March 2, 2004 14:58 WSPC/148-RMP
00193
Reviews in Mathematical Physics Vol. 16, No. 1 (2004) 1–28 c World Scientific Publishing Company
CONSTRUCTION OF METASTABLE STATES IN QUANTUM ELECTRODYNAMICS
¨ MATTHIAS MUCK FB Mathematik, Johannes Gutenberg-Universit¨ at Mainz Staudingerweg 9, D-55128 Mainz, Germany
[email protected] Received 31 October 2002 Revised 7 August 2003 In this paper, we construct metastable states of atoms interacting with the quantized radiation field. These states emerge from the excited bound states of the non-interacting system. We prove that these states obey an exponential time-decay law. In detail, we show that their decay is given by an exponential function in time, predicted by Fermi’s Golden Rule, plus a small remainder term. The latter is proportional to the (4 + β)th power of the coupling constant and decays algebraically in time. As a result, though it is small, it dominates the decay for large times. A central point of the paper is that our remainder term is significantly smaller than the one previously obtained in [1] and as a result we are able to show that the time interval during which the Fermi’s Golden Rule can be observed is significantly longer that the time interval obtained in [1]. This improvement is achieved by incorporating a part of the complex dilatation resonance states into our construction of the metastable states rather than using the unperturbed eigenstates (the excited bound states of the non-interacting system). Thus, the connection to resonance states allows us to introduce metastable states which qualify better in the description of unstable excited states of the interacting system. Keywords: Metastable states; resonances; QED; spectral deformation; Fermi’s golden rule.
1. Introduction and Survey of Results This paper is concerned with the study of unstable, excited states of a nonrelativistic N -electron atom, coupled to the quantized electromagnetic field. We base our analysis on the Standard Model of Nonrelativistic Quantum Electrodynamics introduced in [2, 3, 1]. Physics textbooks teach us that atomic systems — while possessing stable excited energy levels in the absence of an electromagnetic field — have no stable states except for the ground state when interacting with photons. However, experiments on atoms interacting with the electromagnetic field show that the spectral density is located around the former atomic energy levels as “smeared out levels”. Further, there exist distinguished states, so called metastable states, related to such an energy band which are stable for long times and show a controlled, i.e. exponential, decay behavior. These states control the process of emission and 1
March 2, 2004 14:58 WSPC/148-RMP
2
00193
M. M¨ uck
absorption of light since they allow a transition of the electron system in different states under annihilation or creation of photons. The unstable energy levels are well understood as complex resonances (complex eigenvalues of the spectrally deformed Hamiltonian). The real part of the resonance is the center of the energy level while the imaginary part is the spectral linewidth proportional to the decay rate. However, it is not clear yet how to assign a physical state to a resonance such that its energy and decay behavior are given by the resonance coordinates. The problem is that the corresponding resonance eigenvector of the deformed Hamiltonian is dependent on the spectral deformation — unlike the resonance itself — and that the resonance eigenvector is leaving the Hilbert space when the deformation is removed. Thus it has no immediate physical significance. However, it seems convincing that the resonance eigenvector is related to a metastable state. This conjecture is well-founded in the work of Hunziker [4]. He has shown for quantum systems with isolated resonances that formal Rayleigh– Schr¨ odinger expansions for the (nonexistent) eigenstate (i.e. for the resonance eigenvector) possess the required exponential decay behavior up to a power-law decay error term which is small in the coupling constant. Furthermore, Hunziker proved that one can achieve arbitrary orders in the coupling constant in the error term provided one incorporates sufficient high terms of the expansion into the definition of the metastable state. Thus he showed that one can assign a class of physical states — labelled by the order of the Rayleigh–Schr¨ odinger expansion — to a resonance whose characteristic exponential time decay (referred to as Fermi’s Golden Rule) is dominating the power-law decay for longer times the higher the considered orders are. A first approach to identify metastable states in quantum electrodynamics was done by Bach, Fr¨ ohlich and Sigal in [1]. Unlike in Hunziker’s work [4], the resonances of a quantum electrodynamical system stay at the thresholds of continuous spectrum, even after spectral deformation. Therefore, standard perturbation theory for discrete eigenvalues, as it was used in [4], cannot be applied to the quantum electrodynamical framework to generate expansions of the resonance eigenvectors. Technically much more involved strategies as the Renormalization Group Technique using the Feshbach Map would supersede the standard perturbation procedure. Further, the imbedded resonances make the problem of extracting information about the time evolution from the spectral information (via Fourier or Laplace transformation) a difficult task — in particular if one considers higher order corrections. The authors of [1] restricted themselves to considering the decay of the zeroth order expansion, i.e. the unperturbed eigenstate. They could identify an exponential decay, predicted by Fermi’s Golden Rule, and a correction term which decays by a power-law being of fourth order in the coupling constant. Our aim in this paper is to apply Hunziker’s approach to quantum electrodynamics to some extent. In other words, we incorporate a first-order approximation of the resonance eigenvector into the construction of a metastable state. Our efforts are going in the direction to regularize the resonance eigenvector at the expense
March 2, 2004 14:58 WSPC/148-RMP
00193
Construction of Metastable States in Quantum Electrodynamics
3
of small error terms. We relate — in a constructive way — the metastable state with the (dilatation) resonance eigenvector; more precisely, we define a first-order approximation of the resonance eigenvector which survives when the spectral deformation is turned back. This state has physical meaning and, indeed, fulfils the expected time decay. Our construction of a metastable state and the bounds on its time decay improve the previous result in [1] in the sense that we have better estimates on the remainder terms compared to [1]. This gives physical significance to the mathematically-motivated construction since the observation of Fermi’s Golden Rule is possible over a longer time interval. Furthermore, by connecting metastable states with resonance eigenvectors, we made a first step towards understanding the nature of excited, slowly decaying “eigenstates” of particle-photon systems. The explicit construction of a more qualified metastable state as a first-order approximation to the resonance eigenvector reveals — to some extent — the nature of what we understand by excited “eigenstates” of the given quantum electrodynamical system. We believe that higher-order approximations to the resonance eigenvector give even better results with respect to the smallness of the power-law term in the time decay of the states under consideration, i.e. we expect similar results to [4]. But since the first order is already technically non-trivial, we content ourselves with tackling this “initial” problem. 1.1. The mathematical model of nonrelativistic electrons coupled to the quantized radiation field We begin by briefly reviewing the model of nonrelativistic electrons interacting with photons, and we refer the reader to [2, 3, 1] for a detailed discussion of its derivation from first principles of quantum mechanics. Other contributions to various aspects of this model can be found in [5–8]. The Hilbert space H, containing the state vectors of N electrons and the photon field, is given as the tensor product H := Hel ⊗ F ∼ = AN L2 [(R3 × {1, 2})N ; F]
(1)
of the N -electron Hilbert space Hel := AN L2 [(R3 × {1, 2})N ] ∼ =
N ^
(L2 [R3 ] ⊗ C2 ) ,
(2)
j=1
of totally antisymmetric wave functions of N position and spin variables xj ≡ (~xj , λj ) ∈ R3 × {1, 2} and the bosonic Fock space F :=
∞ M
Hn ,
(3)
n=0
where Hn denotes the space of totally symmetric, square-integrable wave functions of n momentum and polarization variables k j ≡ (~kj , λj ) ∈ R3 × {1, 2}.
March 2, 2004 14:58 WSPC/148-RMP
4
00193
M. M¨ uck
L∞ F is the Hilbert space of all sequences (ψn ∈ Hn )n∈N0 ≡ n=0 ψn , for which P ∞ |ψ0 |2 + n=1 hψn | ψn iHn < ∞. We assume that the electrons are bound in a Coulomb potential VC (~x1 , . . . , ~xN ) :=
N X −Z j=1
|~xj |
X
+
1≤i<j≤N
1 |~xi − ~xj |
(4)
of a single static nucleus located at the origin with nuclear charge Z ∈ R+ . The electron dynamics is generated by the Schr¨ odinger operator Hel := −∆x + VC (x) ,
(5)
x ∈ R3N . While it is well-known from the HVZ-Theorem (see, e.g., [9]) that the essential spectrum of the electron Hamiltonian is given by σess (Hel ) = [Σ, ∞), Σ ≤ 0, much less is known about the discrete spectrum of Hel , and we have to require explicitly some additional properties of the atom to ensure stability of the electron system. Hypothesis 1.1. The discrete spectrum of Hel consists of, at most, countably many, but at least one, finitely degenerated eigenvalues E0 < E1 < . . . < Ej < Ej+1 < . . . < Σ = inf(σess (Hel )). The spectrum, σ(Hel ), has no accumulation point in R\{Σ} and σ(Hel ) = {E0 , E1 , . . . , Ej , Ej+1 , . . .} ∪ [Σ, ∞) (see Fig. 1). 6
(6)
1 INTRODUCTION AND SURVEY OF RESULTS
E0
E
E
...
1
Fig. 1.
j
E
j+1
...
Σ
The spectrum of the electronic Hamiltonian Hel .
Figure 1: The spectrum of the electronic Hamiltonian H el .
Hypothesis 1.1 is fulfilled for neutral atoms and positively charged ions (i.e. for N ≤ Z,(see seeFig. also 1).[10]). It is known that there are no eigenvalues for highly-charged negative ions with N ≥ 2Z + 1 (see also [11]). Hypothesis 1.1 is fulfilled for neutral atoms and positively charged ions (i.e., for The dynamics of the photons is generated by the Hamiltonian of the free quanN ≤ Z, see also [Zi]). tized electromagnetic field It is known that there are no eigenvalues for highly charged Z negative ions with N ≥ H 2Z +:= 1 (seeω( also [Li]). ∗ a(k) dk , ~k)a(k) (7) f The dynamics of the photons is generated by the Hamiltonian of the free quantized
which is the second quantization of the dispersion relation ω(~k) := |~k| of massless bosons.electromagnetic The creationfield and annihilation operators, a∗ (k) and a(k), resp., are the usual Z densely defined, operator-valued, distributions representing the canonical ∗ Hf tempered := ω(~k)a(k) a(k) dk, (7) commutation relations (CCR) which is the second [a(k), quantization dispersion ω(~k) := |~k| of massless a(k 0 )∗of] the − k 0 ) 1relation = δ(k l and F
0 bosons. The creation and ∗annihilation a∗ (k) and0 a(k), resp., are the usual , a(k 0 )∗ ] =operators, [a(k) [a(k), a(k )] = .
densely defined, operator-valued, tempered distributions representing the canonical commutation relations (CCR)
a(k), a(k 0 )∗
a(k)∗ , a(k 0 )∗
= =
δ(k − k0 )
F
and
a(k), a(k 0 ) = 0.
(8)
The normalized vacuum state Ω := (1, 0, 0, . . . ) ∈ H0 is uniquely determined (up to a
(8)
March 2, 2004 14:58 WSPC/148-RMP
00193
Construction of Metastable States in Quantum Electrodynamics
5
The normalized vacuum state Ω := (1, 0, 0, . . .) ∈ H0 is uniquely determined (up to a phase factor) by a(k)Ω := 0 ,
(9)
3
for all k ∈ R × {1, 2}. The spectrum of Hf is characterized by σ(Hf ) = [0, ∞) ,
σpp (Hf ) = {0} ,
Ker(Hf ) = H0 = CΩ .
(10)
The operator H0 := Hel ⊗ 1lF + 1lHel ⊗ Hf ,
(11)
describes the merged system of non-interacting electrons and photons. Its spectrum is illustrated in Fig. 2. The atomic eigenvalues {Ej | j ∈ N0 } persist as eigenvalues of H0 , but the energy levels are no longer isolated points of the spectrum. The eigenvalues Ek are the thresholds of half lines Ek + [0, ∞) of absolutely continuous spectrum and are imbedded in the half lines of absolutely continuous spectrum, El1.1 + [0,The ∞)Mathematical with l < k. Model of Nonrelativistic Electrons ... 7
E0
E
1
Fig. 2.
E
...
E
j
j+1
...
Σ
Spectrum of the unperturbed Hamiltonian H0 .
Figure 2: Spectrum of the unperturbed Hamiltonian H 0 .
The interaction between electrons and photons is introduced through minimal coupling, illustrated in Fig. 2. The atomic eigenvalues {Ej | j ∈ 0 } persist as eigenvalues of H0 , N X
but the energy levels are no longer 1/2 isolated 3/2points 0 ~ of the2 spectrum. The eigenvalues Ek
Hα :=
[~σj · (~ pj − 2π
α
A(α~xj ))] + VC (x) + Hf ,
(12)
j=1 of half lines Ek + [0, ∞) of absolutely continuous spectrum and are are the thresholds
where ~σimbedded )j , (σ Pauli matrices and p ~j = −i∇~xj is the j = ((σ 2 )lines j , (σof 3 )absolutely j ) are the in1 the half continuous spectrum, El + [0, ∞) with l < k. momentum operator of the jth electron. The vector potential The interaction electrons and photons is introduced through minimal couZ between κ(|~k|/K) ~ ~ ~ xj ) := q {~ε (k)e−ik·~xj a(k)∗ + ~ε (k)∗ eik·~xj a(k)} (13) A(~ pling, π N2ω(~k) 0
Xh
1/2 3/2
~
i2
VC (x) + Hf , ~σj · p~j − 2π α A(α~xj ) Hα := couples the jth electron to the radiation field with the+strength of the fine(12) structure j=1 constant α > 0. The function κ is fast decaying and regularizes the UV behavior where ~σj = ((σ1 )j , (σ2 )j , (σ3 )j ) are the Pauli matrices and p~j = −i∇~x is the momenon a scale K < ∞. Using for Pauli we perform normal ordering of the tumstandard operator ofalgebra the j th electron. The matrices vector potential squares in creation and annihilations operators. After subtracting the self energy Z R 2 ~ o n (K) (|k|/K) κ(|2~k|/K) ~ −i~ k·~ x the∗Hamilton ε (k)| Eself = N dk 2κπKω( system reads ~ x~j ) :=|~ q of theε~(k)e (13) a(k) + ~ε(k)∗ eik·~x operator a(k) A(~ k) π 2ω(~k) X (K) Hg := Hα0 − Eself = H0 + Wg (θ = 0) := H0 + g m+n Wm,n (θ = 0) , (14) j
j
j
couples the j th electron to the radiation field with m+n=1,2 the strength of the fine structure
constant α >. The function κ is fast decaying and regularizes the UV behavior on a scale K < ∞. Note that the representation (12) of the Hamiltonian is a rescaled form of the standard Hamiltonian found in quantum mechanics textbooks. The rescaling can be obtained by using the homogeneity of degree −1 of the Coulomb potential VC . Using standard algebra for Pauli matrices we perform normal ordering of the squares (K)
in creation and annihilations operators. After subtracting the self energy Eself =
March 2, 2004 14:58 WSPC/148-RMP
6
00193
M. M¨ uck
where g := (αK)3/2 > 0 represents the strength of the electromagnetic coupling between electrons and photons and Z ¯ ∗ := dk wm,n (k , . . . , k , k Wm,n (θ) := Wn,m (θ) 1 m m+1 , . . . , k m+n , θ) · a(k 1 )∗ · · · a(k m )∗ a(k m+1 ) · · · a(k m+n )
(15)
for suitable form factors wm,n . The artificially-introduced parameter θ is the dilatation parameter used later for spectral analysis purpose (see Appendix A). A suitable choice of the UV cutoff κ (i.e. κ(τ ) = exp(−τ 4 )) gives us the relative bounds kwm,n (k, θ)(−∆x + 1)−1/2 kHel ≤ J(~k) kwm,n (k 1 , k2 , θ)kHel ≤ J(~k1 )J(~k2 )
(m + n = 1) , (m + n = 2)
(16)
~
|k|→0 for a square-integrable function J : R3 → R with J(~k ) ∼ ω(~k)−1/2 and ~
|k|→∞ J(~k) ∼ exp(−|~k|4 ). This guarantees the existence of 1/2 Z Λβ := J(~k)2 ω(~k)β d3 k , β > −2 .
(17)
R3
Furthermore, for such a choice of κ, we have strong analyticity of θ 7→ wm,n (k 1 , . . . , km , k m+1 , . . . , k m+n , θ) in a complex neighborhood of θ = 0 (for details refer to [1]). Henceforth, we will assume that κ is chosen in such a way. An immediate consequence of the above assumptions is Lemma 1.2. For m, n ∈ N0 , 1 ≤ m + n ≤ 2 and |θ| < π/4, the operators Wm,n (θ) are defined on D(H0 ) = D(Hel ) ⊗ D(Hf ) and satisfy the relative bound : kWm,n (θ)|H0 + i|−1 kH ≤ [4(1 + Λ20 + Λ2−1 )](m+n)/2 .
(18)
Proof. We refer the reader to [1, Lemma I.1]. Furthermore, the perturbed Hamiltonian Hg is well-defined and extends to a self-adjoint operator. Theorem 1.3. The perturbation Wg is defined on D(H0 ) = D(Hel ) ⊗ D(Hf ). If 1 g < 10 (1 + Λ20 + Λ2−1 )−1/2 , then Hg = H0 + Wg is essentially self-adjoint on D(Hg ) = D(H0 ) and bounded from below. Proof. (See [1, Corollary I.2]). 1.2. Survey of the main result and strategies In [1] it was proven that, for sufficiently small g > 0, the excited energy levels disappear (see also Theorem A.6) when electrons and photons are minimally coupled. We study the fate of the corresponding eigenstates in Secs. 2 and 3.
March 2, 2004 14:58 WSPC/148-RMP
00193
Construction of Metastable States in Quantum Electrodynamics
7
In Sec. 2 we construct metastable states close to the unperturbed eigenstates, i.e. for a given eigenstate ϕj ⊗ Ω of the uncoupled system with energy Ej we explicitly define a vector Φ in the Hilbert space H which is of the form Φ = ϕj ⊗ Ω + O(g β ), where 0 < β < 1/9. In Sec. 3, we identify the time-evolution of this state Φ to be exponentially decaying, up to a small remainder term, i.e. there exist constants C, Cn < ∞ for n = 0, 1 such that for t > 1, the following holds: | hΦ | exp(−iHg t)ΦiH | ≤
C exp(−Sg t) {z } |
Fermi’s Golden Rule
+
Cn 4+β g n |t {z }
,
(19)
remainder term
where 0 < β < 1/9 and Sg = O(g 2 ). This main result is restated as Theorem 3.1. The first part in (19) is predicted by Fermi’s Golden Rule. Following standard expositions in physics textbooks (see, e.g., [12, Chap. 20]), the probability per unit time, pΦ→φ0 , of transition of a perturbed, excited state Φ into the ground state φ0 is given by the matrix element of the perturbation, pΦ→φ0 = 2π| hφ0 | Wg ΦiH |2 = O(g 2 ) .
(20)
This constant transition rate implies an exponential decay of the state Φ under the time-evolution with a life-time p−1 Φ→φ0 . However, for large times t 1 the decay (19) is dominated by the remainder term (“background”). We point out that this result is an improvement on [1], where the time decay of the eigenstate ϕj ⊗ Ω itself was identified to be of this kind, except for the error term; specifically, Bach, Fr¨ ohlich and Sigal proved that the error term is O(t−n g 4 ) (but here valid for all n ∈ N). Some ideas for estimating metastable states used here are borrowed from [1]. Our improved estimate expands the time slot in which the exponential decay dominates the error term and, thus, in which Fermi’s Golden Rule can be observed, i.e. where C exp(−Sg t) Ctnn g 4+β . This shows that the states we construct are physically more qualified for describing unstable, excited states than excited states of the decoupled system. Increasing the order of the error term requires a refinement of the choice of Φ compared to [1]. The construction of the state Φ is not straightforward, and we devote an entire section to this task. In our paper, we follow an idea of [4] for constructing metastable states. For the case when the resonance originates from an isolated eigenvalue of the unperturbed, deformed system, Hunziker proved that the error is of order O(t−n g M ), for all n ∈ N, assuming the existence of a Rayleigh–Schr¨ odinger expansion to order M of the (nonexistent) perturbed eigenstate. Moreover, Hunziker showed how metastable states of a quantum system and resonances (complex eigenvalues of a dilated Hamiltonian Hg (θ), see Appendix A) are related and that the imaginary parts of the resonance energies determine the reciprocal life-time S g of the metastable states. In our case, we expect that the resonance eigenstate ψ θ of ˜j (i.e. Hg (θ)ψθ = E ˜j ψθ ) carries Hg (θ), Im(θ) > 0, corresponding to the resonance E the features of a metastable state. In this paper we consider an approximation of
March 2, 2004 14:58 WSPC/148-RMP
00193
81.2M.Survey M¨ uck of the Main Result and Strategies
.. E . . .
.
˜ E
E˜
E˜0
E0
k
k
j
Σ
11
Ej . . .
O(Im(θ)) unperturbed energy level with branch of continuous spectrum resonance with sector of continuous spectrum
Fig. 3. Spectrum of the perturbed, dilated Hamiltonian Hg (θ), g > 0, Im(θ) > 0, resulting from ˜k are caused by perturbing HFigure sectors of continuous spectrum fixed at the resonances E 0 (θ). The 3: Spectrum of the perturbed, dilated Hamiltonian H g (θ), g > 0, Im(θ) > 0, ˜j as Im(θ) → 0. the branches Ek + e−θ [0, ∞). They cover the resonance E
˜k resulting from H0 (θ). The sectors of continuous spectrum fixed at the resonances E
this state in first order. Such an approxiamtion can be found by using Renormal˜j as are caused by perturbing the branches E k + e−θ [0, ∞). They cover the resonance E ization Group techniques (see [2]) and reads Im(θ)(1) → 0.
ψθ
= [P (θ) − P¯ (θ)(P¯ (θ)Hg (θ)P¯ (θ) − E˜j )−1 P¯ (θ)Wg (θ)P (θ)]ϕj (θ) ⊗ Ω .
(21)
We can understand this vector as a formal Rayleigh–Schr¨ odinger expansion (of the We can understand this vector as a formal Rayleigh-Schr¨odinger expansion (of the resonance vector) up to first order. Here, P (θ) is a projection on the unperturbed resonance vector) up to first order. P (θ) is aphoton projection on the unperturbed eigenspace corresponding to E on “small energies” and P¯ (θ) = 1lH − j andHere, P (θ). eigenspace It is the corresponding first prototype a on metastable state. Butand theP (θ) transformation back to = H − P (θ). to Ejof and “small photon energies” (1) the undilated system is problematic because limIm(θ)→0 ψθ does not exist. In the It is the first prototype of a metastable state. But the transformation back to the limit Im(θ) → 0, the spectrum of Hg (θ) is turned back from the lower half plane 2 2 ˜j = Eψj(1)+does into the real axis. the resonance E O(gnot )− iO(g covered by a undilated systemThus, is problematic because lim exist. In )theislimit Im(θ)→0 θ sector of absolutely continuous spectrum of Hg (θ), which results from perturbing Im(θ) → 0, the spectrum of H (θ) is turned back from the lower half plane into the the branches Ek + e−θ [0, ∞), kg < j, of the spectrum of H0 (θ) (this is illustrated in ˜j the by a Appendix sector of − iO g 2 is covered Thus, the resonance of E = Ejdilated + O g 2 Hamiltonian Fig. 3;real foraxis. spectral properties see also A and Fig. 7). absolutely continuous spectrum of Hg (θ), which results from perturbing the branches We briefly sketch our approach of constructing a metastable state. Inspired (1) + e−θ [0, ∞), kthe < j, state of the spectrum H0 (θ)it(this is illustrated in Fig. 3; by [4],Ekwe modify ψθ soofthat becomes analytic inforθ spectral (i.e. we require (1) dilatation analyticity), and yet, the new state approximate ψθ as g → 0. We properties of the dilated Hamiltonian see also Sect.should A and Fig. 7). obtain this state Φθ by replacing, in a first step, the perturbed resolvent in (21) by We briefly analogue sketch our approach of constructing a metastable Inspired by regularization [Hu], its unperturbed and then, in a second step, bystate. introducing (1) termswe tomodify ensurethethe existence of itthe resolvent restricted to require severaldilatation eigenspaces, see becomes analytic in θ (i.e., we state ψθ so that Definition 2.2. In Sec. 2, we see that this modification of the resolvent enables (1) analyticity), and yet, the new state should approximate ψ as g → 0. We obtain this us to define a state Φθ=0 which is living in the θphysical configuration space H and has the required decay properties. However, incorporating the resolvent in the definition of the metastable state results in a dependence of the spectral deformation on the coupling constant (see Remark 2.3 and Fig. 4) which in turn requires subtle estimates on the resolvent (e−θ Hf + Ek − Ej − ig α )−1 as g, Im(θ) → 0.
March 2, 2004 14:58 WSPC/148-RMP
14
00193
Construction of Metastable States in OF Quantum Electrodynamics STATE 9 2 CONSTRUCTION A METASTABLE
Ej + ig α −ϑ
Ek
...
E
j
ϑ Ej − ig α Fig. 4.
Illustration of the g-dependence of the analytic dilatation of φ.
Figure 4: Illustration of the g-dependence of the analytic dilatation of φ. 2. Construction of a Metastable State The first part of this section deals with the construction of metastable states. We α g prove dilatation analyticity sense of Appendix A) and some important estidiverges, for Im(θ) → arctan(inEkthe −Ej . Thus, for a fixed g > 0, we bound ϑ := Im(θ) mates which enter proof of our main Theorem 3.1 about the decay of metastable the gα states. a metastable state of H , we introduce some notation. by ϑ >Before arctan constructing Ek −Ej . The boundary corresponds gto the critical angle ϑ in Fig. 4,
Notation Let σ(H −θ) = {E0 , E1 , . . . , Ej , Ej+1 , . . .} ∪ [Σ, ∞) be the spectrum so that the 2.1. spectrum of eel Hf + Ek (the ray in the figure) touches the pole Ej + ig α . of the electronic Hamiltonian Hel as postulated in Hypothesis 1.1. For k ∈ N0 we
consider discrete eigenvalue Ekof(which is finitely degenerated Since we the consider matrix elements the type φ(θ) A(θ) φ(θ) H , by we assumption) get an equal and denote by bound on −ϑ = Im(θ), i.e., for the reflected ray in Fig. 4. Thus, the angle is confined Pelk := χE − δk , E + δk (Hel ) , δk := dist(Ek , σ(Hel )\{Ek }) > 0 , (22) k
2
k
2
to the sector
the orthogonal projection of rank Nk < ∞ on the corresponding eigenspace. Fur thermore, we define an orthogonal projection g α on H, g→0 −→ 0. |ϑ| < arctan kEj − Ek Pk := Pel ⊗ 1lF . (23)
We j ∈the N (note 0 isto excluded), set of δ := and introduce Notefixthat boundthat on ϑj is=due the presence theδjresolvent in the definition of φ. j X δj P>jto:= Pk = χ H , electron eigenstates (24) This is in contrast [?],1lwhere states were by1lF the H − metastable el > E j +given⊗ 2 k=0
(and χ[f therefore resolvent the definition), such that dilatation analyticity holds with (x) > no a] := χ{y∈R |enters f (y)>a} (x), for a measurable function f .
Now, we areininthe position to constant implement true, uniformly coupling g. the strategy of constructing metastable states as outlined in Sec. 1.2. Remark 2.3 to choose the parameter ϑ dependent on the interacDefinition 2.2suggests (Construction of adilatation Metastable State). Using Notation 2.1 we define, for g > 0, tion strength g to ensure " dilatation ( j−1analyticity. This requires a more delicate estimate X −1 − E − ig α )−1 P αE 1lH k as ϑ,j g → 0. These k estimates are the of the singularitiesφof:= e−θ Hf−+ Ek − P Ekj(H − fig+ k=0
technically most crucial difference to [?] and are carried out in the following.
2.4 Lemma (Dilatation Analyticity of φ) The states φ, F φ, ξk are dilatation ana-
March 2, 2004 14:58 WSPC/148-RMP
10
00193
M. M¨ uck
+ Pj (Hf + g γ )−1 Pj )
#
+ P>j (H0 P>j − Ej )−1 P>j Wg ϕj ⊗ Ω ,
(25)
where ϕj ∈ Ran Pelj is an arbitrary, normalized eigenstate of Hel corresponding to the jth excited energy level Ej , and Ω is the vacuum vector of the photon Fock space. α, γ > 0 are positive constants fixed later and (H0 P>j −Ej )−1 is the resolvent of the operator H0 , restricted to the range of P>j . Furthermore, we define, for ω > 0, the interval I(ω) := (Ej − ω, Ej + ω) .
(26)
Let F ∈ C0∞ (I(δ/2); [0, 1]) with F ≡ 1 on I(δ/4) and let F be symmetric with respect to Ej , i.e. F (Ej − λ) = F (Ej + λ). The metastable state is defined as Φ := F 1/2 (Hg ) φ .
(27)
We define two further states which are useful for our analysis, Fφ := (Hg − Ej )φ ξk := −ig α (Hf + Ek − Ej − ig α )−1 Pk Wg ϕj ⊗ Ω ,
(28) k = 0, . . . , j − 1 .
(29)
Remark 2.3. We consider the map θ 7→ φ(θ) = U (θ)φ where U (θ) is the dilatation group defined in Definition A.1. The complex domain in which this map has analytic continuation depends on the coupling constant g. The resolvent (Hf + Ek − Ej − ig α )−1 acting on the range of Pk (k < j) are the terms that confine the dilatation −θ parameter θ to a region that decreases in g. We observe that (e Hf + Ek − Ej − gα α −1 ig ) diverges, for Im(θ) → arctan Ek −Ej . Thus, for a fixed g > 0, we bound α ϑ := Im(θ) by ϑ > arctan Ekg−Ej . The boundary corresponds to the critical angle ϑ in Fig. 4, so that the spectrum of e−θ Hf + Ek (the ray in the figure) touches the α ¯ pole Ej + ig . Since we consider matrix elements of the type φ(θ) A(θ) φ(θ) H , ¯ i.e. for the reflected ray in Fig. 4. Thus, the we get an equal bound on −ϑ = Im(θ), angle is confined to the sector gα g→0 |ϑ| < arctan −→ 0 . Ej − E k Note that the bound on ϑ is due to the presence of the resolvent in the definition of φ. This is in contrast to [1], where metastable states were given by the electron eigenstates (and therefore no resolvent enters the definition), such that dilatation analyticity holds true, uniformly in the coupling constant g. Remark 2.3 suggests the choice of the dilatation parameter ϑ dependent on the interaction strength g to ensure dilatation analyticity. This requires a more delicate estimate of the singularities of (e−θ Hf + Ek − Ej − ig α )−1 as ϑ, g → 0. These estimates are the technically most crucial difference to [1] and are carried out in the following.
March 2, 2004 14:58 WSPC/148-RMP
00193
Construction of Metastable States in Quantum Electrodynamics
11
Lemma 2.4 (Dilatation Analyticity of φ). The states φ, Fφ, ξk are dilatation analytic in the sense of Definition A.2. More precisely, let β ≥ α and define ϑg := Cϑ g β ,
(30)
where Cϑ > 0 is a sufficiently small positive constant so that gα ≥ sup (Ej − Ek + 2) , tan(2ϑg ) k=0,...,j−1
(31)
for g > 0 sufficiently small. Then, using the notations of Appendix A, the maps R 3 θ 7→ φ(θ), Fφ(θ), ξk (θ) have Hilbert space valued analytic continuations to the disc U2ϑg = {θ ∈ C | |θ| < 2ϑg } , for g > 0 sufficiently small , namely " ( j−1 X Pk (θ)(e−θ Hf + Ek − Ej − ig α )−1 Pk (θ) φ(θ) = 1lH − k=0
+ Pj (θ)(e−θ Hf + g γ )−1 Pj (θ) + P>j (θ)(H0 (θ)P>j (θ) − Ej )
−1
)
#
P>j (θ) Wg (θ) ϕj (θ) ⊗ Ω ,
Fφ(θ) = (Hg (θ) − Ej )φ(θ) , ξk (θ) = −ig α (e−θ Hf + Ek − Ej − ig α )−1 Pk (θ)Wg (θ) ϕj (θ) ⊗ Ω .
(32) (33) (34)
Proof. The proof is elementary and will be dropped. For the estimate of the time-decay of the metastable state Φ, we have to examine the behavior of the dilated states φ(θ), Fφ(θ), ξk (θ) in the limit g → 0. Lemma 2.5. Let θg := ±iϑg = O(g β ) and 0 < α ≤ β ≤ 1, 0 < γ ≤ 1. Then the states φ, Fφ, τk defined in (25), (28) and (29) obey the following estimates: β γ φ(θg ) − ϕj (θg ) ⊗ Ω = O g 1−max{ 2 , 2 } , (35)
Fφ(θg ) −
β ξk (θg ) = O g 1+α− 2 ,
(36)
β γ ξk (θg ) = O g 1+min{ 2 ,1− 2 } ,
(37)
γ β Fφ(θg ) = O g 1+min{ 2 ,α− 2 } .
(38)
j−1 X
k=0
Proof. We first establish Eq. (38): (Hg (θg ) − Ej )φ(θg )
March 2, 2004 14:58 WSPC/148-RMP
12
00193
M. M¨ uck
=
j−1 X
−θ α −1 ξk (θg ) − Wg (θg )(e g Hf + Ek − Ej − ig ) Pk (θg )Wg (θg ) ϕj (θg ) ⊗ Ω | {z } k=0 =:T1k (θg )
− (Wg (θg ) − g γ )(e−θg Hf + g γ )−1 Pj (θg )Wg (θg ) ϕj (θg ) ⊗ Ω {z } | =:T2 (θg )
− Wg (θg )(H0 (θg )P>j (θg ) − Ej )−1 P>j (θg )Wg (θg ) ϕj (θg ) ⊗ Ω | {z } =:T3 (θg )
=
j−1 X
[ξk (θg ) − T1k (θg )] − T2 (θg ) − T3 (θg ) .
k=0
We estimate the four terms ξk (θg ), T1k (θg ), T2 (θg ), T3 (θg ) separately. It is obvious that T3 (θg ) = O(g 2 ), because kWg (θ)(H0 + i)−1 kH = O(g 1 ). Next, we have kξk (θg )kH = g α k(e−θg Hf + Ek − Ej − ig α )−1 Pk (θg )Wg (θg ) ϕj (θg ) ⊗ ΩkH β = O g 1+α− 2 ,
where we used Lemma 2.7 to estimate the norm. Analogously, we see that k 2− β kT1 (θg )kH = O g 2 . To estimate T2 (θg ), we consider kT2 (θg )kH ≤ k(Wg (θg ) − g γ )Pj (θg )kH k|e−θg Hf + g γ |−1/2 kH
· k|e−θg Hf + g γ |−1/2 Pj (θg )Wg (θg ) ϕj (θg ) ⊗ ΩkH . γ Lemma 2.8 then implies that kT2 (θg )kH = O g 1+ 2 , and finally Fφ(θg ) −
j−1 X
ξk (θg ) = −
k=0
j−1 X k=0
Fφ(θg ) =
j−1 X
k=0
β γ T1k (θg ) − T2 (θg ) − T3 (θg ) = O g 1+min{ 2 ,1− 2 } ,
γ γ β β ξk (θg ) + O g 1+min{ 2 ,1− 2 } = O g 1+min{ 2 ,α− 2 } .
The estimate of (35) is similar. Remark 2.6. Later, we choose β smaller than ε/2 < 1/9 in order to use properties of the dilated spectrum of Hg . To get the largest possible exponent in (36) and (38), we choose α as large as possible, i.e. α = β, in accordance with Lemma 2.4. Furthermore, we choose γ ≥ α = β. Then, the Eqs. (35)–(38) read: γ β (φ(θg ) − ϕj (θg ) ⊗ Ω) = O g 1− 2 , ξk (θg ) = O g 1+ 2 , Fφ(θg ) −
j−1 X k=0
γ ξk (θg ) = O g 1+ 2 ,
β Fφ(θg ) = O g 1+ 2 .
(39)
March 2, 2004 14:58 WSPC/148-RMP
00193
Construction of Metastable States in Quantum Electrodynamics
13
Lemma 2.7. Let 0 < α ≤ β < 1/9, ϑg as defined in (30), and θg := ±iϑg = O(g β ). Abbreviate Ejk := Ej − Ek > 0. Then, we have the estimates: β kPk (θg )(e−θg Hf − Ejk − ig α )−1 Wg (θg ) ϕj (θg ) ⊗ ΩkH = O g 1− 2 (40) and
kPk (θg )(e−θg Hf − Ejk ∓ ω + is)−1 (e−θg Hf − Ejk − ig α )−1 Wg (θg ) ϕj (θg ) ⊗ ΩkH β = O g 1− 2 , (41)
uniformly in δ/4 ≤ ω ≤ δ/2 and −const g 2 ≤ s ≤ const g 2 . Proof. Note that Wg (θg ) ϕj (θg ) ⊗ Ω =
gW1,0 (θg ) +
X
!
2
g Wm,n (θg ) ϕj (θg ) ⊗ Ω .
m+n=2
We start with Eq. (40). First, we consider for m + n = 2: kPk (θg )(e−θg Hf − Ejk − ig α )−1 g 2 Wm,n (θg ) ϕj (θg ) ⊗ ΩkH = O(g 2−α ) . Thus, it remains to consider the case (m, n) = (1, 0). Using Lemma A.4 and the notation therein and the Pull-Through formula (Lemma B.1) we derive the following, kPk (θg )(e−θg Hf − Ejk − ig α )−1 gW1,0 (θg ) ϕj (θg ) ⊗ Ωk2H
2
Z Nk
X hϕlk (θ¯g )|w1,0 (k, θg ) ϕj (θg )iHel
l ∗ ϕk (θg ) ⊗ a(k) Ω = g dk
e−θg ω(~k) − Ejk − ig α 2
l=1
≤ g 2 Nk
2
≤ g C
H
max kϕlk (θg )k2Hel
l=1,...,Nk
Z
dk
Z
J(~k)2
l hϕ (θ¯g )|w1,0 (k, θg ) ϕj (θg )iHel 2 dk k e−θg ω(~k) − Ejk − ig α H 2
|e−θg ω(~k) − Ejk − ig α |2
≤g C
0
Z∞ 0
r−1 + exp(−2r4 ) r2 dr − Ejk − ig α |2
|e∓iϑg r
0
for some suitable constants C, C < ∞. Now, we prove that the last integral is of order O(g −β ). We apply the bound (31) on ϑg , Z∞ 0
r(1 + r exp(−2r 4 )) dr (cos(ϑg )r − Ejk )2 + (sin(±ϑg )r + g α )2
≤ cos(ϑg )−3
Z
|r−Ejk |>1
r(1 + r exp(−2r 4 cos(ϑg )−4 )) dr (r − Ejk )2
March 2, 2004 14:58 WSPC/148-RMP
14
00193
M. M¨ uck
Z
+
4
r(1 + r exp(−2r cos(ϑg )
|r−Ejk |≤1
(r − Ejk )2 + tan(ϑg )2 r +
1 ≤ cos(ϑg )−3 C 00 + C 000 tan(ϑg )2
Z∞
−∞
−4
))
dr
2 gα tan(±ϑg )
tan(ϑg ) 1+
r tan(ϑg )
= O(tan(ϑg )−1 ) = O(g −β ),
dr 2 tan(ϑg )
(42)
with C 00 , C 000 < ∞. This proves (40). Now, we turn to the proof of (41). Again, it is sufficient to consider the case (m, n) = (1, 0) only. We proceed similarly to (40). Using the Pull-Through formula twice we get for a suitable constant C < ∞: kPk (θg )(e−θg Hf − Ejk ∓ ω + is)−1 (e−θg Hf − Ejk − ig α )−1 gW1,0 (θg )ϕj (θg ) ⊗ Ωk2H g2 C ≤ cos(ϑg )3
Z∞ 0
r(1 + r exp(−2r 4 cos(ϑg )−4 )) s (r − Ejk ∓ ω)2 + tan(ϑg )2 (r − tan(±ϑ )2 g) ·
1 (r − Ejk )2 + tan(ϑg )2 r +
2 gα tan(±ϑg )
#
dr .
(43)
The integral in (43) can be split as Z∞ h 0
i 64 . . . dr ≤ 2 δ +
Z
r(1 + r exp(−2r 4 cos(ϑg )−4 ))
|r−Ejk |> δ8
64 δ2
Z
(r − Ejk ∓ ω)2 + tan(ϑg )2 r −
r(1 + r exp(−2r 4 cos(ϑg )−4 ))
|r−Ejk |≤ δ8
64 ≤ 2 δ
Z∞ 0
2
2 s tan(±ϑg )
(r − Ejk )2 + tan(ϑg )2 r +
2 gα tan(±ϑg )
r(1 + r exp(−2r 4 cos(ϑg )−4 )) (r − Ejk ∓
ω)2
+ tan(ϑg
)2
r−
2 s tan(±ϑg )
dr
dr
dr + O(g −β ) ,
δ for |r − Ejk | > 8δ and (r − Ejk ∓ ω)2 ≥ (|ω| − |r − Ejk |)2 ≥ since (r − Ejk )2 > 64 2 δ ( δ4 − δ8 )2 = 64 for |r − Ejk | ≤ 8δ . The second integral was already estimated in (42) to be a term of order O(g −β ). We consider the remaining integral. First, we note δ δ 2 β that Ejk ± ω ≥ Ej − Ej−1 − 2 ≥ 2 > 0. Since |s| ≤ const g and tan(±ϑg ) = O(g ) δ s with β < 1/9 we have tan(±ϑ < 8 for g > 0 sufficiently small. Thus g)
March 2, 2004 14:58 WSPC/148-RMP
00193
15
Construction of Metastable States in Quantum Electrodynamics
Z∞ 0
r(1 + r exp(−2r 4 cos(ϑg )−4 )) (r − Ejk ∓ ω)2 + tan(ϑg )2 r − 0
= O(g ) +
Z∞
δ/4
2 s tan(±ϑg )
dr
r(1 + r exp(−2r 4 cos(ϑg )−4 )) (r − Ejk ∓ ω)2 + tan(ϑg )2 r −
as in (42). This proves (41).
2 s tan(±ϑg )
dr = O(g −β ) ,
Lemma 2.8. Let θ = iϑ ∈ Uϑg , i.e. |ϑ| ≤ ϑg and γ ≤ 2. Then , we have γ kPj (θ)|e−θ Hf + g γ |−1/2 Wg (θ)|e−θ Hf + g γ |−1/2 Pj (θ)kH = O g 1− 2 ,
(44)
kPj (θ)|e−θ Hf + g γ |−1/2 Wg (θ)ϕj (θ) ⊗ ΩkH = O(g 1 ) ,
(45)
uniformly in ϑ ∈ (−ϑg , +ϑg ). Proof. Recall that Wg (θ) = tion.
P
m+n=1,2
g m+n Wm,n (θ). First, we derive a proposi-
Proposition 2.9. There is a constant C > 0 so that for m + n = 1: kPj (θ)|e−θ Hf + µ|−1/2 Wm,n (θ)|e−θ Hf + µ|−1/2 Pj (θ)kH s 1+µ Λ2−1 + Λ20 2 cos(ϑ) ≤ C µ
(46)
for all µ > 0. Proof of (46). Let ψ ∈ H. Using the Pull-Through formula (Lemma B.1), we obtain: kPj (θ)|e−θ Hf + µ|−1/2 W0,1 (θ)|e−θ Hf + µ|−1/2 Pj (θ) ψkH
Z
−θ −1/2 −θ −1/2 ~
|e H + µ| P (θ)w (k, = θ)P (θ)|e (H + ω( k)) + µ| a(k) ψ dk f j 0,1 j f
≤
Z
H
k|e−θ Hf + µ|−1/2 Pj (θ)w0,1 (k, θ)Pj (θ)|e−θ (Hf + ω(~k)) + µ|−1/2
dk · (Hf + ω(~k))1/2 k2H ω(~k)
1/2 Z
k(Hf + ω(~k))−1/2 a(k) ψk2H ω(~k) dk
1/2
.
(47)
The bound of the second factor in (47) follows from Z k(Hf + ω(~k))−1/2 a(k)ψk2H ω(~k) dk =
Z D (Hf + ω(~k))−1/2 a(k)ψ
E ω(~k) dk (Hf + ω(~k))−1/2 a(k)ψ H
Z D E ω(~k) dk ≤ kPΩ⊥ ψk2H , (48) = a(k)(Hf PΩ⊥ )−1/2 PΩ⊥ ψ a(k)(Hf PΩ⊥ )−1/2 PΩ⊥ ψ H
March 2, 2004 14:58 WSPC/148-RMP
16
00193
M. M¨ uck
where PΩ⊥ = 1lH − PΩ and PΩ ≡ 1lHel ⊗ PΩ denotes the orthogonal projection on the kernel Hel ⊗ CΩ of Hf ≡ 1lHel ⊗ Hf . The integrand of the first factor in (47) is estimated as follows: k|e−θ Hf + µ|−1/2 Pj (θ)w0,1 (k, θ)Pj (θ)|e−θ (Hf + ω(~k)) + µ|−1/2 (Hf + ω(~k))1/2 kH ≤ k|e−θ Hf + µ|−1/2 (Hf + ω(~k))1/2 kF · kPj (θ)w0,1 (k, θ)Pj (θ)kHel · k|e−θ (Hf + ω(~k)) + µ|−1/2 kF s s ω(~k) ~ 1 1 ≤ C + J(k) cos(ϑ) µ ω(~k) cos(ϑ) + µ CJ(~k) = p µ cos(ϑ)
q 1 + µ + ω(~k) cos(ϑ) .
(49)
Inserting the estimates (48) and (49) into (47) we finally get: kPj (θ)|e−θ Hf + µ|−1/2 W0,1 (θ)|e−θ Hf + µ|−1/2 Pj (θ)kH ≤ p
C µ cos(ϑ)
Z
1/2 1 + µ + ω(~k) cos(ϑ) ~ 2 =C J(k) dk ω(~k)
s
2
1+µ 2 cos(ϑ) Λ−1
+ Λ20
µ
.
¯ ∗ and similar estimates finally prove (46). Replacing W0,1 (θ) by W1,0 (θ) = W0,1 (θ) We return to the proof of Eq. (44). Since, obviously, kPj (θ)|e−θ Hf + g γ |−1/2 g 2 Wm,n (θ)|e−θ Hf + g γ |−1/2 Pj (θ)kH = O(g 2−γ ) , for m + n = 2, it is sufficient to consider the case m + n = 1. But this case is easily handled using (46) and replacing µ by g γ . Finally, Eq. (45) follows from: kPj (θ)|e−θ Hf + g γ |−1/2 Wg (θ)ϕj (θ) ⊗ ΩkH ≤ kPj (θ)|e−θ Hf + g γ |−1/2 Wg (θ)|e−θ Hf + g γ |−1/2 Pj (θ)kH · k|e−θ Hf + g γ |1/2 ϕj (θ) ⊗ ΩkH = O(g 1 ) , uniformly in ϑ ∈ (−ϑg , +ϑg ). 3. Time-Decay Law for Metastable States In this section we restate and prove the main Theorem of this paper. Theorem 3.1 (Quasi-Exponential Time-Decay of Metastable States). Let Φ = F 1/2 (Hg ) φ ∈ D(Hel ) ⊗ D(Hf ) = D(Hg ) be the state defined in (25) and (27). Then there exist constants C, Cn < ∞, for n = 0, 1 such that, for any 0 < β < 1/9, Cn (50) | hΦ | exp(−iHg t)ΦiH | ≤ C exp(−Sg t) + n g 4+β , t for t > 1 and sufficiently small g > 0.
March 2, 2004 14:58 WSPC/148-RMP
00193
Construction of Metastable States in Quantum Electrodynamics
17
Remark 3.2. The constants α, γ in Definition 2.2 and the constant β in Lemma 2.4 are fixed by α := β, where 0 < β < 1/9 is arbitrary, and γ := 1. 0 < Sg = O(g 2 ) is the reciprocal of the life-time of the metastable state defined in (A.12). Proof. We roughly follow the strategy of the proof of [1, Theorem III.5]. We first use the fact that F (λ) = −
Zδ/2
λ−Ej
Zδ/2 F 0 (Ej + ω)χI(ω) (λ) dω F (Ej + ω) dω = − 0
(51)
δ/4
and Stones formula (see e.g. [13]), to rewrite hΦ | exp(−iHg t)ΦiH = hφ | exp(−iHg t)F (Hg )φiH 1 = − lim 2πi η&0
Z Zδ/2 dλ exp(−iλt) dω F 0 (Ej + ω)
δ/4
I(ω)
· [ φ (Hg − λ − iη)−1 φ H − φ (Hg − λ + iη)−1 φ H ] .
(52)
Now, we use our results of the dilatation analyticity derived in Sec. 2. Fix 0 < ε < 2/9, and let θg := iϑg = iCϑ g β , as defined in (30). We note that Theorem A.6 is applicable for β < ε/2 < 1/9 and sufficiently small g > 0. Thus, appealing to Theorem A.6, we transform the matrix elements in (52) by complex dilatation, using the fact that φ is dilatation analytic on U2ϑg (Lemma 2.4),
φ (Hg − λ − iη)−1 φ H = φ(θ¯g ) (Hg (θg ) − λ − iη)−1 φ(θg ) H ,
φ (Hg − λ + iη)−1 φ H = φ(θg ) (Hg (θ¯g ) − λ + iη)−1 φ(θ¯g ) H ,
because Hg (θg )∗ = Hg (θ¯g ). Inserting this into (52), we have: hΦ | exp(−iHg t)ΦiH 1 = − π
Zδ/2 Z
0 dω F (Ej + ω) dλ e−iλt Im φ(θ¯g ) (Hg (θg ) − λ)−1 φ(θg ) H , (53)
δ/4
I(ω)
where the limit η & 0 exists due to the analyticity of Aε 3 z 7→ (Hg (θg ) − z)−1 , see Theorem A.6. The analyticity also allows us to deform the integration contour along I(ω) in (53) into the lower half plane (see Fig. 5). Applying Cauchy’s integral formula, we get: hΦ | exp(−iHg t)ΦiH = M− − M+ + Mk (θg ) − Mk (θ¯g ) ,
(54)
March 2, 2004 14:58 WSPC/148-RMP
18
00193
M. M¨ uck
where ZSg Zδ/2 F 0 (Ej + ω) ds exp(−it(Ej ± ω − is)) dω M± := − 2πi 0
δ/4
· φ(θ¯g ) (Hg (θg ) − Ej ∓ ω + is)−1 φ(θg ) H
− φ(θg ) (Hg (θ¯g ) − Ej ∓ ω + is)−1 φ(θ¯g ) H ,
(55)
Zδ/2 F 0 (Ej + ω) Mk (θg ) := − dω 2πi δ/4
·
Z
I(ω)
dλ e−it(λ−iSg ) φ(θ¯g ) (Hg (θg ) − λ + iSg )−1 φ(θg ) H .
(56)
Theorem 3.1 follows directly from Lemmas 3.3 and 3.5 below which state that the integration along the horizontal contour −iSg + I(ω) (see Fig. 5) gives the exponential decay in (50), and the integration along the vertical contour yields the error term O(t−n g 4+β ). Lemma 3.3. Under the conditions of Theorem 3.1 and for n = 0, 1, there is a constant Cn < ∞, such that Cn 4+β g , 2tn provided t > 1, and g > 0 is sufficiently small. |M± | ≤
(57) 23
Aε
Ej − ω
Ej + ω
Ej
E + ω − iS E − ω − iS resonance σ(H (θ)) j
j
g
g
g
Fig. 5.
Deformation of the integration contour into the lower half plane.
Figure 5: Deformation of the integration contour into the lower half plane.
in (50), and the integration along the vertical contour yields the error term O(t −n g 4+β ).
3.3 Lemma Under the conditions of Thm. 3.1 and for n = 0, 1, there is a constant Cn < ∞, such that
March 2, 2004 14:58 WSPC/148-RMP
00193
Construction of Metastable States in Quantum Electrodynamics
19
Proof. Recall the definition (55) of M± . By n-fold integration by parts with respect to dω and using the notation
Dν (s, ω) := φ(θ¯g ) (Hg (θg ) − Ej ∓ ω + is)−ν−1 φ(θg ) H
− φ(θg ) (Hg (θ¯g ) − Ej ∓ ω + is)−ν−1 φ(θ¯g ) H (58) we can rewrite
−(∓i)n M± = 2πi tn
ZSg 0
·
Zδ/2 dω exp(−it(Ej ± ω − is)) ds δ/4
n X (±1)ν n! (n−ν+1) F (Ej + ω) · Dν (s, ω) . (n − ν)! ν=0
(59)
To compute Dν (s, ω), we use the operator equality (Hg (θg ) − Ej ∓ ω + is)−1
1 [1lH − (Hg (θg ) − Ej ∓ ω + is)−1 (Hg (θg ) − Ej )] , (60) ∓ω + is on the domain D(Hg ) = D(Hel ) ⊗ D(Hf ). Introducing two further definitions,
DνF (s, ω) := φ(θ¯g ) (Hg (θg ) − Ej ∓ ω + is)−ν−1 Fφ(θg ) H
− φ(θg ) (Hg (θ¯g ) − Ej ∓ ω + is)−ν−1 Fφ(θ¯g ) H , (61) =
2 DνF (s, ω) := Fφ(θ¯g ) (Hg (θg ) − Ej ∓ ω + is)−ν−1 Fφ(θg ) H
− Fφ(θg ) (Hg (θ¯g ) − Ej ∓ ω + is)−ν−1 Fφ(θ¯g ) H ,
(62)
and repeatedly replacing the resolvent in (58) by (60), we obtain the recursion relations 1 [Dν−1 (s, ω) − DνF (s, ω)] , (63) Dν (s, ω) = ∓ω + is 2 1 [DF (s, ω) − DνF (s, ω)] . ∓ω + is ν−1 By induction in ν, we prove the following proposition.
DνF (s, ω) =
(64)
Proposition 3.4. For ν ∈ {−1, 0, . . . , n} and Dν# standing for either Dν or DνF the following estimate holds true: sup
sup
0≤s≤Sg δ/4≤ω≤δ/2
|Dν# (s, ω)| = O(g 2+β ) .
(65)
Proof of (65). We start the induction with ν = −1 (for technical reasons) and note that
D−1 (s, ω) = φ(θ¯g ) φ(θg ) H − φ(θg ) φ(θ¯g ) H = 0 ,
March 2, 2004 14:58 WSPC/148-RMP
20
00193
M. M¨ uck
F D−1 (s, ω) = φ(θ¯g ) Fφ(θg ) H − φ(θg ) Fφ(θ¯g ) H = 0 ,
F2 D−1 (s, ω) = Fφ(θ¯g ) Fφ(θg ) H − Fφ(θg ) Fφ(θ¯g ) H = 0 ,
because of the dilatation analyticity of φ and Fφ = (Hg − Ej )φ. Let us assume now that (65) is already proven for ν − 1 < n = 0, 1. For the induction, we recall the recursion relations (63) and (64). It is obvious that the proposition is implied by the estimate sup
sup
0≤s≤Sg δ/4≤ω≤δ/2
2
|DνF (s, ω)| = O(g 2+β ) ,
(66)
for ν = −1, 0, 1. This is already done for ν = −1. To derive (66) in the cases ν = 0, 1 we note that const k(Hg (θg ) − Ej ∓ ω + is)−1 kH ≤ = O(g −β ) , (67) tan(ϑg )(|ω + O(g 2 )| + O(g 2 )) uniformly in δ/4 ≤ ω ≤ δ/2 and |s| ≤ Sg . This result is also derived in [1, Eq. (III.33)] (which is based on [1, Theorem III.2, (III.17), including (III.14)]). Furthermore, we have due to Thoerem A.5 and Eq. (A.10) an estimate for the unperturbed resolvent, k(H0 (θg ) − Ej ∓ ω + is)−1 kH = O(g −β ) .
(68)
Combining (67) with (68), we get: k(Hg (θg ) − Ej ∓ ω + is)−ν − (H0 (θg ) − Ej ∓ ω + is)−ν kH = O(g 1−2νβ ) .
(69)
2
We are now in a position to estimate DνF (s, ω). Abbreviating ξ(θg ) :=
j−1 X
ξk (θg ) ,
(70)
k=0
we have (because of Remark 2.6, (39), (67) and (69))
Fφ(θ¯g ) (Hg (θg ) − Ej ∓ ω + is)−ν−1 Fφ(θg ) H
= ξ(θ¯g ) (H0 (θg ) − Ej ∓ ω + is)−ν−1 ξ(θg ) H
+ ξ(θ¯g ) [(Hg (θg ) − Ej ∓ ω + is)−ν−1 − (H0 (θg ) − Ej ∓ ω + is)−ν−1 ]ξ(θg ) H γ−(2ν+1)β γ−(2ν+1)β 2 2 + O g 2+ + O(g 2+γ−(ν+1)β ) + O g 2+ =
j−1 X
k=0
ξk (θ¯g ) (e−θg Hf + Ek − Ej ∓ ω + is)−ν−1 ξk (θg ) H
γ−(2ν+1)β 2 + O(g 2+γ−(ν+1)β ) , + O(g 3−(2ν−1)β ) + O g 2+
March 2, 2004 14:58 WSPC/148-RMP
00193
Construction of Metastable States in Quantum Electrodynamics
21
where we choose γ = 1. It remains to consider the inner products. For ν = 0, we have
| ξk (θ¯g ) (e−θg Hf + Ek − Ej ∓ ω + is)−ν−1 ξk (θg ) | H
≤ kξk (θ¯g )kH g α k(e−θg Hf + Ek − Ej ∓ ω + is)−1
· (e−θg Hf + Ek − Ej − ig α )−1 Pk (θg )Wg (θg ) ϕj (θg ) ⊗ ΩkH = O(g 2+β ) (in the case α = β), uniformly in δ/4 ≤ ω ≤ δ/2 and |s| ≤ const g 2 . Here we made use of the results of Remark 2.6 and Lemma 2.7 and Eq. (41). For ν = 1, the estimate is similar:
ξk (θ¯g ) (e−θg Hf + Ek − Ej ∓ ω + is)−ν−1 ξk (θg ) H
¯
≤ g α k(e−θg Hf + Ek − Ej ∓ ω − is)−1
¯ · (e−θg Hf + Ek − Ej − ig α )−1 Pk (θ¯g )Wg (θ¯g ) ϕj (θ¯g ) ⊗ ΩkH
· g α k(e−θg Hf + Ek − Ej ∓ ω + is)−1 · (e−θg Hf + Ek − Ej − ig α )−1 Pk (θg )Wg (θg ) ϕj (θg ) ⊗ ΩkH = O(g 2+β ) (α = β), uniformly in ω and s, where we used (41) once more. Finally, we have
= O(g 2+β ) Fφ(θ¯g ) (Hg (θg ) − Ej ∓ ω + is)−ν−1 Fφ(θg ) H
for γ = 1 and β < 1/9. The estimate of the second addend of (62) is analogous. This proves (66) and therefore the proposition (65). We return to the proof of Lemma 3.3. By inserting (58) into (59), we estimate: |M± | Zδ/2 n X (±1)ν n! Sg (n−ν+1) −it(Ej ±ω−is) ≤ sup F (E + ω) · D (s, ω) dω e j ν 2π tn 0≤s≤Sg (n − ν)! ν=0 δ/4 ≤
n δSg X n! 8π tn ν=0 (n − ν)!
sup
sup
0≤s≤Sg δ/4≤ω≤δ/2
|F (n−ν+1) (Ej + ω) · Dν (s, ω)|
Cn 4+β g (71) 2tn for some positive constant Cn < ∞ and sufficiently small g > 0 because of (65), Sg = O(g 2 ), and since all derivatives of F are bounded. ≤
Lemma 3.5. Under the conditions of Theorem 3.1, there is a constant C < ∞, such that |Mk (θg )| + |Mk (θ¯g )| ≤ C exp(−Sg t) , (72) for t > 1 and sufficiently small g > 0.
March 2, 2004 14:58 WSPC/148-RMP
22
00193
M. M¨ uck
Proof. Recall the definition (56) of Mk (θg ). We introduce a quadratic form Zδ/2 F 0 (Ej + ω) M (ϕ , ψ) := − dω 2πi ˜θ
δ/4
·
Z
I(ω)
to rewrite
dλ e−it(λ−iSg ) ϕ (Hg (θ) − λ + iSg )−1 ψ H
(73)
˜ θg ϕj (θ¯g ) ⊗ Ω , φ(θg ) − ϕj (θg ) ⊗ Ω ˜ θg ϕj (θ¯g ) ⊗ Ω , ϕj (θg ) ⊗ Ω + M Mk (θg ) = M ˜ θg φ(θ¯g ) − ϕj (θ¯g ) ⊗ Ω , ϕj (θg ) ⊗ Ω +M ˜ θg φ(θ¯g ) − ϕj (θ¯g ) ⊗ Ω , φ(θg ) − ϕj (θg ) ⊗ Ω . +M
By [1, Eq. (III.34)], we have
k(Hg (±iϑg ) − λ + iSg )−1 kH ≤
const tan(ϑg )(|Ej − λ + O(g 2 )| + |CS g 2+ε |)
(74)
which implies ˜ ±iϑg (ϕ , ψ) | ≤ CkϕkH kψkH e−Sg t |M tan(ϑg )
Z
1 dλ |Ej − λ + O(g 2 )| + |CS g 2+ε |
I(δ/2)
≤
2CkϕkH kψkH exp(−Sg t) tan(ϑg ) δ + CS g 2+ε + O(g 2 ) − ln(CS g 2+ε + O(g 2 )) , · ln 2
for a suitable constant C < ∞, because F 0 is bounded. By (39), it follows that ˜ ±iϑg (ϕj (∓iϑg ) ⊗ Ω , φ(±iϑg ) − ϕj (±iϑg ) ⊗ Ω) | |M ≤ const
kφ(±iϑg ) − ϕj (±iϑg ) ⊗ ΩkH ln(O(g 2 )) exp(−Sg t) tan(ϑg ) γ
= O(g 1− 2 −β ln(g)) exp(−Sg t) ≤ const · exp(−Sg t) . The estimates ˜ ±iϑg (φ(∓iϑg ) − ϕj (∓iϑg ) ⊗ Ω , ϕj (±iϑg ) ⊗ Ω) | ≤ const · exp(−Sg t) , |M ˜ ±iϑg (φ(∓iϑg ) − ϕj (∓iϑg ) ⊗ Ω , φ(±iϑg ) − ϕj (±iϑg ) ⊗ Ω) | ≤ const · exp(−Sg t) , |M ˜ ±iϑg (ϕj (∓iϑg ) ⊗ Ω, follow similarly. It remains to consider the term M ϕj (±iϑg ) ⊗ Ω). To this end, we define a projection P (θ) := Pelj (θ) ⊗ χ[Hf ≤ g 2−2ζ ]
(75)
March 2, 2004 14:58 WSPC/148-RMP
00193
Construction of Metastable States in Quantum Electrodynamics
23
on small photon energies, where 32 ε < ζ < 13 . We observe that P (θg ) ϕj (θg ) ⊗ Ω = ϕj (θg ) ⊗ Ω and hence
ϕj (θ¯g ) ⊗ Ω (Hg (θg ) − λ + iSg )−1 ϕj (θg ) ⊗ Ω H
≤ ϕj (θ¯g ) ⊗ Ω (e−θg Hf + Ej − λ + iSg + Re(O(g 2 )) − ig 2 Γ)−1 ϕj (θg ) ⊗ Ω H +
const · g 2+ζ , tan(ϑg )[(Ej − λ + O(g 2 ))2 + (CS g 2+ε )2 ]
(76)
where we used (74) and [1, Theorem III.4], which states that k[FP (θg ) (Hg (θ) − z) − (e−θ Hf + Ej − z + Re(O(g 2 )) − ig 2 Γ)]P (θ)kH ≤ const · g 2+ζ ,
(77)
(uniformly in z ∈ Aε ) where the Feshbach map is defined by FP (θ) (Hg (θ) − z) := P (θ)(Hg (θ) − z)P (θ) − P (θ)Wg (θ)P¯ (θ)(Hg (θ) − z)−1 P¯ (θ)Wg (θ)P (θ) on Ran(P (θ)) (for details see [2, Chap. II]). Furthermore, we compute: Z g 2+ζ dλ tan(ϑg )[(Ej − λ + O(g 2 ))2 + (CS g 2+ε )2 ] I(ω)
2g 2+ζ ≤ tan(ϑg )
2 δ/2+O(g ) Z
0
2g 2+ζ ≤ tan(ϑg )CS g 2+ε
1 dλ λ2 + (CS g 2+ε )2
Z∞
1 dλ = O(g ζ−ε−β ) = O(g 0 ) , λ2 + 1
0
because ζ ≥ 32 ε and β < ε/2. Inserting this and (76) in (73), we can estimate (since F 0 is bounded): ˜ θg ϕj (θ¯g ) ⊗ Ω , ϕj (θg ) ⊗ Ω | ≤ C exp(−Sg t) |M Z exp(−it(λ − iS )) g 2 dλ kϕ ⊗ Ωk + C0 sup . j H Ej − λ + Re(O(g 2 )) − iCS g 2+ε δ/4≤ω≤δ/2 I(ω) Finally, we get the bound of the integral by applying Cauchy’s integral formula and the residue theorem (see Fig. 6): Z exp(−it(λ − iSg )) dλ Ej − λ + Re(O(g 2 )) − iCS g 2+ε I(ω)
March 2, 2004 14:58 WSPC/148-RMP
24
00193
M. M¨ uck
29
Ej − ω
Ej + ω
Ej
˜j = Ej + Re(O (g 2 )) − iCS g 2+ε E −iR + I (ω) R→∞ Fig. 6.
Applying Cauchy’s integral formula and residue theorem.
Figure 6: Applying Cauchy’s integral formula and residue theorem.
=
Z∞
Sg
−
exp(−it(Ej − ω − is)) ds 2 )) − iC g 2+ε g ) ω + is − iSδ/2+O + (Re(O(g gZ S 2
2g 2+ζ tan(ϑg )
≤
Z∞ ≤
Sg
1
λ2 + (CS g 2+ε )2
0
dλ
exp(−it(E Z∞ j + ω − is)) 1 2g 2+ζ ds 2 )) dλ = O− g ζ−ε−β = O g0 , −ω + gis iSg +λRe(O(g iCS g 2+ε 2+ε 2+1 tan(ϑ )C− Sg 0
Z
− iR)) because ζ ≥ 23 ε and β < ε/2. Inserting this exp(−it(λ and (76) in (73), we can estimate (since F 0 + lim
is bounded):
R→∞ I(ω)
Ej − λ − iSg + iR + Re(O(g 2 )) − iCS g 2+ε
dλ
fθg− 1 exp(−it(λ − iS , ))C exp(−Sg t) 2 ϕj (θg ) ⊗ Ω , ϕj (θg ) ⊗ Ω g≤ M 2πi λ=Ej +Re(O(g ))−iCS g 2+ε Z exp(−it(λ − iS )) g 2 kϕ ⊗ Ωk +C 0 sup dλ . j H 2 2+ε Ej − λ + Re(O (g )) − iCS g δ/4≤ω≤δ/2 ∞ I(ω) Z
where Z∞ exp(−it(Ej ∓ ω − is)) 4 exp(−st) ds ds ≤ 2+ε Finally, ω + is Re(O(g − iCby δ + integral O(g 2 ) formula and the we− getiSthe bound of the2 )) integral g + S gapplying Cauchy’s Sg Sg residue theorem (see Fig. 6):
<
Z
exp(−it(λ − iSg ))
dλ
4 exp(−Sg t) , δ + O(g 2 )
because t > 1, I(ω) Ej − λ + Re(O (g 2 )) − iCS g 2+ε Z Z∞ Z exp(−it(Ej − ω − is)) exp(−it(λ − iR)) 2 e−Rt ds = 2+ε ω + is − iSg + Re(O (g )) − iCS g lim dλ ≤ lim dλ = 0 2 2+ε S R→∞ R→∞ E − λ − iS R g + iR + Re(O(g )) − iCS g I(ω) j ∞ I(ω) Z g
−
and
Sg
exp(−it(Ej + ω − is)) ds −ω + is − iSg + Re(O (g 2 )) − iCS g 2+ε
| exp(−it(λ − iSg ))|λ=Ej +Re(O(g2 ))−iCS g2+ε | = exp(−t(Sg + CS g 2+ε )) ≤ exp(−Sg t) . Thus, we conclude ˜ ±iϑg ϕj (θ¯g ) ⊗ Ω , ϕj (θg ) ⊗ Ω | ≤ const · exp(−Sg t) |M
March 2, 2004 14:58 WSPC/148-RMP
00193
Construction of Metastable States in Quantum Electrodynamics
25
(note that we did not make use of the sign of Im(θg )). This proves the lemma. Appendix A. Dilatation Analyticity of the System In
this appendix, we study the analytic property of the resolvent z 7→ ϕ (Hg − z)−1 ψ H across the real axis for suitable dilatation analytic states ϕ, ψ ∈ H. The technique of dilatation analyticity as discussed in [14, 15] provides us an analytic continuation. Definition A.1 (Dilatation Group). On the Hilbert space H we introduce a strongly continuous one-parameter group of unitary operators U (θ), θ ∈ R, the Dilatation Group, which scales the position variables of the electrons and the photon momenta by ~x 7→ eθ ~x and ~k 7→ e−θ~k, respectively. For ψ ∈ H, and a closed operator A on H we abbreviate ψ(θ) := U (θ)ψ ,
A(θ) := U (θ) A U (θ)−1 .
(A.1)
We collect the dilatation properties of the operators involved in our analysis: Hel (θ) = e−2θ (−∆x ) + e−θ VC (x) ,
Hf (θ) = e−θ Hf ,
H0 (θ) = e−2θ (−∆x ⊗ 1lF ) + e−θ (VC (x) ⊗ 1lF + 1lHel ⊗ Hf ) .
(A.2) (A.3)
Note that the homogeneity of the Coulomb potential under dilatation is due to the fact that we consider an atom, i.e. we deal with a single nucleus at the origin. The perturbation Wg ≡ Wg (θ = 0) transforms under dilatation as follows X Wg (θ) = U (θ) Wg U (θ)−1 = g m+n Wm,n (θ) . (A.4) m+n≤2
Definition A.2 (Dilatation Analyticity). A vector ψ ∈ H is called dilatation analytic, if and only if the Hilbert space valued map R 3 θ 7→ ψ(θ) has an analytic continuation in a neighborhood of zero. A closed operator A on H is called dilatation analytic if and only if Ur 3 θ 7→ A(θ) defines a holomorphic family of type (A) (in the sense of [16, Chap. VII.2.1]) in a neighborhood of zero. The advantage of the concept of dilatation analytic operators and vectors lies in the following identities: ¯ ∗ = A∗ (θ) , [A(θ)]
¯ A(θ)ψ(θ) hφ | AψiH = φ(θ) , H
(A.5) (A.6)
which hold for dilatation analytic operators A, A∗ and vectors φ, ψ and θ ∈ C sufficiently close to zero. By Eq. (A.6), a matrix element of an operator A can be transformed into a matrix element of the dilated operator A(θ), whose properties are more accessible in various applications.
March 2, 2004 14:58 WSPC/148-RMP
26
00193
M. M¨ uck
The dilatation analyticity of the unperturbed Hamiltonian H0 follows directly from Eq. (A.3). An immediate implication of Lemma 1.2, Eq. (A.4) and the analyticity of θ 7→ wm,n (k 1 , . . . , k m+n , θ) ψ is the dilatation analyticity of Wg (θ) and of Hg (θ): Lemma A.3. If g <
1 10 (1
+ Λ20 + Λ2−1 )−1/2 and |θ| < π/4, then the map
Wg : Uπ/4 → B(D(H0 ); H) ,
θ 7→ Wg (θ)
is analytic with respect to the graph norm on D(H0 ). In particular , Wg (θ) and Hg (θ) are dilatation analytic. The analytic property of the electron system can be quoted, Lemma A.4. (a) Let ϕk ∈ Hel be an eigenstate of Hel corresponding to the eigenvalue Ek . Then ϕk is dilatation analytic. The dilated state ϕk (θ) is an eigenstate of the dilated operator Hel (θ) corresponding to Ek . PNk l l (b) Let Pelk = l=1 |ϕk ihϕk | be the orthogonal rank Nk projection onto the eigenspace of Hel corresponding to the eigenvalue Ek . Then, Pelk is dilatation analytic and Pelk (θ) ψ =
Nk X
l ¯ |ϕlk (θ)ihϕ k (θ)|
(A.7)
l=1
is the (not necessarily orthogonal ) finite rank projection onto the eigenspace of Hel (θ) corresponding to Ek . Proof. We refer to [15, Theorem. 1]. The spectrum of H0 (θ) and Hg (θ) can be located as follows, Theorem A.5. The spectrum of the dilated electronic Hamiltonian H el (θ) is given by σ(Hel (θ)) = {Ek | k ∈ N0 } ∪ SHel (θ) ,
(A.8)
SHel (θ) ⊆ S˜ := Σ + re−iϑ r ≥ 0, ϑ between 0 and 2 Im(θ) .
(A.9)
where
Thus, the spectrum of H0 (θ) = Hel (θ) ⊗ 1lF + e−θ 1lHel ⊗ Hf is given by n o σ(H0 (θ)) = Ek + re−iIm(θ) k ∈ N0 , r ≥ 0 ∪ SH0 (θ) ,
with SH0 (θ) ⊆ S˜ (see Fig. 7).
(A.10)
March 2, 2004 14:58 WSPC/148-RMP
00193
Construction of Metastable States in Quantum Electrodynamics 3327
E
E0
1
...
E
E
j
j+1
Σ S˜
...
Im(θ)
Fig. 7.
Spectrum of the unperturbed, dilated Hamiltonian H0 (θ), Im(θ) > 0.
Figure 7: Spectrum of the unperturbed, dilated Hamiltonian H 0 (θ), Im(θ) > 0.
Proof. Equation (A.8) is proven in [15, Lemmas 1 and 3]. The remaining assertions are a simple consequence of σ(e−θ Hf ) ⊗ = e−iIm(θ) [0,⊗∞) and the functional calculus. H is given by + e−θ Thus, the spectrum of H (θ) = H (θ) 0
el
F
f
Hel
n
0 ∈ 0j, r∈≥N = 2/9 Ek + re−iIm(θ) k for Theorem A.6. Fixσ(H 0< ε< and define 0 (θ))
o
∪ SH0 (θ) ,
(87)
Aε := (E − δ/2, Ej + δ/2) + i[−Sg , ∞) ,
j with SH0 (θ) ⊆ S˜ (see Fig. 7).
2
Proof:
2+ε
(A.11)
2
S := g Γ − g CS = O(g ) (A.12) Eq. (85) is gproven in [BC], Lemma 1 and Lemma 3. The remaining
with suitable constants Γ, CS > 0, defined−θas in [1] (note that Sg > 0 for g suffiassertions are a simple consequence of σ(e Hf ) = e−iIm(θ) [0, ∞) and the functional ciently small ). Let g > 0, Im(θ) > 0 and g ε ≤ const Im(θ)2 and g 2−2ε ≤ const Im(θ) calculus. where the constants are suitably chosen. Then Aε ∩ σ(Hg (θ)) = ∅ (A.13)
Fix 0−
Proof. (see [1, Lemmas III.14gand III.15]).
(89)
suitable constants Γ, CS >the 0, defined as in [?] (note g > 0 for g sufficiently Thewith value Sg approximates imaginary part ofthat theSresonance corresponding 2 to Ej . small). It enters ofg εthe timeIm(θ) decay the≤ metastable state Let gthe > 0,estimates Im(θ) > 0 and ≤ const and of g 2−2ε const Im(θ) where the as the reciprocal life-time (see also Fig. 5). constants are suitably chosen. Then
Appendix B. The Pull-Through Formula Aε ∩ σ(H (θ)) = ∅ g
(90)
A useful tool for calculations with the free field Hamiltonian and creation and
annihilation so−1called Formula, which canaxis easily be and +operators 3 z 7→ ϕ is (Hthe ψ H hasPull-Through an analytic continuation across the real g − z) derived from functional calculus (see, e.g., [2, Lemma A.1]): into the lower half plane for all dilatation analytic vectors ϕ, ψ ∈ H. In particular,
Lemma B.1 (Pull-Through Formula). Let F : R+ → C be a measurable functhere are no eigenvalues of Hg in Aε . tion, obeying F (r) = O(r). Then F (Hf ) is defined on D(Hf ) and a(k) F (Hf ) = F (Hf + ω(~k)) a(k) , F (Hf ) a(k)∗ = a(k)∗ F (Hf + ω(~k)) .
(B.1)
March 2, 2004 14:58 WSPC/148-RMP
28
00193
M. M¨ uck
Acknowledgments The author thanks V. Bach, F. Baldus, J. Lutgen, M. Schneider, I. M. Sigal, U. Staude, T. Weth, and H. Zenk for helpful discussions. He is especially indebted to V. Bach, J. Lutgen and I. M. Sigal for careful proofreading. References [1] V. Bach, J. Fr¨ ohlich and I. M. Sigal, Spectral analysis for systems of atoms and molecules coupled to the quantized radiation field, Commun. Math. Phys. 207 (1999), 249–290. [2] V. Bach, J. Fr¨ ohlich and I. M. Sigal, Renormalization group analysis of spectral problems in quantum field theory, Adv. Math. 137 (1998), 205–298. [3] V. Bach, J. Fr¨ ohlich and I. M. Sigal, Quantum electrodynamics of confined nonrelativistic particles, Adv. Math. 137 (1998), 299–395. [4] W. Hunziker, Resonances, metastable states and exponential decay laws in perturbation theory, Commun. Math. Phys. 132 (1990), 177–188. [5] C. Cohen-Tannoudji, J. Dupont-Roc and G. Grynberg, Photons and Atoms — Introduction to Quantum Electrodynamics (John Wiley & Sons, New York, 1991). [6] M. H¨ ubner and H. Spohn, Radiative decay: Nonperturbative approaches, Rev. Math. Phys. 7 (1995), 363–387. [7] M. H¨ ubner and H. Spohn, Spectral properties of the spin-boson Hamiltonian, Ann. Inst. H. Poincar´e 62 (1995), 289–323. [8] E. Skibsted, Spectral analysis of n-body systems coupled to a bosonic field, Rev. Math. Phys. 10(7) (1997), 989–1026. [9] H. L. Cycon, R. G. Froese, W. Kirsch and B. Simon, Schr¨ odinger Operators (Springer Verlag, Berlin, Heidelberg, New York, 1987). [10] G. M. Zishlin, Discussion of the spectrum of the Schr¨ odinger operator for systems of many particles, Tr. Mosk. Math. O. -va 9 (1960), 81–120. [11] E. H. Lieb, Bound on the maximum negative ionization of atoms and molecules, Phys. Rev. A29 (1984), 3018–3028. [12] E. Merzbacher, Quantum Mechanics (John Wiley & Sons, New York, 1964). [13] M. Reed and B. Simon, Methods of Modern Mathematical Physics II : Fourier Analysis, Self-Adjointness (Academic Press, 1975). [14] J. Aguilar and J. M. Combes, A class of analytic perturbations for one-body Schr¨ odinger Hamiltonians, Commun. Math. Phys. 22 (1971), 269–279. [15] E. Balslev and J. M. Combes, Spectral properties of Schr¨ odinger operators with dilatation analytic potentials, Commun. Math. Phys. 22 (1971), 280–294. [16] T. Kato, Perturbation theory of linear operators, in Die Grundlehren der Mathematischen Wissenschaften, Vol. 132 (Springer Verlag, New York, 1966).
March 10, 2004 14:17 WSPC/148-RMP
00191
Reviews in Mathematical Physics Vol. 16, No. 1 (2004) 29–123 c World Scientific Publishing Company
SCATTERING OF MASSLESS DIRAC FIELDS BY A KERR BLACK HOLE
∗ and JEAN-PHILIPPE NICOLAS† ¨ DIETRICH HAFNER
M.A.B., UMR CNRS no 5466, Institut de Math´ ematiques de Bordeaux, Universit´ e Bordeaux 1, 351 cours de la Lib´ eration, 33405 Talence cedex, France ∗
[email protected] †
[email protected] Received 15 April 2003 Revised 13 November 2003 For the massless Dirac equation outside a slow Kerr black hole, we prove asymptotic completeness. We introduce a new Newman–Penrose tetrad in which the expression of the equation contains no artificial long-range perturbations. The main technique used is then a Mourre estimate. The geometry near the horizon requires us to apply a unitary transformation before we find ourselves in a situation where the generator of dilations is a good conjugate operator. The results are eventually re-interpreted geometrically to provide the solution to a Goursat problem on the Penrose compactified exterior. Keywords: Dirac equation; general relativity; Goursat problem; Kerr metric; scattering theory.
Contents 1. Introduction 2. The Kerr Metric and Dirac’s Equation 2.1 The Kerr metric 2.2. Dirac’s and Weyl’s equations in the Newman–Penrose formalism 2.3. A choice of null tetrad and the calculation of the spin coefficients 2.4. Calculation and first simplifications of Weyl’s equation 2.5. Further simplifications of the equation 2.5.1. A new Newman–Penrose tetrad adapted to the foliation 2.5.2. The new expressions of Weyl’s equation and the conserved quantity 2.6. The main theorems for the Kerr framework 3. Abstract Analytic Framework 3.1. Symbol classes 3.2. Technical results 3.3. The reference Dirac operator 3.4. The perturbed Dirac operator 3.5. Asymptotic dynamics 4. Some Fundamental Properties of our Dirac-Type Hamiltonians 4.1. Description of the domains 29
30 35 35 37 41 44 45 47 49 53 58 58 58 63 64 64 65 65
March 10, 2004 14:17 WSPC/148-RMP
30
00191
D. H¨ afner & J.-P. Nicolas
4.2. Resolvent estimates 4.3. Absence of eigenvalues for D /ν , ν ∈ N 5. The Mourre Estimate 5.1. Preliminary remarks 5.2. The abstract setting of Mourre theory 5.3. Technical results 5.4. Conjugate operator for D / 5.5. The Mourre estimate for D / 5.6. The Mourre estimate for D /ν , ν ∈ N 6. Asymptotic Completeness 6.1. Technical results 6.2. Comparison with the intermediate dynamics 6.3. Technical results concerning the separable problem 6.4. Comparison with the asymptotic dynamics 7. Proof of the Theorems of Sec. 2.6 7.1. Absence of point spectrum 7.2. Compatibility of the general analytic framework 7.3. Proof of Theorems 2.1 2.3 7.3.1. Scattering theory in terms of cut-off functions 7.3.2. Asymptotic velocity (proof of Theorem 2.1) 7.3.3. Proof of Theorems 2.2 and 2.3 8. Geometric Interpretation 8.1. Theorem 2.2 in terms of principal null geodesics 8.2. Inverse wave operators at the horizon as trace operators 8.2.1. Kerr-star and star-Kerr coordinates 8.2.2. Kruskal–Boyer–Lindquist coordinates 8.2.3. Interpretation of the traces on the horizon 8.3. Inverse wave operators at infinity as trace operators 8.3.1. Penrose compactification of block I 8.3.2. Interpretation of the traces on J ± 8.4. The Goursat problem Acknowledgments References
68 68 69 69 70 72 74 77 84 86 86 87 87 89 90 90 93 96 96 98 100 101 102 103 103 106 109 116 116 119 120 121 121
1. Introduction Black holes are the cosmological objects in which the effects of gravity are the most extreme. In the 1960s and the 1970s, some striking phenomena related to black holes were discovered, among which were the Hawking radiation and superradiance. A complete mathematical understanding of these phenomena is far from being achieved yet; it requires a detailed study of the propagation and scattering properties of classical and quantum fields on black hole space-times. The first and simplest solution of the Einstein vacuum equations describing a black hole is the Schwarzschild metric, discovered in 1916 by Karl Schwarzschild [53]. It represents an asymptotically flat space-time containing nothing but a static, spherically symmetric, uncharged black hole. The Kerr family of metrics, discovered in 1963 by Roy Patrick Kerr [35], is a set of solutions of the Einstein vacuum equations generalizing the Schwarzschild metric. A subset of this family, referred to as slow Kerr metrics, describes an asymptotically flat space-time containing nothing
March 10, 2004 14:17 WSPC/148-RMP
00191
Scattering of Massless Dirac Fields by a Kerr Black Hole
31
but an eternal, uncharged, rotating black hole. This provides the realistic model for the exterior of a black hole (all cosmological objects are in rotation). The scattering properties of classical and quantum fields outside a Schwarzschild black hole have been thoroughly studied. The first results on the subject were obtained by Dimock in 1985 [18] and by Dimock and Kay in 1986 and 1987 [19–21] for classical and quantum scalar fields. This work was pushed further by Bachelot in the 1990s; his important series of papers starts with scattering theories for classical fields, Maxwell in 1991 [2], Klein–Gordon in 1994 [3] and culminates with a rigourous mathematical description of the Hawking effect for a spherical gravitational collapse in 1997 [4], 1999 [5] and 2000 [6]. Meanwhile other authors contributed to the subject, such as Nicolas in 1995 with a scattering theory for classical massless Dirac fields [44], Jin in 1998 with a construction of wave operators in the massive case [34] and Melnyk in 2003 who obtained a complete scattering for massive charged Dirac fields [39] and the Hawking effect for charged, massive spin 1/2 fields [40]. Note that in [6, 39, 40, 44], the cases of Reissner–Nordstrøm (charged) and de Sitter (with a cosmological horizon) black holes are also treated; these geometries do not fundamentally change the analytic difficulties in the construction of classical or quantum scattering theories. All these works use trace class perturbation methods and therefore cannot be extended to the Kerr case because of the lack of symmetry of the geometry (see below). One paper by De Bi`evre, Hislop and Sigal using different techniques appeared in 1992, [14]: by means of a Mourre estimate, they studied the wave equation on non-compact Riemannian manifolds; possible applications are therefore static situations, such as the Schwarzschild case, which they treat, but the Kerr geometry is not even stationary and the results cannot be applied. A complete scattering theory for the wave equation, on stationary, asymptotically flat space-times, was subsequently obtained by H¨ afner in 2001 using the Mourre theory [31]. The theory of resonances is well understood in the Schwarzschild geometry, thanks to works by Bachelot and Motet-Bachelot in 1993 [7] and S´ a Barreto and Zworski in 1998 [52]. There is also a work on a nonlinear Klein–Gordon equation on the Schwarzschild metric (and other similar geometries) with partial scattering results obtained by conformal methods, due to Nicolas in 1995 [43]. In the more realistic framework of Kerr black holes, the analysis of the scattering properties of fields is faced with three fundamental difficulties, not present in the Schwarzschild framework. (1) Lack of symmetry. The Kerr solutions possess only two commuting Killing vector fields. In the Boyer–Lindquist coordinate system (t, r, θ, ϕ), based on these Killing vector fields, they are interpreted as the time coordinate vector field ∂/∂t and the longitude coordinate vector field ∂/∂ϕ. Kerr space-time therefore has cylindrical, but not spherical, spatial symmetry. This prevents a straightforward decomposition in spin-weighted spherical harmonics, that reduces the problem to the study of a (1 + 1)-dimensional evolution system with potential. The trace-class perturbation methods used in the Schwarzschild
March 10, 2004 14:17 WSPC/148-RMP
32
00191
D. H¨ afner & J.-P. Nicolas
case are in consequence not applicable. Another effect of the lack of spherical symmetry is the presence of artificial long-range terms at infinity in the field equations. To get rid of these terms, it is necessary to have a deeper understanding of the geometry, and of the dynamics naturally associated with the conformal structure, than what is required in the Schwarzschild case. (2) The point of view of scattering theory is that of an observer static at infinity. Such an observer perceives the propagation of a field outside the black hole as an evolution on a cylindrical manifold Σ ' R × S2 , with one asymptotically flat end corresponding to infinity and one exponentially large (i.e. asymptotically hyperbolic, see Remark 3.3) end representing the horizon. In the absence of spherical symmetry, the exponentially large end is awkward for scattering theory, more particularly for the choice of a conjugate operator in the framework of Mourre theory. The generator of dilations, that is the usual conjugate operator, cannot be used here. (3) Kerr space-time is not stationary; there exists no globally defined timelike Killing vector field outside the black hole. In particular, the vector ∂/∂t is spacelike in a toroidal region, called the ergosphere, surrounding the horizon. For field equations of integral spin, such as the wave equation, Klein–Gordon or Maxwell, this means that no positive definite conserved energy exists, which allows fields to extract energy from the ergosphere, a phenomenon referred to as superradiance. For field equations of half integral spin (Weyl, Dirac or Rarita– Schwinger), we have a conserved L2 norm, there is no superradiance and the lack of stationarity is not in itself a serious difficulty. This conserved L2 norm is usually interpreted as a conserved charge. It is the good conserved quantity to work with because the field energy, which is the quadratic form associated with the Hamiltonian operator, is not positive definite for these equations (see also Remark 2.1 for the Dirac case). Because of the geometric complexity of the Kerr metric and the three difficulties mentioned above, analytic studies of the propagation of fields outside a Kerr black hole are few. In particular the complete understanding of superradiance in terms of time-dependent scattering is a major open problem. Chandrasekhar’s fundamental work [12] uses systematically the Newman–Penrose formalism to develop stationary scattering theories and describe superradiance in terms of transmission and reflexion coefficients. As for time-dependent scattering, to our knowledge, the only result in the Kerr framework is H¨ afner’s paper [32]; it is a proof of asymptotic completeness for the non-superradiant modes of the Klein–Gordon equation. In this work, the first two difficulties are present, but the third is avoided by the restriction to non-superradiant modes. Some analytic results have also been obtained outside the scope of scattering theory: the existence of smooth solutions for Dirac’s and Maxwell’s equations was shown on generic space-times by De Vries [16, 17] with application to the Kerr metric where the existence of superradiance for Maxwell and its
March 10, 2004 14:17 WSPC/148-RMP
00191
Scattering of Massless Dirac Fields by a Kerr Black Hole
33
absence for Dirac are obtained; one of us (Nicolas) has published a generic analytic study of the evolution of Dirac fields in Sobolev and weighted Sobolev spaces, with applications to the Kerr metric and its maximal analytic extension [46], as well as a work on a nonlinear Klein–Gordon equation, proving the well-posedness of the minimum regularity Cauchy problem and, by means of a Penrose compactification, the existence of smooth asymptotic profiles for smooth solutions [47]; there is also a paper by Finster et al. [23] on the time decay of Dirac fields. In this work we develop a complete scattering theory for massless Dirac fields outside a slow Kerr black hole; this is, to our knowledge, the first complete scattering theory on the Kerr background. The choice of Dirac fields, with their conserved L2 norm, has the advantage of avoiding the third difficulty. The spinorial aspect, however, requires to obtain a better understanding of the first two difficulties than what is necessary for the Klein–Gordon equation. The paper is organized as follows: • Section 2 is devoted to the presentation of the Kerr metric, of the Dirac equation on it and of our main results. We begin with a brief description of the Kerr metric, then we give the expression of Dirac’s equation in the two-spinor formalism of Penrose and Rindler (see [50]). The Newman–Penrose formalism allows us to transform this intrinsic expression into a system of partial differential equations with respect to a coordinate basis. For this purpose, we choose Kinnersley’s tetrad, which is the one commonly used. The resulting system contains artificial long-range terms. In order to get rid of these terms, we introduce a new tetrad closely related to the local rotation of space-time. The section ends with the statement of the main theorems of this work; they express the existence and completeness of classical wave operators for two types of simplified dynamics: asymptotic profiles and Hamiltonians of Dirac type involving the Dirac operator on the 2-sphere. Sections 3 to 7 contain the proofs of the theorems of Sec. 2. • In Sec. 3, we define an abstract analytic framework, generalizing Dirac’s equation outside a Kerr black hole, by retaining only the analytic features relevant to scattering theory. Then some simplified asymptotic comparison Hamiltonians are defined, for both asymptotic regions, in the general setting. • Section 4 contains intermediate technical results necessary for the scattering theory. • In Sec. 5, after recalling the basic principles of Mourre theory, we prove the fundamental Mourre estimate. The generator of dilations cannot be used as conjugate operator because of the difficulty related to the asymptotic end corresponding to the horizon. However, it is possible to define a unitary transformation leading to a situation where it is a good conjugate operator. The correct conjugate operator is then defined by conjugating the generator of dilations by this unitary transformation; it is similar to the operator introduced by Froese and Hislop [24],
March 10, 2004 14:17 WSPC/148-RMP
34
00191
D. H¨ afner & J.-P. Nicolas
but the arguments used to prove the Mourre estimate are different (Froese and Hislop’s argument is not adapted to Dirac’s equation). • Once the Mourre estimate is established, the asymptotic completeness follows by standard arguments described in Sec. 6. • Section 7 opens with a proof of the absence of eigenvalues for the Hamiltonian of the massless Dirac equation on the Kerr metric; this is a straightforward consequence of Teukolski’s separation of variables in the equation. Then, we construct asymptotic velocities using the asymptotic completeness results of Sec. 6. Finally, the theorems of Sec. 2 are obtained as consequences of this construction, the absence of eigenvalues and the results of Sec. 6. • Section 8 is a re-interpretation of the results of Sec. 2 in geometrical terms. The inverse wave operators are understood as trace operators on smooth null hypersurfaces at the boundary of the Penrose compactification of the exterior of the black hole. The full scattering theory is thus realized as the solution of a Goursat problem on the compactified exterior, with null data specified on a union of two such smooth hypersurfaces, singular at the junction. In the massive charged case on Kerr–Newman backgrounds for the classical equation, the full dynamics is a short-range perturbation of an intermediate spherically symmetric dynamics. This intermediate dynamics is a one-dimensional Dirac equation with a long-range (at infinity) matrix-valued potential. This will require to introduce a Dollard modification in the wave operators. This case is currently under study. The purpose of the present paper is to solve the geometrical difficulties of the scattering of Dirac fields outside a rotating black hole. All such difficulties are already present in the case of massless Dirac fields on a Kerr background. In particular, the Mourre theory developed here should hold without modification in the charged massive case. Note however that the geometrical interpretation of Sec. 8 is highly dependent on the massless, chargeless aspect. Indeed, in the massive, or charged case, the equation is no longer conformally invariant and the conformal constructions fail. The reverse problem, consisting of solving the Goursat problem on a compactified space-time in order to extract a scattering theory, is under study with a first work on asymptotically simple space-times [38]. The results of the present paper will be used in a subsequent work to develop a quantum scattering theory for the Dirac equation on the Kerr metric. It is at this quantum level that the effects of the non-stationarity of space-time will appear (see Remark 2.1). Notations. Many of our equations will be expressed using the two-component spinor notations and abstract index formalism of Penrose and Rindler [50]. Abstract indices are denoted by light face latin letters, capital for spinor indices and lower case for tensor indices. Abstract indices are a notational device for keeping track of the nature of objects in the course of calculations, they do not imply any reference to a coordinate basis, all expressions and calculations involving them are perfectly intrinsic. For example, gab will refer to the space-time metric as an intrinsic
March 10, 2004 14:17 WSPC/148-RMP
00191
35
Scattering of Massless Dirac Fields by a Kerr Black Hole
h i
, i.e. a section of T∗ M T∗ M and g ab will h i refer to the inverse metric as an intrinsic symmetric tensor field of valence 2 , symmetric tensor field of valence
0 2
0
i.e. a section of TM TM (where denotes the symmetric tensor product, TM the tangent bundle to our space-time manifold M and T∗ M its cotangent bundle). Concrete indices defining components in reference to a basis are represented by bold face latin letters. Concrete spinor indices, denoted by bold face capital latin letters, take their values in {0, 1} while concrete tensor indices, denoted by bold face lower case latin letters, take their values in {0, 1, 2, 3}. Consider for example a basis of TM, that is a family of four smooth vector fields on M : B = {e0 , e1 , e2 , e3 } such that at each point p of M the four vectors e0 (p), e1 (p), e2 (p), e3 (p) are linearly independent, and the corresponding dual basis of T∗ M : B ∗ = {e0 , e1 , e2 , e3 } such that ea (eb ) = δba , δba denoting the Kronecker symbol; gab will refer to the components of the metric gab in the basis B : gab = g(ea , eb ) and g ab will denote the components of the inverse metric g ab in the dual basis B ∗ , i.e. the 4 × 4 real symmetric matrices (gab ) and (g ab ) are the inverse of one another. In the abstract index formalism, the basis vectors ea , a = 0, 1, 2, 3, are denoted ea a or ga a . In a coordinate basis, the basis vectors ea are coordinate vector fields and will also be denoted by ∂a or ∂x∂ a ; the dual basis covectors ea are coordinate 1-forms and will be denoted by dxa . We adopt Einstein’s convention for the same index appearing twice, once up, once down, in the same term. For concrete indices, the sum is taken over all the values of the index. In the case of abstract indices, this signifies the contraction of the index, i.e. fa V a denotes the action of the 1-form fa on the vector field V a . The 0 indexed 1-form dxa ∈ T∗ M ⊗ SA ⊗ SA and the indexed vector ∂a ∈ TM ⊗ SA ⊗ SA0 0 (see Subsec. 2.2 for the meaning of the notations SA , SA , SA and SA0 ) are used to suppress form and vector abstract indices: dxa maps the 1-form ωa as an indexed quantity to the same 1-form ω = ωa dxa with its index suppressed, ∂a maps the vector field V a to the same vector field V = V a ∂a with its index suppressed. For a manifold Y we denote by Cb∞ (Y ) the set of all C ∞ functions on Y , that are bounded together with all their derivatives. We denote by C∞ (Y ) the set of all continuous functions tending to zero at infinity. 2. The Kerr Metric and Dirac’s Equation 2.1. The Kerr metric Kerr’s space-time is described in terms of Boyer–Lindquist coordinates as the manifold M = Rt × Rr × Sω2 equipped with the Lorentzian metric 2M r 4aM r sin2 θ g = 1− 2 dtdϕ dt2 + ρ ρ2 −
σ2 ρ2 2 dr − ρ2 dθ2 − 2 sin2 θ dϕ2 , ∆ ρ
(2.1)
March 10, 2004 14:17 WSPC/148-RMP
36
00191
D. H¨ afner & J.-P. Nicolas
ρ2 = r2 + a2 cos2 θ ,
∆ = r2 − 2M r + a2 ,
σ 2 = (r2 + a2 )ρ2 + 2M ra2 sin2 θ = (r2 + a2 )2 − ∆a2 sin2 θ , where M is the mass of the black hole and a its angular momentum per unit mass. If |a| is not too large, (M, g) is an asymptotically flat universe containing nothing but an eternal, uncharged, rotating black hole. For no value of r is the 2 sphere {r} × Sθ,ϕ reduced to a point, which justifies the extension of the variable r to the whole real axis. The expression (2.1) of the Kerr metric has two types of singularities. The set of points {ρ2 = 0} (the equatorial ring {r = 0, θ = π/2} of the {r = 0} sphere) is a true curvature singularity. The spheres where ∆ vanishes, called horizons, are mere coordinate singularities. Using appropriate coordinate systems, they are understood as regular null hypersurfaces that can be crossed one way but would require speeds greater than that of light to be crossed the other way, hence their name: event horizons. The black hole is the part of our space-time lying beyond an event horizon. There are three types of Kerr space-times according to the number of horizons (which depends on the respective importance of M and a). • Slow Kerr space-time for 0 < |a| < M. ∆ has two real roots p r± = M ± M 2 − a 2 ,
(2.2)
so there are two horizons, the spheres {r = r− } and {r = r+ }, on either side of {r = M }. The case a = 0 reduces to Schwarzschild’s space-time. • Extreme Kerr space-time for |a| = M. M is then the double root of ∆ and the sphere {r = M } is the only horizon. • Fast Kerr space-time for |a| > M. ∆ has no real root and the space-time has no horizon. There is no black hole in this case; the ring singularity is a naked singularity.
We only work with slow Kerr metrics; they are usually considered as the generic description of a space-time containing simply a rotating uncharged black hole, since the extreme case is believed to be unstable. The two horizons separate M into three connected components called Boyer–Lindquist blocks: block I, denoted here BI , is the exterior of the black hole {r > r+ }; block II, {r− < r < r+ }, is a dynamic region situated beyond the outer horizon and where the inertial frames are dragged towards the inner horizon; block III, {r < r− }, is the part of space-time located beyond the inner horizon, it contains the ring singularity and a time machine called Carter’s time machine. No Boyer–Lindquist block is stationary, that is to say there exists no globally defined timelike Killing vector field on any given block. In particular, block I contains a toroidal region, called the ergosphere, surrounding the horizon, o n p E = (t, r, θ, ϕ); r+ < r < M + M 2 − a2 cos2 θ , where the vector ∂/∂t is spacelike. An important feature of Kerr’s space-time is that it has Petrov type D (see translation of Petrov’s original paper [51], or standard general relativity textbooks,
March 10, 2004 14:17 WSPC/148-RMP
00191
Scattering of Massless Dirac Fields by a Kerr Black Hole
37
or [48]). This means that the Weyl tensor has two double roots at each point. These roots, referred to as the principal null directions of the Weyl tensor, are given by the two vector fields (r2 + a2 ) ∂ ∂ a ∂ V± = ± + . (2.3) ∆ ∂t ∂r ∆ ∂ϕ Since V + and V − are (twice) repeated null directions of the Weyl tensor, by the Goldberg–Sachs theorem (see for example [48]) their integral curves define geodesic shear-free null congruences. We shall refer to the integral curves of V + (respectively V − ) as the outgoing (respectively incoming) principal null geodesics. Since the quantities ρ2 and σ 2 are positive on block I (in fact ρ2 is positive on the whole but σ 2 is negative in the time machine in block III), we pspace-time, √ 2 denote ρ = ρ and σ = σ 2 . Our purpose in this paper is to describe the scattering of linear massless Dirac fields by a slow Kerr black hole from the point of view of an observer static at infinity. For such observers, the exterior of the black hole is the only visible part of space-time. Besides, their perception of time is well described by the time function t of the Boyer–Lindquist coordinates. The horizon will therefore appear to them as a singularity of the metric (for more details on the nature of this singularity, see for example [46] or [47]). One may tend to think that t is simply a bad choice of time coordinate since it makes a regular part of space-time appear as singular. However, our choice of observer is natural in that it is a good description of a distant observer (typically, a telescope on earth aimed in the direction of a black hole) and the choice of time coordinate describes the experience of such observers. 2 Hence, we work on BI = Rt ×]r+ , +∞[×Sθ,ϕ equipped with the metric (2.1) and we shall consider Dirac’s equation as an evolution equation with respect to t. We 2 denote Σ the generic spacelike slice: Σ = ]r+ , +∞[×Sθ,ϕ and Σt = {t} × Σ. 2.2. Dirac’s and Weyl’s equations in the Newman Penrose formalism The function t of Boyer–Lindquist coordinates is a globally defined time function on block I, i.e. its gradient ∇a t, ∇a t = g ab ∇b t ,
∇a tdxa = dt ,
is a smooth, timelike, non-vanishing vector field on block I (in spite of the fact that in Boyer–Lindquist coordinates ∂/∂t is not everywhere timelike). The time orientation of block I is defined by t, i.e. a timelike or null vector field is said to be future oriented if t is increasing along its integral lines. The foliation {Σt }t∈R by the level hypersurfaces Σt = {t} × Σ of the function t, is a foliation of block I by Cauchy hypersurfaces. Block I is therefore globally hyperbolic. In dimension 4, this entails the existence of a spin-structure (see Geroch [27–29] and Stiefel [54]). We ¯ denote by S (or SA in the abstract index formalism) the spin bundle over BI and S A0 (or S ) the same bundle with the complex structure replaced by its opposite. The
March 10, 2004 14:17 WSPC/148-RMP
38
00191
D. H¨ afner & J.-P. Nicolas
¯∗ will be denoted respectively SA and SA0 . The complexified dual bundles S∗ and S ¯ i.e. tangent bundle to BI is recovered as the tensor product of S and S, ¯ or T a BI ⊗ C = SA ⊗ SA0 T BI ⊗ C = S ⊗ S and similarly ¯∗ T ∗ BI ⊗ C = S ∗ ⊗ S
or Ta BI ⊗ C = SA ⊗ SA0 .
An abstract tensor index a is thus understood as an unprimed spinor index A and a primed spinor index A0 clumped together: a = AA0 . The spin bundle S is equipped with a canonical symplectic form, εAB , referred to as the Levi–Civita symbol. It is used to raise and lower spinor indices, but due to its skew-symmetry, the order is important: εAB κB = κA ,
κA εAB = κB .
The complex conjugate εAB = ε¯A0 B 0 , simply denoted εA0 B 0 , plays a similar role on ¯ These symplectic structures are compatible with the metric, more precisely S. gab = εAB εA0 B 0 . The Dirac equation finds its simplest expression in terms of two-component 0 spinors (sections of the bundles SA , SA , SA or SA0 ): AA0 0 φA = µχA , ∇ (2.4) m 0 ∇AA χA0 = µφA , µ = √ , 2
where m ≥ 0 is the mass of the field. In the massless case, Eq. (2.4) reduces to the Weyl anti-neutrino equation 0
∇AA φA = 0 ,
(2.5)
since the equation on χ (the Weyl neutrino equation), 0
∇AA χA0 = 0 , is the complex conjugate of the anti-neutrino equation 0
∇AA χ ¯A = 0 . Equation (2.5) is the object of this paper; we shall refer to it as the Weyl equation. The full Dirac equation (2.4) possesses a conserved current (see for example [46]) on general curved space-times, defined by the future oriented non-spacelike vector field, sum of two future oriented null vector fields: 0
0
V a = φA φ¯A + χ ¯ A χA .
March 10, 2004 14:17 WSPC/148-RMP
00191
Scattering of Massless Dirac Fields by a Kerr Black Hole
This implies that the total charge outside the black hole Z Z 0 C(t) = Va T a dVol = (φA φ¯A0 + χ ¯A χA0 )T AA dVol , Σt
39
(2.6)
Σt
is constant throughout time, where T a is the future oriented normal vector field to Σt , normalized for convenience so that Ta T a = 2, and dVol is the volume form induced on Σt by the Kerr metric, i.e. r σ 2 ρ2 drdω . (2.7) dVol = ∆ The quantity C(t) defines a norm for (φA , χA0 ) on Σt (in fact the natural L2 norm, see for example [46]). This will be explained in more detail in Sec. 2.5. Remark 2.1. Thanks to this charge conservation, the non-stationarity of spacetime is not seen as a difficulty for the scattering theory of classical Dirac fields. The effects, however, do appear at the level of the physical interpretation. Let us consider the so-called Klein paradox as a toy model to explain how they can be seen: i∂t ψ = (αDr + βm + V )ψ , with V ∈ Cb∞ (R) ,
lim V (r) = 0 ,
r→−∞
lim V (r) = U > 0 .
r→+∞
If U > 2m, then a particle whose energy is between m and −m + U will propagate near −∞ as an electron and near +∞ as a positron. It is therefore natural to ask whether there is creation of particles in this situation, i.e. whether eternal rotating black holes create particles. Most physicists claim that there is in fact creation of particles (see for example [13]), but the mathematical proof is still missing. It is clear that such a mathematical proof can only be given in a second quantized, many particle framework, and it would require the use of the classical scattering results proved in this paper. The Klein paradox has been studied from a mathematical point of view by Bongaarts and Ruijsenaars [9, 10]; they show that the classical scattering matrix cannot be implemented as a unitary operator in the Fock space of the free fields. Using the Newman–Penrose formalism, Eq. (2.4) can be expressed as a system of partial differential equations with respect to a coordinate basis. This formalism is based on the choice of a null tetrad, i.e. a set of four vector fields l a , na , ma and m ¯ a , the first two being real and future oriented, m ¯ a being the complex conjugate a of m , such that all four vector fields are null and ma is orthogonal to l a and na , that is to say la l a = n a n a = m a m a = l a m a = n a m a = 0 .
(2.8)
The tetrad is said to be normalized if in addition la n a = 1 ,
ma m ¯ a = −1 .
(2.9)
March 10, 2004 14:17 WSPC/148-RMP
40
00191
D. H¨ afner & J.-P. Nicolas
Such a null tetrad defines at each point a basis of the complexified tangent space to our manifold, in other words, the tetrad is a global section of the complexified principal bundle. The vectors l a and na describe “dynamic” or scattering directions, i.e. directions along which light rays may escape towards infinity (or more generally asymptotic regions corresponding to scattering channels). The vector ma tends to have, at least spatially, bounded integral curves, typically ma and m ¯ a generate rotations. The principle of the Newman–Penrose formalism is to decompose the covariant derivative into directional covariant derivatives along the frame vectors. To each directional derivative corresponds a standard symbol: D = l a ∇a ,
D 0 = n a ∇a ,
δ = m a ∇a ,
δ0 = m ¯ a ∇a .
The connection coefficients (first order derivatives of the metric) can be organized into combinations involving only derivatives of frame vectors along frame vectors. These combinations are referred to as spin coefficients. For a normalized tetrad, there are twelve spin coefficients defined as follows (see Penrose and Rindler [50, Vol. 1, pp. 226–228]) κ = ma Dla ,
ρ˜ = ma δ 0 la ,
ε=
1 a (n Dla + ma Dm ¯ a) , 2
β=
1 a (n δla + ma δ m ¯ a) , 2
π = −m ¯ a Dna ,
σ ˜ = ma δla ,
(2.10)
1 a 0 (n δ la + ma δ 0 m ¯ a) 2
(2.11)
1 a 0 (n D la + ma D0 m ¯ a) , 2
(2.12)
α= γ=
τ = m a D 0 la ,
λ = −m ¯ a δ 0 na ,
µ = −m ¯ a δna ,
ν = −m ¯ a D 0 na ,
(2.13)
where we have denoted by ρ˜ and σ ˜ the spin coefficients usually √ denoted ρ and σ, p in order to avoid confusion with the functions ρ = ρ2 and σ = σ 2 appearing in the expression (2.1) of the Kerr metric. The spin coefficients can also be expressed in terms of the Ricci rotation coefficients γ(a)(b)(c) (see for example Chandrasekhar [12]). For this definition, the frame vectors are denoted by la = e(1) a ,
na = e(2) a ,
ma = e(3) a ,
m ¯ a = e(4) a ,
na = e(2) a ,
ma = e(3) a ,
m ¯ a = e(4) a ,
the dual 1-forms by la = e(1) a ,
and the components of tensors with respect to this frame and co-frame are denoted by light-face latin indices within brackets, e.g.: R(a) (b)(c)(d) = Ra bcd e(a) a e(b) b e(c) c e(d) d . The Ricci rotation coefficients are defined by 1 γ(a)(b)(c) = [λ(a)(b)(c) + λ(c)(a)(b) − λ(b)(c)(a) ] , 2 ∂ ∂ i j e − e λ(a)(b)(c) = (b)i (b)j e(a) e(c) ∂xj ∂xi
March 10, 2004 14:17 WSPC/148-RMP
00191
Scattering of Massless Dirac Fields by a Kerr Black Hole
41
and the expression of the spin-coefficients in terms of the γ(a)(b)(c) is κ = γ(3)(1)(1) ,
ρ˜ = γ(3)(1)(4) ,
ε=
1 (γ(2)(1)(1) + γ(3)(4)(1) ) , 2
(2.14)
σ ˜ = γ(3)(1)(3) ,
µ = γ(2)(4)(3) ,
γ=
1 (γ + γ(3)(4)(2) ) , 2 (2)(1)(2)
(2.15)
λ = γ(2)(4)(4) ,
τ = γ(3)(1)(2) ,
α=
1 (γ(2)(1)(4) + γ(3)(4)(4) ) , 2
(2.16)
1 (γ(2)(1)(3) + γ(3)(4)(3) ) . (2.17) 2 We can now express Eq. (2.4) as a system of partial differential equations, involving partial derivatives along the frame vectors; this system acts on the components of φA and χA0 in a unitary spin-frame (oA , ιA ), defined uniquely up to an overall sign factor by the requirements that ν = γ(2)(4)(2) ,
0
oA o¯A = la ,
0
π = γ(2)(4)(1) ,
ιA ¯ιA = na ,
0
β=
oA ¯ιA = ma ,
0
ιA o¯A = m ¯a,
o A ιA = 1 .
(2.18)
We denote by φ0 and φ1 the components of φA in (oA , ιA ), and χ00 and χ10 the 0 0 components of χA0 in (¯ oA , ¯ιA ): φ0 = φ A o A ,
φ1 = φ A ιA ,
0
χ00 = χA0 o¯A ,
0
χ10 = χA0 ¯ιA .
Dirac’s equation then takes the form (see for example [12]) m a a n ∂a φ0 − m ∂a φ1 + (µ − γ)φ0 + (τ − β)φ1 = √ χ10 , 2 m a ¯ a ∂a φ0 + (α − π)φ0 + (ε − ρ˜)φ1 = − √ χ00 , l ∂ a φ1 − m 2 m a ¯ 10 = √ τ − β)χ µ − γ¯)χ00 + (¯ ¯ a ∂a χ10 + (¯ n ∂ a χ0 0 − m φ1 , 2 m ¯˜)χ10 = − √ la ∂a χ10 − ma ∂a χ00 + (¯ φ0 ε−ρ α−π ¯ )χ00 + (¯ 2 and the Weyl equation is simply ( a n ∂a φ0 − ma ∂a φ1 + (µ − γ)φ0 + (τ − β)φ1 = 0 , l a ∂ a φ1 − m ¯ a ∂a φ0 + (α − π)φ0 + (ε − ρ˜)φ1 = 0 .
(2.19)
2.3. A choice of null tetrad and the calculation of the spin coefficients The description of Kerr’s space-time in the framework of the Newman–Penrose formalism has been used before by Teukolski [56] and Unruh [57] to calculate the expression of the massless Dirac equation in Boyer–Lindquist coordinates (note that Unruh, although his calculations used the Newman–Penrose formalism, described his results in terms of Dirac matrices), and subsequently by Chandrasekhar for the full Dirac equation (see [11] for the original work, but also [12]). The tetrad used in
March 10, 2004 14:17 WSPC/148-RMP
42
00191
D. H¨ afner & J.-P. Nicolas
all these references is due to Kinnersley [36]. It is naturally inherited from the type D structure. The two real null vectors are chosen along the principal null directions V + and V − : la
∂ = λV + , ∂xa
na
∂ = µV − , ∂xa
the normalization condition la na = 1 then gives λ µ g(V + , V − ) = 1 , whence, after calculation, λµ
2ρ2 = 1. ∆
Kinnersley’s choice was simply to take λ = 1. Once the directions of l a and na are chosen, the complex vector fields are uniquely determined, modulo a phase factor eiθ , by (2.8) and (2.9). This gives Kinnersley’s tetrad, which we denote La , N a , ma , m ¯ a: 1 ∂ ∂ 2 2 ∂ a ∂ = +a (r + a ) + ∆ , (2.20) L ∂xa ∆ ∂t ∂r ∂ϕ Na
∂ 1 ∂ ∂ 2 2 ∂ (r + a ) , = − ∆ + a ∂xa 2ρ2 ∂t ∂r ∂ϕ
1 ∂ i ∂ ∂ ∂ √ = + + ia sin θ , ∂xa ∂t ∂θ sin θ ∂ϕ p 2 ∂ 1 ∂ i ∂ a ∂ √ m ¯ −ia sin θ + , = − ∂xa ∂t ∂θ sin θ ∂ϕ p¯ 2 ma
(2.21)
(2.22)
(2.23)
where p = r + ia cos θ . In this tetrad, the real null vectors La and N a have very different behaviors near the horizon because 1/∆ blows up there while 1/(2ρ2 ) remains bounded. The consequence will be that the two components φ0 and φ1 of the spinor φ, solution to the massless Dirac equation, will not be on an equal footing near the horizon. This would break the time symmetry of our scattering construction (the components would need to be rescaled near the horizon in different manners for future and past scattering data). We prefer to modify this tetrad so that the real vectors behave similarly at the horizon. We define a normalized Newman–Penrose tetrad l a , na , ma , m ¯ a by a simple modification of Kinnersley’s tetrad; we choose la
∂ = λV + , ∂xa
na
∂ = µV − , ∂xa
λ=µ
March 10, 2004 14:17 WSPC/148-RMP
00191
Scattering of Massless Dirac Fields by a Kerr Black Hole
and the vectors ma and m ¯ a remain unchanged. This gives us 1 ∂ ∂ ∂ ∂ +a , la a = p (r2 + a2 ) + ∆ ∂x ∂t ∂r ∂ϕ 2∆ρ2 1 ∂ ∂ ∂ 2 2 ∂ , (r + a ) − ∆ = p +a n ∂xa ∂t ∂r ∂ϕ 2∆ρ2 a
∂ ∂ 1 ∂ i ∂ √ , ia sin θ = + + ∂xa ∂t ∂θ sin θ ∂ϕ p 2 1 ∂ i ∂ ∂ ∂ − m ¯ a a = √ −ia sin θ + . ∂x ∂t ∂θ sin θ ∂ϕ p¯ 2 ma
The dual tetrad of 1-forms is s ρ2 ∆ 2 a dr − a sin θ dϕ , dt − la dx = 2ρ2 ∆ a
na dx =
s
ρ2 ∆ 2 dt + dr − a sin θ dϕ , 2ρ2 ∆
1 ma dxa = √ ia sin θ dt − ρ2 dθ − i(r2 + a2 ) sin θ dϕ , p 2
1 m ¯ a dxa = √ −ia sin θ dt − ρ2 dθ + i(r2 + a2 ) sin θ dϕ . p¯ 2
43
(2.24)
(2.25)
(2.26)
(2.27)
(2.28)
(2.29)
(2.30)
(2.31)
To the tetrad (2.24)–(2.27), we associate a spin-frame (oA , ιA ) satisfying (2.18). The calculation of the spin-coefficients gives κ=σ ˜ = λ = ν = 0, r 1 ∆ ia sin θ ρ˜ = µ = − , τ =−√ , p¯ 2ρ2 2 ρ2 ia sin θ π= √ , 2 p¯2
(r − M )ρ2 − r∆ p , ε= 2ρ2 2∆ρ2
cot θ ia sin θ a2 sin θ cos θ √ − √ α= √ + , 2 p¯2 2 2 p¯ 2ρ2 2 p¯
a2 sin θ cos θ cot θ √ + , β= √ 2 2p 2ρ2 2 p s (r − M )ρ2 − r∆ ∆ ia cos θ p γ= − . 2 2 2ρ2 ρ2 2ρ 2∆ρ
(2.32)
(2.33)
(2.34)
(2.35)
March 10, 2004 14:17 WSPC/148-RMP
44
00191
D. H¨ afner & J.-P. Nicolas
2.4. Calculation and first simplifications of Weyl’s equation Replacing in Eq. (2.19) the expressions of the frame vectors and of the spincoefficients gives us the following explicit expression of the Weyl equation on the Kerr metric in terms of Boyer–Lindquist coordinates: s ia sin θ a r 2 + a2 ∆ 1 p ∂ t φ0 − ∂ ϕ φ0 − √ ∂ t φ1 − √ ∂ θ φ1 ∂ φ +p 2 r 0 2 2 2ρ p 2 p 2 2∆ρ 2∆ρ
i (r − M )ρ2 + r∆ p φ0 − √ ∂ ϕ φ1 − p 2 sin θ 2ρ2 2∆ρ2 cot θ ia sin θ a2 sin θ cos θ √ + √ √ − φ1 = 0 , (2.36) + 2p 2 2ρ2 2ρ2 2p s ia sin θ a ∆ r 2 + a2 1 p ∂ t φ1 + ∂ ϕ φ1 + √ ∂ t φ0 − √ ∂ θ φ0 ∂ r φ1 + p 2 2 2 2ρ p¯ 2 p¯ 2 2∆ρ 2∆ρ −cot θ a2 sin θ cos θ √ + √ φ0 2¯ p 2 2ρ2 2¯ p ! (r − M )ρ2 + r∆ ia∆ cos θ p φ1 = 0 . + p 2ρ2 2∆ρ2 ρ2 2∆ρ2
i + √ ∂ ϕ φ0 + p¯ 2 sin θ +
(2.37)
Multiplying the spinor by the measure density associated with an adequate radial variable will get rid of some long-range potentials (the same technique was used in [44] for the Dirac equation on the Schwarzschild metric). Other long-range potentials do remain in the equation. The method used to eliminate them is described in Subsec. 2.5.1. We introduce the “good” radial variable for time-dependent scattering: a variable r∗ (already used in [12] and more recently in [32]) such that the principal null geodesics have radial speed ±1 with respect to this coordinate, i.e. such that
r 2 + a2 dr∗ = . (2.38) dr ∆ In the Schwarzschild case, r∗ is the Regge–Wheeler coordinate r +2M Log(r −2M ). On the exterior of a slow Kerr black hole, we have r r − r+ 2M 2 2 2 r∗ = r + M Log(r − 2M r + a ) + √ + R0 , (2.39) Log 2 2 r − r− M −a
where R0 ∈ R is arbitrary. The measure dVol has the following expression with respect to the coordinates r∗ , θ and ϕ: s ∆ σ 2 ρ2 dVol = dr∗ dω , dω = sin θ dθ dϕ . (r2 + a2 )2
March 10, 2004 14:17 WSPC/148-RMP
00191
45
Scattering of Massless Dirac Fields by a Kerr Black Hole
We define the “density spinor” φ˜A =
∆ σ 2 ρ2 (r2 + a2 )2
1/4
φA .
(2.40)
The only differences between the equation satisfied by φ˜ and (2.36) and (2.37) come from the terms
∆ σ 2 ρ2 (r2 + a2 )2
14
∂ ∂r
∆ σ 2 ρ2 (r2 + a2 )2
− 14
=− +
∆ σ 2 ρ2 (r2 + a2 )2
41
∂ ∂θ
∆ σ 2 ρ2 (r2 + a2 )2
− 14
=
(r − M )ρ2 + r∆ 2∆ρ2 ((r − M )(r2 + a2 ) − 2r∆)a2 sin2 θ , 2σ 2 (r2 + a2 )
a2 sin θ cos θ ∆a2 sin θ cos θ + . 2ρ2 2σ 2
˜ = t (φ˜0 , φ˜1 ) satisfies the following system of equations Hence, the vector Φ ˜ + M r ∂r Φ ˜ + PΦ ˜ = 0, ˜ + Mθ ∂θ + 1 cot θ Φ ˜ + M ϕ 1 ∂ϕ Φ Mt ∂ t Φ 2 sin θ r 2 + a2 p 2∆ρ2 Mt = ia sin θ √ p¯ 2
0 Mθ = −1 √ p¯ 2
− P =
(r−M )(r 2 +a2 )−2r∆ 2 a 2σ 2 (r 2 +a2 )
− ∆a
2
ia sin θ √ p 2 , 2 2 r +a p 2∆ρ2
−
−1 √ p 2 , 0 sin2 θ
sin√θ cos θ 2σ 2 2p ¯
s
∆ 2ρ2
Mr =
s
∆ 2ρ2
a sin θ p 2∆ρ2 Mϕ = i √ p¯ 2
√sin θ − − ia 2 2ρ
0
−i √ p 2
0
1
,
, a sin θ p 2∆ρ2
a2 sin θ cos θ √ − ρ2 2p
(r−M )(r 2 +a2 )−2r∆ 2 a 2σ 2 (r 2 +a2 )
!
−1
(2.41)
sin2 θ
q
∆a2 sin θ √cos θ 2σ 2 p 2
∆ 2ρ2
+
√ ∆ cos θ ia √ ρ2 2ρ2
2.5. Further simplifications of the equation Multipling Eq. (2.41) by the matrix Mt−1 , we obtain the evolution system: ˜ + M −1 Mr ∂r Φ ˜ + M −1 MS 2 i D ˜+p ∂t Φ /S 2 Φ t t
a 2∆ρ2
˜ + M −1 P Φ ˜ = 0, Mt−1 ∂ϕ Φ t
.
March 10, 2004 14:17 WSPC/148-RMP
46
00191
D. H¨ afner & J.-P. Nicolas
MS 2
−1 √ p 2 = 0
iD /S 2 =
0
, −1 √ p¯ 2 0
∂θ +
1 i cot θ + ∂ϕ 2 sin θ , 0
i 1 cot θ − ∂ϕ 2 sin θ where the angular terms have been decomposed into the Dirac operator D / S 2 on the 2-sphere and a remainder involving only derivatives with respect to ϕ. A first advantage of this decomposition is that the operators D /S 2 and ∂ϕ are regular on the whole 2-sphere; the singularities appearing in cot θ and sin1 θ ∂ϕ are thus understood as coordinate singularities. The other advantage is that the Dirac operator D / S 2 is spherically symmetric. Hence the lack of spherical symmetry (that is to say, the effects of rotation) is materialized first by the term in ∂ϕ and second by the lack of symmetry in the matrix Mt−1 MS 2 . We see that the matrix Mt−1 MS 2 behaves as r−1 near infinity, whereas √ a 2 Mt−1 falls off as r−2 . Thus, the term in ∂ϕ can be ∂θ +
2∆ρ
understood as a long-range perturbation of the “principal” part involving D /S2 : a p
2∆ρ2
Mt−1 ∂ϕ =
1 O(Mt−1 MS 2 i D /S 2 ) r
as r → +∞ .
This however is of no matter since the term in ∂ϕ will be treated as a potential (falling off as r−2 and therefore short-range) using the cylindrical symmetry. The real problem comes from the matrix Mt−1 MS 2 : we have p ia∆ sin θ (r2 + a2 ) ∆ρ2 − − −1 pσ 2 σ2 Id2 as r → +∞ Mt−1 MS 2 = ' p r 2 2 2 (r + a ) ∆ρ ia∆ sin θ − σ2 p¯σ 2
and there exists no “spherically symmetric” matrix M0 (meaning that the coefficients of M0 depend solely on r), falling off as r −1 at infinity, such that Mt−1 MS 2 − M0 = O(r−2−ε ) as r → +∞. This is obvious when we consider the fact that ia∆ sin θ/σ 2 is zero on the axis and falls off as r −2 at infinity on the equator; no spherically symmetric matrix can make up for such a behavior. This shows that Mt−1 MS 2 D /S 2 is a long-range perturbation of M0 D /S 2 for any spherically −1 symmetric matrix M0 falling-off as r at infinity. Remark 2.2. This problem is caused by the rotation of Kerr’s space-time. The natural way of minimizing the effects of rotation in the expression of an equation is to choose means of describing the geometry that are as closely tied in with this rotation as possible. We have essentially two possibilities:
March 10, 2004 14:17 WSPC/148-RMP
00191
Scattering of Massless Dirac Fields by a Kerr Black Hole
47
• Change coordinates to follow locally non-rotating observers; this induces timedependent expressions for the metric and the equation, and therefore entails even more serious analytic difficulties. • Find a new Newman–Penrose tetrad in some sense associated with locally nonrotating observers. The next paragraph is devoted to the construction of such a tetrad. The upshot will be that Kinnersley’s tetrad, although it is systematically used in detailed studies of the Kerr geometry, including Chandrasekhar’s stationary scattering theories, is not adapted to the point of view of time-dependent scattering. Note that we have not quite used Kinnersley’s tetrad, but a rescaled version of it. Using the exact Kinnersley tetrad would produce similar long-range terms at infinity. 2.5.1. A new Newman Penrose tetrad adapted to the foliation Given a Newman–Penrose tetrad l a , na , ma , m ¯ a , the vector field l a + na is timelike future-oriented as the sum of two future-oriented null vectors. Hence, to any Newman–Penrose tetrad, we can associate a preferred timelike future-pointing vector field (or observer), given by the sum of the √ two real frame vectors. Besides, the norm of such a vector field must always be 2 since (la + na )(la + na ) = 2 . Locally non-rotating observers are described by the future-oriented normal to the hypersurfaces Σt . We consider T a the future-oriented vector field normal to the Σt and normalized so that T a Ta = 2. It is given in Boyer–Lindquist coordinates by (see [46]) s 2σ 2 ∂ 2aM r ∂ ∂ a T . = + ∂xa ∆ρ2 ∂t σ 2 ∂ϕ ¯ a , that follows the local We are looking for a Newman–Penrose tetrad la , na , ma , m rotation of space-time. The first natural idea is to impose la + n a = T a .
(2.42)
This is exactly the notion of a tetrad adapted to the foliation as it was defined in [46]. The way we choose to construct such a tetrad is guided by our wish to minimize the apparent influence of rotation in our equation. Requiring (2.42) is a first step in this direction, but there are many possible choices of la and na compatible with (2.42). We single out a pair of null vectors that are not accelerated in the angular directions; i.e. we choose la and na in the plane spanned by T a and ∂r . Requiring that la should be outgoing, na incoming, and a similar behavior of the two vectors near the horizon, we obtain
March 10, 2004 14:17 WSPC/148-RMP
48
00191
D. H¨ afner & J.-P. Nicolas
∂ 1 ∂ la a = T a a + ∂x 2 ∂x σ
σ
s
∆ ∂ 2ρ2 ∂r
2aM r ∂ ∂ = p + 2 ∂t σ 2 ∂ϕ 2∆ρ s 1 a ∂ ∆ ∂ a ∂ = T − n ∂xa 2 ∂xa 2ρ2 ∂r = p
2∆ρ2
2aM r ∂ ∂ + ∂t σ 2 ∂ϕ
+
s
∆ ∂ , 2ρ2 ∂r
(2.43)
−
s
∆ ∂ . 2ρ2 ∂r
(2.44)
The choice of ma is now imposed, except for the freedom of a complex factor of modulus 1. The vector fields T a ∂a , ∂r , ∂θ and ∂ϕ define an orthogonal frame everywhere (except on the axis where ∂θ is singular); since ma is orthogonal to la and na and since these two vectors span the plane hT a , ∂r i, ma must be tangent to the 2-sphere. This gives (choosing the complex factor so as to obtain the simplest expression) 1 ρ2 i ∂ ∂ ∂ + , (2.45) ma a = p ∂x σ sin θ ∂ϕ 2ρ2 ∂θ ∂ 1 ρ2 i ∂ ∂ ¯a a = p m . (2.46) − ∂x σ sin θ ∂ϕ 2ρ2 ∂θ We now recall some well-known facts about Newman–Penrose tetrads and spin-frames, then we shall see how they can be significant to us.
Properties. We consider a Newman–Penrose tetrad l a , na , ma , m ¯ a and a unitary A A spin-frame (o , ι ) related to the tetrad by (2.18). We also denote the frame spinors oA and ιA by oA = ε 0 A ,
ιA = ε 1 A
and the dual dyad (−ιA , oA ) by −ιA = εA 0 ,
oA = ε A 1 . 0
To any vector field X a , we can associate the matrix X AA of the components of its 0 spinor form X AA . More precisely ! 0 0 X 00 X 01 AA0 , X = 0 0 X 10 X 11 and, writing for example the first component in details, 0
0
0
0
X 00 = εA0 ε¯A0 0 X AA = ιA ¯ιA0 X AA = na X a .
March 10, 2004 14:17 WSPC/148-RMP
00191
Scattering of Massless Dirac Fields by a Kerr Black Hole
49
With similar calculations for the three other components, we obtain ! na X a −m ¯ aX a AA0 . X = −ma X a la X a 0
Denoting X the matrix X AA , the quadratic form on SA associated with X a : 0 φA 7→ φA φ¯A0 X AA ,
is expressed in terms of X and the vector φ0 = ε 0 A φA
Φ=
φ1 = ε 1 A φA
!
as follows 0 ¯ = hΦ, XΦi ¯ C2 , φA φ¯A0 X AA = t ΦXΦ
where h . , . iC2 denotes the standard scalar product on C2 . In particular, we see that for the vector Z a := (la + na ) , 0
the matrix Z AA is the identity and therefore 0
φA φ¯A0 Z AA = |φ0 |2 + |φ1 |2 .
(2.47)
The conserved charge (2.6) outside the black hole involves the quadratic form 0 φA φ¯A0 T AA associated with the normal vector T a . In the Newman–Penrose tetrad ¯ a , the vector T a is the sum of the two real frame vectors, whence la , n a , m a , m the above quadratic form becomes simply |φ0 |2 + |φ1 |2 . It follows that, with respect to this new tetrad, the conserved charged is exactly the L2 norm of the vector Φ representing the spinor in the associated spin-frame. 2.5.2. The new expressions of Weyl’s equation and the conserved quantity Having found a Newman–Penrose tetrad meeting our requirements, we now wish to re-calculate Weyl’s equation using this new tetrad. We have the possibility of computing the new values of the spin-coefficients using (2.10)–(2.13) or (2.14)–(2.17). This is excessively long and tedious and we prefer to follow a somewhat shorter path. Given any two normalized Newman–Penrose tetrads, there is a unique Lorentz transformation changing the first into the second. To this Lorentz transformation corresponds a unique (modulo sign) spin-transformation. All we have to do here is to calculate the Lorentz transformation Lba which transforms the tetrad (2.24)– B B ¯B 0 (2.27) into (2.43)–(2.46), infer the spin transformation SA such that Lba = SA SA0 , then modify the components of the spinor φ˜A using this spin-transformation. The
March 10, 2004 14:17 WSPC/148-RMP
50
00191
D. H¨ afner & J.-P. Nicolas
equation satisfied by the modified components will be the form of Weyl’s equation corresponding to the tetrad (2.43)–(2.46) where the unknown is the “density spinor” φ˜A defined by (2.40). First, in order to obtain the expression of the Lorentz transformation Lba , we express the frame-vectors (2.43)–(2.46) in terms of (2.24)–(2.27). We have √ σ+ a ∆a2 sin2 θ a ∆a sin θ a l = l + n + (ip ma − i¯ pm ¯ a) , 2σ 2σσ+ 2σρ √ ∆a2 sin2 θ a σ+ a ∆a sin θ a l + n + (ip ma − i¯ pm ¯ a) , n = 2σσ+ 2σ 2σρ √ pσ+ a p¯∆a2 sin2 θ a ∆a sin θ a a m = −i (l + na ) + m − m ¯ , 2σ 2σρ 2σσ+ ρ where σ+ = σ + r2 + a2 . The matrix of the Lorentz transformation in the basis (2.24)–(2.27) is therefore (b)
L(a) = Lba e(a) a e(b) b
σ+ 2σ
∆a2 sin2 θ 2σσ + = √ −i ∆a sin θ 2σ √ ∆a sin θ i 2σ
√
∆a2 sin2 θ 2σσ+
√
σ+ 2σ
√ ∆a sin θ −i 2σ √ ∆a sin θ i 2σ
√ ∆a sin θ − i¯ p 2σρ √ ∆a sin θ − i¯ p 2σρ . (2.48) 2 2 p¯∆a sin θ − 2σσ+ ρ p¯σ+ 2σρ
∆a sin θ ip 2σρ ∆a sin θ ip 2σρ pσ+ 2σρ
−
p∆a2 sin2 θ 2σσ+ ρ
B B The matrix SA of the spin-transformation SA in the spin-frame (oA , ιA ) is uniquely 0 B ¯B B determined, modulo sign, by Lba = SA SA0 and det(SA ) = 1. The first condition can be expressed in terms of components as 02 0 0 |S0 | |S01 |2 S00 S¯010 S01 S¯000 02 0 0 |S | |S11 |2 S10 S¯110 S11 S¯100 1 (b) L(a) = (2.49) . S 0 S¯000 S 1 S¯100 S 0 S¯100 S 1 S¯000 0 1 0 1 0 1 0 1 0
S10 S¯000
0
S11 S¯010
0
S10 S¯010
0
S11 S¯000
B Identifying (2.48) and (2.49) and imposing det(SA ) = 1, we obtain √ p¯ ia sin θ ∆ √ ! r σ+ − √ ρ σ+ S00 S01 p B =: U , SA = = √ 0 1 2σρ ia sin θ ∆ S1 S1 p¯ √ σ+ √ σ+ ρ
(2.50)
March 10, 2004 14:17 WSPC/148-RMP
00191
Scattering of Massless Dirac Fields by a Kerr Black Hole
51
where the square root of p is calculated using any given determination of the square root on the complex plane. The spin-transformation (2.50) transforms the spinA B A B frame (oA , ιA ) into a new spin-frame (oA = SB o , ıA = S B ι ) such that la = 0 0 0 0 ¯ A , na = ıA¯ıA , ma = oA¯ıA , m ¯ a = ıA o ¯ A . The components of the spinor φ˜A oA o in this spin-frame are given by A B ψ0 = φ˜A oA = φ˜A SB o = S0A φ˜A ,
A B ψ1 = φ˜A ıA = φ˜A SB ι = S1A φ˜A ,
that is to say Ψ :=
ψ0 ψ1
!
˜, = UΦ
˜= Φ
φ˜0 φ˜1
!
=
∆ σ 2 ρ2 (r2 + a2 )2
41
φ0 φ1
!
.
(2.51)
The equation satisfied by Ψ is 1 ˜ ˜ ˜ ˜ ϕ 1 ∂ϕ Ψ + P˜ Ψ = 0 , Mt ∂t Ψ + Mr ∂r Ψ + Mθ ∂θ + cot θ Ψ + M 2 sin θ where
˜ t = UMt U−1 M
˜ r = UMr U−1 M
˜ θ = UMθ U−1 M
˜ ϕ = UMϕ U−1 M
r 2 + a2 p 2∆ρ2 = ia sin θ p 2ρ2
ia sin θ −p 2ρ2 , r 2 + a2 p 2∆ρ2
√ r 2 + a2 ia sin θ ∆ − − σ σ ∆ = , 2ρ2 ia sin θ√∆ 2 2 r +a − σ σ √ σσ+ + ∆a2 sin2 θ −ia sin θ ∆ ρ ρσ+ −1 = √ , √ σ 2 σσ+ + ∆a2 sin2 θ ia sin θ ∆ ρσ+ ρ
s
a sin θ p 2∆ρ2 = i √ 2ρ
P˜ = UP U−1 + UMθ
−i √ 2ρ
, a sin θ p 2∆ρ2
∂ ∂ , U−1 + UMr , U−1 , ∂θ ∂r
(2.52)
March 10, 2004 14:17 WSPC/148-RMP
52
00191
D. H¨ afner & J.-P. Nicolas
U
−1
=
r
ρ 2σp
√ −
√ ia sin θ ∆ √ σ+ , ρ√ σ+ p¯
σ+
√ ρ ia sin θ ∆ √ p¯ σ+
and the commutators [∂θ , U−1 ], [∂r , U−1 ] are simply the partial derivatives of U−1 with respect to θ and r. Left-multiplying Eq. (2.52) by 2 r + a2 ia sin θ p p 2 2∆ρ2 2ρ2 ˜ t −1 = 2∆ρ M , σ 2 −ia sin θ r 2 + a2 p p 2ρ2 2∆ρ2 we get
∂t Ψ + M r ∂r Ψ + M θ
1 1 ˜ t−1 P˜ Ψ = 0 , ∂ϕ Ψ + M ∂θ + cot θ Ψ + Mϕ 2 sin θ
(2.53)
where −∆ σ −1 ˜t M ˜r = Mr = M 0
! 0 −1 0 ∆ = , σ ∆ 0 1 σ √ − ∆ 0 σ , ˜ t−1 M ˜θ = √ Mθ = M − ∆ 0 σ √ 2M ra sin θ −i ∆ρ2 σ2 σ2 ˜ t−1 M ˜ϕ = Mϕ = M √ . i ∆ρ2 2M ra sin θ
σ2 σ2 We then modify Eq. (2.53) by isolating the Dirac operator D /S 2 on the 2-sphere S 2 from the rest of the angular terms: ∂t Ψ + Ar ∂r Ψ + AS 2 iD /S 2 Ψ + Aϕ ∂ϕ Ψ + BΨ = 0 , ∆ Ar = Mr = σ
−1 0
0 1
!
2M ra σ2 Aϕ = √ i ∆ ρ2 −1 σ sin θ σ
,
AS 2
√ − ∆ = Id2 , σ
√ −i ∆ ρ2 −1 σ sin θ σ , 2M ra 2 σ
˜ −1 P˜ . B=M t
(2.54)
March 10, 2004 14:17 WSPC/148-RMP
00191
Scattering of Massless Dirac Fields by a Kerr Black Hole
53
Remark 2.3. The matrix AS 2 is now diagonal and furthermore AS 2 D /S 2 is a shortrange perturbation of A0S 2 D /S 2 , where √ − ∆ 0 AS 2 = 2 Id2 . r + a2 Remark 2.4. As was remarked at the end of the previous subsection, the conserved quantity takes a considerably simplified form with respect to the new tetrad, namely r Z Z 0 σ 2 ρ2 AA AA0 T φA φ¯A0 T φA φ¯A0 dVol = drdω ∆ Σt Σt s Z 0 ∆σ 2 ρ2 dr∗ dω T AA φA φ¯A0 = (r2 + a2 )2 Σt Z 0 ¯ T AA φ˜A φ˜A0 dr∗ dω = Σt
=
Z
Σt
hΨ, ΨiC2 dr∗ dω .
(2.55) 0
In the tetrad la , na , ma , m ¯ a , the explicit expression of the quantity T AA φA φ¯A0 0 involves the matrix T of T AA in the associated spin-frame, given by p r 2 + a2 ia sin θ ∆ρ2 √ √ σ2 p¯ σ 2 ∗ . T=U U= p 2 2 −ia sin θ ∆ρ2 r +a √ √ p σ2 σ2 2.6. The main theorems for the Kerr framework We start by re-expressing the form (2.54) of Weyl’s equation in a manner which makes explicit the existence of two asymptotic regions: one corresponding to the horizon, the other to infinity. This is done by using the Regge–Wheeler-type coordinate r∗ , defined in (2.38), instead of r. This coordinate r∗ , as was remarked earlier, is chosen so that the principal null geodesics have radial speed ±1. The consequence is that the horizon is now described as the asymptotic region r∗ → −∞, sometimes referred to as “negative infinity”. Equation (2.54) takes the new form ∂t Ψ = iD /K Ψ ,
(2.56)
where D /K , the Hamiltonian for the Weyl equation on the Kerr metric, is given by √ ∆ r 2 + a2 D /K = γDr∗ + D / 2 − Aϕ Dϕ + iB , σ σ S ! 1 0 1 ∂ 1 ∂ , Dϕ = . γ= , Dr ∗ = i ∂r i ∂ϕ 0 −1 ∗
March 10, 2004 14:17 WSPC/148-RMP
54
00191
D. H¨ afner & J.-P. Nicolas
This expression allows us to define asymptotic dynamics near the horizon and in the neighborhood of infinity, corresponding to approximations of D /K in these asymptotic regions. For our first construction of wave operators, we make the simplest choice of asymptotic dynamics: asymptotic profiles. In addition to being simple and intuitive, this has the major advantage of allowing an almost immediate geometrical interpretation of the scattering theory, as providing the solution to a non-trivial Goursat problem on the Penrose compactified block I. The details of this interpretationa are given in Sec. 8. As r∗ → −∞, D /K approaches DH = γDr∗ −
a 2M r+ a 2 + a2 )2 Dϕ = γDr∗ − r 2 + a2 Dϕ , (r+ +
(2.57)
whereas in the neighborhood of infinity, D /K is close to D∞ = γDr∗ .
(2.58)
The asymptotic Hamiltonians are both self-adjoint on H = L2 ((R × S 2 ; dr∗ dω); C2 )
(2.59)
and for Ψ = t(ψ0 , ψ1 ) ∈ H, a 2 + a2 t ψ0 r∗ + t, θ, ϕ − r+ itDH (e Ψ)(r∗ , θ, ϕ) = , a ψ1 r∗ − t, θ, ϕ − 2 t 2 r+ + a
(e
itD∞
Ψ)(r∗ , θ, ϕ) =
ψ0 (r∗ + t, θ, ϕ) ψ1 (r∗ − t, θ, ϕ)
!
.
The dynamics generated by DH operates a radial translation at speed 1 with respect to r∗ (towards −∞ for the first component of Ψ and towards +∞ for the second) 2 as well as a rotation of fixed angular velocity a/(r+ + a2 ), i.e. the rotation speed of the horizon as perceived by an observer static at infinity. The operator D ∞ induces the same radial translation as DH without the rotation. Both Hamiltonians have the same spaces of incoming (respectively outgoing) data: H− = {Ψ = (ψ0 , ψ1 ) ∈ H; ψ1 = 0} (respectively H+ = {Ψ = (ψ0 , ψ1 ) ∈ H; ψ0 = 0}) . a The
constructions of Sec. 8 will indeed be based on asymptotic profiles, but they will be slightly different from the ones used here, so as to make their geometric significance more obvious. The scattering results using the profiles of Sec. 8 and the ones described in this section are equivalent.
March 10, 2004 14:17 WSPC/148-RMP
00191
Scattering of Massless Dirac Fields by a Kerr Black Hole
55
Although the geometric interpretation is less relevant, it is also interesting to use Dirac-type operators, involving the full D /S 2 in their angular part, as comparison dynamics. We introduce a D /H = γDr∗ + e−κ+ |r∗ |θ0 (r∗ ) D /2S − 2 Dϕ , r+ + a 2 (2.60) θ1 (r∗ ) 2 D /S , D /∞ = γDr∗ + |r∗ | where κ+ , the surface gravity at the outer horizon, is given by √ M 2 − a2 r+ − r − κ+ = = 2 , (2.61) 2 2 2(r+ + a ) r+ + a 2 θ0 , θ1 ∈ C ∞ (R), θ0 is zero in the neighborhood of 0 and 1 far from the origin, and θ1 (x) = 1R+ (x)θ0 (x). The choice of D /H is related to an adequate choice of constant R0 in the definition (2.39) of r∗ (see Remark 7.2). These two Hamiltonians are self-adjoint on H. Our first theorem establishes the existence of asymptotic velocities for all Hamiltonians D / K , DH , D∞ , D /H and D /∞ . Then we give a first construction of wave operators using asymptotic profiles as comparison dynamics in Theorem 2.2 and another construction in Theorem 2.3 using D /H and D /∞ instead. We denote by the letters W and W the wave operators associated with asymptotic profiles; we use the letter Ω, in accordance with the notations of Sec. 6, for the wave operators associated with D /H and D /∞ . All these wave operators are defined using projections onto the positive and negative spectra of our asymptotic velocities. Theorem 2.1 (Asymptotic velocities). (i) The three Hamiltonians D /H , D /∞ and D /K are self-adjoint on H and their spectra are purely absolutely continuous; in particular, their point spectra are empty. ± ± (ii) There exist bounded self-adjoint operators P ± , PH , P∞ such that, for all J ∈ C∞ (R): r∗ itD /K e /K , (2.62) J(P ± ) = s − lim e−itD J t→±∞ t r∗ itD ± /H J(PH ) = s − lim e−itD J e /H , (2.63) t→±∞ t r∗ itD ± −itD /∞ J(P∞ ) = s − lim e J e /∞ , (2.64) t→±∞ t r∗ itDH −itDH J J(∓γ) = s − lim e e t→±∞ t r∗ itD∞ = s − lim e−itD∞ J e . (2.65) t→±∞ t − + − + In addition, we have P − = −P + , PH = −PH , P∞ = −P∞ , + + σ(P + ) = σ(PH ) = σ(P∞ ) = {−1, 1} .
March 10, 2004 14:17 WSPC/148-RMP
56
00191
D. H¨ afner & J.-P. Nicolas
Remark 2.5. Note that 1R± (−γ) = PH± , where PH± is the projector from H onto H± . Theorem 2.2 (Asymptotic profiles). (i) The classical wave operators defined by the strong limits −itD /K itDH W± e PH ∓ , H := s − lim e
(2.66)
−itD /K itD∞ W± e PH ± , ∞ := s − lim e
(2.67)
/K ˜ ± := s − lim e−itDH eitD W 1R− (P ± ) , H
(2.68)
/K ˜ ± := s − lim e−itD∞ eitD W 1R+ (P ± ) , ∞
(2.69)
t→±∞
t→±∞
t→±∞
t→±∞
exist and satisfy ˜ ± = (W± )∗ , W H H
˜ ± = (W± )∗ , W ∞ ∞
˜ ± W± + W ˜ ± W± = W ± W ˜ ± + W± W ˜ ± = IdH , W ∞ ∞ ∞ ∞ H H H H ± ker(W± H) = H ,
∓ ker(W± ∞) = H ,
˜ ± ) = H∓ , ran(W H
˜ ± ) = H± . ran(W ∞
(ii) The scattering can be described in a more synthetic manner by defining global wave operators involving both asymptotic dynamics: W+ :
H− ⊕ H+
−→ H ,
+ ((ψ0 , 0), (0, ψ1 )) 7−→ W+ H (ψ0 , 0) + W∞ (0, ψ1 ) ,
W− :
H+ ⊕ H−
(2.70)
−→ H
− ((0, ψ1 ), (ψ0 , 0)) 7−→ W− H (0, ψ1 ) + W∞ (ψ0 , 0) .
(2.71)
˜ + Ψ, W ˜ + Ψ) , ˜ + : H −→ H− ⊕ H+ , W ˜ + Ψ = (W W ∞ H
(2.72)
˜ − : H −→ H+ ⊕ H− , W ˜ − Ψ = (W ˜ − Ψ, W ˜ − Ψ) . W ∞ H
(2.73)
The operators W ± are isometries and satisfy ˜ + W + = IdH− ⊕H+ , W
˜ − W − = IdH+ ⊕H− , W
˜ + = W −W ˜ − = IdH . W +W
The scattering operator S is the isometry defined by the commutative diagram:
March 10, 2004 14:17 WSPC/148-RMP
00191
Scattering of Massless Dirac Fields by a Kerr Black Hole
57
Theorem 2.3 (Dirac-type comparison dynamics). The classical wave operators defined by the strong limits ± −itD /K itD Ω± e /H 1R− (PH ), H := s − lim e
(2.74)
−itD /K itD ± Ω± e /∞ 1R+ (P∞ ), ∞ := s − lim e
(2.75)
/H itD ˜ ± := s − lim e−itD Ω e /K 1R− (P ± ) , H
(2.76)
−itD /∞ itD ˜± Ω e /K 1R+ (P ± ) , ∞ := s − lim e
(2.77)
t→±∞
t→±∞
t→±∞
t→±∞
exist and satisfy ˜ ± = (Ω± )∗ , Ω H H
˜ ± = (Ω± )∗ , Ω ∞ ∞
± ˜± ± ˜± ± ˜ ± Ω± + Ω ˜± Ω ∞ Ω∞ = ΩH ΩH + Ω∞ Ω∞ = IdH . H H
Remark 2.6. Theorems 2.2 and 2.3 describe the scattering properties of the solutions of (2.56). The scattering properties of the vector Φ (describing the physical Weyl field φA in the spin-frame (oA , ιA )), are obtained from the results of these theorems via the identifying operator J : L2 ((Σ; dVol); C2 ) −→ H ,
J Φ :=
∆σ 2 ρ2 (r2 + a2 )2
1/4
UΦ .
More precisely, to a given wave operator W, W or Ω, for the solution Ψ of (2.56), corresponds the wave operator J −1 WJ , J −1 ΩJ , or J −1 W J , for the vector Φ. Remark 2.7. The theorems above show that the solutions of Eq. (2.56) satisfy asymptotically the same L2 properties as the solutions propagated by the simpler comparison dynamics. In particular, the L2 norm in a compact set tends to zero as t → ±∞. Remark 2.6 entails that these properties are also satisfied by the physical field Φ. The next four sections describe a complete scattering theory based on a Mourre estimate for a general analytic framework. In Sec. 7, the form (2.56) of Weyl’s equation outside a slow Kerr black hole is understood as a special case of this general framework; Theorems 2.1–2.3 are then deduced from the results of Sec. 6. In Sec. 8, we shall describe the scattering properties of Dirac fields outside a Kerr black hole in a more geometrical manner. A new form of Theorem 2.2 will be derived, using the flows of outgoing and incoming principal null geodesics as comparison dynamics. This form is the most natural geometrically and enables us to interpret the scattering theory as the solution of a singular Goursat problem on the Penrose compactification of block I.
March 10, 2004 14:17 WSPC/148-RMP
58
00191
D. H¨ afner & J.-P. Nicolas
3. Abstract Analytic Framework In this section, we describe generic Dirac-type operators on the manifold Σ = R×S 2 , endowed with the C ∞ density dµ = drdω. We use the notation r for the “radial” variable, for simplicity; it is to be understood as corresponding to the variable r ∗ , and not r, in the Kerr case. We shall often denote f 0 the derivative of f with respect to r, even for functions depending also on ω. We define several operators: first the reference Dirac operator D /0 then a perturbed and some asymptotic Dirac operators. The perturbed operator is a generalization of the Hamiltonian of Eq. (2.56). The choice of the others is guided by the wish to compare the full evolution with both asymptotic profiles and the Dirac propagator on simplified Lorentzian manifolds. 3.1. Symbol classes Let η > 0. We define the following symbol classes as subsets of C ∞ (Σ) :
∀ α ∈ N, β ∈ N2
f ∈ Sm,n iff ( O(hrim−α ) α β ∂r ∂ω f ∈ O(enη|r| ) f ∈ Sm
∀ α ∈ N, β ∈ N2
r → +∞ , r → −∞ .
iff
∂rα ∂ωβ f ∈ O(hrim−α ) .
Recall that for f ∈ C ∞ (R), we have ∀α ∈ N,
f ∈ Sm
iff
∂rα f ∈ O(hrim−α ) .
We shall understand S m as the subset of spherically symmetric elements of Sm . 3.2. Technical results We consider the operator D /T = γDr + p(r)D / S 2 + c1 with γ=
iD /S 2 =
!
1
0
0
−1 ! 1
0 1
0
, 1 1 ∂θ + cot θ + 2 sin θ
0
i
−i
0
!
∂ϕ ,
c1 ∈ R
and p ∈ C ∞ (R), not necessarily bounded. We consider D /T as an operator acting on the Hilbert-space H defined earlier in (2.59): H = L2 ((R × S 2 ; drdω); C2 ) .
March 10, 2004 14:17 WSPC/148-RMP
00191
Scattering of Massless Dirac Fields by a Kerr Black Hole
59
In order to describe the domain of D /T we will introduce spin weighted harmonics l Ysn (for a complete definition, see for example [45]). For each spinorial weight s, l 2s ∈ Z, the family {Ysn = einϕ ulsn ; l − |s| ∈ N, l − |n| ∈ N} forms a Hilbert basis of L2 (Sω2 , dω) and we have the following relations dulsn n − s cos θ l − usn = −i[(l + s)(l − s + 1)]1/2 uls−1,n , dθ sin θ dulsn n − s cos θ l + usn = −i[(l + s + 1)(l − s)]1/2 uls+1,n . dθ sin θ We define ⊗2 as the following operation between two vectors of C2 ∀ v = (v1 , v2 ), u = (u1 , u2 ) , Since the families o n Y 1l ,n ; (n, l) ∈ I , 2
n
Y−l 1 ,n ; (n, l) ∈ I 2
o
v ⊗2 u = (u1 v1 , u2 v2 ) .
,
I=
(n, l)/l −
1 ∈ N, l − |n| ∈ N 2
form a Hilbert basis of L2 (Sω2 , dω), we express H as a direct sum H = ⊕(n,l)∈I Hnl , Hnl = L2 ((R; dr); C2 ) ⊗2 Ynl , Ynl = Y−l 1 ,n , Y 1l ,n . 2
2
2
2
We shall henceforth identify Hnl and L ((R; dr); C ) as well as ψnl ⊗2 Ynl and ψnl . We see that 1 nl nl + c1 , D /T = ⊕nl D /T with D /T := γDr + p(r)τ l + 2 ! 0 −1 τ := . (3.1) −1 0
In what follows, we put q := l + ∃C > 0,
1 2
and we assume
∀ r ∈ R |p0 (r)| ≤ C|p(r)| .
We put D(D /nl /nl T ) = {u ∈ Hnl ; D T u ∈ Hnl } , ( ) X X nl nl 2 2 D(D /T ) = u = unl ; unl ∈ D(D /T ), kunl k + kD /T unl k < ∞ . nl
nl
Our aim is to show that (D /T , D(D /T )) is self-adjoint. We will need several lemmas. nl
Lemma 3.1. (C0∞ (R))2 is dense in D(D /T ) equipped with the graph norm. Proof. For φ ∈ C0∞ (R) and f = (f1 , f2 ) ∈ Hnl we put φ ∗ f := (φ ∗ f1 , φ ∗ f2 ), nl “∗” denoting the convolution. Let f ∈ D(D /T ). We R use a standard approximation procedure. Let φ ∈ C0∞ (R) such that φ ≥ 0, φ = 1, φδ (x) := δ −1 φ( xδ ), χ ∈
March 10, 2004 14:17 WSPC/148-RMP
60
00191
D. H¨ afner & J.-P. Nicolas
x ). C0∞ {|x| < 1}, χ ≡ 1 in a neighborhood of 0, kχ(α) k ≤ 1, α = 0, 1, χm (x) := χ( m We put
fδ,m (x) := φδ ∗ (χm f ) ,
fm := χm f .
We write kD /nl /nl /nl /nl /nl /nl T f −D T fδ,m k ≤ kD T f −D T fm k + kD T fm − D T fδ,m k . Let us first consider nl
nl
nl
D /T f − D /T fm = (1 − χm )D /T f + iγ(χm )0 f . We obtain ∀ > 0, ∃ M ;
∀m ≥ M
. 2 1 ∈ (Hcomp )2 . Indeed
nl
nl
kD /T f − D / T fm k <
For a given > 0 we fix m ≥ M. Note that fm nl
D /T fm = γDr fm + pτ qfm nl
and D /T fm , pτ qfm ∈ Hnl . We have nl
nl
kD / T fm − D /T fδ,m k ≤ kγDr fm − γDr fδ,m k + kpτ qfm − pτ qfδ,m k .
(3.2)
It is well known that kγDr fm − γDr fδ,m k → 0 (δ → 0) . Let us consider the second term in (3.2). We have supp fm ⊂ B(0, m), supp φδ ⊂ B(0, 1). It follows that supp φδ ∗ fm ⊂ B(0, 1) + B(0, m) =: K .
We estimate (|f |2 = |f1 |2 + |f2 |2 ): Z Z |fδ,m − fm |2 dx |p(fδ,m − fm )|2 dx ≤ sup |p(x)|2 x∈K
≤C
Z
K
K
|fδ,m − fm |2 dx → 0 (δ → 0) .
This concludes the proof of the lemma. Lemma 3.2. We have ∀ u ∈ D(D /T ) , ∀ u ∈ D(D /T ) ,
kγDr uk ≤ C(kD /T uk + kuk) ,
(3.3)
kp(r)D /S 2 uk ≤ C(kD /T uk + kuk) ,
(3.4)
nl
kγDr uk ≤ C(kD /T uk + kuk) ,
nl
(3.5)
∀ u ∈ D(D /nl T ),
kp(r)τ quk ≤ C(kD /nl T uk + kuk) .
(3.6)
∀ u ∈ D(D /T ) , This implies nl
D(D /T ) = (H 1 (R))2 ∩ D(p) ,
where
D(p) = {u ∈ Hnl ; pu ∈ Hnl } .
March 10, 2004 14:17 WSPC/148-RMP
00191
Scattering of Massless Dirac Fields by a Kerr Black Hole
61
Proof. (3.5), (3.6) follow from (3.3), (3.4). In the sense of quadratic forms on D(D /T ) we have D /2T = Dr2 + p2 (r)D /2S 2 +
γ 0 1 p (r)D /S 2 ≥ (Dr2 + p2 (r)D /2S 2 ) − C , i 2
which proves the lemma. Corollary 3.1. We have nl
(i) D(D /T ) ⊂ (H 1 (R))2 . nl (ii) If f = (fij ) and fij , g ∈ C∞ (R), then f (r)g(D /T ) is compact. nl
nl
Lemma 3.3. (D /T , D(D /T )) is self-adjoint. Proof. By a classical result due to Thallerb [55, Theorem 4.3], we know that ∞ 2 (D /nl /nl T , (C0 (R)) ) is essentially self-adjoint. Let us denote by DT (D T ) the domain of nl /nl its self-adjoint extension. We have to show that DT (D /T ) = D(D T ). nl If u belongs to DT (D /T ) then, by definition, there exists a sequence um ∈ nl nl nl nl (C0∞ (R))2 such that um → u, D /T um → v =: D /T u in Hnl . Besides D /T um → D /T u in nl the sense of distributions and we find that D /T u, defined in the sense of distributions, nl belongs to Hnl , i.e. u ∈ D(D /T ). nl nl /T ) by Lemma 3.1, there Let now u ∈ D(D /T ). As (C0∞ (R))2 is dense in D(D nl nl exists a sequence um ∈ (C0∞ (R))2 such that um → u in Hnl , D / T um → D /T u in Hnl , nl i.e. u ∈ DT (D /T ). Lemma 3.4. (C0∞ (Σ))2 is dense in D(D /T ) equipped with the graph norm. Proof. Recall that D(D /T ) = Let u =
P
nl
(
u=
X nl
unl ∈ H; unl ∈
1 Hnl ,
X nl
nl kD /T unl k2
)
<∞ .
unl ∈ D(D /T ). For > 0 we choose N > 0 such that X 2 2 kD /nl . T unl k + kunl k < 2 |(n,l)|≥N nl
∞ 2 (C0∞ (R))2 being dense in D(D /T ) we can choose φN nl ∈ (C0 (R)) such that
∀ |(n, l)| ≤ N ,
nl
2 N 2 kD /T (unl − φN nl )k + k(unl − φnl )k <
. 2N 2
We put φN :=
X
|(n,l)|≤N b Thaller’s
∞ 2 φN nl ∈ (C0 (Σ)) .
result is proved in dimension 3 but the proof is independent of the dimension.
March 10, 2004 14:17 WSPC/148-RMP
62
00191
D. H¨ afner & J.-P. Nicolas
We have kD /T (u − φN )k2 + ku − φN k2 =
nl
X
|(n,l)|≤N
+
2 N 2 kD /T (unl − φN nl )k + k(unl − φnl )k
X
|(n,l)|≥N
2 2 kD /nl T unl k + kunl k < .
We find Lemma 3.5. The operator D /T with domain D(D /T ) = {u ∈ H; D /T u ∈ H} ) ( X nl X nl 2 kD /T unl k < ∞ , unl ; unl ∈ D(D /T ), = u= nl
nl
is self-adjoint. Proof. Let us first show that D(D /T ) = {u ∈ H; D /T u ∈ H} ( ) X X nl nl 2 = u= unl ; unl ∈ D(D /T ), kD /T unl k < ∞ . nl
Let u =
P
nl
(3.7)
nl
unl ∈ H. As D /T : H → D0 is continuous, it follows that X nl D /T u = D /T unl nl
in the sense of distributions. The equality (3.7) then follows from the fact that X nl X nl D /T unl ∈ H ⇔ ∀ n, l D /nl kD /T unl k2 < ∞ . T unl ∈ Hnl , nl
nl
We now have to show: (i) (D /T , D(D /T )) is closed, (ii) ran(D /T ± i) = H. We will start with (i). Let um ∈ D(D /T ), um → u, D /T um → v. We must show that u ∈ D(D /T ) and D /T u = v. Let X X X um = um u= unl , v = vnl . nl , nl
nl
nl
nl
nl
Clearly um / T um /T , D(D /T )) is closed, unl ∈ D(D /T ) and nl → unl , D nl → vnl . As (D nl D /T unl = vnl . We have X nl X D /T u = D /T unl = vnl = v .
March 10, 2004 14:17 WSPC/148-RMP
00191
Scattering of Massless Dirac Fields by a Kerr Black Hole
63
But v ∈ H, i.e. X nl
nl
kD /T unl k2 =
X
kvnl k2 < ∞ and u ∈ D(D /T ) .
P Let us now show (ii). Let v = vnl ∈ H. We have to find u ∈ D(D /T ) such that nl nl nl /T ) /T )) is self-adjoint, for each n, l we find unl ∈ D(D (D /T ± i)u = v. As (D /T , D(D P such that (D /T ± i)unl = vnl . We put u := unl and check X nl D /T u = D /T unl = ∓iu + v ∈ H , i.e. u ∈ D(D /T ) . nl
This concludes the proof of the lemma. Remark 3.1. If we suppose that p is bounded our results follow immediately from the Kato–Rellich theorem. For technical reasons, we shall need to consider the case p(r) = c0 eηr for some constant c0 ≥ 0. We put D /e = γDr + c0 eηr D /S 2 + c1 . 3.3. The reference Dirac operator We consider on (C0∞ (Σ))2 the operator D /0 := γDr + g(r)D /S 2 + f (r) ,
f, g ∈ Cb∞ (R), g > 0 .
We assume g ∈ S−1,−1 , f 0 ∈ S−3 , the existence of some constants c0 ≥ 0 and c1 ∈ R such that (g(r) − c0 eηr )(i) = O(e(η+ε)r ) as r → −∞ , f (r) − c1 ∈ O(hri−2 ) ,
g(r) −
1 r
(i)
r → −∞ ,
∈ O(hri−1−i−ε ) as r → +∞ , f (r) ∈ O(hri−2 ) ,
ε > 0, i = 0, 1 ,
r → ∞.
ε > 0, i = 0, 1 ,
(3.8) (3.9) (3.10) (3.11)
Remark 3.2. Properties (3.8) and (3.10) imply the existence of R0 > 0 and c2 > 0 such that c2 ∀ r ≥ R0 , g(r) ≥ and ∀ r ≤ −R0 , g(r) ≥ c2 eηr . (3.12) r Remark 3.3. Note that the reference Dirac operator has the same principal terms as the Dirac operator associated with the Riemannian metric g0 = dr2 + g −2 (r)dω 2
March 10, 2004 14:17 WSPC/148-RMP
64
00191
D. H¨ afner & J.-P. Nicolas
on Rr × S 2 . The Riemannian manifold (Rr × S 2 , g0 ) has two asymptotic ends: the end corresponding to r → +∞ is asymptotically flat and that corresponding to r → −∞ is asymptotically hyperbolic, in other words exponentially large (the size of the 2-sphere grows exponentially as r → −∞). We have nl
nl
D /0 = ⊕(n,l)∈I D /0 , nl
D /0 = γDr + g(r)qτ + f (r) .
nl
D /0 is self-adjoint with domain D(D /0 ) = (H 1 (R))2 . By Lemma 3.5 D /0 is self-adjoint with domain X X nl 2 kD / ψ k < ∞ . ψnl ; ψnl ∈ D(D /nl ) D(D /0 ) = Ψ = 2 2 nl (L (R)) 0 0 (n,l)∈I
(n,l)∈I
3.4. The perturbed Dirac operator
We consider on C0∞ (Σ) a Dirac-type operator of the form D / = hD /0 h + V ;
V = (Vij ), h ≥ 0 .
(3.13)
We suppose that Vij , h ∈ Cb∞ (Σ) are some real functions satisfying the following conditions: ∃,
0 < α < 1,
|h2 − 1| ≤ α ,
(3.14)
∂ω h ∈ S−2 ,
(3.15)
h − 1 ∈ S−2 ,
(3.16)
Vij ∈ S−2 .
(3.17)
We define D /1 := D /−D /0 . 3.5. Asymptotic dynamics We have two asymptotic regions (r → ±∞) and to each we associate an asymptotic operator. Let θ0 ∈ Cb∞ (R) such that θ0 = 0 in a neighborhood of 0 and for all |x| ≥ 1, θ0 (x) = 1 and θ1 (x) := 1R+ (x)θ0 (x). We first consider negative infinity. We put D /− := γDr + ce−ηθ0 (r)|r|D /S 2 + c1 nl
nl
(c ≥ 0)
and we define D /− , D(D /− ) = (H 1 (R))2 in the same way as for D /0 . Clearly nl nl /− is self-adjoint with domain /− )) is self-adjoint and D (D /− , D(D ) ( X nl X nl kD /− ψnl k2 < ∞ . D(D /− ) = ψ = ψnl ; ψnl ∈ D(D /− ), nl
nl
March 10, 2004 14:17 WSPC/148-RMP
00191
Scattering of Massless Dirac Fields by a Kerr Black Hole
65
For positive infinity, we put D /+ = γDr + θ1 (r) nl
nl
1 D / 2. |r| S
nl
nl
D /+ , D(D /+ ) are defined as for D /0 and D /− , (D /+ , D(D /+ )) is self-adjoint and D /+ is self-adjoint with domain ( ) X X nl nl 2 D(D /+ ) = ψ = ψnl ; ψnl ∈ D(D /+ ), kD /+ ψnl k < ∞ . nl
The constant c in D /− will be taken equal to 0 for a comparison with asymptotic profiles and to c0 for a Dirac-type asymptotic operator (see introduction to this section). We denote in what follows: N := {0, ±} , f0 := f ,
g0 := g ,
f+ := 0 ,
g− := ce−ηθ0 (r)|r| ,
g+ (r) :=
1 θ1 (r) , |r|
f− := c1 .
4. Some Fundamental Properties of our Dirac-Type Hamiltonians This section is mostly devoted to the proof of technical results that will be important later on. Many results stated here concern functions of self-adjoint operators. Their proof requires the use of the Helffer–Sj¨ ostrand formula (see e.g. [15]). Let χ ∈ C0∞ (R), H a self-adjoint operator, there exists an almost analytic extension χ ˜ of χ such that ∂χ ˜ χ| ˜ R = χ , (z) ≤ C|Im z|N , ∀ N ∈ N , ∂ z¯ 1 χ(H) = 2πi
Z
∂χ ˜ (z)(z − H)−1 dz ∧ d¯ z. ∂ z¯
4.1. Description of the domains Let us first note that the operator D / is self-adjoint with the same domain as D /0 . Lemma 4.1. (D /, D(D /0 )) is self-adjoint. Proof. As h : D(D /0 ) → D(D /0 ), (D /, D(D /0 )) is well-defined and symmetric. We have D / = h2 D /0 + V˜
(4.1)
with V˜ = (V˜ij ), V˜ij ∈ S−2 . The self-adjointness of (D /, D(D /0 )) follows from (3.14) and the Kato–Rellich theorem. We put D(D /) := D(D /0 ). By (3.13) and (3.14) it can easily be checked that for u ∈ H, the properties D /u ∈ H and D /0 u ∈ H are equivalent. So we obtain D(D /) = {u ∈ H; D /u ∈ H} .
March 10, 2004 14:17 WSPC/148-RMP
66
00191
D. H¨ afner & J.-P. Nicolas
1 1 2 We denote in what follows H1 := D(D /) = D(D /0 ), Hnl := D(D /nl 0 ) = (H (R)) ⊗2 ∞ 2 Ynl . Recall from Lemma 3.4 that (C0 (Σ)) is dense in D(D /0 ) = D(D /). Let kukH1 = kuk + kD /0 uk be the graph norm of D /0 and V 1 the closure of (C0∞ (Σ))2 in this norm.
Lemma 4.2. H1 = V 1 . Proof. (i) Let us first show that V 1 ⊂ H1 . Let u ∈ V 1 , um ∈ (C0∞ (Σ))2 such that um → u. D /0 um is a Cauchy sequence, so D /0 um → v ∈ H. Besides, D / 0 um → D /0 u in the sense of distributions, so D /0 u = v ∈ H, i.e. u ∈ H1 . (ii) We now show H1 ⊂ V 1 . Let u ∈ H1 . By Lemma 3.4 there exists a sequence um ∈ (C0∞ (Σ))2 such that kum − ukH1 → 0, it follows u ∈ V 1 . 2
We consider the quadratic forms associated to D /0 and 2
H := Dr2 + g 2 (r)D /S 2 2
which we denote by Q0 and QH , for example Q0 (u, u) = (D /0 u, u) + kuk2. We also denote H nl := Dr2 + g 2 (r)q 2 . Let D(Qi ), i ∈ {0, H}, be the closure of (C0∞ (Σ))2 in the norm Qi (u, u). Lemma 4.3. The norms Q0 (u, u) and QH (u, u) are equivalent. Proof. Let us first note ∃C > 0,
∀r ∈ R,
|g 0 (r)| ≤ Cg(r) .
In the sense of quadratic forms on (C0∞ (Σ))2 we have γ 2 2 /S 2 + g 0 (r)D D /0 = Dr2 + g 2 (r)D /S 2 + γDr f (r) + f (r)γDr + 2f (r)g(r)D /S 2 , i whence, 1 2 (D + g 2 (r)D /2S 2 ) − C ≤ D /20 ≤ 2(Dr2 + g 2 (r)D /2S 2 ) + C . 2 r This establishes the equivalence of the norms Q0 and QH . 1 Corollary 4.1. D(Q0 ) ⊂ (Hloc (Σ))2 . 1 Proof. D(Q0 ) = D(QH ) by the previous lemma. But D(QH ) ⊂ (Hloc (Σ))2 by the local ellipticity of the operator H.
Corollary 4.1 and Lemma 4.2 together give: 1 Lemma 4.4. H1 ⊂ (Hloc (Σ))2 .
Corollary 4.2. If fij , g ∈ C∞ (R), f (r) = (fij (r))ij , then f (r)g(D /) is compact on H. Proof. It is sufficient to suppose f, g ∈ C0∞ (R). So
f (r)g(D /) : H → (H 1 (Ω))2 ,→ H
March 10, 2004 14:17 WSPC/148-RMP
00191
Scattering of Massless Dirac Fields by a Kerr Black Hole
67
where Ω is some bounded set and the above embedding is compact by the Rellich theorem. Lemma 4.5. Let χ ∈ C0∞ (R). Then the operator χ(D /) − χ(D /0 ) is compact. Proof. Using the Helffer–Sj¨ ostrand formula it is sufficient to show k(z − D /0 )−1 (D /−D /0 )(z − D /)−1 k ≤ C|Im z|−2 , (z − D /0 )−1 (D /−D /0 )(z − D /)−1 is compact for all z ∈ C \ (σ(D /0 ) ∪ σ(D /)) .
(4.2) (4.3)
(4.2) is clear, let us show (4.3). We have D /−D /0 = (h − 1)D /0 h + D /0 (h − 1) + V . It follows that (z − D /0 )−1 (D /−D /0 )(z − D /)−1 = (z − D /0 )−1 (h − 1)D /0 h(z − D /)−1 + (z − D /0 )−1 D /0 (h − 1)(z − D /)−1 + (z − D /0 )−1 V (z − D /)−1 . (4.3) now follows from the fact that h : D(D /) → D(D /), (3.16), (3.17) and Corollary 4.2. We will also need the following: 2
2
Lemma 4.6. D(D / ) = D(D /0 ) = D(H). Proof. We have D / = hD /0 h + V , D /2 = hD /0 hV + V hD /0 h + hD / 0 h2 D /0 h + V 2 ,
(4.4)
2
D(D /0 ) = {u ∈ D(D /0 ); D /0 u ∈ D(D /0 )} , 2
D(D / ) = {u ∈ D(D /); D /u ∈ D(D /)} , D(D /) = D(D /0 ) . 2 D(D /0 ).
2
Let u ∈ We have to show that D / u ∈ H. This follows from (4.4) and the 2 2 2 2 fact that h, V : D(D /0 ) → D(D /0 ), D(D /0 ) → D(D /0 ). The proof for D(D /0 ) ⊂ D(D / ) is analogous using the fact that h is non-vanishing [see (3.14)]. The following estimates 2 give D(H) = D(D /0 ):
γ
2
kHuk2 ≤ kD /20 uk2 + g 0 (r)D / S 2 u i 2
≤ C(kD /0 uk2 + kuk2 ) ,
March 10, 2004 14:17 WSPC/148-RMP
68
00191
D. H¨ afner & J.-P. Nicolas
2 kD /0 uk2
γ
2
0
≤ C kHuk + g (r)D / S 2 u i
2
≤ C(kHuk2 + kuk2 ) .
We shall henceforth denote H2 := D(D /2 ) = D(D /20 ) = D(H). 4.2. Resolvent estimates Lemma 4.7. We have for all u ∈ D(D /0 )
kg(r)D /S 2 uk ≤ C(kD /0 uk + kuk) ,
(4.5)
kγDr uk ≤ C(kD /0 uk + kuk) .
(4.6)
Proof. The lemma follows from the equivalence of the norms Q0 and QH and the fact that kg(r)D /S 2 uk2 + kγDr uk2 = (Hu, u) . Lemma 4.8. For u ∈ D(D /0 ) we have
khri2 D /1 uk ≤ C(kD /0 uk + kuk) ,
(4.7)
khriD /1 hriuk ≤ C(kD /0 uk + kuk) ,
(4.8)
kD /1 uhri2 k ≤ C(kD /0 uk + kuk) .
(4.9)
Proof. We will only show (4.7), the proof for the other estimates is analogous. We have, using (3.14)–(3.17): D / = (h2 − 1)D / + V˜ with V˜ = (V˜ij ), V˜ij ∈ S−2 1
0
which gives (4.7). 4.3. Absence of eigenvalues for D /ν , ν ∈ N The following lemma is analogous to [6, Lemma VI.1]. Lemma 4.9. D /ν has no eigenvalues for all ν ∈ N . Similarly, D /e has no eigenvalues. Proof. We prove the lemma only for D /0 , the other cases are analogous. It is sufnl ficient to show that D /0 − c1 = γDr + g(r)τ q + f (r) − c1 has no eigenvalues. We put Vˆ (r) = g(r)τ q + f (r) − c1 . If u ∈ (L2 (R))2 is an eigenvector of D /nl 0 − c1 with eigenvalue λ, then w(r) = e−iλγr u(r) satisfies w0 (r) − iγe−iλγr Vˆ (r)eiλγr w(r) = 0 . (4.10) Each solution of (4.10) is in H 1 and therefore limr→−∞ w(r) = 0. As Z 0 |Vˆ |dr < ∞ , −∞
we conclude by Gronwall’s lemma that w = 0.
March 10, 2004 14:17 WSPC/148-RMP
00191
Scattering of Massless Dirac Fields by a Kerr Black Hole
69
5. The Mourre Estimate 5.1. Preliminary remarks The Mourre estimate is a positive commutator estimate between the Hamiltonian and another self-adjoint operator, called the conjugate operator. The conjugate operator thus represents an observable that increases along the evolution. For Schr¨ odinger or Dirac equations in flat space-time, the situation has been thoroughly studied and we can take the generator of dilations as conjugate operator. In our case we have two asymptotic regions. The space-time is asymptotically flat at positive infinity and we can use the generator of dilations as conjugate operator there. Near the black hole horizon the problem is much more complicated. Let us consider a toy model of this situation: D / = γDr + eηr D /S 2
on R− × S 2 .
The Dirac operators that we consider are short range perturbations of an operator of this kind. For such a Hamiltonian, if we try to use the generator of dilations A :=
1 (rDr + Dr r) 2
as conjugate operator, we find [iD /, A] = γDr − ηreηr D /S 2 . For χ ∈ C0∞ (R), χ(D /)[iD /, A]χ(D /) generically has no sign. Moreover, this commutator is not controlled by D /. In this spherically symmetric setting, we can use spin weighted harmonics and write D /nl = γDr + eηr τ q . The angular part is replaced by eηr τ q (q = l + 1/2), a mere potential. Therefore, after diagonalization, we can use the generator of dilations. If the metric is not spherically symmetric, we cannot proceed in this manner. We consider instead the −1 /S 2 | unitary transformation U = eη iDr ln |D . We obtain D / 2 ˆ/ = U ∗ D D /U = γDr + eηr S . |D /S 2 |
ˆ/ reduces to the operator On each spherical harmonics D ˆ/nl = γDr + eηr τ . D
If now we use the generator of dilations as conjugate operator, all the necessary estiˆ/nl ) mates are uniform in q simply because no term involves q! In particular, eηr τ χ(D ˆ/nl ) will be compact and thus small if the support of χ is sufficiently and ηreηr τ χ(D small. This “smallness-result” is uniform in q. If we apply our unitary transformation to the generator of dilations, we find an operator similar to the one introduced by Froese and Hislop (see [24]). The argument is however different. In the case
March 10, 2004 14:17 WSPC/148-RMP
70
00191
D. H¨ afner & J.-P. Nicolas
of the Laplacian we can show that the commutator between the angular part of the Laplacian and the Froese–Hislop conjugate operator is positive. In our case we cannot find such a conjugate operator because the angular part has no sign. One might think better to use D /2 rather than D / to get a Mourre estimate and then apply known results about the Mourre estimate for the square root of an operator (see 2 [14, 32]). Let us first remark that the angular part of D / also has no sign: γ 2 2 / 2. D / = Dr2 + e2ηr D /S 2 + eηr D i S /S 2 is not a perturbation (not even a Note also that the connection term eηr γi D long range one) of the Laplacian. This is typical for exponentially large ends. It is however reasonable to expect that the connection term is a perturbation of the Laplacian for a large class of asymptotic ends. For manifolds with such ends, a Mourre theory for the Laplacian implies directly a Mourre theory for the Dirac operator. 5.2. The abstract setting of Mourre theory In this section we recall the technical hypotheses for the Mourre estimate. We consider the commutator [H, iA] between the Hamiltonian H and another selfadjoint operator A, called the conjugate operator. We say that the pair (H, A) satisfies a Mourre estimate on some energy interval ∆, if 1∆ (H)[iH, A]1∆ (H) ≥ δ1∆ (H) ;
δ > 0.
As both operators H and A are unbounded we have to be careful to define correctly the commutator. We say that the pair (H, A) satisfies the Mourre conditions (see [42]) iff (M1’) D(A) ∩ D(H) is dense in D(H), (M2’) eisA preserves D(H), sup|s|≤1 kHeisA uk < ∞, ∀ u ∈ D(H), (M3’) [iH, A] which is defined as a quadratic form on D(H)∩D(A) is semibounded, closable and can be extended to a bounded operator from D(H) to H: |[iH, A](u, v)| ≤ CkHuk kvk ,
∀ u, v ∈ D(H) ∩ D(A) .
It has been remarked in [25] that the Virial theorem remains valid under the following conditions: (M1) eisA preserves D(H), (M2) [iH, A] defined as a quadratic form on D(H) ∩ D(A) can be extended to a bounded operator from D(H) to H: |[iH, A](u, v)| ≤ CkHuk kvk ,
∀ u, v ∈ D(H) ∩ D(A) .
(M1’) + (M2’) is even equivalent to (M1) (see [1, Proposition 3.2.5]). Note also that even in Mourre’s original work [42], the assumption that [iH, A] is semibounded is not necessary.
March 10, 2004 14:17 WSPC/148-RMP
00191
Scattering of Massless Dirac Fields by a Kerr Black Hole
71
In our opinion the simplest and most useful condition for the Mourre estimate is the following (see [1]): A bounded operator C is of class C k (A; H) iff R 3 s 7→ eisA Ce−isA is C k for the strong topology of B(H) . H ∈ C k (A) if there exists z ∈ C\σ(H) such that (z−H)−1 ∈ C k (A; H). (M1)−(M2) implies H ∈ C 1 (A) and the Virial theorem is valid under the only condition H ∈ C 1 (A) (see [1]). The condition H ∈ C 1 (A) has been characterized in [1, Theorem 6.2.10] by the following property of the commutator [H, iA]: Proposition 5.1. The operator H is of class C 1 (A) iff the following two conditions are satisfied : (i) There exists C < ∞ such that |(Au, Hu) − (Hu, Au)| ≤ Ck(H + i)uk2 ,
∀ u ∈ D(H) ∩ D(A) .
(ii) There exists z ∈ C \ σ(H) such that {u ∈ D(A) | (z − H)−1 u ∈ D(A), (¯ z − H)−1 u ∈ D(A)} is a core for A. In general it is not easy to check the conditions (i) and (ii) if the domains of H and A are not explicitly known. In such a case, a possibility for checking condition H ∈ C 1 (A) consists in searching first a common core for H and A. This is described in [26]. We start with an extension of the Nelson theorem (see [26, Lemma 1.2.5]): Lemma 5.1. Let H be a Hilbert space, N ≥ 1 a self-adjoint operator on H, A a symmetric operator on H such that D(N ) ⊂ D(A) and (i) kAuk ≤ CkN uk, u ∈ D(N ), 1 (ii) |(Au, N u) − (N u, Au)| ≤ CkN 2 uk2 , u ∈ D(N ). ¯ then (1 + iN )−1 u conThen A is essentially self-adjoint on D(N ). If u ∈ D(A), ¯ verges to u in the graph topology of D(A) when → 0. The operator N is called a comparison operator. In this situation, it is sufficient to calculate the commutator on D(N ); more precisely, we have the following lemma (see [26, Lemma 3.2.2]): Lemma 5.2. Let H, H0 , N be self-adjoint operators on a Hilbert space H such that N ≥ 1, D(H) = D(H0 ) as Banach spaces, and (z − H)−1 sends D(N ) into itself. Let A be a symmetric operator with domain D(N ). Suppose that H0 and A satisfy the assumptions of Lemma 5.1 with comparison operator N and denote still A the unique self-adjoint extension of A. Suppose furthermore that |(Au, Hu) − (Hu, Au)| ≤ C(kHuk2 + kuk2 ) ,
∀ u ∈ D(N ) .
March 10, 2004 14:17 WSPC/148-RMP
72
00191
D. H¨ afner & J.-P. Nicolas
Then (i) D(N ) is dense in D(A) ∩ D(H) equipped with the norm kHuk + kAuk + kuk, (ii) the quadratic form [H, iA] on D(A) ∩ D(H) is the unique extension of [H, iA] on D(N ), (iii) H is of class C 1 (A). We will also use the following lemma (see [25, Lemma 2]): Lemma 5.3. Let H ∈ C 1 (A) and suppose that the commutator [iH, A] can be extended to a bounded operator from D(H) to H. Then eisA preserves D(H). 5.3. Technical results We now define the comparison operator by N := H + r2 + 1 (acting on H) , N nl := H nl + r2 + 1 (acting on Hnl ) . We put D(N nl ) := {u ∈ Hnl ; N nl u ∈ Hnl } , D(N ) := {u ∈ H; N u ∈ H} , ( ) X X nl nl 2 = u= unl ; unl ∈ D(N ), kN unl k < ∞ . nl
We recall (a slightly weaker version for the first one) [32, Lemmas 4.1.1 and 5.1.1]: Lemma 5.4. (C0∞ (R))2 is dense in D(N nl ) and (C0∞ (Σ))2 is dense in D(N ). Lemma 5.5. We have for all u ∈ D(N ) kr2 uk ≤ kN uk2 + kuk2 , kHuk2 ≤ kN uk2 + kuk2 . Therefore we can characterize the domains of N nl and N in the following way: nl
D(N nl ) = D(H nl ) ∩ D(r2 ) = D((D /0 )2 ) ∩ D(r2 ) , D(N ) = D(H) ∩ D(r2 ) = D(D /2 ) ∩ D(r2 ) ( ) X X nl nl 2 2 2 = u= unl ∈ H; unl ∈ D(N ), kH unl k + kr unl k < ∞ , nl
where we have used Lemma 4.6. Lemma 5.6. Let n ∈ N and z ∈ C \ σ(D /), we have
March 10, 2004 14:17 WSPC/148-RMP
00191
Scattering of Massless Dirac Fields by a Kerr Black Hole
73
(i) (z − D /)−1 : D(hrin ) → D(hrin ), (ii) (z − D /)−1 : D(N ) → D(N ). Proof. We have clearly (z−D /)−1 : D(D /2 ) → D(D /2 ) and (ii) follows from (i) because 2 2 D(N ) = D(D / ) ∩ D(hri ). Let us first show that (z − D /)−1 : D(r) → D(r). This is equivalent to
isr
e − 1
−1 sup (z − D /) u (5.1)
< ∞ , ∀ u ∈ D(r) . s |s|≤1
We have eisr (z − D /)−1 e−isr − (z − D /)−1 eisr (z − D /)−1 (1 − e−isr ) eisr − 1 (z − D /)−1 = + . s s s Clearly
−isr
isr −1 1 − e
< ∞ , ∀ u ∈ D(r) . sup u e (z − D /)
s |s|≤1
Moreover
D / −D / eisr (z − D /)−1 e−isr − (z − D /)−1 = (z − D /s )−1 s (z − D /)−1 s s
with D /s = eisr D /e−isr ,
D /s − D / = −sγh2 .
Using (z − D /s )−1 = eisr (z − D /)−1 e−isr , this gives (5.1). Let us now suppose
(z − D /)−1 : D(hrin ) → D(hrin ) and show that (z − D /)−1 : D(hrin+1 ) → D(hrin+1 ) .
(5.2)
If u ∈ D(hrin+1 ), then hriu ∈ D(hrin ) and
hrin+1 (z − D /)−1 = hrin (hri(z − D /)−1 hri−1 )hriu .
In order to prove (5.2), it is therefore sufficient to show hri(z − D /)−1 hri−1 : D(hrin ) → D(hrin ) .
We have hri(z − D /)−1 hri−1 = (z − hriD /hri−1 )−1 and hriD /hri−1 can be treated in exactly the same way as D /. It follows that (z − hriD /hri−1 )−1 : D(hrin ) → D(hrin ) . Lemma 5.7. We have D / ∈ C 1 (hri) and the commutator [iD /, hri] is bounded. Proof. We use Proposition 5.1. By Lemma 5.6 we have for all z ∈ C \ σ(D /) (z − D /)−1 : D(hri) → D(hri) .
Furthermore [iD /, hri] = hγhri−1 rh and this is a bounded operator.
March 10, 2004 14:17 WSPC/148-RMP
74
00191
D. H¨ afner & J.-P. Nicolas
5.4. Conjugate operator for D / Let F ∈ C ∞ (R) with F (x) = 0 for x ≥ 1 and F (x) = 1 for x ≤ 12 . Let η > 0 be the constant of Sec. 3.1. We define ηr + ln |D /S 2 | FS := F . S Note that FS is well defined because 0 ∈ / σ(D /S 2 ). Let j± ∈ C ∞ (R), j± ≥ 0 with j− (x) = 1 for x ≤ 0, j− (x) = 0 for x ≥ 1, j+ (x) = 1 for x ≥ 1, j+ (x) = 0 for x ≤ 0 2 2 and j− + j+ = 1. Let j±,R (·) = j± (R¯˙ ). We put KS := (ηr + ln |D /S 2 |)FS2 ,
D(KS ) = {u ∈ H, KS u ∈ H} ,
2 X− (r, D /S 2 ) := j−,R (r)KS , 2 X+ (r) := rj+,R (r) ,
Z := X− + X+ . We obtain from [32, Corollary 5.2.2]: Lemma 5.8. (i) |X− (r, q)| ≤ Chri uniformly in q, R for all S, (i) |X− (r, q)| ≤ C uniformly in q, R, for all i ≥ 1 and for all S, 2 (ii) j−,R KS is bounded from D(N ) to D(Dr ), 2 j−,R Dr is bounded from D(N ) to D(KS ). We put A− :=
1 (X− (r, D /S 2 )Dr + hc) + c1 γX− (r, D /S 2 ) , 2
A+ :=
1 (X+ (r)Dr + hc) , 2
A := A− + A+ . Remark 5.1. In [39], a term of type c1 ηrF ( Sr )γ was introduced to treat an electromagnetic scalar potential, constant on the horizon of the Reissner–Nordstrøm black hole. Near the horizon, the effects of rotation in the case of Dirac’s equation outside a slow Kerr black hole are similar to the effects of charge on a Reissner–Nordstrøm background. We therefore use the same extra-term as in [39], but conjugated by the unitary transformation introduced in Sec. 5.1. After a cut-off near the horizon, this gives the term c1 γX− (r, D /S 2 ) in A− . By Lemma 5.8 the operators A± , A are well defined on D(N ). Remark 5.2. From now on we will consider systematically all commutators between two of the operators D /0 , A± , A, N as quadratic forms on D(N ). All these operators preserve Hnl , hence it is sufficient to calculate the commutators on
March 10, 2004 14:17 WSPC/148-RMP
00191
Scattering of Massless Dirac Fields by a Kerr Black Hole
75
D(N nl ); in fact we can even do these calculations on (C0∞ (R))2 using the density of (C0∞ (R))2 in D(N nl ). This justifies in particular the application of the Leibniz rule. In order to extend the commutators on larger spaces, we need to obtain estimates that are uniform in n, l. Lemma 5.9. The pairs (D /0 , N ) and (A, N ) satisfy the hypotheses of Lemma 5.1. Proof. Let us start with (D /0 , N ): 2
D(N ) ⊂ D(D /0 ) ⊂ D(D /0 ) , 2
kD /0 uk2 ≤ C(kD /0 uk + kuk2 ) ≤ CkN uk2 ,
∀ u ∈ D(N ) .
For u ∈ D(N ), we have |[iD /0 , N ](u, u)| ≤ |(2γru, u)| + 2|(f 0 (r)u, Dr u)| + 2|(g 0 (r)D /S 2 u, Dr u)| ≤ C(N u, u) . The proof for (A, N ) is similar to the proof of [32, Lemma 5.2.4]. We have one extra term which is 0 (c1 γX− Dr + hc)(u, u) ≤ C(N u, u) .
We omit the details. Lemma 5.10. We have D / ∈ C 1 (A) and the commutator [iD /, A] can be extended to a bounded operator from D(D /) to H, that we denote [iD /, A]0 . Proof. We use Lemma 5.2. We will show that ∀ u ∈ D(N ) ,
|(Au, D /u) − (D /u, Au)| ≤ C(kD /ukkuk + kuk2 ) .
(5.3)
Step 1. We will estimate |[iD /0 , A](u, u)|. (i) Let us first estimate |[iD /0 , A− ](u, u)|. We have as a quadratic form on D(N nl ): nl
0 0 0 0 [iD /0 , Anl − ] = γX− Dr − X− (g (r)τ q + f (r)) + c1 X− + [ig(r)τ q, c1 γX− ] ,
i.e. nl
0 0 0 [iD /0 , Anl − ](unl , unl ) = (γX− Dr unl , unl ) − (X− (g (r)τ q + f (r))unl , unl ) 0 + (c1 γX− unl , unl ) + [ig(r)τ q, c1 γX− ](unl , unl ) . 0 As X− is uniformly bounded in n, l the first term can be estimated by nl
0 (γX− Dr unl , unl ) ≤ C(kD /0 unl kkunl k + kunl k2 ) ,
March 10, 2004 14:17 WSPC/148-RMP
76
00191
D. H¨ afner & J.-P. Nicolas
where we have also used Lemma 4.7. The second term is in fact bounded: |X− g 0 (r)τ q| ≤ C|X− eηr+ln q | ≤ C , |X− f 0 (r)| ≤ C where we have used g ∈ S−1,−1 , f 0 ∈ S−3 and Lemma 5.8. The third term is bounded by Lemma 5.8 again and the last is bounded uniformly in q: |g(r)qX− c1 | ≤ C|X− eηr+ln q | ≤ C . (ii) Let us now estimate |[iD /0 , A+ ](u, u)|. As a quadratic form on D(N nl ) we find nl 0 0 0 [iD /nl 0 , A+ ] = γX+ Dr − X+ (g (r)τ q + f (r)) .
0 We obtain the estimate for the first term using the fact that X+ is bounded. In order to estimate the second term we use Lemma 4.7 and
|X+ g 0 (r)| ≤ C|g(r)| ,
|X+ f 0 (r)| ≤ C .
Step 2. We now have to estimate [iD /, A]. We have [iD /, A] = h[D /0 , A]h + hD /0 [h, A] + [h, A]D /0 h + [V, A] . We have [V, A] = −ZV 0 , [h, A] = −Zh0 . The estimate now follows from the fact that h, Zh0 , ZV 0 : D(D /0 ) → D(D /0 ) . Using Lemma 5.3 we get: Corollary 5.1. The pair (D /, A) satisfies the Mourre conditions (M1) and (M2). We obtain from [32, Lemmas A.2.1 and A.2.2]: (k)
Lemma 5.11. We have |Z (i) Z (k) | ≤ C and |Z (i) X− | ≤ C uniformly in q if i + k ≥ 2. Lemma 5.12. Let i, j, k ∈ N. We have uniformly in q (j)
|g (i) qX− | ≤ C ,
(k)
(j)
|X− g (i) qX− | ≤ C .
If in addition, 3 ≥ i ≥ 1 and i + k ≥ 2, then we have uniformly in q (j)
|g (i) qX+ | ≤ Cg(r)q ,
(j)
(k)
|X− g (i) qX+ | ≤ C ,
(j)
(k)
(j)
(k)
|X+ g (i) qX+ | ≤ Cg(r)q , |X+ g (i) qX− | ≤ C .
Lemma 5.13. The double commutator [[iD /, A]0 , A] defined as a quadratic form on D(N ) can be extended to a bounded operator from H 1 to H. Proof. Recall from the proof of Lemma 5.10 that nl
0 [iD /0 , Anl ] = γZ 0 Dr − Z(g 0 (r)τ q + f 0 (r)) + c1 X− + [ig(r)τ q, c1 γX− ] .
March 10, 2004 14:17 WSPC/148-RMP
00191
Scattering of Massless Dirac Fields by a Kerr Black Hole
77
So we get nl nl 0 2 00 0 0 0 0 0 [[iD /nl 0 , A ], iA ] = γ(Z ) Dr − γZZ Dr + c1 Z X− + Z(Z(g (r)τ q + f (r))) 00 − [Z(g 0 (r)τ q + f 0 (r)), ic1 γX− ] − c1 ZX−
− Z([ig(r)τ q, c1 γX− ])0 + [[ig(r)τ q, c1 γX− ], ic1 γX− ] and the lemma follows with D / replaced by D /0 using Lemmas 4.7, 5.8, 5.11 and 5.12. Recall now that D / = hD /0 h + V. So we have (see the proof of Lemma 5.10) [iD /, A] = −hD /0 h0 Z − h0 ZD /0 h + h[iD /0 , A]h − ZV 0 , [[iD /, A], iA] = h0 ZD /0 h0 Z + hD /0 (h0 Z)0 Z + (h0 Z)0 ZD /0 h + h0 ZD /0 h0 Z − h[iD /0 , A]h0 Z − h0 Z[iD /0 , A]h − h0 Z[iD /0 , A]h − h[iD /0 , A]h0 Z + h[[iD /0 , A], iA]h + (ZV 0 )0 Z . Using h0 ∈ S−2 , Vij ∈ S−2 we observe that
h0 Z, (h0 Z)0 Z, (ZV 0 )0 Z : D(D /0 ) → D(D /0 ) ,
so the double commutator is bounded from H1 → H. 5.5. The Mourre estimate for D / Let us start with some technical lemmas. Lemma 5.14. Let χ ∈ C0∞ (R). Then
j−,R (χ(D /0 ) − χ(D /e )) is compact . Proof. Using the Helffer–Sj¨ ostrand formula, it is sufficient to show j−,R (z − D /0 )−1 (D /0 − D /e )(z − D /e )−1 ] × is compact for all z ∈ C \ (σ(D /0 ) ∪ σ(D /e )) , kj−,R (z − D /0 )−1 (D /0 − D /e )(z − D /e )−1 k ≤ C|Im z|−2 .
(5.4) (5.5)
(5.5) is clear, let us show (5.4). We have j−,R (z − D /0 )−1 (D /0 − D /e )(z − D /e )−1 0 (z − D /0 )−1 (D /0 − D /e )(z − D /e )−1 = −(z − D /0 )−1 γj−,R
+ (z − D /0 )−1 j−,R (g(r) − c0 eηr )D /S 2 (z − D /e )−1 . Both terms are compact by (3.8), Corollary 3.1 and Lemma 3.2. 1 . (T nl , D(T nl )) is clearly self-adjoint Let us put T nl = ln qDr , D(T nl ) = Hnl and the operator T := ln |D /S 2 |Dr is self-adjoint with domain ( ) X X D(T ) = u = unl ; unl ∈ D(T nl ), k(T nl + i)unl k2 < ∞ . nl
nl
March 10, 2004 14:17 WSPC/148-RMP
78
00191
D. H¨ afner & J.-P. Nicolas
Lemma 5.15. Let f, χ ∈ C∞ (R). Then nl
f (ηr + ln q)χ(D /e ) is compact on Hnl . Proof. It is sufficient to show that 1
nl
1
e− η iDr ln q f (ηr + ln q)χ(D /e )e η iDr ln q = f (ηr)χ (γDr + c1 + c0 eηr τ ) is compact and this follows from Corollary 3.1. Lemma 5.16. Let f ∈ C∞ (R), λ ∈ R. ∀ > 0, ∃ δ > 0 ,
kf (ηr + ln |D /S 2 |)1[λ−δ,λ+δ] (D /e )k < .
Proof. We have /e )k kf (ηr + ln |D /S 2 |)1[λ−δ,λ+δ] (D i
i
= ke− η T f (ηr + ln |D /S 2 |)1[λ−δ,λ+δ] (D /e )e η T k
/S 2 ηr D
+ c = f (ηr)1 γD + c e 1 r 0 [λ−δ,λ+δ]
|D /S 2 |
and it is sufficient to show that
kf (ηr)1[λ−δ,λ+δ] (γDr + c0 eηr τ + c1 )k < , uniformly in q. The operator f (ηr)1[λ−δ,λ+δ] (γDr + c0 eηr τ + c1 ) is compact by Corollary 3.1. So for any given > 0, we can find δ > 0 such that kf (ηr)1[λ−δ,λ+δ] (γDr + c0 eηr τ + c1 )k < ε . This concludes the proof of Lemma 5.16.
March 10, 2004 14:17 WSPC/148-RMP
00191
Scattering of Massless Dirac Fields by a Kerr Black Hole
79
The following corollary estimates a remainder term in the Mourre estimate. Corollary 5.2. For all S, R > 0, > 0, λ ∈ R, there exists δ > 0 such that 2 2 k1[λ−δ,λ+δ] (D /0 )(−ηg(r)D /S 2 FS2 j−,R + j−,R (ηr + ln |D /S 2 |)(FS2 )0 γDr
2 /S 2 X− γc1 (ηr + ln |D /S 2 |)(FS2 )0 + ig(r)D − X− g 0 (r)D /S 2 + c1 j−,R
/0 )k < . − iγc1X− g(r)D /S 2 )1[λ−δ,λ+δ] (D Proof. Let us treat 2 1[λ−δ,λ+δ] (D /0 )ηg(r)D /S 2 FS2 j−,R 1[λ−δ,λ+δ] (D /0 ) .
Using that 2 1[λ−δ,λ+δ] (D /0 )(g(r) − c0 eηr )D /S 2 FS2 j−,R
and 2 (1[λ−δ,λ+δ] (D /0 ) − 1[λ−δ,λ+δ] (D /e ))j−,R
are compact and that eηr D /S 2 FS2 is bounded, it is sufficient to treat 1[λ−δ,λ+δ] (D /e )c0 eηr D /S 2 FS2 and this term can be estimated using Lemma 5.16. The estimation of the remaining terms is analogous. Lemma 5.17. We have (i) j±,R preserves D(D /2 ) = D(D /20 ), its norm in L(D(D /2 )) is bounded uniformly in R, (ii) FS preserves D(D /2 ) = D(D /20 ), its norm in L(D(D /2 )) is bounded uniformly in S, 2 2 (iii) W := j−,R (1−FS ) preserves D(D / ) = D(D /0 ), its norm in L(D(D /2 )) is bounded uniformly in R, S. 2
2
We have an analogous statement if we replace D / (respectively D /0 ) by D / (respectively D /0 ). Proof. We have 2
[(z − D /0 )−1 , ij±,R ] 0 0 0 = (z − D /20 )−1 (2f (r)γj±,R + j±,R Dr + Dr j±,R )
1 (z − D /20 )−1 , R
which gives 2
[(z − D /0 )−1 , j±,R ] ∈ O(R−1 )|Im z|−2
for z ∈ K ⊂⊂ C .
In the same manner we calculate 2
2
2
2
[(z − D /0 )−1 , FS ] = (z − D /0 )−1 [D /0 , FS ](z − D /0 )−1 .
(5.6)
March 10, 2004 14:17 WSPC/148-RMP
80
00191
D. H¨ afner & J.-P. Nicolas
Therefore, [(z − D /20 )−1 , FS ] ∈ O(S −1 )|Im z|−2
and FS preserves
D(D /20 ).
(5.7)
Finally (5.6) and (5.7) give (iii).
Lemma 5.18. If supp χ ⊂ ]0, ∞[, then lim kχ(D /0 )W k = 0 unif ormly in R large .
S→∞
Proof. Let χ ˆ ∈ C0∞ (]0, ∞[) with χχ ˆ = χ. As supp χ ⊂ ]0, ∞[, 2
χ(D /0 ) = χ(D ˆ /0 )χ(|D /0 |) = χ(D ˆ /0 )χ(D ˜ /0 )
√ where χ(x) ˜ = χ( x). We have on supp j−R (R sufficiently large) γ 2 2 D /0 = Dr2 + g 2 (r)D /S 2 + g 0 (r)D /S 2 + γDr f (r) + f (r)γDr + f 2 (r) i + 2f (r)g(r)D /S 2 ≥ g 2 (r)D /2S 2 − Cg(r)|D /S2 | − C 2
/S 2 | − C3 . ≥ C1 e2ηr D /S 2 − C2 eηr |D On supp (1 −
2 FS )j−,R
(5.8)
we have ln(eηr |D /S 2 |) ≥
S , 2
eηr |D /S 2 | ≥ eS/2 .
Using (5.8) we get for S large enough D /20 ≥ CeS/2 ,
i.e. ∀ M > 1, ∃ S0 ;
∀ S ≥ S0 , W D /20 W ≥ M W 2
in the sense of quadratic forms on D(D /0 ). Using (5.9), we get 2
2
(z − D /0 )−1 W 2 (¯ z−D /0 )−1 ≤
1 2 2 2 (z − D /0 )−1 W D /0 W (¯ z−D /0 )−1 M
1 1 2 2 2 W (z − D /0 )−1 D /0 (¯ z−D /0 )−1 W + O(R−1 , S −1 )|Im z|−3 M M for z ∈ K ⊂⊂ C. This follows from (5.6) and (5.7). We have =
(z − D /20 )−1 D /20 (z − D /20 )−1 ≤ |Im z|−2 ,
z ∈ K ⊂⊂ C .
It follows that (z − D /20 )−1 W 2 (¯ z−D /20 )−1 ≤
C 1 |Im z|−2 + O(R−1 , S −1 )|Im z|−3 , M M
3 C 2 k(z − D /0 )−1 W k ≤ √ (|Im z|−1 + O(R−1 , S −1 )|Im z|− 2 ) , M Using the Helffer–Sj¨ ostrand formula we obtain C kχ(D ˜ /20 )W k ≤ √ . M
z ∈K.
(5.9)
March 10, 2004 14:17 WSPC/148-RMP
00191
Scattering of Massless Dirac Fields by a Kerr Black Hole
81
Lemma 5.19. We have for R, S large enough: for all λ0 > 0, there exists an interval I, neighborhood of λ0 , and µ > 0 such that 1I (D /)[iD /, A]1I (D /) ≥ µ1I (D /) + K, K compact . For λ0 < 0, we take −A instead of A and obtain a similar estimate. Proof. We work with λ0 > 0, the case λ0 < 0 is proved similarly. Step 1. We first calculate the commutator [iD /0 , A]. We have nl
0 0 0 0 [iD /0 , Anl − ] = γX− Dr − X− (g (r)τ q + f (r)) + c1 X− + [ig(r)τ q, c1 γX− ] .
We have for χ ∈ C0∞ (]0, ∞[), 0 ≤ χ ≤ 1: nl
nl
0 χ(D /0 )γX− Dr χ(D /0 ) nl
nl
˜ nl , = ηχ(D /0 )j−,R FS γDr FS j−,R χ(D /0 ) + O(R−1 , S −1 ) + T1nl + K 1 uniformly in n, l, where 2 2 0 T1nl := χ(D /nl /nl 0 )j−,R (ηr + ln q)(FS ) γDr χ(D 0 ), nl nl 2 ˜ nl := χ(D K /0 )(j−,R )0 (ηr + ln q)FS2 γDr χ(D /0 ) . 1
We put T1 :=
M
T1nl ,
n,l
˜ 1 := K
M
˜ nl . K 1
n,l
˜ 1 is compact by Lemma 5.8 and Corollary 4.2. Using Lemma 5.17, The operator K we obtain: χ(D /0 )[iD /0 , A− ]χ(D /0 ) = ηχ(D /0 )j−,R FS D /0 FS j−,R χ(D /0 ) ˜ 1 + R1 + T , + O(R−1 , S −1 ) + K
(5.10)
with 2 R1 = −χ(D /0 )(X− f 0 (r) − c1 (j−,R (ηr + ln |D /S 2 |))0 FS2
+ ηj−,R FS f (r)FS j−,R )χ(D /0 ) , T = −χ(D /0 )(X− g 0 (r)D /S 2 + ηj−,R FS g(r)D /S 2 FS j−,R 2 − c1 j−,R (ηr + ln |D /S 2 |)(FS2 )0 − ig(r)D /S 2 c1 γX−
/ 0 ) + T1 , + ic1 γX− g(r)D /S 2 )χ(D as an identity between quadratic forms on D(D /0 ). We first estimate R1 . We have 2 χ(D /0 )ηj−,R FS2 f (r)χ(D /0 ) 2 2 = ηχ(D /0 )(FS2 − 1)j−,R f (r)χ(D /0 ) + ηχ(D /0 )j−,R f (r)χ(D /0 )
March 10, 2004 14:17 WSPC/148-RMP
82
00191
D. H¨ afner & J.-P. Nicolas
2 and limS→∞ kχ(D /0 )(FS2 − 1)j−,R f (r)χ(D /0 )k = 0 using Lemma 5.18. We put 2 Rˆ1 := R1 + ηχ(D /0 )(FS2 − 1)j−,R f (r)χ(D /0 ) . 2 /0 ). We have Let us now consider the term χ(D /0 )c1 (j−,R (ηr + ln |D /S 2 |))0 FS2 χ(D 2 (ηr + ln |D /S 2 |))0 FS2 χ(D χ(D /0 )c1 (j−,R /0 ) 2 = χ(D /0 )c1 (j−,R )0 (ηr + ln |D /S 2 |)FS2 χ(D /0 ) 2 + χ(D /0 )c1 j−,R ηFS2 χ(D /0 ) .
(5.11)
We use Lemma 5.18 to obtain
2 /0 )c1 j−,R (FS2 − 1)χ(D / 0 ) = 0 . lim χ(D S→∞
We put
2 ˆ 1 := R ˆ 1 − χ(D K /0 )c1 j−,R (FS2 − 1)χ(D /0 ) .
Let us show that Kˆ1 is compact. We first note that the first term in (5.11) is compact 2 by Corollary 4.2 and Lemma 5.8. Furthermore, (η(c1 − f (r))j−,R )1[λ−δ,λ+δ] (D /0 ) is compact by (3.9) and Corollary 4.2. Besides, X− f 0 (r)1[λ−δ,λ+δ] (D /0 ) is compact by Corollary 4.2 and Lemma 5.8. We introduce the compact operator ˜1 + K ˆ1 . K1 := K Let us now treat the first term in (5.10). We have χ(D /0 )j−,R FS D /0 FS j−,R χ(D /0 ) = χ(D /0 )j−,R D /0 j−,R χ(D /0 ) + χ(D /0 )j−,R D /0 j−,R (FS − 1)χ(D /0 ) + χ(D /0 )j−,R (FS − 1)D /0 j−,R FS χ(D /0 ) .
(5.12)
We know by Lemma 5.17 that kχ(D /0 )j−,R D /0 k is bounded uniformly in R and kj−,R (FS − 1)χ(D /0 )k → 0 as S → ∞ . This estimates the second term in (5.12). The last term can be treated in the same manner. We obtain χ(D /0 )[iD /0 , A− ]χ(D /0 ) ≥ ηχ(D /0 )j−,R D /0 j−,R χ(D /0 ) − χ2 (D /0 ) + T + K1 if R, S are large enough. Let us now estimate [iD /0 , A+ ]. We obtain χ(D /0 )[iD /0 , A+ ]χ(D /0 ) = χ(D /0 )j+,R γDr j+,R χ(D /0 ) − χ(D /0 )X+ (g 0 (r)D /S 2 + f 0 (r))χ(D /0 ) + k ,
March 10, 2004 14:17 WSPC/148-RMP
00191
Scattering of Massless Dirac Fields by a Kerr Black Hole
83
where k is a compact operator. We now use the fact that the operator 2 X+ (g 0 (r)D /S 2 + f 0 (r))χ(D /0 ) + (g(r)D /S 2 + f (r))j+,R χ(D /0 ) 2 2 = (X+ g 0 (r) + j+,R g(r))D /S 2 + (X+ f 0 (r) + j+,R f (r)) χ(D /0 )
2 2 = j+,R (rg 0 (r) + g(r))D /S 2 + j+,R (rf 0 (r) + f (r)) χ(D /0 )
is compact, by (3.10), (3.11) and Corollary 4.2, to obtain
χ(D /0 )[iD /0 , A+ ]χ(D /0 ) = χ(D /0 )j+,R D /0 j+,R χ(D / 0 ) + K2 with some compact operator K2 . Putting everything together, we find for R, S large enough: χ(D /0 )[iD /0 , A]χ(D /0 ) ≥ ηχ(D /0 )j−,R D /0 j−,R χ(D /0 ) + χ(D /0 )j+,R D /0 j+,R χ(D /0 ) − χ2 (D / 0 ) + T + K 1 + K2 . 0 Using the compactness of χ(D /0 )j±,R , we obtain
χ(D /0 )[iD /0 , A]χ(D /0 ) ≥ µ ˜χ2 (D /0 ) + K + T with a compact operator K and µ ˜ > 0. We now fix R, S large enough. We can apply Corollary 5.2 to obtain χ(D /0 )[iD /0 , A]χ(D /0 ) ≥ µχ2 (D /0 ) ,
µ > 0,
if the support of χ is sufficiently small. Step 2. We now estimate [iD /1 , A]. The operator χ(D /)[iD /1 , A]χ(D /) is in fact compact. Let us write χ(D /)D /1 Aχ(D /) = (χ(D /)(D /0 + i)hri−1 ) × ((D /0 + i)−1 hriD /1 hrihri−1 Aχ(D /)) + χ(D /)(D /0 + i)hri−1 (D /0 + i)−1 [γDr , hri] × (D /0 + i)−1 D /1 hrihri−1 Aχ(D /) . The first factor of each term is compact, the others are bounded. This concludes the proof of the lemma using that χ(D /) − χ(D /0 ) is compact. Using [42] we obtain the following consequence of the limiting absorption principle: Theorem 5.1. For all χ ∈ C0∞ (R \ ({0} ∪ σpp (D /))), µ > 21 , ψ ∈ H, we have Z ∞ / khAi−µ eitD χ(D /)ψk2 dt ≤ Ckψk2 . 0
The operator D / has no singular continuous spectrum and the pure point spectrum is locally finite in R \ {0}.
March 10, 2004 14:17 WSPC/148-RMP
84
00191
D. H¨ afner & J.-P. Nicolas
5.6. The Mourre estimate for D /ν , ν ∈ N Let us first remark that we cannot apply directly the results of the previous sections to D /± . The situation for D /ν , ν ∈ N is however much simpler as we can restrict our attention to the subspaces of spherical harmonics. So we shall work in what follows nl with the spaces Hnl and the operators D /ν . We will drop the indices n, l. We define if ν = − , c1 1 ν ν 2 Bν := (rDr + Dr r) + γc1 r , c1 = c1 j− if ν = 0 , 2 0 if ν = + .
Let N := Dr2 + r2 + 1, D(N ) = {ψ ∈ H, N ψ ∈ H}. N is self-adjoint with this domain and we have also D(N ) = (H 2 (R))2 ∩ D(r2 ) . All commutators in this subsection are defined as quadratic forms on D(N ). As (C0∞ (R))2 is dense in D(N ), it is sufficient to calculate them on (C0∞ (R))2 . Lemma 5.20. The pairs (Bν , N ) and (D /ν , N ), ν ∈ N satisfy the hypotheses of Lemma 5.1. Proof. The proof for (Bν , N ) is contained in the proof of [39, Lemma 4.5]. Let us treat (D /ν , N ): 2
D(N ) ⊂ D(D /ν ) = (H 2 (R))2 and 2
kD /ν uk2 ≤ C(kD /ν uk + kuk2 ) ≤ CkN uk2 ,
∀ u ∈ D(N ) .
We calculate [iD /ν , N ] = 2γr + Dr g 0 τ q + g 0 τ qDr + Dr fν0 + fν0 Dr . This implies: |[iD /ν , N ](u, u)| ≤ C(N u, u) . Lemma 5.21. We have ∀ν ∈ N ,
∀ z ∈ C \ σ(D /ν ) ,
(z − D /ν )−1 : D(N ) → D(N ) .
The argument for the proof is the same as in the proof of Lemma 5.6, we omit the details. Lemma 5.22. We have Dν ∈ C 1 (Bν ) for all ν ∈ N and the commutator [iD / ν , Bν ] can be extended to a bounded operator from D(D / ν ) to H, that we denote by [iD / ν , B ν ]0 .
March 10, 2004 14:17 WSPC/148-RMP
00191
Scattering of Massless Dirac Fields by a Kerr Black Hole
85
Proof. We use Lemma 5.2. We show |(Bν u, D /ν u) − (D /ν u, Bν u)| ≤ C(kD /ν uk kuk + kuk2 ) .
(5.13)
We have [iD /ν , Bν ] = γDr − rgν0 τ q − rfν0 + (cν1 r)0 + [igν τ q, γcν1 r] . This gives the desired estimate by Lemma 4.7 and (3.8)–(3.11). Using Lemma 5.3 we obtain: Corollary 5.3. The pair (D /ν , Bν ) satisfies the Mourre conditions (M1) and (M2) for all ν ∈ N . Lemma 5.23. For all ν ∈ N the double commutator [iBν , [iD /ν , Bν ]0 ], defined as a quadratic form on D(N ), can be extended to a bounded operator from D(D / ν ) to H. Proof. We have [[iD /ν , Bν ], iBν ] = γDr + r(rgν0 )0 τ q + r(rfν )0 − r(cν1 r)00 − r([igν τ q, γcν1 r])0 + (cν1 r)0 + [[igν τ q, γcν1 r], iγcν1 r] and the estimate for the double commutator follows in the same way as for the commutator. Lemma 5.24. Let ν ∈ N . For all λ0 > 0 there exists a neighborhood I of λ0 and µ > 0 such that 1I (D /ν )[iD /ν , Bν ]1I (D /ν ) ≥ µ1I (D /ν ) + kν where kν is a compact operator. For λ0 < 0, taking −Bν instead of Bν , we obtain a similar estimate. Proof. We work in the case λ0 > 0 (the proof is similar for λ0 < 0). By the proof of Lemma 5.22 we have [iD /ν , Bν ] = D /ν + (cν1 r)0 − fν − rfν0 − rgν0 τ q − gν τ q + [igν τ q, γcν1 r] . Using Corollary 3.1 and (3.8)–(3.11), we see that for χ ∈ C0∞ (]0, +∞[), χ(D /ν )(rfν0 + rgν0 τ q + gν τ q − (cν1 r)0 + fν − [igν τ q, γcν1 r])χ(D /ν ) is compact. Putting everything together we find χ(D /ν )[iD /ν , Bν ]χ(D /ν ) ≥ µχ2 (D /ν ) + kν where kν is compact.
March 10, 2004 14:17 WSPC/148-RMP
86
00191
D. H¨ afner & J.-P. Nicolas
Using [42] we obtain the following consequence of the limiting absorption principle: Theorem 5.2. For all ν ∈ N , χ ∈ C0∞ (R \ {0}), µ > 21 and ψ ∈ H, we have Z ∞ / khBν i−µ eitD /ν )ψk2 dt ≤ Ckψk2 . ν χ(D 0
The operators D /ν have no singular continuous spectrum. 6. Asymptotic Completeness Let j± ∈ Cb∞ (R) be a partition of unity as follows: there exists R > 0 such that j− (r) = 0 ,
∀r ≥ R,
j+ (r) = 0 ,
∀ r ≤ −R ,
j− (r) = 1 , j+ (r) = 1 ,
2 2 j+ + j− = 1.
∀ r ≤ −R , ∀r ≥ R,
6.1. Technical results Lemma 6.1. Let χ ∈ C0∞ (R). Then the operator hAiχ(D /)hri−1 is bounded. Proof. We first show that [hAi, χ(D /)] is bounded. Using the Helffer–Sj¨ ostrand formula, it is sufficient to show that (z − D /)−1 [hAi, D /](z − D /)−1 is bounded and that k(z − D /)−1 [hAi, D / ](z − D /)−1 k ≤ C|Im z|−2 . This follows from the commutator and double commutator estimates and from [15, Lemma C.3.2]. It remains to show that χ(D /)hAihri−1 is bounded. This follows from Lemma 5.8. Lemma 6.2. Let χ ∈ C0∞ (R). Then the operator hAiχ(D /)(D /−D /0 )χ(D /0 )hAi is bounded. Proof. Let χ ˜ ∈ C0∞ (R) with χχ ˜ = χ. We have hAiχ(D /)(D /−D /0 )χ(D /0 )hAi = hAiχ(D /)hri−1 hriχ(D ˜ /)(D /−D /0 )χ(D ˜ /0 )hrihri−1 χ(D /0 )hAi and it is sufficient to show that hriχ(D ˜ /)(D /−D /0 )χ(D ˜ /0 )hri is bounded . By Lemma 5.7, [hri, χ(D ˜ /)] is bounded from H to H1 , so it remains to show that χ(D ˜ /)hri(D /−D /0 )hriχ(D ˜ /0 ) is bounded , which is a consequence of Lemma 4.8.
March 10, 2004 14:17 WSPC/148-RMP
00191
Scattering of Massless Dirac Fields by a Kerr Black Hole
87
6.2. Comparison with the intermediate dynamics Theorem 6.1. The limits / itD s − lim e−itD e /0 ,
(6.1)
/0 itD s − lim e−itD e / 1c (D /)
(6.2)
t→∞
t→∞
exist (1c (D /) is the projector onto the continuous subspace of D /). If we denote (6.1) by Ω+ , then (6.2) equals (Ω+ )∗ and we have (Ω+ )∗ Ω+ = 1 ,
Ω+ (Ω+ )∗ = 1c (D /) .
Proof. By a density argument and using that σpp (D /) has no accumulation point except possibly 0, as well as the fact that D /0 does not have any eigenvalue, it is sufficient to show the existence of / itD s − lim e−itD e /0 χ ˜2 (D /0 ) ,
(6.3)
/0 itD s − lim e−itD e / χ2 (D /)
(6.4)
t→∞
t→∞
with χ, χ ˜ ∈ C0∞ (R) and supp χ ⊂ R \ ({0} ∪ σpp (D /)), supp χ ˜ ⊂ R \ {0}. We have /0 itD /0 / e−itD e / χ2 (D /) = e−itD (χ(D /) − χ(D /0 ))eitD χ(D /) /0 / + e−itD χ(D /0 )eitD χ(D /) .
(6.5)
By Lemma 4.5, χ(D /0 ) − χ(D /) is compact. As supp χ ∩ σpp (D /) = ∅ and σsc (D /) = ∅, / / eitD χ(D /) → 0 weakly, so (χ(D /0 ) − χ(D /))eitD χ(D /) → 0 strongly, and the first term in (6.5) tends strongly to zero. We have d /0 itD /0 / χ(D /0 )e−itD e / χ(D /) = χ(D /0 )e−itD i(D /−D /0 )eitD χ(D /) . dt Let χ ˆ ∈ C0∞ (R \ ({0} ∪ σpp (D /)) with χχ ˆ = χ. Then d /0 itD /0 χ(D /0 )e−itD e / χ(D /) = (e−itD χ(D /0 )hAi−1 ) dt
/ × (hAiχ(D ˆ /0 )i(D /−D /0 )χ(D ˆ /)hAi)(hAi−1 χ(D /)eitD ).
The second operator is bounded by Lemma 6.2. The first and third operators, using Theorem 5.1 and a duality argument, give the integrability of the whole expression. This shows the existence of (6.2). The proof of the existence of (6.1) is analogous. 6.3. Technical results concerning the separable problem In this subsection we drop the indices n, l. Recall that D(D /ν ) = (H 1 (R))2 , ∀ ν ∈ N. Lemma 6.3. Let ψ ∈ Cb∞ (R) such that ψ 0 ∈ C0∞ (R), χ ∈ C0∞ (R). Then [ψ, χ(D /ν )] is compact.
March 10, 2004 14:17 WSPC/148-RMP
88
00191
D. H¨ afner & J.-P. Nicolas
Proof. By the Helffer–Sj¨ ostrand formula it is sufficient to show that ∀ z ∈ C \ σ(D /ν )(z − D /ν )−1 [ψ, D /ν ](z − D /ν )−1 is compact ,
(6.6)
k(z − D /ν )−1 [ψ, D /ν ](z − D /ν )−1 k ≤ C|Im z|−2 .
(6.7)
(6.7) is clear and (6.6) follows from Corollary 3.1 because [iD /ν , ψ] = γψ 0 . Lemma 6.4. The operator j± (χ(D /± ) − χ(D /0 )) is compact. Proof. Using the Helffer–Sj¨ ostrand formula it is sufficient to show that kj± (z − D /0 )−1 (D /0 − D /± )(z − D /± )−1 k ≤ C|Im z|−2 , ∀ z ∈ C \ (σ(D /0 ) ∪ σ(D /± ))j± (z − D /0 )−1 (D /0 − D /± )(z − D /± )−1 is compact .
(6.8) (6.9)
(6.8) is clear, let us show (6.9). We have j± (z − D /0 )−1 (D /0 − D /± )(z − D /± )−1 = (z − D /0 )−1 [j± , D /0 ](z − D /0 )−1 (D /0 − D /± )(z − D /± )−1 + (z − D /0 )−1 j± (D /0 − D /± )(z − D /± )−1 .
(6.10)
The first term is compact by Lemma 6.3. We have j± (D /0 − D /± ) = j± (g0 (r) − g± (r))τ q + j± (f (r) − f± ) . Both terms are functions which tend to zero as |r| → ∞, so the last term in (6.10) is compact by Corollary 3.1. Lemma 6.5. For all χ ∈ C0∞ (R) hBν iχ(D /0 )(D / 0 j± − j ± D /± )χ(D /± )hBν i is bounded . Proof. Let χ ˜ ∈ C0∞ (R) with χχ ˜ = χ. We have hBν iχ(D /0 )(D / 0 j± − j ± D /± )χ(D /± )hBν i = hBν iχ(D /0 )hri−1 hriχ(D ˜ /0 )(D / 0 j± − j ± D /± )χ(D ˜ /± ) × hrihri−1 χ(D /± )hBν i . We first show that hBν iχ(D /ν )hri−1 is bounded for all ν ∈ N .
(6.11)
By [15, Lemma C.3.2], the Helffer–Sj¨ ostrand formula and the commutator estimates included in the proofs of Lemmas 5.22 and 5.23, [hBν i, χ(D /ν )] is bounded. As χ(D /ν )hBν ihri−1 is also bounded, (6.11) follows. It remains to show that hriχ(D ˜ /0 )(D / 0 j± − j ± D /± )χ(D ˜ /± )hri is bounded .
(6.12)
March 10, 2004 14:17 WSPC/148-RMP
00191
Scattering of Massless Dirac Fields by a Kerr Black Hole
89
Clearly, the commutator [hri, D /ν ] is bounded and by the Helffer–Sj¨ ostrand formula, we find [hri, χ(D ˜ /ν )] is bounded for all ν from H to H1 .
(6.13)
It remains to show that χ(D ˜ /0 )hri(D / 0 j± − j ± D /± )hriχ(D ˜ /± ) is bounded .
(6.14)
We have D / 0 j± − j ± D /± =
1 0 j γ + j± (g(r) − g± (r))τ q + j± (f (r) − f± (r)) . i ±
It follows that hri(D / 0 j± − j ± D /± )hri is a uniformly bounded function thanks to (3.8)–(3.11). 6.4. Comparison with the asymptotic dynamics Theorem 6.2. The limits /± /0 s − lim e−itD j± eitD ,
(6.15)
/± /0 s − lim e−itD j± eitD
(6.16)
t→∞
t→∞
+ ∗ exist. If we denote (6.15) by Ω+ 0,± , then (6.16) equals (Ω0,± ) and we have + + + + + ∗ ∗ ∗ + ∗ + Ω+ 0,+ (Ω0,+ ) + Ω0,− (Ω0,− ) = (Ω0,+ ) Ω0,+ + (Ω0,− ) Ω0,− = 1 . + ∗ Ω+ 0,± , (Ω0,± ) are independent of the choice of the partition of unity.
Proof. We start by proving the existence of Ω0,± . It is sufficient to show for all n, l the existence of nl
nl
nl
/0 /± 2 s − lim e−itD j± eitD χ (D /± ) t→∞
for χ ∈ C0∞ (R \ {0}). From now on we omit the indices n, l. We have /± 2 /± /0 /0 e−itD j± eitD χ (D /± ) = e−itD j± (χ(D /± ) − χ(D /0 ))eitD χ(D /± ) /0 /± + e−itD [j± , χ(D /0 )]eitD χ(D /± ) /0 /± + e−itD χ(D /0 )j± eitD χ(D /± ) .
(6.17)
Using that [j± , χ(D /0 )] is compact by Lemma 6.3 and that j± (χ(D /± ) − χ(D /0 )) is compact by Lemma 6.4, it is sufficient to show that the last term in (6.17) has a
March 10, 2004 14:17 WSPC/148-RMP
90
00191
D. H¨ afner & J.-P. Nicolas
limit. Let χ ˆ ∈ C0∞ (R \ {0}) such that χχ ˆ = χ. We have d /0 /± χ(D /0 )e−itD j± eitD χ(D /± ) dt /0 /± = −ie−itD χ(D /0 )(D / 0 j± − j ± D /± )eitD χ(D /± ) /0 = −i(e−itD χ(D /0 )hBν i−1 ) /± × (hBν iχ(D ˆ /0 )(D / 0 j± − j ± D /± )χ(D ˆ /± )hBν i)(hBν i−1 χ(D /± )eitD ).
We conclude as in the proof of Theorem 6.1 using Theorem 5.2 and Lemma 6.5. The proof of the existence of (6.16) is analogous. In order to prove the last statement, it is sufficient to show that /± /0 =0 s − lim e−itD ψeitD t→∞
for all ψ ∈
C0∞ (R).
This will follow from /± /0 s − lim e−itD ψeitD χ(D /0 ) = 0 t→∞
for all χ ∈
C0∞ (R
\ {0}), which is true because ψχ(D /0 ) is compact.
Theorem 6.3. The limits /± / s − lim e−itD j± eitD ,
(6.18)
/± / c s − lim e−itD j± eitD 1 (D /)
(6.19)
t→∞
t→∞
+ ∗ exist. If we denote (6.18) by Ω+ ± , then (6.19) equals (Ω± ) and we have + ∗ + + ∗ c Ω+ /) , + (Ω+ ) + Ω− (Ω− ) = 1 (D
+ ∗ + ∗ + (Ω+ + ) Ω+ + (Ω− ) Ω− = 1 .
+ ∗ Ω+ ± , (Ω± ) are independent of the choice of the partition of unity.
Proof. This follows from Theorems 6.1 and 6.2 and the chain rule. 7. Proof of the Theorems of Sec. 2.6 7.1. Absence of point spectrum The separability of Weyl’s equation in Boyer–Lindquist coordinates was proved independently by Unruh [57] and Teukolski [56]; Chandrasekhar then extended the result to the full Dirac equation (see [11, 12]). All these proofs rely on the Newman–Penrose formalism and adopt Kinnersley’s null tetrad (for its definition, see Eqs. (2.20)–(2.23) or Kinnersley’s original work [36]). The absence of stationary solutions to the charged massive Dirac equation outside a non-extreme Kerr–Newman black hole is the object of a work by Finster et al. [22]; the class of solutions considered there is specified by means of so-called matching conditions
March 10, 2004 14:17 WSPC/148-RMP
00191
Scattering of Massless Dirac Fields by a Kerr Black Hole
91
across the horizon. In fact, if we simply impose the physical requirement that solutions should have finite total charge, the absence of stationary solutions becomes an immediate consequence of the separability of the equations. We prove this for the massless Dirac equation on the Kerr metric. The same technique should be valid on Kerr–Newman backgrounds for charged and massive fields. In the extreme case however, the method fails because of the lack of integrability at the horizon. Proposition 7.1. There are no stationary finite charge Weyl fields outside a Kerr black hole, i.e. there are no non-zero solutions φA ∈ C(Rt ; L2 ((Σ; dVol); SA )) of (2.5), of the form φ(t, r, θ, ϕ) = e−iαt χ(r, θ, ϕ) ,
α ∈ R.
In other words, the Hamiltonian D /K has empty point spectrum on H. Proof. Let us consider α ∈ R and χA ∈ L2 ((Σ; dVol); SA ) such that φA (t, r, θ, ϕ) = e−iαt χA (r, θ, ϕ) satisfies (2.5). We denote f0 and f1 the components of χA in the spin-frame (OA , I A ) associated with Kinnersley’s null tetrad La , N a , ma , m ¯ a , i.e. 0
¯A , La = O A O
0
0
N a = I A I¯A ,
ma = OA I¯A ,
f0 = χ A O A ,
0
¯A , m ¯ a = I AO
f1 = χ A I A .
A simple calculation shows that χA ∈ L2 ((Σ; dVol); SA ) if and only if f0 ∈ L2 (Σ; ∆dr∗ dω) ,
f1 ∈ L2 (Σ; ρ2 dr∗ dω) .
(7.1)
We know from [11, 56] that for s = ±1/2 and for each value of α, there exist s,α s,α orthonormal bases {Yl,n (θ, ϕ) = Sl,n (θ)einϕ }l,n of L2 (S 2 , dω), such that f0 (respectively f1 ) can be decomposed into a series, convergent in L2 (Σ; ∆dr∗ dω) (respectively in L2 (Σ; ρ2 dr∗ dω)) of functions of the formc : f0l,n (t, r, θ, ϕ) = einϕ R1/2 (r)S1/2 (θ) , f1l,n (t, r, θ, ϕ) = c Note
1 inϕ e R−1/2 (r)S−1/2 (θ) p¯
that Chandrasekhar’s unknowns F1 and F2 are the components of φA , and not φA , with respect to the spin-frame (O A , I A ); the correspondence with our unknowns is therefore f0 = −eiαt F2 , f1 = eiαt F1 .
March 10, 2004 14:17 WSPC/148-RMP
92
00191
D. H¨ afner & J.-P. Nicolas
(where we omit in R and S the indices l, n and α for simplicity of notation), the functions einϕ S±1/2 (θ) are smooth functions on the sphere and R±1/2 satisfy 2 r + a2 ∂ iK R−1/2 = λR1/2 , + ∆ ∂r∗ ∆ ∂ − iK + (r − M ) R1/2 = 2λR−1/2 , (r2 + a2 ) ∂r∗ K = (r2 + a2 )α + an ,
λ being a separation constant depending on the discrete parameters l and n. The condition (7.1) is equivalent to (since |¯ p|2 = ρ2 ) Putting
R1/2 ∈ L2 (R; ∆dr∗ ) ,
R−1/2 ∈ L2 (R; dr∗ ) .
U = R−1/2 ,
V =
∂ iK + 2 ∂r∗ r + a2
(7.2)
√ ∆R1/2 ,
we obtain
√ U = λ ∆V ,
√ ∂ iK V = 2λ ∆U . − 2 2 ∂r∗ r +a We can multiply U and V by phase factors in order to get rid of the terms involving iK: Z r∗ iK ˜ U = exp ds U =: βU , (r(s))2 + a2 0 Z r∗ iK ¯ , V˜ = exp − ds V = βV (r(s))2 + a2 0 ˜ and V˜ satisfy the where s 7→ r(s) is the reciprocal function of r 7→ r∗ . Now U
differential system
√ ˜ 0 = λβ 2 ∆V˜ , U √ ˜, V˜ 0 = 2λβ¯2 ∆U
i.e. ˜ U V˜
!0
=B
˜ U V˜
!
,
B=
0 √ ¯ 2λβ 2 ∆
√ ! λβ 2 ∆ 0
(7.3)
and (7.2) is equivalent to ˜ V˜ ∈ L2 (R; dr∗ ) . U,
√ The factor λβ 2 ∆ falls off exponentially fast as r∗ → −∞ and is therefore inte˜ , V˜ ) of (7.3) that tends to t (c1 , c2 ) grable at −∞. Hence, there exists a solution t (U at −∞ for any c1 , c2 ∈ C. The space of solutions of (7.3) being of complex dimension 2, it follows that non-zero solutions of (7.3) do not belong to L2 (R; dr∗ ).
March 10, 2004 14:17 WSPC/148-RMP
00191
Scattering of Massless Dirac Fields by a Kerr Black Hole
93
7.2. Compatibility of the general analytic framework With the notations of Sec. 3 we put r √ ∆ r 2 + a2 2M ran g(r∗ ) := 2 , f (r ) := − , h(r ) := , ∗ ∗ r + a2 (r2 + a2 )2 σ !0 ! r r √ 0 1 r 2 + a2 r 2 + a2 i ∆ ρ2 n V := −1 γ + iB + i σ sin θ σ σ σ −1 0 i∆3/2 a2 sin θ cos θ + 2σ 3
0
1
1
0
n
/S 2 + f (r∗ )n , D /0 := γDr∗ + g(r∗ )D
!
2M ran + σ n
1 1 − r 2 + a2 σ
,
n
D / := hD /0 h + V n .
We have n
n
D / = hD /0 h + V n √ 2M ran i∆3/2 a2 sin θ cos θ ∆ r 2 + a2 γDr∗ + D /S 2 − − = σ σ σ(r2 + a2 ) 2σ 3 +
r
r 2 + a2 1 γ σ i
r
r 2 + a2 σ
!0
0
1
1
0
!
n
+Vn =D /K ,
where D /nK is obtained from D /K by fixing Dϕ = n without changing D /S 2 . Remark 7.1. The operators D /n and D /n0 are operators acting on H. They coincide with the restrictions of D / and D /0 to the subspace of functions whose dependence in inϕ ϕ is given by e . For such restrictions, the operator D /S 2 is not “simplified”; it is better to keep the whole D /S 2 which is a regular operator acting on this subspace, than to use an explicit expression with coordinate singularities. Let us recall from Sec. 3 that [η = κ+ defined in (2.61)] ( O(hr∗ im−α ) m,n α β f ∈S iff ∀ α, β ∈ N , ∂r∗ ∂θ f ∈ O(enη|r∗ | ) f ∈ Sm
iff
∀ α, β ∈ N ,
∂rα∗ ∂θβ f ∈ O(hr∗ im−α )
r∗ → +∞ ,
r∗ → −∞ ,
|r∗ | → ∞ .
We check 0
0
0
0
Sm,n × Sm ,n ⊂ Sm+m ,n+n , ∀α ∈ N, ∀β ∈ N,
(7.4)
∂rα∗ : Sm,n → Sm−α,n ,
(7.5)
∂θβ : Sm,n → Sm,n .
(7.6)
March 10, 2004 14:17 WSPC/148-RMP
94
00191
D. H¨ afner & J.-P. Nicolas
For a function depending on r and θ we define f (r, θ) ∈ Π
m
iff
∂rα ∂θβ f
∀ α, β ∈ N ,
∈
(
O(hrim−α ) O(1)
r→∞
r → r+
.
We have (see [32, Lemma 9.7.1]): Lemma 7.1. (i) If f (r) ∈ Πm , then we have for all α ∈ N, Drα∗ f (r(r∗ )) ∈ Sm−α,−2 . (ii) If f (r∗ ) ∈ Sm,n and g(r) ∈ Πk , then f (r∗ )g(r(r∗ )) ∈ Sm+k,n . Examples. (i) (ii) (iii) (iv)
2,−2 e1 (r, θ) = ∆ , √ ∈ S 1,−1 e2 (r, θ) = ∆ ∈ S , 1 −2p,0 , e3 (r, θ) = σp ∈ S e4 (r, θ) = ∂θ ( σ1p ) ∈ S−2(p+1),−2 .
Lemma 7.2. The functions g, f, h, V satisfy the hypotheses of Sec. 3.3. Proof. We start with g(r∗ ) = We put
√ ∆ r 2 +a2
∈ S−1,−1 according to Example (ii) and (7.4). 1
(
r−
(r+ − r− ) 2 r+ c0 := 2 + a2 r+ r−
We have ∆ = (r(r∗ ) − r− ) r+
+1 2η(r∗ −r(r∗ ))
e
g(r∗ ) − c0 e
ηr∗
+1)
e−ηr+ .
and therefore
= g˜(r∗ )eηr∗
with 1
(
r−
+1)
e−ηr (r − r− ) 2 r+ − c0 = O(r − r+ ) = O(e2ηr∗ ) as r∗ → −∞ . g˜(r∗ ) := 2 2 r +a Therefore g(r∗ ) satisfies condition (3.8). We have √ 1 1 1 ∆ 1 = 2 − + − . g(r) − 2 r∗ r +a r r r∗
The term
√ r2
1 ∆ − 2 +a r
is O(hr∗ i−2 ) while the remainder, due to the logarithmic terms in r∗ , is O(hr∗ i−2+ε ) for any ε > 0. The condition on the derivative of g is checked in the same manner. Therefore g(r∗ ) satisfies condition (3.10). ran Let us now check the condition on f (r∗ ). Recall that f (r∗ ) = − (r2M 2 +a2 )2 . We have ˆ := − 2M ran ∈ Π−3 f(r) (r2 + a2 )2
March 10, 2004 14:17 WSPC/148-RMP
00191
Scattering of Massless Dirac Fields by a Kerr Black Hole 2 and f 0 ∈ S−4,−2 by Lemma 7.1. We put c1 := −(r+ + a2 )−1 and we obtain 2 2 − r2 )(r+ + r2 + 2a2 ) 2M (r − r+ ) 2M r+ (r+ f (r∗ ) − c1 = − + 2 + a 2 )2 (r2 + a2 )2 (r2 + a2 )2 (r+
∈ O(e2ηr∗ ) ,
r∗ → −∞ .
Clearly f (r∗ ) ∈ O(hr∗ i−2 ), r∗ → +∞. This proves that (3.9), (3.11) are fulfilled. q r 2 +a2 Let us now check the conditions on h. Recall that h(r∗ ) = σ . We have 0 ≤ h2 − 1 = ≤
h4 − 1 (r2 + a2 )2 − σ 2 ∆a2 sin2 θ 4 ≤ h − 1 = = h2 + 1 σ2 σ2
a2 ≤ α < 1, r2
where we have used that r ≥ r+ ≥ α1 M (0 < α < 1) if M > |a|. We have √ r2 + a2 ∆a2 sin θ cos θ ∈ S−2,−2 , ∂θ h = 2σ 5/2 which shows that (3.15) is fulfilled. We have h−1=
h2 − 1 h4 − 1 ∆a2 sin2 θ = = ∈ S−2,−2 . h+1 (h + 1)(h2 + 1) σ 2 (h + 1)(h2 + 1)
We now have to check the conditions on (Vij ). We put ! r √ 0 1 r 2 + a2 i ∆ ρ2 , V2 := i V1 := −1 γ σ sin θ σ σ −1 0 i∆3/2 a2 sin θ cos θ V3 := 2σ 3 2M ran V4 := σ
0
1
1
0
1 1 − r 2 + a2 σ
,
!
r
r 2 + a2 σ
!0
,
,
V5 := iB .
Let us first treat V1 . We have √ √ √ (ρ4 − σ 2 ) ∆ (a2 sin θρ2 + 2M ra2 sin θ) ∆ ∆(ρ2 − σ) = 2 =− ∈ S−2,−1 . σ sin θ (ρ + σ)σ sin θ (ρ2 + σ)σ An explicit calculation gives V2 =
i (r2 + a2 )(r − M )a2 sin2 θ − 2ra2 ∆ sin2 θ ∆ γ ∈ S−3,−2 . 2 σ3 r 2 + a2
We have V3 ∈ S−3,−3 and V4 = −
2M rn∆a3 sin2 θ ∈ S−5,−2 . σ 2 σ+ (r2 + a2 )
95
March 10, 2004 14:17 WSPC/148-RMP
96
00191
D. H¨ afner & J.-P. Nicolas
˜ t = (M ˜ ij ) with M ˜ ij ∈ S0,1 and M ˜ t−1 = (M ˜ −) To treat V5 let us first remark that M ij ˜ − ∈ S0,−1 . We have V5 = M ˜ t−1 P˜ and it is therefore sufficient to show that with M ij P˜ ∈ S−2,0 . We have P˜ = UP U−1 + UMθ [∂θ , U−1 ] + UMr [∂r , U−1 ] . We first claim that the components of U Mθ [∂θ , U−1 ] are in S−2,0 . Indeed we have − − −1,−1 −1,0 U−1 =: (U− , ∂ θ U− , Mθ = (Mθij ) with Mθij ∈ ij ) with U12 , U21 ∈ S ii ∈ S S−1,0 and U = (Uij ) with Uij ∈ S0,0 . Using (7.4) we get the desired fall-off. We now claim that the components of UMr [∂r , U−1 ] are in S−2,0 . Note that the derivative is with respect to r but the symbol class is defined in reference to r∗ . It −2,0 is sufficient to show that ∂r U− for all i, j. This is obvious for ∂r U− 12 and ij ∈ S − − ∂r U21 . Let us consider ∂r U11 . We have X :=
a2 cos2 θ ia cos θ r2 + a2 ρσ+ =1− − + = 1 + O(r−1 ) . σp ρ(r + ρ) ρ σp
Consequently, 1 −1/2 ∂ r U− ∂r X ∈ S −2,0 . 11 = √ X 2 2 The proof is similar for ∂r U− p = 1 + O(r −1 ). We now 22 using in addition that ρ/¯ −1 −2,0 have to show that UP U belongs to S . This is equivalent to P ∈ S−2,0 which follows from the explicit form of P. Remark 7.2. We can choose the arbitrary constant R0 [see (2.39)] so that √ e−r∗ ∆/σ → 1 as r∗ → −∞, i.e. the constant c0 then becomes 1. 7.3. Proof of Theorems 2.1 2.3 We start by a proof of the scattering theories of Theorems 2.2 and 2.3 using cutoff functions. Then we use these results to construct the asymptotic velocity. This in turn gives us the more elegant definitions of the wave operators given in the theorems. 7.3.1. Scattering theory in terms of cut-off functions We first note that, by Lemma 3.5, D /H and D /∞ are self-adjoint on H and, by Lemma 4.8, their point spectra are empty. Moreover D /K is self-adjoint on H by Lemma 4.1 (the conservation of the total charge guarantees that D /K is symmetric). The absence of point spectrum for D /K is shown in Sec. 7.1. Proof of the scattering theory of Theorem 2.3. We consider cut-off functions j± ∈ Cb∞ (R) satisfying: there exists R > 0 such that j+ ≡ 0 on ] − ∞, −R] ,
j+ ≡ 1 on [R, +∞[ ,
(7.7)
j− ≡ 1 on ] − ∞, −R] ,
j− ≡ 0 on [R, +∞[ ,
(7.8)
2 2 j+ + j− ≡ 1 on R .
(7.9)
March 10, 2004 14:17 WSPC/148-RMP
00191
Scattering of Massless Dirac Fields by a Kerr Black Hole
97
We prove the existence of the following direct and inverse wave operators, defined by the strong limits: −itD /K /H Ω± j− eitD , H := s − lim e
−itD /K /∞ Ω± j+ eitD , ∞ := s − lim e
/H /K ˜ ± := s − lim e−itD Ω j− eitD , H
−itD /∞ /K ˜± . Ω j+ eitD ∞ := s − lim e
t→±∞
t→±∞
t→±∞
t→±∞
They are denoted by the same notations as in Theorem 2.3. It will in fact become apparent at the end of the proof that they are the same operators. Let us decompose H into the direct sum of the spaces Hn := {u = einϕ v, v ∈ L2 (R × [0, π], dr sin θdθ)} . /K /H /∞ The dynamics eitD , eitD , eitD as well as the cut-off functions j± preserve the n spaces H . We have furthermore n
/K /K eitD |Hn = eitD |H n ,
n
n
/H /H eitD |Hn = eitD |H n ,
/∞ /∞ eitD |Hn = eitD |H n ,
where the operator with index n is obtained from the operator without index by replacing Dϕ by n without changing D /S 2 , e.g. √ ∆ r 2 + a2 n γDr∗ + D / 2 − Aϕ n + iB . D /K = σ σ S Using the absence of pure point spectrum for D /K , it is thus sufficient to show the n existence of the limits (on H ): n
n
−itD /K /H Ω±,n j− eitD , H := s − lim e t→±∞
etc.
These limits exist on H ⊃ Hn by Theorem 6.3 and Lemma 7.2 and have the required properties. They are moreover independent of the choice of the cut-off function. Proof of the scattering theory of Theorem 2.2. Theorem 6.3 also gives the existence of limits /K s − lim e−itD j− eitDH , t→±∞
/K s − lim e−itDH j− eitD , t→±∞
with the correct properties. From [44, Lemmas 6.1 and 6.2], we also infer the existence of the limits /∞ itD∞ s − lim e−itD e , t→±∞
/∞ s − lim e−itD∞ eitD . t→±∞
The existence of the direct and inverse wave operators −itD /K W± j− eitDH , H := s − lim e
−itD /K W± j+ eitD∞ , ∞ := s − lim e
/K ˜ ± := s − lim e−itDH j− eitD W , H
/K ˜ ± := s − lim e−itD∞ j+ eitD W , ∞
t→±∞
t→±∞
t→±∞
t→±∞
then follows from the chain rule. These operators are independent of the choice of the cut-off functions.
March 10, 2004 14:17 WSPC/148-RMP
98
00191
D. H¨ afner & J.-P. Nicolas
7.3.2. Asymptotic velocity (proof of Theorem 2.1) We first establish the results for the asymptotic profiles. Lemma 7.3. For each J ∈ C∞ (R), we have r∗ itD∞ −itD∞ ∃ s − lim e J e = J(−γ) , t→+∞ t r∗ itDH e ∃ s − lim e−itDH J = J(−γ) . t→+∞ t
(7.10) (7.11)
Proof. We only prove (7.10); the proof of (7.11) is analogous. Recall that for Ψ = t (ψ0 , ψ1 ) in (L2 (R × S 2 ; dr∗ dω))2 , the action of eitD∞ on Ψ is given by ! ψ (r + t, ω) 0 ∗ (eitD∞ Ψ)(r∗ , ω) = . ψ1 (r∗ − t, ω) We establish the following properties, fundamental for the proof. Consider Ψ ∈ (C0∞ (R × S 2 ))2 and J ∈ C∞ (R). We have (i) if J ≡ 0 in a neighborhood of 1, then r∗ itD∞ e lim J t→+∞ t (ii) if J ≡ 0 in a neighborhood of −1, then r∗ itD∞ e lim J t→+∞ t
0 ψ1
ψ0 0
!
= 0;
!
= 0.
We establish the first limit. Let ε > 0 such that J ≡ 0 in [1 − ε, 1 + ε],
! 2
0 r∗ itD∞
e
J
t ψ1 (Z ) Z +∞ (1−ε)t 2 2 ≤C |ψ1 (r∗ − t, ω)| dr∗ dω + |ψ1 (r∗ − t, ω)| dr∗ dω −∞
(1+ε)t
which is zero for t large enough. The proof of the second limit is similar. It follows that: (i) if J ≡ 0 in a neighborhood of {−1, 1}, then r∗ itD∞ s − lim e−itD∞ J e = 0, t→+∞ t (ii) if J ≡ 1 in a neighborhood of 1 and J ≡ 0 in a neighborhood of −1, then ! ! 0 0 r r ∗ ∗ eitD∞ Ψ = lim e−itD∞ J eitD∞ lim e−itD∞ J = t→+∞ t→+∞ t t ψ1 ψ1 since J − 1 ≡ 0 in a neighborhood of 1,
March 10, 2004 14:17 WSPC/148-RMP
00191
Scattering of Massless Dirac Fields by a Kerr Black Hole
99
(iii) if J ≡ 1 in a neighborhood of −1 and J ≡ 0 in a neighborhood of 1, then ! ! ψ0 r∗ itD∞ r∗ itD∞ ψ0 −itD∞ −itD∞ e Ψ = lim e J e lim e J = t→+∞ t→+∞ t t 0 0 since J − 1 ≡ 0 in a neighborhood of −1. Hence the result. We now introduce the following wave operators, whose existence is a trivial consequence of our previous results: + /∞ itD∞ , V∞ := s − limt→+∞ e−itD e + /H itDH VH := s − limt→+∞ e−itD e ,
/∞ + ∗ + = (V∞ ) , V˜∞ := s − limt→+∞ e−itD∞ eitD
+ + ∗ /H V˜H := s − limt→+∞ e−itDH eitD = (VH ) .
The asymptotic velocity constructed for D∞ and DH in Lemma 7.3, together with these operators give us asymptotic velocities for D /∞ and D /H . Corollary 7.1. For all J ∈ C∞ (R), r∗ itD + + −itD /H ∃ s − lim e J e /H = VH J(−γ)V˜H , t→+∞ t r∗ itD −itD /∞ + + ∃ s − lim e J e /∞ = V∞ J(−γ)V˜∞ , t→+∞ t r∗ itD /K + ˜+ ˜+ ∃ s − lim e−itD J e /K = W + H J(−γ)WH + W∞ J(−γ)W∞ . t→+∞ t
(7.12) (7.13) (7.14)
Proof. We only establish (7.14); the proof of (7.12) and (7.13) is trivial. We consider j± , j0 ∈ Cb∞ (R) such that 1 1 , supp j+ ⊂ ]0, +∞[ , supp j− ⊂ ] − ∞, 0[ , supp j0 ⊂ − , 2 2 2 2 j− + j02 + j+ ≡ 1.
We have clearly s − lim e t→+∞
−itD /K
r∗ /K j0 (r∗ )J j0 (r∗ )eitD = 0. t
Now we have the following strong convergence as t → +∞: r∗ −itD /K /K e j+ (r∗ )J j+ (r∗ )eitD t r∗ itD∞ −itD∞ /K /K = e−itD j+ (r∗ )eitD∞ e−itD∞ J e e j+ (r∗ )eitD t ˜+ → W+ ∞ J(−γ)W∞ and we have an analogous result for j− .
March 10, 2004 14:17 WSPC/148-RMP
100
00191
D. H¨ afner & J.-P. Nicolas
Using [15, Proposition B.2.1], we obtain the existence of self-adjoint operators + + P + , PH and P∞ such that, for all J ∈ C∞ , r∗ itD + −itD /K e /K , J(P ) = s − lim e J t→+∞ t r∗ itD + /H e /H , J(PH ) = s − lim e−itD J t→+∞ t r∗ itD + /∞ J(P∞ ) = s − lim e−itD J e /∞ . t→+∞ t They are referred to as the asymptotic velocities associated with D /K , D /H and D /∞ . By Lemma 7.3, −γ is the asymptotic velocity associated with DH and D∞ . We + + next calculate the spectra of P + , PH and P∞ , describing for each Hamiltonian the allowed radial propagation speeds. + + Lemma 7.4. σ(P + ) = σ(PH ) = σ(P∞ ) = {−1, 1}. + + Proof. The result for PH and P∞ is clear. Using
˜ + = H− ran W H
˜ + = H+ , and ran W ∞
we find + ˜+ ˜+ J(P + ) = J(1)W+ ∞ W∞ + J(−1)WH WH .
Clearly, if J(1) = J(−1) = 0, we have J(P + ) = 0. In the case where J(−1) 6= 0, we + choose Ψ ∈ H− , Ψ 6= 0 and we put Φ := W+ H Ψ. We have Φ 6= 0 since kWH Ψk = + kΨk. Applying J(P ) to Φ, we find ˜+ J(P + )Φ = J(−1)W+ H WH Φ = J(−1)Φ 6= 0 , ˜+ since ran W+ H ⊂ ker W∞ . A similar construction can be done for J(1) 6= 0. Remark 7.3. As a trivial consequence of the previous lemma, we have + ˜+ ˜+ P + = W+ H (−γ)WH + W∞ (−γ)W∞ , + + + PH = VH (−γ)V˜H ,
+ + + P∞ = V∞ (−γ)V˜∞ .
7.3.3. Proof of Theorems 2.2 and 2.3 Let 0 < ε < 1, J ∈ Cb∞ (R), supp J ⊂ ] − ∞, 0[, such that J ≡ 1 on ] − ∞, −ε[. Let j± , j0 be a partition of unity as in the proof of Corollary 7.1. We show that r∗ itD /H /H 2 /K s − lim e−itD J e /K = s − lim e−itD j− (r∗ )eitD . t→+∞ t→+∞ t
March 10, 2004 14:17 WSPC/148-RMP
00191
Scattering of Massless Dirac Fields by a Kerr Black Hole
101
We clearly have r∗ itD e /K s − lim t→+∞ t r∗ itD −itD /H 2 = s − lim e j+ (r∗ )J e /K = 0 . t→+∞ t /H 2 e−itD j0 (r∗ )J
Now, for Ψ ∈ H,
r∗ itD e /K Ψ t r∗ itDH −itDH 2 /H itDH −itDH /K = e−itD e e J e e j− (r∗ )eitD Ψ t
/H 2 e−itD j− (r∗ )J
+ ˜ + Ψ = V +W ˜ +Ψ = Ω ˜ + Ψ as t → +∞ , → VH J(−γ)W H H H H
˜ + . The proof is similar for the other limits. since J(−γ) = PH− and H− = ran W H 8. Geometric Interpretation All the constructions of this section are based on the skeleton of the conformal geometry of block I: the two congruences of principal null geodesics. We start by giving a new version of Theorem 2.2, using the flow of principal null geodesics as comparison dynamics. The new wave operators are denoted as in Theorem 2.2 but with an additional index pn for “principal null”. Then, we interpret geometrically this scattering theory as providing the solution to a Goursat problem on a singular null hypersurface on the Penrose compactification of block I. ˜± • The first step is to interpret the inverse wave operators W H,pn as representations of trace operators on the future (respectively past) horizon. We describe the standard choices of coordinates used to understand the horizon as the union of two smooth null hypersurfaces at the boundary of block I. This provides an explicit diffeomorphism between the {t = 0} hypersurface and the future (respectively past) horizon. Next, we construct a spin-frame that is regular in the neighborhood of the horizon, and we describe its relation with (oA , ιA ) defined in Sec. 2. This, together with standard regularity results for hyperbolic equations (essentially Leray’s theorem), enables us to define traces on the future and past horizons for ˜± the vector Ψ. We then understand the operators W H,pn as the pull-back of the trace operators by the explicit diffeomorphisms. ˜± • The next step is a similar interpretation of the wave operators W ∞,pn as trace operators on future and past null infinities; this is based on the Penrose compactification of the exterior of the black hole. ˜ ± is then understood as a trace Each of the global inverse wave operators W pn operator, on the union of two of the previous null hypersurfaces: the future (respectively past) horizon and future (respectively past) null infinity. This larger
March 10, 2004 14:17 WSPC/148-RMP
102
00191
D. H¨ afner & J.-P. Nicolas
null hypersurface is singular at the junction of the two smooth surfaces (more pre± cisely, the conformal metric is singular there). The direct operators Wpn therefore solve the corresponding Goursat problems on these singular null hypersurfaces. 8.1. Theorem 2.2 in terms of principal null geodesics We introduce the vector fields v ± , generating the outgoing and incoming principal null geodesics, normalized so that their flows preserve the foliation {Σt }t : v ± := =
r2
∂ ∂ ∆ ∂ ∆ a V± = ± 2 + 2 2 2 2 +a ∂t r + a ∂r r + a ∂ϕ
∂ ∂ a ∂ ± + 2 . ∂t ∂r∗ r + a2 ∂ϕ
We introduce the spatial part w ± of v ± : w± := ±
∂ a ∂ + 2 . ∂r∗ r + a2 ∂ϕ
The flow of the vector field w ± , acting on Σ, is the dynamics generated by the Hamiltonian PN± (PN for principal null), defined by PN± = ∓Dr∗ −
a Dϕ . r 2 + a2
We denote by Fw± (t) the flow of w± on Σ and Fv± (t) the flow of v ± on R × Σ. They are related as follows: Fv± (t)(t0 , r0 , θ0 , ϕ0 ) = (t + t0 , Fw± (t)(r0 , θ0 , ϕ0 )) and the group associated with PN± is expressed in terms of Fw± as ±
(eitPN g)(r0 , θ0 , ϕ0 ) = g(Fw± (−t)(r0 , θ0 , ϕ0 )) . We introduce the comparison operator ! PN− 0 a Dϕ . PN = = γDr∗ − 2 r + a2 0 PN+ Its action on H is described in terms of the flows of w ± : ! − (−t)(r0 , θ0 , ϕ0 )) g (F 0 w , (eitPN G)(r0 , θ0 , ϕ0 ) = g1 (Fw+ (−t)(r0 , θ0 , ϕ0 ))
G=
g0 g1
!
∈ H.
The operator PN is self-adjoint on H and its spaces of incoming and outgoing data are H− and H+ . Moreover, the results of Theorem 2.2 are still valid if, instead of DH and D∞ , we use PN as comparison dynamics in the neighborhoods of both the horizon and infinity. This can also be expressed as follows:
March 10, 2004 14:17 WSPC/148-RMP
00191
Scattering of Massless Dirac Fields by a Kerr Black Hole
103
Theorem 8.1. The following strong limits −itD /K itPN W± e PH ∓ , H,pn := s − lim e
(8.1)
−itD /K itPN W± e PH ± , ∞,pn := s − lim e
(8.2)
−itPN itD ˜± W e /K 1R− (P ± ) , H,pn := s − lim e
(8.3)
−itPN itD ˜± W e /K 1R+ (P ± ) . ∞,pn := s − lim e
(8.4)
t→±∞
t→±∞
t→±∞
t→±∞
exist and satisfy the same properties as the wave operators of Theorem 2.2. The corresponding global wave operators are + Wpn : H− ⊕ H+ → H , + ((ψ0 , 0), (0, ψ1 )) 7→ W+ H,pn (ψ0 , 0) + W∞,pn (0, ψ1 ) ,
(8.5)
− Wpn : H+ ⊕ H− → H − ((0, ψ1 ), (ψ0 , 0)) 7→ W− H,pn (0, ψ1 ) + W∞,pn (ψ0 , 0) .
(8.6)
˜ + : H → H− ⊕ H+ , W pn
˜ + Ψ = (W ˜ + Ψ, W ˜ + Ψ) , W pn ∞,pn H,pn
(8.7)
− ˜ pn W : H → H+ ⊕ H− ,
− ˜ − Ψ, W ˜ − Ψ) . ˜ pn W Ψ = (W ∞,pn H,pn
(8.8)
Proof. It is a straightforward consequence of Theorem 2.2 and of the fact that P N is a short-range perturbation of DH as r∗ → −∞, of D∞ as r∗ → +∞ and that all three Hamiltonians PN , DH and D∞ satisfy Huygens’s principle. This, by Cook’s method and the chain rule, allows us to prove existence of direct and inverse wave operators. Remark 8.1. The wave operators in the above theorem can also be defined in terms of cut-off functions j± as in the proof of Theorem 2.2; the two definitions are equivalent. We have for example −itD /K W+ j− eitPN . H,pn := s − lim e t→+∞
8.2. Inverse wave operators at the horizon as trace operators 8.2.1. Kerr-star and star-Kerr coordinates A full account of these two coordinate systems can be found in [48]. The Kerrstar coordinate system (t∗ , r, θ, ϕ∗ ) is based on incoming principal null geodesics, parametrized as the integral lines of the vector V − [defined in (2.3)]. The new coordinates t∗ and ϕ∗ are of the form t∗ = t + T (r) ,
ϕ∗ = ϕ + Λ(r) ,
March 10, 2004 14:17 WSPC/148-RMP
104
00191
D. H¨ afner & J.-P. Nicolas
where the functions T and Λ are required to satisfy r 2 + a2 dT = , dr ∆
dΛ a = . dr ∆
(8.9)
The incoming principal null geodesics now appear as the r coordinate curves parametrized by s = −r: r˙ = −1 ,
θ˙ = 0 ,
dT r˙ = 0 , t˙∗ = t˙ + dr
dΛ ϕ˙∗ = ϕ˙ + r˙ = 0 . dr
Remark 8.2. In other words, in Kerr-star coordinates, the vector V − takes the form V− =−
∂ . ∂r
It will be useful in what followsd to express the Boyer–Lindquist coordinate vector fields in terms of the Kerr-star coordinate vector fields. In order to avoid confusion, we denote respectively (∂/∂r)BL and (∂/∂r)K∗ the r coordinate vector fields in Boyer–Lindquist and Kerr-star coordinates respectively; we do the same for the θ vector fields. We have ∂ ∂ = ∗, ∂t ∂t ∂ ∂ = , ∂θ BL ∂θ K∗
(8.10) (8.11)
∂ ∂ = , ∂ϕ ∂ϕ∗ ∂ r 2 + a2 ∂ ∂ a ∂ = + + . ∂r BL ∆ ∂t∗ ∂r K∗ ∆ ∂ϕ∗
(8.12) (8.13)
Kerr-star coordinates are defined globally on block I.e The Kerr metric in Kerrstar coordinates takes the form g = gtt dt∗ 2 + 2gtϕ dt∗ dϕ∗ + gϕϕ dϕ∗ 2 + gθθ dθ2 − 2dt∗ dr + 2a sin2 θ dϕ∗ dr ,
(8.14)
where gtt , 2gtϕ , gθθ and gϕϕ are the coefficients of dt2 , dtdϕ, dθ2 and dϕ2 in the expression (2.1) of g in Boyer–Lindquist coordinates: gtt = 1 − d When
2M r , ρ2
gtϕ =
2aM r sin2 θ , ρ2
gθθ = −ρ2 ,
gϕϕ = −
σ2 sin2 θ . ρ2
studying the behavior of the Newman–Penrose tetrad l a , na , ma , m ¯ a at null infinity. the exception of the axis (θ = 0 and θ = π); this coordinate singularity, similar to that of spherical coordinates on R3 , can be dealt with simply (see [48, Lemma 2.2.2]); we shall systematically ignore it.
e With
March 10, 2004 14:17 WSPC/148-RMP
00191
Scattering of Massless Dirac Fields by a Kerr Black Hole
105
The expression (8.14) shows that g can be extended smoothly across the horizon {r = r+ }. Besides, it does not degenerate there since its determinant is given by det(g) = −ρ4 sin2 θ
and does not vanish for r = r+ . Thus, we can add the horizon to block I as a smooth boundary. It is important at this point to understand the nature of the boundary we have just glued to block I. The hypersurface 2 h+ := R∗t × {r = r+ } × Sθ,ϕ ∗
is reached along incoming null geodesics; it is the horizon that is reached as t → +∞ by light rays or material bodies falling into the black hole and not the horizon seen 2 as Rt × {r+ }r × Sθ,ϕ in Boyer–Lindquist coordinates. We refer to it as the future horizon. It is a smooth hypersurface in the space-time (BI ∪ h+ , g). We can easily show that it is a null hypersurface. The metric induced by g on hypersurfaces of constant r, gr = gtt dt∗ 2 + 2gtϕ dt∗ dϕ∗ + gϕϕ dϕ∗ 2 − ρ2 dθ2 , has determinant det (gr ) = −ρ2 (gtt gϕϕ − (gtϕ )2 ) = ρ2 ∆ sin2 θ
and thus degenerates for ∆ = 0, i.e. at h+ . Since g does not degenerate, it follows that one of the generators of h+ is null (i.e. h+ is a null hypersurface). Star-Kerr coordinates (∗ t, r, θ, ∗ ϕ) are constructed using the outgoing principal null geodesics parametrized as the integral lines of V + . We have ∗
t = t − T (r) ,
∗
ϕ = ϕ − Λ(r) ,
with the same functions T and Λ as for Kerr-star coordinates. Consequently the outgoing principal null geodesics appear as the r coordinate curves parametrized by r. Remark 8.3. It is equivalent to say that in star-Kerr coordinates, we have ∂ . V+ = ∂r As we did for Kerr-star coordinates, we express the Boyer–Lindquist coordinate vector fields in terms of the star-Kerr coordinate vector fields: ∂ ∂ = ∗ , (8.15) ∂t ∂ t ∂ ∂ = , (8.16) ∂θ BL ∂θ ∗K ∂ ∂ = ∗ , ∂ϕ ∂ ϕ ∂ r 2 + a2 ∂ ∂ a ∂ =− + − . ∂r BL ∆ ∂∗t ∂r ∗K ∆ ∂ ∗ ϕ
(8.17) (8.18)
March 10, 2004 14:17 WSPC/148-RMP
106
00191
D. H¨ afner & J.-P. Nicolas
This coordinate system allows us to add the past horizon 2 h− := R∗ t × {r = r+ }r × Sθ, ∗ϕ
as a smooth null boundary to block I. This is a white hole horizon from whence light rays (and in particular outgoing principal null geodesics) emerge. White holes are natural features of the maximal extension of space-times containing eternal black holes (for a description of maximal Kerr space-time, see [8] or [48], or [46] for a shorter account). The future and past horizons, that we have understood as smooth null boundaries to block I, are both reached for infinite values of t; they do not contain any 2 point of the horizon described in Boyer–Lindquist coordinates as Rt × {r+ } × Sθ,ϕ . In the next subsection, we describe a coordinate system that gives us a global description of the horizon, encompassing the future and past horizons as well as the Boyer–Lindquist horizon. It will be useful to know the behavior of t∗ and ∗ t on both h+ and h− . Properties. We have constructed h+ (respectively h− ) using a coordinate system (t∗ , r, θ, ϕ∗ ) (respectively (∗ t, r, θ, ∗ ϕ)) that is regular across the horizon along incoming (respectively outgoing) principal null geodesics. The Kerr-star variable t ∗ , defined by t∗ = t + T (r) ,
r 2 + a2 dT = , dr ∆
is constant on incoming principal null geodesics. Along outgoing principal null geodesics, its behavior is best described in terms of star-Kerr coordinates: t∗ = ∗ t + 2T (r) . Hence, on an outgoing principal null geodesic, t∗ tends to +∞ as r → +∞ and to −∞ as r → r+ . Consequently, t∗ is regular on h+ , takes all real values on h+ , and tends to −∞ on h− . Similarly, ∗ t is regular on h− , takes all real values on h− , and tends to +∞ on h+ . 8.2.2. Kruskal–Boyer–Lindquist coordinates This coordinate system, discovered in 1967 by Boyer and Lindquist [8], is made up of a combination of the two Kerr coordinate systems, modified in such a way that it is regular on both the future and the past horizons. We describe briefly its definition and properties; for details, see [8] or [48]. The time and radial variables are replaced by ∗
U = e−κ+ t ,
∗
V = e κ+ t ,
(8.19)
where κ+ is the surface gravity at the outer horizon, given by (2.61). The coordinate θ is conserved since it is regular on all three blocks (except on the axes, but as
March 10, 2004 14:17 WSPC/148-RMP
00191
Scattering of Massless Dirac Fields by a Kerr Black Hole
107
remarked above, this singularity is no more serious than that of standard spherical coordinates and we ignore it). The longitude function is defined by 1 a a ∗ ∗ ϕ] = ϕ∗ + ∗ ϕ − 2 + t) (t =ϕ− 2 t. (8.20) 2 2 r+ + a r+ + a 2 It is chosen so that the principal null geodesics in the future and past horizons are coordinate curves. The functions (U, V, θ, ϕ] ) form an analytic coordinate system on BI ∪ h+ ∪ h− − (axes). In this coordinate system, we have 2 BI = ]0, +∞[U ×]0, +∞[V × Sθ,ϕ ] , 2 h+ = {0}U × [0, +∞[V ×Sθ,ϕ ] ,
2 h− = [0, +∞[U × {0}V × Sθ,ϕ ] ,
simply because t∗ (respectively ∗ t) is regular at h+ (respectively h− ), takes all real values on h+ (respectively h− ), and tends to −∞ (respectively +∞) at h− (respectively h+ ). In these new coordinates, the Kerr metric takes the form ρ2+ G2 a2 sin2 θ (r − r− )(r + r+ ) ρ2 2 2 2 2 + g=− + 2 2 2 + a2 ) r 2 + a2 2 + a2 (U dV + V dU ) 4κ+ ρ (r2 + a2 )(r+ r+ ρ4+ ρ4 G+ (r − r− ) + − 2 + a2 )2 dU dV 2κ2+ ρ2 (r2 + a2 )2 (r+ −
G+ a sin2 θ 2 2 2 ] 2 + a2 ) [ρ+ (r − r− ) + (r + a )(r + r+ )](U dV − V dU )dϕ κ+ ρ2 (r+
− ρ2 dθ2 − gϕϕ (dϕ] )2 ,
(8.21)
where r − r+ = e−2κ+ r |r − r− |r− /r+ . UV The functions r and G+ are analytic and non-vanishing on [0, +∞[U ×[0, +∞[V . The expression (8.21) shows that g is smooth on BI ∪ h+ ∪ h− and can be extended 2 smoothly on [0, +∞[U ×[0, +∞[V ×Sθ,ϕ ] . The 2-sphere {U = V = 0}, where the future and past horizons meet, is called the crossing sphere; we denote it by Sc2 . It is a regular surface in the extended space-time 2 ρ2+ = r+ + a2 cos2 θ ,
G+ =
2 (BIKBL := [0, +∞[U ×[0, +∞[V ×Sθ,ϕ ] , g)
and it represents the whole horizon as described in Boyer–Lindquist coordinates 2 (i.e. Rt × {r+ }r × Sθ,ϕ ). Hence, the Kruskal–Boyer–Lindquist coordinates give us a global description of the horizon h = h− ∪ Sc2 ∪ h+ 2 2 = ([0, +∞[U ×{0}V × Sθ,ϕ ] ) ∪ ({0}U × [0, +∞[V ×Sθ,ϕ] )
as a union of two smooth null boundaries h+ ∪ Sc2 and Sc2 ∪ h− (see Fig. 1 for a picture of the extended space-time BIKBL ). This allows us to construct a spin-frame
March 10, 2004 14:17 WSPC/148-RMP
108
00191
D. H¨ afner & J.-P. Nicolas
Fig. 1.
The extended space-time BIKBL in Kruskal–Boyer–Lindquist coordinates.
that behaves smoothly at the horizon. As we shall see below, such a spin-frame will ˜± be fundamental for the interpretation of W H,pn as trace operators on the horizon. We define the new spin-frame by choosing a Newman–Penrose tetrad that is regular at the horizon. We start with the Kinnersley-type tetrad l a , na , ma , m ¯ a defined by (2.24)–(2.27). We express its vectors in the Kruskal–Boyer–Lindquist basis and rescale the frame vectors when necessary, so that they neither vanish nor blow up at the horizon. The relation between the coordinate vector fields of Kruskal–Boyer–Lindquist coordinates and of Boyer–Lindquist coordinates is given by ∂ ∂ a ∂ ∂ − 2 = κ+ −U +V , ∂t ∂U ∂V r+ + a2 ∂ϕ] r 2 + a2 ∂ ∂ ∂ = κ+ +V U , ∂r ∆ ∂U ∂V ∂ ∂ ∂ ∂ = , = . ∂θ ∂θ ∂ϕ ∂ϕ] This yields the expression of the null vectors (2.24)–(2.27) in terms of Kruskal– Boyer–Lindquist coordinates: 2 a(r2 − r+ ) ∂ 1 ∂ ∂ , la a = p − 2 2κ+ (r2 + a2 )V ∂x ∂V r+ + a2 ∂ϕ] 2∆ρ2 2 a(r2 − r+ ) ∂ 1 ∂ a ∂ 2 2 n , = p − 2 −2κ+ (r + a )U ∂xa ∂U r+ + a2 ∂ϕ] 2∆ρ2 1 ∂ i ∂ ∂ ∂ a ∂ √ = +V + m iaκ+ sin θ −U + . ∂xa ∂U ∂V ∂θ sin θ ∂ϕ] p 2
March 10, 2004 14:17 WSPC/148-RMP
00191
Scattering of Massless Dirac Fields by a Kerr Black Hole
109
Lemma 8.1. The rescaled Newman–Penrose tetrad la , na , ma , m ¯ a , defined by U la = √ e−κ+ r (r − r− )M/r+ la , ∆
V na = √ e−κ+ r (r − r− )M/r+ na , ∆
is smooth on h − (axes). The spin-frame (oA , iA ) associated to la , na , ma , m ¯ a , is therefore regular (off axis) on h and is given by 1/2 U oA = √ e−κ+ r (r − r− )M/r+ oA , ∆ 1/2 V ιA . (8.22) iA = √ e−κ+ r (r − r− )M/r+ ∆ Proof. The vector ma is clearly regular off axis on BIKBL . For la , we have la
∂ = e−κ+ r (r − r− )M/r+ ∂xa ×
a(r2 − r2 )U 2κ+ (r2 + a2 ) U V ∂ ∂ p + 2 + p 2 2 2 ∆ ∂V 2ρ (r+ + a ) 2ρ ∆ ∂ϕ]
!
= e−κ+ r (r − r− )M/r+ ×
∂ a(r + r+ )U 1 ∂ 2κ+ (r2 + a2 ) p p − 2 2 2 2 (r − r− )G+ ∂V 2ρ (r+ + a )(r − r− ) 2ρ ∂ϕ]
!
and la is therefore regular on BIKBL − (axes). A similar calculation can be done for na . It is also easy to check, using the definition and properties of G+ , that the tetrad satisfies la na = 1 = −ma m ¯a,
la m a = n a m a = 0 .
8.2.3. Interpretation of the traces on the horizon Let us consider on Σ0 some smooth compactly supported initial data χA ∈ C0∞ (Σ0 , SA ) for the Weyl equation (2.5). Then (2.5) admits a unique solution φA ∈ C ∞ (Rt , C0∞ (Σ, SA )) such that φA |Σ0 = χA . This solution can be extended uniquely as a smooth spinor-valued function on BIKBL , still denoted φA . This is proved by first applying standard regularity results for symmetric hyperbolic systems, then extending the space-time (BIKBL , g) beyond the horizon (a natural way is to construct the maximal analytic extension of Kerr space-time) and finally applying once more the standard theorems for symmetric hyperbolic systems (for details, see [46]). It follows that for smooth solutions of (2.5), we can naturally define the trace of φA on h+ and h− . This trace, projected onto the spin-frame (8.22) can easily be compared with the limit of the vector Φ (of the components of φA in the spin-frame (oA , ιA )) along principal null geodesics as r → r+ . This is expressed by the following proposition.
March 10, 2004 14:17 WSPC/148-RMP
110
00191
D. H¨ afner & J.-P. Nicolas
Proposition 8.1. Given φA ∈ C ∞ (BIKBL ) a smooth solution of (2.5), we denote by f the vector ! f0 := φA oA . f= f1 := φA iA The trace of f on h+ is related to the limit of Φ in the future along incoming principal null geodesics as follows: 1/2 U − f0 (0, V, θ, ϕ] ) = lim √ e−κ+ r (r − r− )M/r+ φ0 (γV,θ,ϕ ] (r)) , r→r+ ∆ 1/2 V − f1 (0, V, θ, ϕ] ) = lim √ e−κ+ r (r − r− )M/r+ φ1 (γV,θ,ϕ ] (r)) , r→r+ ∆ − where γV,θ,ϕ ] (r) is the incoming radial null geodesic, parametrized by r, that encounters h+ at the point of Kruskal–Boyer–Lindquist coordinates (0, V, θ, ϕ] ); the description of this geodesic in Kerr-star coordinates is − γV,θ,ϕ ] (r) a 1 ∗ log(V ), r, θ, ϕ∗ = ϕ] + Λ(r+ ) + 2 (t − T (r )) , = t∗ = + κ+ r+ + a 2
T and Λ being the functions defining the Kerr-star and star-Kerr coordinate systems, satisfying (8.9). Similarly, the trace of f on h− is related to the limit of Φ in the past along outgoing principal null geodesics as follows: 1/2 U + f0 (U, 0, θ, ϕ] ) = lim √ e−κ+ r (r − r− )M/r+ φ0 (γU,θ,ϕ ] (r)) , r→r+ ∆ 1/2 V + φ1 (γU,θ,ϕ f1 (U, 0, θ, ϕ] ) = lim √ e−κ+ r (r − r− )M/r+ ] (r)) , r→r+ ∆ + where γU,θ,ϕ ] (r) is the outgoing radial null geodesic, parametrized by r, that encounters h− at the point of Kruskal–Boyer–Lindquist coordinates (U, 0, θ, ϕ] ); the description of this geodesic in star-Kerr coordinates is + γU,θ,ϕ ] (r) 1 a ∗ ( t + T (r )) . log(U ), r, θ, ∗ ϕ = ϕ] − Λ(r+ ) + 2 = ∗t = − + κ+ r+ + a 2
Proof. It is an immediate consequence of (8.22), the definition of Kruskal–Boyer– Lindquist coordinates and the fact that in Kerr-star coordinates (respectively star-Kerr coordinates), the incoming (respectively outgoing) principal null geodesics are the r coordinate lines.
March 10, 2004 14:17 WSPC/148-RMP
00191
Scattering of Massless Dirac Fields by a Kerr Black Hole
111
The trace of f on h± can then be related to the limit of the vector Ψ [defined in (2.51)], solution of Eq. (2.56), along incoming and outgoing principal null geodesics. Corollary 8.1. The vector field Ψ extends as a smooth vector field on B IKBL and its trace on h± is naturally defined as the limit of Ψ as r → r+ along incoming or outgoing principal null geodesics. The trace of Ψ on h+ is given in terms of the trace of f on h+ by − Ψ0|h+ (0, V, θ, ϕ] ) := lim Ψ0 (γV,θ,ϕ ] (r)) r→r+
= ((r+ − r− )G+ (r+ ))1/4
p pV f0 (0, V, θ, ϕ] ) ,
(8.23)
− Ψ1 |h+ (0, V, θ, ϕ] ) := lim Ψ1 (γV,θ,ϕ ] (r)) = 0 ,
(8.24)
r→r+
and the trace of Ψ on h− is given in terms of the trace of f on h− by + Ψ0 |h− (U, 0, θ, ϕ] ) := lim Ψ0 (γU,θ,ϕ ] (r)) = 0 ,
(8.25)
r→r+
+ Ψ1 |h− (U, 0, θ, ϕ] ) := lim Ψ1 (γU,θ,ϕ ] (r)) r→r+
= ((r+ − r− )G+ (r+ ))1/4 with (r+ − r− )G+ (r+ ) = 2
p
p p¯U f1 (U, 0, θ, ϕ] ) ,
(8.26)
M 2 − a2 e−2κ+ r+ .
Proof. The relation between Ψ and f is given by, using the two expressions of G+ : 14 ∆ σ 2 ρ2 Ψ=U (r2 + a2 )2 r√ ∆ κ+ r −M/r+ e (r − r ) 0 − U × f r√ ∆ κ+ r e (r − r− )−M/r+ 0 V r σρ 1/4 0 r2 + a2 V ((r − r− )G+ ) = U f. r σρ 1/4 0 U((r − r− )G+ ) 2 2 r +a
The matrix r U
σρ V ((r − r− )G+ )1/4 r 2 + a2 0
0 r
σρ U ((r − r− )G+ )1/4 r 2 + a2
March 10, 2004 14:17 WSPC/148-RMP
112
00191
D. H¨ afner & J.-P. Nicolas
is smooth on BIKBL , hence Ψ extends as a smooth vector field on BIKBL . Besides, on the horizon, U reduces to v u r + ia cos θ r uq + 0 p t 0 2 + a2 cos2 θ r+ ρ v r = u p¯ r − ia cos θ uq + 0 0 t ρ 2 + a2 cos2 θ r+ and for r = r+ , we have σ = r2 + a2 . These identities and the facts that U = 0 on h+ and V = 0 on h− imply (8.23)–(8.26).
Definition 8.1. We define the trace operators: Th+ : C0∞ (Σ0 ; C2 ) → C ∞ (h+ , C)
(8.27)
ΨΣ0 7→ ψ0 |h+ Th− : C0∞ (Σ0 ; C2 ) → C ∞ (h− , C)
(8.28)
ΨΣ0 7→ ψ1 |h− where Ψ=
ψ0 ψ1
!
is the solution of (2.56) in C ∞ (BIKBL ) associated with ΨΣ0 . We are now in position, for any initial data for Eq. (2.56) in H, to prove the existence of a trace on h± of the corresponding solution of (2.56) and to relate this ˜± . trace to the image of the initial data by the inverse wave operator W H,pn ± onto Σ0 deTheorem 8.2. We consider the C ∞ diffeomorphisms F± h from h fined by identifying points along incoming (respectively outgoing) principal null geodesics. Their expressions in terms of Kruskal–Boyer–Lindquist coordinates on h± and Boyer–Lindquist coordinates on Σ0 are as follows: first for F+ h , we have 1 ] −1 r(F+ Log(V ) , (8.29) h (0, V, θ, ϕ )) = T κ+ ] θ(F+ h (0, V, θ, ϕ )) = θ , ] ] ϕ(F+ h (0, V, θ, ϕ )) = ϕ −
(8.30) a + ] 2 + a2 (r(Fh (0, V, θ, ϕ )) − r+ ) r+
2M a − 2 Log r+ + a 2
] r(F+ h (0, V, θ, ϕ )) − r+
r+ − r −
!
,
(8.31)
March 10, 2004 14:17 WSPC/148-RMP
00191
Scattering of Massless Dirac Fields by a Kerr Black Hole
113
T −1 being the inverse of the function T defined in (8.9); and for F− h , we have similar formulae 1 ] −1 (U, 0, θ, ϕ )) = T r(F− Log(U ) , (8.32) h κ+ ] θ(F− h (U, 0, θ, ϕ )) = θ ,
(8.33)
] ] ϕ(F− h (U, 0, θ, ϕ )) = ϕ −
2 r+
a ] (r(F− h (U, 0, θ, ϕ )) − r+ ) + a2
2M a − 2 Log r+ + a 2
] r(F− h (U, 0, θ, ϕ )) − r+
r+ − r −
!
.
(8.34)
The trace operators Th± [defined in (8.27) and (8.28)] extend in a unique manner as bounded operatorsf from H to L2 (h± ; dVolh± ), where the measure dVolh± on h± is the pull-back of the volume measure dr∗ dω on Σ0 by the diffeomorphism F± h. ± ˜± They are related to the inverse wave operators W via the diffeomorphisms F H,pn h as follows: ˜ ± Ψ| )(r, θ, ϕ) = (T ± Ψ| )((F± )−1 (r, θ, ϕ)) , (W Σ0 Σ0 H,pn h h
(8.35)
∗ ˜± ˜± that is to say, Th± is the pull-back (F± h ) WH,pn of the inverse wave operator WH,pn by the diffeomorphism F± h.
Proof. We first establish the expressions of the diffeomorphisms F± h . We start ] 2 ) ∈ S , we denote by (t = 0, r with F+ . Given V > 0 and (θ , ϕ 0 0 , θ0 , ϕ0 ) the + + + h
] Boyer–Lindquist coordinates of F+ h (0, V+ , θ+ , ϕ+ ) and we calculate them in terms of V+ , θ+ , ϕ]+ . Along an incoming principal null geodesic, t∗ , θ and ϕ∗ are constant. This already gives us (8.30). Using the definitions of t∗ and of V , we have, on the incoming principal null geodesic going from (t0 , r0 , θ0 , ϕ0 ) to (0, V+ , θ+ , ϕ]+ ):
t∗ =
1 Log(V+ ) = t0 + T (r0 ) = T (r0 ) κ+
which entails (8.29). In order to calculate ϕ0 , we express the vector V − in terms of Kruskal–Boyer–Lindquist coordinates (using the expression of na in these coordinates): V − = −2κ+
2 a(r2 − r+ ) ∂ r 2 + a2 ∂ U − . 2 2 ∆ ∂U ∆(r+ + a ) ∂ϕ]
This shows that along an incoming principal null geodesic, ϕ˙ ] = − f These
2 a(r2 − r+ ) a r + r+ 2 + a2 ) = − (r 2 + a2 ) r − r . ∆(r+ − +
operators can also be understood as trace operators acting on solutions of (2.56) in C(Rt ; H), instead of on their initial data.
March 10, 2004 14:17 WSPC/148-RMP
114
00191
D. H¨ afner & J.-P. Nicolas
Consequently, we must have ϕ]+
a − ϕ (t0 , r0 , θ0 , ϕ0 ) = − 2 (r+ + a2 ) ]
Z
r+ r0
r + r+ dr r − r−
a 2M a = 2 (r0 − r+ ) + 2 Log 2 (r+ + a ) (r+ + a2 )
r0 − r − r+ − r −
.
Moreover, the definition (8.20) of ϕ] shows that this coordinate function coincides with the Boyer–Lindquist function ϕ on Σ0 , which gives (8.31). The verification of (8.32)–(8.34) is done in the same manner, working with outgoing null geodesics and star-Kerr coordinates instead. We now justify the existence of trace operators for minimum regularity solutions and their relation to the inverse wave operators. We consider some initial data ΨΣ0 ∈ C0∞ (Σ0 ) and Ψ the associated solution of (2.56). As observed in Remark 8.1, we have /K /K s − lim e−itPN eitD 1R− (P + ) = s − lim e−itPN j− eitD t→+∞
t→+∞
∞
−
where j− ∈ C (R), supp j− ⊂ R and j− ≡ 1 on ] − ∞, −ε[, ε > 0 and small. We denote by Ψ(t) ∈ C0∞ (Σ) the restriction to Σt (' Σ) of the solution Ψ. As remarked above, we have for (r0 , θ0 , ϕ0 ) given in Σ, ! (j− ψ0 (t))(Fw− (t)(r0 , θ0 , ϕ0 )) itD /K −itPN j− e ΨΣ0 )(r0 , θ0 , ϕ0 ) = (e (j− ψ1 (t))(Fw+ (t)(r0 , θ0 , ϕ0 )) ! j− (r0 + t)ψ0 (t)(Fw− (t)(r0 , θ0 , ϕ0 )) = j− (r0 − t)ψ1 (t)(Fw+ (t)(r0 , θ0 , ϕ0 )) ! ψ0 (t)(Fw− (t)(r0 , θ0 , ϕ0 )) = 0 ! ψ0 (Fv− (t)(0, r0 , θ0 , ϕ0 )) = 0 for t large enough. The line {Fv− (t)(0, r0 , θ0 , ϕ0 ), t ∈ R}
− is exactly the principal null geodesic γ(F + −1 ) (r h
0 ,θ0 ,ϕ0 )
going through the point
(0, r0 , θ0 , ϕ0 ) of Σ0 , parametrized by t instead of r. This change of parameter is analytic and given by − t(γ(F + −1 ) (r h
0 ,θ0 ,ϕ0 )
(r)) = −r∗ (r) + r∗ (r0 ) ,
in other words − Fv− (t)(0, r0 , θ0 , ϕ0 ) = γ(F + −1 ) (r h
0 ,θ0 ,ϕ0 )
(r(−t + r∗ (r0 ))) .
March 10, 2004 14:17 WSPC/148-RMP
00191
Scattering of Massless Dirac Fields by a Kerr Black Hole
115
Fig. 2. The hypersurfaces Σt in (U, V, θ, ϕ] ) coordinates and the effect of the function j− . We have represented the hypersurface t = 0 and three hypersurfaces of constant positive t for t 3 > t2 > t1 . The effect of the cut-off function j− is to obliterate all that happens in the striped region, corresponding to r∗ > R.
As t tends to +∞ along this line, r tends to r+ , hence, we find /K lim (e−itPN j− eitD ΨΣ0 )(r0 , θ0 , ϕ0 )
t→+∞
=
− lim ψ0 (γ(F + −1 ) (r
r→r+
h
0
0 ,θ0 ,ϕ0 )
(r))
.
˜ + ΨΣ0 and T + (ΨΣ0 ) ◦ (F+ )−1 coincide for smooth compactly This shows that W h h H,pn ˜+ extends by density to a bounded operator from supported data ΨΣ0 . Since W H,pn H to H+ , this entails that Th+ also extends to a bounded operator from H to − ˜− L2 (h+ , dVolh+ ). A similar construction can be done for W H,pn and Th . ∗ ˜+ Remark 8.4. The equality Th+ = (F+ h ) WH,pn can be explained in a more visual manner for smooth compactly supported data. As t → +∞, the hypersurfaces /K Σt accumulate on the horizon and the limit of the quantity j− eitD ΨΣ0 is simply + the trace of the solution Ψ on h (see Fig. 2 for a description of the way the hypersurfaces accumulate on the horizon and of the effect of the cut-off function j− ). The operator e−itPN pulls back for a time interval t the first component of Ψ
March 10, 2004 14:17 WSPC/148-RMP
116
00191
D. H¨ afner & J.-P. Nicolas
along the incoming principal null congruence and its second component along the outgoing null congruence. At the limit t → +∞, acting on the trace of Ψ on h+ , whose second component is zero, it simply pulls back the trace of Ψ onto Σ0 along the incoming null congruence. 8.3. Inverse wave operators at infinity as trace operators 8.3.1. Penrose compactification of block I The Penrose compactification of the exterior of a Kerr black hole is done using two independent and symmetric constructions, one based on Kerr-star, the other on star-Kerr coordinates. We describe explicitly only the first of these two constructions, following [46]. Past null infinity is defined as the set of limit points of incoming principal null geodesics as r → +∞. This rather abstract definition of a 3-surface, describing the congruence of incoming principal null geodesics, can be given a precise meaning using Kerr-star coordinates. We consider the expression (8.14) of the Kerr metric in Kerr-star coordinates and replace the variable r by w = 1/r. In these new variables, the exterior of the black hole is described as 1 2 × Sθ,ϕ BI = Rt∗ × 0, ∗ . r+ w The conformally rescaled metric gˆ = Ω2 g ,
Ω=w=
1 r
(8.36)
takes the form gˆ =
4M aw3 sin2 θ dt∗ dϕ∗ 1 + a2 w2 cos2 θ 2 2M a2 w3 sin2 θ 2 2 sin2 θ dϕ∗ − 1+a w + 1 + a2 w2 cos2 θ
w2 −
2M w3 1 + a2 w2 cos2 θ
2
dt∗ +
− (1 + a2 w2 cos2 θ)dθ2 + 2dt∗ dw − 2a sin2 θ dϕ∗ dw . This expression shows that gˆ can be extended smoothly on the domain 1 2 ∗ × Sθ,ϕ Rt × 0, ∗ . r+ w The hypersurface 2 J − := R∗t × {w = 0} × Sθ,ϕ ∗
can thus be added to the rescaled space-time as a smooth hypersurface, describing past null infinity as defined above. This hypersurface is indeed null since gˆ|w=0 = −dθ2 − sin2 θ dϕ∗
2
March 10, 2004 14:17 WSPC/148-RMP
00191
Scattering of Massless Dirac Fields by a Kerr Black Hole
Fig. 3.
117
The Penrose compactification of block I, with two hypersurfaces Σ s and Σt , t > s.
is degenerate (recall that J − is a 3-surface) and det(ˆ g ) = −w 4 ρ4 sin2 θ = −(1 + a2 w2 cos2 θ)2 sin2 θ does not vanish for w = 0. Similarly, using star-Kerr instead of Kerr-star coordinates, we describe future null infinity, the set of limit points as r → +∞ of outgoing principal null geodesics, as 2 J + := R∗ t × {w = 0} × Sθ, ∗ϕ .
The Penrose compactification of block I is then the space-time (BI , gˆ) ,
BI = BI ∪ h+ ∪ Sc2 ∪ h− ∪ J + ∪ J − ,
gˆ being defined by (8.36). In spite of the terminology used, the compactified spacetime is not compact. There are three “points” missing to the boundary: i+ , or future timelike infinity, defined as the limit point of uniformly timelike curves as t → +∞, i− , past timelike infinity, symmetric of i+ in the distant past, and i0 , spacelike infinity, the limit point of uniformly spacelike curves as r → +∞. These “points”, that can be described as 2-spheres, or even blown up further, are singularities of the rescaled metric. See Fig. 3 for a representation of the compactified block I. We conclude this paragraph with a useful result concerning the behavior at null infinity of the Newman–Penrose tetrad l a , na , ma , m ¯ a:
March 10, 2004 14:17 WSPC/148-RMP
118
00191
D. H¨ afner & J.-P. Nicolas
Proposition 8.2. Each vector field of the Newman–Penrose tetrad l a , na , ma , m ¯ a extends as a smooth vector field over BI ∪ J + ∪ J − . All the frame vectors vanish on J − (respectively J + ) except la (respectively na ) which coincides there with the future oriented null generator up J − (respectively J + ). Consequently, the spinor fields oA and ιA extend as smooth spinor fields on BI ∪ J + ∪ J − , oA does not vanish on J − but vanishes on J + while ιA does not vanish on J + but vanishes on J − . The spin-frame (ˆ oA , ˆιA ) = (Ω−1 oA , ιA ) is a smooth non-degenerate normalized (relative to the metric gˆ) spin-frame over BI ∪ J + . Symmetrically, the spin-frame (ˇ oA , ˇιA ) = (oA , Ω−1 ιA ) is a smooth non-degenerate normalized spinframe over BI ∪ J − . Remark 8.5. These properties are well-known for more general space-times admitting a regular null infinity (see [50, Vol. 2]). We prove them here explicitly. Proof. We first express each of the frame vectors in Kerr-star coordinates, using (8.10)–(8.13): la
1 ∂ V+ = p a ∂x 2∆ρ2
s
∂ V + = p ∂r BL 2∆ρ2 s 1 2∆ r2 + a2 ∂ a ∂ ∂ ∂ = −p , + + + ρ2 ∆ ∂t∗ ∂r K∗ ∆ ∂ϕ∗ 2∆ρ2 ∂r K∗ ∂ ∂ 1 1 , na a = p V − = −p ∂x 2∆ρ2 2∆ρ2 ∂r K∗ ∂ ∂ 1 ∂ i a ∂ m ia sin θ ∗ + . = √ + ∂xa ∂t ∂θ sin θ ∂ϕ∗ p 2 1
−
2∆ ρ2
Going over to coordinates t∗ , w, θ, ϕ∗ , we obtain s 1 − 2∆ 2 ∂ a ∂ 2∆ r2 + a2 ∂ a ∂ p = + + l w , ∂xa ∂w K∗ ρ2 ∆ ∂t∗ ∆ ∂ϕ∗ 2∆ρ2 ∂ 1 ∂ na a = p , w2 ∂x ∂w K∗ 2∆ρ2 1 ∂ i ∂ ∂ a ∂ = √ + ia sin θ ∗ + . m ∂xa ∂t ∂θ sin θ ∂ϕ∗ p 2
All three vector fields are smooth on BI ∪ J + ∪ J − ; la is the only one not to vanish on J − and √ ∂ ∂ = 2 ∗, la a ∂x w=0 ∂t
March 10, 2004 14:17 WSPC/148-RMP
00191
Scattering of Massless Dirac Fields by a Kerr Black Hole
119
i.e. on J − , la is the future oriented null generator up J − . A similar calculation can be done for J + using star-Kerr coordinates and identities (8.15)–(8.18). We obtain in particular that √ ∂ a ∂ n = 2 ∗ . ∂xa w=0 ∂ t
The properties of the spin-frame (ˇ oA , ˇιA ) = (oA , Ω−1 ιA ) are easily proved by a −2 a noticing that the vectors l , Ω n , Ω−1 ma and Ω−1 m ¯ a are all smooth and non− − vanishing over J , and define on BI ∪ J a normalized Newman–Penrose tetrad for the metric gˆ. The same can be done for (ˆ oA , ˆιA ) on J + .
8.3.2. Interpretation of the traces on J ± The conformal invariance of the Dirac equation entails that a spinor field φA ∈ L2loc (BI ; SA ) satisfies Eq. (2.5) if and only if the rescaled spinor field φˆA = Ω−1 φA ∈ L2loc (BI ; SA ) satisfies on BI ˆ AA0 φˆA = 0 , ∇
(8.37)
ˆ a is the covariant derivative associated with the rescaled metric gˆ. We where ∇ consider for (2.5) initial data φA (0) ∈ C0∞ (Σ0 ) and φA ∈ C ∞ (Rt , C0∞ (Σ; SA )) the associated solution. The support of the spinor field φˆA remains far from i0 and therefore, by standard argumentsg of regularity of the solution of Dirac’s (or Weyl’s) equation on a smooth space-time, φˆA extends as a solution of (8.37) in C ∞ (BI ). Now, the vector field Ψ, defined in relation to φA by (2.51), is related to φˆA as follows: ! ! 1/4 1/4 o A φA oA φˆA ∆σ 2 ρ2 ∆σ 2 ρ2 . = UΩ Ψ=U (r2 + a2 )2 (r2 + a2 )2 ι A φA ιA φˆA The matrix U, defined in (2.50), as well as its inverse, can be extended smoothly over BI , and the same is true of the quantity 1/4 ∆σ 2 ρ2 . Ω (r2 + a2 )2 Moreover, by Proposition 8.2, oA and ιA extend as smooth spinor-fields on BI ∪ J + ∪ J − . All this entails that the vector field Ψ extends as a smooth vector field over BI (the regularity over h was shown in the previous subsection). Remark 8.6. Noting that U reduces to the identity matrix on J ± , Proposition 8.2 entails that the trace of Ψ on J + is simply the second component of the trace of φˆA on J + in the spin-frame (ˆ oA , ˆιA ) that is regular and non-degenerate on J + . g These arguments are explained in detail for a nonlinear wave equation in [47]. Their application to the Dirac case is identical.
March 10, 2004 14:17 WSPC/148-RMP
120
00191
D. H¨ afner & J.-P. Nicolas
Similarly, the trace of Ψ on J − is the first component of the trace of φˆA on J − in the spin-frame (ˇ oA , ˇιA ) that is regular and non-degenerate on J − . Arguments similar to the ones used for the horizon now allow us to interpret the inverse wave operators at infinity as trace operators on J ± . − ∞ Theorem 8.3. We denote by F+ diffeomorphism J (respectively FJ ) the C + − from J (respectively J ) onto Σ0 defined by identifying points along outgoing (respectively incoming) principal null geodesics in BI . They have the following ex− plicit expressions (F+ J being defined in terms of star-Kerr coordinates and F J in terms of Kerr-star coordinates) : ∗ ∗ −1 F+ (−∗ t), θ, ϕ = ∗ ϕ + Λ(T −1 (−∗ t))) , J ( t, w = 0, θ, ϕ) = (t = 0, r = T ∗ ∗ −1 ∗ F− (t ), θ, ϕ = ϕ∗ − Λ(T −1 (t∗ ))) . J (t , w = 0, θ, ϕ ) = (t = 0, r = T
We define, on J ± , volume measures dVolJ ± as the pull backs of the measure dr∗ dω ± on Σ0 by the diffeomorphisms F± J . The trace operators TJ , that to initial data ΨΣ0 ∈ ∞ ± C0 (Σ0 ) for (2.56), associate the smooth trace on J of the associated solution Ψ, extend as bounded operators from H to L2 (J ± ; dVolJ ± ). They are related to the ± ˜± inverse wave operators W ∞,pn via the diffeomorphisms FJ : ˜ ± ΨΣ0 )(r, θ, ϕ) = (T ± ΨΣ0 )((F± )−1 (r, θ, ϕ)) . (W ∞,pn J J
(8.38)
∗ ˜± In other words, TJ± = (F± J ) W∞,pn .
8.4. The Goursat problem Putting together Theorems 8.2 and 8.3, we obtain the interpretation of the scattering theory for Eq. (2.56) as the solution of a singular Goursat problem on B I : + − ˜ pn ˜ pn Theorem 8.4. The global inverse wave operator W (respectively W ) is a representation of the trace operator that, to initial data for (2.56), associates the trace of the solution Ψ on the null hypersurface in BI , singular at its vertex, h+ ∪ J + (respectively h− ∪ J − ). The scattering theory of Theorem 8.1 states that these operators are isomorphisms (even isometries) from H onto H − ⊕ H+ (respectively H+ ⊕ H− ), i.e. that the solutions are completely and uniquely determined by their trace on h+ ∪ J + (respectively h− ∪ J − ). This is exactly saying that the Goursat problem for (2.56) is well posed on h+ ∪ J + (respectively h− ∪ J − ). The direct wave + − operator Wpn (respectively Wpn ) then solves this Goursat problem by associating to the null data, the initial data on Σ0 of the unique corresponding solution. More precisely, we define the future trace operator
TF :
H → L2 (h+ ; dVolh+ ) ⊕ L2 (J + ; dVolJ + ) =: HF + ∗ ˜+ ∗ ˜+ ΨΣ0 7→ (Th+ ΨΣ0 , TJ+ ΨΣ0 ) = ((F+ h ) WH,pn ΨΣ0 , (FJ ) W∞,pn ΨΣ0 ) .
March 10, 2004 14:17 WSPC/148-RMP
00191
Scattering of Massless Dirac Fields by a Kerr Black Hole
121
TF is an isomorphism, i.e. for each Φ ∈ HF there exists a unique Ψ ∈ C(Rt ; H) solution of (2.56) such that Φ = TF Ψ(0). A similar formulation is valid for the past. Remark 8.7. It is interesting to remember that the hypersurface h+ ∪J + (respectively h− ∪ J − ) on which the Goursat problem is solved, is singular at its vertex because the conformal metric is singular there. This means that there is no choice of conformal factor Ω that would make the corresponding rescaled metric regular and non-degenerate at i± . In the time-dependent scattering theory that we have constructed here, this singularity is not really seen; we have two separate asymptotic regions and i± are considered as points at infinity on J ± and h± (similarly i0 is understood as a point at infinity on J ± ). Acknowledgments The authors would like to warmly thank Christian G´erard for support and enlightening conversations. References [1] W. Amrein, A. Boutet de Monvel and V. Georgescu, C0 -groups, Commutator Methods and Spectral Theory of N-body Hamiltonians (Birk¨ auser Verlag, 1996). [2] A. Bachelot, Gravitational scattering of electromagnetic field by a Schwarzschild black hole, Ann. Inst. H. Poincar´e Phys. Th´eor. 54 (1991) 261–320. [3] A. Bachelot, Asymptotic completeness for the Klein–Gordon equation on the Schwarzschild metric, Ann. Inst. H. Poincar´e Phys. Th´eor. 61(4) (1994) 411–441. [4] A. Bachelot, Quantum vacuum polarization at the black-hole horizon, Ann. Inst. H. Poincar´e Phys. Th´eor. 67(2) (1997) 181–222. [5] A. Bachelot, The Hawking effect, Ann. Inst. H. Poincar´e Phys. Th´eor. 70(1) (1999) 41–99. [6] A. Bachelot, Creation of fermions at the charged black-hole horizon, Ann. Henri Poincar´e 1(6) (2000) 1043–1095. [7] A. Bachelot and A. Motet-Bachelot, Les r´esonances d’un trou noir de Schwarzschild, Ann. Inst. H. Poincar´e Phys. Th´eor. 59(1) (1993) 3–68. [8] R. H. Boyer and R. W. Lindquist, Maximal analytic extension of the Kerr metric, J. Math. Phys. 8 (1967) 265–281. [9] P. J. M. Bongaarts and S. N. M. Ruijsenaars, Scattering theory for one-dimensional step potentials, Ann. Inst. H. Poincar´e Phys. Th´eor. 26 (1977) 1–17. [10] P. J. M. Bongaarts and S. N. M. Ruijsenaars, The Klein paradox as a many particle problem, Ann. Phys. 101 (1976) 289–318. [11] S. Chandrasekhar, The solution of Dirac’s equation in Kerr geometry, Proc. Roy. Soc. London A349 (1976) 571–575. [12] S. Chandrasekhar, The Mathematical Theory of Black Holes (Oxford University Press, 1983). [13] T. Damour, Klein paradox and vacuum polarization, in Proc. 1st Marcel Grossmann Meeting, ed. R. Ruffini (North Holland, 1977), pp. 459–482. [14] S. De Bi`evre, P. Hislop and I. M. Sigal, Scattering theory for the wave equation on non-compact manifolds, Rev. Math. Phys. 4 (1992) 575–618.
March 10, 2004 14:17 WSPC/148-RMP
122
00191
D. H¨ afner & J.-P. Nicolas
[15] J. Derezi´ nski and C. G´erard, Scattering Theory of Classical and Quantum N-particle Systems (Springer, 1997). [16] A. DeVries, The evolution of the Dirac field in curved space-times, Manuscripta Math. 88(2) (1995) 233–246. [17] A. DeVries, The evolution of the Weyl and Maxwell fields in curved space-times, Math. Nachr. 179 (1996) 27–45. [18] J. Dimock, Scattering for the wave equation on the Schwarzschild metric, Gen. Relativity Gravitation 17(4) (1985) 353–369. [19] J. Dimock and B. S. Kay, Scattering for massive scalar fields on Coulomb potentials and Schwarzschild metrics, Classical Quantum Gravity 3 (1986) 71–80. [20] J. Dimock and B. S. Kay, Classical and quantum scattering theory for linear scalar fields on the Schwarzschild metric I, Ann. Phys. 175 (1987) 366–426. [21] J. Dimock and B. S. Kay, Classical and quantum scattering theory for linear scalar fields on the Schwarzschild metric II, J. Math. Phys. 27 (1986) 2520–2525. [22] F. Finster, N. Kamran, J. Smoller and S.-T. Yau, Nonexistence of time-periodic solutions of the Dirac equation in an axisymmetric black hole geometry, Comm. Pure Appl. Math. 53(7) (2000) 902–929. [23] F. Finster, N. Kamran, J. Smoller and S.-T. Yau, Decay Rates and Probability Estimates for Massive Dirac Particles in the Kerr–Newman Black Hole Geometry, Comm. Math. Phys. 230(2) (2002) 201–244. [24] R. Froese and P. Hislop, Spectral analysis of second-order elliptic operators on non-compact manifolds, Duke Math. J. 58 (1989) 103–129. [25] V. Georgescu and C. G´erard, On the virial theorem in quantum mecanics, Comm. Math. Phys. 208 (1999) 275–281. [26] C. G´erard and I. Laba, Multiparticle Quantum Scattering in Constant Magnetic Fields, Mathematical surveys and monographs, Vol. 90 (American Mathematical Society, 2002). [27] R. P. Geroch, Spinor structure of space-times in general relativity I, J. Math. Phys. 9 (1968) 1739–1744. [28] R. P. Geroch, Spinor structure of space-times in general relativity II, J. Math. Phys. 11 (1970) 342–348. [29] R. P. Geroch, The domain of dependence, J. Math. Phys. 11 (1970) 437–449. [30] G. F. R. Ellis and S. W. Hawking, The Large Scale Structure of Space-time, Cambridge Monographs in Mathematical Physics (Cambridge University Press, 1973). [31] D. H¨ afner, Compl´etude asymptotique pour l’´equation des ondes dans une classe d’espaces-temps stationnaires et asymptotiquement plats, Ann. Inst. Fourier 51 (2001) 779–833. [32] D. H¨ afner, Sur la th´eorie de la diffusion pour l’´equation de Klein–Gordon dans la m´etrique de Kerr, Dissertationes Mathematicae 421 (2003). [33] T. Ikebe and J. Uchiyama, On the asymptotic behavior of eigenfunctions of secondorder elliptic operators, J. Math. Kyoto Univ. 11 (1971) 425–448. [34] W. M. Jin, Scattering of massive Dirac fields on the Schwarzschild black hole spacetime, Classical Quantum Gravity 15(10) (1998) 3163–3175. [35] R. P. Kerr, Gravitationnal field of a spinning mass as an example of algebraically special metrics, Phys. Rev. Lett. 11 (1963) 237–238. [36] W. Kinnersley, Type D vacuum metrics, J. Math. Phys. 10 (1969) 1195–1203. [37] O. Klein, Die Reflexion von Elektronen an einem Potentialsprung nach der relativistischen Dynamik von Dirac, Z. Phys. 53 (1929) 157–165.
March 10, 2004 14:17 WSPC/148-RMP
00191
Scattering of Massless Dirac Fields by a Kerr Black Hole
123
[38] L. J. Mason and J.-P. Nicolas, Conformal scattering and the Goursat problem, to appear in J. Hyperbolic Differential Equations. [39] F. Melnyk, Scattering on Reissner–Nordstrøm metric for massive charged spin 1/2 fields, Ann. H. Poincar´e 4(5) (2003) 813–846. [40] F. Melnyk, The Hawking effect for spin 1/2 fields, Commun. Math. Phys. 244(3) (2004) 483–525. [41] C. W. Misner, K. Thorne and J. A. Wheeler, Gravitation (Freeman, San Francisco, 1973). [42] E. Mourre, Absence of singular continuous spectrum for certain selfadjoint operators, Commun. Math. Phys. 78 (1981) 391–408. [43] J.-P. Nicolas, Non linear Klein–Gordon equation on Schwarzschild like metrics, J. Math. Pures Appl. 74(9) (1995) 35–58. [44] J.-P. Nicolas, Scattering of linear Dirac fields by a spherically symmetric black hole, Ann. Inst. H. Poincar´e Phys. Th´eor. 62(2) (1995) 145–179. [45] J.-P. Nicolas, Global exterior Cauchy problem for spin 3/2 zero rest-mass fields in the Schwarzchild space-time, Comm. Partial Differential Equations 22(3,4) (1997) 465–502. [46] J.-P. Nicolas, Dirac fields on asymptotically flat space-times, Dissertationes Mathematicae 408, 2002. [47] J.-P. Nicolas, A non linear Klein–Gordon equation on Kerr metrics, J. Math. Pures Appl. 81(9) (2002) 885–914. [48] B. O’Neill, The Geometry of Kerr Black Holes (A. K. Peters, Wellesley, 1995). [49] R. Penrose, Zero rest-mass fields including gravitation: asymptotic behavior, in Proc. Roy. Soc. A284 (1965) 159–203. [50] R. Penrose and W. Rindler, Spinors and Space-time, Cambridge Monographs on Mathematical Physics, Vols. I & II (Cambridge University Press, 1984 & 1986). [51] A. Z. Petrov, The classification of spaces defining gravitational fields, in Scientific Proc. Kazan State University (named after V. I. Ulyanov-Lenin), Jubilee (1804– 1954) Collection 114(8) (1954), 55–69, translation by J. Jezierski and M. A. H. MacCallum, with introduction, by M. A. H. MacCallum, Gen. Relativity Gravitation 32 (2000) 1661–1685. [52] A. S´ a Barreto and M. Zworski, Distribution of resonances for spherical black holes, Math. Res. Lett. 4 (1997) 103–121. ¨ [53] K. Schwarzschild, Uber das Gravitationsfeld eines Massenpunktes nach der Einsteinschen Theorie, Sitzungsberichte der K¨ oniglich Preussischen Akademie der Wissenschaften 1 (1916) 189–196. [54] E. Stiefel, Richtungsfelder und Fernparallelismus in n-dimensionalen Mannigfaltigkeiten, Comment. Math. Helv. 8 (1936) 305–353. [55] B. Thaller, The Dirac Equation, Texts and Monographs in Physics (Springer, 1992). [56] S. A. Teukolski, Perturbations of a rotating black hole. I. Fundamental equations for gravitational, electromagnetic, and neutrino-field perturbations, Astrophys. J. 185 (1973) 635–647. [57] W. G. Unruh, Separability of the neutrino equations in a Kerr background, Phys. Rev. Lett. 31(20) (1973) 1265–1267. [58] R. M. Wald, General Relativity (The University of Chicago Press, 1984).
March 2, 2004 14:47 WSPC/148-RMP
00192
Reviews in Mathematical Physics Vol. 16, No. 1 (2004) 125–146 c World Scientific Publishing Company
NONUNITAL SPECTRAL TRIPLES ASSOCIATED TO DEGENERATE METRICS
A. RENNIE School of Mathematical and Physical Sciences University of Newcastle Callaghan, NSW, 2308, Australia Received 22 July 2002 Revised 9 January 2004 We show that one can define (p, ∞)-summable spectral triples using degenerate metrics on smooth manifolds. Furthermore, these triples satisfy Connes–Moscovici’s discrete and finite dimension spectrum hypothesis, allowing one to use the Local Index Theorem [1] to compute the pairing with K-theory. We demonstrate this with a concrete example. Keywords: Spectral triples; index theorem; K-theory; K-homology.
1. Introduction Let X be a p-dimensional, geodesically complete, paracompact, σ-compact Riemannian spin manifold with metric g. Our aim is to show that if g˜ is another “metric” which is allowed to be degenerate on a submanifold of measure zero, then a (p, ∞)-summable spectral triple can be constructed by employing the Dirac operator associated to this degenerate metric. The next section makes some preliminary definitions and fixes notation. Some of these definitions are modifications of standard definitions necessary to be able to encompass the nonunital setting. More information will be found in [2, 3]. Section 3 describes the spectral triples and presents our main theorems. The final section provides a detailed example. The original aim of the constructions in this paper was to find an explicit example of a spectral triple with nonsimple dimension spectrum. This would provide an example where the index pairing could be computed using Connes and Moscovici’s Local Index Theorem [1] and hopefully there would be contributions arising from the higher order poles. This would be of benefit in obtaining greater understanding of the various terms in the Local Index Theorem. This original aim failed, but several interesting results were obtained. This paper shows that one must work quite hard to obtain such an example. The “Dirac” operator of our main example seems to contain a double pole in its 125
March 2, 2004 14:47 WSPC/148-RMP
126
00192
A. Rennie
zeta function, however when one considers for a smooth function a s → Trace(a(1 + D 2 )−s ) ,
Re(s) > 1 ,
either there is a simple pole at s = 1, or the operator a(1 + D 2 )−s is not compact for any s ∈ C, so the trace fails to make sense. In some sense there is a trade-off between the size of the (nonzero) point spectrum and the kernel of D, so that the zeta function has a simple pole, or we do not even obtain a spectral triple. 2. Definitions Here we review the relevant definitions, language and notation we will employ in the remainder of the paper. Definition 1. A ∗-algebra A is smooth if it is Fr´echet and ∗-isomorphic to a proper dense subalgebra i(A) of a C ∗ -algebra A which is stable under the holomorphic functional calculus. Definition 2. An algebra A has local units if for every finite subset of elements {ai }ni=1 ⊂ A, there exists φ ∈ A such that for each i φai = ai φ = ai . Definition 3. Let A be a Fr´echet algebra and Ac ⊂ A be a dense ideal with local units. Then we call A a local algebra (when Ac is understood). Remark. Localizable would be a more descriptive word, and local is overused, but it will do for now. Note that unital algebras are automatically local. Furthermore, the dense ideal Ac is saturated in the following sense. If a ∈ A and ∃ φ ∈ Ac such that φa = aφ = a, then a ∈ Ac . This follows because Ac is an ideal. Example. The basic example of a smooth local algebra is C0∞ (X), where X is a noncompact manifold, and C0∞ (X) denotes the smooth functions all of whose derivatives vanish at infinity. This is Fr´echet, stable under the holomorphic functional calculus, and the dense ideal of compactly supported functions has local units. Numerous properties of, and constructions with, local algebras are presented in [2, 3]. Next we present the definition of spectral triples appropriate to our situation, modelled on Connes’ definitions [9, Chap. VI]. Definition 4. A spectral triple (A, H, D) is given by: (1) A representation π : A → B(H) of a local ∗-algebra A on the Hilbert space H. (2) A self-adjoint (unbounded, densely defined) operator D : dom D → H such that 1 [D, a] extends to a bounded operator on H for all a ∈ A and a(1 + D 2 )− 2 is compact for all a ∈ A.
March 2, 2004 14:47 WSPC/148-RMP
00192
Nonunital Spectral Triples Associated to Degenerate Metrics
127
The triple is said to be even if there is an operator Γ = Γ∗ such that Γ2 = 1, [Γ, a] = 0 for all a ∈ A and ΓD + DΓ = 0 (i.e. Γ is a Z2 -grading such that D is odd and A is even). Otherwise the triple is said to be odd. Definition 5. If (A, H, D) is a spectral triple, then we define Ω∗D (A) to be the algebra generated by A and [D, A]. Definition 6. A spectral triple (A, H, D) is smooth if \ dom δ m A and [D, A] ⊆ m≥0
where for x ∈ B(H), δ(x) = [|D|, x]. Remark. Note the difference between the definitions of smooth for topological algebras and spectral triples. In [4] such triples are called regular. In fact we have the following [2]: Lemma 7. If (A, H, D) is a smooth spectral triple, then (Aδ , H, D) is also a smooth spectral triple, where Aδ is the completion of A in the locally convex topology determined by the seminorms qn (a) = kδ n (a)kD , where
kakD = kak + k[D, a]k .
Moreover , Aδ is a smooth algebra. The following definition is, if not crucial, hugely simplifying for summability issues [2, 3]. Definition 8. A local spectral triple (A, H, D) is a spectral triple such that there exists a local approximate unit {φn } ⊂ Ac for A satisfying [ Ω∗D (Ac ) = Ω∗D (A)n , n
Ω∗D (A)n
= {ω ∈ Ω∗D (A) : φn ω = ωφn = ω} .
Thus we may assume without loss of generality that a local spectral triple has a local approximate unit {φn }n≥1 ⊂ Ac such that φn+1 φn = φn and φn+1 [D, φn ] = [D, φn ]. Definition 9. A local spectral triple is (p, ∞)-summable if p ≥ 1 and for all λ in the resolvent set of D a(D − λ)−1 ∈ L(p,∞) (H) ,
∀ a ∈ Ac .
We call it θ-summable if 2
Trace(ae−t(1+D ) ) < ∞ for all a ∈ Ac and t > 0.
March 2, 2004 14:47 WSPC/148-RMP
128
00192
A. Rennie
Remark. If A is unital, ker D is finite dimensional. This case is well described in the literature, see for instance [9, Chap. VI] and [5]. Note that the summability requirements are only for a ∈ Ac . We do not assume that elements of the algebra A are all “integrable”. Note that Lemma 7 does not guarantee that elements of the completion of A for the seminorms arising from the derivation δ satisfy the above summability condition in the nonunital case. Of course, there is no difficulty in the unital case. In [3], we show that if (A, H, D) is a (p, ∞)-summable local spectral triple, then the operator A = a(1 + D 2 )−p/2 ∈ L(1,∞) (H). As such, we may apply any Dixmier trace Trω [9, IV.2.β], to the operator A. An operator T ∈ L(1,∞) (H) is called measurable if the number Trω (T ) is independent of the choice of Dixmier trace Trω . The other main summability requirement is Connes–Moscovici’s “discrete and finite dimension spectrum” hypothesis [1, 5]. Definition 10. A smooth spectral triple (A, H, D) has discrete dimension spectrum Sd if the set Sd ⊂ {z ∈ C : Re(z) ≤ p}, p ≥ 1, is discrete and for any b ∈ B(Ac ) the function z
ζb (z) := Trace(b(1 + D 2 )− 2 ) ,
(1)
is defined for all z ∈ C with Re(z) > p and extends holomorphically to C\Sd. Furthermore we require that Γ(z)ζb (z) is of rapid decay on vertical lines with Re(z) > 0. We say that the discrete dimension spectrum Sd is of finite multiplicity k if for all b ∈ B(Ac ), ζb has a pole of order at most k. We say that Sd is simple if k = 1. Here Γ denotes the gamma function, and B(Ac ) is the algebra generated by δ k (a), δ n ([D, a]) for a ∈ Ac and k, n ≥ 0. It is important to note that in the case of simple dimension spectrum, this definition implies the measurability of all the operators b(1 + D 2 )−p/2 , b ∈ B(Ac ), by [3, Corollary 18]. The Local Index Theorem of Connes–Moscovici [1, 3] computes the index pairing between the K-theory of the algebra A and a smooth local spectral triple with discrete and finite dimension spectrum. In the following k = (k1 , . . . , kn ) ∈ Nn , |k| = k1 + · · · + kn , da = [D, a] for a ∈ A, and (da)(k) = ∇k (da) where ∇(T ) = [D 2 , T ]. Finally, if the function z → Trace(T (1 + D 2 )−m−z ) , has a Laurent expansion around z = 0, let τq (T (1 + D2 )−m ) be the coefficient of z −q−1 in this expansion.
T ∈ B(H)
March 2, 2004 14:47 WSPC/148-RMP
00192
Nonunital Spectral Triples Associated to Degenerate Metrics
129
Theorem 11 (Local Index Theorem [1, 3]). Let (A, H, D) be a smooth local spectral triple, with discrete and finite dimension spectrum contained in the half plane {z : Re(z) ≤ p}. Then if A is unital , the following formulas define the components of a cyclic cocycle in the (b, B) bicomplex of A whose class coincides with the class of the Chern character in HC ∗ (A). If A is nonunital , then the following formulas define cyclic cocycles in the distributional sense, and their class coincides with that of the Chern character in the cyclic cohomology HC ∗ (Ac ). (a) For (A, H, D) even and summing over q ≤ |k| + n2 and |k| + n ≤ p, φn (a0 , . . . , an ) =
X (−1)|k| (2|k|+n) n τq (Γa0 (da1 )(k1 ) · · · (dan )(kn ) (1 + D2 )− 2 ) αk,n σq |k| + k1 ! · · · kn ! 2 k,q
for n 6= 0 even, while φ0 (a0 ) = τ−1 (Γa0 ) where τ−1 (b) = resz=0 z −1 Trace(b(1 + D 2 )−z ) . The σq are the symmetric functions of the numbers 1, 2, . . . , |k| + n2 , |k|+ n 2
|k|+ n 2 −1
Y
X
(s + i) =
i=1
σj
j=0
n j |k| + s , 2
and α−1 k,n = (k1 + 1)(k1 + k2 + 2) · · · (k1 + k2 + · · · + kn + n) . (b) For (A, H, D) odd and summing over q ≤ |k| +
n−1 2
and |k| + n ≤ p,
φn (a0 , . . . , an ) =
√
2πi
X (−1)|k| (2|k|+n) αk,n σm−q (m)τq (a0 (da1 )(k1 ) · · · (dan )(kn ) (1 + D2 )− 2 ) k1 ! · · · kn ! k,q
where m = |k| +
n−1 2
and σj is defined by m−1 Y l=0
(2l + 1) z+ 2
=
X
z j σm−j (m) .
This statement is slightly different to that in [1], in that it has been extended to the nonunital case as described in [3]. More details can be found in these papers.
March 2, 2004 14:47 WSPC/148-RMP
130
00192
A. Rennie
3. Construction of the Triples Let X be a p-dimensional, geodesically complete, paracompact, σ-compact Riemannian spin manifold with metric g. Let SC → X be the complex spinor bundle canonically associated to the spin structure, [6] , and D : Γ(SC ) → Γ(SC ) the Dirac operator of the spin structure. So in local coordinates x1 , . . . , xp we have dxi · dxj + dxj · dxi = −2g(dxi , dxj ) ,
D=
p X i=1
dxj · ∇LC j ,
LC
where · denotes Clifford multiplication and ∇ is the lift of the Levi–Civita connection on the cotangent bundle to the spinor bundle. Finally, let ωC be the complex volume form [6], which in local coordinates is given by ωC = i [
p+1 2 ]
dx1 · · · dxp .
If we define C0∞ (X) to be the smooth complex-valued functions all of whose partial derivatives vanish at infinity, we have: Proposition 12. The tuple (C0∞ (X), L2 (X, SC , g), D, ωC ) is a spectral triple. It is (p, ∞)-summable, where p = dim X, and has discrete and simple dimension spectrum. For all compactly supported a ∈ C0∞ (X), the operator a(1 + D 2 )−p/2 is measurable, and so for any Dixmier trace Z a(x)dVol(x) , Trω (a(1 + D2 )−p/2 ) = c(p) X
where c(p) is a constant depending only on p and dVol is the Riemannian volume form. Proof. In [2] it is shown that for a complete spin manifold, the topology (on smooth functions) of convergence in the seminorms qn (a) = kδ n (a)kD , a : X → C, δ(a) = [|D|, a], is the topology of uniform convergence of all derivatives. Thus by Lemma 7, it suffices to show that (Cc∞ (X), L2 (X, SC , g), D, ωC ) is a spectral triple, where Cc∞ (X) denotes the smooth compactly supported functions. The first step is to show that D is essentially self-adjoint, and so can be extended to a closed self-adjoint operator on L2 (X, SC , g). An integration by parts shows that D is symmetric. The completeness of X and the finite propagation speed of the Dirac operator allows us to employ [7, Proposition 10.2.11] which shows that D is essentially self-adjoint. The compactness and (p, ∞)-summability results are proven in [3]. In particular, for all compactly supported functions a on X, a(1+D 2 )−p/2 ∈ L(1,∞) (L2 (X, SC , g)). The statements on the dimension spectrum are implied by Seeley’s results, [8, Theorem 4 and Sec. 2], namely that for any function a with support contained in a single coordinate chart (with compact closure), the function s → Trace(a(1 + D 2 )−s/2 ) ,
s > p,
March 2, 2004 14:47 WSPC/148-RMP
00192
Nonunital Spectral Triples Associated to Degenerate Metrics
131
extends to a meromorphic function with at most simple poles. The value of the residue at s = p is given by the Wodzicki residue [4, 8–10], Z Z 2[p/2] 2 −p/2 −p a(x)dVol(x) . W Res(a(1 + D ) )= a(x)kξk dS(ξ)dVol = c(p) p(2π)p S ∗ X X By Connes’ trace theorem [1, Appendix A], the operator a(1 + D 2 )−p/2 is in the Dixmier ideal L(1,∞) (L2 (X, SC , g)), and the Wodzicki residue coincides with the value of any Dixmier trace on a(1 + D 2 )−p/2 . These results depend crucially on the self-adjointness and uniform ellipticity of the Dirac operator. To conclude that the dimension spectrum is simple we need to check that the above statements are still true when we replace a ∈ Cc∞ (X) with b = δ k (a) or b = δ k ([D, a]). In both cases, b is an order zero pseudodifferential operator with principal symbol a compactly supported function [1, 2, 4]. The lower order terms do not contribute to the Wodzicki residue. The grading conditions are well known [6], and we have ωC D + (−1)p DωC = 0 and ωC may be normalized to 1 when p is odd. Hence if dim X is even, the triple is even, and if dim X is odd, the triple is odd. Now let g˜ be a positive semidefinite metric. That is a smooth, bounded, symmetric section of T X ⊗ T X, possibly degenerate. Let F = {x ∈ X : ∃ 0 6= α ∈ Γ(T ∗ X) such that g˜(α, α)(x) = 0} , be the degeneracy set of g˜. We assume that F is closed, measure zero (with respect to the Riemannian volume form defined by the original complete metric g) and a smooth submanifold (possibly with boundary) so that X \ F is a smooth manifold. The Clifford algebra determined by the semidefinite metric g˜ allows one to define a new Dirac operator, so that in local coordinates on X dxi • dxj + dxj • dxi = −2˜ g(dxi , dxj ) ,
˜= D
p X i=1
dxi • ∇LC i ,
where • is the Clifford multiplication determined by the new metric g˜ and ∇LC is the lift of the Levi–Civita connection (with respect to the old metric g) on T X to the spinor bundle (again with respect to the old metric g). Thus we are retaining the spinor bundle and connection of the complete metric g, and using g˜ to obtain a new Clifford action and hence a new Dirac operator. The only remaining issue is to define the action of the new Clifford algebra on the old spinor bundle. An obviously sufficient condition for this to be possible is that there is an inclusion of the algebras of sections Γ(Cliff(T X, g˜)) ⊆ Γ(Cliff(T X, g)) .
March 2, 2004 14:47 WSPC/148-RMP
132
00192
A. Rennie
A sufficient condition for this to hold is as follows. Suppose that in any local coordinates we have g˜ij = fij gij , with each fijpa smooth nonnegative function. Provided that for each i, j we have either fij = fii fjj or fij = 0, we can define a representation of the new Clifford algebra on the old spinor bundle by setting p dxi • ξ = fii dxi · ξ , ξ ∈ Γ(X, SC ) . One can now check that the Clifford relations for the new metric are satisfied. In the even case we also have (as operators on the spinor bundle or on Hilbert space) that dxi • and ωC anticommute. In the following we assume that the new Clifford algebra acts on the old spinor bundle, by restricting to the above case if necessary. ˜ ωC ), with F and D ˜ as above, Theorem 13. The tuple (Cc∞ (X\F ), L2 (X, SC , g), D, is a spectral triple. It is local and (p, ∞)-summable, where p = dim X, and has discrete and simple dimension spectrum. For all functions a ∈ Cc∞ (X\F ), the operator ˜ 2 )−p/2 is measurable and for any Dixmier trace a(1 + D Z Z 1 ˜ 2 )−p/2 ) = a(x) Trace(˜ g (ξ, ξ)−p/2 )dS(ξ)dVol(x) , Trω (a(1 + D p(2π)p X S ∗ X
where dS(ξ)dVol(x) is the volume form of the cosphere bundle S ∗ X for the complete metric g.
Proof. The first portion of the proof is exactly the same as the last proposition, ˜ self-adjoint by the completeness of X and the finite propagation speed of with D ˜ [7, Proposition 10.2.11], the finite propagation speed following from the boundD edness of the semidefinite metric g˜. To apply Seeley’s results, as in Proposition 12, ˜ 2 )−s/2 , we need to be sure that D ˜ 2 is uniformly elliptic over the support to a(1 + D of a. However, the support of a is disjoint from the set of degeneracy F , so over ˜2 the support of a, the size of the smallest eigenvalue of the principal symbol of D is bounded from below (and is greater than zero). Thus Seeley’s techniques can be applied, and we deduce the simplicity of the dimension spectrum. The remainder of the proof now follows as in Proposition 12, the value of the Dixmier trace being given by the Wodzicki residue, which is given by the formula in the statement of the proposition.
Remark. The example in the next section employs a degenerate metric which ˜ does not have finite propagation speed. Nevertheless, is not bounded, and so D ˜ is in fact self-adjoint. explicit calculations in the next section show that D ˜ ωC ) as in the proposition, and σ Lemma 14. With (Cc∞ (X\F ), L2 (X, SC , g), D, 2 ˜ the principal symbol of D , the algebra
March 2, 2004 14:47 WSPC/148-RMP
00192
Nonunital Spectral Triples Associated to Degenerate Metrics
133
Ah = {a ∈ C0∞ (X\F ) : x → (∂ α a)(x) TraceSC (σ(x, ξ)−p/2 ) is integrable for all multi-indices α} is a smooth algebra. Here integrability is over the cosphere bundle of X with respect to the volume form of the original metric g. Proof. The algebra Ah is dense in C0 (X\F ), since it contains the smooth compactly supported functions. Define seminorms on Ah by qn (a) = sup sup |∂ α a(x)|, qn1 (a) |α|≤n x∈X
qn1 (a) = sup |α|≤n
Z
S∗X
|(∂ α a)(x) Trace(σ(x, ξ)−p/2 )|dS(ξ)dVol(x) .
These seminorms determine a locally convex metrisable topology on Ah , and a standard /3 proof shows that Ah is complete, and so Fr´echet. To show that Ah is stable under the holomorphic functional calculus, suppose that a ∈ Ah and 1 + a is invertible in C((X\F )+ ), with inverse 1 + b. Then b ∈ C0 (X\F ) and b is smooth. This is because the equation a + b + ab = 0 implies that b = −a/(1 + a), which has derivatives of all orders by hypothesis, and these all vanish at infinity. The integrability condition follows similarly, since bσ −p/2 = −aσ −p/2 − abσ −p/2 = −(1 + b)aσ −p/2 , and 1 + b is bounded whilst a satisfies the integrability criteria by hypothesis. Differentiating b = −(1 + b)a, and applying the Leibniz rule completes the proof. Hence Ah is stable under the holomorphic functional calculus, and so smooth. Corollary 15. The results of Theorem 13 remain true with the compactly supported functions replaced by Ah . Proof. (Sketch) For a ∈ Ah positive, we may use the monotone convergence theorem for the measure Z µ(E) = TraceSC (σ(x, ξ)−p/2 )dS(ξ)dVol(x) , E ⊂ X\F , S∗ E
to show that the measurability results hold for a ≥ 0. Linearity allows us to conclude for general a ∈ Ah . This measurability result, and its proof, is essentially the same as [4, Corollary 7.22]. The boundedness of [D, a], a ∈ Ah , follows from the smoothness of the functions in Ah . This is important for computing the pairing with K-theory. Along with results proved in [2, 3], this means that we can apply the Local Index Theorem to compute
March 2, 2004 14:47 WSPC/148-RMP
134
00192
A. Rennie
the pairing of the K-homology class of the spectral triple of a degenerate metric on X with the K-theory of X\F . This follows because K∗ (Ah ) ∼ = K∗ (C0 (X\F ) , so any class [x] in the right-hand group has a representative x ∈ MN (Ah ) for some sufficiently large N . This result of course applies to the case where g˜ is not degenerate; in particular it aplies to g. 4. A Detailed Example In this section we present an example which shows how the construction of spectral triples from degenerate metrics can be used to do index theory on mildly singular spaces. We are quite explicit in what follows, so that it is clear what prevents us from being able to work with some algebra of functions which is nonzero on the set of degeneracy of the metric. The computations in this section determine precisely for which functions we obtain a spectral triple. Once this is done, we compute the K-theory of the singular space on which we work, and identify generators of the even K-theory. The index pairing between the spectral triple on this singular space and the K-theory generators is then determined using the Local Index Theorem of Connes–Moscovici [1], which reduces to a Wodzicki residue computation. We build this triple by making a deliberately naive attempt to work on a singular space. The extremely simple space we choose is the double cone, C = {(x, y, z) ∈ R3 : x2 + y 2 = κ2 z 2 } , where κ = tan( α2 ), and α ∈ (0, π) is the cone angle. At every point z 6= 0, we have a well-defined cotangent space, and choosing cylindrical coordinates (z, θ), this cotangent space is spanned by dz, dθ. At each such point, we define a Clifford action of these covectors on C2 by 0 −κ 0 κz i . dz = , dθ = κz 0 κ 0 i These satisfy the Clifford relations for the metric 2 2 κ z 0 g(z, θ) = . 0 κ2 Of course this metric is degenerate at z = 0, but elsewhere reproduces the correct distances on the cone. The next step is to define the corresponding Dirac operator, κz 0 i ∂θ − κ∂z D = dz∂z + dθ∂θ = κz . 0 i ∂θ + κ∂z We initially regard D, and D 2 , as defined on the smooth sections of the spinor bundle over the cylinder, which are of rapid decrease. In the Hilbert space completions below, this will mean that D is not closed, but by [7, Lemma 10.2.1], it is closable.
March 2, 2004 14:47 WSPC/148-RMP
00192
Nonunital Spectral Triples Associated to Degenerate Metrics
135
The Hilbert space we employ is H = L2 (Cyl, C2 , dzdθ), where Cyl denotes the doubly infinite cylinder of unit radius, L2 (Cyl, C2 ) is the L2 sections of the trivial plane bundle (the spinor bundle) over the cylinder, and dzdθ denotes the usual Riemannian volume form on the cylinder (not the above Clifford action). By making this choice of Hilbert space we are regarding the cone as the cylinder imbued with a degenerate metric. The operator Γ = idzdθ is a Z2 -grading for H anticommuting with D. In this expression dz, dθ act via the usual Clifford action on the cylinder, −1 0 0 −i 0 −1 . , Γ= , dθ = dz = 0 1 −i 0 1 0 So there is a good deal of interplay between the two metrics we have imposed on the cylinder. Finally, our algebra of functions must encode the topology of the cone, and act on H. We must expect trouble from the singularity at z = 0, so we adopt the definition A = {a : C → C : z k ∂θm ∂zl a is smooth and vanishes at z = 0, ±∞ for all k, l, m ≥ 0} . The vanishing of a function a at z = 0, ±∞ is taken in the usual topological sense, so a(z) → 0 as z → 0, ±∞. The algebra A has a local structure, with the dense ideal of functions compactly supported away from z = 0, ±∞ providing Ac ⊆ A. We let A act by multiplication on H. Next we compute the spectrum of the operator D. The sensible way to tackle the spectrum of a Dirac operator is to first consider the associated Laplace equation. Lemma 16. The operator D 2 is essentially self-adjoint with spectrum the nonnegative reals. The kernel is infinite dimensional , the point spectrum consists of the values 2κ2 N, N > 0 integral , with multiplicity 4d(N ), where d(N ) is the divisor function of N . Proof. We first consider the equation
−z 2 ∂θ2 − ∂z2 − 1i ∂θ 0
D 2 ξ = λ2 ξ , 0 ξ1 λ2 ξ 1 = . κ2 ξ 2 ξ2 −z 2 ∂θ2 − ∂z2 + 1i ∂θ
To solve this equation, we employ separation of variables. If we can span the Hilbert space with such solutions there will be no need to try anything more esoteric. For the first component we write ξ1 (z, θ) = f (θ)g(z), and we consider three possibilities. (1) f is constant. In this case the equation for the first component reduces to g 00 (z) = −
λ2 g(z) . κ2
March 2, 2004 14:47 WSPC/148-RMP
136
00192
A. Rennie
If λ2 > 0, then the only solutions are oscillatory, and do not vanish at infinity. Provided that such a λ2 is not an eigenvalue, this shows that it is in the continuous spectrum of D2 . If λ2 = 0, we will obtain a linear solution, again not vanishing at infinity, but we will see later that there are in fact many solutions in the kernel of D2 . Finally, if λ2 < 0, we have the solutions gλ (z) = e±
q
2
λ −κ 2z
,
and these fail to vanish at one of ±∞ or the other, and they do not belong to the Hilbert space. (2) f (θ) = eimθ , m > 0. This yields the equation λ2 00 2 2 g (z) = z m − m − 2 g(z) . κ m
2
The substitution g(z) = g˜(z)e− 2 z reduces this to g˜00 (z) − 2mz˜ g 0 (z) +
λ2 g˜ = 0 . κ2
For m = 1 this is the defining equation for the Hermite polynomials, and it is not difficult from there to see that √ m 2 g(z) = Hn ( mz)e− 2 z , λ2 = 2κ2 nm , m > 0, n ≥ 0 is the unique square integrable solution [12, 13]. (3) f = e−imθ , m > 0. With the same ansatz as the last case we find λ2 g˜00 (z) − 2mz˜ g 0(z) − 2m − 2 g˜(z) = 0 , κ and for λ2 = 2κ2 m(n + 1) this is the same as for the last case. Thus the unique solution is √ m 2 g(z) = Hn ( mz)e− 2 z , λ2 = 2κ2 m(n + 1) , m > 0, n ≥ 0 . The equation for the second component behaves exactly as the first when f is constant, while the roles of the two cases f (θ) = eimθ and f (θ) = e−imθ are reversed. √ 2 For n ≥ 0 and 0 6= m ∈ Z, define the functions snm = eimθ e−|m|z /2 Hn ( mz). 2 For n < 0 we set snm = 0, and for m = 0 we set sn0 = e−z /2 Hn (z). Next define spinors snm 0 ξnm1 (z, θ) = , ξnm2 (z, θ) = . 0 snm Using the orthogonality relations Z ∞ √ 2 Hn (w)Hm (w)e−w dw = δnm 2n n! π , −∞
Z
2π 0
eilθ e−imθ dθ = δlm 2π
March 2, 2004 14:47 WSPC/148-RMP
00192
Nonunital Spectral Triples Associated to Degenerate Metrics
137
and the completeness of the Hermite and trigonometric polynomials, one can show that these spinors provide a complete orthogonal basis of L2 (Cyl, C2 ). The operator D 2 is defined on all finite linear combinations of these spinors, which is a dense subset of the smooth spinors of rapid decrease. Thus it suffices to show that D 2 is essentially self-adjoint on this subspace, for then the unique self-adjoint extension, given by the closure, will coincide with the closure of D 2 defined on all smooth spinors of rapid decrease. Moreover the projections onto the (closures of the) following three subspaces commute with D 2 , so we can write D 2 as the direct sum of the restrictions of D 2 to these subspaces. The subspaces are: • The kernel of D 2 is the L2 closure of the span of the spinors ξ0m1 , m < 0, and ξ0m2 , m > 0. Thus the restriction of D 2 to this subspace is a closed operator. • The restriction of D 2 to finite linear combinations of the spinors ξn0i , i = 1, 2 is essentially self-adjoint. This follows because these basis vectors are independent of θ and so D2 acts as −∂z2 , which is known to be essentially self-adjoint with continuous spectrum the positive reals. • Finally, the action of D 2 on the subspace of finite linear combinations of the eigenspinors for nonzero eigenvalues is essentially self-adjoint. This follows from the denseness of the range of D 2 ± i on this subspace and [11, p. 257]. The denseness of the range of D 2 ± i follows from the explicit computations above. Since D2 is the direct sum of these three restrictions, D 2 is essentially self-adjoint and so has a unique self-adjoint extension, which we shall also refer to as D 2 . Thus the spectrum of D 2 is the nonnegative real axis, with the points 2κ2 N , N ∈ N, being eigenvalues and everything else being continuous spectrum. The multiplicity of each λ2 = 2κ2 N , N > 0, is 4d(N ), where d(N ) is the divisor function, the number of divisors of N including 1 and N [14]. The origin of the divisor function is clear; the four arises by counting the eigenvectors for λ2 = 2κ2 nm, m > 0, n > 0, namely ξn(−m)1 , ξnm2 , and those for λ2 = 2κ2 m(n + 1), m > 0, n ≥ 0, which are ξnm1 , ξn(−m)2 . The presence of the divisor function, whose asymptotics are extremely subtle [14], indicates that the zeta function of D 2 will have very interesting behaviour. Lemma 17. The operator D is essentially self-adjoint with spectrum the whole real line. √ The kernel is infinite dimensional, the point spectrum consists of the values ±κ 2N , N > 0 integral, with multiplicity 2d(N ). Proof. As in Lemma 16, for n ≥ 0 and m > 0, define functions snm = √ 2 eimθ e−mz /2 Hn ( mz). For n < 0 we set snm = 0, and for m = 0 we set 2 sn0 = e−z /2 Hn (z). Using the orthogonality relations and the completeness of the Hermite and trigonometric polynomials as in Lemma 16, it is easy to check that the spinors
March 2, 2004 14:47 WSPC/148-RMP
138
00192
A. Rennie
χnm±
√ 2nsn−1,m m<0 ∓snm ±snm √ m>0 = 2nsn−1,m sn0 m=0 ±isn0
provide a complete orthogonal basis for L2 (Cyl, C2 ). For m 6= 0, Dχnm± = √ ±κ 2nmχnm± , and for m = 0, Dχn0± = ±(κ/i)∂z χn0± . As in Lemma 16, D is closed on its kernel (spanned by χ0m± , m 6= 0), acts as ±(κ/i)∂z on two copies of L2 (R) (spanned by χn0± ), and D ± i has dense range when restricted to the finite linear combinations of the eigenvectors χnm± , m 6= 0. As D is the direct sum of these restrictions, and each is essentially self-adjoint, D is essentially self-adjoint [11, p. 257], and has a unique self-adjoint extension which we also denote by D. In the following we will be estimating traces for operators of the form a(1 + D2 )−s , a ∈ A. The basis described in Lemma 16 is more suitable than that in Lemma 17. The normalizations to obtain an orthonormal basis are |m|1/4 ξnmi , i = 1, 2, m 6= 0 , ξnmi → p √ 2π π2n n! (2) 1 ξn0i → p √ ξn0i , i = 1, 2 . 2π π2n n!
The only remaining item to check in order to show that (A, H, D, Γ) is a spectral triple is the compactness of a(1 + D 2 )−1/2 , for all a ∈ A compactly supported away from zero. Lemma 18. If a ∈ A has compact support disjoint from the set {(z, θ) ∈ R × [0, 2π) : z = 0}, then the operator a(1 + D 2 )−1/2 is compact. If a is a function defined on the cone which is nonzero at z = 0, a(1 + D 2 )−1/2 is not compact. Proof. Write H = Hc ⊕ Hp ⊕ Hk for the decomposition of H into closed subspaces corresponding to the continuous subspace, the nonzero eigenspaces and the kernel of D2 , respectively. Let Pc , Pp , Pk be the corresponding projections. Then from what we already know about the spectrum of D 2 , and employing the closure of the compacts under adjoints, ? K ? Hc 1 a(1 + D2 )− 2 = K K K Hp ,
? K ? Hk where K indicates that the entry is a compact operator between the appropriate subspaces. So, we begin with − 21 1 − κ2 ∂z2 0 2 − 12 ˜ , Pc a(1 + D ) Pc = a 0 1 − κ2 ∂z2
March 2, 2004 14:47 WSPC/148-RMP
00192
Nonunital Spectral Triples Associated to Degenerate Metrics
where a(z, θ) = a ˜(z)+
P
m6=0
139
am (z)eimθ . In this case, [3, Proposition 21] shows that 1
Pc a(1 + D2 )− 2 Pc ∈ L(1,∞) (Hc ) ,
and so compact. In fact it is measurable and Z Z 1 −Pc a(1 + D2 )− 2 Pc = κc(2)
∞
a ˜(z)dz . −∞
Note that this piece of the computation did not require that a be nonzero at z = 0. Next we consider 1
Pk a(1 + D2 )− 2 Pc . The projection Pk projects onto the subspace spanned by m
2
eimθ e− 2 z ,
m
2
e−imθ e− 2 z ,
m>0 z2
while Pc projects onto the space spanned by Hn (z)e− 2 , n ≥ 0. Thus we need to estimate Z 2 m1/4 − m+1 2 z e−imθ dzdθ . √ a(z, θ)H (z)e k 2π 2π2k k! C
Let amk be the coefficient of Hk (z) in the expansion of the function am (z) on the 2 basis provided by the Hermite functions (with respect to the measure e−z dz). These coefficents amk are o((mk)−1/2 ) for large k, m. Thus Z m1/4 2 m1/4 − m+1 2 z e−imθ dzdθ ≤ √ √ amk → 0 . a(z, θ)H (z)e k 2 2π 2π2k k! C 1
So Pk a(1 + D2 )− 2 Pc is compact. In fact we have shown that this term remains compact even if a is not zero at z = 0. We now come to the final term. It is now that we need the compact support away from z = 0 for the functions a that we consider. So let supp(a(z, θ)) ⊆ ([−K, −] ∪ [, K]) × [0, 2π] for some K 1. Then 1
Pk a(1 + D2 )− 2 Pk = Pk aPk is compact, and to show this we need to estimate √ Z ∞ Z 2π √ 2 m m −mz 2 √ e a(z, θ)dzdθ ≤ √ e−m k˜ a(z)k1 , 2 π −∞ 0 2 π
and we see that this is compact. If a is compactly supported but nonzero at z = 0, the sequence of integrals Z ∞ Z 2π −mz 2 e a(z, θ)dzdθ −∞
0
March 2, 2004 14:47 WSPC/148-RMP
140
00192
A. Rennie 1
is O(m−1/2 ), and so the operator Pk a(1 + D2 )− 2 Pk is bounded but not compact. From Theorem 13 we know that the triple we have built over the cone has discrete and simple dimension spectrum, and is (2, ∞)-summable. So for a compactly supported away from zero, Trace(a(1 + D 2 )−s ) is meromorphic, where initially we suppose that s >> 1. This trace is the sum of three pieces Trace(a(1 + D 2 )−s ) = Trace(Pk a(1 + D2 )−s Pk ) + Trace(Pc a(1 + D2 )−s Pc ) + Trace(Pp a(1 + D2 )−s Pp ) .
As already noted, Trace(Pc a(1 + D2 )−s Pc ) is holomorphic for all s with Re(s) > 12 . The pole at s = 12 is simple and the residue is given by Z ∞ κ 2 −s ress= 21 Trace(Pc a(1 + D ) Pc ) = 2 a ˜(z)dz , 2π −∞ with a ˜ the piece of a independent of θ. Seeley’s results, [8, Theorem 4 and Sec. 2], and the compact support of a ˜, allow us to conclude that this piece of the trace analytically continues to C with the exception of the half-integers less than or equal to 21 , and all poles are simple. We have already seen that the contribution of Trace(Pk a(1 + D2 )−s Pk ) = Trace(Pk aPk ) is independent of s and in fact finite (provided a is supported away from z = 0), from our earlier estimate. So we are left with the point spectrum. It is shown in [14, Theorem 289, p. 250] that ∞ X d(n) = ζ(s)2 , s n n=1
s > 1,
where ζ denotes the Riemann zeta function. To put this information to use, we estimate |Trace(Pp a(1 + D2 )−s Pp )| X 4√m(1 + 2κ2 mk)−s Z 2 √ −mz 2 √ k = mz)e a(z, θ)H ( dzdθ k 2π π2 k! C k,m>0
≤ 4k˜ ak∞
X
(1 + 2κ2 km)−s
k,m>0
∼ 4k˜ ak∞ 22−s κ−2s
∞ X d(n) . ns n=1
March 2, 2004 14:47 WSPC/148-RMP
00192
Nonunital Spectral Triples Associated to Degenerate Metrics
Here ∼ indicates that lim
s→1+
2
2−s −2s
κ
∞ X X d(n) (1 + 2κ2 km)−s − s n n=1 k,m>0
!
141
= constant .
Indeed lim
s→1+
X
m,k>0
(2mk)
−s
1 1+ 2mk
−s
= lim 2−s s→1+
∞ ∞ ∞ X d(n) X (−1)k X d(n) + . ns 2k+1 n=1 nk+1 n=1 k=1
2 −s
So summing over the nonzero eigenvalues of (1 + D ) |Trace(Pp (1 + D2 )−s Pp )| ∼ 22−s κ−2s =2
2−s −2s
κ
∞ X
gives asymptotically
d(n)n−s
n=1
1 γ 2 + + γ + holomorphic . (1 − s)2 s−1
This shows that Trace(a(1 + D 2 )−s ) contains at worst a double pole. The precise behavior will depend on a. Here γ is Euler’s constant, the value of φ(s) at s = 1 1 + φ(s) with φ holomorphic. where ζ(s) = s−1 In fact we have already shown that if the function a has support disjoint from the set {z = 0}, there can only be a simple pole. This follows from Theorem 13 and Lemma 18. Computing the actual values of the residue requires a concrete form for the function a, and of course we are mostly interested in the case where the function a is (a component of) a projection or unitary representing a K-theory class. The Local Index Theorem [1, 3] gives us a formula for components of the Chern character of (A, H, D, Γ). Substituting the various constant terms and using the simplicity of the dimension spectrum we obtain φ2 (a0 , a1 , a2 ) =
1 τ0 (Γa0 da1 da2 (1 + D2 )−1 ) 2
1 φ0 (a0 ) = resz=0 Trace(Γa0 (1 + D2 )−z ) . z The top component involves the coefficient of 1z in the Laurent expansion at z = 0 of Trace(a(1 + D 2 )−1−z ), while the zeroth component involves the coefficient of the constant term. A routine calculation shows that Γa0 [D, a1 ][D, a2 ] −a0 g(da1 , da2 ) − izκ2 a0 da1 ∧ da2 = 0
0 a0 g(da1 , da2 ) − izκ2 a0 da1 ∧ da2
,
so the trace Trace(Γa0 da1 da2 (1 + D2 )−s ) is given by
2iκ2 TraceH+ (a0 ((∂z a1 )(∂θ a2 ) − (∂θ a1 )(∂z a2 ))z(1 + D2 )−s )
(3)
where H+ is the +1 eigenspace of Γ. The factor of κ2 is precisely what one would expect for a critical point at s = 1 since (D 2 +1)−s ∼ κ−2s ; κ is a geometric feature,
March 2, 2004 14:47 WSPC/148-RMP
142
00192
A. Rennie
and the residues we are employing compute purely topological quantities, and so should be insensitive to the precise value of κ. To compute the pairing with K-theory using the residue, we require a concrete form for the generators of the even K-group of the cone. We first compute the K-theory for the cone. Lemma 19. The K-theory of the cone is given by K0 (C0 (cone)) ∼ = Z2 ,
K1 (C0 (cone)) ∼ = Z2 .
Proof. The (C ∗ -closure of the) algebra of functions we are employing decomposes as A¯ ∼ = C0 (R2 \{0}) ⊕ C0 (R2 \{0}) ,
¯ ∼ so K∗ (A) = K0 (C0 (R2 \{0})) ⊕ K0 (C0 (R2 \{0})). To compute K0 (C0 (R2 \{0})), and find explicit generators, consider the exact sequence [7, 4], 0 → C0 (R2 \{0}) → C(D 2 ) → C(S 1 ) ⊕ C → 0 , where C0 (R2 \{0}) is included as the continuous functions on the closed unit disk D2 vanishing at 0 and on the boundary circle. The corresponding K-theory exact sequence is exp
Ind
0 → K1 (C(S 1 )⊕C) −→ K0 (C0 (R2 \{0})) → Z → Z⊕Z −→ K1 (C0 (R2 \{0})) → 0 , since K1 (C(D2 )) = {0}. The Index map on the left is necessarily injective, and it is also onto. To see this, observe that the map from K0 (D2 ) ∼ = Z to K0 (C(S 1 ) ⊕ C) ∼ = Z⊕Z takes the trivial bundle of rank k on the disk to the trivial bundle of rank k on the circle union the point zero. Hence it is the diagonal map, and is injective, whence the map from K0 (C0 (R2 \{0})) to K0 (D2 ) is zero. Furthermore, the exponential map on the right is onto, taking (n, m) onto n − m. So to obtain the generator of K0 (C0 (R2 \{0})) ∼ = Z, it suffices to find a generator for the odd K-group of the circle union a point, and apply the boundary map. The obvious generator of K1 (C(S 1 ) ⊕ C) is the function which is the identity on the circle, and equal to 1 on the adjoined point. This is unitary. To apply the boundary map, we first need to “double” this unitary to an element of the connected component of the identity (in a larger matrix algebra), so IdS 1 ⊕ 1 0 1 IdS ⊕ 1 → w := 0 Id∗S 1 ⊕ 1
where Id∗S 1 : z → z¯. Then we need to lift this unitary to a function in C(D 2 ) which is equal to w modulo C0 (R2 \{0}). So choose any continuous function f on the closed disk such that f is the identity on the boundary, 1 at the centre and has |f |2 ≤ 1 on the whole disk; for example f (reiθ ) = reiθ + (1 − r)
March 2, 2004 14:47 WSPC/148-RMP
00192
Nonunital Spectral Triples Associated to Degenerate Metrics
will do. Then the required lift is f p w ˜= − 1 − |f |2
p
1 − |f |2 f¯
143
.
Finally, we obtain a generator of K0 (C0 (R2 \{0})) defined by 1 0 1 0 w ˜−1 − Bott0 := w ˜ 0 0 0 0 p |f |2 − 1 −f 1 − |f |2 p = = pB − 1 . −f¯ 1 − |f |2 1 − |f |2
This is the analogue of the Bott generator on the punctured disk. To convert this into a projection in M2 (C0 (R2 \{0})) (as opposed to M2 (punctured disk)), we need to compose f with a diffeomorphism h : (0, ∞) → (0, 1). We leave this choice until later. Finally, to obtain generators on the cone, we take the Bott generator on each half and extend them by zero to the other half. To compute the Chern character pairing, first recall that for a projection p [4], X pii , Ch0 (p) = i
Ch2 (p) = −2
X
i,j,k
1 p− 2
ij
⊗ pjk ⊗ pki ,
which is the trace of (p − 21 )dpdp ∈ Ω∗ (M2 (A)), the universal differential algebra [4, p. 320]. We actually require Ch∗ (pB ) − Ch∗ 10 00 , but Ch2 (1) = 0 and Ch0 (pB ) − Trace
1 0 0 0
1 0 = Trace pB − , 0 0
and this trace is zero. Hence 1 Ress=0 TraceH2 (Γ2 (Ch0 (pB ) − Ch0 (1))(1 + D22 )−s ) = 0 , s and we need only worry about the order two pairing. Since this has only a simple pole, we may compute it using the Wodzicki residue [9, 1, 10], via Connes’ trace theorem. Proposition 20. The index pairing between the Bott generator supported on the positive half of the cone and the spectral triple (A, H, D, Γ) described above is h[pB ] − 1, [(A, H, D, Γ)]i = 1 . Proof. To complete the computation of the pairing, we must choose an explicit diffeomorphism h : (0, ∞) → (0, 1). We take 2
h(z) = 1 − e−z ,
March 2, 2004 14:47 WSPC/148-RMP
144
00192
A. Rennie
as this allows effective computations. Substituting in the components of pB into the formula for the Chern character yields −4iκ2 Kz, where K is a complicated expression in terms of z and θ. The integral in the θ direction of K yields Z 2π 2 2 Kdθ = −4πiz(e−2z − e−z ) . 0
Together with the definition of the Wodzicki residue of an operator of order −2 on the cylinder, Z Z 1 σ A (x, ξ)dS(ξ)dVol(x) , W Res(A) = 2(2π)2 Cyl kξk=1 −2 we can compute the pairing. The computation is as follows. W Res
X ijk
=
=
2 −1
(p − 1/2)ij [D, pjk ][D, pki ](1 + D )
−2i (2π)2 −i 2π 2
Z
Z
∞ −∞
∞ −∞
Z
Z
2π
K 0
2π
K 0
Z
Z
0
(z 2 ξθ2 + ξz2 )−1 dS zdzdθ
(4)
(z 2 sin2 t + cos2 t)−1 dt zdzdθ
(5)
ξθ2 +ξz2 =1
2π
Z Z −i ∞ 2π 2π = K zdzdθ 2π 2 −∞ 0 |z| Z 2 2 −i ∞ (−4πi)z(e−2z − e−z )dz = π 0 Z ∞ Z ∞ 2 2 = −4 ze−2z dz − ze−z dz 0
= −4
Γ(1) Γ(1) − 22 2
= −1 + 2 = 1 .
!
(6) (7)
0
(8)
Equation (4) is just the definition of the Wodzicki residue, and we have replaced the sum of products of differentials of components of pB with −4izκ2K, as described above. The integral in Eq. (5) is a standard one, and the equality between (6) and (7) follows from integrating K in the θ direction. Finally we recall, the Bott projector we employ is supported only on a half-line, so the |z| term becomes simply z. It is clear from the above computation that if we begin with the punctured Bott projector on the other half of the cone (i.e. z < 0) we obtain the result −1. A final point to notice is that the trace over the continuous subspace for D and the trace over the kernel of D do not contribute, since both are finite as s → 1. For the kernel this follows from our previous estimates and the independence of this
March 2, 2004 14:47 WSPC/148-RMP
00192
Nonunital Spectral Triples Associated to Degenerate Metrics
145
trace on s, and for the continuous subspace it follows from our earlier computation that the trace only becomes singular as s → 21 . Thus only the point spectrum of D contributes to the above pairing. The industrious reader will find that the explicit expression for the trace of Ch2 (pB )(1 + D2 )−s is given by the function X X X N −4κ2 l N +m AN/m+1,m,l + AN/m−1,m,l T (s) = (−1) 2 s 2 (1 + 2κ N ) m m N >1
or, X
n,m>1
m|N l=1,2
N −m N AN/m−2,m,l , + 2 AN/m,m,l + m m X −4κ2 (−1)l (1 + 2κ2 nm)s l=1,2
n n+1 An+1,m,l +nAn−1,m,l + An,m,l +(n−1)An−2,m,l m m
!
the two obviously being equal. Here r [n/2] m X (−1)k+p n!2n−2k−2p mn−k−p Γ(n − k − p + 12 ) An,m,l = . 1 π k!p!(n − 2k)!(n − 2p)!(m + l)n−k−p+ 2 k,p=0 Our computations have shown that T is a meromorphic function whose residue at s = 1 is precisely 1. 5. Conclusion Despite obtaining (p, ∞)-summable spectral triples from degenerate metrics, our original aim of obtaining a spectral triple with non-simple dimension spectrum failed. We feel that explicit examples of non-simple dimension spectrum are an important step towards understanding the full content of the Local Index theorem of Connes–Moscovici [1]. Acknowledgments I would like to thank Alan Carey, Fyodor Sukochev and Steven Lord for useful discussions concerning the cone example. I would also like to thank Joseph Varilly for his encouragement and interest. This work was supported by Australian Research Council grant DP0211367. References [1] A. Connes and H. Moscovici, The local index formula in noncommutative geometry, Geom. Funct. Anal. 5 (1995), 174–243. [2] A. Rennie, Smoothness and locality for nonunital spectral triples, K-Theory 28(2) (2003), 127–165. [3] A. Rennie, Summability for nonunital spectral triples, to appear in K-Theory.
,
March 2, 2004 14:47 WSPC/148-RMP
146
00192
A. Rennie
[4] J. M. Gracia-Bond´ıa, J. C. Varilly and H. Figueroa, Elements of Noncommutative Geometry (Birkhauser, Boston, 2001). [5] A. Connes, Geometry from the spectral point of view, Lett. Math. Phys. 34 (1995), 203–238. [6] H. Lawson and M. Michelsohn, Spin Geometry (Princeton University Press, 1989). [7] N. Higson and J. Roe, Analytic K-Homology, Oxford Mathematical Monographs (Oxford University Press, New York, 2000). [8] R. T. Seeley, Complex powers of an elliptic operator, Proc. Symp. Pure Math. 10 (1967), 288–307. [9] A. Connes, Noncommutative Geometry (Academic Press, 1994). [10] M. Wodzicki, Noncommutative Residue. Chapter I : Fundamentals, Lecture Notes in Mathematics, 1289 (Springer, Berlin, 1987), pp. 320–399. [11] M. Reed and B. Simon, Functional Analysis (Academic Press, 1980). [12] R. Courant and D. Hilbert, Methods of Mathematical Physics, Vol. I (Interscience, New York, 1931). [13] E. D. Rainville, Special Functions (The MacMillan Company, New York, 1960). [14] G. H. Hardy and E. M. Wright, An Introduction to the Theory of Numbers (Clarendon Press, Oxford, 1960).
March 31, 2004 10:39 WSPC/148-RMP
00195
Reviews in Mathematical Physics Vol. 16, No. 2 (2004) 147–174 c World Scientific Publishing Company
ON THE SECOND CRITICAL FIELD FOR A GINZBURG–LANDAU MODEL WITH FERROMAGNETIC INTERACTIONS
STAN ALAMA∗ and LIA BRONSARD∗ Department of Mathematics and Statistics, McMaster University Hamilton, Ontario, Canada L8S 4K1
Received 10 November 2003 We consider a two-dimensional Ginzburg–Landau model for superconductors which exhibit ferromagnetic ordering in the superconducting phase, introduced by physicists to describe unconventional p-wave superconductors. In this model the magnetic field is directly coupled to a vector-valued order parameter in the energy functional. We show that one effect of spin coupling is to increase the second critical field H c2 , the value of the applied magnetic field at which superconductivity is lost in the bulk. Indeed, when the spin coupling is strong we show that the upper critical field is no longer present, confirming predictions in the physics literature. We treat the energy density as a measure, and show that the order parameter converges (as the Ginzburg–Landau parameter κ → ∞) in an average sense to a constant determined by the average energy. Keywords: Calculus of variations; partial differential equations; Ginzburg–Landau model; superconductivity; ferromagnetism.
1. Introduction Some recent papers in the physics literature have introduced Ginzburg–Landau models for superconductors with ferromagnetic properties, in an attempt to understand experimental results on a new class of Ru-based superconductors (Sr2 YRu1−x Cux O6 ). Ferromagnetism is observed at or near the onset of superconductivity at the critical temperature TC , which leads to the hypothesis that it is the same physical mechanism responsible for both types of ordering. On a microscopic level, physists believe that these properties may be attributed to a p-wave pairing of Cooper pairs, as opposed to the s-wave pairing in conventional superconductivity, and the d-wave pairing in the high-TC cuprates. Zhu et al. [1] have derived a Ginzburg–Landau model from first principles assuming p-wave symmetry. To incorporate ferromagnetic effects, they introduced a vectorial order parameter which serves as both the usual wave function for superconducting electrons and a ∗ Supported
by an NSERC Research Grant. 147
March 31, 2004 10:39 WSPC/148-RMP
148
00195
S. Alama & L. Bronsard
magnetic spin vector. Similar models have been proposed for heavy-fermion superconductivity as well as other types of “unconventional” superconductors, by Wang and Wang [2]. In this paper we consider a variant on the model of [1, 2] introduced by Knigavko and Rosenstein [3] in which the order parameter’s spin is directly coupled to the magnetic field in the energy. They consider a two-dimensional setting, with an order ~ with parameter Ψ = (ψ1 , ψ2 ) : Ω ⊂ R2 → C2 , and a magnetic field h = curl A, ~ : Ω → R2 . A The free energy takes the form: Z 2 2 β1 β2 2 1 ~ |∇A Ψ|2 + α|Ψ|2 + |Ψ|4 + (h − hex )2 − µ S h , (1) ψ1 + ψ22 + 2m 2 2 8π Ω
where
|Ψ|2 = |ψ1 |2 + |ψ2 |2 ,
|∇A Ψ|2 =
S = ψ1 × ψ2 = Im (ψ1∗ ψ2 ) ,
2 2 X ~ j , (∇ − iA)ψ j=1
hex is the external applied field, and µ, α, β1 , β2 are material parameters. α is positive above the critical temperature and negative below. β1 > 0 but β2 could be of either sign, with β1 + β2 > 0. Of particular interest is the Zeeman coupling constant µ which implements the interaction between the superconductor’s spin vector ~ = S ~k and the magnetic field ~h = h ~k. S By some formal arguments and numerical computations, Knigavko and Rosenstein make several predictions for this spin-coupled model. First, they note that the sign of β2 distinguishes two very different types of behavior for the model. In Phase I (β2 > 0) energy minimization forces the superconductor to have as large a spin S as possible, maximizing the ferromagnetic effects of the model. In Phase II (with −β1 < β2 < 0) the potential term penalizes the spin. The intuition behind these observations comes from the identity |ψ12 + ψ22 |2 = |Ψ|4 − 4 S 2 . They reason that ferromagnetic effects in Phase II will be slight, and concentrate on Phase I, β2 > 0 in their paper. The first part of [3] is devoted to the lower critical field Hc1 . They argue that Hc1 in Phase I is significantly reduced in comparison with the usual type-II superconductors, decreasing linearly with Zeeman coupling µ, and that in fact above a critical value of the coupling µc1 the Meissner phase disappears altogether: for µ ≥ µc1 they predict a spontaneous vortex state present at zero applied field. Our previous paper [4] gives a rigorous analytical justification of these predictions. Knigavko and Rosenstein also studied the upper critical field, Hc2 , at which superconductivity is lost in the bulk and the sample reverts to the normal state, with Ψ = 0 and the external magnetic field penetrating the entire interior of Ω. They predict that the effect of Zeeman coupling is to elevate Hc2 above its usual
March 31, 2004 10:39 WSPC/148-RMP
00195
Ginzburg–Landau Model with Ferromagnetic Interactions
149
value in the Ginzburg–Landau theory. Indeed, they argue that there is another critical coupling µc2 , above which there is no upper critical field, Hc2 (µ) = +∞ for µ > µc2 . In this supercritical regime the transition from the normal to superconducting states occurs for T > TC , and superconductivity nucleates in increasing external fields. This behavior is exactly the opposite of what is well-known for typeII superconductors. Nevertheless, very similar unconventional properties have also been reported for the organic superconductor TMTSF2 PF6 near the upper critical field (see Lebed [5]), and again this is attributed to a microscopic p-wave symmetry. In this paper we sharpen and verify these conjectures, and study the nature of the superconducting state at applied fields hex on the order of Hc2 . We also show that the simplistic distinction between Phase I (β2 > 0) and Phase II (−β1 < β2 < 0) is artificial in this high field regime. 1.1. Heuristics and predictions We rewrite the energy functional by introducing non-dimensional quantities in the usual way, choosing to measure lengths in units of the penetration depth, λ = q mβ1 c2 4π|α|e2 . In these new units the free energy takes the form: Z 2 κ2 γ 2 κ2 ~ = ψ1 + ψ22 2 + (h − hex )2 − 2g S h . |∇A Ψ|2 + E(Ψ, A) |Ψ|2 − 1 + 2 2 Ω (2) Here hex is the external applied magnetic field, κ = λ/ξ is the usual Ginzburg– Landau parameter, and the Zeeman coupling constant is g = 2mc e~ µ > 0. Phase I corresponds to γ > 0, and γ ∈ (−1, 0) for Phase II. To predict the value of Hc2 , the magnitude of the applied field at which superconductivity disappears in the bulk, physicists examine the linear stability of the ~ = hex ). To study the bulk effect (rather than the “surnormal state (Ψ ≡ 0, curl A face superconductivity” which appears along the boundary of a finite sample at the upper critical field, Hc3 ) it is common to look at the linearization as an operator on the whole R2 . In our case, the linearized operators for the order parameter are a coupled system of magnetic Schr¨ odinger operators, LΨ := −∇2A0 Ψ − κ2 Ψ − g hex σ2 Ψ , for Ψ = (ψ1 , ψ2 ), where A0 is chosen with curl A0 = hex , and σ2 is the Pauli spin matrix given by 0 −i σ2 = . (3) i 0 Following [3] we change variables via ψ± := √12 (ψ1 ± iψ2 ) to diagonalize these operators and obtain an equivalent system of uncoupled magnetic Schr¨ odinger operators, L˜± ψ± := −∇2A0 ψ± + (−κ2 ± ghex )ψ± .
March 31, 2004 10:39 WSPC/148-RMP
150
00195
S. Alama & L. Bronsard
Since hex is a constant, we may immediately identify the ground state eigenvalue λ± of each of these operators, λ± = hex − κ2 ± ghex .
Clearly λ− < λ+ , and the normal state should be stable (locally minimizing) when λ− > 0 and unstable when λ− < 0. When 0 < g < 1, this implies that the normal state loses its stability when the applied field crosses below κ2 . 1−g That is, when hex < Hc2 (g) the energy minimizer should have a non-zero order parameter and exhibit superconductivity in the bulk, but for larger values hex > Hc2 (g) we expect that energy minimization produces a normal state inside Ω. Note that when g > 1 the eigenvalue λ− < 0 regardless of the applied field hex . In this regime the normal state is never locally minimizing, and so we do not expect any transition to the normal state no matter how large the applied field is, Hc2 (g) = +∞ for g > 1. Knigavko and Rosenstein [3] denote this critical value of the coupling gc2 = 1. Furthermore, we observe that near the critical field hex ' Hc2 small amplitude solutions of the Euler–Lagrange equations (which ostensibly bifurcate from the normal state at the critical field) should resemble the eigenfunction corresponding to the eigenvalue λ− . In particular, the minimizer should have the form ψ+ ≡ 0 with ψ− a small multiple of the eigenvector of L˜− . This structure is exactly the form of Ψ which makes |ψ12 + ψ22 |2 = 4|ψ+ ψ− |2 vanish in the energy and gives the largest possible value of the spin S = 21 |Ψ|2 , and therefore represents what we will call a “Phase I bulk state”. In particular, Knigavko and Rosenstein use this special form as a bulk state for solutions Ψ in Phase I. This special structure of minimizers should arise naturally from the analysis near the critical field, and indeed we will show that minimizers are (approximately) of this form in certain parameter regimes. We note also that this structure is chosen by the quadratic part of the energy, and hence the distinction of Phase I and Phase II cases is misleading near the upper critical field Hc2 . Indeed, the bifurcation at Hc2 should produce minimizers which resemble the “Phase I bulk state” regardless of the sign of γ. Hc2 (g) :=
1.2. Main results To study this phenomenon using variational analysis we consider the singular limit κ → ∞ and recover the physicists’ conjectures in the limit. First, we note that the heuristic arguments above indicate that the critical field Hc2 is of the order of κ2 , and so we assume a simple relationship between hex and κ, hex = κ2 b , for b > 0 a fixed constant. We emphasize the dependence of the energy (2) on the parameters by writing E = E˜κ,b,g , and introduce the energy density function e˜κ,b,g via
March 31, 2004 10:39 WSPC/148-RMP
00195
Ginzburg–Landau Model with Ferromagnetic Interactions
151
2 2 ~ := |∇A Ψ|2 + κ |Ψ|2 − 1 2 + κ γ ψ 2 + ψ 2 2 + (h − hex )2 − 2g S h . e˜κ,b,g (Ψ, A) 1 2 2 2 (4)
First, we show that for such external fields the energy density of minimizers is approximately constant in Ω.
~ minimizers of E˜κ,b,g . Then, there Theorem 1.1. Let γ > −1, b, g > 0, and (Ψ, A) exists a constant F˜ = F˜ (b, g) such that the energy density converges weakly in the sense of measures, µ ˜κ :=
1 ~ dx * F˜ (b, g) L e˜κ,b,g (Ψ, A) κ2
as κ → ∞, where L denotes Lebesgue measure. ~ converges in the slightly stronger In fact, the family of functions {˜ eκ,b,g (Ψ, A)} ∗ ∞ weak topology on L (Ω). The proof of this result follows along the lines of the recent paper of Sandier and Serfaty [6] concerning the Hc2 transition in the classical Ginzburg–Landau model. Here we introduce a measure-theoretic framework which we believe to be a natural setting for studying Ginzburg–Landau-type functionals at high fields hex = O(κ2 ) where (in physicists’ parlance) the vortices are “overlapping” and can no longer be isolated. The underlying explanation of why the energy is (to the highest order in κ) equally distributed comes from the idea (already recognized by Pan [7] in studying surface superconductivity) that at fields of order hex = O(κ2 ) boundary effects appear in the energy at order κ, whereas the bulk energy is of the order κ 2 . Hence, sets may be decomposed into cubes by the introduction of extra boundaries without affecting the highest order term in the energy. Since the energy functional is translation invariant, each cube is interchangeable and the energy density must be approximately constant in the sample. The equidistribution of energy then leads to a convergence of averages of the modulus of the order parameter. We prove the following result for “Phase I”: ~ be Theorem 1.2 (Phase I). Let γ ≥ 0, g > 0, hex = bκ2 with b > 0, and (Ψ, A) ˜ minimizers of Eκ,b,g . Then: (a) There exists a constant 0 < c < 1 (independent of κ, b, g) such that for any cube Q ⊂ Ω, Z Z 1 1 |Ψ|4 ≤ lim sup |Ψ|4 c(1 − (1 − g)b)2+ ≤ lim inf κ→∞ L(Q) Q κ→∞ L(Q) Q ≤ (1 − (1 − g)b)2+ .
(5)
1 In particular , kΨkL4(Ω) → 0 if and only if 0 < g < 1 and b ≥ 1−g . 1 1 2 1 2 . (b) S − 2 |Ψ| → 0 as κ → ∞, in L (Ω) when γ ≥ 1, or in L (Ω) provided b ≥ g+1 1 4 ∞ (c) If either b ≥ 1+g or γ ≥ 1, then |Ψ| converges in the weak* topology of L (Ω).
March 31, 2004 10:39 WSPC/148-RMP
152
00195
S. Alama & L. Bronsard
Theorem 1.2 suggests the following definition of an upper critical field, ( Z ~ of E˜κ,b,g bc2 (g) = inf b > 0 : lim sup |Ψ|4 dx = 0 for minimizers (Ψ, A) κ→∞
Ω
2
with applied field hex = κ b
=
(
1 1−g
,
+∞ ,
if 0 < g < 1 if g ≥ 1
)
,
and a critical value of the coupling constant gc2 = 1. In the case of subcritical coupling (0 < g < 1) we observe the average value of |Ψ|4 (which measures the square of the density of superconducting electrons) decreases as hex = bκ2 increases, reaching zero total density at the critical value hc2 = bc2 κ2 . With supercritical coupling (g ≥ 1), the average value of |Ψ|4 does not decrease with increasing b; in fact the averages increase with the applied field strength for g > 1. This is a consequence of the direct form of the coupling of h to Ψ in the Euler–Lagrange equations, and an effect not foreseen in the physics paper [3]. Note that assertion (b) implies that the minimizers approach the physicists’ Phase I bulk state in the large κ limit. In Phase II we must restrict the parameters somewhat to obtain a similar result: Theorem 1.3 (Phase II). Let − 12 ≤ γ < 0, g > 0, and hex = bκ2 with b ≥ Then the conclusions of Theorem 1.2 hold.
1 1+g .
The restriction on γ is technical, but it is not clear whether the full result of Theorem 1.2 should be expected for all Phase II cases. Indeed, conclusion (b) asserts that minimizers approximately resemble the Phase I bulk state (S = 21 |Ψ|2 , as predicted by the linear heuristic, independently of the value of γ). However, it is unlikely that the order parameter prefers this arrangement very far beyond the transition point, since the fourth-order term γ|ψ12 + ψ22 |2 should become more important as the average value of |Ψ| increases. In proving both theorems we find upper and lower bounds on the minimum energy via comparison with rescaled Ginzburg–Landau functionals. Indeed, the rescaling is suggested by the physicists’ intuitive idea that vortices become “narrower” due to the spin coupling, and can be packed more densely than would normally be allowed in a classical Ginzburg–Landau setting. The average value of |Ψ|4 is then related to the limiting energy density using techniques from [6]. 1.3. Recovery of superconductivity for T > TC Another unconventional prediction in [3] is that in supercritical coupling g > gc2 = 1 the transition between a superconducting and normal bulk occurs for temperatures higher than the critical temperature, T > TC , and for increasing external fields.
March 31, 2004 10:39 WSPC/148-RMP
00195
Ginzburg–Landau Model with Ferromagnetic Interactions
153
To model the behavior above TC , we assume that only the coefficient α = αT in the quadratic term in the functional is temperature dependent, with αT > 0 for T > TC and αT < 0 for T < TC . We fix a reference temperature T0 < TC and define our non-dimensional units with respect to that reference temperature, with r length scale λ =
mβ1 c2 4π|αT |e2 .
This leads to a functional of the form
0
~ E(Ψ, A) Z 2 κ2 γ 2 κ2 2 2 2 2 2 = |∇A Ψ| + |Ψ| + aT + ψ1 + ψ2 + (h − hex ) − 2g S h , (6) 2 2 Ω
with aT carrying the temperature dependence: aT > 0 if T > TC and aT < 0 for T < TC . The value aT = −1 corresponds to the reference temperature T = T0 < TC and recovers the original energy as defined in (2). Applying the heuristic arguments above to the linearization about the normal state now yields a ground-state eigenvalue λ− = hex (1 − g) + κ2 aT , κ2 a
and a prediction of a critical field Hc (g, T ) = g−1T . In other words, when g > 1 we expect that the normal state is minimizing for hex < Hc (g, T ) but that the minimizers exhibit bulk superconductivity for hex > Hc (g, T ). This situation is quite counterintuitive, since it is exactly the opposite of what is known to be true for the Ginzburg–Landau functional. This situation is quite unusual for the Ginzburg– Landau functional, and in [3] it is recognized that this simplistic temperature dependence on the energy may only be valid very close to the critical temperature T C . Again, we consider a κ → ∞ limit and prove the following result, which confirms the nucleation in increasing fields for Phase I and certain Phase II models: Theorem 1.4 (T > TC ). Let γ ≥ − 12 , aT > 0, g > 0, and hex = bκ2 with ~ minimizers of (6). Then, |Ψ|4 converges in the weak* topology of b > 0, and (Ψ, A) ∞ L (Ω), and there exists a constant 0 < c < 1 (independent of κ, b, g) such that for any cube Q ⊂ Ω, Z 1 2 |Ψ|4 ≤ ((g − 1)b − aT )2+ . (7) c((g − 1)b − aT )+ ≤ lim κ→∞ L(Q) Q In particular , kΨkL4(Ω) → 0 if and only if 0 < g < 1 or g > 1 and b ≤ Moreover , S − 12 |Ψ|2 → 0 as κ → ∞ in L2 (Ω).
aT g−1 .
Note again that the superconducting state resembles the “Phase I bulk state” (S ≡ 1 1 2 2 |Ψ| ) regardless of the sign of γ ≥ − 2 , as predicted by the heuristic argument. We also remark that, just as for the classical Ginzburg–Landau model, there should be another critical field Hc0 (g, T ) < Hc (g, T ) at which superconductivity appears along the surface of Ω, forming a thin superconducting sheath as the applied field increases. The appearance of this surface effect for conventional superconductors in decreasing fields was first remarked by the physicists DeGennes and
March 31, 2004 10:39 WSPC/148-RMP
154
00195
S. Alama & L. Bronsard
Saint-James and has been the subject of many interesting mathematical papers; see Bauman, Phillips and Tang [8], Lu and Pan [9], Del Pino, Felmer and Sternberg [10], Helffer and Morame [11], and Pan [7] among others. Here we will treat bulk (interior) superconductivity only, as the analytical methods for studying these two critical phenomena are rather different. 2. The Energy and Euler–Lagrange Equations Throughout the paper we will make the hypothesis that the applied field hex = bκ2 , for b > 0 a constant. We label the energy E = Eκ,b,g defined in (2) to emphasize the dependence on the parameters. We introduce a function space for admissible configurations, ~ ∈ H 1 (Ω; C2 ) × H 1 (Ω; R2 )} . H = HΩ := {(Ψ, A) In fact, for any regular domain ω ⊂ Ω (that is, a domain with Lipschitz boundary) we define the space Hω analogously. Since the functional is gauge invariant, ~ + ∇φ(x)) = E(Ψ, A) ~ E(Ψ exp[iφ(x)] , A
for all real-valued functions φ(x), we should fix a gauge in order to eliminate the pernicious effect of this non-compact group action. This may be done in a standard ~ = 0 in Ω and A ~ · ~n = 0 on ∂Ω. way using the Coulomb gauge, with div A The Euler–Lagrange equations are: ~ 2 Ψ + κ2 |Ψ|2 − 1 Ψ + κ2 γ ψ12 + ψ22 Ψ∗ − g h σ2 Ψ = 0 , −(∇ − iA) (8) ∇⊥ (h − hex − g(ψ1 × ψ2 )) +
X
k=1,2
~ k) = 0 , (ψk × (∇ − iA)ψ
(9)
where Eq. (8) denotes a system of two equations for Ψ = (ψ1 , ψ2 ), Ψ∗ = (ψ1∗ , ψ2∗ ) denotes the complex conjugate, and the Pauli spin matrix σ2 is defined in (3). The boundary conditions are: ~ k · n = 0 (k = 1, 2) , h − hex − g(ψ1 × ψ2 ) = 0 on ∂Ω. (∇ − iA)ψ (10) We have the following a priori estimates of solutions.
~ be critical points with hex = bκ2 for fixed constant b > Proposition 2.1. Let (Ψ, A) ~ 0, and energy Eκ,b,g (Ψ, A) = O(κ2 ). Then there exists a constant C = C(b, g, Ω) > 0 (independent of κ) such that: C (11) sup |Ψ|2 ≤ 1 + bg + , κ Ω sup |h − hex | ≤ C κ ,
(12)
sup |∇A Ψ| ≤ C κ .
(13)
Ω
Ω
March 31, 2004 10:39 WSPC/148-RMP
00195
Ginzburg–Landau Model with Ferromagnetic Interactions
155
The proof is a modification of an analogous result presented in [4] (based on the method of [12]; see also [9]) and is deferred to a later section. Note that the usual simple proof of the boundedness of |Ψ(x)| via the maximum principle fails here due to the coupling of the spin term to the magnetic field, and the bounds must be obtained by a bootstrap argument applied to the entire system. We now redefine the energy in a convenient form for our later analysis. Following the heuristic analysis of [3] we make a rotation in the image space which diagonalizes the quadratic part of the energy: define 1 ψ± = √ (ψ1 ± i ψ2 ) . 2 We continue to use the symbol Ψ to denote the C2 -valued order parameter, writing Ψ = [ψ+ , ψ− ] when expressing Ψ in this rotated frame. We then calculate, |Ψ|2 = |ψ+ |2 + |ψ− |2 , ψ+ ψ− = We define the energy
|∇A Ψ|2 = |∇A ψ+ |2 + |∇A ψ− |2 ,
1 2 ψ1 + ψ22 , 2 ~ = Eκ,b,g (Ψ, A)
S=
Z
1 |ψ− |2 − |ψ+ |2 . 2
~ dx eκ,b,g (Ψ, A) Ω
with energy density 2 κ2 2 |Ψ|2 − 1 + 2κ2 γ |ψ+ ψ− | 2 + (h − hex )2 − gh |ψ− |2 − |ψ+ |2 + κ2 Kb,g,γ ,
~ = |∇A Ψ|2 + eκ,b,g (Ψ, A)
with constant
Kb,g,γ :=
1 + g 2 b2 , 1+g2 b2 , 2(1+γ)
if γ ≥ − 21 if − 1 < γ < − 21
.
(14)
The constant term Kb,g,γ is chosen such that the energy density of any minimizer is positive pointwise in Ω, 1 ~ ≥ 1 + o(1) . eκ,b,g (Ψ, A) κ2 2 Indeed, using (11) and (12) in the case γ ≥ − 21 we have: 2 1 hex 2 |Ψ|2 − 1 + 2γ |ψ+ ψ− | − g 2 |ψ− |2 − |ψ+ |2 2 κ =
1 1 |ψ+ |4 + |ψ− |4 + (gb − 1)|ψ+ |2 2 2
− (gb + 1)|ψ− |2 + (1 + 2γ) |ψ+ ψ− |2 +
1 + o(1) 2
March 31, 2004 10:39 WSPC/148-RMP
156
00195
S. Alama & L. Bronsard
=
2 1 2 1 |ψ+ |2 − [1 − gb] + |ψ− |2 − [1 + gb] 2 2
+ (1 + 2γ) |ψ+ ψ− |2 − ≥ −
1 + g 2 b2 2
1 − g 2 b2 + o(1) 2
+ o(1) .
For −1 < γ < − 21 , we split the quartic terms in the second line to arrive at 2 1 hex 2 |Ψ|2 − 1 + 2γ |ψ+ ψ− | − g 2 |ψ− |2 − |ψ+ |2 2 κ 2 1 1 + g 2 b2 1 + 2γ |ψ− |2 − |ψ+ |2 + − = − 2 2 2(1 + γ) " 2 2 # 1 + gb 1 + gb 2 2 + o(1) + (1 + γ) |ψ+ | − + |ψ− | − 2(1 + γ) 2(1 + γ) ≥
1 1 + g 2 b2 − + o(1) . 2 2(1 + γ)
3. Equidistribution of Energy ~ D ) denote energy minimizers of Eκ,b,g in a regular domain D ⊂ Now we let (ΨD , A 2 ~ D ): for any Borel set R , and define the measure µD,κ with density κ12 eκ,b,g (ΨD , A E ⊂ D, Z 1 ~ D ) dx . µD,κ (E) := 2 eκ,b,g (ΨD , A κ E We denote µκ := µΩ,κ when the underlying domain is the sample domain D = Ω. We also define a set function νκ for all regular domains ω ⊂ Ω (for our purposes, ω will be a smooth subdomain or a cube,) Z 1 ~ ~ νκ (ω) := inf eκ,b,g (Φ, B) dx : (Φ, B) ∈ Hω = µω,κ (ω) . κ2 ω
The advantage of treating the energy as a measure is that it leads to a weak ~ with curl A ~ = hex ) as a test convergence result. Using the normal state (Ψ = 0, A function, we have the trivial upper bound, 1 1 ~ µκ (Ω) = 2 inf Eκ,b,g (Ψ, A) ≤ + Kb,g,γ L(Ω) . κ HΩ 2 Hence, there exists a subsequence κn → ∞ and a Radon measure µ∗ with µ κn * µ ∗
in the sense of measures. In fact we can say a bit more: by Proposition 2.1 the family ~ is uniformly bounded in L∞ (Ω), and hence by Banach– of functions {eκ,b,g (Ψ, A)} Alaoglu there is a convergent subsequence in the L∞ (Ω) weak* sense. However, for the analysis of the limit we prefer to think of the energy as defining a measure.
March 31, 2004 10:39 WSPC/148-RMP
00195
Ginzburg–Landau Model with Ferromagnetic Interactions
157
j j Lemma 3.1. If E = ∪N j=1 Qδ is a finite disjoint union of congruent cubes Qδ of size δ, then Nδ . µκ (E) = N νκ (Qδ ) + O κ
Note that the error made by decomposing E into cubes and minimizing in each is proportional to the total perimeter of the cubes.
Proof. This result is essentially contained in [6, Lemma III.1]; we provide a sketch ~ Ω ) the minimizer of Eκ,b,g in Ω. Since (ΨΩ , A ~ Ω ) restricted here. Denote by (ΨΩ , A to each Qjδ is an admissible test function for the minimization problem in Qjδ , we have N X µκ (E) ≥ νκ (Qjδ ) = N νκ (Qδ ) . (15) j=1
~ Q ) be minimizers in a cube Qδ , the To prove the opposite inequality, let (ΨQ , A minimizers being the same modulo translation in each cube. Introduce a smooth cut-off ηκ (x) with 0 ≤ ηκ (x) ≤ 1 and ( 1 , if dist (x, ∂Q) < κ1 ηκ (x) = 0 , if dist (x, ∂Q) > κ2 and replicate ηκj (x) by translation to each cube Qjκ . We patch together a modified ˆ A), ˆ defined with configuration (Ψ, ( j ~ Ω ) + (1 − ηκj (x))(ΨQ , A ~ Q ) , for x ∈ Qj , ηκ (x)(ΨΩ , A δ ˆ ˆ (Ψ, A) = ~Ω) , (ΨΩ , A for x ∈ / E. The calculations of [6, Lemma III.1], based on the a priori estimates of the form obtained in Proposition 2.1, then show Z Z 1 1 δ δ ˆ ˆ ~ e ( Ψ, A) dx = e (Ψ , A ) dx + O = ν (Q ) + O . κ,b,g κ,b,g Q Q κ δ κ2 Qjκ κ2 Qjκ κ κ
(16)
In other words, the error made in truncation is proportional to the perimeter of the cube. ˆ A) ˆ is admissible for the minimization problem in Ω, and By construction, (Ψ, therefore we calculate i 1 h ˆ A) ˆ − Eκ,b,g (ΨΩ , A ~Ω) 0 ≤ 2 Eκ,b,g (Ψ, κ Z 1 ˆ A) ˆ − eκ,b,g (ΨΩ , A ~Ω) = 2 eκ,b,g (Ψ, κ E =
N Z 1 X ˆ A) ˆ − µκ (E) eκ,b,g (Ψ, κ2 j=1 Qjδ
March 31, 2004 10:39 WSPC/148-RMP
158
00195
S. Alama & L. Bronsard
δ ≤ N νκ (Qδ ) + O − N νκ (Qδ ) κ Nδ . =O κ
(We have used (15) and (16) in the fourth line.) In particular, all the terms in the above inequality are of the same order, and hence N Z Nδ 1 X ˆ ˆ µκ (E) ≤ 2 , eκ,b,g (Ψ, A) + O κ j=1 Qjδ κ and the proof is complete.
Next we show that the measures µκ are approximately equal to a constant times the Lebesgue measure. Lemma 3.2. There exists a constant Fκ = Fκ (b, g) > 0 such that for any dyadic cube Q2−m , −m −m 2 2 = Fκ L(Q2−m ) + O . µΩ,κ (Q2−m ) = νκ (Q2−m ) + O κ κ Proof. First, we note that µκ is (approximately) translation invariant. Indeed, let Q ⊂ Ω denote any cube. We observe that if Qτ = Q + τ is a translate of Q by vector τ , and Q, Qτ ⊂ Ω, then µκ (Qτ ) = νκ (Qτ ) + O(2−m /κ) = νκ (Q) + O(2−m /κ) = µκ (Q) + O(2−m /κ) ,
since νκ is translation invariant. Following the proof of [13, Theorem 2.2(d)] we fix any (small) cube Q1 ⊂ Ω and define µκ (Q1 ) 1 νκ (Q1 ) = +O Fκ (b, g) := , L(Q1 ) L(Q1 ) κ where the second equality follows from Lemma 3.1 with N = 1. Denote by Qλ a cube with sides of length λ, and decompose Q1 into disjoint dyadic cubes, m
Q1 =
4 [
(j)
Q2−m .
j=1
Then, m
µκ (Q1 ) =
4 X
(j)
µκ (Q2−m )
j=1 m
=
4 h X
(j)
νκ (Q2−m ) + O(2−m /κ)
j=1
= 4m νκ (Q2−m ) + O(2m /κ) = 4m µκ (Q2−m ) + O(2m /κ) .
i
March 31, 2004 10:39 WSPC/148-RMP
00195
Ginzburg–Landau Model with Ferromagnetic Interactions
159
Therefore, µκ (Q2−m ) = 4−m µκ (Q1 ) + O(2−m /κ) = Fκ (b, g)L(Q2−m ) + O(2−m /κ) . Remark 3.3. Note that the constant Fκ = Fκ (b, g) is independent of the domain Ω. In order to pass to the limit, we show that Fκ (b, g) is approximately monotone in κ: Lemma 3.4. There exists a constant c1 > 0 such that c1 Fκ ≥ Fλκ − √ κ holds uniformly for λ ∈ (0, 1]. ~ be minimizers Proof. We follow the construction in [6, Lemma III.3]. Let (Ψ, A) of Eκ,b,g in Ω, 0 < λ < 1, and take the cube Q1 from Lemma 3.2. Rescaling, Φ(x) = Ψ(λx) = Ψ(y) , ~ ~ ~ B(x) = λA(λx) = λA(y) , ~ H(x) = curl B(x) = λ2 h(y) , for x ∈ Ωλ := {x : y = λx ∈ Ω}. We calculate: Z 1 ~ dy e (Ψ, A) µκ (Q1 ) = 2 κ,b,g Q1 κ Z Z 1 ~ dx + 1 − 1 1 eλκ,b,g (Φ, B) (H − λ2 κ2 b)2 dx = 2 κ Q1/λ λ2 κ2 Q1/λ Z 2 2 1 ≥ λ νλκ (Q1/λ ) + (1 − λ ) 2 (h − hex )2 dy . (17) κ Q1 For any m ∈ N let M = [2m /λ] (the integer part). Then we approximate Q1/λ ˜ M of M 2 disjoint dyadic cubes Q2−m . The error in total area is by the union Q ˜ L(Q1/λ \QM ) ≤ 2−m /λ. Using Lemma 3.2 we have −m 2 2 M 2 ˜ µλκ (Q1/λ ) ≥ µλκ (QM ) = M νλκ (Q2−m ) + O κ m 2 = M 2 Fλκ L(Q2−m ) + Fλκ L(Q1/λ \QM ) + O λ2 κ m −m 2 2 = Fλκ L(Q1/λ ) + O + O . (18) λ2 κ λ
March 31, 2004 10:39 WSPC/148-RMP
160
00195
S. Alama & L. Bronsard
For any κ > 0 we may choose m ∈ N such that of m, we then obtain:
√
√ κ ≤ 2m < 2 κ. For that choice
√ λ2 µλκ (Q1/λ ) ≥ Fλκ L(Q1 ) + O(1/ κ) uniformly in λ ∈ (0, 1]. By definition (see the proof of Lemma 3.2) and by Lemma 3.1, Fκ L(Q1 ) = νκ (Q1 ) = µκ (Q1 ) + o(1) . Applying this observation and (18) to (17) we obtain the desired result. Lemma 3.4 is sufficient to pass to the limit κ → ∞ without recourse to subsequences: Lemma 3.5. There exists a constant F (b, g) with 21 ≤ F (b, g) ≤ ( 12 + Kb,g,γ ) such that for every fixed b, g ≥ 0, Fκ (b, g) → F (b, g) as κ → ∞. Moreover , for any Borel set B ⊂ Ω, lim µκ (B) = µ∗ (B) = F (b, g) L(B) .
κ→∞
~ in As already noted above, we may also conclude convergence of { κ12 eκ,b,g (Ψ, A)} ∗ ∞ the weak topology of L (Ω). Proof. First, since 1 1 L(Ω) ≤ 2 inf Eκ,b,g ≤ 2 κ by Lemmas 3.1 and 3.2 we have
1 2
1 + Kb,g,γ L(Ω) , 2
≤ Fκ (b, g) ≤
1 2
F (b, g) := lim sup Fκ (b, g) ≤ κ→∞
+ Kb,g,γ + o(1). We define 1 + Kb,g,γ . 2
For any > 0 there is a κ0 large enough such that F (b, g) − ≤ Fκ0 (b, g) and Fκ (b, g) ≤ F (b, g) + for all κ > κ0 . Then for any κ > κ0 , choose λ = κ0 /κ < 1 and apply Lemma 3.4 to obtain: c1 c1 F (b, g) + > Fκ (b, g) ≥ Fκ0 (b, g) − √ ≥ F (b, g) − √ − . κ κ In particular, Fκ → F .
March 31, 2004 10:39 WSPC/148-RMP
00195
Ginzburg–Landau Model with Ferromagnetic Interactions
161
For any cube Qδ we may cover ∂Qδ with a sequence Em of open sets, each Em a disjoint union of dyadic cubes of fixed size Q2−m . The number of such cubes in the covering is at most Nm = 4δ · 2m , and hence µκ (∂Qδ ) ≤ µκ (Em ) Nm δ2−m ≤ Nm νκ (Q2−m ) + O κ 2 δ . ≤ Fκ (b, g)Nm 4−m + O κ
By weak convergence, µ∗ (∂Qδ ) ≤ µ∗ (Em ) ≤ lim inf µκ (Em ) ≤ δ2−m F (b, g) . κ→∞
Since m is arbitrary, we have µ∗ (∂Qδ ) = 0, and therefore for any cube Qδ we conclude (see [14]) µ∗ (Qδ ) = lim µκ (Qδ ) = F (b, g)L(Qδ ) . κ→∞
Any open set E ⊂ Ω is a countable disjoint union of dyadic cubes E = ∪Qj . From the a priori bounds in Proposition 2.1 we have µκ (Qj ) ≤ CL(Qj ), with constant C independent of κ, j. Therefore we may apply Lebesgue dominated convergence to the sum, X X X lim µκ (E) = lim µκ (Qj ) = µ∗ (Qj ) = F (b, g)L(Qj ) = F (b, g)L(E) . κ→∞
κ→∞
The result extends to Borel sets E by the regularity of the measure.
Remark 3.6. (a) We note that the uniformity in the statement of Lemma 3.4 is needed to arrive at the conclusions to Lemma 3.5. Indeed, the counterexample Fκ = sin[ln(ln κ)] satisfies Fκ ≥ Fλκ + o(1) for each fixed λ, yet it diverges (cf. [6, III.12]). (b) From Remark 3.3 we conclude that F (b, g) is independent of the domain Ω. As a corollary to the Proof of Lemma 3.4 we obtain the following further information about the limiting energy density: ~ of Eκ,b,g in Ω, Lemma 3.7. For any minimizer (Ψ, A) Z 1 lim (h − hex )2 dx = 0 . κ→∞ κ2 Ω
(19)
Proof. Since Ω is open it decomposes into a disjoint countable union Ω = ∪Qj of dyadic cubes. We fix any λ ∈ (0, 1) and apply (17) to each dyadic cube Qj to obtain Z 1 1 (h − hex )2 dx ≤ µκ (Qj ) − λ2 νλκ (1 − λ2 ) 2 Qj → 0 κ Qj λ
March 31, 2004 10:39 WSPC/148-RMP
162
00195
S. Alama & L. Bronsard
as κ → ∞ by Lemma 3.2 and Lemma 3.5. From (12) and Lebesgue dominated convergence we conclude Z X 1 Z 1 lim 2 (h − hex )2 dx = lim (h − hex )2 dx = 0 . 2 κ→∞ κ κ→∞ κ Ω Q j j We note that [6, Lemma III.3] also includes the continuity and monotonicity of the limiting energy density with respect to the external field strength. These results do not extend to our spin-coupled system, as the spin term competes with the other potential terms in the energy, ruining the monotonicity observed in the classical Ginzburg–Landau energy. We do recover continuity in certain cases later on, but by different arguments (see Case 1 of Sec. 4 below.) We now connect the average behavior of |Ψ| to the energy density in the large κ limit. ~ be a minimizer of Eκ,b,g in HΩ . For any cube Q ⊂ Ω, Lemma 3.8. Let (ψ, A) Z 4 |Ψ| + 4γ|ψ+ ψ− |2 = (1 − 2 [F (b, g) − Kb,g,γ ]) L(Q) . (20) lim κ→∞
Q
That is, [|Ψ|4 + 4γ|ψ+ ψ− |2 ] * [1 − 2(F (b, g) − Kb,g,γ )] in the weak∗ topology on L∞ (Ω).
We could also have said that [|Ψ|4 + 4γ|ψ+ψ− |2 ] converges in the sense of measures, as in Lemma 3.5. Proof. The proof is exactly as in [6]. We multiply the Euler–Lagrange equations for each ψk by ψk itself, and integrate over the cube Q. The boundary term will be O(κ), and hence we obtain the identity, Z h i 2 |∇A Ψ|2 + κ2 |Ψ|2 − 1 |Ψ|2 + γκ2 ψ12 + ψ22 − 2 g h S = O(κ) . Q
Substituting into the energy density yields Z h Z h i 2 i κ2 κ2 2 |Ψ|4 + γ ψ12 + ψ22 = |Ψ|4 + 4γ |ψ+ ψ− | 2 Q 2 Q Z 2 κ ~ + κ2 Kb,g,γ + o(κ2 ) , − eκ,b,g (Ψ, A) = 2 Q
using Lemma 3.7 to eliminate the magnetic field energy. The conclusion then follows from Lemma 3.5. 4. Minimizers Near the Upper Critical Field We now use the limiting energy density and Lemma 3.8 from the previous section to study the nature of minimizers when hex = O(κ2 ).
March 31, 2004 10:39 WSPC/148-RMP
00195
Ginzburg–Landau Model with Ferromagnetic Interactions
163
Following the scheme of [6] we introduce the Annihilation operator, D− ϕ := ∂x ϕ − iAx ϕ + i(∂y ϕ − iAy ϕ) . The identity |∇A ϕ|2 = |D− ϕ|2 + h |ϕ|2 + curl [Im (ϕ∗ ∇A ϕ)]
(21)
illustrates how a magnetic field of order κ2 can affect the magnitude of the order parameter, essentially reducing the depth of the potential well. We will need to consider different cases, and apply different techniques for each case. Case 1: γ ≥ − 21 and hex = κ2 b with b ≥
1 1+g
This is the most relevent physical case, as it covers Phase I (γ > 0) in the range of applied fields near hc2 (g), although the choice of parameters is entirely technical. The basic idea is that the Zeeman spin coupling forces a splitting between the critical fields of the two components ψ+ and ψ− of the order parameter, as described in the Introduction. First we note that (21) implies that for any regular domain ω ⊂ Ω (including Ω itself), Z Z ∗ |D− ψ+ |2 + h |ψ+ |2 + curl [Im (ψ+ ∇A ψ+ )] |∇A ψ+ |2 = ω
ω
=
Z
=
Z
ω
ω
|D− ψ+ |2 + h |ψ+ |2 + |D− ψ+ |2 + h |ψ+ |
2
Z
∂ω
∗ ∇A ψ+ ) · ~τ ds Im (ψ+
+ O(κ) ,
(22)
using Proposition 2.1. We may therefore bound the energy from below as follows: let ρ2 := 1 + gb > 0,
σ := (1 + g)b − 1 ≥ 0 .
Then, Z
~ eκ,b,g (Ψ, A) ω
=
Z ω
|D− ψ+ |2 +
κ2 |ψ+ |4 + 2σ|ψ+ |2 + κ2 (1 + 2γ)|ψ+ ψ− |2 2
2 κ2 |ψ− |2 − ρ2 + (h − hex )2 2 κ2 + (1 − ρ4 ) + κ2 Kb,g,γ + o(κ2 ) 2
+ |∇A ψ− |2 +
March 31, 2004 10:39 WSPC/148-RMP
164
00195
S. Alama & L. Bronsard
κ2 ≥ 2 +κ
Z 2
ω
ω ~ |ψ+ |4 + σ|ψ+ |2 + Gκ,b,ρ (ψ− , A)
1 4 (1 − ρ ) + Kb,g,γ L(ω) + o(κ2 ) . 2
(23)
Here Gκ,b,ρ denotes the Ginzburg–Landau energy, Z 2 κ2 ω ~ = |ϕ|2 − ρ2 + (h − κ2 b)2 . Gκ,b,ρ (ϕ, A) |∇A ϕ|2 + 2 ω
~ to be the minimizers of We obtain a matching upper bound by choosing (ϕ, A) ω Gκ,b,ρ in the Hilbert space Hω , and taking Ψ = [ψ+ , ψ− ] = [0, ϕ]. For this choice, S = 12 |ϕ|2 , and by Proposition 2.1 we then have: Z ~ eκ,b,g (Ψ, A) inf Hω
ω
≤
Z
=
Z
~ eκ,b,g ([0, ϕ], A) ω
Ω
|∇A ϕ|2 +
2 κ2 |ϕ|2 − 1 − g h |ϕ|2 + κ2 Kb,g,γ 2
κ2 4 2 2 |ϕ| − 2(1 + gb)|ϕ| + 1 + κ Kb,g,γ + o(κ2 ) = |∇A ϕ| + 2 Ω 4 ω 2 1 ~ (1 − ρ ) + Kb,g,γ L(ω) + o(κ2 ) . ≤ Gκ,b,ρ (ψ− , A) + κ (24) 2 Z
2
Matching with the lower bound (23), we conclude that Z |ψ+ |4 → 0 as κ → ∞ ,
(25)
ω
and
ω 2 1 4 ~ eκ,b,g (Ψ, A) = inf Gκ,b,ρ + κ inf (1 − ρ ) + Kb,g,γ L(ω) + o(κ2 ) . Hω Hω ω 2 Z
(26)
Applying (25) with ω = Ω we have 2S = |ψ− |2 + o(1) = |Ψ|2 + o(1) in L2 (Ω). ω Now, the energy Gκ,b,ρ rescales to the usual Ginzburg–Landau energy, but for a different domain and applied field. If we define x˜ = ρx ∈ ω ˜ ρ , and ϕ(˜ ˜ x) =
1 1 ϕ(˜ x/ρ) = ϕ(x) , ρ ρ
1 ˜ x) = 1 A(˜ A(˜ x/ρ) = A(x) , ρ ρ
˜ x) = curl x˜ A(˜ ˜ x) = 1 h(x), h(˜ ρ2
˜b = b , ρ2
then ω ω ˜ ˜ = ρ2 Gκ,b,ρ (ϕ, A) = ρ2 Gκ, (ϕ, ˜ A) ˜ b,1
Z ω ˜
|∇A˜ ϕ| ˜2+
2 κ2 ˜ − κ2˜b)2 , |ϕ| ˜ 2 − 1 + (h 2
March 31, 2004 10:39 WSPC/148-RMP
00195
Ginzburg–Landau Model with Ferromagnetic Interactions
165
ω ˜ where Gκ, is the classical Ginzburg–Landau energy for configurations in the do˜ b,1 ˜ ex = κ2˜b. The paper of Sandier and Serfaty [6] main ω ˜ subject to the applied field h
is devoted to studying the behavior of minimizers of this functional, and they show ˜ Ω ˜ are that the energy density of minimizers of Gκ, converges to a constant: if (ϕ, ˜ A) ˜ b,1 ˜ ˜ is a regular domain, then minimizers of G Ω and ω ˜ ⊂Ω κ,˜ b,1
1 ω˜ ˜ = f (˜b)L(˜ G (ϕ, ˜ A) ω ) + o(1) . 2κ2 κ,˜b,1 They also prove the following upper and lower bounds on the constant f (˜b), i 1h 1 ˜ 1 ˜2 ≤ f (˜b) ≤ b− b 1 − α(1 − ˜b)2+ , (27) 2 2 4 +
where α ∈ (0, 1) is constant, and f (b) = 14 for all b ≥ 1. Applying these bounds in our case, we have from (26), Z 1 1 4 2 ˜ ~ inf eκ,b,g (Ψ, A) = 2ρ f (b)L(˜ ω ) + (1 − ρ ) + Kb,g,γ L(ω) + o(1) κ 2 Hω ω 2 1 4 4 ˜ (28) = 2ρ f (b) + (1 − ρ ) + Kb,g,γ L(ω) + o(1) . 2 In particular (see Lemma 3.5), b 1 1 F (b, g) = 2ρ4 f − + + Kb,g,γ . 2 ρ 4 2
(29)
By [6, Lemma III.3] we conclude that F (b, g) is continuous in both variables. Note that we cannot conclude monotonicity in either variable, despite the monotonicity of f (˜b) from [6]. We now use Lemma 3.8 to infer information about the transition to the normal state at the critical field. From (20) we conclude that (in the weak* topology of L∞ (Ω),) b . (30) |Ψ|4 * [1 − 2(F (b, g) − Kb,g,γ )] = ρ4 1 − 4f ρ2 1 Consider first the case ˜b ≤ 1, that is b ≤ bc2 = 1−g and 0 < g < 1 = gc2 . Using the ˜ upper and lower bounds (27) on f (b), after some algebra we have
1 1 1 − (ρ2 − b)2+ + o(1) ≤ F (b, g) − Kb,g,γ ≤ 1 − α(ρ2 − b)2+ + o(1) , 2 2
(31)
with α ∈ (0, 1) as in [6]. In particular, for any cube Q ⊂ Ω, Z 1 α(1 − (1 − g)b)2+ ≤ lim |Ψ|4 ≤ (1 − (1 − g)b)2+ , L(Q) κ→∞ Q
1 as the smallest value of the which proves the characterization of bc2 (g) = 1−g applied field for which the minimizer vanishes (on average) in the bulk. Note that
March 31, 2004 10:39 WSPC/148-RMP
166
00195
S. Alama & L. Bronsard
the local averages of |Ψ|4 are bounded above and below by increasing functions of b if g > 1, by decreasing functions of b if 0 < g < 1, and |Ψ|4 remains bounded above and below on average in the critical case g = 1. In the case b ≥ bc2 (g) and 0 < g < 1 = gc2 we note that ˜b ≥ 1, hence f (˜b) = 41 for such b, and therefore F (b, g) − Kb,g,γ = 12 . From Lemma 3.8 we conclude that |Ψ| → 0 in L4 (Ω). This proves part (a) of Theorems 1.2 and 1.3 in Case 1. In the case g ≥ gc2 = 1, we note that by the strict monotonicity of f (˜b), f (b/ρ2 ) < lim f (b/1 + gb) = f (1/g) < b→∞
1 . 4
From (30) we conclude that the weak limit of |Ψ|4 is bounded away from zero, and in fact grows quadratically with b when g > 1. This proves part (b) of Theorems 1.2 and 1.3 in Case 1. Remark 4.1. Note that this proof via comparison with a rescaled Ginzburg– Landau energy follows the physical interpretation of the effect of spin coupling at high fields as discussed in [3], where they reason that vortices become “narrower”, and therefore can be packed more densely than the Ginzburg–Landau theory would otherwise allow. Indeed, in terms of the physical parameters this rescaling effectively decreases both the penetration depth λ and the coherence length ξ. This can also be thought of as decreasing temperature T . Note that κ is left unchanged by this rescaling. Case 2: γ ≥ 1 and hex = κ2 b with 0 < b <
1 1+g
In this case it is more convenient to work with the original representation for Ψ = (ψ1 , ψ2 ). Using the usual test configuration Ψ = √12 (ϕ, iϕ), with ϕ the minimizer Ω of Gκ,b,ρ above, we still have an upper bound on the energy, 1 1 + + Kb,g,γ . F (b, g) ≤ 2ρ4 f (˜b) − (32) 4 2 To obtain a lower bound, we let Ψ = |Ψ|2 − 1
Also,
2
√1 (ϕ1 , ϕ2 ). 2
Then,
2 2 2 i 1h |ϕ1 |2 − 1 + |ϕ2 |2 − 1 + ψ12 + ψ22 = 2 1 + |ϕ1 |2 |ϕ2 |2 + Re (ϕ21 ϕ¯22 ) 2 2 2 i 1h ≥ |ϕ1 |2 − 1 + |ϕ2 |2 − 1 . 2
−2gh S = −ghex |Ψ|2 + ghex |Ψ|2 − 2S + o(κ2 )
where we recall that |Ψ|2 − 2S ≥ 0 pointwise. This leads to the lower bound Z ~ dx ≥ 1 ρ2 G ω˜ (ϕ˜1 , A) ˜ + 1 ρ2 G ω˜ (ϕ˜2 , A) ˜ eκ,b,g (Ψ, A) 2 2 ω
March 31, 2004 10:39 WSPC/148-RMP
00195
Ginzburg–Landau Model with Ferromagnetic Interactions
167
κ2 2 2 2 2 (γ − 1)|ψ1 + ψ2 | + ghex |Ψ| − 2S + o(κ2 ) + 2 ω 1 1 2 4 ˜ ≥ κ 2ρ f (b) − + + Kb,g,γ L(ω) 4 2 Z 2 κ + (γ − 1)|ψ12 + ψ22 |2 + ghex |Ψ|2 − 2S + o(κ2 ) . 2 ω Z
(Here we use the notations introduced to handle the previous case, above.) This matches the upper bound exactly except for the last two (non-negative) terms, which are therefore forced to zero: 0 ≤ |Ψ|2 − 2S → 0 in L1 (Ω), and |Ψ|4 * ρ4 [1−4f (˜b)] weak* in L∞ (Ω) by Lemma 3.8. We may complete this case by following the same arguments as in Case 1. Case 3: 0 ≤ γ ≤ 1 and hex = κ2 b with 0 < b <
1 1+g
In this situation we can no longer directly compare with a rescaled Ginzburg– Landau functional, but we can still obtain similar results. The upper bound on energy is exactly the same as in the previous case, (32). To do the lower bound we recomplete the square with the Annihilation operator using (21), but applied with ψ1 , ψ2 . The result will be similar to (31), although we obtain no useful bound on the spin term as in the previous two cases. Indeed, calculating as in (23), for any regular ω ⊂ Ω we obtain Z Z κ2 ~ |Ψ|4 − 2(1 + (g − 1)b)|Ψ|2 + 1 eκ,b,g (Ψ, A) ≥ |D− Ψ|2 + 2 ω ω + κ2 Kb,g,γ + o(κ2 ) ≥
κ2 1 − (ρ2 − b)2+ + Kb,g,γ L(ω) + o(κ2 ) , 2
with ρ2 as above. Thus we recover the upper and lower bounds (31) on the limiting energy density F (b, g) once again. The difference with the previous parts comes with the application of Lemma 3.8. In the present case we cannot conclude that the term |ψ12 + ψ22 |2 = 4|ψ+ ψ− |2 in (20) vanishes. Instead, we use the facts 0 ≤ γ ≤ 1 and 0 ≤ |ψ12 + ψ22 |2 ≤ |Ψ|4 to conclude from (20) that Z lim sup |Ψ|4 ≤ (1 − 2[F (b, g) − Kb,g,γ ]) L(Q) ≤ (1 − (1 − g)b)2+ L(Q) , κ→∞
Q
and lim inf κ→∞
Z
4
Q
2|Ψ| ≥ lim
κ→∞
Z
Q
[|Ψ|4 + γ|ψ12 + ψ22 |2 ]
= (1 − 2[F (b, g) − Kb,g,γ ]) L(Q) ≥ α(1 − (1 − g)b)2+ L(Q) .
March 31, 2004 10:39 WSPC/148-RMP
168
00195
S. Alama & L. Bronsard
This is sufficient to conclude that the order parameter vanishes if and only if 0 < 1 g < 1 and b ≥ bc2 (g) = 1−g , as desired. This completes the proof of Theorem 1.2. 5. Nucleation in Increasing Fields, T > TC This section is not substantially different from Case 1, so we will just briefly sketch the set-up and leave the details to the interested reader. We consider the energy ~ = Eκ,b,g,T (Ψ, A) ~ defined by (6), and denote by eκ,b,g (Ψ, A) ~ the energy E(Ψ, A) 1 density. We assume γ ≥ − 2 , and g > gc2 = 1. First, we note that the result of Theorem 1.1 holds essentially unchanged (except for the choice of constant Kb,g,γ,T required to make the energy density positive in Ω), and hence there exists a constant F = F (b, g, γ, T ) such that the energy density µκ * F L weakly in the sense of measures on Ω. Next we expand the effective potential term of the energy, 2 1 1 |Ψ|2 + aT + 2γ|ψ+ ψ− |2 − 2 g b S = |ψ+ |4 + (aT + gb)|ψ+ |2 + (1 + 2γ)|ψ+ ψ− |2 2 2 1 1 + |ψ− |4 − (gb − aT )|ψ− |2 + a2T . 2 2 If aT ≥ gb all terms in this expression are non-negative, and hence eκ,b,g,T (Ψ, A) ≥ 1 2 2 aT for minimizers. Since this is the energy of the normal state, we have a matching a upper bound and |Ψ|4 → 0 for b ≤ gT . If aT < gb we define σT2 = gb − aT > 0 and then
1 4 1 4 2 eκ,b,g,T (Ψ, A) ≥ eGL κ,b,σT (ψ− , A) + |ψ+ | − (σT − aT ) , 2 2 where eGL κ,b,σT is the energy density of the Ginzburg–Landau functional Gκ,b,σT defined in the previous section. Indeed, taking as a test function the minimizer of Gκ,b,σT in any subdomain ω ⊂ Ω, we have µκ (ω) = F (b, g, γ, T )L(ω) + o(1) 1 ω inf Gκ,b,σ − T κ2 b 4 − = 2σT f σT2
=
1 4 (σ − a2T )L(ω) + o(1) 2 T 1 4 2 (σ − aT ) L(ω) + o(1) , 2 T
and kψ+ kL4 (Ω) → 0. Reworking Lemma 3.8 with the modified energy and Euler–Lagrange equations, we have for any cube Q ⊂ Ω, Z Z 4 1 1 4 |Ψ| = |Ψ| + 4γ|ψ+ ψ− |2 + o(1) L(Q) Q L(Q) Q = a2T − 2(F (b, g, γ, T ) + Kb,g,γ,T ) + o(1) b = σT4 1 − 4f + o(1) . σT2
March 31, 2004 10:39 WSPC/148-RMP
00195
Ginzburg–Landau Model with Ferromagnetic Interactions
169
As in Case 1, we use the upper and lower bounds on f (˜b) from [6] to obtain Z 1 α((g − 1)b − aT )2+ ≤ lim |Ψ|4 ≤ ((g − 1)b − aT )2+ , κ→∞ L(Q) Q with constant α ∈ (0, 1) independent of κ. In particular, if g > gc2 = 1, we obtain aT , and Theorem 1.4 is completed. |Ψ|4 * 0 if and only if b ≥ bc (g, T ) = g−1 6. A P riori Estimates In this section we prove the a priori estimates on solutions from Proposition 2.1. We modify the proof of [4], which is based on the framework of Bethuel and Rivi`ere [12]. Denote by Mκ := sup |Ψ|2 . Ω
As opposed to the case of the classical Ginzburg–Landau functional, the direct coupling of the magnetic field to the order parameter in (8) precludes the usual simple proof of the boundedness of Mκ . We obtain the boundedness of the order parameter only after some preliminary estimates on the field have been calculated (see Step 5 below.) Step 1: Blow-up to scale := 1/κ ¯ and let Take any point p ∈ Ω x−p ˜ ∈ Ω = {y : p + y ∈ Ω} . x˜ = ˜ = (ψ˜1 , ψ˜2 ), Then we define Ψ ψ˜k (˜ x) = ψk (˜ x + p) = ψk (x) , ∇A˜ ψ˜k (˜ x) = ∇A ψk (x) ,
˜ x) = A(˜ ~ x + p) = A(x) ~ A(˜
˜ x) = curl A(˜ ˜ x) = 2 h(x) . h(˜
Under this scaling, the Euler–Lagrange equations (8) and (9) transform to: ˜ σ2 Ψ ˜∗ − g h ˜ = 0, ˜ + |Ψ| ˜ 2−1 Ψ ˜ + γ ψ˜2 + ψ˜2 Ψ −∇2A˜ Ψ 1 2 ˜ = 2 −∇⊥ h
X
k=1,2
Im ψ˜k∗ ∇A˜ ψ˜k
− g ∇⊥ S˜ ,
(33)
(34)
where we write S˜ = ψ˜1 × ψ˜2 . The rescaled energy, Z 2 2 2 ˜ − b − 2g h ˜ S˜ ˜ A) ˜ := ˜ 2 + 1 |Ψ| ˜ 2 − 1 + γ ψ˜12 + ψ˜22 + κ2 h E˜Ω˜ (Ψ, |∇A˜ Ψ| 2 2 ˜ Ω ~ = O(κ2 ) , = Eκ,b,g (Ψ, A)
which gives a first simple energy-estimate for various quantities, ˜ − bkL2 (Ω) ≤ Cκ . ˜ 2 ˜ + kΨk 4 ˜ + κkh k∇A Ψk L (Ω ) L (Ω )
March 31, 2004 10:39 WSPC/148-RMP
170
00195
S. Alama & L. Bronsard
Step 2: Interior Lploc estimates of ∇A˜ ψ˜k Let KR := B(0, R), the closed ball of radius R centered at the origin. Suppose p ∈ Ωint . Then for = 1/κ sufficiently small, dist (p, ∂Ω) ≥ 4, which is to say ˜ . Fix a gauge for A˜ with B(0, 4) ⊂ Ω div A˜ = 0 in B(0, 4) ,
A˜ · n = 0 on ∂B(0, 4) .
˜ L2 (B(0,4)) with universal con˜ H 1 (B(0,4)) ≤ ckhk In particular, this implies that kAk stant c. We rewrite (33) in the form of two Poisson’s equations for Ψ = (ψ1 , ψ2 ), ˜ σ2 Ψ ˜ = (|Ψ| ˜ 2 − 1)Ψ ˜ + γ ψ˜12 + ψ˜22 Ψ ˜∗ − g h ˜ + 3|A| ˜ 2Ψ ˜ + 2iA˜ · ∇ ˜ Ψ ˜ ∆Ψ (35) A , and estimate the right-hand side in Lq (K3 ), for 1 < q < 2:
˜ 2
˜ 2 ˜ − 1) 2 ≤ c Mκ κ ; ≤ cMκ (|Ψ|
(|Ψ| − 1)Ψ
q L (K3 )
L (K3 )
˜2 ˜2 ˜ ∗
(ψ1 + ψ2 )Ψ
Lq (K3 )
˜ ˜
h Ψ
˜ 2 ˜
|A| Ψ
˜ ˜
A · ∇A˜ Ψ
≤ c Mκ (ψ˜12 + ψ˜22 )
L2 (K3 )
≤ c Mκ κ ;
Lq (K3 )
˜ L2 (K ) ≤ c Mκ ; ≤ c Mκ khk 3
Lq (K3 )
˜ 2 ˜ 2 2q ≤ Mκ kAk L (K3 ) ≤ c Mκ khkL2 (K3 ) ≤ c Mκ
Lq (K3 )
˜ L2 (K ) k∇ ˜ ψ˜k kL2 (K ) ≤ cκ , ˜ Lp (K ) k∇ ˜ ψ˜k kL2 (K ) ≤ c khk ≤ kAk 3 3 3 3 A A
with p1 + 12 = 1q . Note that it is only the last estimate which requires q < 2, and that the constant in each case depends only on q, g and b = hex /κ2 . In summary, we have k∆ψ˜k kLq (K3 ) ≤ cq Mκ (κ + O(1))
(36)
for any q ∈ (1, 2), with constant cq depending only on q, g and b. By interior estimates for Poisson’s equation we then gain two derivatives in the smaller region K2 ⊂ K3 , kψ˜k kW 2,q (K2 ) ≤ c0q Mκ κ, and by the Sobolev embedding ∗
2q 2−q .
k∇ψ˜k kLp (K2 ) ≤ cp Mκ κ ,
(37)
for every p ≤ q = Since we may choose any q ∈ (1, 2), we obtain (37) for any p ≥ 2 in this way. Furthermore, ˜ Lp (K ) ≤ cp Mκ κ , k∇A˜ ψ˜k kLp (K2 ) ≤ k∇ψ˜k kLp (K2 ) + Mκ kAk 2
(38)
for every p ≥ 2.
Step 3: Interior uniform estimate for ˜ h Now we turn to Eq. (34) for ˜ h, and estimate its right-hand side. First, (38) implies:
≤ Mκ k∇A˜ ψ˜k kLp (K2 ) ≤ Mκ2 κ .
Im {ψ˜k∗ ∇A˜ ψ˜k } p L (K2 )
March 31, 2004 10:39 WSPC/148-RMP
00195
Ginzburg–Landau Model with Ferromagnetic Interactions
171
Next, note that ∇S˜ = (∇A˜ ψ˜1 ) × ψ˜2 + ψ˜1 × (∇A˜ ψ˜2 ) ,
(39)
since (iA˜ψ˜1 ) × ψ2 ) = −ψ˜1 × (iA˜ψ˜2 ). Therefore,
⊥ ˜ ˜ Lp (K ) ≤ cp M 2 κ , ≤ Mκ k∇A˜ Ψk
∇ S κ 2 Lp (K2 )
with constant cp depending only on p, g and b. We conclude from (34) that ˜ Lp (K ) ≤ cp k∇hk 2
Mκ2 κ
(40)
with constant cp depending only on g, p. ˜ − bk 2 ˜ ≤ c, and so applying (40) with From the energy estimate we have kh L (Ω ) p = 2, Mκ2 ˜ ˜ , kh − bkLr (K2 ) ≤ ckh − bkW 1,2 (K2 ) ≤ c 1 + κ ˜ in W 1,r (K2 ) for any r ≥ 2. Finally, the above estimate together with (40) bounds h with any r > 2, and we obtain the result: 2 ˜ − b| ≤ khk ˜ W 1,r (K ) ≤ c 1 + Mκ , sup |h (41) 2 κ K2 with the constant depending only on r, g, b. ˜ ˜ Step 4: Estimates near the boundary for ∇A˜ Ψ, h Consider now points p0 ∈ ∂Ω and repeat the blow-up procedure from Step 1: define 0 ˜ , so p0 maps to 0 ∈ ∂Ω. Let K 0 := cl [B(0, R) ∩ Ω ˜ ]. We fix a gauge ∈Ω x ˜ = x−p R 0 for A˜ in K8 so that div A˜ = 0 in K80 and A˜ · n = 0 on ∂K80 . ˜ , where We now repeat the estimate of Step 1: now the region K80 adjoins ∂ Ω ˜ ψk satisfies a homogeneous Neumann condition, ∂ ψ˜k ˜ , = ∇ψ˜k · n = ∇A˜ ψ˜k · n = 0 , on ∂K80 ∩ ∂ Ω ∂n ˜ Therefore we may apply the standard Neumann by our choice of gauge for A. boundary estimate for Poisson’s equation to (35) to obtain: kψ˜k kW 2,q (K60 ) ≤ cq k∆ψ˜k kLq (K80 ) ≤ cq Mκ κ , with cq depending only on g, b and q ∈ (1, 2). (The constant depends on the domain K60 , but in a uniformly bounded way due to the smoothness hypothesis on ∂Ω.) The remainder of the estimates in Steps 2 and 3 continue as before, and we conclude that 2 ˜ − b| ≤ c kh ˜ − bkW 1,p (K 0 ) ≤ cp 1 + Mκ , k∇A˜ ψ˜k kLp(K60 ) ≤ cp Mκ κ , sup |h 6 κ K60 with cp depending only on p ∈ [2, ∞) and b.
March 31, 2004 10:39 WSPC/148-RMP
172
00195
S. Alama & L. Bronsard
˜ We may now conclude a global estimate for h, 2 Mκ2 Mκ 2 ˜ , equivalently sup |h − hex | ≤ cκ 1 + . (42) sup |h − b| ≤ c 1 + κ κ ˜ Ω Ω ˜ ] by the sets K2 correIndeed, for any fixed = 1/κ, we obtain a covering of cl [Ω sponding to the points p with dist (p, ∂Ω) ≥ 4 and the sets K60 corresponding to the points p0 ∈ ∂Ω. Since the same uniform estimate holds in each of these regions ˜ (and by rescaling, h.) we obtain the global bound on the supremum of h Step 5: Mκ2 ≤ C1 for constant C1 independent of κ Let p ∈ cl [Ω] with Mκ = supΩ |Ψ| = |Ψ(p)|. We observe that V (x) := |Ψ(x)|2 solves the elliptic inequality, 2 ∆V = 2|∇A Ψ|2 + 2κ2 |Ψ|2 − 1 |Ψ|2 + 2κ2 γ ψ12 + ψ22 − 4gh(ψ1 × ψ2 ) ≥ 2κ2 (V − 1)V − 2g|h|V gkh − hex k∞ 2 V, ≥ 2κ V − 1 − gb − κ2
and ∂V ∂n = 0 on ∂Ω. If p is an interior maximum of V , we must have ∆V (p) ≤ 0, and therefore Mκ2 gkh − hex k∞ , (43) ≤ 1 + gb + c 1 + Mκ2 = V (p) ≤ 1 + gb + κ2 κ
applying (42). In particular, there is a constant C1 such that Mκ2 ≤ C1 . If the maximum is attained for p ∈ ∂Ω, the same conclusion follows from the Hopf Lemma. ˜ is uniformly bounded in Ω ˜ Step 6: |∇A˜ Ψ|
Take any x ∈ Ω. If dist (x, ∂Ω) ≥ 5, consider the corresponding set K5 = ˜ . As before, fix a Coulomb gauge for A˜ so that A˜ = ∇⊥ ξ, with cl [B(˜ x, 5)] ⊂ Ω ˜ ∆ξ = h in K4 and ξ = 0 on ∂K4 . By the global Lp estimates for the Dirichlet problem, ˜ L∞(K ) ≤ c . ˜ L∞(K ) = k∇ξkL∞ (K ) ≤ ckξkW 2,p (K ) ≤ khk kAk (44) 4
4
2
4
Notice that this means that the right-hand side of Eq. (33) for ψ˜k is uniformly bounded in L∞ (K4 ). Let η be a smooth function which equals one in K3 and vanishes outside of K4 . We take the complex inner product of (33) with (ηψk ), and integrate over K4 , using the Neumann condition to eliminate the boundary integral. This gives us the easy “energy” estimate, k∇A ψ˜k kL2 (K ) ≤ c , (45) 3
with the constant depending only on b. By a bootstrap argument using (35) we then obtain the uniform bound k∇ψ˜k kL∞ (K2 ) ≤ C 0 . If dist (x, ∂Ω) < 5, we blow up around p0 ∈ ∂Ω with dist (x, p0 ) < 5, and do the ˜ ] which adjoins the boundary of Ω ˜ . same as above but in the set K60 = cl [B(0, 6)∩ Ω
March 31, 2004 10:39 WSPC/148-RMP
00195
Ginzburg–Landau Model with Ferromagnetic Interactions
173
The uniform estimate (44) of A˜ and the “energy” estimate (45) of ∇ψ˜k in K40 follow in the same way as above. Then we apply the boundary regularity estimates to the equations (35), and obtain the same bound k∇ψ˜k kL∞(K40 ) ≤ C 00 . The constants will depend on the blow-up point, but can be chosen uniformly assuming that ∂Ω is smooth. Since (for fixed = 1/κ) any x ∈ Ω falls into one of these two categories, we have the uniform gradient bound in all of Ω. Estimate (13) then follows from ˜ the uniform bound on A. Step 7: Sharpening the bounds Using the second Euler–Lagrange equation (9), we have k∇(h − hex )kL∞ (Ω) ≤ g k∇SkL∞(Ω) + kΨkL∞(Ω) k∇A ΨkL∞(Ω) ≤ C κ , using (39), (13) and the result of Step 5 above. On ∂Ω, |h − hex | = g|S| ≤ g2 , so we obtain (12). Finally, we may now return to Step 5 to improve the bound on Mκ . Replaying the same steps, we improve (43) by substituting (12), Mκ2 = V (p) ≤ 1 + gb +
c gkh − hex k∞ ≤ 1 + gb + , 2 κ κ
which is (11). This completes the proof of Proposition 2.1. References [1] J. Zhu, C. Ting, J. Shen and Z. Wang, Ginburg–Landau equations for layered p-wave superconductors, Phys. Rev. B56 (1997) 14093–14101. [2] Z. Wang and Q.-H. Wang, Dynamics of time-reversal–symmetry-breaking vortices in unconventional superconductors, Phys. Rev. B57 (1998) R724–727. [3] A. Knigavko and B. Rosenstein, Spontaneous vortex state and ferromagnetic behavior of type-II p-wave superconductors, Phys. Rev. B58 (1998) 9354–9364. [4] S. Alama and L. Bronsard, Vortices and lower critical field for superconductors with ferromagnetic interactions, preprint (2003). [5] A. G. Lebed, Revival of superconductivity in high magnetic fields and a possible p-wave pairing in (TMTSF)2 PF6, Phys. Rev. B59 (1999) R721–R724. [6] E. Sandier and S. Serfaty, The decrease in bulk-superconductivity close to the second critical field in the Ginzburg–Landau model, SIAM J. Math. Anal. 34 (2003) 936–956. [7] X. Pan, Surface superconductivity in applied magnetic fields above Hc2 , Commun. Math. Phys. 228 (2002) 327–370. [8] P. Bauman, D. Phillips and Q. Tang, Stable nucleation for the Ginzburg–Landau system with an applied magnetic field, Arch. Ration. Mech. Anal. 142 (1998) 1–43. [9] K. Lu and X. Pan, Estimates of the upper critical field for the Ginzburg–Landau equations of superconductivity, Physica D127 (1999) 73–104. [10] M. Del Pino, P. Felmer and P. Sternberg, Boundary concentration for eigenvalue problems related to the onset of superconductivity, Commun. Math. Phys. 210 (2000) 413–446. [11] B. Helffer and H. Morame, Magnetic bottles in connection with superconductivity, J. Funct. Anal. 185 (2001) 604–680.
March 31, 2004 10:39 WSPC/148-RMP
174
00195
S. Alama & L. Bronsard
[12] F. Bethuel and T. Rivi`ere, Vortices for a variational problem related to superconductivity, Ann. Inst. Henri Poincar´e , Analyse non lin´eaire 12 (1995) 243–303. [13] W. Rudin, Real and Complex Analysis, 2nd edn. (McGraw Hill, New York, 1974). [14] L. C. Evans and R. Gariepy, Measure Theory and Fine Properties of Functions (CRC Press, Boca Raton, Florida, 1992). [15] E. Sandier and S. Serfaty, On the energy of type-II superconductors in the mixed phase, Rev. Math. Phys. 12 (2000) 1219–1257. [16] M. Tinkham, Introduction to Superconductivity, 2nd edn. (McGraw-Hill, New York, 1996).
March 22, 2004 10:8 WSPC/148-RMP
00197
Reviews in Mathematical Physics Vol. 16, No. 2 (2004) 175–241 c World Scientific Publishing Company
MULTIPLE HAMILTONIAN STRUCTURE OF BOGOYAVLENSKY TODA LATTICES
PANTELIS A. DAMIANOU Department of Mathematics and Statistics, University of Cyprus P. O. Box 20537, 1678 Nicosia, Cyprus
[email protected] Received 20 July 2003 Revised 21 December 2003 This paper is mainly a review of the multi-Hamiltonian nature of Toda and generalized Toda lattices corresponding to the classical simple Lie groups but it includes also some new results. The areas investigated include master symmetries, recursion operators, higher Poisson brackets, invariants and group symmetries for the systems. In addition to the positive hierarchy we also consider the negative hierarchy which is crucial in establishing the bi-Hamiltonian structure for each particular simple Lie group. Finally, we include some results on point and Noether symmetries and an interesting connection with the exponents of simple Lie groups. The case of exceptional simple Lie groups is still an open problem. Keywords: Toda lattice; Poisson brackets; master symmetries; bi-Hamiltonian systems; group symmetries; simple Lie groups. Mathematics Subject Classification: 37J35, 22E70 and 70H06.
Contents 1. Introduction 2. Background 2.1 Schouten bracket 2.2. Poisson manifolds 2.3. Symplectic and Lie–Poisson manifolds 2.4. Local theory 2.5. Cohomology 2.6. Bi-Hamiltonian systems 2.7. Master symmetries 3. AN Toda Lattice 3.1. Definition of the system 3.2. Multi-Hamiltonian structure 3.3. Properties of Xn and πn 3.4. The Faybusovich–Gekhtman approach 175
176 179 179 179 181 182 183 184 185 186 186 189 192 195
March 22, 2004 10:8 WSPC/148-RMP
176
00197
P. A. Damianou
3.5. A theorem of Petalidou 3.6. A recursive process of Kosmann–Schwarzbach and Magri 4. Lie Group Symmetries of the Toda Lattice 5. The Toda Lattice in Natural Coordinates 5.1. The Das–Okubo–Fernandes approach 5.2. The negative Toda hierarchy 5.3. Master integrals and master symmetries 5.4. Noether symmetries 5.5. Rational Poisson brackets 6. Generalized Toda Systems Associated with Simple Lie Groups 7. BN Toda Systems 7.1. A rational bracket for a central extension of BN -Toda 7.2. A recursion operator for Bogoyavlensky–Toda systems of type Bn 7.3. Bi-Hamiltonian formulation of Bn systems 8. Cn Toda Systems 8.1. A recursion operator for Bogoyavlensky–Toda systems of type Cn 8.2. Bi-Hamiltonian formulation of Cn systems 9. Dn Toda Systems 9.1. A recursion operator for Dn Bogoyavlensky–Toda systems in Flaschka coordinates 9.2. Master symmetries 9.3. A recursion operator for Dn Toda systems in natural (q, p) coordinates 9.4. Bi-Hamiltonian formulation of Bogoyavlencsky–Toda systems of type Dn 10. Conclusion 10.1. Summary of results 10.2. Open problems Acknowledgments References
197 197 198 200 200 202 206 209 211 213 215 215 219 221 222 222 225 225 226 229 232 233 237 237 238 239 240
1. Introduction In this paper we review the bi-Hamiltonian and multiple Hamiltonian nature of the Toda lattices corresponding to simple Lie groups. These are systems that generalize the usual finite, non-periodic Toda lattice (which corresponds to a root system of type AN ). This generalization is due to Bogoyavlensky [1]. These systems were studied extensively in [2] where the solution of the system was connected intimately with the representation theory of simple Lie groups. There are also studies by Olshanetsky and Perelomov [3], and Adler and van Moerbeke [4]. We will call such systems the Bogoyavlensky–Toda lattices. We begin with the following more general definition which involves systems with exponential interaction: consider a Hamiltonian of the form m X 1 e(vi , q) , (1) H = (p, p) + 2 i=1
where q = (q1 , . . . , qN ), p = (p1 , . . . , pN ), v1 , . . . , vm are vectors in RN and ( , ) is the standard inner product in RN . The set of vectors ∆ = {v1 , . . . , vm } is called the spectrum of the system.
March 22, 2004 10:8 WSPC/148-RMP
00197
Multiple Hamiltonian Structure of Bogoyavlensky–Toda Lattices
177
In this paper we limit our attention to the case where the spectrum is a system of simple roots for a simple Lie algebra G. In this case m = l = rank G. It is worth mentioning that the case where m, N are arbitrary is an open and unexplored area of research. The main exception is the work of Kozlov and Treshchev [5] where a classification of system (1) is performed under the assumption that the system possesses N polynomial (in the momenta) integrals. We also note the papers by Ranada [6], and Annamalai and Tamizhmani [7]. Such systems are called Birkhoff integrable. For each Hamiltonian in (1) we associate a Dynkin type diagram as follows: it is a graph whose vertices correspond to the elements of ∆. Each pair of vertices vi , vj are connected by 4(vi , vj )2 (vi , vi )(vj , vj ) edges. Example. The usual Toda lattice corresponds to a Lie algebra of type AN −1 . In other words m = l = N − 1 and we choose ∆ to be the set: v1 = (1, −1, 0, . . . , 0), . . . , vN −1 = (0, 0, . . . , 0, 1, −1) . The graph is the usual Dynkin diagram of a Lie algebra of type AN −1 . The Hamiltonian becomes N N −1 X 1 2 X qi −qi+1 H(q1 , . . . , qN , p1 , . . . , pN ) = , p + e 2 i i=1 i=1
(2)
which is the well-known classical, non-periodic Toda lattice. It is more convenient to work instead, in the space of the natural (q, p) variables, with the Flaschka variables (a, b) which are defined by: 1 1 (vi ,q) , e2 2 1 bi = − p i , 2
ai =
i = 1, 2, . . . , m (3) i = 1, 2, . . . , N .
We end up with a new set of polynomial equations in the variables (a, b). One can write the equations in Lax pair form (at least this is well-known for a spectrum corresponding to simple Lie algebras); see for example [8]. The Lax pair (L(t), B(t)) in G can be described in terms of the root system as follows: L(t) =
bi (t)hαi +
l X
ai (t)(xαi − x−αi ) .
i=1
B(t) =
l X
l X
i=1
i=1
ai (t)(xαi + x−αi ) ,
March 22, 2004 10:8 WSPC/148-RMP
178
00197
P. A. Damianou
As usual hαi is an element of a fixed Cartan subalgebra and xαi is a root vector corresponding to the simple root αi . The Chevalley invariants of G provide for the constants of motion. We will describe them separately for each case. In this paper we begin with a review of the AN Toda system. The theory and bi-Hamiltonian structure for this case is well-developed and the results are wellknown. We present a review of the results in Secs. 3–5 and then, for the remaining part of the paper, we deal exclusively with the other classical simple Lie algebras of type BN , CN and DN . We will demonstrate that these systems are bi-Hamiltonian and then illustrate with some small dimensional examples, namely B2 , C3 and D4 . The multi-Hamiltonian structure of the Toda lattice was established in [9]. For the remaining Bogoyavlensky–Toda systems it was established recently in several papers. The results for the BN Bogoyavlensky–Toda lattice were computed in [10] in Flaschka coordinates, and in [11] in (q, p) coordinates. The CN case is in [12] in Flaschka coordinates, and [11] in natural (q, p) coordinates. The DN case was settled in [13]. The bi-Hamiltonian structure of these systems was established recently in [14]. The negative Toda hierarchy was constructed in [15] and it was crucial in establishing the bi-Hamiltonian formulation in [14]. The construction of the bi-Hamiltonian pair may be summarized as follows: Define a recursion operator R in (a, b) space by finding a second bracket, π3 , and inverting the initial Poisson bracket π1 . Define the negative recursion operator N by inverting the second Poisson bracket π3 . This recursion operator is the inverse of the operator R. Finally, define a new rational bracket π−1 by π−1 = N π1 = π1 π3−1 π1 . We obtain a bi-Hamiltonian formulation of the system: π1 ∇H2 = π−1 ∇H4 ,
where Hi = 1i Tr Li . The brackets π1 and π−1 are compatible and Poisson. There is also an interesting connection with the exponents of the corresponding Lie group. For example, in the case of DN there is a sequence of invariants H2 , H4 , . . . , of even degree and an additional invariant of degree N . Let χi denote the Hamiltonian vector field generated by Hi and let Z0 denote a conformal symmetry. Then we have [Z0 , χj ] = f (j)χj . The values of f (j) corresponding to independent χj generate all the exponents except one. When Z0 acts on the Hamiltonian vector field χP , where P is the invariant corresponding to the Pfaffian of the Jacobi matrix, we obtain the last exponent N − 1. For example, in the case of D5 the exponents are 1, 3, 5, 7, and 4. 1 The independent invariants are H2 , H4 , H6 , H8 and P5 where H2i = 2i Tr L2i and √ P5 = det L. We obtain [Z0 , χ2 ] = χ2 [Z0 , χ4 ] = 3χ4 [Z0 , χ6 ] = 5χ6
March 22, 2004 10:8 WSPC/148-RMP
00197
Multiple Hamiltonian Structure of Bogoyavlensky–Toda Lattices
179
[Z0 , χ8 ] = 7χ8 [Z0 , χP5 ] = 4χP5 . In other words, the coefficients on the right-hand side are precisely the exponents of a simple Lie group of type D5 . 2. Background In this section we review the necessary background from Poisson and symplectic geometry, bi-Hamiltonian systems, master symmetries and recursion operators. 2.1. Schouten bracket We list some properties of the Schouten bracket following [16–18]. Let M be a C ∞ manifold, N = C ∞ (M ) the algebra of C ∞ real-valued functions on M . A contravariant, antisymmetric tensor of order p will be called a p-tensor for short. These tensors form a superspace endowed with a Lie-superalgebra structure via the Schouten bracket. The Schouten bracket assigns to each p-tensor A, and q-tensor B, a (p + q − 1)tensor, denoted by [A, B]. For p = 1 we have [A, B] = LA B where LA is the Lie-derivative in the direction of the vector field A. The bracket satisfies [A, B] = (−1)pq [B, A] .
(4)
(−1)pq [[B, C], A] + (−1)qr [[C, A], B] + (−1)rp [[A, B], C] = 0 .
(5)
(i) (ii) If C is a r-tensor
(iii)
[A, B ∧ C] = [A, B] ∧ C + (−1)
pq+q
B ∧ [A, C] .
(6)
2.2. Poisson manifolds We review the basic definitions and properties of Poisson manifolds following [16, 18] and [19]. A Poisson structure on M is a bilinear form, called the Poisson bracket { , } : N × N → N such that (i) (ii) (iii)
{f, g} = −{g, f } {f, {g, h}} + {g, {h, f }} + {h, {f, g}} = 0 {f, gh} = {f, g}h + {f, h}g .
(7) (8) (9)
Properties (i) and (ii) define a Lie algebra structure on N . (ii) is called the Jacobi identity and (iii) is the analogue of Leibniz rule from calculus. A Poisson manifold is a manifold M together with a Poisson bracket { , }. To a Poisson bracket one can associate a 2-tensor π such that {f, g} = hπ, df ∧ dgi .
(10)
Jacobi’s identity is equivalent to the condition [π, π] = 0 where [ , ] is the Schouten bracket. Therefore, one could define a Poisson manifold by specifying a pair (M, π)
March 22, 2004 10:8 WSPC/148-RMP
180
00197
P. A. Damianou
where M is a manifold and π a 2-tensor satisfying [π, π] = 0. In local coordinates (x1 , x2 , . . . , xn ), π is given by X ∂ ∂ π= ∧ (11) πij ∂xi ∂xj i,j and {f, g} = hπ, df ∧ dgi =
X
πij
i,j
∂g ∂f ∧ . ∂xi ∂xj
(12)
In particular {xi , xj } = πij (x). Knowledge of the Poisson matrix (πij ) is sufficient to define the bracket of arbitrary functions. The rank of the matrix (πij ) at a point x ∈ M is called the rank of the Poisson structure at x. A function F : M1 → M2 between two Poisson manifolds is called a Poisson mapping if {f ◦ F, g ◦ F }1 = {f, g}2 ◦ F
(13)
∞
for all f, g ∈ C (M2 ). In terms of tensors, F∗ π1 = π2 . Two Poisson manifolds are called isomorphic, if there exists a diffeomorphism between them which is a Poisson mapping. The Poisson bracket allows one to associate a vector field to each element f ∈ N . The vector field χf is defined by the formula χf (g) = {f, g} .
(14)
It is called the Hamiltonian vector field generated by f . In terms of the Schouten bracket χf = [π, f ] .
(15)
Hamiltonian vector fields are infinitesimal automorphisms of the Poisson structure. These are vector fields X satisfying LX π = 0. In the case of Hamiltonian vector fields we have Lχf π = [π, χf ] = [π, [π, f ]] = −2[[π, π], f ] = 0 .
(16)
The Hamiltonian vector fields form a Lie algebra and in fact [χf , χg ] = χ{f,g} .
(17)
So, the map f → χf is a Lie algebra homomorphism. The Poisson structure defines a bundle map π∗ : T ∗M → T M
(18)
π ∗ (df ) = χf .
(19)
such that
The rank of the Poisson structure at a point x ∈ M is the rank of πx∗ : Tx∗ M → Tx M . Throughout this paper we use the symbol π to denote a Poisson tensor but
March 22, 2004 10:8 WSPC/148-RMP
00197
Multiple Hamiltonian Structure of Bogoyavlensky–Toda Lattices
181
occasionally we use the same symbol to denote the matrix of the components of the tensor (i.e. the Poisson matrix). The same convention applies for the recursion operators. The functions in the center of N are called Casimirs. It is the set of functions f so that {f, g} = 0 for all g ∈ N . These are functions which are constant along the orbits of Hamiltonian vector fields. The differentials of these functions are in the kernel of π ∗ . In terms of the Schouten bracket a Casimir satisfies [π, f ] = 0. Given a function f , there is a reasonable algorithm for constructing a Poisson bracket in which f is a Casimir. One finds two vector fields X1 and X2 such that LX1 f = LX2 f = 0. If in addition X1 , X2 and [X1 , X2 ] are linearly dependent, then X1 ∧ X2 is a Poisson tensor and f is a Casimir in this bracket. In fact [f, X1 ∧ X2 ] = [f, X1 ] ∧ X2 − X1 ∧ [f, X2 ] = 0 .
(20)
More generally, there is a formula due to Flaschka and Ratiu which gives locally a Poisson bracket when the number of Casimirs is 2 less than the dimension of the space. Let f1 , f2 , . . . , fr be functions on Rr+2 . Then the formula ω{g, h} = df1 ∧ · · · ∧ dfr ∧ dg ∧ dh
(21)
[f π, f π] = f ∧ [f, π] ∧ π + f ∧ π ∧ [π, f ] + f 2 [π, π] = 0 .
(22)
where ω is a non-vanishing r + 2 form, defines a Poisson bracket on Rr+2 and the functions f1 , . . . , fr are Casimirs. For more details on this formula, see [20]. Multiplication of a Poisson bracket by a Casimir gives another Poisson bracket. Suppose [π, π] = 0 and [π, f ] = 0. Then
2.3. Symplectic and Lie–Poisson manifolds The most basic examples of Poisson brackets are the symplectic and Lie–Poisson brackets. (i) Symplectic manifolds: A symplectic manifold is a pair (M 2n , ω) where 2n M is an even dimensional manifold and ω is a closed, non-degenerate 2-form. The associated isomorphism µ : T M → T ∗M
(23)
{f, g} = ω(χf , χg ) .
(24)
extends naturally to a tensor bundle isomorphism still denoted by µ. Let λ = µ−1 , f ∈ N and let χf = λ(df ) be the corresponding Hamiltonian vector field. The symplectic bracket is given by In the case of R2n , according to a Theorem of Darboux, there are coordinates (x1 , . . . , xn , y1 , . . . , yn ), so that ω=
n X i=1
dxi ∧ dyi
and the Poisson bracket is the standard symplectic bracket on R2n .
(25)
March 22, 2004 10:8 WSPC/148-RMP
182
00197
P. A. Damianou
(ii) Lie–Poisson: Let M = G ∗ where G is a Lie algebra. For a ∈ G, define the function Φa on G ∗ by Φa (µ) = ha, µi
(26)
where µ ∈ G ∗ and h , i is the pairing between G and G ∗ . Define a bracket on G ∗ by {Φa , Φb } = Φ[a,b] .
(27)
This bracket is easily extended to arbitrary C ∞ functions on G ∗ . The bracket of linear functions is linear and every linear bracket is of this form, i.e., it is associated with a Lie algebra. Therefore, the classification of linear Poisson brackets is equivalent to the classification of Lie algebras. 2.4. Local theory In his paper [19] Weinstein proved the so-called “splitting theorem”, which describes the local behavior of Poisson manifolds. Theorem 1. Let x0 be a point in a Poisson manifold M . Then near x0 , M is isomorphic to a product S × N where S is symplectic, N is a Poisson manifold , and the rank of N at x0 is zero. S is called the symplectic leaf through x0 and N is called the transverse Poisson structure at x0 . N is unique up to isomorphism. So, through each point x0 passes a symplectic leaf Sx0 whose dimension equals the rank of the Poisson structure on M at x0 . The bracket on the transverse manifold Nx0 can be calculated using Dirac’s constraint bracket formula. Theorem 2. Let x0 be a point in a Poisson manifold M and let U be a neighborhood of x0 which is isomorphic to a product S × N as in Weinstein’s Splitting Theorem. Let pi , i = 1, . . . , 2n be functions on U such that N = {x ∈ U | pi (x) = constant} .
(28)
Denote by P = Pij = {pi , pj } and by P ij the inverse matrix of P . Then the bracket formula for the transverse Poisson structure on N is given as follows: ˆ M (x) + {F, G}N (x) = {Fˆ , G}
2n X i,j
ˆ pj }M (x) {Fˆ , pi }M (x)P ij (x){G,
(29)
ˆ are extensions of F and for all x ∈ N, where F, G are functions on N and Fˆ , G G to a neighborhood of M . Dirac’s formula depends only on F, G, but not on the ˆ extensions Fˆ , G. When µ is an element of G ∗ , where G is a semi-simple Lie algebra, Cushman and Roberts proved that, in suitable coordinates, the transverse structure is polynomial; see [21] and [22].
March 22, 2004 10:8 WSPC/148-RMP
00197
Multiple Hamiltonian Structure of Bogoyavlensky–Toda Lattices
183
2.5. Cohomology Cohomology of Lie algebras was introduced by Chevalley and Eilenberg in [23]. Let G be a Lie algebra and ρ be a representation of G with representation space V . A q-linear skew-symmetric mapping of G into V will be called a q-dimensional V -cochain. The q-cochains form a space C q (G, V ). By definition, C 0 (G, V ) = V . We define a coboundary operator δ = δq : C q (G, V ) → C q+1 (G, V ) by the formula (δf )(x0 , . . . , xq ) =
q X
(−1)q ρ(xi )f (x0 , . . . , x ˆ i , . . . , xq )
i=0
+
X
(−1)i+j f ([xi , xj ], x0 , . . . , x ˆi , . . . , x ˆ j , . . . , xq ) ,
(30)
i<j
where f ∈ C q (G, V ) and x0 , . . . , xq ∈ G. As can be easily checked δq+1 ◦ δq = 0 so that {C q (G, V ), δq } is an algebraic complex. Define Z q (G, V ) the space of q-cocycles as the kernel of δ : C q → C q+1 and the space B q (G, V ) of q-coboundaries as the image δC q−1 . Since δδ = 0 we can define H q (G, V ) =
Z q (G, V ) . B q (G, V )
(31)
Lichnerowicz [16] considered the following cohomology defined on the tensors of a Poisson manifold. Let (M, π) be a Poisson manifold. If we set B = C = π in (5) we get [π, [π, A]] = 0
(32)
for every tensor A. Define a coboundary operator ∂π which assigns to each p-tensor A, a (p + 1)-tensor ∂π A given by ∂π A = −[π, A] .
(33)
We have ∂π2 A = [π, [π, A]] = 0 and therefore ∂π defines a cohomology. An element A is a p-cocycle if [π, A] = 0. An element B is a p-coboundary if B = [π, C], for some (p − 1)-tensor C. Let Z n (M, π) = {A ∈ Tn | [π, A] = 0}
(34)
B n (M, π) = {B | B = [π, C], C ∈ Tn−1 } .
(35)
and
The quotient H n (M, π) = is the nth cohomology group.
Z n (M, π) B n (M, π)
(36)
March 22, 2004 10:8 WSPC/148-RMP
184
00197
P. A. Damianou
Let G be a Lie algebra and consider the Lie–Poisson manifold G ∗ . Define a representation ρ of G with values in C ∞ (G ∗ ) by X ∂f ρ(xi )f = (37) ckij ∂xj j,k
∗
where xi denotes coordinates on G and at the same time elements of a basis for G. In other words, ρ(xi )f = {xi , f }, where the bracket is the Lie–Poisson bracket on G ∗ . We denote the nth cohomology group of G with respect to this representation by H n (G, C ∞ (G ∗ )) .
(38)
H n (G ∗ , π) ∼ = H n (G, C ∞ (G ∗ )) .
(39)
We have the following result: Theorem 3.
The proof can be found in [24] or [10]. 2.6. Bi-Hamiltonian systems Proposition 1. Let (M, π1 ), (M, π2 ) be two Poisson structures on M . The following are equivalent: (i) (ii) (iii) (iv)
π1 + π2 is Poisson. [π1 , π2 ] = 0. ∂π1 ∂π2 = −∂π2 ∂π1 . π1 ∈ Z 2 (M, π2 ), π2 ∈ Z 2 (M, π1 ).
Two tensors which satisfy the equivalent conditions are said to form a Poisson pair on M . The corresponding Poisson brackets are called compatible. Lemma 1. Suppose π1 is Poisson and π2 = LX π1 = −∂π1 X for some vector field X. Then π1 is compatible with π2 . Proof. [π1 , π2 ] = [π1 , −[π1 , X]] = −∂π1 ∂π1 X = 0 . If π1 is symplectic, we call the Poisson pair non-degenerate. If we assume a nondegenerate pair we make the following definition: the recursion operator associated with a non-degenerate pair is the (1, 1)-tensor R defined by R = π2 π1−1 .
(40)
A bi-Hamiltonian system is defined by specifying two Hamiltonian functions H1 , H2 satisfying X = π1 ∇H2 = π2 ∇H1 .
(41)
March 22, 2004 10:8 WSPC/148-RMP
00197
Multiple Hamiltonian Structure of Bogoyavlensky–Toda Lattices
185
We have the following result due to Magri [25]: Theorem 4. Suppose that we have a non-degenerate bi-Hamiltonian system on a manifold M, whose first cohomology group is trivial. Then, there exists a hierarchy of mutually commuting functions H1 , H2 , . . . , all in involution with respect to both brackets. If we denote by χi the Hamiltonian vector field generated by Hi with respect to the initial bracket π1 then the χi generate mutually commuting bi-Hamiltonian flows, satisfying the Lenard recursion relations χi+j = πi ∇Hj ,
(42)
where πi = Ri−1 π1 are the higher order Poisson tensors. For further information on bi-Hamiltonian systems relevant to Toda type systems see [26–29]. 2.7. Master symmetries We recall the definition and basic properties of master symmetries following Fuchssteiner [30]. Consider a differential equation on a manifold M defined by a vector field χ. We are mostly interested in the case where χ is a Hamiltonian vector field. A vector field Z is a symmetry of the equation if [Z, χ] = 0 .
(43)
If Z is time dependent, then a more general condition is ∂Z + [Z, χ] = 0 . ∂t
(44)
A vector field Z is called a master symmetry if [[Z, χ], χ] = 0 ,
(45)
[Z, χ] 6= 0 .
(46)
but
Master symmetries were first introduced by Fokas and Fuchssteiner in [31] in connection with the Benjamin–Ono Equation. Suppose that we have a bi-Hamiltonian system defined by the Poisson tensors π1 , π2 and the Hamiltonians H1 , H2 . Assume that π1 is symplectic. We define the recursion operator R = π2 π1−1 , the higher flows χi = Ri−1 χ1 , and the higher order Poisson tensors πi = Ri−1 π1 .
(47)
March 22, 2004 10:8 WSPC/148-RMP
186
00197
P. A. Damianou
For a non-degenerate bi-Hamiltonian system, master symmetries can be generated using a method due to Oevel [32]. Theorem 5. Suppose that X0 is a conformal symmetry for both π1 , π2 and H1 , i.e., for some scalars λ, µ and ν we have LX0 π1 = λπ1 ,
LX0 π2 = µπ2 ,
LX0 H1 = νH1 .
Then the vector fields Xi = R i X0 are master symmetries and we have (i)
LXi Hj = (ν + (j − 1 + i)(µ − λ))Hi+j
(ii)
LXi πj = (µ + (j − i − 2)(µ − λ))πi+j
(iii)
[Xi , Xj ] = (µ − λ)(j − i)Xi+j .
3. AN Toda Lattice 3.1. Definition of the system Equation (2) is the classical, finite, nonperiodic Toda lattice. This system was investigated in [33–39] and numerous other papers that are impossible to list here. This type of Hamiltonian was considered first by Morikazu Toda [39]. The original Toda lattice can be viewed as a discrete version of the Korteweg–de Vries equation. It is called a lattice as in atomic lattice since interatomic interaction was studied. This system also appears in Cosmology. It appears also in the work of Seiberg and Witten on supersymmetric Yang–Mills theories and it has applications in analogue computing and numerical computation of eigenvalues. But the Toda lattice is mainly a theoretical mathematical model which is important due to the rich mathematical structure encoded in it. Hamilton’s equations become q˙j = pj p˙j = eqj−1 −qj − eqj −qj+1 . The system is integrable. One can find a set of independent functions {H1 , . . . , HN } which are constants of motion for Hamilton’s equations. To determine the constants of motion, one uses Flaschka’s transformation: ai =
1 1 (qi −qi+1 ) , e2 2
1 bi = − p i . 2
(48)
Then a˙ i = ai (bi+1 − bi )
b˙ i = 2 (a2i − a2i−1 ) .
(49)
March 22, 2004 10:8 WSPC/148-RMP
00197
Multiple Hamiltonian Structure of Bogoyavlensky–Toda Lattices
These equations can be written as matrix b a1 1 a 1 b2 0 a 2 L= .. . .. .
a Lax pair L˙ = [B, L], where L is the Jacobi 0
···
a2
··· .. . .. . .. .
b3 .. .
0
···
0 −a1 0 B= .. . .. .
a1
0
···
0
a2
−a2
0
··· .. . .. . .. .
and
0
···
187
···
..
0 .. .
.
· · · aN −1
..
.
···
, .. . aN −1 bN
···
..
.
..
.
−aN −1
0 .. .
. .. . aN −1 0
This is an example of an isospectral deformation; the entries of L vary over time but the eigenvalues remain constant. It follows that the functions Hi = 1i Tr Li are constants of motion. We note that H1 =
N X
1 bi = − (p1 + p2 + . . . + pN ) , 2 i=1
corresponds to the total momentum and H2 = H(q1 , . . . , qN , p1 , . . . , pN ) =
N N −1 1X 2 X 2 bi + ai 2 i=1 i=1
is the Hamiltonian. Consider R2N with coordinates (q1 , . . . , qN , p1 , . . . , pN ), the standard symplectic bracket N X ∂f ∂g ∂f ∂g , (50) {f, g}s = − ∂qi ∂pi ∂pi ∂qi i=1
and the mapping F : R2N → R2N −1 defined by
F : (q1 , . . . , qN , p1 , . . . , pN ) → (a1 , . . . , aN −1 , b1 , . . . , bN ) . There exists a bracket on R2N −1 which satisfies {f, g} ◦ F = {f ◦ F, g ◦ F }s .
March 22, 2004 10:8 WSPC/148-RMP
188
00197
P. A. Damianou
It is a bracket which (up to a constant multiple) is given by {ai , bi } = −ai {ai , bi+1 } = ai ;
(51)
all other brackets are zero. H1 = b1 + b2 + · · · + bN is the only Casimir. The Hamiltonian in this bracket is H2 = 21 Tr L2 . We also have involution of invariants, {Hi , Hj } = 0. The Lie algebraic interpretation of this bracket can be found in [2]. We denote this bracket by π1 . The quadratic Toda bracket appears in conjunction with isospectral deformations of Jacobi matrices. First, let λ be an eigenvalue of L with normalized eigenvector v. Standard perturbation theory shows that 2 T ) := U λ , ∇λ = (2v1 v2 , . . . , 2vN −1 vN , v12 , . . . , vN ∂λ ∂λ , . . . , ∂b ). Some manipulations show that U λ satisfies where ∇λ denotes ( ∂a 1 N
π2 U λ = λ π 1 U λ , where π1 and π2 are skew-symmetric matrices. It turns out that π1 is the matrix of coefficients of the Poisson tensor (51), and π2 , whose coefficients are quadratic functions of the a’s and b’s, can be used to define a new Poisson tensor. The quadratic Toda bracket appeared in a paper of Adler [40] in 1979. It is a Poisson bracket in which the Hamiltonian vector field generated by H1 is the same as the Hamiltonian vector field generated by H2 with respect to the π1 bracket. The defining relations are {ai , ai+1 } = 12 ai ai+1 {ai , bi } = −ai bi {ai , bi+1 } = ai bi+1
(52)
{bi , bi+1 } = 2a2i ;
all other brackets are zero. This bracket has det L as Casimir and H1 = Tr L is the Hamiltonian. The eigenvalues of L are still in involution. Furthermore, π2 is compatible with π1 . We also have π2 ∇Hl = π1 ∇Hl+1 .
(53)
These relations are similar to the Lenard relations for the KdV equation; they are generally called the Lenard relations. Taking l = 1 in (53), we conclude that the Toda lattice is bi-Hamiltonian. In fact, using results from [15], we can prove that the Toda lattice is multi-Hamiltonian: π2 ∇H1 = π1 ∇H2 = π0 ∇H3 = π−1 ∇H4 = · · ·
(54)
Finally, we remark that further manipulations with the Lenard relations for the infinite Toda lattice, followed by setting all but finitely many ai , bi equal to zero,
March 22, 2004 10:8 WSPC/148-RMP
00197
Multiple Hamiltonian Structure of Bogoyavlensky–Toda Lattices
189
yield another Poisson bracket, π3 , which is cubic in the coordinates; see [41]. The defining relations for π3 are {ai , ai+1 } = ai ai+1 bi+1
{ai , bi } = −ai b2i − a3i
{ai , bi+1 } = ai b2i+1 + a3i {ai , bi+2 } = ai a2i+1
(55)
{ai+1 , bi } = −a2i ai+1
{bi , bi+1 } = 2a2i (bi + bi+1 ) ;
all other brackets are zero. The bracket π3 is compatible with both π1 and π2 and the eigenvalues of L are still in involution. The Casimir for this bracket is Tr L−1 . The multi-Hamiltonian structure of the Toda lattice is well-known. The results are usually presented either in the natural (q, p) coordinates or in the more convenient Flaschka coordinates (a, b). In the former case the hierarchy of higher invariants are generated by the use of a recursion operator [42, 43]. In the later case one uses master symmetries as in [9] and [10]. We have to point out that chronologically every result obtained so far was done first in Flaschka coordinates (a, b) and then transferred through the inverse of Flaschka’s transformation to the original (q, p) coordinates. This is to be expected since it is always easier to work with sums of polynomials than with sums of exponentials. The sequence of Poisson tensors can be extended to form an infinite hierarchy. In order to produce the hierarchy of Poisson tensors one uses master symmetries. The first three Poisson brackets are precisely the linear, quadratic and cubic brackets we mentioned above. If a system is bi-Hamiltonian and one of the brackets is symplectic, one can find a recursion operator by inverting the symplectic tensor. The recursion operator is then applied to the initial symplectic bracket to produce an infinite sequence. However, in the case of Toda lattice (in Flaschka variables (a, b)) both operators are non-invertible and therefore this method fails. The absence of a recursion operator for the finite Toda lattice is also mentioned in Morosi and Tondo [44] where a Nijenhuis tensor for the infinite Toda lattice is calculated. Recursion operators were introduced by Olver [45].
3.2. Multi-Hamiltonian structure In the case of Toda equations, the master symmetries map invariant functions to other invariant functions. Hamiltonian vector fields are also preserved. New Poisson brackets are generated by using Lie derivatives in the direction of these vector fields and they satisfy interesting deformation relations. We give a summary of the results. There exists
March 22, 2004 10:8 WSPC/148-RMP
190
00197
P. A. Damianou
• a sequence of invariants H1 , H 2 , H 3 , . . . , where Hi = 1i Tr Li ; • a corresponding sequence of Hamiltonian vector fields χ1 , χ2 , χ3 , . . . , where χi = χHi ; • a hierarchy of Poisson tensors π1 , π 2 , π 3 , . . . , where πi is polynomial, homogeneous, of degree i. • Finally, one can determine a sequence of master symmetries X1 , X 2 , X 3 , . . . , which are used to create the hierarchies through Lie derivatives. We quote the results from Refs. [9] and [10]. Theorem 6. (i) (ii) (iii) (iv) (v) (vi)
πj , j ≥ 1 are all Poisson. The functions Hi , i ≥ 1 are in involution with respect to all of the πj . Xi (Hj ) = (i + j)Hi+j , i ≥ −1, j ≥ 1. LXi πj = (j − i − 2)πi+j , i ≥ −1, j ≥ 1. [Xi , Xj ] = (j − i)Xi+j , i ≥ 0, j ≥ 0. πj ∇Hi = πj−1 ∇Hi+1 , where πj denotes the Poisson matrix of the tensor πj .
To define the vector fields Xn one considers expressions of the form L˙ = [B, L] + Ln+1 .
(56)
This equation is similar to a Lax equation, but in this case the eigenvalues satisfy λ˙ = λn+1 instead of λ˙ = 0. We give an outline of the construction of the vector fields Xn . We define X−1 to be ∇H1 = ∇ Tr L = and X0 to be the Euler vector field X0 =
N −1 X i=1
We want X1 to satisfy
N X ∂ , ∂b i i=1
(57)
N
ai
X ∂ ∂ + bi . ∂ai i=1 ∂bi
X1 (Tr Ln ) = n Tr Ln+1 .
(58)
(59)
March 22, 2004 10:8 WSPC/148-RMP
00197
Multiple Hamiltonian Structure of Bogoyavlensky–Toda Lattices
191
One way to find such a vector field is by considering the equation L˙ = [B, L] + L2 .
(60)
Note that the left-hand side of this equation is a tridiagonal matrix while the righthand side is pentadiagonal. We look for B as a tridiagonal matrix γ 1 β1 0 · · · · · · α 1 γ 2 β2 · · · · · · B= (61) 0 α 2 γ 3 β3 · · · . .. .. . . .. .. . . . . . We want to choose the αi , βi and γi so that the right-hand side of Eq. (60) becomes tridiagonal. One simple solution is αn = −(n + 1)an , βn = (n + 1)an , γn = 0. The vector field X1 is defined by the right-hand side of (60): X1 =
N −1 X
N
a˙ n
n=1
where
X ∂ ∂ + , b˙ n ∂an n=1 ∂bn
(62)
a˙ n = −nan bn + (n + 2)an bn+1
(63)
b˙ n = (2n + 3)a2n + (1 − 2n)a2n−1 + b2n .
(64)
To construct the vector field X2 we consider the equation L˙ = [B, L] + L3 . The calculations are similar to those for X1 . The matrix B is now pentadiagonal and the system of equations slightly more complicated. The result is a vector field
where
N −1 X
N
X ∂ ∂ b˙ n + X2 = a˙ n ∂a ∂b n n n=1 n=1
a˙ n = (2 − n)a2n−1 an + (1 − n)an b2n + an bn bn+1 + (n + 1)an a2n+1 + (n + 1)an b2n+1 + a3n + σn an (bn+1 − bn ) b˙ n = 2σn a2n − 2σn−1 a2n−1 + (2n + 2)a2n bn + (2n + 1)a2n bn+1 + (3 − 2n)a2n−1 bn−1 + (4 − 2n)a2n−1 bn + b3n , with σn =
n−1 X
bi ,
i=1
and σ1 = 0. We continue the sequence of master symmetries for n ≥ 3 by [X1 , Xn−1 ] = (n − 2)Xn .
(65)
March 22, 2004 10:8 WSPC/148-RMP
192
00197
P. A. Damianou
3.3. Properties of Xn and πn It is well known that π1 , π2 , π3 satisfy Lenard relations πn ∇Hl = πn−1 ∇Hl+1 ,
n = 2, 3 ∀ l .
(66)
We want to show that these relations hold for all values of n. We denote the Hamiltonian vector field of Hl with respect to the nth bracket by χnl . In other words, χnl = [πn , Hl ] .
(67)
We prove the Lenard relations in an equivalent form. Proposition 2. χn+1 = χnl+1 . l Proof. To prove this we need the identity [X1 , χnl ] = (n − 3)χn+1 + (l + 1)χnl+1 l
(68)
which follows easily from X1 (Hl ) = (l + 1)Hl+1 and (67). Therefore, (n − 3)χn+1 = [X1 , χnl ] − (l + 1)χnl+1 l n−1 = [X1 , χl+1 ] − (l + 1)χnl+1 n−1 = (n − 4)χnl+1 + (l + 2)χl+2 − (l + 1)χnl+1
= (n − 4)χnl+1 + (l + 2)χnl+1 − (l + 1)χnl+1 = (n − 3)χnl+1 .
(69)
Using the Lenard relations we can show that the functions Hn are in involution with respect to all of the brackets πn . Proposition 3. {Hi , Hj }n = 0, where { , }n is the bracket corresponding to πn . Proof. First we consider the Lie–Poisson Toda bracket. We have {H1 , Hj } = 0 ∀ j ,
(70)
since H1 is a Casimir for π1 . Suppose that {Hi−1 , Hj } = 0 ∀ j. i{Hi , Hj } = {X1 (Hi−1 ), Hj } = −[χ1j , [X1 , Hi−1 ]] = [X1 , {Hi−1 , Hj }] + [Hi−1 , (j + 1)χ1j+1 ] = (j + 1){Hi−1 , Hj+1 } = 0.
(71)
March 22, 2004 10:8 WSPC/148-RMP
00197
Multiple Hamiltonian Structure of Bogoyavlensky–Toda Lattices
193
Now we use induction on n. Suppose {Hi , Hj }n = 0 ∀ i, j .
(72)
{Hi , Hj }n+1 = χn+1 (Hj ) i = χni+1 (Hj ) = {Hi+1 , Hj }n = 0.
(73)
Of course, one can prove the involution of integrals without using master symmetries. We present the classical proof: In this proof the symbol πn stands for the bundle map πn∗ defined by (18). We first prove the involution of constants in the Lie–Poisson Toda bracket. We use the basic Lenard relation π2 dH1 = π1 dH2 . Since H1 is a Casimir in the linear bracket, we have π1 dH1 = 0. We calculate {Hi , Hj }1 = hdHi , π1 dHj i = −hdHj , π1 dHi i = −hdHj , π2 dHi−1 i = hdHi−1 , π2 dHj i = hdHi−1 , π1 dHj+1 i = {H1 , Hj+i−1 }1 = 0 . It is also easy to show involution in the second quadratic bracket: {Hi , Hj }2 = hdHi , π2 dHj i = hdHi , π1 dHj+1 i = {Hi , Hj+1 }1 = 0 . The general result follows from the Lenard relations πn dHj = πn−1 dHj+1 and induction: {Hi , Hj }n = hdHi , πn dHj i = hdHi , πn−1 dHj+1 i = {Hi , Hj+1 }n−1 = 0 . It is straightforward to verify that the mapping f (a1 , . . . , aN −1 , b1 , . . . , bN ) = (a1 , . . . , aN −1 , 1 + b1 , . . . , 1 + bN )
(74)
March 22, 2004 10:8 WSPC/148-RMP
194
00197
P. A. Damianou
is a Poisson map between π2 and π1 + π2 . Since f is a diffeomorphism, we have the isomorphism π2 ∼ = π1 + π 2 .
(75)
In other words, the tensor π2 encodes sufficient information for both the linear and quadratic Toda brackets. An easy induction generalizes this result, i.e., Proposition 4. πn ∼ =
n−1 X
n−1
j
j=0
πn−j .
(76)
The function Tr L2−n , which is well-defined on the open set det L 6= 0, is a Casimir for πn , for n ≥ 3. The proof uses the Lenard type relation πn ∇λ = λπn−1 ∇λ
(77)
satisfied by the eigenvalues of L. To prove the last equation, one uses the relation X X πn λl−1 λlk ∇λk . (78) k ∇λk = πn−1
But
X
λl−1 k (πk ∇λk − λk πn−1 ∇λk ) = 0 ,
(79)
for l = 1, 2, . . . , N + 1, has only the trivial solution because the (Vandermonde) coefficient determinant is non-zero. Proposition 5. For n > 2, Tr L2−n is a Casimir for πn on the open dense set det L 6= 0. Proof. For n = 3, π3 ∇ Tr L−1 = π3 =
X
X k
=−
k
−
X k
−
1 ∇λk λ2k
1 λk π2 ∇λk λ2k π1 ∇λk
= −π1 ∇ Tr L = χ11 = 0 . For n > 3 the induction step is as follows: X πn ∇ Tr L2−n = πn ∇ k
=
X k
1 λkn−2
(2 − n)1λkn−1 πn ∇λk
(80)
March 22, 2004 10:8 WSPC/148-RMP
00197
Multiple Hamiltonian Structure of Bogoyavlensky–Toda Lattices
=
X k
=
(2 − n)
195
1 λk πn−1 ∇λk λkn−1
n−2 πn−1 ∇ Tr L3−n n−3
= 0.
(81)
3.4. The Faybusovich–Gekhtman approach In [46] Faybusovich and Gekhtman found another method of generating the multiHamiltonian structure for the Toda lattice. Their method is important and will certainly have applications to other integrable systems, both finite and infinite dimensional, solvable by the inverse spectral transform. Their work shows that the Hamiltonian formalism is built into the spectral theory. In the case of Toda lattice, the key ingredient is the Moser map which takes the (a, b) phase space of tridiagonal Jacobi matrices to a new space of variables (λ i , ri ) where λi is an eigenvalue of the Jacobi matrix and ri is the residue of rational functions that appear in the solution of Toda equations. The Poisson brackets of Theorem 6 project onto some rational brackets in the space of Weyl functions and in particular, the Lie–Poisson bracket π1 corresponds to the Atiyah–Hitchin bracket [47]. We briefly describe the construction: Moser in [48] introduced the resolvent R(λ) = (λI − L)−1 , and defined the Weyl function f (λ) = (R(λ)e1 , e1 ) , where e1 = (1, 0, . . . , 0). The function f (λ) has a simple pole at λ = λi and positive residue at λi equal to ri : n X ri . f (λ) = λ − λi i=1
The variables (a, b) may be expressed as rational functions of λi and ri using a continued fraction expansion of f (λ) which dates back to Stieltjes. Since the computation of the continued fraction from the partial fraction expansion is a rational process the solution is expressed as a rational function of the variables (λi , ri ). The idea of Faybusovich and Gekhtman is to construct a sequence of Poisson brackets on the space (λi , ri ) whose image under the inverse spectral transform are the brackets πi defined in Theorem 6. The Lie–Poisson bracket π1 corresponds to the Atiyah– Hitchin bracket on Weyl functions which in coordinate free form is written as 2
{f (λ), f (µ)} =
(f (λ) − f (µ)) . λ−µ
March 22, 2004 10:8 WSPC/148-RMP
196
00197
P. A. Damianou
q(λ) A rational function of the form p(λ) is determined uniquely by the distinct eigenvalues of p(λ), λ1 , . . . , λn and values of q at these roots. The residue ri is equal to q(λi ) p0 (λi ) and therefore we may choose
λ1 , . . . , λn , q(λ1 ), . . . , q(λn ) as global coordinates on the space of rational functions (of the form pq with p having simple roots and q, p coprime). We have to remark that the image of the Moser map is a much larger set. The kth Poisson bracket is defined by {λi , q(λi )} = −λki q(λi ) {q(λi ), q(λj )} = {λi , λj } = 0 .
Let us denote this bracket by wk . On the other hand, in [9, page 108], there is a definition of vector fields on the space of eigenvalues of the Jacobi matrix which are projections of the master symmetries Xi . They are defined by ei =
N X
λi+1 j
j=1
∂ . ∂λj
One verifies easily that these vector fields satisfy the usual Virasoro type relation [ei , ej ] = (j − i)ei+j . If we denote by F the function which sends the Jacobi matrix to its eigenvalues then dF (X1 ) = e1 dF (X2 ) = e2 . Therefore, it follows by induction that dF (Xi ) = ei . Faybusovich and Gekhtman used the brackets wi and the vector fields ej to obtain the analogue of Theorem 6 in the space of rational functions. The relations obtained correspond to the relations of Theorem 6 under the inverse of the Moser map. The explicit formulas for the brackets wk can be deduced easily from the formulas in [46]. They are λki + λkj ri rj λi − λ j = λ i ri
{ri , rj }k = {ri , λi }k
{λi , λj }k = 0 .
In a recent paper [49] Vaninsky also has explicit formulas in (λi , ri ) coordinates for the initial bracket w1 .
March 22, 2004 10:8 WSPC/148-RMP
00197
Multiple Hamiltonian Structure of Bogoyavlensky–Toda Lattices
197
3.5. A theorem of Petalidou Finally, we mention an interesting result of Petalidou [50]. She proved the following theorem: suppose that (M, Λ0 , Λ1 ) is a bi-Hamiltonian manifold of odd dimension and let p be a point in M of corank 1. If there exists locally an infinitesimal automorphism Z0 of Λ0 which is transverse to the symplectic leaf through p and a vector field Z1 which depends on a parameter t such that ∂ ∧ Z1 = 0 , [Λ1 , Z1 ] + Z1 , ∂t and ∂ [Λ0 , Z1 ] + [Λ1 , Z0 ] + Z1 , ∧ Z0 = 0 , ∂t then one can find a symplectic realization of both Λ0 and Λ1 by a pair of symplectic ˆ 0, Λ ˆ 1 given by brackets Λ ˆ i = Λi + Zi ∧ ∂ , i = 1, 2 . Λ ∂t This result applies in the case of the Toda lattice by taking Z0 = X−1 and Z1 = X0 . Petalidou obtained symplectic realizations of π1 and π2 . Furthermore, she constructed symplectic realizations of the whole sequence π1 , π 2 , π 3 , . . . of Theorem 6. The corresponding symplectic sequence is given by ∂ . π ˆk = πk + Xk−2 ∧ ∂t The tensors π ˆk may be generated by a recursion operator since the initial tensor, ˆ 0 , is invertible. which is a multiple of Λ 3.6. A recursive process of Kosmann–Schwarzbach and Magri In [51] Kosmann–Schwarzbach and Magri considered the relationship between Lax and bi-Hamiltonian formulations of integrable systems. They introduced an equation, called the Lax–Nijenhuis equation, relating the Lax matrix with the biHamiltonian pair and they showed that every operator that satisfies that equation satisfies also the Lenard recursion relations. They derived the multi-Hamiltonian structure of the Toda lattice by defining a matrix M and a vector λ0 which arise by manipulating the Lax–Nijenhuis equation. They showed that π2 = M π 1 + X ⊗ λ 0 , where X is the Hamiltonian vector field χ2 . In the next step of the recursive process they showed that π3 = M π 2 + X ⊗ λ 1 , where λ1 = M λ0 . In general, πi+1 = M πi + X ⊗ M (i−1) λ0 .
March 22, 2004 10:8 WSPC/148-RMP
198
00197
P. A. Damianou
4. Lie Group Symmetries of the Toda Lattice Sophus Lie introduced his theory of continuous groups in order to study symmetry properties of differential equations. His approach allowed a unification of existing methods for solving ordinary differential equations as well as classifications of symmetry groups of partial and ordinary differential equations. A symmetry group of a system of differential equations is a Lie group acting on the space of independent and dependent variables in such a way that solutions are mapped into other solutions. Knowing the symmetry group allows one to determine some special types of solutions invariant under a subgroup of the full symmetry group, and in some cases one can solve the equations completely. Lie’s methods have been developed into powerful tools for examining differential equations through group analysis. In many cases, symmetry groups are the only known means for finding concrete solutions to complicated equations. The method applies of course to the case of Hamiltonian or Lagrangian systems, both autonomous and time dependent. Recently, the immense amount of computation needed for determining symmetry groups of concrete systems has been greatly reduced by the implementation of computer algebra packages for symmetry analysis of differential equations. The symmetry approach to solving differential equations can be found, for example, in the books of Olver [52], Bluman and Kumei [53], Ovsiannikov [54] and Ibragimov [55]. Some properties of master symmetries are clear: they preserve constants of motion, Hamiltonian vector fields and they generate a hierarchy of Poisson brackets. We are interested in the following problem: can one find a symmetry group of the system whose infinitesimal generator is a given master symmetry? In the case of Toda equations the answer is negative. However, in this section we find a sequence consisting of time-dependent evolution vector fields whose time-independent part is a master symmetry. Each master symmetry Xn can be written in the form Yn +tZn where Yn is a time-dependent symmetry and Zn is a time-independent Hamiltonian symmetry (i.e. a Hamiltonian vector field). In other words, we find an infinite sequence of evolution vector fields that are symmetries of Eqs. (49). We do not know if every symmetry of Toda equations is included in this sequence. We begin by writing Eqs. (49) in the form Γj = a˙ j − aj bj+1 + aj bj = 0 , ∆j = b˙ j − 2a2 + 2a2 = 0 . j
j−1
We look for symmetries of Toda equations, i.e. vector fields of the form N −1 N X X ∂ ∂ ∂ + φj + ψj v=τ ∂t ∂a ∂b j j j=1 j=1
that generate the symmetry group of the Toda equations. The first prolongation of v is N −1 N X X ∂ ∂ , pr(1) v = v + fj gj + ∂ a ˙ ∂ b˙ j j j=1 j=1
March 22, 2004 10:8 WSPC/148-RMP
00197
199
Multiple Hamiltonian Structure of Bogoyavlensky–Toda Lattices
where fj = φ˙ j − τ˙ a˙ j
gj = ψ˙ j − τ˙ b˙ j .
The infinitesimal condition for a group to be a symmetry of the system is pr(1) (Γj ) = 0 ,
pr(1) (∆j ) = 0 .
Therefore we obtain the equations φ˙ j − τ˙ aj (bj+1 − bj ) + φj (bj − bj+1 ) + aj ψj − aj ψj+1 = 0 ,
(82)
ψ˙ j − 2τ˙ (a2j − a2j−1 ) − 4aj φj + 4aj−1 φj−1 = 0 .
(83)
We first give some obvious solutions: (i) τ = 0, φj = 0, ψj = 1. This is the vector field X−1 . ∂ (ii) τ = −1, φj = 0, ψj = 0. The resulting vector field is the time translation − ∂t whose evolutionary representative is N −1 X j=1
N
a˙ j
X ∂ ∂ b˙ j + . ∂aj j=1 ∂bj
This is the Hamiltonian vector field χH2 . It generates a Hamiltonian symmetry group. (iii) τ = −1, φj = aj , ψj = bj . Then v=−
N −1 N X X ∂ ∂ ∂ ∂ aj bj + + = − + X0 . ∂t ∂aj j=1 ∂bj ∂t j=1
This vector field generates the same symmetry as the evolutionary vector field X0 + tχH2 . We next look for some non-obvious solutions. The vector field X1 is not a symmetry, so we add a term which depends on time. We try φj = −jaj bj + (j + 2)aj bj+1 + t(aj a2j+1 + aj b2j+1 − a2j−1 aj − aj b2j ) ψj = (2j + 3)a2j + (1 − 2j)a2j−1 + b2j + t(2a2j bj+1 + 2a2j − 2a2j−1 aj − 2a2j−1 bj ) , and τ = 0.
March 22, 2004 10:8 WSPC/148-RMP
200
00197
P. A. Damianou
A tedious but straightforward calculation shows that φj , ψj satisfy (82) and (83). It is also straightforward to check that the vector field X X ∂ ∂ φj + ψj ∂aj ∂bj
is precisely equal to X1 +tχH3 . The pattern suggests that Xn +tχHn+2 is a symmetry of Toda equations. Theorem 7. The vector fields Xn + tχn+2 are symmetries of Toda equations for n ≥ −1. Proof. Note that χH1 = 0 because H1 is a Casimir for the Lie–Poisson bracket. We use the formula [Xn , χl ] = (l − 1)χn+l .
(84)
In particular, for l = 2, we have [Xn , χ2 ] = χn+2 . Since the Toda flow is Hamiltonian, generated by χ2 , to show that Yn = Xn + tχn+2 are symmetries of Toda equations we must verify the equation ∂Yn + [χ2 , Yn ] = 0 . ∂t
(85)
But ∂Yn ∂Yn + [χ2 , Yn ] = + [χ2 , Xn + tχn+2 ] ∂t ∂t = χn+2 − [Xn , χ2 ] = χn+2 − χn+2 = 0 .
(86)
5. The Toda Lattice in Natural Coordinates In this section we define the positive and negative Toda hierarchies for the Toda lattice in (q, p) variables. We follow Ref. [15]. 5.1. The Das–Okubo–Fernandes approach Another approach, which explains the relations of Theorem 6 is adopted in Das and Okubo [42], and Fernandes [43]. In principle, their method is general and may work for other finite dimensional systems as well. This approach was also used in [56] by da Costa and Marle in the case of the Relativistic Toda lattice. The procedure is as follows: one defines a second Poisson bracket in the space of canonical variables (q1 , . . . , qN , p1 , . . . , pN ). This gives rise to a recursion operator. The presence of a conformal symmetry as defined in Oevel [32] allows one, by using the recursion operator, to generate an infinite sequence of master symmetries. These, in turn,
March 22, 2004 10:8 WSPC/148-RMP
00197
Multiple Hamiltonian Structure of Bogoyavlensky–Toda Lattices
201
project to the space of the new variables (a, b) to produce a sequence of master symmetries in the reduced space. Let Jˆ1 be the symplectic bracket (50) with Poisson matrix 0 I ˆ , J1 = −I 0 where I is the N × N identity matrix. We use J1 = 4Jˆ1 . With this convention the bracket J1 is mapped precisely onto the bracket π1 under the Flaschka transformation (48). We define Jˆ2 to be the tensor A B ˆ J2 = , −B C where A is the skew-symmetric matrix defined by aij = 1 = −aji for i < j, B is the diagonal matrix (−p1 , −p2 , . . . , −pN ) and C is the skew-symmetric matrix whose non-zero terms are ci,i+1 = −ci+1,i = eqi −qi+1 for i = 1, 2, . . . , N − 1. We define J2 = 2Jˆ2 . With this convention the bracket J2 is mapped precisely onto the bracket π2 under the Flaschka transformation. It is easy to see that we have a bi-Hamiltonian pair. We define h1 = −2(p1 + p2 + · · · + pN ) , and h2 to be the Hamiltonian: h2 =
N −1 N X 1 2 X qi −qi+1 pi + e . 2 i=1 i=1
Under Flaschka’s transformation (48), h1 is mapped onto 4(b1 + b2 + · · · + bN ) = 4 Tr L = 4H1 and h2 is mapped onto 2 Tr L2 = 4H2 . Using the relationship π2 ∇H1 = π1 ∇H2 , which follows from Proposition 2, we obtain, after multiplication by 4, the following pair: J1 ∇h2 = J2 ∇h1 . We define the recursion operator as follows: R = J2 J1−1 . The matrix form of R is quite simple: 1 B R= 2 C
−A B
.
(87)
This operator raises degrees and we therefore call it the positive Toda operator. In (q, p) coordinates, the symbol χi is a shorthand for χhi . It is generated as usual by χi = Ri−1 χ1 .
March 22, 2004 10:8 WSPC/148-RMP
202
00197
P. A. Damianou
In a similar fashion we obtain the higher order Poisson tensors Ji = Ri−1 J1 . We finally define the conformal symmetry N X
N
X ∂ ∂ Z0 = + . (N − 2i + 1) pi ∂q ∂p i i i=1 i=1
It is straightforward to verify that
LZ0 J1 = −J1 , L Z 0 J2 = 0 . In fact, Z0 is Hamiltonian in the J2 bracket with Hamiltonian function see [43]. This observation will be generalized in Sec. 5.3. In addition,
1 2
PN
i=1 qi ;
Z0 (h1 ) = h1 Z0 (h2 ) = 2h2 . Consequently, Z0 is a conformal symmetry for J1 , J2 and h1 . The constants appearing in Oevel’s Theorem are λ = −1, µ = 0 and ν = 1. Therefore, we end up with the following deformation relations: [Zi , hj ] = (i + j)hi+j LZi Jj = (j − i − 2)Ji+j [Zi , Zj ] = (j − i)Zi+j . Switching to Flaschka coordinates, we obtain relations (iii)–(v) of Theorem 6. 5.2. The negative Toda hierarchy To define the negative Toda hierarchy we use the inverse of the positive recursion operator R. We define N = R−1 = J1 J2−1 .
Obviously we can use the same conformal symmetry Z0 = K0 and take λ = 0, µ = −1 and ν = 2. In other words the role of λ and µ is reversed. We define the vector fields Ki = N i K0 = N i Z 0 ,
i = 1, 2, . . .
which are master symmetries. We use the convention Y−i = Ki for i = 0, 1, 2, . . . . PN ∂ . This vector field, in (a, b) coordiFor example, Y−1 = K1 = N Z0 = −2 i=1 ∂p i nates, is given by X−1 = ∇H1 = ∇ Tr L =
N X ∂ . ∂b i i=1
March 22, 2004 10:8 WSPC/148-RMP
00197
Multiple Hamiltonian Structure of Bogoyavlensky–Toda Lattices
203
This is precisely the same vector field (57). In that section, X−1 was constructed through a different method. Similarly, the vector field Z0 corresponds to the Euler vector field (58): X0 =
N −1 X i=1
N
ai
X ∂ ∂ bi + . ∂ai i=1 ∂bi
Note: We use the symbol Yi for a vector field in (q, p) coordinates and Xi for the same vector field in (a, b) coordinates. Similarly, we denote by Ji a Poisson tensor in (p, q) coordinates and πi the corresponding Poisson tensor in (a, b) coordinates. The index i ranges over all integers. We now calculate, using Oevel’s Theorem: [Y−i , Y−j ] = [Ki , Kj ] = (µ − λ)(j − i)Ki+j = (−1)(j − i)Ki+j = (i − j)Y−(i+j) .
Letting m = −i and n = −j we obtain the relationship [Ym , Yn ] = (n − m)Ym+n ,
(88)
for all m, n negative. The same relation holds in Flaschka coordinates. In other words [Xm , Xn ] = (n − m)Xm+n ,
∀ m, n ∈ Z− .
This last relation may be modified to hold for any two arbitrary integers m, n. We suppose, without loss of generality, that j > i and consider the bracket of two master symmetries Ki = Y−i and Zj = Yj , one in the negative hierarchy and the second in the positive hierarchy, i.e. Ki = N i Z0 = R−i Z0 , and Zj = R j Z0 . We proceed as in the proof of Oevel’s Theorem (see [43]): First we note that LZ0 R = (LZ0 J2 ) J1−1 − J2 J1−1 LZ0 J1 J1−1 = (µ − λ)R . On the other hand Finally,
LZ0 N = LZ0 J1 J2−1 = (λ − µ)N . [Y−i , Yj ] = [Ki , Zj ] = [N i Z0 , Rj Z0 ] = N i L Z 0 Rj Z 0 − R j L Z 0 N i Z 0
= N i j(µ − λ)Rj Z0 − Rj i(λ − µ)N i Z0 = j(µ − λ)Rj−i Z0 − i(λ − µ)Rj−i Z0 = (i + j)(µ − λ)Rj−i Z0 = (i + j)(µ − λ)Yj−i .
March 22, 2004 10:8 WSPC/148-RMP
204
00197
P. A. Damianou
In the case of Toda lattice µ = 0 and λ = −1, therefore [Y−i , Yj ] = (i + j)Yj−i . We deduce that (88) holds for any integer value of the index. We define Wi = J3−i . This is necessary since the conclusions of Oevel’s Theorem assume that the index begins at i = 1 and is positive. We compute LY−i J−j = LKi Wj+3 = (µ + (j + 3 − 2 − i)(µ − λ))Wi+j+3 = (i − j − 2)Wi+j+3 = (i − j − 2)J−(i+j) . Letting m = −i and n = −j we obtain LYm Jn = (n − m − 2)Jn+m , for n, m negative integers. Switching to Flaschka coordinates we deduce that the relation (iv) of Theorem 6 holds also for negative values of the index. In other words LXi πj = (j − i − 2)πi+j ,
i ≤ 0,
j ≤ 0.
Again, a straightforward modification of the proof of Oevel’s Theorem shows that the last relationship holds for any integer value of m, n. We have shown that conclusions (iv) and (v) of Theorem 6 hold for integer values of the index. In fact, it is not difficult to demonstrate all the other parts of Theorem 6. Theorem 8. The conclusions of Theorem 6 hold for any integer value of the index. Proof. We need to prove parts (i)–(iii) and (vi) of the Theorem. (i) The fact that Jn are Poisson for n ∈ Z follows from the properties of the recursion operator. The similar result in (a, b) coordinates follows easily from properties of the Schouten bracket, and the fact that Jn and πn are F -related. We have πn = F∗ Jn , therefore [πn , πn ] = [F∗ (Jn ), F∗ (Jn )] = F∗ [Jn , Jn ] = F∗ (0) = 0 . The vanishing of the Schouten bracket is equivalent to the Poisson property. (iii) The case where i and j are both of the same sign was already proved. We next note that Xn (λ) = λn+1 if λ is an eigenvalue of L. This follows from Eq. (56) which is used to define the vector fields Xn for n ≥ 0. We would like to extend the formula Xn (λ) = λn+1 for n < 0. Since X−1 (λ) = 1 we consider X−2 . We look at the equation [X−2 , Xn ] = (n + 2)Xn−2 . We act on λ with both sides of the equation and let X−2 (λ) = f (λ). We obtain the equation (n + 1)λf (λ) − f 0 (λ)λ2 = (n + 2) .
March 22, 2004 10:8 WSPC/148-RMP
00197
Multiple Hamiltonian Structure of Bogoyavlensky–Toda Lattices
205
This is a linear first-order ordinary differential equation with general solution f (λ) = Since n is arbitrary, we obtain f (λ) =
1 + cλn+1 . λ 1 λ.
In order to calculate X−3 (λ) we use
X−3 = −[X−1 , X−2 ] . We obtain 1 1 X−3 (λ) = X−2 X−1 (λ) − X−1 X−2 (λ) = −X−1 ( ) = 2 . λ λ The result follows by induction. Finally we calculate X 1 X j 1 X Xi (Hj ) = Xi λk = Xi λjk = λj−1 k Xi (λk ) j j X j−1 X i+j = λk λi+1 = λk = (i + j)Hi+j . k
(vi) First we note that πj ∇Hi = πj−1 ∇Hi+1 , holds for i, j of the same sign. More generally, in the positive (or the negative) hierarchy we have the Lenard relations (77) for the eigenvalues, i.e. πj ∇λi = λi πj−1 ∇λi .
(89)
Assume now that i < 0, j > 0. The calculation is straightforward: πj ∇
X 1X i λk = λi−1 k πj ∇λi i X = λli πj−1 ∇λk = πj−1 ∇
Therefore,
1 X i+1 λk . i+1
πj ∇Hi = πj−1 ∇Hi+1 .
(90)
In the case i > 0 and j < 0 we use exactly the same calculation but use (89) for the negative hierarchy. (ii) It is clearly enough to show the involution of the eigenvalues of L since Hi are functions of the eigenvalues. We prove involution of eigenvalues by using the Lenard relations (90). We give the proof for the case of the bracket π j with j > 0 but if j < 0 the proof is identical. First we show that the eigenvalues are in involution with respect to the bracket π1 . Let λ and µ be two distinct eigenvalues and let U , V be the gradients of λ and µ respectively. We use the notation { , } to
March 22, 2004 10:8 WSPC/148-RMP
206
00197
P. A. Damianou
denote the bracket π1 and h , i the standard inner product. The Lenard relations (89) translate into π2 U = λ π1 U and π2 V = µ π1 V . Therefore, 1 {λ, µ} = hπ1 U, V i = hπ2 U, V i λ 1 1 = − hU, π2 V i = − hU, µπ1 V i λ λ µ µ = − hU, π1 V i = hπ1 U, V i λ λ µ = {λ, µ} . λ Therefore, {λ, µ} = 0. To show the involution with respect to all brackets πj , and in view of part (iv) of Theorem 6, it is enough to show the following: let f1 , f2 be two functions in involution with respect to the Poisson bracket π, and X be a vector field such that X(fi ) = fi2 for i = 1, 2. Define a Poisson bracket w by w = LX π. Then the functions f1 , f2 remain in involution with respect to the bracket w. The proof follows trivially if we write w = LX π in Poisson form: {f1 , f2 }w = X{f1 , f2 }π − {f1 , X(f2 )}π − {X(f1 ), f2 }π .
Remark. We should point out that 1 Tr Ln , n makes sense for n 6= 0 but it is undefined for n = 0. The reader should interpret the 0 formulas involving H0 as a degenerate case, i.e. H0 = Tr0L = N0 = ∞. Therefore the result of X−n (Hn ) = N where N is the size of L. It makes sense to define 1 Xm (H0 ) = lim Xm (Tr Ln ) . n→0 n For example, X−1 (H0 ) is calculated by X−1 n1 Tr Ln = Tr Ln−1 . Taking the limit as n → 0 gives X−1 (H0 ) = Tr L−1 = −H−1 which is the correct answer. Hn =
5.3. Master integrals and master symmetries In this section we prove some further results and give some specific examples. In Sec. 5.1 we noticed that Z0 is Hamiltonian with respect to the J2 bracket PN with Hamiltonian function f = 21 i=1 qi . This observation is due to Fernandes [43]. This type of function is called a master integral. It is not a constant of motion, but its derivative is. We generalize the result as follows: Theorem 9. The master symmetry Yn , n ∈ Z is the Hamiltonian vector field of f with respect to the Jn+2 bracket. Proof. We will prove the result for the positive hierarchy Zn = Yn but the proof for Y−n = Kn is similar. As a first step we show that Zn (f ) = 0 ,
∀n ≥ 0.
March 22, 2004 10:8 WSPC/148-RMP
00197
Multiple Hamiltonian Structure of Bogoyavlensky–Toda Lattices
207
We recall that N X
N
X ∂ ∂ Z0 = (N − 2i + 1) pi + . ∂q ∂p i i i=1 i=1 Since N X i=1
(N + 1 − 2i) = 0 ,
we obtain 1 Z0 (f ) = Z0 2
N X
qi
i=1
!
1 = 2
N X
Z0 (qi )
i=1
!
= 0.
By examining the form (87) of the recursion operator R we deduce easily that the qi component of Z1 is X X 1 pj . pj − Z1 (qi ) = − (N − 2i + 1)pi + 2 j
i In other words, the vector
(Z1 (q1 ), . . . , Z1 (qN )) is the product AP where Z0 (q1 ) −1 1 −1 A=− 2 .. . . . . −1
1 Z0 (q2 ) −1
···
···
··· . Z0 (q3 ) . . .. .. . . .. .
···
1 1
···
1 .. .
.. . ..
.
1
,
· · · −1 Z0 (qN ) PN and P is the column vector (p1 , p2 , . . . , pN )t . Note that i=1 aij = 0 and PN a = −Z (q ). Therefore, 0 i j=1 ij ! 1 1 X X 1X Z1 (f ) = (Z1 (q1 ) + · · · + Z1 (qN )) = aij pj = aij pj = 0 . 2 2 i,j 2 j i In the same fashion one proves that Z2 (f ) = 0. For n > 2, we proceed by induction. Zn =
1 [Z1 , Zn−1 ] . n−2
March 22, 2004 10:8 WSPC/148-RMP
208
00197
P. A. Damianou
Therefore, Zn (f ) =
1 1 [Z1 , Zn−1 ] (f ) = (Z1 Zn−1 f − Zn−1 Z1 f ) = 0 , n−2 n−2
by the induction hypothesis. To complete the proof of the theorem, it is enough to show Zn = [Jn+2 , f ] , where [ , ] denotes the Schouten bracket. First we note that [[Jn+1 , f ], Z1 ] + [[f, Z1 ], Jn+1 ] + [[Z1 , Jn+1 ], f ] = 0 due to the super Jacobi identity for the Schouten bracket. Since [Z1 , f ] = Z1 (f ) = 0 , the middle term in the last identity is zero. We obtain [Z1 , [Jn+1 , f ]] = [[Z1 , Jn+1 ], f ] . Finally, we calculate using induction: Zn = = = =
1 [Z1 , Zn−1 ] n−2
1 [Z1 , [Jn+1 , f ]] n−2
1 [[Z1 , Jn+1 ], f ] n−2
1 ([(n − 2)Jn+2 , f ]) n−2
= [Jn+2 , f ] . The result of the theorem is striking. It shows that the master symmetries are determined once the Poisson hierarchy is constructed. Of course one requires knowledge of the function f . The function f may be constructed by using Noether’s theorem: the symmetries of the Toda lattice in (q, p) coordinates have been constructed in [57], at least for two degrees of freedom. The Lie algebra for the potential of the
March 22, 2004 10:8 WSPC/148-RMP
00197
Multiple Hamiltonian Structure of Bogoyavlensky–Toda Lattices
209
Toda lattice with N degrees of freedom is five dimensional with generators ∂ ∂t
X1 =
N X ∂ ∂ (i − 1) −2 ∂t ∂q i i=2 ! N N X X ∂ qi X3 = ∂qi i=1 i=1
X2 = t
X4 =
(91)
N X ∂ ∂qi i=1
N X ∂ . X5 = t ∂q i i=1
The non-zero bracket relations satisfied by the generators are [X1 , X2 ] = X1 [X1 , X5 ] = X4 [X2 , X3 ] = −2X4 [X2 , X5 ] = X5 [X3 , X4 ] = −2X4 [X3 , X5 ] = −2X5 . This Lie algebra L is solvable with L(1) = [L, L] = {X1 , X4 , X5 }, L(2) = {X4 } and L(3) = {0}. We examine the symmetry X5 . A corresponding time-dependent integral produced from Noether’s Theorem is N
N
1 X 1 1X qi − t I= pi = f + th1 . 2 i=1 2 i=1 4
Motivated by the results of [58] and [59], it makes sense to consider the timeindependent part of I which is precisely the function f . It is an interesting question whether this procedure works for other integrable systems as well. We also remark that the integrals are also determined from the knowledge of the Poisson brackets and the function f . For example, it follows easily from Theorem 9 that 1 {hi , f }3 , hi+1 = i+1 where { , }3 denotes the cubic Toda bracket. 5.4. Noether symmetries We recallR that Noether’s Theorem states that for a first order Lagrangian, the action t integral t12 Ldt is invariant under the infinitesimal transformation generated by the
March 22, 2004 10:8 WSPC/148-RMP
210
00197
P. A. Damianou
differential operator, known as a Noether symmetry, N
X=T
X ∂ ∂ + Qi ∂t i=1 ∂qi
(92)
if there exists a function F , known as a gauge term, such that N N ∂L ∂L X ∂L X ˙ ˙ F =T + Qi + + T˙ L . Qi − q˙i T˙ ∂t ∂qi i=1 ∂ q˙i i=1
(93)
When the corresponding Euler–Lagrange equation is taken into account, Eq. (93) can be manipulated to yield the first integral # " N X ∂L (Qi − q˙i T ) . (94) I = F − TL + ∂ q˙i i=1 Thus to every Noether symmetry there is an associated first integral. Consider the Lagrangian to be of the form N
L=
1X 2 q˙i − V (q1 , q2 , . . . , qN ) . 2 i=1
(95)
We summarize the results of [60] in the following theorem: Theorem 10. Let X be the N × 1 vector with entries Qi , x the vector with entries qi and b the vector with entries bi (t). Let A be an N × N skew-symmetric matrix with constant entries. Finally we denote by IN the N × N identity matrix. If X, given by (92), is a Noether symmetry then the infinitesimals must be of the form T = T (t) 1 dT X = A+ IN x + b 2 dt
(96)
and the gauge term is restricted to F =
N N 1 d2 T X 2 X dbi (t) qi + d(t) . q + 4 dt2 i=1 i dt i=1
The associated first integral I is equal to F + TH − where H is the Hamiltonian.
N X i=1
Qi pi ,
(97)
March 22, 2004 10:8 WSPC/148-RMP
00197
Multiple Hamiltonian Structure of Bogoyavlensky–Toda Lattices
211
By examining the form (91) of the generators for the Toda lattice we conclude, using Theorem 10, that only X1 , X4 and X5 are Noether symmetries. The corresponding integrals provided by Noether’s Theorem are the Hamiltonian H = h2 , the total momentum h1 = p1 + · · · + pN and f + th1 . In order to obtain more integrals we consider generalized Noether symmetries. That is, the infinitesimals in (92) do not just depend on t, q1 , . . . , qN but also on q˙1 , . . . , q˙N . For Lagrangians of the form (95) for one, two and three degrees of freedom, all the possible point Noether symmetries are classified in [60]. The following results are from [61]. In the case of generalized symmetries, we can take without loss of generality T = 0. Hence, we consider operators of the form G=
N X
ηi
i=1
∂ ∂qi
(98)
where the infinitesimals of (92) and (98) are related by ηi = Qi − q˙i T . Using (95) and (98), Noether’s condition (93) becomes N N N N N N X X X X X X q¨j ηiq˙j . (99) q˙i ηit + q˙j ηiqj + q¨i fq˙i = η i Vqi + ft + q˙i fqi + i=1
i=1
i=1
i=1
j=1
j=1
We consider Eq. (99) in the case of the Toda lattice with two degrees of freedom. By assuming various forms of the ηi (i.e. linear, quadratic or arbitrary) we can solve this equation and produce the following integrals one of which (I3 ) is new:
I2 = (p1 − p2 )2 + 4eq1 −q2 , √ p p1 − p 2 + I2 q1 + q 2 √ exp I3 = I2 . p1 + p 2 p1 − p 2 − I2 2 Note that H = 14 I12 + I2 and that the function G = pq11 +q +p2 which appears in the exponent of I3 satisfies {G, H} = 1. The existence of the integral I3 shows that the two degrees of freedom Toda lattice is super-integrable with three integrals of motion {I1 , I2 , I3 }. A Hamiltonian system with N degrees of freedom is called super-integrable if it possesses 2N − 1 independent integrals of motion. Of course these integrals cannot be all in involution. Based on this computation for 2 degrees of freedom we make the following conjecture: I1 = p 1 + p 2 ,
Conjecture. The Toda lattice is super-integrable. 5.5. Rational Poisson brackets The rational brackets in (q, p) coordinates are given by complicated expressions that are quite hard to write in explicit form. When projected in the space of (a, b)
March 22, 2004 10:8 WSPC/148-RMP
212
00197
P. A. Damianou
variables they give rational brackets whose numerator is polynomial and the denominator is the determinant of the Jacobi matrix. We give examples of these brackets and master symmetries for N = 3. For example, the tensor J0 is a homogeneous rational bracket of degree 0. It is defined by J0 = N J1 = J1 J2−1 J1 . In the case of three particles the corresponding bracket π0 is given as follows: first define the skew-symmetric matrix A by 1 a12 = − a1 a2 (b3 + b1 − b2 ) 2 a13 = a1 (a22 − b2 b3 ) a14 = −a1 (a22 − b1 b3 ) a15 = a1 a22 a23 = −a21 a2 a24 = a2 (a21 − b1 b3 ) a25 = −a2 (a21 − b1 b2 ) a34 = −2a21 b3 a35 = 0 a45 = −2a22 b1 . The matrix of the tensor π0 is defined by 1 A (100) det L where det L = b1 b2 b3 − a22 b1 − a21 b3 . This formula defines a Poisson bracket with one single Casimir H2 = 12 Tr L2 . The bracket is defined on the open dense set det L 6= 0. Taking H3 = 31 Tr L3 as the Hamiltonian we have another bi-Hamiltonian formulation of the system: π0 =
π1 dH2 = π0 dH3 . In fact we have infinite pairs of such formulations since π2 dH1 = π1 dH2 = π0 dH3 = π−1 dH4 = · · · The explicit formulas for the vector fields X1 and X2 are given in Sec. 3.1 therefore we give an example for the vector field X−2 . In the case N = 3 it is given by ! 2 3 X X ∂ ∂ 1 ri + si X−2 = det L i=1 ∂ai i=1 ∂bi
March 22, 2004 10:8 WSPC/148-RMP
00197
Multiple Hamiltonian Structure of Bogoyavlensky–Toda Lattices
213
where r1 = 21 a1 (b1 − b2 − 2b3 )
r2 = 12 a2 (b3 − 2b1 − b2 )
s1 = b2 b3 − a21 − a22
s2 = b1 b3 + a21 + a22
s3 = b1 b2 − a21 − a22 . Finally, we consider the Casimirs of these new Poisson brackets. Theorem 11. The Casimir of πn in the open dense set det L 6= 0 is Tr L2−n for all n 6= 2. The Casimir of π2 is det L. Proof. For n ≥ 1 the result was proved in Proposition 5. Therefore, we only have to show that the Casimir of π−m is Tr Lm+2 (m ≥ 0). This follows from (90) and the fact that H1 = Tr L is the Casimir for the Lie–Poisson bracket π1 : 0 = π1 ∇H1 = π0 ∇H2 = π−1 ∇H3 = · · · . 6. Generalized Toda Systems Associated with Simple Lie Groups In this section we consider mechanical systems which generalize the finite, nonperiodic Toda lattice. These systems correspond to Dynkin diagrams. They are special cases of (1) where the spectrum corresponds to a system of simple roots for a simple Lie algebra. It is well known that irreducible root systems classify simple Lie groups. So, in this generalization for each simple Lie algebra there exists a mechanical system of Toda type. The generalization is obtained from the following simple observation: in terms of the natural basis qi of weights, the simple roots of An−1 are q1 − q2 , q2 − q3 , . . . , qn−1 − qn .
(101)
On the other hand, the potential for the Toda lattice is of the form eq1 −q2 + eq2 −q3 + · · · + eqn−1 −qn .
(102)
We note that the angle between qi−1 − qi and qi − qi+1 is 2π 3 and the lengths of qi − qi+1 are all equal. The Toda lattice corresponds to a Dynkin diagram of type An−1 . More generally, we consider potentials of the form U = c1 ef1 (q) + · · · + cl efl (q)
(103)
where c1 , . . . , cl are constants, fi (q) is linear and l is the rank of the simple Lie algebra. For each Dynkin diagram we construct a Hamiltonian system of Toda type. These systems are interesting not only because they are integrable, but also for their fundamental importance in the theory of semisimple Lie groups. For example
March 22, 2004 10:8 WSPC/148-RMP
214
00197
P. A. Damianou
Kostant in [2] shows that the integration of these systems and the theory of the finite dimensional representations of semisimple Lie groups are equivalent. For reference, we give a complete list of the Hamiltonians for each simple Lie algebra. n 1X 2 • An−1 : H = p + eq1 −q2 + · · · + eqn−1 −qn 2 1 j n
1X 2 p + eq1 −q2 + · · · + eqn−1 −qn + eqn 2 1 j
• Bn
: H=
• Cn
: H=
• Dn
: H=
• G2
: H=
• F4
1X 2 1 : H= p + eq1 −q2 + eq2 −q3 + eq3 + e 2 (q4 −q1 −q2 −q3 ) 2 1 j
• E6
: H=
• E7
: H=
• E8
: H=
n
1X 2 p + eq1 −q2 + · · · + eqn−1 −qn + e2qn 2 1 j n
1X 2 p + eq1 −q2 + · · · + eqn−1 −qn + eqn−1 +qn 2 1 j 3
1X 2 p + eq1 −q2 + e−2q1 +q2 +q3 2 1 j 4
8
4
8
5
8
6
1 X 2 X qj −qj+1 1 p + e + e−(q1 +q2 ) + e 2 (−q1 +q2 +···+q7 −q8 ) 2 1 j 1 1 1 X 2 X qj −qj+1 + e−(q1 +q2 ) + e 2 (−q1 +q2 +···+q7 −q8 ) pj + e 2 1 1
1 X 2 X qj −qj+1 1 p + e + e−(q1 +q2 ) + e 2 (−q1 +q2 +···+q7 −q8 ) 2 1 j 1
We should note that the Hamiltonians in the list are not unique. For example, the A2 Hamiltonian is H=
1 2 1 2 1 2 p + p + p + eq1 −q2 + eq2 −q3 . 2 1 2 2 2 3
(104)
An equivalent system is H(Qi , Pi ) =
√2 √ √2 1 2 1 2 P1 + P2 + e 3 ( 3Q1 +Q2 ) + e−2 3 Q2 . 2 2
(105)
The second Hamiltonian is obtained from the first by using the transformation √ 2 (q1 + q2 − 2q3 ) (106) Q1 = 4 √ 6 Q2 = (q2 − q1 ) (107) 4
March 22, 2004 10:8 WSPC/148-RMP
00197
Multiple Hamiltonian Structure of Bogoyavlensky–Toda Lattices
2 P1 = √ (p1 + p2 ) 2
215
(108)
2 P2 = √ (p2 − p1 ) . (109) 6 Another example is the following two systems, both corresponding to a Lie algebra of type D4 : 4 X p2 i
i=1
2
4 X p2 i
i=1
2
1
+ eq1 + eq2 + eq3 + e 2 (q4 −q1 −q2 −q3 )
+ eq1 −q2 + eq2 −q3 + eq3 −q4 + eq3 +q4 .
Finally, let us recall the definition of exponents for a semisimple group G. An excellent reference is the book by Collingwood and McGovern [62]. Let G be a connected, complex, simple Lie Group G. We form the de Rham cohomology groups H i (G, C) and the corresponding Poincar´e polynomial of G: X pG (t) = d i ti ,
where di = dim H i (G, C). A Theorem of Hopf shows that the cohomology algebra is a finite product of l spheres of odd dimension where l is the rank of G. This Theorem implies that pG (t) =
l Y
(1 + t2ei +1 ) .
i=1
The positive integers {e1 , e2 , . . . , el } are called the exponents of G. One can also extract the exponents from the root space decomposition of G. The connection with the invariant polynomials is the following: let H1 , H2 , . . . , Hl be independent, homogeneous, invariant polynomials of degrees m1 , m2 , . . . , ml . Then ei = mi − 1. The exponents of a simple Lie group are given in the following list: • An−1 : 1, 2, 3, . . . , n − 1 • Bn , Cn : 1, 3, 5, . . . , 2n − 1 • Dn : 1, 3, 5, . . . , 2n − 3, n − 1 • G2 : 1, 5 • F4 : 1, 5, 7, 11 • E6 : 1, 4, 5, 7, 8, 11 • E7 : 1, 5, 7, 9, 11, 13, 17 • E8 : 1, 7, 11, 13, 17, 19, 23, 29 7. BN Toda Systems 7.1. A rational bracket for a central extension of BN -Toda In this subsection we show that the BN Toda system is bi-Hamiltonian by considering a central extension of the corresponding Lie algebra in analogy with gl(n, C) which is a central extension of sl(n, C) in the case of AN Toda.
March 22, 2004 10:8 WSPC/148-RMP
216
00197
P. A. Damianou
Another way to describe these generalized Toda systems, is to give a Lax pair representation in each case. It can be shown that the equation L˙ = [B, L] is equivalent to the equations of motion generated by the Hamiltonian H2 = 12 Tr L2 on the orbit through L of the coadjoint action of B− (lower triangular group) on the dual ∗ ∗ of its Lie algebra, B− . The space B− can be identified with the set of symmetric matrices. This situation, which corresponds to sl(n, C) = An−1 can be generalized to other semisimple Lie algebras. We use notation and definitions from Humphreys [63]. Let G be a semisimple Lie algebra, Φ a root system for G, ∆ = {α1 , . . . , αl } the simple roots, h a Cartan subalgebra and Gα the root space of α. We denote by xα a generator of Gα . Define X Gα . B− = h ⊕ α<0
There is an automorphism σ of G, of order 2, satisfying σ(xα ) = x−α and σ(x−α ) = xα . Let K = {x ∈ G | σ(x) = −x}. Then we have a direct sum decomposition ∗ G = B− ⊕ K. The Toda flow is a coadjoint flow on B− and the coadjoint invariant ∗ ∗ functions on G , when restricted to B− are still in involution. The Jacobi elements are of the form L=
l X
b i hi +
i=1
l X i=1
ai (xαi + x−αi ) .
We define B=
l X i=1
ai (xαi − x−αi ) .
The generalized Toda flow takes the Lax pair form: L˙ = [B, L] . The BN Toda systems were shown to be bi-Hamiltonian. The second bracket can be found in [10]. It turned out to be a rational bracket and it was obtained by using Dirac’s constrained bracket formula (29). The idea is to use the inclusion of BN into A2N and to restrict the hierarchy of brackets from A2N to BN via Dirac’s bracket. Straightforward restriction does not work. We briefly describe the procedure in the case of B2 . The Jacobi matrices for A4 and B2 are given by b1 a 1 0 0 0 a b a 0 0 2 2 1 , (110) 0 a b a 0 L A4 = 2 3 3 0 0 a 3 b4 a 4 0
0
0
a4
b5
March 22, 2004 10:8 WSPC/148-RMP
00197
Multiple Hamiltonian Structure of Bogoyavlensky–Toda Lattices
217
and b1
a1
0
0
0
a 1 = 0 0
b2
a2
0
0
a2
b3
0
0
−a2
−a2
L B2
0
0
0
2b3 − b2 −a1
−a1
2b3 − b1
.
(111)
Note that LA4 lies in gl(4, C) instead of sl(4, C). Therefore we have added an additional variable in LB2 . We define p1 = a 1 + a 4 p2 = a 2 + a 3 p3 = b1 + b5 − 2b3 p4 = b2 + b4 − 2b3 . It is clear that we obtain B2 from A4 by setting pi = 0 for i = 1, 2, 3, 4. We calculate the matrix P = {pi , pj }. The bracket used is the quadratic Toda (52) on A4 . {p1 , p2 } =
1 (a1 a2 − a3 a4 ) 2
{p1 , p3 } = a4 b5 − a1 b1 {p1 , p4 } = a1 b2 − a4 b4 {p2 , p3 } = 2(a3 b3 − 2a2 b3 ) {p2 , p4 } = a3 b4 + 2a3 b3 − 2a2 b3 − a2 b2 {p3 , p4 } = −2a24 − 4a23 + 4a22 + 2a21 . If we evaluate at a point in B2 we get {p1 , p2 } = 0 {p1 , p3 } = −2a1 b3 {p1 , p4 } = 2a1 b3 {p2 , p3 } = −4a2 b3 {p2 , p4 } = −6a2 b3 {p3 , p4 } = 0 .
March 22, 2004 10:8 WSPC/148-RMP
218
00197
P. A. Damianou
Therefore the matrix P is given by 0 0 0 0 P = 2a1 b3 4a2 b3 and P −1 is the matrix
−2a1 b3
−4a2 b3
6a2 b3
0
−2a1 b3
0
0 P −1 = − 10a3 b
1 3
− 10a12 b3
0 1 5a1 b3 − 10a12 b3
0
0
2a1 b3
−6a2 b3 , 0 0
3 10a1 b3 1 10a2 b3
− 5a11 b3
0
0
0
0
1 10a2 b3
.
Using Dirac’s formula (29) we obtain a homogeneous quadratic bracket on B 2 given by {a1 , a2 } =
a1 a2 (3b3 − b2 − 2b1 ) 10b3
{a1 , b1 } =
−a1 (10b1 b3 − 2b1 b2 − 3b21 − a21 ) 10b3
{a1 , b2 } =
a1 (10b2 b3 − 3b22 − 2b1 b2 − 4a22 − a21 ) 10b3
{a1 , b3 } =
a1 (b2 − b1 ) 5
{a2 , b1 } =
a2 (2b1 b3 − 2b1 b2 + a21 ) 10b3
{a2 , b2 } =
−a2 (8b2 b3 − 3b22 − 6a22 − 4a21 ) 10b3
{a2 , b3 } =
a2 (b3 − b2 ) 5
10a21 b3 − 3a21 b2 − 2a22 b1 − 3a21 b1 5b3 2a21 {b1 , b3 } = 5 {b1 , b2 } =
2 2 (a − a21 ) . 5 2 The bracket satisfies the following properties which are analogous to the quadratic AN Toda (52). {b2 , b3 } =
(i) It is a homogeneous quadratic Poisson bracket. (ii) It is compatible with the B2 Lie–Poisson bracket.
March 22, 2004 10:8 WSPC/148-RMP
00197
Multiple Hamiltonian Structure of Bogoyavlensky–Toda Lattices
219
(iii) The functions Hn = n1 Tr Ln are in involution in this bracket. (iv) We have Lenard type relations π2 ∇Hi = π1 ∇Hi+1 where π1 , π2 are the Poisson matrices of the linear and quadratic B2 Toda brackets respectively. (v) The function det L is the Casimir. 7.2. A recursion operator for Bogoyavlensky Toda systems of type Bn In this section, we show that higher polynomial brackets exist also in the case of Bn Toda systems. We will prove that these systems possess a recursion operator and we will construct an infinite sequence of compatible Poisson brackets in which the constants of motion are in involution. The Hamiltonian for Bn is n
H=
1X 2 p + eq1 −q2 + · · · + eqn−1 −qn + eqn . 2 1 j
(112)
We make a Flaschka-type transformation ai =
1 1 (qi −qi+1 ) e2 , 2
an =
1 1 qn e2 2
(113)
1 bi = − p i . 2 Then a˙ i = ai (bi+1 − bi ) , b˙ i = 2 (a2i − a2i−1 ) ,
i = 1, . . . , n
(114)
i = 1, . . . , n ,
with the convention that a0 = bn+1 = 0. These equations can be written as a Lax pair L˙ = [B, L], where L is the symmetric matrix b1 a 1 .. a1 . . . . .. .. . . an−1 a b a n−1 n n (115) , a 0 −a n n .. . −a −b n n .. .. . . −a1 −a1
−b1
and B is the skew-symmetric part of L (In the decomposition, lower Borel plus skew-symmetric). We note that the determinant of L is zero.
March 22, 2004 10:8 WSPC/148-RMP
220
00197
P. A. Damianou
The mapping F : R2n → R2n , (qi , pi ) → (ai , bi ), defined by (113), transforms the standard symplectic bracket into another symplectic bracket π1 given (up to a constant multiple) by {ai , bi } = −ai {ai , bi+1 } = ai .
(116)
It is easy to show by induction that det π1 = a21 a22 . . . a2n . The invariant polynomials for Bn , which we denote by H2 , H4 , . . . , H2n 1 are defined by H2i = 2i Tr L2i . The degrees of the first n (independent) polynomials are 2, 4, . . . , 2n and the exponents of the corresponding Lie group are 1, 2, . . . , 2n−1. We look for a bracket π3 which satisfies
π3 ∇H2 = π1 ∇H4 .
(117)
Using trial and error, we end up with the following homogeneous cubic bracket π3 . {ai , ai+1 } = ai ai+1 bi+1
{ai , bi } = −ai b2i − a3i ,
{an , bn } =
−an b2n
{ai , bi+1 } =
ai b2i+1
{ai , bi+2 } = ai a2i+1
−
2a3n
+
a3i
i = 1, 2, . . . , n − 1 (118)
{ai , bi−1 } = −a2i−1 ai
{bi , bi+1 } = 2a2i (bi + bi+1 ) .
We summarize the properties of this new bracket in the following: Theorem 12. The bracket π3 satisfies (i) π3 is Poisson. (ii) π3 is compatible with π1 . (iii) H2i are in involution. Define R = π3 π1−1 . Then R is a recursion operator. We obtain a hierarchy π1 , π 3 , π 5 , . . . consisting of compatible Poisson brackets of odd degree in which the constants of motion are in involution. (iv) πj+2 ∇H2i = πj ∇H2i+2
∀ i, j.
March 22, 2004 10:8 WSPC/148-RMP
00197
Multiple Hamiltonian Structure of Bogoyavlensky–Toda Lattices
221
The proof of this result is in [10]. It is interesting to compute the cubic Poisson bracket in (p, q) coordinates. We will see that in the expression for the master symmetry the exponents of the corresponding Lie-group appear explicitly. We reproduce the formula for the cubic Poisson bracket in (p, q)-coordinates from [11]. {qi , qi−1 } = {qi , qi−2 } = · · · = {qi , q1 } = 2pi ,
i = 2, . . . , n
{pi , qi−2 } = {pi , qi−3 } = · · · = {pi , q1 } = 2(eqi−1 −qi − eqi −qi+1 ) , {pn , qn−2 } = {pn , qn−3 } = · · · = {pn , q1 } = 2(e {qi , pi } = p2i + 2eqi −qi+1 ,
{qn , pn } = p2n + 2eqn
qn−1 −qn
qn
i = 3, . . . , n − 1
−e )
i = 1, . . . , n − 1
{qi+1 , pi } = eqi −qi+1
{qi , pi+1 } = 2(eqi+1 −qi+2 − eqi −qi+1 ) ,
{qn−1 , pn } = 2e
qn
−e
qn−1 −qn
i = 1, . . . , n − 2
{pi , pi+1 } = −eqi −qi+1 (pi + pi+1 ) .
In (p, q)-coordinates, J1 is the (symplectic) canonical Poisson tensor, h2 is the Hamiltonian , J3 is the cubic Poisson tensor for Bn and Z0 is the conformal symmetry for both J1 , J3 and h2 . So, with Z0 =
n X
n
pi
i=1
we have
LZ0 J1 = −J1 ,
X ∂ ∂ + 2(n − i + 1) , ∂pi i=1 ∂qi L Z 0 J3 = J 3 ,
LZ0 h2 = 2h2 .
We obtain a hierarchy of Poisson tensors, master symmetries and invariants which are obtained using Oevel’s Theorem. For example, we have [Zi , χj ] = (2j + 1)χi+j
(119)
and the coefficients of the first n independent Hamiltonian vector fields correspond to the exponents of a Lie group of type Bn . 7.3. Bi-Hamiltonian formulation of Bn systems Following the procedure outlined in the introduction we obtain a bi-Hamiltonian formulation of the system. In other words, we define π−1 = N π1 = π1 π3−1 π1 and we use it to obtain the desired formulation. We illustrate with the B2 Toda system. In this case det π1 = a21 a22 and det π3 = a21 a22 ∆2 = det π1 ∆2 where ∆ = a41 + 2a22 a21 + 2a22 b21 + b21 b22 − 2a21 b1 b2 .
March 22, 2004 10:8 WSPC/148-RMP
222
00197
P. A. Damianou
The explicit formula for π−1 is π−1 = where
0
a 1 a 2 b2 A= a1 (b22 + a21 + 2a22 ) −a1 (b21
+
a21
+
2a22 )
−a1 a2 b2 0
−a21 a2
a2 (b21
+
2a21 )
1 A, ∆ −a1 (b22 + a21 + 2a22 ) a21 a2 0 2a21 (b1
+ b2 )
a1 (b21 + a21 + 2a22 )
−a2 (b21 + 2a21 ) . 2 −2a1 (b1 + b2 ) 0
This bracket is Poisson by√construction. We will prove later that it is compatible with π1 . We note that ∆ = det R and it is also equal to the product of the non-zero eigenvalues of L. Using the rational bracket π−1 we establish the bi-Hamiltonian nature of the system, i.e. π1 ∇H2 = π−1 ∇H4 . 8. Cn Toda Systems We now consider Cn Toda systems. We will prove that these systems also possess a recursion operator and we will construct an infinite sequence of compatible Poisson brackets as in the Bn case. We also show that the systems are bi-Hamiltonian. 8.1. A recursion operator for Bogoyavlensky–Toda systems of type Cn The Hamiltonian for Cn is n
H=
1X 2 p + eq1 −q2 + · · · + eqn−1 −qn + e2qn . 2 1 j
(120)
We make a Flaschka-type transformation ai =
1 1 (qi −qi+1 ) , e2 2
1 a n = √ e qn 2
(121)
1 bi = − p i . 2 The equations in (a, b) coordinates are a˙ i = ai (bi+1 − bi ) ,
i = 1, . . . , n − 1
a˙ n = −2an bn
b˙ i = 2 (a2i − a2i−1 ) ,
with the convention that a0 = 0.
(122) i = 1, . . . , n ,
March 22, 2004 10:8 WSPC/148-RMP
00197
Multiple Hamiltonian Structure of Bogoyavlensky–Toda Lattices
223
These equations can be written as a Lax pair L˙ = [B, L], where L is the matrix b1 a 1 .. a1 . . . . .. .. . . an−1 an−1 bn an L= (123) , an −bn −an−1 .. .. . . −an−1 .. .. . . −a1 −a1
−b1
and B is the skew-symmetric part of L. In the new variables ai , bi , the canonical bracket on R2n is transformed into a bracket π1 which is given by {ai , bi } = −ai , {ai , bi+1 } = ai ,
i = 1, 2, . . . , n − 1 i = 1, 2, . . . , n − 1
(124)
{an , bn } = −2an . The invariant polynomials for Cn , which we denote by H2 , H4 , . . . H2n , 1 Tr L2i . are defined by H2i = 2i We look for a bracket π3 which satisfies
π3 ∇H2 = π1 ∇H4 .
(125)
The bracket π3 was obtained in [12]: {ai , ai+1 } = ai ai+1 bi+1 ,
i = 1, 2, . . . , n − 2
{an−1 , an } = 2an−1 an bn
{ai , bi } = −ai b2i − a3i ,
{an , bn } = −2an b2n − 2a3n
i = 1, 2, . . . , n − 1
{ai , bi+2 } = ai a2i+1
{ai , bi+1 } = ai b2i+1 + a3i
{an−1 , bn } = a3n−1 + an−1 b2n − an−1 a2n {ai , bi−1 } = −a2i−1 ai
{an , bn−1 } = −2a2n−1 an
{bi , bi+1 } = 2a2i (bi + bi+1 ) .
(126)
March 22, 2004 10:8 WSPC/148-RMP
224
00197
P. A. Damianou
We summarize the properties of this bracket in the following: Theorem 13. The bracket π3 satisfies (i) π3 is Poisson. (ii) π3 is compatible with π1 . (iii) H2i are in involution. Define R = π3 π1−1 . Then R is a recursion operator. We obtain a hierarchy π1 , π 3 , π 5 , . . . consisting of compatible Poisson brackets of odd degree in which the constants of motion are in involution. (iv) πj+2 ∇H2i = πj ∇H2i+2
∀ i, j.
The proofs are precisely the same as in the case of Bn . Even though it is not necessary to work in (p, q)-coordinates, we reproduce the formulas from [11] for completeness. {qi , qi−1 } = {qi , qi−2 } = · · · = {qi , q1 } = 2pi ,
i = 2, . . . , n
{pi , qi−2 } = {pi , qi−3 } = · · · = {pi , q1 } = 2(eqi−1 −qi − eqi −qi+1 ) , {pn , qn−2 } = {pn , qn−3 } = · · · = {pn , q1 } = 2eqn−1 −qn − 4e2qn {qi , pi } = p2i + 2eqi −qi+1 ,
{qn , pn } = p2n + 2e2qn
i = 3, . . . , n − 1 (127)
i = 1, . . . , n − 1
{qi+1 , pi } = eqi −qi+1
{qi , pi+1 } = 2eqi+1 −qi+2 − eqi −qi+1 ,
{qn−1 , pn } = 4e2qn − eqn−1 −qn
i = 1, . . . , n − 2
(128)
{pi , pi+1 } = −eqi −qi+1 (pi + pi+1 ) .
For Cn , the conformal symmetry is n n X X ∂ ∂ + (2n − 2i + 1) , Z0 = pi ∂p ∂q i i i=1 i=1
and we have the same constants as in the case of Bn : LZ0 J0 = −J0 ,
L Z 0 J1 = J 1 ,
LZ0 H0 = 2H0 .
The relations of Oevel’s Theorem are the same as the Bn Toda [Zi , χj ] = (2j + 1)χi+j
(129)
[Zi , Zj ] = 2(j − i)Zi+j
(130)
LZi Jj = (2(j − i) − 1)Ji+j .
Note that (129) gives a method of generating the exponents.
(131)
March 22, 2004 10:8 WSPC/148-RMP
00197
Multiple Hamiltonian Structure of Bogoyavlensky–Toda Lattices
225
8.2. Bi-Hamiltonian formulation of Cn systems In order to show that the Cn Toda systems are bi-Hamiltonian we define π−1 = π1 π3−1 π1 . This is the second bracket required to obtain a bi-Hamiltonian pair. We illustrate with a small dimensional example, namely C3 . The explicit formula for π−1 is the following: let A be the skew-symmetric 6 × 6 matrix defined by the following terms: a12 = a1 a2 (b2 a23 + b23 b2 − a22 b3 − b3 b22 + a22 b2 + b21 b3 )
a13 = −2a1 a3 (a22 b2 + b21 b3 − b3 b22 )
a14 = a1 (b22 a23 + a42 + b22 b23 − 2a22 b2 b3 + a21 a23 + a21 b23 )
a15 = −a1 (2a42 − 2a22 b2 b3 + a21 a23 + a21 b23 + a23 b21 + b23 b21 )
a16 = a1 (2a42 + 2a23 b21 + a22 a21 − 2a22 b2 b3 − a22 b21 − 2b22 a23 )
a23 = 2a2 (b21 + a21 )b3 a3
a24 = −a21 a2 (a23 + b23 − 2b2 b3 + a22 − 2b1 b3 )
a25 = a2 (2a21 a23 + 2a21 b23 − 2a21 b2 b3 + 2a22 a21 − 2a21 b1 b3 + a23 b21 + b23 b21 + a22 b21 ) a26 = −a2 (2a21 a23 + 2a23 b21 + a22 b21 + 2a22 a21 + a41 − 2a21 b1 b2 + b22 b21 )
a34 = 2a21 a3 (a22 − 2b1 b3 − 2b2 b3 )
a35 = −2a3 (2a22 a21 − 2a21 b1 b3 + a22 b21 − 2a21 b2 b3 )
a36 = 2a3 (2a22 b21 + 2a22 a21 + a41 − 2a21 b1 b2 + b22 b21 )
a45 = 2a21 (a23 b1 + b2 a23 + b1 b23 + b23 b2 )
a46 = 2a21 (a22 b1 − 2a23 b1 − 2b2 a23 − a22 b3 )
a56 = 2(2a21 a23 b1 + 2a21 b2 a23 + 2a21 a22 b3 − 2a21 a22 b1 + a22 b21 b3 + a22 b21 b2 ) . The Poisson tensor π−1 is of the form π−1 =
1 A, det L
where det L =
√ det R = 2a22 b21 b2 b3 − 2a21 a22 b1 b3 − a23 b21 b22 + 2a23 b1 b2 a21 − a41 a23 − b21 b22 b23 + 2a21 b1 b2 b23 − a41 b23 − a42 b21 .
As in the case of B2 we have π1 ∇H2 = π−1 ∇H4 . 9. Dn Toda Systems In this section, we show that higher polynomial brackets exist also in the case of D n Bogoyavlensky–Toda systems. Using Flaschka coordinates, we will prove that these
March 22, 2004 10:8 WSPC/148-RMP
226
00197
P. A. Damianou
systems possess a recursion operator and we will construct an infinite sequence of compatible Poisson brackets in which the constants of motion are in involution. We also show that the system is bi-Hamiltonian. 9.1. A recursion operator for Dn Bogoyavlensky–Toda systems in Flaschka coordinates The Hamiltonian for Dn is n
H=
1X 2 p + eq1 −q2 + · · · + eqn−1 −qn + eqn−1 +qn , 2 1 j
n ≥ 4.
(132)
We make a Flaschka-type transformation, F : R2n → R2n defined by F : (q1 , . . . , qn , p1 , . . . , pn ) → (a1 , . . . , an , b1 , . . . , bn ) , with ai =
1 1 (qi −qi+1 ) e2 , 2
i = 1, 2, . . . , n − 1 , 1 bi = − p i , 2
an =
1 1 (qn−1 +qn ) e2 , 2
(133)
i = 1, 2, . . . , n .
Then a˙ i = ai (bi+1 − bi ) ,
i = 1, 2, . . . , n − 1
a˙ n = −an (bn−1 + bn ) b˙ i = 2 (a2i − a2i−1 ) ,
i = 1, 2, . . . , n − 2 and i = n
(134)
b˙ n−1 = 2(a2n + a2n−1 − a2n−2 ) .
These equations can be written as a Lax pair L˙ = [B, L], where L is the symmetric matrix b1 a 1 .. a1 . . . . .. .. . . an−1 −an 0 a b 0 a n−1 n n (135) , −an 0 −bn −an−1 .. .. . . 0 an −an−1 .. .. . . −a1 −a1
−b1
and B is the skew-symmetric part of L (In the decomposition, lower Borel plus skew-symmetric).
March 22, 2004 10:8 WSPC/148-RMP
00197
Multiple Hamiltonian Structure of Bogoyavlensky–Toda Lattices
227
The mapping F : R2n → R2n , (qi , pi ) → (ai , bi ), defined by (133), transforms the standard symplectic bracket J0 into another symplectic bracket π1 given (up to a constant multiple) by 1 {ai , bi } = − ai , i = 1, 2, . . . , n 2 1 {ai , bi+1 } = ai , i = 1, 2, . . . , n − 1 2 1 {an , bn−1 } = − an . 2
(136)
We obtain a hierarchy of invariant polynomials, which we denote by H2 , H4 , . . . , H2n , . . . 1 Tr L2i . The degrees of the first n−1 (independent) polynomials defined by H2i = 2i are 2, 4, . . . , 2n − 2. We also define
Pn =
√
det L .
The degree of Pn is n. The set {H2 , H4 , . . . , H2n−2 , Pn } corresponds to the Chevalley invariants for a Lie group of type Dn . The exponents of the Lie group is the set {1, 3, 5, . . . , 2n − 3, n − 1} which is obtained by subtracting 1 from the degrees of the invariant polynomials. A conjecture of Flaschka states that the degrees of the Poisson brackets is in one–to–one correspondence with the exponents of the corresponding Lie group. Taking H2 = 21 Tr L2 as the Hamiltonian we have that π1 ∇H2 gives precisely Eqs. (134). In this section, we find a bracket π−1 which satisfies π1 ∇H2 = π−1 ∇H4 . First, we define a bracket π3 which satisfies π3 ∇H2 = π1 ∇H4 ,
(137)
March 22, 2004 10:8 WSPC/148-RMP
228
00197
P. A. Damianou
and whose non-zero terms are {ai , ai+1 } = ai ai+1 bi+1 ,
i = 1, 2, . . . , n − 2
{an−2 , an } = an−2 an bn−1 {an−1 , an } = 2an−1 an bn
{ai , bi } = −ai (b2i + a2i ) ,
i = 1, 2, . . . , n − 2
{an−1 , bn−1 } = −an−1 (a2n−1 + 3a2n + b2n−1 ) {an , bn } = −an (a2n + b2n − a2n−1 )
{ai , bi+1 } = ai (a2i + b2i+1 ) , {an−1 , bn } =
an−1 (a2n−1
{an−2 , bn } =
an−2 (a2n−1
{an , bn−2 } =
−a2n−2 an
{ai , bi+2 } = ai a2i+1 ,
+
b2n
i = 1, 2, . . . , n − 2
− a2n )
(138)
i = 1, 2, . . . , n − 3
{ai , bi−1 } = −a2i−1 ai ,
− a2n )
i = 2, 3, . . . , n − 1
{an , bn−1 } = −an (3a2n−1 + a2n + b2n−1 ) {bi , bi+1 } = 2a2i (bi + bi+1 ) ,
i = 1, 2, . . . , n − 2
{bn−1 , bn } = 2a2n−1 (bn−1 + bn ) + 2a2n (bn − bn−1 ) . This bracket appeared recently in [13]. We summarize the properties of this new bracket in the following: Theorem 14. The bracket π3 satisfies (i) π3 is Poisson. (ii) π3 is compatible with π1 . Define R = π3 π1−1 . Then R is a recursion operator. We obtain a hierarchy π1 , π 3 , π 5 , . . . consisting of compatible Poisson brackets of odd degree in which the constants of motion are in involution. (iii) All the H2i and Pn are in involution with respect to all the brackets π1 , π3 , π5 , . . .. (iv) πj+2 ∇H2i = πj ∇H2i+2 ∀ i, j. The proof of (i) is a straightforward verification of the Jacobi identity. We will see later, in the next subsection, that π3 is the Lie derivative of π1 in the direction of a master symmetry and this fact makes π1 , π3 compatible. (iv) follows from properties of the recursion operator. (iii) is a consequence of (iv) (see for example Proposition 3 for a method of proof). The only part which is not obvious is the
March 22, 2004 10:8 WSPC/148-RMP
00197
Multiple Hamiltonian Structure of Bogoyavlensky–Toda Lattices
229
involution of Pn with Hn which will be proved at the end of the next subsection using master symmetries. 9.2. Master symmetries We would like to make some observations concerning master symmetries. Due to the presence of a recursion operator, we will use the approach of Oevel. We define Z0 to be the Euler vector field Z0 =
n X
ai
i=1
∂ ∂ + bi . ∂ai ∂bi
We define the master symmetries Zi by: Zi = R i Z0 . For obvious reasons we use the notations X2i = Zi , quadhi = H2i+2 ,
Πi = π2i+1 ,
ψi = χ2i+2 ,
i = 0, 1, 2, . . .
where χ2i denotes the Hamiltonian vector field generated by H2i , with respect to π1 . This notation is convenient since X2 is a master symmetry which raises the degrees of invariants and Poisson tensors by 2 each time. One calculates easily that LZ0 Π0 = −Π0 ,
L Z0 Π 1 = Π 1 ,
LZ0 h0 = 2h0 .
Therefore Z0 is a conformal symmetry for Π0 , Π1 and h0 . The constants appearing in Oevel’s Theorem are λ = −1 , µ = 1 and ν = 2. Therefore we obtain [Zi , ψj ] = (1 + 2j) ψi+j ⇔ [X2i , χ2j+2 ] = (1 + 2j) χ2(i+j+1)
(139)
[Zi , Zj ] = 2 (j − i) Zi+j ⇔ [X2i , X2j ] = 2 (j − i) X2(i+j)
(140)
LZi (Πj ) = (2j − 2i − 1) Πi+j ⇔ LX2i (π2j+1 ) = (2j − 2i − 1) π2(i+j)+1 (141) Zi (hj ) = (2 + 2i + 2j) hi+j ⇔ X2i (H2j ) = 2 (i + j) H2(i+j) .
(142)
Remark 1. The relation (141) implies that LX2 (π1 ) = −3π3 and therefore π3 is the Lie-derivative of π1 in the direction of a master symmetry. This makes π1 compatible with π3 (see Lemma 1). Remark 2. The relation (139) gives a procedure for generating almost all the exponents. As we mentioned in the introduction, the last exponent is generated by the application of the conformal symmetry on the Hamiltonian vector field corresponding to the Pfaffian of the Jacobi matrix. It is interesting to note that one can obtain the master symmetry X2 by using the matrix equation L˙ = [B, L] + L3 ,
(143)
March 22, 2004 10:8 WSPC/148-RMP
230
00197
P. A. Damianou
where L is the follows: 0 −x1 −y 1 0 B=
Lax matrix (135) and B is the skew-symmetric matrix defined as x1 y1
0
0
y2 .. .
x2
..
.
.. . −x2 0 0 .. .. . xn−2 yn−2 yn−1 −y2 . .. .. . . −xn−2 0 xn−1 xn 0 −yn−2 −xn−1 −yn−1 −xn 0
0
0
0
0
xn xn−1
0
0 −xn −yn−1
−xn−1 −yn−2 0 . . 0 −xn−2 . . . .
yn−1 yn−2 xn−2 0
0
xi = a i
xn = −an
j=1
bj + (i + 1 − n) (bi + bi+1 )
n−2 X
bj
.
..
.
0 −x2
. −y2
..
.
..
..
.
y2
x2
0
y 1 x1
where
i−1 X
..
0
0 −y1 −x1 0
, i = 1, 2, . . . , n − 1
j=1
yi = (i + 1 − n) ai ai+1 ,
i = 1, 2, . . . , n − 2
yn−1 = an−2 an . It is interesting to note that the yi is a constant times a product of aj ak where the product is determined from the Dynkin diagram of a Lie algebra of type Dn . The matrix B was chosen in such a way that both sides of (143) have the same form. The components of the vector field X2 are defined by the right-hand side of (143). √ Finally we note the action of the first master symmetry on Pn = det L: X2 (Pn ) = Pn H2 . Remark. This last result should be expected since the eigenvalues of L satisfy λ˙ = λ3 under (143). Therefore, √ X2 (Pn ) = X2 ( det L)
March 22, 2004 10:8 WSPC/148-RMP
00197
Multiple Hamiltonian Structure of Bogoyavlensky–Toda Lattices
= X2
231
p λ 1 · · · λn
1 −1 (λ1 · · · λn ) 2 λ˙ 1 λ2 · · · λn + λ1 λ˙ 2 · · · λn + · · · + λ1 · · · λ˙ n 2 1 λ31 λ2 · · · λn + λ1 λ32 · · · λn + · · · + λ1 · · · λ3n = √ 2 det L =
det L = √ λ21 + λ22 + · · · + λ2n 2 det L √ λ21 + λ22 + · · · + λ2n = det L 2 = P n H2 .
We conclude this subsection by proving the involution of Hi with Pn . It is clearly enough to show the involution of the eigenvalues of L since Pn and Hi are both functions of the eigenvalues. It is well-known that the eigenvalues are in involution with respect to the symplectic bracket π1 . We will give here a proof based on the Lenard relations (137). Let λ and µ be two distinct eigenvalues and let U , V be the gradients of λ and µ respectively. We use the notation { , } to denote the bracket π1 and h , i the standard inner product. The Lenard relations (137) translate into π3 U = λ2 π1 U and π3 V = µ2 π1 V . Therefore, {λ, µ} = hπ1 U, V i =
1 hπ3 U, V i λ2
1 hU, π3 V i λ2 1
= − 2 U, µ2 π1 V λ =−
=−
µ2 hU, π1 V i λ2
=
µ2 hπ1 U, V i λ2
=
µ2 {λ, µ} . λ2
Therefore, {λ, µ} = 0. To show the involution with respect to all brackets π2j+1 and in view of (141) it is enough to show the following: let f1 , f2 be two functions in involution with respect to the Poisson bracket π, let X be a vector field such that X(fi ) = fi3 for i = 1, 2. Define a Poisson bracket w by w = LX π. Then the functions f1 , f2 remain in involution with respect to the bracket w. The proof
March 22, 2004 10:8 WSPC/148-RMP
232
00197
P. A. Damianou
follows trivially if we write w = LX π in Poisson form {f1 , f2 }w = X{f1 , f2 }π − {f1 , X(f2 )}π − {X(f1 ), f2 }π . Remark. We have to point out that unlike the case of Bn and Cn the cubic bracket (138) was discovered not by manipulating the left-hand side of (137) but through the use of the master symmetry X2 . In other words, we construct the master symmetry X2 using (143) and then compute π3 = − 31 LX2 π1 . 9.3. A recursion operator for Dn Toda systems in natural (q, p) coordinates We now define a bi-Hamiltonian formulation for Dn Bogoyavlensky–Toda systems in natural (qi , pi ) coordinates. This bracket is simply the pull-back of π3 under the Flaschka transformation (133). After some tedious calculation, we obtain the following bracket in (qi , pi ) coordinates: {qi , qj } = −2pj , {qi , pi } =
p2i
+ 2e
i<j qi −qi+1
,
i = 1, 2, . . . , n − 2
{qn−1 , pn−1 } = p2n−1 + 2eqn−1 −qn + 2eqn−1 +qn {qn , pn } = p2n
{qi , pi−1 } = eqi−1 −qi , {qn , pn−1 } = e
qn−1 −qn
i = 2, 3, . . . , n − 1
− eqn−1 +qn
{qi , pi+1 } = −eqi −qi+1 + 2eqi+1 −qi+2 ,
{qn−2 , pn−1 } = −e
qn−2 −qn−1
+ 2e
qn−1 −qn
i = 1, 2, . . . , n − 3
+ 2eqn−1 +qn
(144)
{qn−1 , pn } = −eqn−1 −qn + eqn−1 +qn
{qi , pj } = −2eqj−1 −qj + 2eqj −qj+1 ,
1≤i<j −1≤n−3
{qi , pn−1 } = −2eqn−2 −qn−1 + 2eqn−1 −qn + 2eqn−1 +qn {qi , pn } = −2eqn−1 −qn + 2eqn−1 +qn ,
{pi , pi+1 } = −eqi −qi+1 (pi + pi+1 ) ,
i = 1, 2, . . . , n − 3
i = 1, 2, . . . , n − 2
i = 1, 2, . . . , n − 2
{pn−1 , pn } = −(pn−1 + pn )eqn−1 −qn + (pn−1 − pn )eqn−1 +qn ; and all other brackets are zero. Denote this Poisson tensor by J1 and let J0 be the standard symplectic bracket. A simple computation leads to the following: Theorem 15. The bracket J1 satisfies (i) J1 is Poisson. (ii) J1 is compatible with J0 .
March 22, 2004 10:8 WSPC/148-RMP
00197
Multiple Hamiltonian Structure of Bogoyavlensky–Toda Lattices
233
(iii) The mapping F given by (133) is a Poisson mapping between J1 and the cubic bracket π3 . Thus, in (q, p) coordinates we also have a non-degenerate pair (J0 , J1 ) for Dn Bogoyavlensky–Toda and therefore we may define a recursion operator R = J 1 J0−1 . We obtain a hierarchy of mutually compatible Poisson tensors defined by Ji = Ri J0 . The vector field Z0 =
n X i=1
n
pi
X ∂ ∂ 2(n − i) + , ∂pi i=1 ∂qi
(145)
is a conformal symmetry for the Poisson tensors J0 and J1 and for the Hamiltonian n
h0 =
1X 2 p + eq1 −q2 + · · · + eqn−1 −qn + eqn−1 +qn . 2 1 j
(146)
We compute LZ0 J0 = −J0 ,
L Z 0 J1 = J 1 ,
LZ0 h0 = 2h0 .
(147)
So Oevel’s Theorem applies. With Zi = Ri Z0 , χ0 = χh0 and χi = Ri χ0 one calculates easily that (a)
[Zi , χj ] = (1 + 2j) χi+j
(b)
[Zi , Zj ] = 2 (j − i) Zi+j
(c)
LZi (Jj ) = (2j − 2i − 1) Ji+j .
Note that (a) gives the exponents (except one) for a Lie group of type D n . The action of the first master symmetry on Pn is the same as in Flaschka coordinates: Z1 (Pn ) = h0 Pn .
(148)
[Z0 , χPn ] = (n − 1) χPn ,
(149)
Finally, we calculate that
producing the last exponent. 9.4. Bi-Hamiltonian formulation of Bogoyavlensky–Toda systems of type Dn In order to show that the Dn Toda systems are bi-Hamiltonian we use the same procedure as in the previous two cases. The tensors π1 and π3 are both invertible and we define π−1 = π1 π3−1 π1 . This is the second bracket required to obtain a bi-Hamiltonian pair. We illustrate with a small dimensional example, namely D4 . The explicit formula for π−1 is the following:
March 22, 2004 10:8 WSPC/148-RMP
234
00197
P. A. Damianou
Let A be a skew-symmetric 8 × 8 matrix given by the following terms: a12 = −a1 a2 (a22 a23 b4 + b2 b23 b24 + 2b4 b3 b2 a24 − 2b4 b3 b2 a23 + b2 a43 + a44 b2 − 2b2 a24 a23 + a22 b2 b24 − b22 b3 b24 + b4 b22 a23 − b4 b22 a24 + b21 b3 b24 + b21 a24 b4 − b21 a23 b4 − a22 b3 b24 − a22 a24 b4 ) a13 = −a1 a3 (b22 b3 b24 − b4 b22 a23 − a22 b2 b24 + b4 b22 a24 − b21 b3 b24 − b21 a24 b4 + b21 a23 b4 − a42 b4 + 2a22 b2 b3 b4 − a21 a22 b4 + a22 b2 a24 − a22 a23 b2 − b22 b23 b4 + a23 b22 b3 − b3 b22 a24 + b21 a22 b4 +b21 b23 b4 − b21 a23 b3 + b21 b3 a24 ) a14 = a1 a4 (a22 b2 b24 − b22 b3 b24 + b4 b22 a23 − b4 b22 a24 + b21 b3 b24 + b21 a24 b4 − b21 a23 b4 − a42 b4 + 2a22 b2 b3 b4 − a21 a22 b4 + a22 b2 a24 − a22 a23 b2 − b22 b23 b4 + a23 b22 b3 − b3 b22 a24 + b21 a22 b4 + b21 b23 b4 − b21 a23 b3 + b21 b3 a24 ) a15 = −a1 (2a24 b3 b22 b4 + 2b4 a22 a23 b2 + b22 a44 + a42 b24 − 2b3 b2 a22 b24 + b22 a43 − 2b3 b4 b22 a23 −2b22 a24 a23 − 2b4 a22 a24 b2 + b23 b24 b22 + a21 b23 b24 − 2a21 b3 b4 a23 + 2a21 b3 b4 a24 + a21 a44 − 2a21 a24 a23 + a21 a43 ) a16 = a1 (2a21 b3 b4 a24 + 2b21 b3 b4 a24 − 2a21 b3 b4 a23 − 2b4 a22 a24 b2 + 2b4 a22 a23 b2 + 2a42 b24 −2b3 b2 a22 b24 + a21 a44 + a21 a43 + b21 a44 + b21 a43 − 2a21 a24 a23 − 2b21 a23 a24 + a21 b23 b24 − 2b21 a23 b3 b4 + b21 b23 b24 ) a17 = a1 (2b22 a44 − 2b21 b3 b4 a24 + 2b22 a43 + 2a24 b3 b22 b4 − 2a42 b24 − 4b22 a24 a23 + 2b3 b2 a22 b24 − 2b3 b4 b22 a23 − 2b21 a44 − 2b21 a43 + 4b21 a23 a24 + 2b21 a23 b3 b4 + b21 a22 b24 − a21 a22 b24 ) a18 = a1 (a21 a22 a24 − a21 a22 a23 − 2b21 b3 b4 a24 + 2b22 a44 − 2b4 a22 a24 b2 − 2b22 a43 + 2a24 b3 b22 b4 − 2b4 a22 a23 b2 + 2b3 b4 b22 a23 − 2b21 a44 + 2b21 a43 − 2b21 a23 b3 b4 + b21 a22 a23 − b21 a22 a24 ) a23 = a2 a3 (b21 a23 b4 − b21 b3 b24 − b21 a24 b4 − b3 a21 b24 − a21 a24 b4 + a21 a23 b4 + b21 b23 b4 − b21 a23 b3 + b21 b3 a24 + b23 b4 a21 + a24 b3 a21 − b3 a21 a23 − b4 b21 b22 + 2b4 b1 a21 b2 − a21 a22 b4 − b4 a41 ) a24 = −a2 a4 (b21 b3 b24 + b21 a24 b4 − b21 a23 b4 + b3 a21 b24 + a21 a24 b4 − a21 a23 b4 + b21 b23 b4 − b21 a23 b3 + b21 b3 a24 + b23 b4 a21 + a24 b3 a21 − b3 a21 a23 − b4 b21 b22 + 2b4 b1 a21 b2 − a21 a22 b4 − b4 a41 ) a25 = −a21 a2 (2b4 b3 a23 − b23 b24 − 2b4 a42 b3 − a44 + 2a24 a23 − a43 + 2b1 b3 b24 − 2b4 b1 a23 +2b4 a24 b1 − a22 b24 + 2b3 b2 b24 + 2b4 b2 a24 − 2b4 a23 b2 ) a26 = −a2 (4a21 b3 b4 a24 + 2b21 b3 b4 a24 − 2a21 b4 b2 a24 − 2a21 b3 b2 b24 + 2a21 b4 a23 b2 − 2b1 a21 a24 b4
March 22, 2004 10:8 WSPC/148-RMP
00197
Multiple Hamiltonian Structure of Bogoyavlensky–Toda Lattices
235
+ 2b1 a21 a23 b4 − 2b1 b3 a21 b24 − 4a21 b3 b4 a23 + 2a21 a44 + 2a21 a43 + b21 a44 + b21 a43 − 4a21 a24 a23 − 2b21 a23 a24 + 2a21 b23 b24 − 2b21 a23 b3 b4 + b21 a22 b24 + 2a21 a22 b24 + b21 b23 b24 ) a27 = a2 (a41 b24 + 2a21 b3 b4 a24 + 2b21 b3 b4 a24 − 2a21 b3 b4 a23 + 2a21 a44 + 2a21 a43 + 2b21 a44 +2b21 a43 − 4a21 a24 a23 − 4b21 a23 a24 − 2b21 a23 b3 b4 + b21 a22 b24 + 2a21 a22 b24 − 2b1 a21 b2 b24 + b21 b22 b24 ) a28 = −a2 (a24 b21 b22 − 2a21 b3 b4 a24 − 2b21 b3 b4 a24 − 2a21 b3 b4 a23 − 2a21 a44 + 2a21 a43 − 2b21 a44 + 2b21 a43 − 2b21 a23 b3 b4 − 2b1 a24 a21 b2 − b21 a23 b22 + b21 a22 a23 − b21 a22 a24 − a41 a23 + a41 a24 + 2b1 a21 a23 b2 ) a34 = −2a3 a4 b4 (b21 a22 + b21 b22 − 2b1 a21 b2 + a21 a22 + a41 ) a35 = −a3 a21 (2b4 b1 a23 − 2b1 b3 b24 − 2b4 a24 b1 + a22 b24 − 2b3 b2 b24 − 2b4 b2 a24 + 2b4 a23 b2 + 2b1 b4 a22 + 2b1 b23 b4 − 2b1 a23 b3 + 2a24 b1 b3 − 2a22 b3 b4 + a22 a23 − a22 a24 + 2b2 b23 b4 + 2a24 b2 b3 − 2a23 b2 b3 ) a36 = −a3 (2b21 b4 a22 b2 − 4b1 a21 a22 b4 + 2a21 a22 a24 − 2a21 a22 a23 + 2a21 b4 b2 a24 + 2a21 b3 b2 b24 − 2a21 a24 b2 b3 + 2a21 a23 b2 b3 + 2a22 b21 b3 b4 + 2a21 b1 a23 b3 + 4a21 a22 b3 b4 − 2a21 b2 b23 b4 − 2a21 a24 b1 b3 − 2a21 b1 b23 b4 − 2a21 b4 a23 b2 + 2b1 a21 a24 b4 − 2b1 a21 a23 b4 + 2b1 b3 a21 b24 − b21 a22 b24 − 2a21 a22 b24 − b21 a22 a23 + b21 a22 a24 ) a37 = a3 (2b21 b4 a22 b2 − 2b1 a21 a22 b4 + 2a21 a22 a24 − a41 b24 − 2a21 a22 a23 + 2a22 b21 b3 b4 + 2a21 a22 b3 b4 + a24 b21 b22 − 2b21 a22 b24 − 2a21 a22 b24 + 2b1 a21 b2 b24 − 2b1 a24 a21 b2 − b21 a23 b22 − 2b21 a22 a23 + 2b21 a22 a24 − b21 b22 b24 − a41 a23 + a41 a24 + 2b1 a21 a23 b2 ) a38 = a3 (2b1 a21 a22 b3 + 2a21 a22 a24 + 2a21 a22 a23 + a41 b23 + b21 b22 b23 + 3a24 b21 b22 − 6b1 a24 a21 b2 + b21 a23 b22 + b21 a42 + 2b21 a22 a23 + 2b21 a22 a24 − 2b21 a22 b3 b2 − 2b1 a21 b2 b23 + a41 a23 + 3a41 a24 − 2b1 a21 a23 b2 ) a45 = a4 a21 (2b1 b3 b24 − 2b4 b1 a23 + 2b4 a24 b1 − a22 b24 + 2b3 b2 b24 + 2b4 b2 a24 − 2b4 a23 b2 + 2b1 b4 a22 + 2b1 b23 b4 − 2b1 a23 b3 + 2a24 b1 b3 − 2a22 b3 b4 + a22 a23 − a22 a24 + 2b2 b23 b4 + 2a24 b2 b3 − 2a23 b2 b3 ) a46 = a4 (2b21 b4 a22 b2 − 4b1 a21 a22 b4 + 2a21 aa2 42 − 2a21 a22 a23 − 2a21 b4 b2 a24 − 2a21 b3 b2 b24
March 22, 2004 10:8 WSPC/148-RMP
236
00197
P. A. Damianou
− 2a21 a24 b2 b3 + 2a21 a23 b2 b3 + 2a22 b21 b3 b4 + 2a21 b1 a23 b3 + 4a21 a22 b3 b4 − 2a21 b2 b23 b4 − 2a21 a24 b1 b3 − 2a21 b1 b23 b4 + 2a21 b4 a23 b2 − 2b1 a21 a24 b4 + 2b1 a21 a23 b4 − 2b1 b3 a21 b24 + b21 a22 b24 + 2a21 a22 b24 − b21 a22 a23 + b21 a22 a24 ) a47 = −a4 (2b21 b4 a22 b2 − 2b1 a21 a22 b4 + 2a21 a22 a24 + a41 b24 − 2a21 a22 a23 + 2a22 b21 b3 b4 +2a21 a22 b3 b4 + a24 b21 b22 + 2b21 a22 b24 + 2a21 a22 b24 − 2b1 a21 b2 b24 − 2b1 a24 a21 b2 − b21 a23 b22 − 2b21 a22 a23 + 2b21 a22 a24 + b21 b22 b24 − a41 a23 + a41 a24 + 2b1 a21 a23 b2 ) a48 = −a4 (2b1 a21 a22 b3 + 2a21 a22 a24 + 2a21 a22 a23 + a41 b23 + b21 b22 b23 + a24 b21 b22 − 2b1 a24 a21 b2 + 3b21 a23 b22 + b21 a42 + 2b21 a22 a23 + 2b21 a22 a24 − 2b21 a22 b3 b2 − 2b1 a21 b2 b23 + 3a41 a23 + a41 a24 − 6b1 a21 a23 b2 ) a56 = −2a21 (b1 b23 b24 − 2b1 b4 a23 b3 + 2b1 b4 a24 b3 + b1 a44 − 2b1 a23 a24 + b1 a43 + b2 b23 b24 + 2b4 b3 b2 a24 − 2b4 b3 b2 a23 + a44 b2 − 2b2 a24 a23 + b2 a43 ) a57 = −2a21 (4b2 a24 a23 − 2b1 a44 − 2b1 a43 − 2a44 b2 − a22 b3 b24 + 2b4 b3 b2 a23 − 2b4 b3 b2 a24 − 2b2 a43 − 2b1 b4 a24 b3 + 4b1 a23 a24 + 2b1 b4 a23 b3 + b1 a22 b24 ) a58 = −2a21 (2b1 a43 − 2b1 a44 − 2a44 b2 − 2b4 b3 b2 a23 − 2b4 b3 b2 a24 + 2b2 a43 + a22 a23 b4 − 2b1 b4 a24 b3 − 2b1 b4 a23 b3 − b1 a22 a24 + a22 a24 b4 + b1 a22 a23 ) a67 = −4a21 b2 a44 − 4a21 a43 b2 − 4b3 a21 a22 b24 + 8a21 a23 b2 a24 − 4a21 b1 a44 − 4a21 b1 a43 + 4a21 b1 a22 b24 − 2a22 b21 b3 b24 − 2a22 b21 b24 b2 + 8a21 b1 a23 a24 + 4a21 b1 b4 a23 b3 − 4a21 b1 b4 a24 b3 − 4b3 b4 a21 b2 a24 + 4b3 b4 a21 a23 b2 a68 = −4b3 b4 a21 a23 b2 − 4b3 b4 a21 b2 a24 − 4a21 b1 b4 a23 b3 − 4a21 b1 b4 a24 b3 − 4a21 b2 a44 + 4a21 a43 b2 − 4a21 b1 a44 + 4a21 b1 a43 + 4a21 a22 a23 b4 + 4a21 a22 a24 b4 + 4b1 a21 a22 a23 − 4b1 a21 a22 a24 + 2b21 a22 a24 b4 + 2b21 a22 a23 b4 + 2b21 a22 b2 a24 − 2b21 a22 b2 a23 a78 = 2a24 b3 a41 − 2a41 a24 b4 − 4a21 a22 a23 b4 − 4a21 a22 a24 b4 + 4b1 a21 a23 b2 b3 − 4b1 a21 a24 b2 b3 + 4b1 b4 a21 a23 b2 − 4b1 a21 a22 a23 + 4b1 a21 a22 a24 − 2b21 a23 b3 b22 − 2b21 b4 b22 a23 − 2b21 b4 b22 a24 + 4b1 b4 a24 a21 b2 − 4b21 a22 a24 b4 + 2b21 b3 b22 a24 − 4b21 a22 a23 b4 − 4b21 a22 b2 a24 + 4b21 a22 b2 a23 − 2a23 b3 a41 − 2a41 a23 b4 . The Poisson tensor of π−1 is given by the formula
March 22, 2004 10:8 WSPC/148-RMP
00197
Multiple Hamiltonian Structure of Bogoyavlensky–Toda Lattices
π−1 =
237
1 A, 2 det L
det L = P42 = (a22 b1 b4 + a23 b1 b2 − a24 b1 b2 − b1 b2 b3 b4 − a21 a23 + a21 a24 + a21 b3 b4 )2 . We note that det π3 = det π1 (det L)2 = det π1 P44 , and therefore det R = (det L)2 . This formula (as well as the formulas in the previous two cases) indicates that the eigenvalues of the recursion operator should be the squares of the non-zero eigenvalues of the Jacobi matrix. This is known to hold in the case of the classical An Toda lattice, see [26]. For a general recursion operator, the relation between its eigenvalues and the eigenvalues of the Lax matrix has not been fully investigated yet. As in the case of B2 and C3 we have π1 ∇H2 = π−1 ∇H4 . 10. Conclusion 10.1. Summary of results The classical, finite, non-periodic Toda lattice is known to be bi-Hamiltonian. Moreover, (53) is a multi-Hamiltonian formulation of the system. We have indicated how to obtain similar results for the other classical Lie algebras and we have illustrated with some small dimensional examples. These examples may be generalized: Theorem 16. The Bn , Cn and Dn Toda systems are bi-Hamiltonian. In fact, they are multi-Hamiltonian. In each case we define N = π1 π3−1 , where π1 is the Lie–Poisson bracket, π3 is the cubic Poisson bracket and π−(2i−1) = N i π1 ,
i = 1, 2, . . . .
Then all the brackets are mutually compatible, Poisson and satisfy π1 ∇H2 = π−1 ∇H4 = π−3 ∇H6 = · · · .
(150)
Proof. The proof of (150) is trivial. They are just the Lenard relations for the negative hierarchy. The brackets π−1 , π−3 , π−5 , . . . are all Poisson since they are generated by the negative recursion operator, N , applied to the initial Poisson bracket π1 . To prove compatibility of all Poisson brackets appearing in (150) we take two brackets πt and πs where t, s are odd integers, with t < s ≤ 1. Using condition (b) of Oevel’s theorem (for the negative operator) we can express π t as
March 22, 2004 10:8 WSPC/148-RMP
238
00197
P. A. Damianou
the Lie derivative of πs in the direction of a master symmetry. It is therefore enough to prove the following simple general result: if π and σ are both Poisson tensors and σ = LX π for some vector field X, then π and σ are compatible. The one line proof uses the super-Jacobi identity for the Scouten bracket: 1 [π, σ] = [π, [π, X]] = − [X, [π, π]] = 0 . 2 We remark that Oevel’s theorem applies in all three cases (and for both hierarchies) since the Euler vector field X0 (58) is a conformal symmetry for π1 , π3 and H2 . Furthermore, the compatibility condition holds for all brackets in both hierarchies. If πt and πs are both in the positive hierarchy then the argument of the theorem still works using the positive recursion operator. The remaining case, when one of the tensors has negative index and the other one positive, can also be proved in a straightforward manner using similar arguments, i.e., properties of the Schouten bracket and the fact that the formulas in Oevel’s theorem hold for any integer value of the index (Theorem 8). The tensor in the positive hierarchy is the Lie derivative of the tensor in the negative hierarchy in the direction of a suitable master symmetry and the argument of the theorem shows that they are compatible. Therefore, we have a more general result: any two brackets in either the positive or negative hierarchy are compatible. Remark. The compatibility condition follows also from a general result of biHamiltonian geometry: if π and σ are two compatible Poisson tensors and π is invertible, then N = σπ −1 is a recursion operator (i.e. its torsion vanishes) and all the tensors N i π, with i ≥ 0, are Poisson and compatible. Using this result one can prove compatibility of all brackets in both hierarchies. For example, to show that −1 π5 is compatible with π−3 we use the fact that R = π3 π1−1 = π−1 (π−3 ) to obtain 4 π5 = R π−3 . Since π−1 and π−3 are compatible and Poisson, then R generates a chain of compatible Poisson tensors. In particular π5 and π−3 are compatible. 10.2. Open problems We conclude with some open problems and some possible directions of research for systems related to the Toda lattice. • Exceptional Toda lattices The case of exceptional simple Lie groups is still an open problem. It is also a much more difficult problem. The only case that is reasonable to complete is the Toda system of type G2 . In that case the second Poisson bracket should be a homogeneous bracket of degree 5 (a conjecture of Flaschka states that the degrees of the independent Poisson tensors coincide with the exponents of the corresponding Lie group). The other exceptional cases are even more complicated. It is a nontrivial task even to write down an explicit Lax pair for the systems and therefore the methods of this paper will be difficult to apply.
March 22, 2004 10:8 WSPC/148-RMP
00197
Multiple Hamiltonian Structure of Bogoyavlensky–Toda Lattices
239
• Full-Kostant Toda One has a tri-Hamiltonian formulation of the An system but no hierarchy. In the case of generalized full-Kostant Toda lattice (associated to simple Lie groups) one could seek to find similar structures as in the present paper. So far, nothing is known. The interesting feature of these systems is the presence of the rational integrals that are necessary to prove integrability. The Lie-algebraic background of the systems will certainly play a prominent role. • The Volterra or KM-system The multi-Hamiltonian structure of this system was first obtained in [64]. However, there is a symplectic realization of the system, due to Volterra, and it would be interesting to find a recursion operator in that symplectic space that projects onto the known hierarchy (as in Sec. 5.1). • Bogoyavlensky–Volterra lattices There is also an interesting connection with the corresponding generalized Volterra systems also defined by Bogoyavlensky [65] in 1988. It seems that the multiple Hamiltonian structures of the Volterra and Toda lattices are in one-toone correspondence through a procedure of Moser. Multiple Hamiltonian structures for the generalized Volterra lattices, were constructed recently by Kouzaris [66], at least for the classical Lie algebras. The relation between the Volterra systems of type Bn and Cn and the corresponding Toda Bn , Cn systems was demonstrated in [67]. The connection between Volterra Dn and Toda Dn is still an open problem. • Independence of Poisson structures Equations (53) and (150) are remarkable; one Hamiltonian system, infinite formulations. On the other hand the systems are finite dimensional and after some point some dependencies should occur. Indeed, the integrals Hk are not all independent. It is natural to ask a similar question about the infinite sequence of Poisson structures. Do they become dependent after a certain point? Unfortunately, there exists no widely accepted definition of independence for Poisson tensors. • Is the Toda lattice super-integrable? This conjecture should be true. A number of well-known systems are superintegrable, i.e. the free particle, the harmonic oscillator and the Calogero–Moser systems. In the case of the open Toda lattice, asymptotically the particles become free as time goes to infinity with asymptotic momenta being the eigenvalues of the Lax matrix. Therefore, the system behaves asympotically like a system of free particles which is super-integrable. Acknowledgments I would like to thank H. Flaschka for introducing me to the problems of the present paper and for his considerable input in this long project. I thank also a number of people who had discussed the subject with me and given me some useful ideas: J.
March 22, 2004 10:8 WSPC/148-RMP
240
00197
P. A. Damianou
M. Costa Nunes, R. Fernades, M. Gekhtman, S. Kouzaris, F. Magri, I. Marshall, W. Oevel, T. Ratiu and C. Sophocleous. References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41]
O. I. Bogoyavlensky, Commun. Math. Phys. 51 (1976) 201. B. Kostant, Adv. Math. 34 (1979) 195. M. A. Olshanetsky and A. M. Perelomov, Invent. Math. 54 (1979) 261. M. Adler and P. van Moerbeke, Adv. Math. 38 (1980) 267. V. V. Kozlov and D. V. Treshchev, Math. USSR-Izv. 34 (1990) 555. M. F. Ranada, J. Math. Phys. 36 (1995) 6846. A. Annamalai and K. M. Tamizhmani, J. Math. Phys. 34 (1993) 1876. A. M. Perelomov, Integrable Systems of Classical Mechanics and Lie Algebras, Vol. I (Birkhauser Verlag, Basel, 1990). P. A. Damianou, Lett. Math. Phys. 20 (1990) 101. P. A. Damianou, J. Math. Phys. 35 (1994) 5511. J. M. Nunes da Costa and P. A. Damianou, Bull. Sci. Math. 125 (2001) 49. P. A. Damianou, Regul. Chaotic Dyn. 5 (2000) 17. P. A. Damianou and S. P. Kouzaris, J. Phys. A36 (2003) 1385. P. A. Damianou, Nonlinearity 17 (2004) 397. P. A. Damianou, J. Geom. Phys. 45 (2003) 184. A. Lichnerowicz, J. Differential Geom. 12 (1977) 253. J. E. Marsden and T. S. Ratiu, Introduction to Mechanics and Symmetry, A Basic Exposition of Classical Mechanical Systems (Springer-Verlag, New York, 1999). I. Vaisman, Lectures on the Geometry of Poisson Manifolds, Progress in Mathematics, 118 (Birkh´ auser, Basel, 1994). A. Weinstein, J. Differential Geom. 18 (1983) 523. J. Grabowski, G. Marmo and A. M. Perelomov, Modern Phys. Lett. A8 (1993) 1719. R. Cushman and M. Roberts, Bull. Sci. Math. 126 (2002) 525. P. A. Damianou, Bull. Sci. Math. 120 (1996) 195. C. Chevalley and S. Eilenberg, Trans. Amer. Math. Soc. 63 (1948) 85. J. L. Koszul, Soc. Math. France Asterisque hors serie (1985) 257. F. Magri, J. Math. Phys. 19 (1978) 1156. G. Falqui, F. Magri and M. Pedroni, J. Nonlinear Math. Phys. 8 (2001) 118. I. M. Gelfand and I. Zakharevich, Selecta Math. 6 (2000) 131. R. G. Smirnov, C. R. Math. Acad. Sci. Soc. R. Can. 17 (1995) 225. Y. B. Suris, Phys. Lett. A180 (1993) 419. B. Fuchssteiner, Progr. Theoret. Phys. 70 (1983) 1508. A. S. Fokas and B. Fuchssteiner, Phys. Lett. A86 (1981) 341. W. Oevel, Topics in Soliton Theory and Exactly Solvable Non-linear Equations (World Scientific Publishing, Singapore, 1987). H. Flaschka, Phys. Rev. B9 (1974) 1924. H. Flaschka, Progr. Theoret. Phys. 51 (1974) 703. M. Henon, Phys. Rev. B9 (1974) 1921. S. Manakov, Zh. Exp. Teor. Fiz. 67 (1974) 543. J. Moser, Lect. Notes Phys. 38 (1976) 97. J. Moser, Adv. Math. 16 (1975) 197. M. Toda, J. Phys. Soc. Japan 22 (1967) 431. M. Adler, Invent. Math. 50 (1979) 219. B. Kupershmidt, Asterisque 123 (1985) 1.
March 22, 2004 10:8 WSPC/148-RMP
00197
Multiple Hamiltonian Structure of Bogoyavlensky–Toda Lattices
[42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55]
[56] [57] [58] [59] [60]
[61] [62] [63] [64] [65] [66] [67]
241
A. Das and S. Okubo, Ann. Phys. 190 (1989) 215. R. L. Fernandes, J. Phys. A26 (1993) 3797. C. Morosi and G. Tondo, Inv. Probl. 6 (1990) 557. P. J. Olver, J. Math. Phys. 18 (1977) 1212. L. Faybusovich and M. Gekhtman, Phys. Lett. A272 (2000) 236. M. F. Atiyah and N. Hitchin, The Geometry and Dynamics of Magnetic Monopoles, M. B. Porter Lectures (Princeton University Press, Princeton, 1988). J. Moser, Finitely many mass points on the line under the influence of an exponential potential. Batelles Recontres, Springer Notes in Physics (1974) 417–497. K. L. Vaninsky, J. Geom. Phys. 46 (2003) 283. F. Petalidou, Bull. Sci. Math. 124 (2000) 255. Y. Kosmann-Schwarzbach and F. Magri, J. Math. Phys. 37 (1996) 6173. P. J. Olver, Applications of Lie groups to Differential Equations. GTM, 107 (Springer-Verlag, New York, 1986). G. W. Bluman and S. Kumei, Symmetries and Differential Equations (SpringerVerlag, New York, 1989). L. V. Ovsiannikov, Group Analysis of Differential Equations (Academic Press, New York, 1982). N. H. Ibragimov, Elementary Lie Group Analysis and Ordinary Differential Equations, Wiley Series in Mathematical Methods in Practice, 4 (John Wiley and Sons, Ltd., 1999). J. M. Nunes da Costa and C. M. Marle, J. Phys. A30 (1997) 7551. P. A. Damianou and C. Sophocleous, J. Math. Phys. 40 (1999) 210. P. A. Damianou, J. Phys. A26 (1993) 3791. M. F. Ranada, J. Math. Phys. 40 (1999) 236. C. Sophocleous, S. Moyo, P. G. L. Leach and P. A. Damianou, Noether Symmetries and Integrals in One, Two and Three dimensions, TR/16/2000, Department of Mathematics and Statistics, University of Cyprus. P. A. Damianou and C. Sophocleous, Master and Noether symmetries for the Toda lattice, Proc. 16th Int. Symp. Nonlinear Acoustics, 1 (2003), pp. 618–622. D. H. Collingwood and W. M. McGovern, Nilpotent Orbits in Semisimple Lie Algebras (Van Nostrand Reinhold Co., New York, 1993). J. E. Humphreys, Introduction to Lie algebras and Representation Theory GTM 9 (Springer–Verlag, 1972). P. A. Damianou, Phys. Lett. A155 (1991) 126. O. I. Bogoyavlensky, Phys. Lett. A134 (1988) 34. S. P. Kouzaris, J. Nonlinear Math. Phys. 10 (2003) 431. P. A. Damianou and R. Fernandes, Rep. Math. Phys. 50 (2002) 361.
March 18, 2004 11:29 WSPC/148-RMP
00194
Reviews in Mathematical Physics Vol. 16, No. 2 (2004) 243–255 c World Scientific Publishing Company
MARKOV QUANTUM FIELDS ON A MANIFOLD
J. DIMOCK Dept. of Mathematics, SUNY at Buffalo Buffalo, NY, 14260, USA [email protected] Received 23 May 2003 Revised 26 December 2003 We study scalar quantum field theory on a compact manifold. The free theory is defined in terms of functional integrals. For positive mass it is shown to have the Markov property in the sense of Nelson. This property is used to establish a reflection positivity result when the manifold has a reflection symmetry. In dimension d = 2 we use the Markov property to establish a sewing operation for manifolds with boundary circles. Also in d = 2 the Markov property is proved for interacting fields. Keywords: Markov property; reflection positivity; sewing.
1. Introduction We consider a Riemannian manifold (M, g) consisting of an oriented compact connected manifold M of dimension d and a positive definite metric g. The natural inner product on functions is Z Z p hu, vi = u ¯vdτ = u ¯(x)v(x) det g(x)dx (1) where dτ is the Riemannian volume element and the second expression refers to local coordinates. The Laplacian ∆ can be defined by the quadratic form Z Z p ∂u ¯ ∂u hu, (−∆)ui = |du|2 dτ = g µν (x) (2) (x) (x) det g(x)dx . ∂xµ ∂xν As is well-known −∆ defines a self adjoint operator in L2 (M, dτ ) with non-negative discrete spectrum and an isolated simple eigenvalue at zero and with eigenspace the constants. We want to study the free scalar quantum field of mass m ≥ 0 on (M, g). (We use the term quantum loosely; there may be no direct quantum mechanical interpretation.) For m > 0 this is a family φ(f ) = hφ, f i of Gaussian random variables indexed by smooth real functions f on M . The fields φ(f ) are defined to 243
March 18, 2004 11:29 WSPC/148-RMP
244
00194
J. Dimock
have mean zero and covariance (−∆ + m2 )−1 . If µ is the underlying measure we have the characteristic function Z −1 1 eiφ(f ) dµ = e− 2 hf,(−∆+m) f i (3) from which one can generate the correlation functions. For m = 0 the Laplacian is only invertible on the orthogonal complement of the R constants and we restrict the test functions f to lie in this subspace, i.e. f dτ = 0. For m = 0 and d = 2, metrics which are equivalent by a local rescaling give rise to the same fields,a and we have a conformal field theory. In this paper we show that for m > 0 the fields φ(f ) satisfy a Markov property in the sense of Nelson [1–3]. The Markov property is the statement that for a function of the fields localized in a region Ω, conditioning on fields in Ωc is the same as conditioning on fields in ∂Ω. Nelson originally established this property for Euclidean quantum fields in Rn , and we show that his treatment can also be carried out on manifolds. We also work out some applications, generally for m > 0 and sometimes by limits for m = 0. We show that functional integrals can be written as inner products of states localized on (d − 1)-dimensional submanifolds. If the manifold has a reflection symmetry this leads to a reflection positivity result and an enhanced Hilbert space structure. In d = 2 another application is the establishment of a sewing property for manifolds with boundary circles. Operations of this type are widely used in conformal field theory and string theory. Finally we obtain the Markov property for interacting fields in d = 2. 2. Sobolev Spaces We begin with some preliminary definitions. (See for example [4].) Let H ±1 (M ) be the usual real Sobolev spaces consisting of those distributions on M which when expressed in local coordinates are in the spaces H ±1 (Rd ). These have no particular norm, but we give an alternate definition which supplies a norm and an inner product. The spaces H ±1 (M ) can be identified as completion of C ∞ (M ) in the norm kuk2±1 = hu, (−∆ + m2 )±1 ui
(4)
for any m > 0. These are real Hilbert spaces and we have H 1 (M ) ⊂ L2 (M, dτ ) ⊂ H −1 (M ). We also have |hu, vi| ≤ kuk1kvk−1 so the inner product extends by limits to a bilinear pairing of H 1 , H −1 . These spaces are dual with respect to this pairing. Also −∆ + m2 is unitary from H 1 to H −1 . For any closed subset A ⊂ M define a closed subspace −1 HA (M ) = {u ∈ H −1 (M ) : supp u ⊂ A} . a For
(5)
−1 −1 f i smooth λ > 0 we have ∆λg = λ−1 ∆g and hence hλ−1 f, ∆−1 λg = hf, ∆g f ig . Thus λg λ −1 φλg (λ f ) and φg (f ) have the same characteristic function and are equivalent.
March 18, 2004 11:29 WSPC/148-RMP
00194
Markov Quantum Fields on a Manifold
245
Also for Ω ⊂ M open let H01 (Ω) be the closure of C0∞ (Ω) in H 1 (M ). Now let Ω be open set and consider the disjoint unions M = Ωc ∪ Ω ,
¯, M = (ext Ω) ∪ Ω
M = (ext Ω) ∪ ∂Ω ∪ Ω .
(6)
For each of these we have an associated decomposition of H −1 (M ): Lemma 1. For open Ω ⊂ M 2 1 H −1 (M ) = HΩ−1 c (M ) ⊕ (−∆ + m )H0 (Ω)
(7)
H −1 (M ) = (−∆ + m2 )H01 (ext Ω) ⊕ HΩ−1 ¯ (M )
(8)
−1 H −1 (M ) = (−∆ + m2 )H01 (ext Ω) ⊕ H∂Ω (M ) ⊕ (−∆ + m2 )H01 (Ω) .
(9)
Proof. It is straightforward to show that the orthogonal complement of H01 (Ω) in the dual space H −1 (M ) is HΩ−1 c (M ). The dual relation is that the orthogonal 1 (M ) in H (M ) is H01 (Ω). To find the orthogonal complement complement of HΩ−1 c −1 of HΩ−1 (M ) we apply the unitary operator −∆ + m2 and hence get c (M ) in H 2 1 (−∆ + m )H0 (Ω). This gives the first result. For the second result replace Ω by ext Ω. For the third result replace Ω by (∂Ω)c and obtain −1 H −1 (M ) = H∂Ω (M ) ⊕ (−∆ + m2 )H01 ((∂Ω)c ) .
(10)
The result now follows from H01 ((∂Ω)c ) = H01 (Ω) ⊕ H01 (ext Ω) .
(11)
Remark. Applying the unitary (−∆+m2 )−1 to the decomposition (9) of H −1 (M ) we get a decomposition of H 1 (M ) which is −1 H 1 (M ) = H01 (ext Ω) ⊕ (−∆ + m2 )−1 H∂Ω (M ) ⊕ H01 (Ω) .
(12)
This says that any element of H 1 (M ) can be uniquely written as the sum of a function which satisfies (−∆ + m2 )u = 0 on (∂Ω)c = Ω ∪ ext Ω and a function which vanishes on ∂Ω. By comparing the various decompositions in the lemma we also have corre¯ = ∂Ω ∪ Ω and Ωc = (ext Ω) ∪ ∂Ω the decompositions: sponding to Ω Corollary 1. −1 2 1 HΩ−1 ¯ (M ) = H∂Ω (M ) ⊕ (−∆ + m )H0 (Ω)
−1 2 1 HΩ−1 c (M ) = (−∆ + m )H0 (ext Ω) ⊕ H∂Ω (M ) .
(13)
−1 Now for A ⊂ M let eA be the orthogonal projection onto HA (M ). The following pre-Markov property is basic to our treatment.
March 18, 2004 11:29 WSPC/148-RMP
246
00194
J. Dimock
Lemma 2. For open Ω ⊂ M (1) If u ∈ HΩ−1 ¯ (M ) then eΩc u = e∂Ω u. (2) eΩc eΩ¯ = e∂Ω . Proof. The two statements are equivalent. With respect to the decomposition (9) we have 1 0 0 0 0 0 0 0 0 eΩc = 0 1 0 , eΩ¯ = 0 1 0 , e∂Ω = 0 1 0 (14) 0 0 0
0 0 1
0 0 0
and hence eΩc eΩ¯ = e∂Ω . −1 Remark. If u ∈ HΩ−1 c (M ) and v ∈ H ¯ (M ) then Ω
(u, v)−1 = (eΩc u, eΩ¯ v)−1 = (u, e∂Ω v)−1 = (e∂Ω u, e∂Ω v)−1
(15)
which reduces the inner product to the boundary. We can use this to obtain a −1 sufficient condition for H∂Ω (M ) to be nontrivial. (The condition is not necessary.) −1 Corollary 2. If Ω, ext Ω 6= ∅ then H∂Ω (M ) 6= {0}. −1 Proof. The space H∂Ω (M ) has a meaning independent of any norm. It suffices to show that it is nontrivial as a subspace of H −1 (M ) with the norm (4) and m2 small. Let u ∈ C0∞ (ext Ω) and v ∈ C0∞ (Ω) be positive functions. We will show that e∂Ω u 6= 0 and e∂Ω v 6= 0. By (15) it suffices to show that (u, v)−1 6= 0. Let ψ0 = p 1/ Vol(M ) be the lowest eigenfunction of −∆ on L2 (M, dτ ). Then u0 = hu, ψ0 i and v0 = hv, ψ0 i are nonzero. As m & 0 we have
(u, v)−1 = hu, (−∆ + m2 )−1 vi = u0 v0 m−2 + O(1) .
(16)
2
Thus (u, v)−1 6= 0 for m small. 3. Markov Property We use these results to establish the Markov property for our m > 0 field theory following Nelson [3]. First extend the class of test functions from C ∞ (M ) to H −1 (M ) so that now φ(f ) is a family of Gaussian random variables indexed by f ∈ H −1 (M ) with covariance given by the H −1 (M ) inner product. The underlying measure space (Q, O, µ) consists of a set Q, a σ-algebra of measurable subsets O generated by the φ(f ), and a measure µ. Polynomials in φ(f ) are dense in L2 (Q, O, dµ). We also need Wick monomials : φ(f1 ) . . . φ(fn ) :(−∆+m2 )−1 defined as the projection in L2 (Q, O, dµ) of φ(f1 ) . . . φ(fn ) onto the orthogonal complement of polynomials of degree n − 1. These are polynomials of degree n and for example : φ(f )φ(g) :(−∆+m2 )−1 = φ(f )φ(g) − hf, (−∆ + m2 )−1 gi .
(17)
March 18, 2004 11:29 WSPC/148-RMP
00194
Markov Quantum Fields on a Manifold
247
Let us recall the well-known connection between the Gaussian processes and Fock space. Let F(HC−1 ) be the Fock space over the complexification HC−1 (M ), that is the infinite direct sum of n-fold symmetric tensor products of the HC−1 (M ). Then there is a unitary identification of (complex) L2 (Q, O, dµ) with F(HC−1 ) determined by : φ(f1 ) . . . φ(fn ) :(−∆+m2 )−1 ↔
√ n! Sym(f1 ⊗ . . . ⊗ fn ) .
(18)
Any contraction T on HC−1 (M ) (linear operator with kT k ≤ 1) induces a contraction Γ(T ) on the Fock space by sending Sym(f1 ⊗· · ·⊗fn ) to Sym(T f1 ⊗· · ·⊗T fn ). This determines a contraction on L2 (Q, O, dµ) also denoted Γ(T ). We have Γ(T )Γ(S) = Γ(T S). Now for closed A ⊂ M let OA be the smallest subalgebra of O such that the functions {φ(f ) : supp f ⊂ A} are measurable. Also let EA F = E{F |OA } be the conditional expectation of a function F with respect to OA . Then EA is an orthogonal projection on L2 (Q, O, dµ) with range L2 (Q, OA , dµ), the OA measurable L2 -functions. The conditional expectations are related to the projections in Sobolev space by EA = Γ(eA ) .
(19)
For the proof see Simon [5]. This leads to Theorem 1 (The Markov Property). For open Ω ⊂ M (1) If F ∈ L2 (Q, OΩ¯ , dµ) then EΩc F = E∂Ω F . (2) EΩc EΩ¯ = E∂Ω . Proof. The two statements are equivalent. The second follows from eΩc eΩ¯ = e∂Ω and (19) for we have EΩc EΩ¯ = Γ(eΩc )Γ(eΩ¯ ) = Γ(eΩc eΩ¯ ) = Γ(e∂Ω ) = E∂Ω .
(20)
Remark. Now suppose that F is OΩc measurable and G is OΩ¯ measurable. Then by EΩc EΩ¯ = E∂Ω we have Z Z Z Z ¯ ¯ c (21) F Gdµ = EΩ F (EΩ¯ G)dµ = F (E∂Ω G) = E∂Ω F (E∂Ω G)dµ . This says that the conditional expectation E∂Ω maps OΩ¯ measurable functions and OΩc measurable functions to O∂Ω measurable functions in such a way that the functional integral is evaluated as the inner product in the boundary Hilbert space L2 (Q, O∂Ω , dµ) We exploit this identity in the next two sections.
March 18, 2004 11:29 WSPC/148-RMP
248
00194
J. Dimock
4. Reflection Positivity As a first application we show that if the manifold has a reflection symmetry then the functional integrals have a more elementary Hilbert space structure. We assume that our d-dimensional manifold M has a (d − 1)-dimensional submanifold B which divides the manifold in two identical parts. That is we have the disjoint union M = Ω− ∪ B ∪ Ω+
(22)
where Ω± are open and ∂Ω± = B. Further we assume there is an isometric involution θ on M so that θΩ± = Ω∓ and θB = B. For d = 2 this is the structure of a Schottky double. As an example in d dimensions we could take M to be the sphere {x ∈ Rd+1 : x20 + . . . + x2d = 1}, take B = {x0 = 0} and Ω± = {±x0 > 0}, and let θ be the reflection in x0 → −x0 . As a diffeomorphism θ defines a map θ∗ on C ∞ (M ) by θ∗ u = u ◦ θ−1 which extends to a bounded operator on H ±1 (M ) or L2 (M ). Since θ is an isometry, θ∗ is unitary on these spaces and preserves the H 1 , H −1 pairing. Since θ2 = 1 we have (θ∗ )2 = 1. −1 Lemma 3. Let u ∈ HB (M ).
(i) hu, f i = 0 for any smooth function vanishing on B. (ii) θ∗ u = u. Proof. By choosing local coordinates we reduce (i) to the following statement. Let −1 (Rd ) where B0 = {x ∈ Rd : xd = 0} and let f ∈ C0∞ (Rd ) vanish on B0 . u ∈ HB 0 Then hu, f i = 0. A distribution with support in B0 is a finite sum of derivatives of P (j) delta functions: u = j hj ⊗ δB0 . The condition f ∈ H −1 (Rd ) rules out j ≥ 1 as can be seen by looking at the Fourier transform. Thus u = h ⊗ δB0 and the result follows. For (ii) we must show that hθ∗ u − u, f i = 0 for smooth f or equivalently that hu, f − θ∗ f i = 0. Since f − θ∗ f vanishes on B this follows from part one. This completes the proof. Now let Θ = Γ(θ∗ ) be the induced reflection on L2 (Q, O, dµ). This is unitary since θ∗ is unitary and we also have Θ( φ(f1 ) . . . φ(fn ) ) = φ(θ∗ f1 ) . . . φ(θ∗ fn ) . Theorem 2 (Reflection Positivity, m > 0). For F ∈ L2 (Q, OΩ¯ + , dµ) Z Θ(F )F dµ ≥ 0 .
(23)
(24)
Remarks. The positivity is also known as Osterwalder–Schrader positivity. A similar result was previously obtained by De Angelis, de Falco and Di Genova [6] by other methods. The proof below follows Nelson [3].
March 18, 2004 11:29 WSPC/148-RMP
00194
Markov Quantum Fields on a Manifold
249
−1 −1 Proof. For any closed set A we have θ∗ HA = HθA and hence θ∗ eA = eθA θ∗ . It follows that
ΘEA = Γ(θ∗ )Γ(eA ) = Γ(eθA )Γ(θ∗ ) = EθA Θ .
In particular we have ΘEΩ¯ + = EΩc+ Θ and ΘEB = EB Θ. The result now follows by the calculation Z Z Z (ΘF )F dµ = EB (ΘF ) EB F dµ = |EB (F )|2 dµ ≥ 0 .
(25)
(26)
Here in the first step we have used ΘEΩ¯ + = EΩc+ Θ to conclude that ΘF is OΩc+ measurable and then (21) to reduce the calculation to B. For the second step we note that the lemma says θ∗ eB = eB and so ΘEB = EB . Hence EB Θ = EB to complete the proof. Next we consider the case m = 0 as defined in the introduction. Let µ0 denote the measure and again define Θ so that (24) holds. We take a smaller class of functions F but otherwise have the same result. Corollary 3 (Reflection Positivity, m = 0). Let F be a polynomial in the R fields φ(f ) with f ∈ C0∞ (Ω+ ) and f dτ = 0. Then Z Θ(F )F dµ0 ≥ 0 . (27) Proof. We have hf, (−∆)−1 gi = limm→0 hf, (−∆ + m2 )−1 gi provided f, g satisfy R f dτ = 0. Gaussian integrals of polynomials can be explicitly evaluated as sums of products of these expressions. HenceR if P is any polynomial with these test functions R and µm the massive measure then P dµ0 = limm→0 P dµm . In particular Z Z Θ(F )F dµ0 = lim Θ(F )F dµm . (28) m→0
The result now follows from the previous theorem. Remarks. Returning to the case m > 0 one can now define an inner product on OΩ¯ + measurable functions F, G by Z (29) hF, Gi = Θ(F )Gdµ . Then hF, F i ≥ 0 and if we divide out the null vectors N = {F : hF, F i = 0} we get something positive definite and hence a pre-Hilbert space. We call the Hilbert space completion K: K = L2 (Q, OΩ¯ + , dµ)/N .
(30)
A similar construction works for m = 0. Now we are in a position to define operators on K from certain operators on the L2 space. This can lead to a detailed quantum mechanical/operator interpretation of the theory. For details on such constructions and related positivity results in conformal field theory see [7–10].
March 18, 2004 11:29 WSPC/148-RMP
250
00194
J. Dimock
5. Sewing Now restrict to d = 2 and suppose that we have a Riemann surface (M1 , g1 ) with a boundary circle C1 . Further suppose that the metric is flat on a neighborhood of the boundary. This means that there is a local coordinate z in which the circle is |z| = 1 and the metric has the form |z|−2 dzd¯ z for |z| > 1. If we allow ourselves local rescalings of the metric g → λg this is not a restrictive condition. These rescalings are permitted if m = 0. Even if m > 0 the effect of such a transformation would be to change to a variable mass, and this would not spoil our results. We want to define a mapping from an algebra of fields on M1 to states on the boundary C1 . We have already noted that for a manifold without boundary the conditional expectation serves this function, so we proceed by closing M1 . That is ˜ 1 , g˜1 ) we cap off the circle in some standard fashion to get a compact manifold (M without boundary, also flat in a neighborhood of C1 . Then for m > 0 we have ˜ 1 )} on a measure space (Q1 , O1 , µ1 ). As the Gaussian fields {φ1 (f ) : f ∈ H −1 (M boundary Hilbert space we take the L2 functions measurable with respect to O1,C1 HC1 ≡ L2 (Q1 , O1,C1 , dµ1 ) .
(31)
AC1 ,M1 : L2 (Q1 , OM1 , dµ1 ) → HC1
(32)
Then we define
˜1 as the restriction of the conditional expectation in M ˜
AC1 ,M1 = ECM11 .
(33)
−1 ˜ (M1 )}. We further restrict the domain to polynomials in {φ1 (f ) : f ∈ HM 1 Suppose also there is a second such Riemann surface (M2 , g2 ) with boundary circle C2 and a local coordinate in which the circle is |w| = 1 and the metric has the form |w|−2 dwdw¯ for |w| > 1. We cap off M2 to form a manifold without ˜ 2 , g˜2 ). Then we have fields {φ2 (f ) : f ∈ H −1 (M ˜ 2 )} on a measure space boundary (M ˜2 M (Q2 , O2 , µ2 ), and an operator AC2 ,M2 = EC2 . The two manifolds M1 , M2 can be joined together by identifying points in a ˜ 1 with points in a neighborhood of C2 in M ˜ 2 when the neighborhood of C1 in M coordinates satisfy z = 1/w. Then C1 and C2 are identified by an orientation reversing map. On the overlap we have two coordinates and two metrics, but the metrics agree since the coordinate change z = 1/w takes |z|−2 dzd¯ z to |w|−2 dwdw. ¯ Thus we get a compact Riemann surface (M, g) which is flat in a neighborhood of a circle C. (See Fig. 1, and [11] for more details on this construction.) There is an ˜ 1 into M which takes C1 to isometric mapping j1 from a neighborhood of M1 in M C. The image of M1 in M will also be called M1 . Similarly we have an isometric ˜ 2 to M which takes C2 to C. mapping j2 from a neighborhood of M2 in M On the new manifold M we have Gaussian fields {φ(f ) : f ∈ H −1 (M )} on a measure space (Q, O, µ). We also have an identification between fields on M1 in ˜ 1 and fields on M1 in M . To see this, first note that the isometry j1 induces a M
March 18, 2004 11:29 WSPC/148-RMP
00194
Markov Quantum Fields on a Manifold
C1
M2
251
M1
~ M1
C2 ~ M2
M2
C
M1
M ˜ 1, M ˜ 2. Fig. 1. M1 , M2 are manifolds with boundary circles C1 , C2 . They are capped off to form M They are sewn together to form the manifold M without boundary.
˜ 1 with support in M1 to distributions on M with map j1,∗ from distributions on M support in M1 . This map preserves Sobolev spaces and so −1 ˜ −1 j1,∗ : HM (M 1 ) → H M (M ) . 1 1
(34)
However with our nonlocal norms (4) this is not unitary. There is an induced map on Fock space subspaces: −1 ˜ −1 J1 ≡ Γ(j1,∗ ) : F(HM (M1 )) → F(HM (M )) . 1 1
(35)
Since j1,∗ is not a contraction J1 is unbounded. We take as the domain elements with a finite number of entries. We can also regard J1 as a map of the corresponding L2 subspaces J1 : L2 (Q1 , O1,M1 , dµ1 ) → L2 (Q, OM1 , dµ)
(36)
with domain the polynomials. We note also that J1 maps HC1 to HC ≡ L2 (Q, OC , dµ). There is a similar map J2 . Our goal is to sew together the operators AC1 ,M1 and AC2 ,M2 and obtain a manageable functional integral on the new manifold M . The recipe is as follows. Starting with polynomials F, G on M1 , M2 we propagate them to the circles C1 , C2 by forming AC1 ,M1 F and AC2 ,M2 G. Then we map to the circle C forming J1 AC1 ,M1 F and J2 AC2 ,M2 G in HC . Finally we take the inner product in HC . Thus we define Z (AC1 ,M1 F, AC2 ,M2 G) = (J1 AC1 ,M1 F )(J2 AC2 ,M2 G)dµ . (37)
March 18, 2004 11:29 WSPC/148-RMP
252
00194
J. Dimock
−1 ˜ Theorem 3 (Sewing, m > 0). Let F be a polynomial in {φ1 (f ) : f ∈ HM (M1 )} 1 −1 ˜ and let G be a polynomial in {φ2 (f ) : f ∈ HM2 (M2 )}. Then Z (38) (AC1 ,M1 F, AC2 ,M2 G) = (J1 F )(J2 G)dµ .
Remark. Thus sewing involves the identification operators J1 , J2 from M1 , M2 to M . These can be understood as a change in Wick ordering. We have J1 : φ1 (f1 ) . . . φ1 (fn ) :(−∆M˜ +m2 )−1 =: φ(j1,∗ f1 ) . . . φ(j1,∗ fn ) :(−∆M +m2 )−1 . (39) 1
−1 ˜ −1 Proof. We have that j1,∗ maps HM (M1 ) to HM (M ). These spaces have the 1 1 decompositions (13) −1 ˜ ˜ 1 ) ⊕ (−∆ ˜ + m2 )H 1 (int M1 ) HM (M1 ) = HC−1 (M 0 M1 1 1 −1 HM (M ) = HC−1 (M ) ⊕ (−∆M + m2 )H01 (int M1 ) 1
(40) ˜
1 and since j1 is an isometry j1,∗ preserves the decomposition. The operators eM C1 M and eC are the projections onto the first factors and so we have the identity on −1 ˜ HM (M 1 ) 1
˜
M 1 j1,∗ eM C1 = eC j1,∗ .
(41)
It follows that ˜
˜
M M 1 J1 ECM11 = Γ(j1,∗ )Γ(eM C1 ) = Γ(eC )Γ(j1,∗ ) = EC J1 .
(42)
Then we have (AC1 ,M1 F, AC2 ,M2 G) =
Z
(J1 ECM11 F )(J2 ECM22 G)dµ
=
Z
(ECM J1 F )(ECM J2 G)dµ
=
Z
(J1 F )(J2 G)dµ .
˜
˜
(43)
In the last step we use that J1 F is OM1 measurable, that J2 G is OM2 measurable, and the Markov property via the identity (21). This completes the proof. Remarks. (1) We do not attempt a direct sewing result in the case m = 0. However one can get something in this direction by restricting the class of test functions and taking the limit m → 0 as we did for reflection positivity. (2) Our treatment has featured manifolds with a single boundary circle. However one could as well consider manifolds with many boundary circles {Ci }. In this case one would consider operators between (algebraic) tensor products of Hilbert spaces HCi based on the boundary circles. Again one can show a sewing property of the type we have presented. This is essentially the structure discussed by Segal [12] in his axioms for conformal field theory, except that we have not accommodated
March 18, 2004 11:29 WSPC/148-RMP
00194
Markov Quantum Fields on a Manifold
253
the possibility of sewing together boundary circles on the same manifold. See also Gawedski [10], Huang [11], and Langlands [13]. 6. Interacting Fields We continue to restrict to d = 2 and now study interacting fields on a compact Riemann surface (M, g). For this we may as well assume m > 0. We introduce a potential for A ⊂ M Z p : P (φ(x)) :(−∆+m2 )−1 det g(x)dx . VA (φ) = (44) A
Here P is a lower semi-bounded polynomial. This not obviously well-defined since it refers to products of distributions. However it turns out that the Wick ordering provides sufficient regularization and we have Lemma 4. VA , e−VA are functions in Lp (Q, O, dµ) for all p < ∞. In the plane and with A compact this is a classic result of constructive field theory [3, 5, 14]. The proof has been extended to compact subsets of paracompact complete Riemannian manifolds by De Angelis, de Falco and Di Genova [6]. Hence it holds for compact manifolds and an interacting field theory can be defined by the measureb e−VM dµ . dν = R −V e M dµ
(45)
As noted by Gawedski [10] there may be special choices of the polynomial P such that this is a conformal field theory. We want to establish the Markov propertry for ν, generalizing Nelson’s result ν on the plane [3]. The conditional expectations for ν are denoted EA F ≡ E ν {F |OA }. Theorem 4. Let F ∈ L2 (Q, O, dµ) (1) For A ⊂ M closed ν EA F =
EA (F e−VAc ) EA (e−VAc )
(46)
(2) (Markov propertry) For open Ω ⊂ M, let F be OΩ¯ measurable. Then ν EΩν c F = E∂Ω F.
(47)
Proof (cf. [5]). Let G be bounded and OA measurable. We compute GF e−VM dµ in two different ways. On the one hand conditioning with respect to ν it is Z Z ν ν G EA (F )e−VM dµ = G EA (F )EA (e−VAc )e−VA dµ . (48) R
b See
[15] for some results on Lorentzian manifolds.
March 18, 2004 11:29 WSPC/148-RMP
254
00194
J. Dimock
The second version follows since everything except e−VAc is OA measurable. On the other hand conditioning with respect to µ it is Z Z −VM G EA (F e )dµ = G EA (F e−VAc )e−VA dµ . (49) ν It follows that EA (F )EA (e−VAc ) = EA (F e−VAc ) which is the first result. For the second result we have by the Markov property for µ
EΩν c F =
EΩc (F e−VΩ ) E∂Ω (F e−VΩ ) = . EΩc (e−VΩ ) E∂Ω (e−VΩ )
(50)
This shows that EΩν c F is O∂Ω measurable and hence ν ν EΩν c F = E∂Ω EΩν c F = E∂Ω F
where the last follows by the adjoint of
ν EΩν c E∂Ω
=
(51)
ν E∂Ω .
Acknowledgment This work was initiated at the Institute for Advanced Study in Princeton whose hospitality I gratefully acknowledge. This research was supported by NSF Grant PHY0070905. References [1] E. Nelson, Construction of quantum fields from Markov fields, J. Funct. Anal. 12 (1973) 97–112. [2] E. Nelson, The free Markov field, J. Funct. Anal. 12 (1973) 211–227. [3] E. Nelson, Probability theory and Euclidean field theory, in Constructive Quantum Field Theory, eds. G. Velo and A. Wightman (Springer-Verlag, New York, 1973). [4] M. Taylor, Partial Differential Equations I (Springer, New York, 1996). [5] B. Simon, The P (φ)2 Euclidean Field Theory (Princeton University Press, Princeton, 1974). [6] G. De Angelis, D. de Falco and G. Di Genova, Random fields on Riemannian manifolds: A constructive approach, Commun. Math. Phys. 103 (1986) 297–303. [7] G. Felder, J. Frohlich and J. Keller, On the structure of unitary conformal field theory, Commun. Math. Phys. 124 (1989) 417–463. [8] A. Jaffe, S. Klimek and A. Lesniewski, Representations of the Heisenberg algebra on a Riemann surface, Commun. Math. Phys. 126 (1989) 421–431. [9] Z. Haba, Reflection positivity and quantum fields on a Riemann surface, in Stochastic Processes, Physics, and Geometry, eds. S. Albeverio, G. Casati, G. Cattaneo, D. Merlini and R. Moresi (World Scientific, Singapore, 1990). [10] K. Gawedski, Lectures on conformal field theory, in Quantum Fields and Strings: A Course for Mathematicians, eds. P. Deligne, P. Etingof, D. Freed, L. Jeffrey, D. Kazhdan, J. Morgan, D. Morrison and E. Witten (American Mathematical Society, Providence, 1999). [11] Y. Z. Huang, Two Dimensional Conformal Geometry and Vertex Operator Algebras (Birkhauser, Boston, 1997). [12] G. Segal, Two-dimensional conformal field theories and modular functions, in IX Int. Congress on Mathematical Physics, eds. B. Simon, A. Truman and I. M. Davies (Adam Hilger, Bristol, 1989), pp. 22–37.
March 18, 2004 11:29 WSPC/148-RMP
00194
Markov Quantum Fields on a Manifold
255
[13] R. P. Langlands, The renormalization fixed point as a mathematical object, IAS, preprint. [14] J. Glimm and A. Jaffe, Quantum Physics, a Functional Integral Point of View (Springer, New York, 1987). [15] J. Dimock, P (φ)2 models with variable coefficients, Ann. Phys. 154 (1984) 283–307.
March 19, 2004 12:14 WSPC/148-RMP
00198
Reviews in Mathematical Physics Vol. 16, No. 2 (2004) 257–280 c World Scientific Publishing Company
SELF-ADJOINT OPERATORS AFFILIATED TO C ∗ -ALGEBRAS
MONDHER DAMAK and VLADIMIR GEORGESCU CNRS and Department of Mathematics, University of Cergy-Pontoise 2, avenue Adolphe Chauvin, 95302 Cergy-Pontoise Cedex, France [email protected] Received 21 July 2003 Revised 13 January 2004 We discuss criteria for the affiliation of a self-adjoint operator to a C ∗ -algebra. We consider in particular the case of graded C ∗ -algebras and we give applications to Hamiltonians describing the motion of dispersive N -body systems and the wave propagation in pluristratified media. Keywords: Operators affiliated to C ∗ -algebras; graded C ∗ -algebra; N -body problem; dispersive Hamiltonians; stratified media; pluristratified media.
1. Introduction Let H be a self-adjoint operator on a Hilbert space H and C a C ∗ -algebra of operators on H. We say that H is affiliated to C if there is a complex number z∈ / σ(H) (the spectrum of H) such that (H − z)−1 ∈ C . By the Stone–Weierstrass theorem this is equivalent to the property: ϕ(H) ∈ C if ϕ ∈ C0 (R) (see the end of this section for notations). If, moreover, the operators of the form ϕ(H)T , with ϕ ∈ C0 (R) and T ∈ C , generate a dense subspace of C , we say that H is strictly affiliated to C . In the first part of this paper we consider the following problem. Assume that H = H0 + V , where H0 is a self-adjoint operator (strictly) affiliated to C and V is a symmetric densely defined sesquilinear form, the sum being interpreted in the sense of quadratic forms. Under what conditions on V is H (strictly) affiliated to C ? We give a rather complete answer to this question under the assumption that V is form bounded with respect to H0 , cf. Theorems 2.5 and 2.8. The main feature of Theorem 2.8 is that, when H0 is bounded from below, the perturbation V is allowed to be of the same order as H0 with no restrictions on its positive part (this is case (ii) of Definition 2.1). More precisely, the following is a particular case of Theorem 2.8 (see also Lemma 2.9). Theorem 1.1. Let H0 be a positive self-adjoint operator strictly affiliated to C . 1/2 Let V be a continuous symmetric sesquilinear form on D(H0 ) such that V ≥ −µH0 − δ for some real numbers µ, δ with µ < 1, and let H be the self-adjoint 257
March 19, 2004 12:14 WSPC/148-RMP
258
00198
M. Damak and V. Georgescu
operator associated to the form sum H0 + V . If there is a real number α > 1/2 such that (H0 + 1)−α V (H0 + 1)−1/2 ∈ C , then H is strictly affiliated to C . Earlier results (the best ones, as far as we know, being those from [1, 2]) require much stronger conditions on V , and this is quite unnatural in applications, cf. the example concerning stratified media discussed below. We note that, since the class of self-adjoint operators affiliated to C is stable under norm resolvent limits, one could also treat situations where V has a positive component which is not bounded with respect to H0 . Hamiltonians of N -body systems with hard core interactions belong to this category, see [1]. In the second part of the paper we consider the case when the C ∗ -algebra C is graded by a semilattice (which could be infinite), for example by the lattice of all vector subspaces of a finite dimensional vector space, or by the lattice of all subsets of a finite set. Such situations appear in the study of the Hamiltonians of N -body systems or of quantum fields with a finite particle number cut-off, see [3] for a review of these applications. Theorem 3.5, our main result in this context, is then used in Sec. 4 where we show that our framework allows one to give a unified treatment of the spectral theory of the Hamiltonians of N -body systems and of stratified media. We consider here only the question of the essential spectrum; we shall discuss the Mourre estimate in a later publication. Our initial motivation for studying these matters came from a conversation with Viorel Iftimie concerning his paper [4] with Yves Dermenjian on the Hamiltonians of “pluristratified media”, which generalize both non-relativistic N -body Hamiltonians and Hamiltonians of stratified media (studied in [5]). It seemed at that moment that the geometric methods they used (e.g. partitions of unity of Simon type) gave better results than the algebraic methods described in [6]. In fact, the strongest affiliation criterion to the N -body C ∗ -algebra we knew at that moment [1, Proposition 3.6], when applied to an operator of the form Hu = div(M grad u), required that the function M be a small H¨ older continuous perturbation of the constant function 1. Instead, they were able to treat the case M ∈ L∞ . Now our Theorem 3.5 covers this situation and goes much further (for example, we allow non-local operators M and a more general behavior at infinity, cf. Sec. 4). Our interest in the questions studied here is, however, mainly motivated by the developments in [7–9] concerning the role of the C ∗ -algebras in the study of the spectral properties of quantum Hamiltonians. We can summarize the point of view of [9] as follows: instead of focusing on the study of one self-adjoint operator H, the Hamiltonian of the system, one should study the C ∗ -algebra C of “energy observables” of the system. Indeed, some general assertions concerning the essential spectrum or the Mourre estimate are better understood at this level and are valid for all the self-adjoint operators affiliated to C (an example is Theorem 4.4). Several techniques for constructing C are presented in [9, 3], the main idea being that C must be the C ∗ -algebra generated by some very simple operators (see Sec. 4 for the N -body case). But then one has to prove that “realistic Hamiltonians” are affiliated
March 19, 2004 12:14 WSPC/148-RMP
00198
Self-Adjoint Operators Affiliated to C ∗ -algebras
259
to C . The results of our paper show that the class of self-adjoint operators affiliated to such algebras is much larger than one would a priori believe. Thus, in the papers [5, 4] it was realized that N -body techniques can be used in the study of perturbed stratified media. But, according to the preceding point of view, this is not an accident: indeed, the corresponding Hamiltonians are affiliated to the same C ∗ -algebra, the C ∗ -algebra generated by “elementary” N -body Hamiltonians in the sense of [9], and the main assertions concerning the essential spectrum and the Mourre estimate are valid for all the self-adjoint operators affiliated to this algebra. In Appendix A we discuss several questions related to the notion of affiliation. The results established there are needed in Secs. 3 and 4 (but are also of some independent interest). The algebraic techniques used in these sections force us to consider the more general framework of “observables” affiliated to C ∗ -algebras. Indeed, although our purpose is to study a self-adjoint operator defined in the usual sense on a Hilbert space, these techniques require to show that the operator is affiliated to a certain C ∗ -algebra and then to take its image under a morphism, an operation which has no meaning at a purely Hilbertian level. In this way we get an abstract object (that we call observable in order to avoid confusions) affiliated to the image C ∗ -algebra, and then we are faced with the problem of showing that in certain representations this observable defines a self-adjoint operator. This is the main problem discussed in Appendix A. In the last part of Appendix A we show that an observable strictly affiliated to C is an unbounded self-adjoint element affiliated with C in the sense of Woronowicz [10]; but the reciprocal assertion is not true in general. The notion of affiliation that we use is that introduced in [11, 12], where its usefulness in the spectral analysis of N -body systems is pointed out (proof of the HVZ theorem and of the Mourre estimate; see also [1]). When this terminology was introduced the authors of the quoted papers were not aware of the work of Woronowicz, and the assertion concerning the connection between the two notions made later on in [9] is only partly correct (see Appendix A for details). On the other hand, note that the unbounded self-adjoint elements affiliated with C in the sense of Woronowicz are exactly the self-adjoint operators of the (right) Hilbert C ∗ -module C , and this now seems to be the standard terminology used in the literature [13, Chaps. 9 and 10]. The present paper is a revised and improved version of a part of our work on “C ∗ -algebras related to the N -body problem and the self-adjoint operators affiliated to them”. This is preprint 99-482 at http://www.ma.utexas.edu/mp arc/ and was not submitted for publication. Notations and terminology If X is a locally compact space we denote by Cc (X), C0 (X), Cb (X) the spaces of continuous functions on X which have compact support, tend to zero at infinity, or are bounded, respectively. We use the notation hxi = (1 + |x|2 )1/2 whenever it
March 19, 2004 12:14 WSPC/148-RMP
260
00198
M. Damak and V. Georgescu
makes sense. A ∗-morphism between two ∗-algebras will be called morphism. If H is a Hilbert space, then B(H) and K(H) are the C ∗ -algebras of bounded or compact operators on H, respectively. A self-adjoint operator on H is assumed to be densely defined (this assumption is not always convenient, cf. [1]). 2. Main Affiliation Criterion In this section H0 will be a self-adjoint operator on a Hilbert space H affiliated to a C ∗ -algebra C ⊂ B(H). We denote by G the form domain of H0 , so G = D(|H0 |1/2 ) equipped with the graph topology, and we embed G ⊂ H = H ∗ ⊂ G ∗ in the usual way (G ∗ is the space of antilinear continuous functionals). Then for each z ∈ C\σ(H0 ) the operator H0 − z extends to an isomorphism of G onto G ∗ whose inverse is a continuous extension of (H0 − z)−1 to an operator G ∗ → G. We shall keep the notations H0 − z and (H0 − z)−1 for these extensions. We identify as usual the continuous (symmetric) sesquilinear forms on G with the continuous (symmetric) linear operators G → G ∗ . Definition 2.1. We say that V is a standard form perturbation of H0 if V is a continuous symmetric sesquilinear form on G and there are numbers µ ∈ [0, 1) and δ ∈ R such that one of the following conditions is satisfied: (i) ±V ≤ µ |H0 | + δ as forms on G; (ii) H0 is bounded from below and V ≥ −µH0 − δ as forms on G. If V is a standard form perturbation of H0 then the operator H = H0 + V ∈ B(G, G ∗ ) induces a self-adjoint operator in H which we shall denote also by H. More precisely, the restriction of H to D(H) = {f ∈ G | Hf ∈ H} is a self-adjoint operator on H. This known fact is also a consequence of the proof of Theorem 2.5. Indeed, we show there that there is a complex number z such that the maps H − z and H − z¯ are isomorphisms of G onto G ∗ , from which the self-adjointness of H on D(H) follows easily. Now it is easy to show that G is adapted to H in the following sense: for each z ∈ C\σ(H) the operator H − z extends to an isomorphism of G onto G ∗ whose inverse is a continuous extension of (H − z)−1 to an operator G ∗ → G. As in the case of H0 , we shall keep the notations H − z and (H − z)−1 for these extensions. Moreover, in case (ii) the operator H is bounded from below and G is its form domain. But we stress that in case (i) the space G is not necessarily the form domain of H. Definition 2.2. We say that a continuous symmetric sesquilinear form V on G is of class (H0 , C ) if for each integer k ≥ 1 there are numbers z0 , z1 , . . . , zk outside the spectrum of H0 such that (H0 − z0 )−1 V (H0 − z1 )−1 · · · V (H0 − zk )−1 ∈ C .
(2.1)
March 19, 2004 12:14 WSPC/148-RMP
00198
Self-Adjoint Operators Affiliated to C ∗ -algebras
261
Remark 2.3. If V is of class (H0 , C ) then the relation (2.1) holds for all the choices of z0 , . . . , zk outside the spectrum of H0 . This follows by induction on k by using the resolvent identity. For the proof of the main results of this section we need the next lemma. Lemma 2.4. Let I be a real open interval and let {Hν }ν∈I be a family of selfadjoint operators on H which depend analytically on ν, more precisely, assume that the map ν 7→ (Hν − z)−1 ∈ B(H) is analytic on I for some (hence for all ) z ∈ C\R. If there is a non-empty open interval J ⊂ I such that Hν is affiliated to C when ν ∈ J, then Hν is affiliated to C for all ν ∈ I. Proof. Let F (ν) = (Hν − z)−1 , then F : I 7→ B(H) is analytic and F (ν) belongs to the closed subspace C if ν ∈ J = (a, b). Assume that b is not the right extremity of the interval I, so that b ∈ I, then all the derivatives F (k) (b) belong to C (because they can be computed using only values F (ν) with ν < b and C is closed). Since F (ν) for ν near b is the sum of its Taylor expansion at ν = b we see that F (ν) ∈ C if b ≤ ν ≤ b + ε for some ε > 0. Repeating the argument we get F (ν) ∈ C for all ν ∈ I, ν > b. Theorem 2.5. Assume that V is a standard form perturbation of H0 and that V is of class (H0 , C ). Then the self-adjoint operator H = H0 + V is affiliated to C . If H0 is strictly affiliated to C then H is strictly affiliated to C . Proof. Assume first that we are in case (i) of Definition 2.1 and denote Λ = (H0 2 + λ2 )−1/4 where λ is a real number 6= 0. Then Λ is an isomorphism of H onto G and of G ∗ onto H, hence ΛV Λ is a bounded operator on H, and from the estimate ±V ≤ µ |H0 | + δ we get ±ΛV Λ ≤
µ |H0 | + δ δ <1 ≤µ+ |λ| (H0 2 + λ2 )1/2
if |λ| > δ(1 − µ)−1 . Fix such a λ and observe that kΛV Λk < 1. Let S be the unitary operator S = (H0 + iλ)Λ2 . Then we have H + iλ = (H0 + iλ) + V = Λ−1 [S + ΛV Λ] Λ−1 . But Λ−1 is an isomorphism of G onto H and of H onto G ∗ and kΛV Λk < 1, so S + ΛV Λ is invertible on H. Hence H + iλ is an isomorphism of G onto G ∗ with (H + iλ)−1 = Λ
∞ X
(−S −1 ΛV Λ)k S −1 Λ
k=0
=
∞ X k=0
(H0 + iλ)−1 (−V )(H0 + iλ)−1 · · · (−V )(H0 + iλ)−1 .
(2.2)
This is a norm convergent series of elements of C , so (H + iλ)−1 ∈ C , i.e. H is affiliated to C .
March 19, 2004 12:14 WSPC/148-RMP
262
00198
M. Damak and V. Georgescu
Assume now that we are in case (ii) of Definition 2.1 and let λ be a positive number such that λ > − inf H0 . We take Λ = (H0 + λ)−1/2 and observe that the estimate V ≥ −µH0 − δ implies U := ΛV Λ ≥ −
µλ − δ δ µH0 + δ = −µ + ≥ −µ − . H0 + λ H0 + λ λ + inf H0
Hence we can choose µ0 ∈ (µ, 1) and λ such that the bounded symmetric operator U in H has the property U ≥ −µ0 . Let ε > 0 such that (1 + ε)µ0 = µ1 < 1, then for 0 ≤ ν ≤ 1 + ε we shall have 1 + νU ≥ 1 − µ1 > 0, hence 1 + νU is invertible
in B(H) with a bound (1 + νU )−1 ≤ (1 − µ1 )−1 independent of ν. Let us denote Hν = H0 + νV , so H1 = H, then we have Hν + λ = (H0 + λ) + νV = Λ−1 [1 + νU ]Λ−1 .
So Hν + λ is an isomorphism of G onto G ∗ and (Hν + λ)−1 G ∗ →G is bounded by a constant independent of ν. Moreover, the map ν 7→ (Hν + λ)−1 ∈ B(G ∗ , G) is analytic on the interval (0, 1 + ε) (because so is ν 7→ (1 + νU )−1 ∈ B(H)). If −1 0 < ν < kU k then (Hν + λ)−1 = Λ
∞ X k=0
(−νU )k Λ =
∞ X k=0
(−ν)k (H0 + λ)−1 V (H0 + λ)−1 · · · V (H0 + λ)−1
belongs to C . From Lemma 2.4 we get that the self-adjoint operator Hν is affiliated to C for all ν ∈ (0, 1 + ε). Finally, we assume that H0 is strictly affiliated to C and we prove that H has the same property. Lemma A.5 and the strict affiliation of H0 imply that it suffices to show that limε→0 kθ(εH)R0 − R0 k = 0 if θ(εH) = (1 + iεH)−1 and R0 = (H0 + i)−1 . We have kθ(εH0 )R0 − R0 k → 0, so it is enough to prove that k[θ(εH) − θ(εH0 )]R0 k → 0. The identity θ(εH) − θ(εH0 ) = −iεθ(εH)V θ(εH0 ) holds in B(G ∗ , G) and implies:
(1 + |H0 |)1/2
(1 + |H0 |)1/2
· kW k · k[θ(εH) − θ(εH0 )]R0 k ≤ ε (1 + iεH) (H0 + i)(1 + iεH0 ) ≤ cεk(1 + iεH)−1 (1 + |H0 |)1/2 k
for some finite constant c. Here W = (|H0 | + 1)−1/2 V (|H0 | + 1)−1/2 . If we are in situation (ii) of Definition 2.1 then the operators H0 and H have the same form domain, hence the operator (1 + |H|)−1/2 (1 + |H0 |)1/2 is bounded. This clearly √ implies that the last term above is dominated by ε, and hence tends to zero with ε. In situation (i) of Definition 2.1 the form domains of H0 and H could be different, but one can argue as follows. Let Λε = (1 + ε2 H0 2 )−1/4 and Sε = (1 + iεH0 )Λ2ε , so that Sε is a unitary operator in H. Then −1 1 + iεH = 1 + iεH0 + iεV = Λ−1 ε [Sε + iεΛε V Λε ]Λε .
March 19, 2004 12:14 WSPC/148-RMP
00198
Self-Adjoint Operators Affiliated to C ∗ -algebras
263
From ±εV ≤ µ |εH0 | + εδ we get ±εΛε V Λε ≤ µ + εδ ≤ µ0 < 1 if ε is small enough, hence kεΛε V Λε k ≤ µ0 . Then by using (1 + iεH)−1 = Λε [Sε + iεΛε V Λε ]−1 Λε
we see that there is a finite number c such that for ε small: k(1 + iεH)−1 (1 + |H0 |)1/2 k ≤ ckΛε (1 + |H0 |)1/2 k
which is dominated by ε−1/2 .
The condition isolated in Definition 2.2 is optimal in a sense made precise by the following proposition. Note that if V is a continuous symmetric form on G then νV is a standard form perturbation of H0 if ν is real and small, so Hν = H0 + νV is a well defined self-adjoint operator for such ν. Proposition 2.6. Let V be a continuous symmetric form on G such that Hν = H0 + νV is affiliated to C if ν > 0 is small enough. Then V is of class (H0 , C ). Proof. Let C be a real number such that ±V ≤ C(|H0 |+1) and let us assume that 2νC < 1. Since ±νV ≤ Cν(|H0 | + 1), we are in situation (i) of Definition 2.1. In the first part of the proof of Theorem 2.5 we take λ = 1 and observe that kΛνV Λk < 1. Hence for the operator Rν = (Hν + i)−1 we have the norm convergent expansion (2.2) in which V is replaced by νV . Thus the (right) derivative of order k at zero of the C -valued map ν 7→ Rν exists in norm and equals (−1)k R0 V R0 · · · V R0 (k factors V ). So this product belongs to C . If C is a C ∗ -subalgebra of B(H) we set M(C ) = {S ∈ B(H) | T ∈ C ⇒ ST, T S ∈ C } .
(2.3)
This is clearly a C ∗ -algebra. It can be identified with the usual multiplier algebra (see [13]) of C if and only if C is nondegenerate on H (see the few lines after Theorem A.1 in Appendix A for this notion). This is the case for the algebra considered here because the self-adjoint operator H0 is affiliated to it, see Proposition A.3. Note that by Proposition A.6 if H0 is strictly affiliated to C then ϕ(H0 ) belongs to M(C ) for all ϕ ∈ Cb (R). Example 2.7. In general M(C ) is considerably larger than C . The following example is relevant in the context of Sec. 4. Let X be an euclidean space, E a Hilbert space, and H = L2 (X; E). Let C = C0 (X; K(E)) be the C ∗ -algebra of norm continuous functions X → K(E) which tend to zero at infinity. Then M(C ) is the algebra of bounded strongly continuous functions X → B(E). If H0 is affiliated to C and V ∈ M(C ) is symmetric then it is easy to show that the self-adjoint operator H = H0 + V is affiliated to C . More generally: Theorem 2.8. Assume that V is a standard form perturbation of H0 and , if H0 is not semibounded , assume that H0 is strictly affiliated to C . Let us denote
March 19, 2004 12:14 WSPC/148-RMP
264
00198
M. Damak and V. Georgescu
U ≡ (|H0 | + 1)−1/2 V (|H0 | + 1)−1/2 . If U belongs to M(C ) then H is affiliated to C . If H0 is strictly affiliated to C then H is strictly affiliated to C . Proof. We check the conditions of Theorem 2.5. Let Λ = (|H0 | + 1)−1/2 and S = (|H0 | + 1)(H0 − z)−1 , so that (H0 − z)−1 = SΛ2 . If H0 is semibounded, then clearly S = 1 + ϕ(H0 ) for some ϕ ∈ C0 (R), hence S belongs to M(C ). If H0 is not semibounded, then S = φ(H0 ) for some φ ∈ Cb (R) and H0 was supposed strictly affiliated to C , hence S belongs to M(C ) by a remark made above. Then (H0 − z)−1 V (H0 − z)−1 · · · V (H0 − z)−1 = SΛU SU S · · · U SΛ . The operators U and S belong to M(C ) and M(C ) is an algebra. Moreover, Λ belongs to C , so (H0 − z)−1 V (H0 − z)−1 · · · V (H0 − z)−1 ∈ C . If H0 is bounded from below one can obviously replace (|H0 | + 1)−1/2 in the definition of U from Theorem 2.8 by (H0 + λ)−1/2 if H0 + λ ≥ c > 0. More generally, the following result is useful in applications for checking the hypotheses of the theorem. Lemma 2.9. Assume that H0 is strictly affiliated to C . (i) A symmetric operator U ∈ B(H) belongs to M(C ) if and only if one has ϕ(H0 )U ∈ C for all ϕ ∈ Cc (R) and also if and only if there is ϕ ∈ C0 (R) with ϕ(x) 6= 0 for all x ∈ R such that ϕ(H0 )U ∈ C . (ii) Let V ∈ B(G, G ∗ ) be symmetric and let U be as in Theorem 2.8. Then U belongs to M(C ) if and only if there is θ ∈ Cb (R), with |θ(x)| ∼ |x|−1/2 for large x, such that ϕ(H0 )V θ(H0 ) ∈ C for all ϕ ∈ Cc (R). Proof. The first assertion follows from the fact that the elements of the form T ϕ(H0 ), with T ∈ C and ϕ ∈ Cc (R), generate a dense subspace of C . For the second one, observe that there is η ∈ Cb (R) such that (|x| + 1)−1/2 = θ(x)η(x). Hence from ϕ(H0 )V θ(H0 ) ∈ C we get ϕ(H0 )V (|H0 | + 1)−1/2 ∈ C by using Proposition A.6. Replacing ϕ(x) by ϕ(x)(|x| + 1)−1/2 we then get ϕ(H0 )U ∈ C for all ϕ ∈ Cc (R), so U ∈ M(C ) by the first assertion of the lemma. The reciprocal assertion is proved similarly. 3. Graded Algebras We consider here an application to L-graded C ∗ -algebras, a class of algebras defined in [14] for arbitrary semilattices L (see [11, 12] or [6] for finite L and note that our conventions are different, the roles of the lower and upper bounds in the definition of the grading being interchanged). If A , B are subalgebras of an algebra C then we denote by A · B the linear subspace of C generated by the elements of the form AB with A ∈ A , B ∈ B. A family {Ci }i∈I of subalgebras of C is linearly independent if for each family {Si }i∈I
March 19, 2004 12:14 WSPC/148-RMP
00198
Self-Adjoint Operators Affiliated to C ∗ -algebras
265
P such that Si ∈ Ci and Si 6= 0 for at most a finite number of i we have i∈I Si = 0 if and only if Si = 0 for all i ∈ I. Let L be a semilattice, i.e. a partially ordered set in which each pair of elements a, b has a lower bound a ∧ b. For later convenience we also assume that L has a greatest element max L ≡ e, although Definition 3.1 does not require this. Then (L, ∧) is an abelian monoid with e as neutral element. If E ⊂ L is a ∧-stable subset we say that E is a sub-semilattice of L. For a ∈ L we set La = {b ∈ L | b ≥ a}; this is a sub-semilattice and a submonoid (i.e. e ∈ La ). Definition 3.1. We say that a C ∗ -algebra C is L-graded if a linearly independent family {C (a)}a∈L of C ∗ -subalgebras of C is given such that: (i) C (a) · C (b) ⊂ C (a ∧ b) for all a, b ∈ L; P (ii) if E ⊂ L is finite then a∈E C (a) is a closed subspace of C ; P (iii) a∈L C (a) is dense in C .
Observe that each C (a) is equipped with a natural left and right action of the C ∗ -algebra C (e): indeed, one has C (e) · C (a) ⊂ C (a) and C (a) · C (e) ⊂ C (a). It is clear that the action of C (e) on C is nondegenerate if and only if its action on each C (a) is nondegenerate. In this case, as explained in Appendix A, for each S ∈ C there are T ∈ C and A, B ∈ C (e) such that S = ATB . P For each E ⊂ L we define C (E) as the closure of a∈E C (a). If E ⊂ L is a sub-semilattice then C (E) is an E-graded (hence L-graded) C ∗ -subalgebra of C . If a ∈ L we set Ca = C (La ). S Proposition 3.2. One has C = C (E) where E runs over the set of countable sub-semilattices of L. Each observable affiliated to C is affiliated to a C ∗ -algebra of the form C (E) with E a countable sub-semilattice of L. Proof. Note first that each finite (countable) set E ⊂ L is a subset of a finite (countable) sub-semilattice. Indeed, it suffices to add to it the lower bounds of its finite non-empty subsets. From condition (iii) of Definition 3.1 it follows that each T ∈ C is the limit of a sequence of elements Tn ∈ C (En ) where En are S finite sub-semilattices such that En ⊂ En+1 . Then it suffices to take E = En . This proves the first assertion. Now let H be an observable affiliated to C and let D ⊂ C0 (R) be a countable dense subset. For each ϕ ∈ D there is Eϕ ⊂ L countable such that ϕ(H) ∈ C (Eϕ ). Let E be the (countable) sub-semilattice generated by S ϕ∈D Eϕ . Then ϕ(H) ∈ C (E) for all ϕ ∈ D and C (E) is closed hence, by density and continuity, we get ϕ(H) ∈ C (E) for all ϕ ∈ C0 (R). The next two propositions are from [14]. For completeness, we include slightly simplified proofs. Note that if L is finite the proofs are very easy.
Proposition 3.3. There is a unique linear map Pa : C → Ca such that Pa T = T if T ∈ C (b) and a ≤ b and Pa T = 0 if T ∈ C (b) and a 6≤ b. This map is a
March 19, 2004 12:14 WSPC/148-RMP
266
00198
M. Damak and V. Georgescu
surjective morphism and a projection (in the sense of linear spaces). Its kernel is C (L0a ), where L0a = {b ∈ L | b a}.
P Proof. Let C ◦ be the dense ∗-subalgebra of C given by C ◦ = a∈L C (a). We shall P first prove the following assertion: the map Pa◦ : C ◦ → C ◦ defined by Pa◦ [ b T (b)] = P b≥a T (b) extends to a norm 1 projection Pa of C onto Ca , and this projection is also a morphism. P Since the sum a∈L C (a) is direct and stable under taking adjoints, we see that P the map Pa◦ is well defined and has the property Pa◦ [T ]∗ = Pa◦ [T ∗ ]. If S = b∈L S(b) with S(b) 6= 0 only for a finite number of b and T is of a similar form, then X X X S(b)T (c) . S(b)T (c) = ST = d∈L b∧c=d
b,c
Thus we have Pa◦ [ST ] = =
X X
a≤d b∧c=d
X
a≤b, a≤c
S(b)T (c) =
X
S(b)T (c)
a≤b∧c
S(b)T (c) = Pa◦ [S] Pa◦ [T ] .
Pa◦
This proves that is a morphism of the ∗-algebra C ◦ into C . Clearly its range is P equal to b≥a C (b), which is a dense ∗-subalgebra of Ca . We now show that kPa◦ k = 1. For each T ∈ C ◦ there is a finite sub-semilattice E such that T ∈ C (E) (indeed, if F is the finite set of b such that T (b) 6= 0, then we can take E equal to the set of elements of the form b1 ∧ · · · ∧ bn with b1 , . . . , bn ∈ F). Then Pa◦ |C (E) is a non-zero morphism of the C ∗ -algebra C (E) onto the C ∗ -algebra C (Ea ), with Ea = E ∩ La . But such a morphism has norm 1. The fact that kPa◦ k = 1 is now clear. It follows that Pa◦ extends to a morphism Pa : C → Ca with Pa T = T if T ∈ Ca . In particular, Pa is also a linear projection of C onto Ca with kPa k = 1. Clearly P ◦ ◦ is dense in C and 1 − Pa is a continuous b6≥a C (b) = (1 − Pa )C . Since C P surjective map of C onto ker Pa , we get that b6≥a C (b) is dense in ker Pa . So ker Pa = C (L0a ). The proof of Theorem 4.4 requires one more abstract result. We recall that the atoms of an ordered set L which has a least element o are the minimal elements of L\{o} and that L is atomic if each a 6= o is minorated by an atom. Proposition 3.4. Assume that the semilattice L has a least element o and is atomic and let M be the set of its atoms. Then the map P : S 7→ (Pa S)a∈M Q is a morphism of C into a∈M Ca and its kernel is equal to C (o).
Q Proof. We denoted by a∈M Ca the direct product C ∗ -algebra; then the fact that P is a morphism follows immediately from Proposition 3.3. Moreover, the inclusion C (o) ⊂ ker P is obvious. So it remains to prove that if Pa T = 0 for all a ∈ M, then
March 19, 2004 12:14 WSPC/148-RMP
00198
Self-Adjoint Operators Affiliated to C ∗ -algebras
267
T ∈ C (o). Observe that if o 6= b ∈ L then there is a ∈ M such that a ≤ b, hence for such a T we have Pb T = 0 for all b 6= o (the general relation a ≤ b ⇒ Pb = Pb Pa is easy to prove). We assume first that L is finite and we prove the following: if S ∈ C has the property kPb Sk ≤ ε for a fixed ε ≥ 0 and all o 6= b ∈ L, then there is So ∈ C (o) such that kS − So k ≤ 2ε. If ε = 0 this amounts to proving the proposition for finite semilattices and the argument is quite straightforward. Indeed, write S = P P c≥b S(c) = 0 for all c∈L S(c) with S(c) ∈ C (c), then the hypothesis says that b 6= o. Since the family of operators {S(c)} is linearly independent, we get S(b) = 0 for all b 6= o. Q This proves that the map T 7→ (Pb T )b6=o is a morphism C → b6=o Cb and has Q C (o) as its kernel. The map C /C (o) → b6=o Cb will be an isometry hence, if S ∈ C has the property kPb Sk ≤ ε for each b 6= o, the image of S in the quotient space C /C (o) has norm ≤ ε. From the definition of the quotient norm it follows that there is So ∈ C (o) such that kS − So k ≤ 2ε. Now we consider the general case. Let T such that Pb T = 0 for all b 6= o. It is clear that for each ε > 0 there is a finite sub-semilattice F of L with o ∈ F and there is S ∈ C (F) such that kT − Sk ≤ ε. If o 6= b ∈ L then kPb Sk = kPb [T − S]k ≤ kT − Sk ≤ ε if o 6= b ∈ L. Applying the result obtained above to the finite semilattice F and S, we find So ∈ C (o) such that kS − So k ≤ 2ε. Then kT − So k ≤ 3ε. Since ε is arbitrary and C (o) is closed we get T ∈ C (o). From now on we assume that C is realized as a C ∗ -algebra of bounded operators on a Hilbert space H. Let He be a self-adjoint operator on H strictly affiliated to C (e). If C (e) acts nondegenerately on each algebra C (a) then He will be strictly affiliated to C . From Proposition A.6 it follows then that for each S ∈ C there are T ∈ C and ϕ, ψ ∈ C0 (R) such that S = ϕ(He )T ψ(He ); if S ∈ C (a) then one may choose T ∈ C (a). For the next theorem we assume that a bounded from below self-adjoint operator He on H and a family {H(a)}a∈L of symmetric continuous sesquilinear forms on G = D(|He |1/2 ) are given such that H(a) ≥ −µa He − δa for some positive numbers P P µa , δa with a∈L µa < 1 and a∈L δa < ∞. Moreover, we assume that the family {H(a)}a∈L is (norm) summable in B(G; G ∗ ), in particular the set of a for which H(a) 6= 0 is countable. We shall take H(e) = 0 (this amounts to a redefinition of He ). P P If we denote V = b∈L H(b) and Va = b≥a H(b) then V and Va are standard form perturbations of He . Hence H = He + V and Ha = He + Va are bounded from below self-adjoint operators on H with form domain equal to G . Theorem 3.5. Assume that He is strictly affiliated to the algebra C (e) and that C (e) acts nondegenerately on each C (a). Moreover , assume that for some numbers
March 19, 2004 12:14 WSPC/148-RMP
268
00198
M. Damak and V. Georgescu
α > 1/2 and λ > − inf He and for each a ∈ L one has:
(λ + He )−α H(a)(λ + He )−1/2 ∈ C (a) .
(3.1)
Then H is strictly affiliated to C and Pa [H] = Ha for all a ∈ L.
Remarks. Pa [H] is defined as in Proposition A.7. If (3.1) holds for some α > 1/2 and λ > − inf He then it holds for all such α, λ. Moreover, let θ ∈ C0 (R) such that θ(x) ∼ x−1/2 as x → +∞. Then (3.1) is a consequence of ϕ(He )H(a)θ(He ) ∈ C (a) ,
∀ ϕ ∈ Cc (R).
(3.2)
On the other hand, (3.1) implies ψ(He )H(a)θ(He ) ∈ C (a) if ψ is a continuous function such that ψ(x)x1/2 → 0 as x → +∞. Proof. We apply Theorem 2.8 with U = (λ + H0 )−1/2 V (λ + H0 )−1/2 and H0 = He (see the comment after Theorem 2.8). By the preceding remark we have ϕ(H0 )U ∈ C for all ϕ ∈ C0 (R). We have seen that H0 is strictly affiliated to C so each S ∈ C is of the form S = T ϕ(H0 ) with T ∈ C and ϕ ∈ C0 (R). We get SU ∈ C for all S ∈ C , hence U ∈ M(C ). So, by Theorem 2.8, H is strictly affiliated to C . It remains to prove that Pa [H] = Ha for all a ∈ L. Let us consider the operators H ν = H0 + νV and Haν = H0 + νVa for 0 ≤ ν ≤ 1 + ε. The numbers λ, ε are chosen exactly as in the proof of the second part of Theorem 2.5. Then H ν is strictly affiliated to C and the operators (λ + H ν )−1 and (λ + Haν )−1 depend analytically on ν ∈ (0, 1 + ε). Clearly it suffices to prove that the equality Pa [(λ + H ν )−1 ] = (λ + Haν )−1 holds for small values of ν > 0. Set Λ = (H0 + λ)−1/2 , Ua = ΛVa Λ and U = ΛV Λ. Let us assume that 0 < ν < min(kU k , kUa k). Then (using the proof of Theorem 2.5) we have a norm convergent expansion ∞ X (λ + H ν )−1 = (−ν)k ΛU k Λ . k=0
This remains valid if we replace H ν by Haν and U by Ua . So it suffices to prove that ΛU k Λ belongs to C and that its projection onto Ca is ΛUak Λ. P Since U = b ΛH(b)Λ one can develop the power U k and see that ΛU k Λ is a linear combination of terms of the form J = ΛU (b1 ) · · · U (bk )Λ with U (b) = ΛH(b)Λ. If in this linear combination one replaces by zero the terms such that bj is not ≥ a for some j then one clearly gets ΛUak Λ. So it suffices to prove that J ∈ C (b1 ∧ · · · ∧ bk ). We shall show that ΛU (b1 ) · · · U (bk ) ∈ C (b1 ∧ · · · ∧ bk ) by induction over k. If k = 1 this is clear by the remarks after the statement of the theorem. If the property holds for some k then by the Cohen–Hewitt theorem (see Appendix A) one can write ΛU (b1 ) · · · U (bk ) = T ϕ(H0 ) for some T ∈ C (b1 ∧ · · · ∧ bk ) and ϕ ∈ C0 (R) (cf. the comments after Proposition 3.3). Then ΛU (b1 ) · · · U (bk )U (bk+1 ) = T ϕ(H0 )U (bk+1 ) ∈ C (b1 ∧ · · · ∧ bk+1 )
because ϕ(H0 )U (bk+1 ) ∈ C (bk+1 ).
March 19, 2004 12:14 WSPC/148-RMP
00198
Self-Adjoint Operators Affiliated to C ∗ -algebras
269
Remark 3.6. It is clear that Theorem 2.8 can be used to get better results than P Theorem 3.5. Indeed, the decomposability into a sum H(a) of the perturbation V is not necessary. For example, a symmetric element V ∈ C is not necessarily decomposable in such a sum (if L is infinite); however, He +V is obviously affiliated to C . 4. Operators Affiliated to the N -Body Algebra Let X be an euclidean space and H = H(X) ≡ L2 (X). The euclidean structure is not really needed in our arguments, but it simplifies the notations since it allows us to identify the adjoint space X ∗ with X and so to define the Fourier transformation F such as to be a unitary operator in H(X) (the presentation in [14] involves only the vector space structure of X). We denote by Q and P the position and momentum (X-valued) observables, more precisely, if ϕ, ψ are Borel functions on X then ϕ(Q) is the operator of multiplication by ϕ in H(X) and ψ(P ) = F ∗ ψ(Q)F. Let Hs (X) be the Sobolev space of order s ∈ R with the usual identifications, so that H0 (X) = H(X) and if s > 0 Hs (X) ⊂ H(X) ⊂ H−s (X) = Hs (X)∗ . The spaces B s (X) ≡ B(Hs (X), H−s (X)) are equipped with the natural norm topologies. B(X) ≡ B 0 (X) = B(H(X)) is a C ∗ -algebra (independent of the euclidean structure of X) and we denote K (X) = K(H(X)) the ideal of compact operators. We also need the following class of operators. Definition 4.1. B0s (X) is the set of operators T ∈ B s (X) with the following property: there is ν > 0 such that T : Hs+ν (X) → H−s (X) is compact. It is clear that the last condition is satisfied if and only if, for each ψ ∈ Cc∞ (X), the operator T ψ(P ) : Hs (X) → H−s (X) is compact; and then this property holds for all ψ ∈ C0 (X). The space B0s (X) is a closed linear subspace of B s (X) consisting of operators which are small, in a weak sense, at large x. An example clarifying this matter can be found in Appendix B. The C ∗ -algebra of Hamiltonians considered in the rest of this section has been introduced in [14]. The semilattice L will be a sub-semilattice of the Grassmanian G(X), the set of all linear subspaces of X. G(X) has a natural semilattice structure: we set Y ≤ Z if Y ⊂ Z and note that Y ∧ Z = Y ∩ Z. In fact G(X) is a complete lattice, the upper bound of Y and Z being Y ∨ Z = Y + Z, the greatest element is max G(X) = X and the least element is min G(X) = {0}. If Y is a linear subspace of X then the quotient X/Y is a finite dimensional vector space (which can be identified with the orthogonal subspace Y ⊥ of Y in X), hence the C ∗ -algebra C0 (X/Y ) is well defined (if Y = X we take C0 (X/Y ) = C). Let πY : X → X/Y be the canonical surjection and let us denote ϕ(QY ) = ϕ ◦ πY (Q) for ϕ ∈ C0 (X/Y ).
March 19, 2004 12:14 WSPC/148-RMP
270
00198
M. Damak and V. Georgescu
Definition 4.2. C X (Y ) is the linear closed subspace of B(X) generated by the operators of the form ϕ(QY )ψ(P ) with ϕ ∈ C0 (X/Y ) and ψ ∈ C0 (X). The following facts are easy to prove (see [14] for details). Proposition 4.3. Each C X (Y ) is a C ∗ -subalgebra of B(X) and C X (Y ) · C X (Z) ⊂ C X (Y ∩ Z) . (4.1) P If E ⊂ G(X) is finite then Y ∈E C X (Y ) is closed in B(X) and the family of algebras {C X (Y )}Y ∈E is linearly independent. We have C X (X) = {ϕ(P ) | ϕ ∈ C0 (X)} and C X ({0}) = K (X). The natural action of the algebra C X (X) on each C X (Y ) is nondegenerate. In order to keep close to notations traditional in the quantum mechanical N body problem we assume now that a semilattice L with a largest element e is given together with an injective map L 3 a 7→ Xa ∈ G(X) such that Xa∧b = Xa ∩ Xb and Xe = X. We could identify L with a subset of G(X), we could even take L = G(X), but in the usual N -body problem L is the semilattice of all partitions of the set {1, . . . , N }. Note that, because of Proposition 3.2, we could assume that L is countable. Now we define: X C (a) . (4.2) C (a) = C X (Xa ) , C = norm closure of a∈L
∗
Proposition 4.3 implies that C is an L-graded C -algebra. We mentioned in the introduction that the affiliation to C of a Hamiltonian completely determines some of its basic spectral properties. The following result, a general version of the HVZ theorem, exemplifies this idea. We use the morphisms Pa introduced in Proposition 3.3 and for H strictly affiliated to C we define Pa [H] as explained before Proposition A.7 and in its statement (below we assume strict affiliation although affiliation would be sufficient). Let L∗ be the set of a ∈ L such that Xa 6= {0} and let us denote by M the set of minimal elements of L∗ . Theorem 4.4. Let H be a bounded from below self-adjoint operator strictly affiliated to C and for each a ∈ L let Ha be the self-adjoint operator on H(X) defined by Ha = Pa [H]. Then for each a ∈ L∗ there is a real number τa such that σ(Ha ) = [τa , ∞[ and one has σess (H) = [τ, ∞[ with τ = inf a∈M τa . Proof. The theorem is a consequence of more general abstract results from [14] (see also [11, 12]). For completeness, we shall give here a self-contained proof which takes into account the specificities of the present situation. We first note that L must have a least element o. Indeed, let na be the dimension of the vector space Xa and let m = mina na . If the dimensions of Xa and Xb are equal to m, since Xa ∩ Xb = Xa∧b must have the same dimension, we get Xa = Xb , hence the existence of o follows from the injectivity of the map a 7→ Xa . Now we treat the (easy) case when Xo 6= {0}. Then M = {o} and Co = C , so Ha = H, hence it
March 19, 2004 12:14 WSPC/148-RMP
00198
Self-Adjoint Operators Affiliated to C ∗ -algebras
271
suffices to prove that σ(H) = [τ, ∞[ for some real number τ . Since (H + c)−1 ∈ C if c is large, we are reduced to proving that the spectrum of any symmetric element of C is an interval. Let X o be the orthogonal of Xo in X. By making a Fourier transformation in X o it is easy to see that there is a natural injective morphism of C into the algebra C0 (Xo ; B(X o )) (see Lemma 4.5). If S is a symmetric element of this algebra, then S is a norm continuous map S : Xo → B(X o ) such that S(x) is a symmetric operator for each x ∈ X0 and limx→∞ kS(x)k = 0. Let α(x) and β(x) be the lower and upper bound of S(x). Then α, β ∈ C0 (Xo ), we have {α(x), β(x)} ∈ σ(S(x)) ⊂ [α(x), β(x)] and the spectrum of S is the union of the spectra of the operators S(x). So clearly σ(S) = [min α, max β]. It remains to treat the case Xo = {0}. Then C (o) = K (X) and it is well known that the essential spectrum of a symmetric element S ∈ C is equal to the spectrum of its image Sb in the quotient algebra Cˆ := C /K (X). The rest of the proof is based Q on Proposition 3.4, which gives us a natural embedding Cˆ ⊂ a∈M Ca . Since the spectrum of an element of a product C ∗ -algebra is the closure of the union of the spectra of its components, we get [ σ(Pa S) . σess (S) = a∈M
The proof of the theorem is then finished by taking S = (H + c)−1 for some large c, then using Pa S = (Ha + c)−1 and observing that the spectra of the operators Ha are intervals, by what we have shown above. Our goal is to construct a large class of self-adjoint operators strictly affiliated to C . For this, we begin by choosing for each a a supplementary subspace X a of Xa in X. In the non-relativistic N -body problem one takes X a = Xa⊥ , but it was pointed out in [15] that this is not convenient in the relativistic case. We denote πa and π a the projections onto Xa and X a determined by the direct sum decomposition X = Xa + X a . Note that we have a canonical identification X a ∼ = X/Xa such that π a = πXa , in particular C0 (X a ) ∼ C (X/X ). = 0 a Since X a is an euclidean space, we can associate to it the observables Qa and P a and the spaces H(X a ), B s (X a ), . . . , as in the case of X. Observe that to each ϕ ∈ C0 (X a ) one can associate two operators, namely ϕ(Qa ) acting in H(X a ) and ϕ(QXa ) = ϕ ◦ π a (Q) acting in H(X). Except for the next lemma, whose proof is easy and will not be given, we shall not distinguish between them. Note that the choice of X a allows one to consider functions defined on X as functions of two variables (xa , xa ) ∈ Xa × X a . Lemma 4.5. There is a unique isomorphism C (a) → C0 (Xa ; K (X a )) such that, for all ϕ ∈ C0 (X a ), ψ ∈ C0 (X), the image of ϕ ◦ π a (Q)ψ(P ) is the map defined by xa 7→ ϕ(Qa )ψ(xa , P a ). The direct sum decomposition X = Xa + X a allows us to identify H(X) with the space of vector valued functions H(Xa ; H(X a )) ≡ L2 (Xa ; H(X a )). It is easy to
March 19, 2004 12:14 WSPC/148-RMP
272
00198
M. Damak and V. Georgescu
show that the Sobolev space of order s > 0 is then given by Hs (X) = H(Xa ; Hs (X a )) ∩ Hs (Xa ; H(X a )) with natural interpretations of the spaces from the right hand side. Now assume that Φ : Xa → B s (X a ) is a weakly measurable symmetric operator valued map satisfying ±Φ(xa ) ≤ C(1 + |xa |2 + |P a |2 )s for a constant C and all xa . Let Fa be the Fourier transform in H(Xa ), naturally extended to a unitary operator in H(Xa ; H(X a )). Then we can define a symmetric operator Φ(Pa ) ∈ B s (X) by the following rule: if u ∈ Hs (X) then Z hFa u(xa ), Φ(xa )Fa u(xa )i dxa . hu, Φ(Pa )ui = Xa
We clearly have ±Φ(Pa ) ≤ C(1 + |P |2 )s = C 0 hP i2s , where C 0 depends only on C and on the definition of the norms in the various Sobolev spaces involved. Theorem 4.6. Let h : X → [0, ∞[ be continuous and assume that there is s > 0 such that C 0 |x|2s ≤ h(x) ≤ C 00 |x|2s for some numbers C 0 , C 00 and all large x. For each a ∈ L, a 6= e, let V a : Xa → B0s (X a ) be a symmetric operator valued norm continuous map such that ±V a (xa ) ≤ Ca (1 + |xa |2 + |P a |2 )s for a constant Ca and all xa . Let He = h(P ), H(e) = 0 and H(a) = V a (Pa ) for a 6= e. Assume P that H(a) ≥ −µa He − δa for some positive numbers µa , δa with a∈L µa < 1 and P is (norm) summable in B s (X). a∈L δa < ∞, and that the family {H(a)} P a∈L Then the self-adjoint operator H = He + a H(a) (form sum) is strictly affiliated P to C and for all a ∈ L one has Pa [H] = Ha ≡ He + b≥a H(b). Proof. We check the conditions of Theorem 3.5 and for this it suffices to prove that ϕ(He )H(a)(He + 1)−1 ∈ C (a) if ϕ ∈ Cc (R), see (3.2). Note that G = Hs (X). Clearly there is θ ∈ Cc (X) such that ϕ(He ) = ϕ(He )θ(P ). Then ϕ(He )H(a)(He + 1)−1 = ϕ(He ) · θ(P )H(a)hP i−s · hP is (He + 1)−1
(4.3)
and so we see that it suffices to show that S ≡ θ(P )H(a)hP i−s ∈ C (a). Indeed, the action of C (e) on C (a) being nondegenerate, we will then be able to write S = S0 ξ(P ) with S0 ∈ C (a) and ξ ∈ C0 (X), cf. Theorem A.1. But ϕ(He ) belongs to C (e) and one can write hP is (He + 1)−1 = η(P ) with η ∈ Cb (X). Thus we see that the operator on the left hand side of (4.3) belongs to C (a). Hence it remains to prove that S ∈ C (a). According to Lemma 4.5 this is equivalent to the fact that, if θ ∈ Cc (X), then the map F : xa 7→ θ(xa , P a )V a (xa )(1 + |xa |2 + |P a |2 )−s/2 belongs to C0 (Xa ; K (X a )). The support of F being obviously compact, we have only to show its norm continuity and the fact that F (xa ) is compact. But F (xa ) = θ(xa , P a )hP a is+ν · hP a i−s−ν V a (xa )hP a i−s · hP a is (1 + |xa |2 + |P a |2 )−s/2 .
March 19, 2004 12:14 WSPC/148-RMP
00198
Self-Adjoint Operators Affiliated to C ∗ -algebras
273
The first and the last factors on the right hand side are easily seen to be norm continuous functions of xa . Finally, hP a i−s−ν V a (xa )hP a i−s ∈ K (X a ) and depends norm continuously on xa by hypothesis. We now consider the particular case of the non-relativistic N -body systems. This means that we take h(x) = |x|2 and s = 1 in the preceding theorem. In particular, this time the euclidean structure of X is really needed, He = ∆ being the (positive) Laplace operator on X. It is then clear that the class of operators affiliated to C contains the dispersive N -body Hamiltonians first studied by J. Derezinski [16] and C. Gerard [15] and also the generalizations which appear in [6, 1, 2]. However, the class of Hamiltonians considered by these authors is significantly smaller than that covered by Theorem 4.6. Indeed, they require either that V a (xa ) is a compact operator H1 (X a ) → H−1 (X a ), so it cannot be a second order operator, or that V a (xa ) is independent of xa and is small and regular with respect to ∆, cf. [1, Proposition 3.6]. These conditions are particularly annoying in view of the fact that the Hamiltonians which appear naturally in the study of wave propagation in (perturbed) stratified or pluristratified media [5, 4] do not satisfy them. Our purpose now is to show that the class of operators introduced in Theorem 4.6 covers these situations (and much more). In order to simplify the discussion we shall assume from now on that L is finite. However, we stress that our formalism allows one to consider interesting situations when L is infinite: for example, take X of dimension 2 and let the subspaces Xa be one dimensional if a 6= e and such that ∪a6=e Xa is dense in X. Let us first restate the particular case of Theorem 4.6 that we need. Corollary 4.7. Assume that for each a ∈ L, a 6= e, a norm continuous map V a : Xa → B01 (X a ) is given such that V a (xa ) is a symmetric form satisfying ±V a (xa ) ≤ Ca (1+|xa |2 +|P a |2 ) for a constant Ca and all xa . Let H(e) = 0 and H(a) = V a (Pa ) for a 6= e. Assume that H(a) ≥ −µa ∆ − δa for some positive numbers µa , δa with P P P a∈L µa < 1 and a∈L δa < ∞. Then the self-adjoint operator H = ∆+ a H(a) (form sum) is strictly affiliated to C and for all a ∈ L one has Pa [H] = Ha ≡ P ∆ + b≥a H(b).
In the rest of this section we take X a = Xa⊥ and assume that two families of symmetric operators {M a } and {W a } are given such that M a ∈ B0 (X a ) and W a ∈ B01 (X a ) for all a ∈ L and M e = W e = 0. For example, we can take M a = ma (Qa ) where ma ∈ L∞ (X a ) satisfies the conditions of Proposition B.1 (with X replaced by X a ) and W a could be a differential operator of order ≤ 2 on X a . The coefficients of W a are allowed to be quite singular, but they must tend to zero at infinity, in some weak sense. However, we stress that M a , W a could be arbitrary, not necessarily local, operators. The only other assumption we make is: P there are numbers µa , δ ≥ 0 such that a µa < 1 and, for all a ∈ L, M a + µa ≥ 0 and ∇a∗ (M a + µa )∇a + W a + δ ≥ 0 .
(4.4)
March 19, 2004 12:14 WSPC/148-RMP
274
00198
M. Damak and V. Georgescu
Here ∇a is the gradient operator associated to the euclidean structure of X a and ∇a∗ is the corresponding divergence operator. Let us define for xa ∈ Xa V a (xa ) = |xa |2 M a + ∇a∗ M a ∇a + W a .
(4.5)
Since |Pa |2 + |P a |2 = |P |2 = ∆ it is clear that the hypotheses of Corollary 4.7 are satisfied, hence the operator H is strictly affiliated to C . To see that this class of operators contains both non-relativistic N -body Hamiltonians and the Hamiltonians of perturbed pluristratified media, introduced and studied in [4], we shall now make the connection with the usual N -body formalism. For each a we have a canonical identification H(X) = H(Xa ) ⊗ H(X a ) which allows us to define the operator M (a) ∈ B(X) by M (a) = 1a ⊗ M a and W (a) ∈ B 1 (X) by a similar formula. The identity operator in H(Xa ) was denoted 1a . The tensor product symbol depends on a but we shall not indicate it explicitly since it will be clear from the context to what tensor decomposition we refer. If ∆a = ∇∗a ∇a is the Laplace operator associated to Xa then: H(a) ≡ V a (Pa ) = ∆a ⊗ M a + 1a ⊗ (∇a∗ M a ∇a + W a ) = ∇∗ M (a)∇ + W (a) . Thus, the condition (4.4) implies (in fact, is clearly equivalent to) the estimate ∇∗ M (a)∇ + W (a) ≥ −µa ∆ − δ . (4.6) P P Let us set N = 1 + a M (a) and V = a W (a). Then H = ∇∗ N ∇ + V where ∇ is the gradient operator associated to the euclidean structure on X. We compute the Hamiltonians Ha = Pa [H] with the help of Corollary 4.7: X Ha = ∆ + H(b) = ∇∗ Na ∇ + Va (4.7) b≥a
P where Na = 1+ b≥a M (a) and Va = b≥a W (a). Hence, exactly as in the standard N -body situation, Ha has the same structure as H. We show now that, although not a non-relativistic N -body Hamiltonian in the sense of [6, Definition 9.4.1], H is a rather simple dispersive Hamiltonian. For this we have to further decompose Ha . If a, b with a ≤ b and if we denote Xba = X a ∩ Xb then X a = Xba ⊕ X b hence H(X a ) = H(Xba ) ⊗ H(X b ). We introduce several new operators in H(X a ). Let us set M a (b) = 1ab ⊗ M b ∈ B(X a ) and W a (b) = 1ab ⊗ W b ∈ B 1 (X a ), where 1ab P is the identity operator in H(Xba ). Then let N a = 1a + b≥a M a (b) and V a = P a b≥a W (b). Finally, we define P
H a = ∇a∗ N a ∇a + V a
(4.8)
and note that H a is a well defined self-adjoint operator in H(X a ). Observe also that Na = 1a ⊗ N a and similarly for Va . From (4.7) we then easily get Ha = ∆ a ⊗ N a + 1 a ⊗ H a .
(4.9)
March 19, 2004 12:14 WSPC/148-RMP
00198
Self-Adjoint Operators Affiliated to C ∗ -algebras
275
The set L is a finite semilattice hence has a smallest element o = min L and Ho = H. If Xo = {0} let M be the set of minimal elements of L\{o}. If not, let M = {o}. The following result describes the essential spectrum of H. Theorem 4.8. Let τa = inf H a . Then for a ∈ M one has σ(Ha ) = [τa , ∞[ and if τ = inf a∈M τa then σess (H) = [τ, ∞[. Proof. Because of Theorem 4.4 it suffices to show that Ha and H a have the same lower bound. Note that Fa Ha Fa∗ is the operator of multiplication by the operator valued function H a (xa ) := |xa |2 N a + H a in H(Xa ; H(X a )). Since the spectrum of Ha is an interval and is the union of the spectra of the operatorss H a (xa ), and since N a ≥ c > 0 and H a (0) = H a , we get inf Ha = inf H a . We make two final remarks. First, it is easy to show that Ha has no eigenvalues if Xa 6= {0}. This depends only on the fact that Ha can be written as in (4.9) where the operator ∆a has purely continuous spectrum and the operator N a is positive and invertible. Second, one can easily prove the Mourre estimate for H (with the generator of dilations as conjugate operator) by using [6, Theorem 8.4.3] and a straightforward extension of [6, Theorem 8.3.6] to operators having the structure (4.9) of Ha (see the proof of [6, Theorem 9.4.4]). However, we prefer to treat this question for a larger class of dispersive Hamiltonians in a later publication. Acknowledgments We are grateful to Viorel Iftimie for a very useful discussion and to Christian G´erard for pointing out to us that the initial version of Theorem 2.8 was not correctly formulated. We would also like to thank the referees for several useful suggestions which have been incorporated in the final version of our paper Appendix A. Observables and Strict Affiliation In this appendix we discuss several questions raised by our definition of operators strictly affiliated to a C ∗ -algebra and establish some facts required in the preceding sections. We need a particular case of a very interesting result due to Cohen and Hewitt (see [17, V.9.2] for the general case). Note that the Cohen–Hewitt theorem is used not only in this appendix but also plays an important role in the proof of Theorem 3.5. Let A be a C ∗ -algebra and let E be a (left) Banach A -module. This means that E is a Banach space and that a continuous bilinear map A × E 3 (A, T ) 7→ AT ∈ E has been given such that (AB)T = A(BT ) for all A, B ∈ A and T ∈ E . Denote A · E the linear subspace generated by the elements AT with A ∈ A and T ∈ E . Say that E is a nondegenerate A -module, or that A acts nondegenerately on E , if A · E is (norm) dense in E . The following is the Cohen– Hewitt theorem:
March 19, 2004 12:14 WSPC/148-RMP
276
00198
M. Damak and V. Georgescu
Theorem A.1. If E is a nondegenerate A -module then for each S ∈ E there are A ∈ A and T ∈ E such that S = AT .
For example, if C is a C ∗ -algebra of operators on a Hilbert space H then H has a natural structure of Banach C -module. One says that C is nondegenerate on H if H is a nondegenerate A -module. If this is the case, then for each g ∈ H there are T ∈ C and f ∈ H such that g = T f . As a second example, let A be a C ∗ -subalgebra of an arbitrary C ∗ -algebra C . Then C has an obvious (left) A -module structure (left multiplication by elements of A ) and if this action is nondegenerate then each S ∈ C can be written as a product S = AT with A ∈ A and T ∈ C . In fact we have more: C = {ATB | A, B ∈ A and T ∈ C } .
Indeed, if S ∈ C then S = AT1 for some A ∈ A and T1 ∈ C . Since T1∗ ∈ C we can also write it as T1∗ = A1 T2 for some A1 ∈ A and T2 ∈ C and then S = ATB with T = T2∗ ∈ C and B = A∗1 ∈ A . Now let X be a locally compact topological space and H : C0 (X) → C a morphism. It is convenient to use the notation ϕ(H) = H(ϕ) and to call H an X-valued observable affiliated to C (this is consistent with the quantum mechanical terminology). We say that H is strictly affiliated to C if the range of the morphism H (which is a C ∗ -subalgebra of C ) acts nondegenerately on C , i.e. if the elements of the form ϕ(H)T , with ϕ ∈ C0 (X) and T ∈ C , generate a dense linear subspace in C . By what we have seen above, this is equivalent to: for each S ∈ C there are ϕ, ψ ∈ C0 (X) and T ∈ C such that S = ϕ(H)T ψ(H). Note that if A is a C ∗ -subalgebra of C then an observable affiliated to A is also affiliated to C . Clearly: Lemma A.2. Let H be an observable strictly affiliated to a C ∗ -subalgebra A of a C ∗ -algebra C . Then H is strictly affiliated to C if and only if the action of A on C is nondegenerate. From now we take X = R and observable will be an abbreviation for “real valued observable”. Let us fix a C ∗ -algebra C of operators on a Hilbert space H. If H is a self-adjoint operator on H affiliated to C in the sense defined in the introduction, i.e. such that ϕ(H) ∈ C if ϕ ∈ C0 (R), then one can obviously associate to it an observable OH affiliated to C . Since the C0 functional calculus uniquely determines a self-adjoint operator, the map H 7→ OH is injective, so that we can identify a self-adjoint operator affiliated to C with an observable affiliated to C . But there are observables which are not of this type: in fact the map H 7→ OH becomes bijective only if we do not require that self-adjoint operators be densely defined (this is easy to prove, see [11, pp. 3102–3103] or [6]). However, in the context of this paper it seems more natural to keep the standard definition of self-adjointness. On the other hand, an observable strictly affiliated to a C ∗ -algebra is realized as a self-adjoint operator in each nondegenerate representation of the algebra. Indeed, we have:
March 19, 2004 12:14 WSPC/148-RMP
00198
Self-Adjoint Operators Affiliated to C ∗ -algebras
277
Proposition A.3. Let C be a C ∗ -algebra of operators on a Hilbert space H. If there is a self-adjoint operator H on H which is affiliated to C then C is nondegenerate on H. If C is nondegenerate on H and H is an observable strictly affiliated to C , then H is a self-adjoint operator on H. Proof. To prove the first assertion, let θ ∈ C0 (R) with θ(0) = 1. Then for each ε 6= 0 one has θ(εH) ∈ C and θ(εH)f → f as ε → 0 for each f ∈ H. Now assume that φ : C0 (R) → C is an observable strictly affiliated to C . If z is a non-real complex number we set R(z) = φ(rz ) ∈ C , where rz ∈ C0 (R) is the function x 7→ (x − z)−1 . We will show that there is a self-adjoint operator H such that R(z) = (H − z)−1 . This suffices to finish the proof, because the Stone– Weierstrass theorem will then give θ(H) = φ(θ) for all θ ∈ C0 (R). We clearly have R(z1 ) − R(z2 ) = (z1 − z2 )R(z1 )R(z2 ) so, by a well known theorem, it is sufficient to prove, for each f ∈ H, that kniR(ni)f + f k → 0 when n → ∞. Since the action of C on H is nondegenerate and that of the range of φ on C is nondegenerate too, we can write f = φ(θ)T g for some θ ∈ C0 (R), T ∈ C and g ∈ H. Thus it suffices to show k(niR(ni) + 1)φ(θ)k → 0. But this is clear because (ni(x−ni)−1 +1)θ(x) → 0 uniformly in x. We now give an example which clarifies the distinction between self-adjoint operators affiliated or strictly affiliated to an algebra. It is easy to see that a selfadjoint operator H affiliated to C is strictly affiliated to it if and only if the set of T ∈ C such that T H extends to an operator in C is a dense subset of C . And also if and only if the set of T ∈ C such that T H ⊂ D(H) is dense in C . Example A.4. Let h be a real rational function on R which tends to infinity at infinity and let H the operator of multiplication by h in L2 (R). Then H is affiliated to C0 (R) but it is strictly affiliated to it if and only if h has no real poles. Indeed, one can write h = f /g for some real polynomials f, g with no common roots and such that the degree of f is strictly larger than that of g. Hence (H + i)−1 = g/(f + gi) ∈ C0 (R). If a is a real pole of h then g(a) = 0, hence if T ∈ C0 (R) is such that T H extends to an operator in C0 (R) then T (a) = 0. Thus the set of such T is not dense in C0 (R). The following obvious fact will be useful. Lemma A.5. If a self-adjoint operator H is affiliated to C and if there is a function θ ∈ C0 (R) with θ(0) = 1 such that limε→0 kθ(εH)T − T k = 0 for each T ∈ C , then H is strictly affiliated to C . A convenient choice is θ(εH) = (1+iεH)−1 . The next proposition, which follows from the results established before, summarizes the facts that are important for us. Proposition A.6. Let H be a self-adjoint operator on H strictly affiliated to a C ∗ -algebra C ⊂ B(H). Then for each S ∈ C there are ϕ, ψ ∈ C0 (R) and T ∈ C
March 19, 2004 12:14 WSPC/148-RMP
278
00198
M. Damak and V. Georgescu
such that S = ϕ(H)T ψ(H). Moreover , for all ϕ ∈ Cb (R) and T ∈ C the operators ϕ(H)T and T ϕ(H) belong to C . Finally, if {ϕn } is a bounded sequence in Cb (R) and limn ϕn (x) = ϕ(x) locally uniformly in x, then limn kϕn (H)T − ϕ(H)T k → 0 and limn kT ϕn (H) − T ϕ(H)k → 0 for all T ∈ C . In the next proposition we consider images of observables through morphisms. If C1 and C2 are C ∗ -algebras, H is an observable affiliated to C1 and P : C1 → C2 is a morphism, then P[H] := P ◦ H is an observable affiliated to the C ∗ -subalgebra P(C1 ) of C2 , hence to C2 . If H is strictly affiliated to C1 then clearly P[H] is strictly affiliated to P(C1 ). These comments together with Lemma A.2 and Proposition A.3 imply the following. Proposition A.7. Let C1 , C2 be nondegenerate C ∗ -algebras of bounded operators on the Hilbert spaces H1 , H2 respectively and let P : C1 → C2 be a morphism such that P(C1 ) acts nondegenerately on C2 (e.g. assume P is surjective). Then for each self-adjoint operator H1 on H1 strictly affiliated to C1 there is a unique self-adjoint operator H2 on H2 such that P[ϕ(H1 )] = ϕ(H2 ) for each ϕ ∈ C0 (R). Moreover , H2 is strictly affiliated to C2 . Finally, let us describe the relation between observables affiliated to an abstract C ∗ -algebra C and self-adjoint elements affiliated to C in the sense of Woronowicz [10] (see also [18], [13, Chaps. 9 and 10] and references therein). We shall prove the following fact: to each observable H strictly affiliated to C one can associate an unbounded self-adjoint element WH affiliated to C in the sense of Woronowicz, and the map H 7→ WH is injective.a Note first that, for each observable H affiliated to C and for each T ∈ C , there is a largest open real set ΩT such that ϕ ∈ C0 (ΩT ) ⇒ ϕ(H)T = 0. We say that suppH T := R\ΩT is the (left) H-support of T and we denote by CcH the set of T such that suppH T is compact. Then H is strictly affiliated to C if and only if CcH is a dense subspace of C . It is clear that if ϕ ∈ C0 (R) then ϕ(H)T depends only on the restriction of ϕ to suppH T , so if the H-support of T is compact then ϕ(H)T is well defined for any ϕ ∈ C(R). Let ζ be the map ζ(λ) = λ on R and ◦ ◦ T = ζ(H)T for T ∈ CcH . Then WH is a densely defined closable let us define WH operator on the Banach space C and we take WH equal to its closure. Then WH is an unbounded self-adjoint element affiliated to C in the sense of Woronowicz. In fact, by using [10, Theorem 1.2] we have WH = H(ζ), the notation H(ζ) being interpreted in the sense of the quoted theorem. Alternatively, one can work in a faithful nondegenerate representation of C on some Hilbert space H, realize H as a self-adjoint operator on H (see Proposition A.3) and check (which is easy) that the conditions of [10, Sec. 1, Example 4] are satisfied. a That this map is not bijective in general, as it was erroneously stated on [9, p. 534], follows from [10, Sec. 1, Example 3]. The second author (V.G.) would like to thank Jan Derezinski for a discussion which clarified this point to us.
March 19, 2004 12:14 WSPC/148-RMP
00198
Self-Adjoint Operators Affiliated to C ∗ -algebras
279
Appendix B. Operators Small at Infinity We describe here the multiplication operators of class B0 (X), cf. Definition 4.1. Proposition B.1. Let M = m(Q) be the operator of multiplication by a measurable function Rm : X → C. Then M ∈ B0 (X) if and only if m is essentially bounded and lima→∞ |x−a|<1 |m(x)|dx = 0.
Note that m ∈ L∞ (X) satisfies the last condition above if and only if θ ∗ |m| belongs to C0 (X) for all θ ∈ Cc∞ (X), which means that the distribution |m| weakly vanishes at infinity, see [6, Sec. 1.4]. This condition was suggested to us by the paper [19]. Our initial version of the proposition involved a stronger condition, namely the measure of the set where |m(x)| > r had to be finite for each r > 0. Proof of Proposition B.1. Assume first that m ∈ L∞ (X) and M : H1 → H is compact. If θ is a Cc∞ (X) function such that θ(x) = 1 on the unit ball 1 and if we set θa (x) = θ(x − a), R then θa → 20 weakly in H as a → ∞, hence kM θa k → 0 too. Thus lima→∞ |m(x)θ(x−a)| dx = 0, which is more than needed. ∞ Reciprocally, let us denote Ba the unit ball with center R a and let m ∈ L (X) such R −1 m(y)dy, where v(ε) is that Ba |m|dx → 0 if a → ∞. Let mε (x) = v(ε) |y−x|<ε the volume of the ball of radius ε in X, and let Mε = mε (Q). Then mε ∈ C0 (X) hence Mε : Hs → H is compact if s > 0. Thus it suffices to show that there is s > 0 such that kMε − M kHs →H → 0 if ε → 0. We choose some s > n/2 and recall that one has kϕ(Q)k R Hs →H ≤ C[ϕ]2 for some constant C, where we used the notation [ϕ]p = supa ( Ba |ϕ(x)|p dx)1/p (see [6, (1.3.23)]). If ϕ is bounded then [ϕ]22 ≤ kϕk∞ [ϕ]1 , hence it suffices to show that [mε − m]1 → 0. But Z Z Z 1 dy dx|m(x − εy) − m(x)| |mε (x) − m(x)|dx ≤ v(1) |y|<1 |x−a|<1 |x−a|<1
hence it suffices to prove that Z lim sup y→0 a
|x−a|<1
|m(x − y) − m(x)|dx = 0 .
R But this is true because limy→0 |x|
March 19, 2004 12:14 WSPC/148-RMP
280
00198
M. Damak and V. Georgescu
[4] Y. Dermenjian and V. Iftimie, M´ethodes a ` N corps pour un probl`eme de milieux pluristratifi´es perturb´es, Publications of RIMS, 35(4) (1999) 679–709. [5] S. De Bi`evre and D. W. Pravica, Spectral analysis for optical fibres and stratified fluids I. The limiting absorption principle, J. Funct. Anal. 98 (1991) 404–436. [6] W. Amrein, A. Boutet de Monvel and V. Georgescu, C0 -Groups, Commutator Methods and Spectral Theory of N -Body Hamiltonians (Birkhauser, Progress in Math. Ser. 135, 1996). [7] J. Bellissard, K-theory of C ∗ -algebras in solid state physics, in Statistical Mechanics and Field Theory: Mathematical Aspects, eds. T. C. Dorlas, N. M. Hugenholtz and M. Winnink, Groningen 1985, Lecture Notes in Physics 257, 1986, pp. 99–156. [8] J. Bellissard, Gap labelling theorems for Schr¨ odinger operators, in From Number Theory to Physics (Les Houches 1989 ), eds. J. M. Luck, P. Moussa and M. Waldschmidt (Springer, 1993), pp. 538–630. [9] V. Georgescu and A. Iftimovici, Crossed products of C ∗ -algebras and spectral analysis of quantum Hamiltonians, Commun. Math. Phys. 228 (2002) 519–560. [10] S. L. Woronowicz, Unbounded elements affiliated with C ∗ -algebras and non-compact quantum groups, Commun. Math. Phys. 136 (1991) 399–432. [11] A. Boutet de Monvel and V. Georgescu, Graded C ∗ -algebras in the N body problem, J. Math. Physics 32 (1991) 3101–3110. [12] A. Boutet de Monvel and V. Georgescu, Graded C ∗ -algebras associated to symplectic spaces and spectral analysis of many channel Hamiltonians, in Dynamics of Complex and Irregular Systems (Bielefeld encounters in Mathematics and Physics VIII, 1991 ), eds. Ph. Blanchard, L. Streit, M. Sirugue-Collin and D. Testard (World Scientific, 1993), pp. 22–66. [13] C. Lance, Hilbert C ∗ -Modules. A Toolkit for Operator Algebraists, London Math. Soc. Lecture Note Series 210 (Cambridge University Press, 1995). [14] M. Damak and V. Georgescu, C ∗ -crossed products and a generalized quantum mechanical N -body problem, in Proc. Symp. “Mathematical Physics and Quantum Field Theory”, Electronic J. of Diff. Equations, Conference 04 (2000), pp. 51–69 (an improved version is available as preprint 99–481 at http://www.ma.utexas.edu/mp arc/). [15] C. Gerard, The Mourre estimate for regular dispersive systems, Ann. Inst. H. Poincar´e, Phys. Theor. 54 (1991) 59–88. [16] J. Derezinski, The Mourre estimate for dispersive N -body Schr¨ odinger operators, Trans. Amer. Math. Soc. 317 (1990) 773–798. [17] J. M. G. Fell and R. S. Doran, Representations of ∗-Algebras, Locally Compact Groups, and Banach ∗-Algebraic Bundles; Vol. 1, Basic Representation Theory of Groups and Algebras (Academic Press, Boston, 1988). [18] S. L. Woronowicz and K. Napi´ orkowski, Operator theory in the C ∗ -algebra framework, Rep. Math. Phys. 31 (1992) 353–371. [19] E. M. Ouhabaz and P. Stollmann, Stability of the essential spectrum of second-order complex elliptic operators, J. Reine Angew. Math. 500 (1998) 113–126.
April 27, 2004 19:23 WSPC/148-RMP
00199
Reviews in Mathematical Physics Vol. 16, No. 3 (2004) 281–330 c World Scientific Publishing Company
A QUANTUM TRANSMITTING ¨ POISSON SYSTEM SCHRODINGER
M. BARO∗ , H.-CHR. KAISER† , H. NEIDHARDT‡ and J. REHBERG§ Weierstrass Institute for Applied Analysis and Stochastics, Mohrenstr. 39 10117 Berlin, Germany ∗[email protected] †[email protected] ‡[email protected] §[email protected] Received 21 March 2003 Revised 9 January 2004 To our fathers We study a stationary Schr¨ odinger–Poisson system on a bounded interval of the real axis. The Schr¨ odinger operator is defined on the bounded domain with transparent boundary conditions. This allows us to model a non-zero current through the boundary of the interval. We prove that the system always admits a solution and give explicit a priori estimates for the solutions. Keywords: Quantum phenomena; current carrying state; inflow boundary condition; dissipative operators; open quantum systems; carrier and current densities; density matrices; quantum transmitting boundary method. 2000 Mathematics Subject Classification: 34B24, 34L40, 47B44, 81U20, 82D37
Contents 1. 2. 3. 4. 5.
Introduction Buslaev–Fomin Operator and QTB Family Eigenfunction Expansion Scattering Matrix Carrier and Current Densities 5.1. Carrier densities 5.2. Current densities 6. Carrier Density Operator ∗ Supported ‡ Supported
282 288 296 306 312 314 316 318
by the DFG Research Center “Mathematics for key technologies” (FZT 86) in Berlin. by the DFG under grant RE 1480/2. 281
April 27, 2004 19:23 WSPC/148-RMP
282
00199
M. Baro et al.
7. Quantum Transmitting Schr¨ odinger–Poisson System 7.1. Assumptions 7.2. Definition of solutions 7.3. Existence of solutions 7.4. Concluding remarks Acknowledgments References
322 322 324 324 326 328 328
1. Introduction In this paper we investigate, from a mathematical point of view, a basic quantum mechanical model for the transport of electrons and holes in a semiconductor device. More precisely, our subject is the distribution of electrons and holes in a device between two reservoirs within a selfconsistent electrical field, thereby taking into account quantum phenomena such as tunnelling and the quantization of energy levels in a quantum well. These very quantum effects are the active principle of many nanoelectronic devices: quantum well lasers, resonant tunnelling diodes etc., see [40]. We look for stationary states of a quasi two-dimensional electron-hole gas in a semiconductor heterostructure which is translational invariant in these two dimensions. Thus, neglecting any magnetic field induced by the carrier currents, we are dealing with an essentially one-dimensional physical system. Let us first regard the transport model for a single band, electrons or holes, in a given spatially varying potential v under the assumption that this potential v as well as the material parameters of the physical system are constant outside a fixed interval (a, b), see [12, 13, 26]. The possible wave functions are given by the generalized solutions of Kv ψk = λ(k)ψk
(1.1)
where Kv = −
~2 d 1 d +v 2 dx m dx
(1.2)
is the one particle, effective mass Hamiltonian in Ben–Daniel–Duke form, ~ is the reduced Planck constant, m = m(x) > 0 is the spatially varying effective mass of the particle species under consideration, and λ = λ(k) is a dispersion relation, e.g. 2 2 ~ k 2m + va for k > 0 , a (1.3) λ(k) = 2 2 ~ k + vb for k < 0 ; 2mb ma , mb are the effective masses, and va , vb are the potentials in the asymptotic regions x < a and x > b, respectively. For the sake of simplicity of presentation we assume (only in this section) that Kv has no bounded states. The particle density u
April 27, 2004 19:23 WSPC/148-RMP
00199
A Quantum Transmitting Schr¨ odinger–Poisson System
283
is a composition of the wave functions weighted by values of a distribution function f: Z ∞ Z 0 u(x) = c dk f (λ(k) − a )|ψk (x)|2 + c dk f (λ(k) − b )|ψk (x)|2 (1.4) 0
−∞
for all x ∈ (a, b). a and b is the quasi-Fermi potential of the reservoir in the asymptotic region x < a and x > b, respectively, and c is the two-dimensional density of states. The distribution function is ξ for Boltzmann statistics , exp − T (1.5) f (ξ) = ξ ln 1 + exp − for Fermi–Dirac statistics , T
where T is the temperature scaled by Boltzmann’s constant. (1.4) can be written in the following way: let ρˆ be the multiplication operator on L2 (R) induced by the function ( cf (λ(k) − a ) for k > 0 , ρ(k) = (1.6) cf (λ(k) − b ) for k < 0 , and let Fv : L2 (R) → L2 (R) be the Fourier transform Z (Fv φ)(k) = dx φ(x)ψk (x) R
which diagonalizes the operator Kv on L2 (R), that means ˆ, Fv Kv Fv∗ = λ ˆ is the maximal multiplication operator induced by the dispersion relation where λ λ = λ(k). Then the operator %(v) = Fv∗ ρˆFv
(1.7)
is a steady state, that means a selfadjoint, positive operator on the Hilbert space L2 (R) which commutes with Kv . Moreover, any steady state can be expressed in the form (1.7) by means of a function ρ = ρ(k). The particle density u, defined by (1.4) is the Radon–Nikod´ ym derivative of the (Lebesgue) absolutely continuous measure (a, b) ⊃ ω 7→ tr %(v)M (χω )
(M (χω ) denotes the multiplication operator induced by the characteristic function χω of the set ω) that means Z dx u(x) = tr %(v)M (χω ) , (1.8) ω
for all Lebesgue measurable subsets ω of (a, b).
April 27, 2004 19:23 WSPC/148-RMP
284
00199
M. Baro et al.
By replacing the real-valued distribution function (1.6) by a generalized distribution function with 2×2-matrix values this concept of particle density carries over to the setup we investigate in this paper, see Sec. 5.1. It should be noted that the species current density between the reservoirs can also be expressed in terms of the generalized solutions of (1.1), see Sec. 5.2. In the asymptotic regions x < a and x > b the generalized eigenfunctions ψk can be written as a superposition of plane waves. This allows us to define boundary conditions at a and b, with respect to the dispersion relation λ = λ(k), by means of the quantum transmitting boundary method, see [26, 13]. The corresponding homogeneous boundary conditions are ~ ψ 0 (a) = −iv(k)ψ(a) , m(a)
~ ψ 0 (b) = iv(−k)ψ(b) , m(b)
k ∈ R,
(1.9)
where v(k), k ∈ R, is the group velocity defined by
1 dλ . (1.10) ~ dk The differential expression (1.2) together with the boundary conditions (1.9) sets up a family of maximal dissipative operators on the Hilbert space L2 (a, b). We call this family, in the style of [26], the quantum transmitting boundary operator family (QTB operator family), see Sec. 2 and in particular Definition 2.2. The QTB operator family already contains all the information needed to define, in conjunction with a generalized distribution function ρ, physical quantities such as the particle density, the current density, or the scattering matrix. The interaction between an electric field and carriers of charge, electrons and holes, within a semiconductor device can be modelled by Poisson’s equation, see [37, 28, 15] and the references cited there. In the spatially one-dimensional case, which we consider here, the Poisson equation is v=
d d (x) ϕ(x) = q(C(x) + N + (v + )(x) − N − (v − )(x)) , x ∈ (a, b) , (1.11) dx dx where q denotes the elementary charge, C is the density of ionized dopants in the semiconductor device, > 0 is the dielectric permittivity function, and ϕ is the electrostatic potential. N − and N + map a (chemical) potential to the corresponding density of electrons and holes, respectively. The potential energies v + and v − are given by −
v + = w+ + qϕ , −
+
v − = w− − qϕ ,
(1.12)
where w and −w are the conduction and valence band offsets. In the following the superscript “+” always refers to quantities related to holes, the superscript “−” to electrons, and the superscript “±” to something which can be specified for holes as well as for electrons. If drawing a distinction between electrons and holes is of no importance we omit these superscripts, in particular in Secs. 2–6. In general (1.11) is complemented by mixed boundary conditions allowing Ohmic — metal — contacts on some parts of the boundary while other parts of the boundary of the
April 27, 2004 19:23 WSPC/148-RMP
00199
A Quantum Transmitting Schr¨ odinger–Poisson System
285
device are insulated, see [37, 15]. (1.11) can be regarded as an equation in the dual of a Sobolev space which is determined by these boundary conditions. The quantum transmitting Schr¨ odinger–Poisson system is a Poisson equation (1.11) with nonlinear electron and hole density operators N − and N + defined as the map of a potential v to the density (1.8) with steady states %− (v) and %+ (v), respectively. In Sec. 6 we demonstrate that the thus defined carrier density operators are continuous; the corresponding currents are uniformly bounded for all potentials v, see Proposition 5.12. We prove that the quantum transmitting Schr¨ odinger–Poisson system comprising electrons and holes always admits a solution provided the function inducing the steady states has reasonable decay properties with increasing energy, see Theorem 7.7. Furthermore, we give a priori estimates for the solutions. The a priori bounds for the electrostatic potential and the electron and hole density of solutions are explicit expressions in the data of the problem. Ben Abdallah, Degond and Markowich have investigated a special case of this model in [6] and have proved the existence of solutions for the unipolar case. Unfortunately the mathematical techniques used in their proof do not apply to the bipolar case, which we treat in this paper. The quantum transmitting Schr¨ odinger–Poisson system is not the only nonlinear Poisson equation used for the description of operating states of semiconductor devices. Depending on the objective of modelling and the underlying assumptions about the device, operators N ± are set up in different ways. In van Roosbroeck’s system, which describes the motion of electrons and holes in a semiconductor due to drift and diffusion, the operators N ± are Nemytzkii operators of the form ± φ (x) − v(x) , (1.13) N ± (v)(x) = N ± (x)F T see [37, 28, 15]. In this context let us assume that the temperature T (scaled to an energy), the densities of states N + , N − and the electro-chemical potentials φ+ and φ− are given functions. The function F is a statistical distribution function for Boltzmann’s statistics , exp(ξ) √ Z ∞ F(ξ) = (1.14) 2 ν dν for Fermi–Dirac statistics . √ π 0 1 + exp(ν − ξ) Poisson’s equation (1.11), with operators N ± defined by (1.13) and (1.12), has a unique solution due to the anti-monotonicity and Lipschitz continuity of the carrier density operators N + and N − , see [14, 28, 34, 29, 15] and the references cited there. The semi-classical approximation (1.13) of the carrier densities has the limitation of not taking into account quantum effects such as tunnelling, resonances, or the quantization of energy levels in a quantum well, the very phenomena one is interested in in many nanoelectronic devices. Modelling devices in thermodynamic equilibrium which inherently employ quantum effects is achieved by defining the density operators N ± by means of Schr¨ odinger operators of the form (1.1) (there
April 27, 2004 19:23 WSPC/148-RMP
286
00199
M. Baro et al.
specifying the position dependent effective mass m as m+ and m− for holes and electrons, respectively) on the bounded interval occupied by the nanoelectronic device. These Schr¨ odinger operators are completed by homogeneous boundary conditions ± in such a way that the operators, which we denote by Hsa (v), are selfadjoint. The ± operators Hsa (v) are semibounded from below and have compact resolvent. To get a Hamiltonian describing for instance confined states in a stack of quantum wells one can use homogeneous Dirichlet (hard wall) boundary conditions. For a quantum device in thermodynamic equilibrium the density of each species (electrons or holes) is given by u(x) =
∞ X j=1
cf (λj − F )|ψj (x)|2 ,
(1.15)
where λj are the eigenvalues (counting multiplicity) and ψj the corresponding eigenfunctions of the operator Hsa (v). F is the quasi-Fermi potential of the species under consideration in the interval (a, b), c is the two-dimensional density of states, and f is one of the equilibrium distribution functions (1.5). The steady state corresponding to (1.15) is %(v) = cf (Hsa (v) − F )
(1.16)
and the density u can be rewritten as in (1.8), that means the carrier densities N ± (v) are the Radon–Nikod´ ym derivatives of the (Lebesgue) absolutely continuous measures ± (a, b) ⊃ ω 7→ tr(c± f (Hsa (v) − ± F )M (χω )) .
(1.17)
It turns out that the (nonlinear) mappings v 7→ N ± (v) are — as in the semiclassical approximation — boundedly Lipschitz continuous and anti-monotone, see [10, 33, 17–19, 24]. Hence, the nonlinear Poisson equation (1.11) with carrier density operators N ± defined by (1.17) and (1.12) has a unique solution which, moreover, depends boundedly Lipschitz continuously on the reference energies w + and w− . This nonlinear Poisson equation — a stationary selfadjoint Schr¨ odinger–Poisson system — is closely related to the Euler equations of density functional theory, see [19]. The latter have been investigated also on bounded two- and three-dimensional domains, see [20]. ± The selfadjointness of the operators Hsa (v) reflects that the corresponding quantum systems of electrons and holes inside the bounded domain (a, b) are in thermodynamic equilibrium. Consequently, there is no flowing of carriers through the boundary of the device. To model systems out of equilibrium, where a non-zero electron and hole current through the boundary of the device is possible, boundary conditions at a and b have to be chosen, such that the operator on the Hilbert space L2 (a, b) becomes essentially non-selfadjoint. This leads to open quantum systems which allow the transmission of scattering states through the boundary of the device domain.
April 27, 2004 19:23 WSPC/148-RMP
00199
A Quantum Transmitting Schr¨ odinger–Poisson System
287
The boundary conditions (1.9) comprise the group velocity (1.10). If the device is driven by a suitable macroscopic current acting on the boundary of the domain (a, b), see [19–21], then the corresponding velocity field v induces an operator of the form (1.2) on L2 (a, b) with the boundary conditions ~ ψ 0 (a) = −iva ψ(a) , m(a)
~ ψ 0 (b) = ivb ψ(b) , m(b)
(1.18)
where va > 0 and vb > 0 are the values of the velocity field v at a and b, respectively. The resultant pseudo-Hamiltonian on the Hilbert space L2 (a, b), which we denote by Hdis (v), is essentially non-selfadjoint, more precisely, it is maximal dissipative and completely non-selfadjoint. Physical quantities such as the particle density and the current density are only well defined with respect to a selfadjoint Hamiltonian, see [27]. However, as the operator Hdis (v) is maximal dissipative, a minimal closed quantum system exists which contains the open one. That means, there is a Hilbert space, with L2 (a, b) as a subspace, and a selfadjoint operator, 2 the dilation of Hdis (v), such that L (a, b), Hdis (v) is a “projection” of the larger system in the sense of resolvents, see [11] and Remark 2.5. The dilation of Hdis (v) serves as a quasi-Hamiltonian for the open quantum system, which allows to define physical quantities related to the latter, see [21–23]. In particular, this way one obtains the densities of electrons and holes and the corresponding current densities on (a, b) for a device which is driven by an exterior current, see [23]. One can prove that the corresponding carrier density operators v 7→ N ± (v) are continuous and obey certain a priori estimates, see [1, Sec. 3]. The ensuing dissipative Schr¨ odinger– Poisson system, that means the Poisson equation (1.11) with the electron and hole ± density derived from the dilations of Hdis (v) and potentials (1.12) has a solution, see [1, Sec. 4.2], though in general uniqueness cannot be expected. The quantum transmitting Schr¨ odinger–Poisson system is closely related to the dissipative Schr¨ odinger–Poisson system. We point out the relation between the two systems throughout this paper. In particular we show that the dissipative Schr¨ odinger–Poisson system and the quantum transmitting Schr¨ odinger–Poisson system coincide for fixed energy, modulo a unitary transformation. The paper is organized as follows: in Sec. 2 we rigorously define the operator (1.2), which we call the Buslaev–Fomin operator [8], and the generalized eigenvalue problem (1.1)–(1.3) in a more general form and we set up the associated family of quantum transmitting boundary (QTB) operators. The generalized eigenfunctions of the Buslaev–Fomin operator can be expressed in terms of the QTB family, see Sec. 3. Section 4 is devoted to the scattering matrix of the Buslaev–Fomin operator. In Sec. 5 we define the carrier and current density for the open quantum system related to the QTB family. The properties of the corresponding carrier density operator have been investigated in Sec. 6. Finally, in Sec. 7 we characterize the solutions of the quantum transmitting Schr¨ odinger–Poisson system as the fixed points of a mapping which meets the preconditions of Schauder’s fixed point theorem.
April 27, 2004 19:23 WSPC/148-RMP
288
00199
M. Baro et al.
2. Buslaev–Fomin Operator and QTB Family Let us first introduce some notations which we use throughout this paper: N, R and C denote the natural, the real and the complex numbers, respectively; C+ := {z ∈ C | Im(z) > 0}, C− := {z ∈ C | Im(z) < 0}; if z ∈ C, then z denotes the complex conjugate number. Lp (Ω, X, ν), 1 ≤ p < ∞ is the space of ν-measurable, p-integrable functions with values in the Banach space X; L∞ (Ω, X, ν) is the corresponding space of essentially bounded functions. If Ω ⊆ R is a domain, ν the Lebesgue measure, and X = C, then we write short Lp (Ω), 1 ≤ p ≤ ∞. Furthermore we denote by W1,2 (Ω) the usual Sobolev space of complex-valued functions on Ω, by C(Ω) the space of continuous complex-valued functions on Ω and by Cb (Ω) the space of continuous bounded complex-valued functions on Ω equipped with the supremum norm. If Ω = (a, b) we abbreviate Lp , W1,2 , . . . for Lp (Ω), W1,2 (Ω), . . . ; moreover, we introduce K := L2 (R) and H := L2 = L2 (a, b). The real part of a function space is indexed by R, i.e. the real part of Lp , W1,2 , . . . is denoted by LpR , WR1,2 , . . . . For Banach spaces X and Y , we denote by B(X, Y ) the space of all linear, continuous operators from X into Y ; if X = Y we write B(X); IX ∈ B(X) is the identity operator. If X, Y are Hilbert spaces, then B1 (X, Y ) denotes the space of trace class operators and B2 (X, Y ) denotes the space of Hilbert Schmidt operators; if X = Y we abbreviate B1 (X) := B1 (X, X) and B2 (X) := B2 (X, X). For a densely defined linear operator A : X → Y we denote by A∗ the adjoint operator and by |A| the absolute value, if A is closed. If A is a selfadjoint operator in a Hilbert space we denote by σ(A), σp (A), σac (A) the spectrum of A, its point spectrum, and its absolutely continuous spectrum, respectively. ∞ Let va , vb ∈ R, va > vb , be given. We define the operator E : L∞ R → LR (R) by v , −∞ < x ≤ a , a ∞ (Ev)(x) := v(x) , x ∈ (a, b) , v ∈ D(E) := L∞ (2.1) R = LR (a, b) . vb , b ≤ x < ∞, Moreover, we assume that m ∈ L∞ R and m > 0 and 1/m ∈ L∞ . We set ma , m(x) ˆ := m(x) , mb , and define the operator Kv by Kv f := lv (f ) ,
f ∈ D(Kv ) :=
where lv (f )(x) := −
ma , mb ∈ R are given, with ma , mb > 0, −∞ < x ≤ a , x ∈ (a, b) ,
(2.2)
b ≤ x < ∞,
f ∈W
1,2
1 0 1,2 (R) f ∈ W (R) , m ˆ
~2 d 1 d f (x) + (Ev)(x)f (x) . 2 dx m(x) ˆ dx
(2.3)
April 27, 2004 19:23 WSPC/148-RMP
00199
A Quantum Transmitting Schr¨ odinger–Poisson System
289
Kv is selfadjoint on K = L2 (R). We call Kv the Buslaev–Fomin operator, see [8]; operators of this form also have been investigated in [39, Chap. 17] and [7]. Let us define r r z − va z − vb qa (z) := and qb (z) := , z ∈ C, (2.4) 2ma 2mb √ where the cut of the square root is taken along [0, ∞), i.e. Im( z) > 0 for z ∈ √ C\[0, ∞) and z > 0 for z ∈ (0, ∞). To construct the resolvent of Kv we proceed as in [39, Chap. 17] and introduce the functions 2mb g1 (x, z) = exp i qb (z)x , x ∈ (b, ∞) , ~ 2ma qa (z)x , x ∈ (−∞, a) , h1 (x, z) = exp −i ~ for z ∈ C. Furthermore let g2 (v)(x, z) be the solution of the integral equation Z b 2 g2 (v)(x, z) = c1 (z) − c2 (z) dt m(t) ˆ ~ x Z b Z b 2 + 2 dt m(t) ˆ ds((Ev)(s) − z)g2 (v)(s, z) , ~ x t
x ∈ (−∞, b), z ∈ C, where 2mb qb (z)b , c1 (z) := exp i ~
2mb c2 (z) := iqb (z) exp i qb (z)b . ~
Similarly we introduce h2 (v)(x, z) as the solution of Z x 2 h2 (v)(x, z) = d1 (z) + d2 (z) dt m(t) ˆ ~ a Z x Z t 2 ds((Ev)(s) − z)h2 (v)(s, z) , dt m(t) ˆ + 2 ~ a a
x ∈ (a, ∞), z ∈ C, with 2ma qa (z)a , d1 (z) := exp −i ~
Then we define
f+ (v)(x, z) :=
(
f− (v)(x, z) :=
(
and
2ma d2 (z) := −iqa (z) exp −i qa (z)a . ~
g2 (v)(x, z) ,
−∞ < x < b ,
g1 (x, z) ,
b ≤ x < ∞,
h1 (x, z) ,
−∞ < x ≤ a ,
h2 (v)(x, z) ,
a < x < ∞,
z ∈ C,
(2.5)
z ∈ C.
(2.6)
The f± (v) fulfill lv (f± (v)(x, z)) = zf± (v)(x, z)
for almost every x ∈ R ,
April 27, 2004 19:23 WSPC/148-RMP
290
00199
M. Baro et al.
and the f± (v)(x, ·) are holomorphic on C\[0, ∞) and continuous on C. We define the restrictions of f± (v) to the cut complex plane: k1 (v)(x, z) := f+ (v)(x, z) ,
z ∈ C\(vb , ∞) ,
k2 (v)(x, z) := f− (v)(x, z) ,
1,2 For ψ1 , ψ2 ∈ Wloc (R) the Wronskian is defined by
W (ψ1 (x), ψ2 (x)) := ψ1 (x)
~ ~ ψ20 (x) − ψ2 (x) ψ10 (x) . 2m(x) ˆ 2m(x) ˆ
x ∈ R.
(2.7)
In the sequel the Wronskian W (k1 (v)(x, z), k2 (v)(x, z)) for z ∈ C\(vb , ∞) is of interest to us; we abbreviate it by Wv (z). Indeed, this Wronskian does not depend on x. Lemma 2.1. The resolvent (Kv −z)−1 of the Buslaev–Fomin operator (2.3) admits the representation Z k1 (v)(x, z) x dy k2 (v)(y, z)f (y) ((Kv − z)−1 f )(x) = ~Wv (z) −∞ Z k2 (v)(x, z) ∞ + dy k1 (v)(y, z)f (y) , (2.8) ~Wv (z) x for all f ∈ K and all z from the resolvent set of Kv .
Proof. For convenience we do not indicate the dependence on v throughout the proof. Setting Z x Z ∞ 1 g(x) := k1 (x, z) dy k2 (y, z)f (y) + k2 (x, z) dy k1 (y, z)f (y) ~W (z) −∞ x for f ∈ K we get
1 ~2 d 1 d g(x) = 2 dx m(x) ˆ dx ~W (z)
((Ev)(x) − z) k1 (x, z)
+ ((Ev)(x) − z) k2 (x, z)
Z
∞ x
Z
x
dy k2 (y, z)f (y) −∞
dy k1 (y, z)f (y) − ~W (z)f (x) .
Hence, l(g(x)) − zg(x) = f (x) , i.e. g ∈ D(K) and (K − z)g = f . The spectrum of Kv is given by σ(Kv ) = σac (Kv ) ∪ σp (Kv ), where the absolutely continuous part is σac (Kv ) = [vb , ∞) and the point spectrum σp (Kv ) consists of finitely many simple eigenvalues λj (v), j = 1, . . . , N (v), with λj (v) < vb . σac (Kv ) is simple on [vb , va ) and has multiplicity two on [va , ∞), see [8] or [39, Theorem 17.C.1]. With respect to (2.4) we define κa (z) := iqa (z) ,
κb (z) := iqb (z) ,
z ∈ C+ .
(2.9)
April 27, 2004 19:23 WSPC/148-RMP
00199
A Quantum Transmitting Schr¨ odinger–Poisson System
291
Definition 2.2. The quantum transmitting boundary operator family (QTB family) {Hv (z)}z∈C+ is the family of maximal dissipative operators on H = L2 given by 1 0 f ∈ W1,2 , m ~ 0 1,2 f (a) = −κa (z)f (a) , , D(Hv (z)) := f ∈ W 2m(a) ~ 0 f (b) = κ (z)f (b) b 2m(b)
and
Hv (z)f := −
~2 d 1 d f + vf , 2 dx m dx
f ∈ D(Hv (z)) ,
for all z ∈ C+ . Since Re(qj (z)) > 0 for z ∈ C+ , we have Im(κj (z)) > 0, j = a, b. Thus the operator Hv (z) is dissipative for each fixed z, i.e. Im(Hv (z)f, f ) ≤ 0 for all f ∈ D(Hv (z)). Furthermore, for each z ∈ C+ the spectrum of Hv (z) is contained in the lower half plain C− and Hv (z) is maximal dissipative and completely non– selfadjoint, see [21, Theorems 4.6 and 5.2]. K Proposition 2.3. Let PH denote the projection operator from K onto H. Then K PH (Kv − z)−1 = (Hv (z) − z)−1 , for all z ∈ C+ , (2.10) H
K PH (Kv − z)−1
H
= (Hv (z)∗ − z)−1 ,
for all z ∈ C− .
(2.11)
Proof. Again we omit the subscript v within the proof. It suffices to show that K (K−z)−1 f satisfies the boundary condition for every f ∈ H and z ∈ C+ . We g := PH only prove that the boundary condition at b is satisfied; the corresponding statement at a is proven similarly. By Eq. (2.5) we get k1 (x, z) = exp(2imb qb (z)x/~), for x ≥ b. Thus, ~ 2mb qb (z)b and k 0 (b, z) = iqb (z)k1 (b, z) . k1 (b, z) = exp i ~ 2m(b) 1 Using the expression (2.8) for the resolvent of K we get 1 g(b) = ~W (z) k1 (b, z) = ~W (z)
k1 (b, z) Z
Z
b
dy k2 (y, z)f (y) + k2 (b, z) −∞
b
dy k2 (y, z)f (y) , a
Z
∞
dy k1 (y, z)f (y) b
!
April 27, 2004 19:23 WSPC/148-RMP
292
00199
M. Baro et al.
since f (y) = 0 for y ∈ (b, ∞). Similarly we obtain Z b 1 ~ g 0 (b) = iqb (z) k1 (b, z) dy k2 (y, z)f (y) 2m(b) ~W (z) −∞ Z ∞ ~ 0 + k (b, z) dy k1 (y, z)f (y) 2m(b) 2 b Z k1 (b, z) b dy k2 (y, z)f (y) = κb (z)g(b) . = iqb (z) ~W (z) a Now (2.11) follows from (2.10) by passing to the adjoints. Remark 2.4. The QTB family {Hv (z)}z∈C+ describes an open quantum system on H. The expression (2.10) is interpreted as the embedding of this open system into the larger quantum system described by the Buslaev–Fomin operator Kv . Remark 2.5. The relation (2.10) looks similar to a relation between maximal dissipative operators and their dilations, see [11]. More precisely: Let v ∈ L∞ R and λ0 ∈ R with λ0 > vb be given. We set Hdis (v) := Hv (λ0 ). Hdis (v) is maximal dissipative and completely non-selfadjoint. By the dilation theory we get the existence of a larger Hilbert space Kdis with H ⊆ Kdis and the existence of a selfadjoint operator Kdis (v) on Kdis , such that Kdis (2.12) PH (Kdis (v) − z)−1 = (Hdis (v) − z)−1 , for all z ∈ C+ , H
Kdis PH
where denotes the projection from Kdis onto H, see [11]. The operator Kdis (v) is called the dilation corresponding to Hdis (v). Note that Eq. (2.12) differs from the expression (2.10), since Hdis (v) is independent of z ∈ C+ . The Hilbert space Kdis and the operator Kdis have been explicitly calculated in [22]. We remark that the Hilbert space Kdis differs from the space K. Furthermore, the operator Kdis (v) is not bounded from below and its spectrum is completely absolutely continuous, i.e. σ(Kdis (v)) = σac (Kdis (v)) = R (see [22] for details).
−1 K Proposition 2.6. If v ∈ L∞ PH is R and Kv is the operator (2.3), then (Kv − i) trace class and its trace norm can be estimated by p (b − a) p −1 K 1 + kvkL∞ . k(Kv − i) PH kB1 (K) ≤ 3 + 8 + 4 kmkL∞ ~
For the proof we need the subsequent lemma. We set ra := Re(κa (i)),
rb := Re(κb (i)) ,
αa := Im(κa (i)),
αb := Im(κb (i)) .
We note that ra , rb ≤ 0 and αa , αb ≥ 0. Furthermore we abbreviate Hv := Hv (i). Following [22] we introduce the (unclosed) operator α : H → C2 by √ αb f (b) , D(α) = W1,2 , (2.13) αf = √ − αa f (a)
April 27, 2004 19:23 WSPC/148-RMP
00199
A Quantum Transmitting Schr¨ odinger–Poisson System
293
and the operator Uv (z) : H → C2 given by Uv (z) = α(Hv − z)−1 ,
D(Uv (z)) = H
for z from the resolvent set of the operator Hv . Lemma 2.7. For every v ∈ L∞ R we have the estimates p (b − a) p 1 + kvkL∞ , k(Hv − i)−1 kB1 (H) ≤ 3 + 4 kmkL∞ ~
and
kUv (i)kB1 (H,C2 ) ≤ 8
p
1 + kvkL∞ .
(2.14)
(2.15)
Proof. We define the selfadjoint operator H0 by 1 ~ ~ f 0 (b) = f 0 (a) = 0 , D(H0 ) := f ∈ W1,2 f 0 ∈ W1,2 , m 2m(b) 2m(a) H0 f = −
~2 d 1 d f +f, 2 dx m dx
f ∈ D(H0 ) .
Note that H0 ≥ IH . Similar to the operator α we define the operator r : H → C2 by √ −rb f (b) , D(r) = W1,2 . rf = √ − −ra f (a) The operators Vα (µ), Vr (µ) : H → C2 are given by Vα (µ) = α(H0 + µ)−1/2 ,
Vr (µ) = r(H0 + µ)−1/2 ,
µ ≥ 0.
Furthermore we introduce the operator Bv (µ) : H → H by Bv (µ) := (H0 + µ)−1/2 (v − 1)(H0 + µ)−1/2 + Vr (µ)∗ Vr (µ) − iVα (µ)∗ Vα (µ) , for µ ≥ 0, see [1]. There is k(H0 + µ)−1/2 (v − 1)(H0 + µ)−1/2 kB(H) ≤
1 + kvkL∞ . 1+µ
Hence, k(H0 + µ)−1/2 (v − 1)(H0 + µ)−1/2 k ≤
1 , 2
for µ ≥ 1 + 2kvkL∞ .
We set Rv (µ) := (H0 + µ)−1/2 (v − 1)(H0 + µ)−1/2 + Vr (µ)∗ Vr (µ) ,
April 27, 2004 19:23 WSPC/148-RMP
294
00199
M. Baro et al.
for µ ≥ 1 + 2kvkL∞ . Rv (µ) is selfadjoint and there √ is 1 + Rv (µ) ≥ 21 . Hence, −1/2 (1 + Rv (µ)) exists and its norm does not exceed 2. Now a straightforward calculation shows that (1 + Bv (µ))−1 = (1 + Rv (µ))−1/2 −1 × 1 − i(1 + Rv (µ))−1/2 Vα (µ)∗ Vα (µ)(1 + Rv (µ))−1/2 (1 + Rv (µ))−1/2 .
Hence,
k(1 + Bv (µ))−1 kB(H) ≤ 2 ,
for µ ≥ 1 + 2kvkL∞ .
(2.16)
By [1, Lemma 2.3] we have the representation (Hv + µ)−1 = (H0 + µ)−1/2 (1 + Bv (µ))−1 (H0 + µ)−1/2 ,
(2.17)
for large, positive µ. Because both sides depend on µ analytically, this operator equality extends to all real µ for which (1 + Bv (µ))−1 ∈ B(H). This is true for any µ ≥ 1 + 2kvkL∞ ; hence, (2.17) extends to all these µ. Using the first resolvent equation we obtain (Hv − i)−1 = (Hv + µ)−1 1 + (µ + i)(Hv − i)−1 , and we get by (2.17) that
k(Hv − i)−1 kB1 (H) ≤ (2 + µ)k(Hv + µ)−1 kB1 (H) ≤ 2(2 + µ)k(H0 + µ)−1/2 k2B2 (H) .
(2.18)
According to [1, Appendix] we have p 1 (b − a) 1 + kmkL∞ √ √ . 1+µ 1+µ ~ 2 Thus, from (2.18) we obtain for µ = 1 + 2kvkL∞ : p (b − a) p k(Hv − i)−1 kB1 (H) ≤ 3 + 4 kmkL∞ 1 + kvkL∞ , ~ i.e. the first assertion (2.14) of the lemma. To prove the second assertion we estimate for µ ≥ 1 + 2kvkL∞ k(H0 + µ)−1/2 k2B2 (H) ≤
kUv (i)kB1 (H,C2 ) ≤ 2kUv (i)kB(H,C2 ) ≤ 2(2 + µ)kUv (µ)kB(H,C2 ) .
(2.19)
Using the definition of Uv (µ) and the representation (2.17) for the resolvent of Hv we get Uv (µ) = Vα (µ)(1 + Bv (µ))−1 (H0 + µ)−1/2 −1 = Vα (µ)(1 + Rv (µ))−1/2 1 − i(1 + Rv (µ))−1/2 Vα (µ)∗ Vα (µ)(1 + Rv (µ))−1/2 × (1 + Rv (µ))−1/2 (H0 + µ)−1/2 .
April 27, 2004 19:23 WSPC/148-RMP
00199
A Quantum Transmitting Schr¨ odinger–Poisson System
295
Because the operator norm of −1 Vα (µ)(1 + Rv (µ))−1/2 1 − i(1 + Rv (µ))−1/2 Vα (µ)∗ Vα (µ)(1 + Rv (µ))−1/2
is not larger than one, we get
kUv (µ)kB(H,C2 )
√ ≤ 2k(H0 + µ)−1/2 kB(H) ≤
and with (2.19):
r
2 1+µ
√ 2+µ √ p kUv (i)kB1 (H,C2 ) ≤ 2 2 √ ≤ 4 2 1+µ. 1+µ
Setting µ = 1 + 2kvkL∞ we finally obtain the inequality (2.15). Proof of Proposition 2.6. Using Proposition 2.3 we get K 2 K K (Kv − i)−1 PH = PH (Kv + i)−1 (Kv − i)−1 PH
1 K K P ((Kv + i)−1 − (Kv − i)−1 )PH 2i H 1 = (Hv∗ + i)−1 − (Hv − i)−1 . 2i
=
By [22, Lemma 3.2] we get (Kv − i)−1 P K 2 = Uv (i)∗ Uv (i) + (H ∗ + i)−1 (Hv − i)−1 . H v Therefore we find
K K k(Kv − i)−1 PH kB1 (K) = k |(Kv − i)−1 PH | kB1 (K)
≤ kUv (i)kB1 (H,C2 ) + k(Hv − i)−1 kB1 (H) . Now using Lemma 2.7 the proof is finished. Proposition 2.8. Assume that v ∈ L∞ R and let Kv be given by (2.3). The number of eigenvalues N (v) of Kv is estimated by p 2kmkL∞ (b − a) p N (v) ≤ 1 + kvkL∞ + |vb | . π~
Proof. We define the operator K1 by
2kmkL∞ (kvkL∞ + |vb |) d2 − χ(a,b) , D(K1 ) = W2,2 (R) , 2 dx ~2 where χ(a,b) ∈ L∞ R (R) is the indicator function of the set (a, b). The number of eigenvalues of K1 , which we denote by N (K1 ), can be estimated, see [39, p. 274]: p 2kmkL∞ (b − a) p kvkL∞ + |vb | . N (K1 ) ≤ 1 + π~ Therefore it suffices to show that N (v) ≤ N (K1 ). A straightforward calcula~2 K1 ≤ Kv . By the min–max principle, see e.g. [36, Thetion shows that 2kmk L∞ orem XIII.2], this implies N (v) ≤ N (K1 ). K1 := −
April 27, 2004 19:23 WSPC/148-RMP
296
00199
M. Baro et al.
3. Eigenfunction Expansion In this section we investigate the generalized eigenfunctions of the Buslaev–Fomin operator Kv , see (2.3). These eigenfunctions are important for the definition of the carrier and current densities, which we define in Sec. 5. Furthermore, we introduce the Fourier transform corresponding to Kv and show that the eigenfunctions of the Buslaev–Fomin operator can be expressed in terms of the QTB family. The first part of this section slightly generalizes results by Buslaev and Fomin [8], see also [39, Chap. 17]. We define by means of the functions (2.5) and (2.6) f1 (v)(x, λ) := f+ (v)(x, λ) ,
for λ ∈ (vb , ∞) ,
x ∈ R,
f2 (v)(x, λ) := f− (v)(x, λ) ,
for λ ∈ (vb , ∞) ,
x ∈ R.
and
In the following investigations Wronskians — as defined in (2.7) — repeatedly appear. A straightforward calculation shows that W (f1 (v)(x, λ), f1 (v)(x, λ)) = 2iqb (λ) , W (f2 (v)(x, λ), f2 (v)(x, λ)) = −2iqa(λ) , Remark 3.1. It should be noted that ( 2qa (λ(k)) v(k) = 2qb (λ(k))
λ ∈ (vb , ∞) , λ ∈ (va , ∞) .
(3.1) (3.2)
for k > 0 , for k < 0 ,
where λ = λ(k) is the dispersion relation (1.3) and v(k) for k > 0 and k < 0 is the group velocity of the species in the asymptotic regions x < a and x > b, respectively, see (1.10) and [12]. By means of the Wronskian W we define the coefficients C11 (v)(λ) :=
1 W (f1 (v)(x, λ), f2 (v)(x, λ)) , 2iqb (λ)
λ ∈ (vb , ∞) ,
C12 (v)(λ) :=
1 W (f2 (v)(x, λ), f1 (v)(x, λ)) , 2iqb (λ)
λ ∈ (vb , ∞) ,
C21 (v)(λ) :=
1 W (f2 (v)(x, λ), f1 (v)(x, λ)) , 2iqa (λ)
λ ∈ (va , ∞) ,
C22 (v)(λ) :=
1 W (f1 (v)(x, λ), f2 (v)(x, λ)) , 2iqa (λ)
λ ∈ (va , ∞) .
Thus, f1 (v)(x, λ) = C22 (v)(λ)f2 (v)(x, λ) + C21 (v)(λ)f2 (v)(x, λ) ,
λ ∈ (va , ∞) ,
(3.3)
f2 (v)(x, λ) = C11 (v)(λ)f1 (v)(x, λ) + C12 (v)(λ)f1 (v)(x, λ) ,
λ ∈ (vb , ∞) .
(3.4)
April 27, 2004 19:23 WSPC/148-RMP
00199
A Quantum Transmitting Schr¨ odinger–Poisson System
297
There are the following equations: qb (λ)C12 (v)(λ) = qa (λ)C21 (v)(λ) , qb (λ)C11 (v)(λ) = −qa (λ)C22 (v)(λ) ,
λ ∈ (va , ∞) ,
(3.5)
and qb (λ) qa (λ) qb (λ) |C12 (v)(λ)|2 = 1 + |C11 (v)(λ)|2 = 1 + |C22 (v)(λ)|2 qa (λ) qa (λ) qb (λ)
(3.6)
for λ ∈ (va , ∞). Moreover, we have C11 (v)(λ) = C12 (v)(λ) ,
λ ∈ (vb , va ) .
(3.7)
The scattering coefficients Sij (v)(λ) are given by Sba (v)(λ) :=
1 , C21 (v)(λ)
Saa (v)(λ) :=
C22 (v)(λ) , C21 (v)(λ)
λ ∈ (va , ∞) ,
(3.8)
Sbb (v)(λ) :=
C11 (v)(λ) , C12 (v)(λ)
Sab (v)(λ) :=
1 , C12 (v)(λ)
λ ∈ (vb , ∞) .
(3.9)
Using (3.5) and (3.6) we obtain the following relations for λ ∈ (va , ∞): qa (λ)Sab (v)(λ) = qb (λ)Sba (v)(λ) , qb (λ)Sba (v)(λ)Sbb (v)(λ) = −qa (λ)Saa (v)(λ)Sab (v)(λ)
(3.10)
and qb (λ) qa (λ) |Sba (v)(λ)|2 + |Saa (v)(λ)|2 = |Sab (v)(λ)|2 + |Sbb (v)(λ)|2 = 1 . (3.11) qa (λ) qb (λ) Equation (3.10) implies |Saa (v)(λ)| ≤ 1 ,
and |Sbb (v)(λ)| ≤ 1 ,
for λ ∈ (va , ∞) .
(3.12)
Furthermore, we get from (3.7) |Sbb (v)(λ)| = 1 ,
for λ ∈ (vb , va ) .
(3.13)
The boundary values of Sij (λ), i, j = a, b, are given by lim Sba (v)(λ) = lim Sab (v)(λ) = 0 ,
λ→va
λ→vb
lim Saa (v)(λ) = lim Sbb (v)(λ) = −1 .
λ→va
λ→vb
Now we define ψa (v)(x, λ) := Sba (v)(λ)f1 (v)(x, λ) ,
λ ∈ (va , ∞) ,
ψb (v)(x, λ) := Sab (v)(λ)f2 (v)(x, λ) ,
λ ∈ (vb , ∞) .
By (3.3) and (3.4) we get ψa (v)(x, λ) = f2 (v)(x, λ) + Saa (v)(λ)f2 (v)(x, λ) ,
for λ ∈ (va , ∞) ,
ψb (v)(x, λ) = f1 (v)(x, λ) + Sbb (v)(λ)f1 (v)(x, λ) ,
for λ ∈ (vb , ∞) .
(3.14)
April 27, 2004 19:23 WSPC/148-RMP
298
00199
M. Baro et al.
Thus, outside the interval (a, b) the functions ψa (v) and ψb (v) are given by ψa (v)(x, λ) 2ma 2ma exp i ~ qa (λ)x + Saa (v)(λ) exp −i ~ qa (λ)x , = 2mb Sba (v)(λ) exp i qb (λ)x , ~
x ∈ (−∞, a) , (3.15) x ∈ (b, ∞)
for λ ∈ (va , ∞) and
ψb (v)(x, λ) 2ma Sab (v)(λ) exp −i ~ qa (λ)x , = 2mb 2mb exp −i qb (λ)x + Sbb (v)(λ) exp i qb (λ)x , ~ ~
x ∈ (−∞, a) , x ∈ (b, ∞) (3.16)
for λ ∈ (vb , ∞), respectively. Remark 3.2. Formulas (3.15) and (3.16) have the following physical interpretaa tion: the wave exp(i 2m ~ qa (λ)x) coming from −∞ is scattered at the potential v. During the scattering the wave is partially reflected and partially transmitted by v. The reflection and the transmission part is given by 2mb 2ma qa (λ)x and Sba (v)(λ) exp i qb (λ)x , Saa (v)(λ) exp −i ~ ~ b respectively. Similarly, the wave exp(−i 2m ~ qb (λ)x) which comes from +∞, splits up during the scattering into the reflection and transmission part 2ma 2mb qb (λ)x and Sab (v)(λ) exp −i qa (λ)x , Sbb (v)(λ) exp i ~ ~
respectively. Lemma 3.3. There are the following identities for the functions (3.14): Z dx ψa (v)(x, λ)ψa (v)(x, µ) = 4π~qa (λ)δ(λ − µ) , for λ, µ ∈ (va , ∞) ,
(3.17)
R
Z
R
dx ψb (v)(x, λ)ψb (v)(x, µ) = 4π~qb (λ)δ(λ − µ) ,
for λ, µ ∈ (vb , ∞) ,
(3.18)
and Z
dx ψa (v)(x, λ)ψb (v)(x, µ) = 0 , R
for λ, µ ∈ (va , ∞) .
(3.19)
April 27, 2004 19:23 WSPC/148-RMP
00199
A Quantum Transmitting Schr¨ odinger–Poisson System
299
Proof. For the sake of simplicity we omit the index v within the proof. Assume λ, µ ∈ (va , ∞). Then Z N Z N ∂ ~ ∂ ~ dx ψa (x, λ) dx ψa (x, λ)ψa (x, µ) = ψa (x, µ) λ − µ ∂x 2m(x) ∂x −N −N Z
N
∂ ~ ∂ dx − ψa (x, λ)ψa (x, µ) ∂x 2m(x) ∂x −N =
For N ≥ b we get by (3.15)
!
~ W (ψa (N, λ), ψa (N, µ)) λ−µ − W (ψa (−N, λ), ψa (−N, µ)) .
W (ψa (N, λ), ψa (N, µ))
2mb = −iSba (λ)Sba (µ)(qb (λ) + qb (µ)) exp i (qb (λ) − qb (µ))N ~
.
(3.20)
b Hence, we get by observing δ( 2m ~ (qb (λ) − qb (µ))) = 2~qb (λ)δ(λ − µ), see [16, Chap. II, Sec. 2, Eq. II]: 2mb ~ W (ψa (N, λ), ψa (N, µ)) = π|Sba (λ)|2 δ (qb (λ) − qb (µ)) lim N →∞ λ − µ ~
= 2π~|Sba (λ)|2 qb (λ)δ(λ − µ) .
For N ≤ a one obtains, analogously to (3.20):
a exp − i 2m ~ ~ (qa (λ) − qa (µ))N W (ψa (N, λ), ψa (N, µ)) = −i~ λ−µ 2ma (qa (λ) − qa (µ))
a exp i 2m ~ (qa (λ) + qa (µ))N − i~Saa (µ) 2ma (qa (λ) + qa (µ))
(3.21)
a exp −i 2m ~ (qa (λ) + qa (µ))N − i~Saa (λ) 2ma (qa (λ) + qa (µ))
a exp i 2m ~ (qa (λ) − qa (µ))N . + i~Saa (µ)Saa (λ) 2ma (qa (λ) − qa (µ))
Therefore
1 W (ψa (N, λ), ψa (N, µ)) λ−µ 2m a (qa (λ) − qa (µ)) = −πqa (λ) 1 + Saa (µ)Saa (λ) δ ~ 2ma + π Saa (µ) − Saa (λ) δ (qa (λ) + qa (µ)) . ~
lim
N →−∞
April 27, 2004 19:23 WSPC/148-RMP
300
00199
M. Baro et al.
a Since qa (λ0 ) > 0 for all λ0 ∈ (va , ∞) we get δ( 2m ~ (qa (λ) + qa (µ))) = 0. Thus,
~ W (ψa (N, λ), ψa (N, µ)) N →−∞ λ − µ = −2π~qa (µ) 1 + |Saa (µ)|2 δ(λ − µ) . lim
(3.22)
Putting (3.21) and (3.22) together yields Z ∞ ψa (x, λ)ψa (x, µ) dx −∞
= 2π~(qb (λ)|Sba (λ)|2 + qa (λ) + qa (λ)|Saa (λ)|2 )δ(λ − µ) . By (3.11) we get
Z
∞ −∞
ψa (x, λ)ψa (x, µ) dx = 4π~qa (λ)δ(λ − µ)
which proves (3.17). Analogously we obtain (3.18) and (3.19). The generalized eigenfunctions of the Buslaev–Fomin operator Kv , see (2.3), are given by 1 ψb (v)(x, λ) , λ ∈ (vb , ∞) , φb (v)(x, λ) := p 4π~qb (λ) (3.23) 1 φa (v)(x, λ) := p ψa (v)(x, λ) , λ ∈ (va , ∞) 4π~qa (λ)
and the orthonormal eigenfunctions corresponding to the eigenvalues λ1 (v), . . . , λN (v) (v) are denoted by φp (v)(x, λj (v)), j = 1, . . . , N (v). Corollary 3.4. The functions {φa (v)(·, λ)}λ∈(va ,∞) ∪ {φb (v)(·, λ)}λ∈(vb ,∞) ∪ {φp (v)(·, λ)}λ∈σp (Kv ) constitute a complete system of orthonormal generalized eigenfunctions, i.e. (φτ (v)(·, λ), φτ 0 (v)(·, λ0 ))K = δτ,τ 0 δ(λ − λ0 ) ,
τ, τ 0 ∈ {a, b, p} ,
where λ and λ0 are from a part of σ(Kv ) which corresponds to τ and τ 0 , respectively. We now introduce the Hilbert space ˆ v := L2 (σ(Kv ), h(λ), ν) , K see [3, Chap. 4], where
h(λ) :=
C,
N (v)
λ ∈ σp (Kv ) = ∪j=1 {λj (v)} ,
C,
C2 ,
λ ∈ (vb , va ) ,
(3.24)
λ ∈ (va , ∞) .
The measure ν(·) decomposes ν(·) = νp (·) + νac (·)
(3.25)
April 27, 2004 19:23 WSPC/148-RMP
00199
A Quantum Transmitting Schr¨ odinger–Poisson System
301
into an atomic measure νp ({λj (v)}) = 1, j = 1, . . . , N (v), supported on σp (Kv ), and an absolutely continuous measure dνac (λ) = χ(vb ,∞) (λ)dλ supported on (vb , ∞). With respect to the decomposition (3.24) we define φp (v)(x, λ) , λ ∈ σp (Kv ), φb (v)(x, λ) , λ ∈ (vb , va ), ~ v (x, λ) := φ (3.26) ! φb (v)(x, λ) φ (v)(x, λ) , λ ∈ (va , ∞) . a ~v (x, λ) we now define the Fourier transform with respect By means of the functions φ ˆ v , see [39, Theorem 17.C.2]: to Kv as the unitary operator Φv : K → K Z ~v (x, λ), λ ∈ σ(Kv ) . (3.27) dx f (x)φ (Φv f )(λ) = R
ˆ The inverse Fourier transform Φ−1 v : Kv → K is given by Z ~v (x, λ) (Φ−1 g ˆ )(x) = dν(λ) g ˆ (λ), φ , x ∈ R, v h(λ)
σ(Kv )
ˆv . gˆ ∈ K
Remark 3.5. We note that Φv Kv Φ−1 v =M, where M is the multiplication operator (M g)(λ) := λg(λ)
ˆ v | λg(λ) ∈ K ˆv } . for g ∈ D(M ) := {g ∈ K
This is a generalized form of (1.7). In the following we give a description of the eigenfunctions of the Buslaev–Fomin operator Kv , see (2.3), on the interval (a, b) in terms of the QTB family. To that end we consider the QTB family {Hv (λ)}λ∈R on the real axis. By the definition of Hv (λ) we have to distinguish two cases: λ ∈ (−∞, vb ): The coefficients κa (λ) and κb (λ) are real and negative. Therefore the operator family {Hv (λ)}λ∈(−∞,vb ) is a family of selfadjoint operators. λ ∈ (vb , ∞): The imaginary part of κb (λ) is strictly positive, Im(κb (λ)) > 0. Hence, the operator family {Hv (λ)}λ∈(vb ,∞) is a family of dissipative operators. Please note that the boundary conditions of this family are exactly the boundary conditions (1.9) in energy form, see also Remark 3.5.
April 27, 2004 19:23 WSPC/148-RMP
302
00199
M. Baro et al.
Let us first show that the generalized eigenfunctions of Kv are closely related to the family {Hv (λ)}λ∈(vb ,∞) . To that end we introduce the operators α(λ) : H → h(λ), λ ∈ (vb , ∞), see also (2.13): √ p 2 qb (λ)f (b) , λ ∈ (vb , va ) , ! p α(λ)f := √ f ∈ D(α(λ)) = W1,2 . (3.28) qb (λ)f (b) 2 −pq (λ)f (a) , λ ∈ (va , ∞) , a Moreover, we define the vectors eb (λ), ea (λ) ∈ h(λ), λ ∈ (vb , ∞), by ( ( 0, λ ∈ (vb , va ) , 1, λ ∈ (vb , va ) , ea (λ) := eb (λ) := (0, 1)T , λ ∈ (va , ∞) , (1, 0)T , λ ∈ (va , ∞) ,
where T denotes the transpose of a vector. Furthermore we define the operators Tv (λ) : H → h(λ), λ ∈ (vb , ∞), by Tv (λ)f := α(λ)(Hv (λ)∗ − λ)−1 f ,
f ∈ H.
(3.29)
Note that the definition makes sense, since the spectrum of Hv (λ) does not intersect the real line, see [21, Theorem 5.2]. Lemma 3.6. For x ∈ (a, b) there is the following representation of the eigenfunctions of the Buslaev–Fomin operator Kv : r 2mb ~ exp −i qb (λ)b (Tv (λ)∗ eb (λ))(x) , λ ∈ (vb , ∞) , φb (v)(x, λ) = −i 2π ~ r 2ma ~ φa (v)(x, λ) = i exp i qa (λ)a (Tv (λ)∗ ea (λ))(x) , λ ∈ (va , ∞) . 2π ~ Proof. Using Lemma 2.1 and Proposition 2.3 we get for every f ∈ H and x ∈ (a, b) ((Hv (λ)∗ − λ)−1 f )(x) Z Z f1 (v)(x, λ) x f2 (v)(x, λ) b dy f2 (v)(y, λ)f (y) + dy f1 (v)(y, λ)f (y) . = ~Wv (λ) a ~Wv (λ) x Hence, Tv (λ)f
Since
p Rb q (λ) f (v)(b, λ) dy f2 (v)(y, λ)f (y) , b 1 a √ p Rb 2 = qb (λ) f1 (v)(b, λ) a dy f2 (v)(y, λ)f (y) , ~Wv (λ) −pq (λ) f (v)(a, λ) R b dy f (v)(y, λ)f (y) a 2 1 a
2mb f1 (v)(b, λ) = exp i qb (λ)b ~
λ ∈ (vb , va ) , λ ∈ (va , ∞) .
2ma and f2 (v)(a, λ) = exp −i qa (λ)a , ~
April 27, 2004 19:23 WSPC/148-RMP
00199
A Quantum Transmitting Schr¨ odinger–Poisson System
303
we obtain for all x ∈ (a, b): p 2qb (λ)f2 (v)(x, λ) 2mb ∗ (Tv (λ) eb (λ))(x) = exp i qb (λ)b , λ ∈ (vb , ∞) , ~Wv (λ) ~ p 2qa (λ)f1 (v)(x, λ) 2ma ∗ (Tv (λ) ea (λ))(x) = − exp −i qa (λ)a , λ ∈ (va , ∞) . ~Wv (λ) ~ By the definition (2.7) of the Wronskians Wv (λ) we have Wv (λ) = −2iqa (λ)C21 (v)(λ) = −2iqb (λ)C12 (v)(λ) or in terms of the scattering coefficients (3.8) and (3.9) i i 1 = Sba (v)(λ) = Sab (v)(λ) . Wv (λ) 2qa (λ) 2qb (λ) Thus, b i exp i 2m 1 ∗ ~ qb (λ)b p √ (Tv (λ) eb (λ))(x) = Sab (v)(λ)f2 (v)(x, λ) 2π ~ 4πqb (λ) b i exp i 2m ~ qb (λ)b p = ψb (v)(x, λ) ~ 4πqb (λ) b i exp i 2m ~ qb (λ)b √ φb (x, λ) , = ~ which proves the first equation of Lemma 3.6; similarly we get the second one. By means of (T (λ)∗ eb (λ))(x), v ! ~ T (v)(x, λ) := φ (Tv (λ)∗ eb (λ))(x) (T (λ)∗ e (λ))(x) , v
and
a
2mb −i exp −i qb (λ)b , ~ 2mb Q(λ) := −i exp −i ~ qb (λ)b 0
λ ∈ (vb , va ), x ∈ (a, b)
λ ∈ (va , ∞) ,
0 , 2ma qa (λ)a i exp i ~
we can write the functions (3.26) r ~ ~ ~T (v)(x, λ) , φ(v)(x, λ) = Q(λ)φ 2π
λ ∈ (vb , ∞),
λ ∈ (vb , va ), λ ∈ (va , ∞) ,
x ∈ (a, b) .
(3.30)
April 27, 2004 19:23 WSPC/148-RMP
304
00199
M. Baro et al.
Thus, we get the following Corollary 3.7. For f ∈ H we get r ~ Q(λ)∗ Tv (λ)f , (Φv f ) (λ) = 2π
λ ∈ (vb , ∞) .
Remark 3.8. Let v ∈ L∞ R and λ0 ∈ R, with λ0 > vb be given. We set Hdis (v) := Hv (λ0 ) and denote by Kdis (v) the selfadjoint dilation corresponding to Hdis (v), see Remark 2.5. The generalized eigenfunctions φdis,b (v), φdis,a (v) of Kdis (v) on the interval (a, b) are given by 1 φdis,b (v)(x, ξ) = √ (Tdis (v; ξ)∗ eb (λ0 ))(x) , 2π
ξ ∈ R,
x ∈ (a, b) ,
1 φdis,a (v)(x, ξ) = √ (Tdis (v; ξ)∗ ea (λ0 ))(x) , 2π
ξ ∈ R,
x ∈ (a, b) ,
where Tdis (v; ξ) : H → C2 is defined by Tdis (v; ξ) := α(λ0 )(Hdis (v)∗ − ξ)−1 ,
ξ ∈ cl(C− ) ,
see [22, Theorem 5.1]. Therefore, we get by Lemma 3.6 (~ scaled to 1) ! ! φb (v)(x, λ0 ) φdis,b (v)(x, λ0 ) = Q(λ0 ) , x ∈ (a, b) . φa (v)(x, λ0 ) φdis,a (v)(x, λ0 ) i.e. the generalized eigenfunctions of Kv and Kdis (v) coincide, modulo a unitary transformation, for any fixed energy λ0 > vb . We are now going to show that the eigenfunctions and eigenvalues of the operator family {Hv (λ)}λ∈(−∞,vb ) determine the eigenfunctions and eigenvalues of Kv in a unique way. Let us first define what we mean by the eigenvalues and eigenfunction of an operator family. Definition 3.9 (See [30, Chap. II, Sec. 11]). An element f ∈ H is called an eigenvector of the QTB operator family {Hv (λ)}λ∈(−∞,vb ) , if Hv (µ(v))f = µ(v)f for some µ(v) ∈ (−∞, vb ); µ(v) is called the corresponding eigenvalue. The set of all these eigenvalues — the spectrum of {Hv (λ)} — is denoted by σ({Hv (λ)}) and the normalized eigenfunction of the QTB operator family {Hv (λ)}λ∈(−∞,vb ) corresponding to the eigenvalue µ(v) is denoted by η(v)(·, µ(v)). For every µ(v) ∈ σ({Hv (λ)}) we set 2ma (x − a) η(v)(a, µ(v)) , exp −κa (µ(v)) ~ ˜ φ(v)(x, µ(v)) = η(v)(x, µ(v)), 2mb exp κb (µ(v)) (x − b) η(v)(b, µ(v)) , ~
x ∈ (−∞, a] , x ∈ (a, b) , x ∈ [b, ∞) .
April 27, 2004 19:23 WSPC/148-RMP
00199
A Quantum Transmitting Schr¨ odinger–Poisson System
305
˜ Note that κa (λ), κb (λ) < 0 for all λ ∈ (−∞, vb ). Hence, φ(v)(·, µ(v)) ∈ K for every µ(v) ∈ σ({Hv (λ)}). Since η(v)(·, µ(v)) ∈ D(Hv (µ(v))), see Definition 2.2, it satisfies the quantum transmitting boundary condition and a straightforward calculation shows that ˜ ˜ φ(v)(·, µ(v)), φ(v)(·, ξ(v)) =0 (3.31) K
for all µ(v), ξ(v) ∈ σ({Hv (λ)}) with µ(v) 6= ξ(v). The following lemma states the relation between the eigenvalues and eigenfunctions of the family {Hv (λ)}λ∈(−∞,vb ) and the eigenvalues and eigenfunctions of the Buslaev–Fomin operator Kv . Lemma 3.10. Assume that v ∈ L∞ R . (i) If µ(v) ∈ σ({Hv (λ)}), then φp (v)(·, µ(v)) :=
˜ φ(v)(·, µ(v)) ˜ kφ(v)(·, µ(v))kK
(3.32)
is a normalized eigenfunction of Kv corresponding to the eigenvalue µ(v). Furthermore, these eigenfunctions are mutually orthogonal for different eigenvalues µ(v). (ii) If λj (v) ∈ σp (Kv ) and φp (v)(·, λj (v)) is an eigenfunction of Kv , then η(v)(x, λj (v)) :=
φp (v)(x, λj (v)) , kφp (v)(·, λj (v))χ(a,b) kK
x ∈ (a, b) ,
is a normalized eigenfunction of the family {Hv (λ)}λ∈(−∞,vb ) . (iii) The point spectrum of Kv and the spectrum of the operator family {Hv (λ)} coincide, i.e. σp (Kv ) = σ({Hv (λ)}) . Proof. Let us first prove (i). Assume that µ(v) ∈ σ({Hv (λ)}). Using the boundary ˜ conditions of η(v)(·, µ(v)) one verifies that φ(v)(·, µ(v)) ∈ D(Kv ). Since Hv (µ(v))η(v)(·, µ(v)) = µ(v)η(v)(·, µ(v)) there is ˜ ˜ Kv φ(v)(·, µ(v)) = µ(v)φ(v)(·, µ(v)) . By means of (3.31) and (3.32) one now obtains that the φp (v)(·, µ(v)) are indeed mutually orthogonal eigenfunctions of Kv for different eigenvalues of the QTB operator family {Hv (λ)}λ∈(−∞,vb ) . Moreover, the φp (v)(·, µ(v)) have norm one in K. Assume now λj (v) ∈ σp (Kv ). In order to prove (ii) it suffices to show that the functions φp (v)(·, λj (v)) satisfy the boundary condition at a and b imposed on
April 27, 2004 19:23 WSPC/148-RMP
306
00199
M. Baro et al.
functions from the domain of the QTB family {Hv (λ)}λ∈(−∞,vb ) , see Definition 2.2. We set a exp − κa (λj (v)) 2m (x − a) φp (v)(a, λj (v)) , x ∈ (−∞, a] , ~ ˜ ψ(v)(x, λj (v)) := φp (v)(x, λj (v)) , x ∈ (a, b) , 2mb exp κb (λj (v)) ~ (x − b) φp (v)(b, λj (v)) , x ∈ [b, ∞) .
˜ ˜ ˜ There is ψ(v)(·, λj (v)) ∈ D(Kv ) and Kv ψ(v)(·, λj (v)) = λj (v)ψ(v)(·, λj (v)). Since the eigenvalues of Kv are simple, there exists a constant C(λj (v)) ∈ C such that ˜ φp (v)(·, λj (v)) = C(λj (v))ψ(v)(·, λj (v)) .
We have ~ ˜0 ~ ˜0 ˜ ψ (v)(a, λj (v)) = ψ (v)(a, λj (v)) = −κa (λj (v))ψ(v)(a, λj (v)) , 2m(a) 2ma ˜ i.e. ψ(v)(·, λj (v)) satisfies the boundary condition at a. In the same way one gets ˜ that ψ(v)(·, λj (v)) satisfies the boundary condition at b. Thus, (ii) has been proved. Statement (iii) follows directly from statements (i) and (ii). 4. Scattering Matrix In this section we investigate the scattering matrix corresponding to the Buslaev– Fomin operator (2.3). The scattering matrix plays an important role for the current. Furthermore, we show that the scattering matrix can be completely expressed in terms of the QTB family. The scattering matrix Sv (λ) is defined by ( Sbb (v)(λ) , λ ∈ (vb , va ) , (4.1) Sv (λ) := ˜ S(v)(λ) , λ ∈ (va , ∞) , where
Sbb (v)(λ) ˜ s S(v)(λ) := qa (λ) Sab (v)(λ) qb (λ)
s
qb (λ) Sba (v)(λ) qa (λ) Saa (λ)
and Sij (v)(λ), i, j = a, b, are the scattering coefficients (3.8) and (3.9). By (3.10), (3.11) and (3.13) we get that Sv (λ)Sv (λ)∗ = Sv∗ (λ)Sv (λ) = Ih(λ) , λ ∈ (vb , ∞), i.e. ˆ ac := L2 ((vb , ∞), h(λ), dλ). By Sˆv we denote the — Sv (λ) is unitary. We define K ˆ ac → K ˆ ac induced by Sv (λ): unitary — multiplication operator Sˆv : K (Sˆv f )(λ) = Sv (λ)f (λ) ,
ˆ ac . f ∈ D(Sˆv ) := K
Since (Kv − i)−1 − (Kw − i)−1 = (Kv − i)−1 (Ev − Ew)(Kw − i)−1 K = (Kv − i)−1 (v − w)PH (Kw − i)−1
(4.2)
April 27, 2004 19:23 WSPC/148-RMP
00199
A Quantum Transmitting Schr¨ odinger–Poisson System
307
is trace class for all v, w ∈ L∞ R , the wave operators W± (Kv , Kw ) := s-lim exp(itKv ) exp(−itKw )Pac (Kw ) , t→±∞
exist and are asymptotically complete, where Pac (Kw ) denotes the projection onto the absolutely continuous subspace of Kw , see [35, Theorem XI.9]. Lemma 4.1. For all v, w ∈ L∞ R the wave operators obey Φv W+ (Kv , Kw )Φ∗w = Sˆv∗ Sˆw ,
Φv W− (Kv , Kw )Φ∗w = IKˆ ac ,
where Φv is the Fourier transform with respect to Kv , see (3.27), and Sˆv is the multiplication operator (4.2) induced by the scattering matrix. For the special case va = vb = 0 and m ˆ ≡ 1, Lemma 4.1 can be found in [39, 17.c]. Proof. We only prove the first equation, the second one can be proven similarly. Let us define the projections P1 (v) : K → K and P2 (v) : K → K by P1 (v) := Φ∗v χ(vb ,va ) Φv ,
P2 (v) := Φ∗v χ(va ,∞) Φv ,
v ∈ L∞ R .
There is W+ (Kv , Kw ) = P1 (v)W+ (Kv , Kw )P1 (w) + P2 (v)W+ (Kv , Kw )P2 (w) . Hence, it suffices to show: Φv W+ (Kv , Kw )Φ∗w fˆ (λ) = Sbb (v)(λ)Sbb (w)fˆ(λ) ,
∗˜ ˜ Φv W+ (Kv , Kw )Φ∗w fˆ (λ) = S(v)(λ) S(w)(λ)fˆ(λ) ,
fˆ ∈ L2 (vb , va ) ,
(4.3)
fˆ ∈ L2 ((va , ∞), C2 ) . (4.4)
We only prove (4.3); the proof of (4.4) is similar. Since χ(a,b) (Kw + i)−1 is compact, in fact it is even trace class, we get by [3, Proposition 6.70] that s-lim χ(a,b) (Kw + i)−1 exp(−itKw )Pac (Kw ) = 0 .
t→+∞
Applying this to all vectors of the form (Kw + i)f we obtain s-lim χ(a,b) exp(−itKw )Pac (Kw ) = 0 .
(4.5)
Φv W+ (Kv , Kw )Φ∗w fˆ = Φv Wa (Kv , Kw )Φ∗w fˆ + Φv Wb (Kv , Kw )Φ∗w fˆ
(4.6)
t→+∞
Therefore for fˆ ∈ L2 (vb , va ), where Wa (Kv , Kw ) := s-lim exp(itKv )χ(−∞,a) exp(−itKw )Pac (Kw ) , t→+∞
Wb (Kv , Kw ) := s-lim exp(itKv )χ(b,∞) exp(−itKw )Pac (Kw ) . t→+∞
2
For every gˆ ∈ L (vb , va ) there is (χ(−∞,a) Φ∗w gˆ)(x) = χ(−∞,a) (x)
Z
va
dλ φb (w)(x, λ)ˆ g (λ) . vb
April 27, 2004 19:23 WSPC/148-RMP
308
00199
M. Baro et al.
Since qa (λ) is purely imaginary for λ ∈ (vb , va ), we obtain by (3.16) that χ(−∞,a) Φ∗w is a compact operator from L2 (vb , va ) into K. This yields lim χ(−∞,a) exp(−itKw )Φ∗ fˆ = 0 , for all fˆ ∈ L2 (vb , va ) . w
t→+∞
Therefore Φv Wa (Kv , Kw )Φ∗w fˆ = 0 , For every fˆ ∈ C∞ 0 (vb , va ) we have Φv Wb (Kv , Kw )Φ∗w fˆ (λ) = lim
lim
t→+∞ N →+∞
Z
N
dx b
Z
va
vb
for all fˆ ∈ L2 (vb , va ) .
(4.7)
dµ exp(it(λ − µ))φb (v)(x, λ)φb (w)(x, µ)fˆ(µ) . (4.8)
Using (3.16) we find for x ∈ (b, ∞) φb (v)(x, λ)φb (w)(x, µ) =
We have
Z
N b
2mb 1 p exp i (qb (λ) − qb (µ))x ~ 4π~ qb (λ)qb (µ) 2mb (qb (λ) + qb (µ))x + Sbb (v)(λ) exp −i ~ 2mb + Sbb (w)(µ) exp i (qb (λ) + qb (µ))x ~ 2mb + Sbb (v)(λ)Sbb (w)(µ) exp −i (qb (λ) − qb (µ))x . ~ (4.9)
b exp i 2m ~ (qb (λ) − qb (µ))x p dx dµ exp(it(λ − µ)) fˆ(µ) 4π~ qb (λ)qb (µ) vb Z N Z va 2mb exp i (q (λ) − q (µ))x b b it(λ−µ) ~ p . dµ fˆ(µ)e dx = 4~π qb (λ)qb (µ) vb b Z
va
There is b ~ exp i 2m 2mb ~ (qb (λ) − qb (µ))N →δ (qb (λ) − qb (µ)) as N → ∞ iπ 2mb (qb (λ) − qb (µ)) ~ in the sense of distributions. Hence, Z N b exp i 2m ~ (qb (λ) − qb (µ))x p lim dx N →+∞ b 4π~ qb (λ)qb (µ) b b δ 2m exp i 2m ~ (qb (λ) − qb (µ)) ~ (qb (λ) − qb (µ))b (qb (λ) + qb (µ)) p p = − λ−µ 4~ qb (λ)qb (µ) 4iπ qb (λ)qb (µ) b exp i 2m 1 ~ (qb (λ) − qb (µ))b (qb (λ) + qb (µ)) p = δ(λ − µ) − . 2 λ−µ 4iπ qb (λ)qb (µ)
April 27, 2004 19:23 WSPC/148-RMP
00199
A Quantum Transmitting Schr¨ odinger–Poisson System
309
Further we get b exp i 2m ~ (qb (λ) − qb (µ))x ˆ p f (µ) lim dx dµ N →+∞ b 4π~ qb (λ)qb (µ) vb Z 1 exp(it(λ − µ)) 1 va = fˆ(λ) − dµ fˆ(µ) 2 2 vb 2iπ(λ − µ) (qb (λ) + qb (µ)) 2mb p (qb (λ) − qb (µ))b × exp i . ~ qb (λ)qb (µ) Z
Since
N
Z
va
1 exp(it(λ − µ)) → δ(λ − µ) as t → ∞ iπ λ−µ
(4.10)
in the sense of distributions, we finally obtain Z ∞ Z va b exp i 2m ~ (qb (λ) − qb (µ))x p dµ lim dx exp(it(λ − µ))fˆ(µ) = 0 . (4.11) t→+∞ b 4π~ qb (λ)qb (µ) vb
Furthermore we have Z N Z va b exp − i 2m ~ (qb (λ) + qb (µ))x p dµ Sbb (v)(λ) lim dx N →+∞ b 4π~ qb (λ)qb (µ) vb × exp(it(λ − µ))fˆ(µ) Z Z va dµfˆ(µ) exp(it(λ − µ)) = Sbb (v)(λ) vb
Since Z
∞ b
one gets Z
∞
dx b
Z
b
b exp − i 2m ~ (qb (λ) + qb (µ))x p dx . 4π~ qb (λ)qb (µ)
b exp − i 2m ~ (qb (λ) + qb (µ))x p dx 4π~ qb (λ)qb (µ) b exp − i 2m ~ (qb (λ) + qb (µ))b (qb (λ) − qb (µ)) p = λ−µ 4πi qb (λ)qb (µ)
va vb
∞
2mb Sbb (v)(λ) p p dµ (qb (λ) + qb (µ))x exp −i ~ 4π~ qb (λ) qb (µ)
× exp(it(λ − µ))fˆ(µ) Z va exp(it(λ − µ)) dµ fˆ(µ) = Sbb (v)(λ) iπ(λ − µ) vb (qb (λ) − qb (µ)) 2mb p (qb (λ) + qb (µ))b × exp −i ~ 4 qb (λ)qb (µ)
April 27, 2004 19:23 WSPC/148-RMP
310
00199
M. Baro et al.
which yields lim
t→∞
Z
∞
dx b
Z
va vb
Sbb (v)(λ) 2mb p dµ exp −i (qb (λ) + qb (µ))x ~ 4π~ qb (λ)qb (µ)
(4.12)
Sbb (w)(µ) 2mb p dµ exp i (qb (λ) + qb (µ))x ~ 4π~ qb (λ)qb (µ)
(4.13)
× exp(it(λ − µ))fˆ(µ) = 0 . Similarly we prove Z ∞ Z lim dx t→∞
b
va vb
× exp(it(λ − µ))fˆ(µ) = 0 . We have lim
N →+∞
Z
N
dx b
Z
va
dµ vb
Sbb (v)(λ)Sbb (w)(µ) p exp(−i2mb (qb (λ) − qb (µ))x) 4π~ qb (λ)qb (µ)
× exp(it(λ − µ))fˆ(µ) Z va Sbb (v)(λ)Sbb (w)(µ) p = dµ fˆ(µ) exp(it(λ − µ)) 4π~ qb (λ)qb (µ) vb Z ∞ 2mb dx exp −i (qb (λ) − qb (µ))x . × ~ b
Since
b exp − i 2m ~ (qb (λ) − qb (µ))x p N →+∞ b 4π~ qb (λ)qb (µ) b b exp − i 2m δ 2m ~ (qb (λ) − qb (µ)) ~ (qb (λ) − qb (µ))b (qb (λ) + qb (µ)) p p + = λ−µ 4 qb (λ)qb (µ) 4iπ qb (λ)qb (µ) b exp − i 2m 1 ~ (qb (λ) − qb (µ))b (qb (λ) + qb (µ)) p = δ(λ − µ) + 2 λ−µ 4iπ qb (λ)qb (µ) lim
Z
we obtain
lim
t→+∞
N
dx
Z
∞
dx b
Z
va
dµ vb
Sbb (v)(λ)Sbb (w)(µ) 2mb p exp −i (qb (λ) − qb (µ))x ~ 4π qb (λ)qb (µ)
× exp(it(λ − µ))fˆ(µ)
= Sbb (v)(λ)Sbb (w)(λ)fˆ(λ) . Thus, we get by (4.8), (4.9), (4.11)–(4.14) that Φv Wb (Kv , Kw )Φ∗w fˆ (λ) = Sbb (v)(λ)Sbb (w)(λ)fˆ(λ)
(4.14)
(4.15)
for f ∈ C∞ 0 (vb , va ) and λ ∈ (vb , va ). Now (4.15), (4.7) and (4.6) imply the assertion (4.4).
April 27, 2004 19:23 WSPC/148-RMP
00199
A Quantum Transmitting Schr¨ odinger–Poisson System
311
In the following we set v0 ∈ L∞ R , v0 ≡ vb . Lemma 4.1 implies Corollary 4.2. If v0 (x) = vb for all x ∈ (a, b), then
Φv = Φv0 W− (Kv , Kv0 )∗ = Sˆv∗ Sˆv0 Φv0 W+ (Kv , Kv0 )∗ Sˆv = Sˆv0 Φv0 W+ (Kv , Kv0 )∗ W− (Kv , Kv0 )Φ∗v0 = Sˆv0 Φv0 S(Kv , Kv0 )Φ∗v0
for all v ∈ L∞ R , where S(Kv , Kv0 ) is the scattering operator
S(Kv , Kv0 ) := W+ (Kv , Kv0 )∗ W− (Kv , Kv0 ) .
The scattering matrix Sv (λ) can be completely described by the QTB family: Lemma 4.3. Let α(λ), Tv (λ), and Q(λ) be given by (3.28), (3.29), and (3.30), respectively. The scattering matrix (4.1) obeys Sv (λ) = Q(λ)(Ih(λ) + i~α(λ)Tv (λ)∗ )Q(λ) ,
λ ∈ (vb , ∞) .
(4.16)
Proof. Using the definitions of α(λ) and Tv (λ) one gets α(λ)Tv (λ)∗ p qb (λ)(Tv (λ)∗ eb )(b) , √ p = 2 qb (λ)(Tv (λ)∗ eb )(b) p − q (λ)(T (λ)∗ e )(a) a v b
Taking into account Lemma 3.6 we obtain
! p qb (λ)(Tv (λ)∗ ea )(b) p , − qa(λ)(Tv (λ)∗ ea )(a)
λ ∈ (vb , va ) , λ ∈ (va , ∞) .
α(λ)Tv (λ)∗
= i
where
r
p pb (λ) qb (λ)φb (b, λ), 4π p pb (λ) qb (λ)φb (b, λ) ~ −p (λ)pq (λ)φ (a, λ) b a b
2mb pb (λ) := exp i qb (λ)b ~
! p −pa (λ) qb (λ)φa (b, λ) p , pa (λ) qa (λ)φa (a, λ)
λ ∈ (vb , va ) , λ ∈ (va , ∞) ,
2ma and pa (λ) := exp −i qa (λ)a . ~
By Eqs. (3.15), (3.16), and (3.23) one has, if λ ∈ (vb , va ), α(λ)Tv (λ)∗ =
i i Ih(λ) + Sbb (v)(λ)pb (λ)2 ~ ~
and α(λ)Tv (λ)∗ =
Sbb (v)(λ)pb (λ)2
i i Ih(λ) + q ~ ~ − qa (λ) Sab (v)(λ)pb (λ)pa (λ) qb (λ)
if λ ∈ (va , ∞). This yields (4.16).
−
q
qb (λ) qa (λ) Sba (v)(λ)pb (λ)pa (λ)
Saa (v)(λ)pa (λ)2
,
April 27, 2004 19:23 WSPC/148-RMP
312
00199
M. Baro et al.
Remark 4.4. Rewriting (4.16) formally as Sv (λ) = Q(λ)(Ih(λ) + i~α(λ)(Hv (λ) − λ)−1 α(λ)∗ )Q(λ) it is not at all surprising that the resonances of Kv are given by {z ∈ C− |z is eigenvalue of Hv (z)} , see also Definition 3.9. Remark 4.5. Let v ∈ L∞ R and λ0 ∈ R with λ0 > vb be given. As in Remarks 2.5 and 3.8 we set Hdis (v) := Hv (λ0 ). The adjoint of the so called characteristic function ΘHdis (v) (z) : C2 → C2 , z ∈ cl(C+ ), corresponding to the dissipative operator Hdis (v) (see [11]) is given by ΘHdis (v) (z)∗ = IC2 + iα(λ0 )Tdis (v; z)∗ ,
z ∈ cl(C+ ) ,
where Tdis (v; z) is defined as in Remark 3.8 and ~ is scaled to 1, see [22, Lemma 3.3]. Therefore we get by Lemma 4.3 Sv (λ0 ) = Q(λ0 )ΘHdis (v) (λ0 )∗ Q(λ0 , i.e. for any fixed energy λ0 the characteristic function is equal to the scattering matrix Sv (λ0 ), modulo the transformation Q(λ0 ). 5. Carrier and Current Densities In this section we introduce the carrier density and the current density corresponding to the QTB operator family {Hv (z)}z∈C+ from Definition 2.2 with respect to a generalized thermodynamic distribution function ρ. We assume ρ = ρp ⊕ ρac
with
ρp ∈ CR (−∞, vb ) ,
ρac ∈ L∞ ((vb , ∞), B(h(λ)), νac ) , (5.1)
where νac is given as in (3.25), ρac (λ)∗ = ρac (λ) ,
ρac (λ) ≥ 0 for a.e. λ ∈ (vb , ∞) ,
ρp ≥ 0 ,
(5.2)
and Cac := ess sup kρac (λ)kB(h(λ)) λ∈(vb ,∞)
Furthermore, we define Cp (v) :=
p λ2 + 1 < ∞ .
sup
ρp (λ) .
(5.3)
(5.4)
λ∈(vb −kvkL∞ ,vb )
Note that νac does not depend on the potential v while νp in (3.25) does. (5.1) implies ρ ∈ L∞ (σ(Kv ), B(h(λ)), ν) ,
for all v ∈ L∞ R .
April 27, 2004 19:23 WSPC/148-RMP
00199
313
A Quantum Transmitting Schr¨ odinger–Poisson System
By means of a function ρ with (5.1)–(5.3) we define the multiplication operator ρˆ ˆ v by on K (ˆ ρg)(λ) := ρ(λ)g(λ) ,
ˆv g ∈ D(ˆ ρ) = L2 (σ(Kv ), h(λ), ν) = K
(5.5)
and the steady state %(v) : K → K by %(v) = Φ∗v ρˆΦv .
(5.6)
Obviously %(v) is a bounded, non-negative, selfadjoint operator which commutes with Kv . Remark 5.1. There is a one-to-one correspondence between bounded, nonnegative, selfadjoint operators which commute with Kv and multiplication operators of the form (5.5), i.e. if %(v) : K → K is a bounded, non-negative, selfadjoint operator, which commutes with Kv , then there exists exactly one function ρ ∈ L∞ (σ(Kv ), B(h(λ)), ν) such that %(v) has the representation given by (5.6), see [3, Proposition 4.18]. Remark 5.2. Using Corollary 4.2 we can rewrite Eq. (5.6) in the form %(v) = %p (v) + %ac (v) ,
(5.7)
where, see also [32], %ac (v) = W− (Kv , Kv0 )%(v0 )W− (Kv , Kv0 )∗ ,
%(v0 ) = Φ∗v0 ρˆac Φv0 ,
(5.8)
and Kv0 is the operator (2.3) and Φv0 the corresponding Fourier transform with respect to the constant potential v0 ≡ vb , see also Corollary 4.2. (5.6) and (5.3) imply k%ac (v)(Kv − i)kB(K) = Cac < ∞ .
(5.9)
The operator %p (v) admits the representation N (v)
%p (v) =
X
ρp (λj (v))P (λj (v)) ,
(5.10)
j=1
where P (λj (v)) are the orthogonal projections of Kv onto the eigenspaces corresponding to the eigenvalues λj (v), j = 1, . . . , N (v) , i.e. ~ v (·, λj (v)) φ ~ v (·, λj (v)) , P (λj (v))f = f, φ K
f ∈ K.
April 27, 2004 19:23 WSPC/148-RMP
314
00199
M. Baro et al.
5.1. Carrier densities First we define an (Lebesgue) absolutely continuous measure, the Radon–Nikod´ ym ∞ derivative of which is the carrier density. For every h ∈ LR let M (h) : K → K be the multiplication operator ( h(x)f (x) , x ∈ (a, b) , (M (h)f )(x) = f ∈ D(M (h)) = K ; (5.11) 0, x ∈ R\(a, b) , note that ran(M (h)) ⊆ H. For any Borel set ω ⊆ (a, b) we consider the observable M (χω ) and define the expectation value of M (χω ) with respect to %(v) by E%(v) (ω) := tr(%(v)M (χω )) . The definition is justified since K |tr(%(v)M (χω ))| ≤ Cp (v)N (v) + Cac k(Kv − i)−1 PH kB1 (K) < ∞ ,
where (5.9) comes to bear. There is tr(%(v)M (χω )) = tr(ˆ ρΦv M (χω )Φ∗v ) Z Z = dx dν(λ) trh(λ) (ρ(λ)D(v)(x, λ)) ω
(5.12)
σ(Kv )
with
|φp (v)(x, λ)|2 , |φb (v)(x, λ)|2 , D(v)(x, λ) = |φb (v)(x, λ)|2 φb (v)(x, λ)φa (v)(x, λ)
λ ∈ σp (Kv ) , φa (v)(x, λ)φb (v)(x, λ) |φa (v)(x, λ)|2
!
λ ∈ (vb , va ) , ,
λ ∈ (va , ∞) .
Hence, E%(v) (·) defines a measure which is absolutely continuous with respect to the Lebesgue measure. Definition 5.3. The Radon–Nikod´ ym derivative of E%(v) (·) is called the carrier density, with respect to %(v), of the open quantum system described by the QTB operator family {Hv (z)}z∈C+ and we denote it by u%(v) . Note the assumptions (5.1)–(5.3). From (5.12) directly follows Z u%(v) (x) = dν(λ)u%(v) (x, λ) , σ(Kv )
x ∈ (a, b) ,
(5.13)
with
~v (x, λ), φ ~v (x, λ) u%(v) (x, λ) := ρ(λ)T φ
h(λ)
,
λ ∈ σ(Kv ) ,
x ∈ (a, b) ,
(5.14)
where ρ(λ)T denotes the transposed matrix. Since ρ ≥ 0 the carrier density is positive, i.e. u%(v) (x) ≥ 0 for a.e. x ∈ (a, b). Note that in (5.14) enter only the
April 27, 2004 19:23 WSPC/148-RMP
00199
A Quantum Transmitting Schr¨ odinger–Poisson System
315
values of the eigenfunctions for arguments x ∈ (a, b). These can be expressed by the QTB family, see Lemmas 3.6 and 3.10. Remark 5.4. As in Remarks 2.5, 3.8, and 4.5 we fix λ0 > vb and define ρdis (λ0 ) := Q(λ0 )∗ ρ(λ0 )Q(λ0 ) ,
(5.15)
where Q(λ0 ) is given by (3.30). Remark 3.8 and the results from [22, Sec. 3] imply udis,v (x, λ0 ) = u%(v) (x, λ) , where udis,v (x, ξ), ξ ∈ R, is defined as in [22, Sec. 3] with the generalized distribution function ρdis (λ0 ) given by (5.15). Thus, we get that the carrier density of the dissipative system and the carrier density of the QTB system coincide for fixed energy λ0 , if the generalized distribution function for the dissipative system is transformed by (5.15). Example 5.5. Let f ∈ CR (R) be a distribution function with p 0 < f (λ), λ ∈ R , ess sup f (λ) λ2 + 1 < ∞ ,
(5.16)
λ∈(vb ,∞)
see for instance (1.5). Then the system is in thermodynamic equilibrium and the steady state %(v) = f (Kv ) is given by Z
2
~
u%(v) (x) = dν(λ) f (λ) φ , x ∈ (a, b) . (5.17) v (x, λ) h(λ)
σ(Kv )
Example 5.6. Let f ∈ CR (R) be a distribution function with (5.16), like for instance (1.5), and let a , b , p ∈ R be given constants representing the quasi-Fermi potential of the reservoir x < a and x > b and the bounded states in a < x < b, respectively. If f (λ − p ) , λ ∈ (−∞, vb ], f (λ − b ) , λ ∈ (vb , va ), ρ(λ) := ! f (λ − b ) 0 , λ ∈ [va , ∞) , 0 f (λ − a ) then
N (v)
u%(v) (x) =
X j=1
+ +
Z
Z
f (λj (v) − p )|φp (v)(x, λj (v))|2 ∞ vb ∞ va
dλ f (λ − b )|φb (v)(x, λ)|2 dλ f (λ − a )|φa (v)(x, λ)|2 ,
x ∈ (a, b) ,
see also (1.4) and [12, 31]. If a = b = p , then the system is in thermodynamic equilibrium, see also Example 5.5.
April 27, 2004 19:23 WSPC/148-RMP
316
00199
M. Baro et al.
Lemma 5.7. The carrier density u%(v) from Definition 5.3 obeys Z
b
dx u%(v) (x)h(x) = tr(%(v)M (h)) , a
for all h ∈ L∞ R ,
(5.18)
M (h) being the multiplication operator (5.11). In particular there is K kB1 (K) . ku%(v) kL1 ≤ Cp (v)N (v) + Cac k(Kv − i)−1 PH
Proof. For any Borel set ω ⊆ (a, b) there is Z tr(%(v)M (χω )) = E% (ω) =
b
dx u%(v) (x)χω (x) , a
i.e. (5.18) for all h = χω . By linearity (5.18) also holds for step functions on (a, b). Since u%(v) is in L1R and %M (χ(a,b) ) is a trace class operator, (5.18) extends by density ∞ K of the step functions in L∞ R and continuity to all h ∈ LR . Because M (χ(a,b) ) = PH there is K K ku%(v) kL1 = tr(%(v)PH ) ≤ Cp (v)N (v) + Cac k(Kv − i)−1 PH kB1 (K) .
5.2. Current densities For x ∈ (a, b), λ ∈ σ(Kv ) the current density j%(v) is defined by, see [27], R j%(v) (x) := σ(Kv ) dν(λ) j%(v) (x, λ) , ~ ∂ ~ ~v (x, λ) j%(v) (x, λ) := Im ρ(λ)T φv (x, λ), φ . m(x) ∂x h(λ)
(5.19)
∂ j%(v) (x, λ) = 0, i.e. j%(v) is constant, j%(v) (x) = Direct calculation shows that ~ ∂x j%(v) .
Remark 5.8. The point spectrum and the simple spectrum do not contribute to the current: j%(v) (λ) = 0 ,
λ ∈ σ(Kv )\(va , ∞) .
Indeed, as Im(iqa (λ)) = 0 for λ ∈ (vb , va ) we get by (3.16) 4ma 2ma 2 qa (λ)|Sab (v)(λ)| exp −i qa (λ)a = 0, j%(v) (λ) = Im −i ~ ~ if λ ∈ (vb , va ). Now let us regard a µ ∈ σp (Kv ). Since Kv is selfadjoint we can find a real-valued eigenfunction φp (v)(x, µ) and thus get ~ ∂ Im φp (v)(x, µ)φp (x, µ) = 0 , for all µ ∈ σp (Kv ) . m(x) ∂x
April 27, 2004 19:23 WSPC/148-RMP
00199
A Quantum Transmitting Schr¨ odinger–Poisson System
317
A straightforward calculation shows that λ ∈ (va , ∞) ,
j%(v) (λ) = trC2 (ρac (λ)C(v)(λ)) , with 1 C(v)(λ) = i
W (φb (v)(x, λ), φb (x, λ))
W (φb (v)(x, λ), φa (x, λ))
W (φa (v)(x, λ), φb (x, λ))
W (φa (v)(x, λ), φa (x, λ))
(5.20) !
.
(5.21)
Lemma 5.9. The operator C(v)(λ) is selfadjoint and admits the representation s qb (λ) qa (λ) 2 − |Sba (v)(λ)| Sab (v)(λ)Sbb (v)(λ) qa (λ) qb (λ) 1 . s C(v)(λ) = 2π~ qb (λ) qa (λ) 2 − Sba (v)(λ)Saa (v)(λ) |Sab (v)(λ)| qa (λ) qb (λ)
(5.22)
There is 1 (Pa Sv (λ)∗ Pb − Pb Sv (λ)∗ Pa ) Sv (λ) , for λ ∈ (va , ∞) , (5.23) 2π~ where Pa := (·, ea )C2 ea and Pb := (·, eb )C2 eb with eb := (1, 0)T , ea := (0, 1)T ∈ C2 . C(v)(λ) =
Proof. By (3.23) and (3.16) for x ≤ a we have W (φb (v)(x, λ), φb (x, λ)) = − =−
1 W (ψb (x, λ), ψb (x, λ)) 4π~qa (λ) i qa (λ) |Sab (v)(λ)|2 . 2π~ qb (λ)
Using (3.23) as well as (3.15) and (3.16) for x ≥ b we get s i qb (λ) Sba (v)(λ)Sbb (v)(λ) . W (φa (v)(x, λ), φb (x, λ)) = 2π~ qa (λ) Similarly we obtain i W (φb (v)(x, λ), φa (x, λ)) = − 2π~
s
qa (λ) Sab (v)(λ)Saa (v)(λ) qb (λ)
and W (φa (v)(x, λ), φa (x, λ)) =
i qb (λ) |Sba (v)(λ)|2 . 2π~ qa (λ)
Using (5.21) and (3.10) we verify (5.22). The relation (5.23) immediately follows from (5.22). The selfadjointness of C(v)(λ) follows from (3.10) and the identity (5.22). Note that the current depends only on the part ρac of the generalized distribution function ρ and the scattering matrix Sv (λ). As has been shown in Lemma 4.3
April 27, 2004 19:23 WSPC/148-RMP
318
00199
M. Baro et al.
the scattering matrix is completely described by the QTB family. Thus, the same is true for the current. Remark 5.10. As in Remark 5.4 we assume that λ0 > vb is fixed. Let ρdis (λ0 ) be given by (5.15) and let jdis,v denote the current density of the dissipative system corresponding to the steady state %dis as defined in [22, Sec. 4]. By Lemma 5.9 and [2, Theorem 7.1.] we get jdis,v (λ0 ) = j%(v) (λ0 ), i.e. the current density of the dissipative system and of the QTB system coincide for fixed energy λ0 . Example 5.11. For the generalized distribution function ρ from Example 5.6 we get the current j%(v) (λ) =
1 2π~
T (v)(λ) (f (λ − a ) − f (λ − b )) ,
λ ∈ (va , ∞) ,
where T (v)(λ) :=
qb (λ) qa (λ) |Sba (λ)|2 = |Sab (λ)|2 qa (λ) qb (λ)
is the transmission coefficient. If the system is in thermodynamic equilibrium, i.e. if a = b , then j%(v) (λ) = 0. In particular, the current vanishes for any steady state given by (5.17), see Example 5.5. Proposition 5.12. If
Z
∞ va
dλ trC2 (ρac (λ)) < ∞ ,
then the total current is bounded and the bound does not depend on v: Z ∞ 1 dλ trC2 (ρac (λ)) . |j%(v) | ≤ 2π~ va Proof. Since k(Pa Sv (λ)∗ Pb − Pb Sv (λ)∗ Pa )Sv (λ)kh(λ) ≤ 1 for λ ∈ (va , ∞), we get immediately from 5.20 and Lemma 5.9 Z ∞ 1 |j%(v) | ≤ dλ trC2 (ρac (λ)) . 2π~ va 6. Carrier Density Operator In this section we define the nonlinear carrier density operator which associates the potential seen by particles to their density and prove that the density depends continuously on the potential. Following [1] the carrier density operator Nρˆ : L∞ R → 1 LR is defined by Nρˆ(v) := u%(v) ,
v ∈ D(Nρˆ) := L∞ R ,
where u%(v) is the carrier density from Definition 5.3. Remark 6.1. If we assume instead of (5.3) the slightly stronger condition ess sup kρac (λ)kB(h(λ)) (1 + λ2 ) < ∞ ,
λ∈(vb ,∞)
(6.1)
April 27, 2004 19:23 WSPC/148-RMP
00199
A Quantum Transmitting Schr¨ odinger–Poisson System
319
then one can prove that the particle density operator Nρˆ(·) is not only well defined 1,2 1 as an operator from L∞ R into LR , but takes its values in WR . By means of Proposition 2.6, Proposition 2.8, and Lemma 5.7 one obtains Lemma 6.2. If v ∈ L∞ R , then kNρˆ(v)kL1
! p 2kmkL∞ (b − a) p ≤ Cp (v) 1 + kvkL∞ + |vb | π~
p (b − a) p + Cac 3 + 8 + 4 kmkL∞ 1 + kvkL∞ . ~
We are now going to prove that the particle density operator is continuous. For doing this we need some technical lemmas. L∞
∞ Lemma 6.3. Assume (vn )n∈N ⊂ L∞ R and v ∈ LR . If vn −→ v as n → ∞, then
lim k(Kvn − z)−1 − (Kv − z)−1 kB1 (K) = 0 ,
n→∞
for all z in the resolvent set of Kv . Proof. By [25, Theorem IV.1.16] Kvn − z is boundedly invertible for z in the −1 . Hence, z is also in the resolvent set of Kv , if kv − vn kL∞ < k(Kv − z)−1 kB(K) resolvent set of Kvn for sufficiently large n. Furthermore, there is k(Kvn − z)−1 kB(K) ≤
k(Kv − z)−1 kB(K) . 1 − kvn − vkL∞ k(Kv − z)−1 kB(K)
K Since PH (Kv − z)−1 is trace class one gets
k(Kvn − z)−1 − (Kv − z)−1 kB1 (K) K (Kv − z)−1 kB1 (K) ≤ k(Kvn − z)−1 kB(K) kvn − vkL∞ kPH
which completes the proof. ∞ Lemma 6.4. Assume v ∈ L∞ R , (vn )n∈N ⊆ LR . Let λ1 (v), . . . , λN (v) (v) and L∞
λ1 (vn ), . . . , λN (vn ) (vn ) be the eigenvalues of Kv and Kvn , respectively. If vn −→ v as n → ∞, then we have N (v) = N (vn ) for sufficiently large n and lim λk (vn ) = λk (v) ,
n→∞
lim kP (λk (vn )) − P (λk (v))kB(K) = 0
n→∞
for k = 1, . . . , N (v), where P (λk (vn )), P (λk (v)), denote the projection onto the eigenspaces of Kvn , Kv corresponding to the eigenvalue λk (vn ), λk (v), k = 1, . . . , N (v), respectively. The proof follows immediately from Lemma 6.3 and [25, Theorem IV.3.16].
April 27, 2004 19:23 WSPC/148-RMP
320
00199
M. Baro et al. L∞
∞ Lemma 6.5. If v ∈ L∞ R , (vn )n∈N ⊆ LR with vn −→ v, then
s-lim W± (Kvn , Kv0 ) = W± (Kv , Kv0 ) , n→∞
(6.2)
s-lim W±∗ (Kvn , Kv0 ) = W±∗ (Kv , Kv0 ) , n→∞
where Kv0 is the operator (2.3) referring to the constant potential v ≡ v0 , see also Corollary 4.2. Proof. Lemma 6.3 and [25, Theorem X.4.15] provide s-lim W± ((Kvn + µ)−1 , (Kv0 + µ)−1 ) = W± ((Kv + µ)−1 , (Kv0 + µ)−1 ) n→∞
for a sufficiently large µ ∈ R. Taking into account the invariance principle, see for example [3], we obtain the first assertion, the second one follows in the same manner. L∞
∞ Theorem 6.6. Assume v ∈ L∞ R and (vn )n∈N ⊂ LR . If vn −→ v, then L1
Nρˆ(vn ) −→ Nρˆ(v)
and
j%(vn ) → j%(v)
as n → ∞ ,
1 i.e. the carrier density operator is a continuous operator from L∞ R to LR and the current density operator is continuous from L∞ R to R.
Proof. According to Lemma 5.7 there is Z b dx((Nρˆ(vn ))(x) − (Nρˆ(v))(x))h(x) = tr((%(vn ) − %(v))M (h)) ,
(6.3)
a
for every real-valued h ∈ L∞ . Using the decomposition (5.7) for the steady states %(vn ) and %(v) we get tr((%(vn ) − %(v))M (h)) = tr((%p (vn ) − %p (v))M (h)) + tr((%ac (vn ) − %ac (v))M (h)) ,
(6.4)
where %ac (v), %ac (vn ) and %p (v), %p (vn ) are given by (5.8) and (5.10), respectively. We first show that the first addend in (6.4) tends to zero as n → ∞. By Lemma 6.4 we have that dim(ran(%p (vn ))) = dim(ran(%p (v))) =: N (v) < ∞, for sufficiently large n. Hence, it suffices to show that k%p (vn ) − %p (v)kB(K) → 0 as n → ∞. We have by the definition of %p (vn ) and %p (v), see (5.10), k%p (vn ) − %p (v)kB(K)
N X ≤ |ρp (λj (vn )) − ρp (λj (v))| j=1
+ |ρp (λj (v))| kP (λj (vn )) − P (λj (v))kB(K) . (6.5)
April 27, 2004 19:23 WSPC/148-RMP
00199
321
A Quantum Transmitting Schr¨ odinger–Poisson System
Since ρp ∈ CbR (−∞, vb ), see (5.1), we get by Lemma 6.4 and (6.5) that %p (vn ) converges in the norm of B(K) to %p (v) as n → ∞. Hence, lim
sup
n→∞ khk
L∞ ≤1
|tr((%p (vn ) − %p (v))M (h))|
≤ N (v) lim k%p (vn ) − %p (v)kB(K) = 0 .
(6.6)
n→∞
To prove that the second addend in (6.4) tends to zero as n → ∞, we set W := W− (Kv , Kv0 )
and Wn := W− (Kvn , Kv0 ) .
By (5.8) we obtain %ac (vn ) − %ac (v) = (Wn − W )%ac (v0 )Wn∗ + W %ac (v0 )(Wn∗ − W ∗ ) . Since W Kv0 = Kv W and Wn Kv0 = Kvn Wn we get by (5.9): |tr((%ac (vn ) − %ac (v))M (h))| ≤ tr(((Kvn + i)−1 Wn − (Kv0 + i)−1 W )%ac (v0 )(Kv0 + i)Wn∗ M (h)) + tr(W (Kv0 − i)%ac (v0 )(Wn∗ (Kvn − i)−1 − W ∗ (Kv0 − i)−1 )M (h)) .
We estimate the terms on the right-hand side separately: tr(((Kvn + i)−1 Wn − (Kv0 + i)−1 W )%ac (v0 )(Kv0 + i)Wn∗ M (h))
≤ Cac Wn∗ (Kvn − i)−1 − W ∗ (Kv0 − i)−1 M (h) B1 (K) ,
tr(W (Kv0 − i)%ac (v0 )(Wn∗ (Kvn − i)−1 − W ∗ (Kv0 − i)−1 )M (h))
≤ Cac Wn∗ (Kvn − i)−1 − W ∗ (Kv0 − i)−1 M (h) B1 (K)
≤ Cac (Kvn − i)−1 − (Kv0 − i)−1 khkL∞ B1 (K)
K khkL∞ + Cac (Wn∗ − W ∗ )(Kv0 − i)−1 PH B1 (K)
and finally obtain
|tr((%ac (vn ) − %ac (v))M (h))| ≤ 2Cac khkL∞ (Kvn − i)−1 − (Kv − i)−1 B1 (K)
K + (Wn∗ − W ∗ )(Kv0 − i)−1 PH B
1 (K)
.
K is trace class one gets By Lemma 6.3, Lemma 6.5 and the fact that (Kv0 − i)−1 PH
lim
sup
n→∞ khk
L∞ ≤1
|tr((%ac (vn ) − %ac (v))M (h))| = 0 .
(6.7)
Now (6.3), (6.4), (6.6) and (6.7) imply the assertion. s Lemma 6.5 implies by Corollary 4.2 the strong convergence Sˆvn → Sˆv . Using the expression (5.23) for C(v)(λ) one sees that j%(vn ) −→ j%(v) as n → ∞.
April 27, 2004 19:23 WSPC/148-RMP
322
00199
M. Baro et al.
7. Quantum Transmitting Schr¨ odinger–Poisson System Let us now consider the quantum transmitting Schr¨ odinger–Poisson system, i.e. the nonlinear Poisson equation (1.11) with the carrier density operator from Sec. 6: −
d d ϕ = q(C + Nρˆ++ (v + ) − Nρˆ−− (v − )) dx dx
in (a, b)
(7.1)
d ϕ = ι(ϕ − ϕΓ ) on {a, b}\Γ , dx
(7.2)
and the mixed boundary conditions ϕ = ϕΓ
on Γ ,
−
where Γ ⊆ {a, b} is the Dirichlet part of the boundary and the function ϕΓ , defined on [a, b], represents the Dirichlet boundary values given on Γ and the inhomogeneous boundary conditions of the third kind on {a, b}\Γ; the function ι ≥ 0 is defined on {a, b} and can be seen as the capacity of the boundary; in particular ι is zero at an insulating interface, see [15]. We recall from Sec. 1 that is the dielectric permittivity function, q is the elementary charge, C is the concentration of ionized dopants. Each of the carrier density operators Nρˆ±± (v ± ) is determined by a Buslaev–Fomin operator Kv±± , one for electrons (with superscript “−”) and one for holes (with superscript “+”). The potential energies for the electrons and holes are given by v ± := w± ± qϕ, where w± are given reference energies, see (1.12). The thus specified nonlinear Poisson equation (7.1), (7.2) is the quantum transmitting Schr¨ odinger–Poisson system, see also Sec. 1. In order to define precisely what is solution of this system we introduce the following function spaces: with respect to the (possibly empty) set Γ ⊂ {a, b} of Dirichlet points in the boundary conditions on Poisson’s equation we define WΓ1,2 := WR1,2 ∩ {ψ | ψ(Γ) ⊂ {0}} . In the following we denote by WΓ−1,2 the dual space of WΓ1,2 , by h·, ·i the dual pairing between WΓ1,2 and WΓ−1,2 , by E1 the embedding operator from L1R into WΓ−1,2 and by 1 its norm, by E∞ the embedding operator from WR1,2 into L∞ R and by ∞ its norm. 7.1. Assumptions Throughout this section we make the following assumptions about the data of the problem: A1: The effective masses m± are positive and obey m± , m1± ∈ L∞ R . The constants ± ∈ R are positive. m± , m a b A2: The real constants va± , vb± obey va± > vb± . A3: The external reference energies w ± belong to L∞ R . ± A4: The generalized distribution functions ρ± = ρ± p ⊕ ρac obey: ± ± b ± ∞ ± ρ± p ∈ CR (−∞, vb ) and ρac (·) ∈ L ((vb , ∞), B(h (λ)), νac ),
April 27, 2004 19:23 WSPC/148-RMP
00199
A Quantum Transmitting Schr¨ odinger–Poisson System
323
where νac is as in (3.25), and h± (λ) are given by (3.24) there replacing vb , va by vb± , va± ; moreover, ρ± p ≥ 0 and ∗ ± ρ± ac (λ) = ρac (λ) ,
ρ± ac (λ) ≥ 0 ,
for a.e. λ ∈ (vb , ∞) , √ 2 ess sup kρ± ac (λ)kB(h± (λ)) λ + 1 < ∞ .
λ∈(vb ,∞)
We define the constants Cp± :=
sup λ∈(−∞,vb )
ρp (λ) ,
± Cac := ess sup kρ± ac (λ)kB(h± (λ)) λ∈(vb ,∞)
p
λ2 + 1 .
A5: The doping profile C, see (7.1), belongs to the function space WΓ−1,2 . A6: The dielectric permittivity function is positive and obeys , 1 ∈ L∞ R . We set ˜ := max{1, k 1 kL∞ }. A7: The set Γ ⊂ {a, b} is not empty or at least one of the numbers ι(x), x ∈ {a, b}\Γ, is strictly positive. A8: The function ϕΓ is from the space WR1,2 . Remark 7.1. In assumption A4 we demand the boundedness of the distribution functions ρ± p governing the point spectrum of the respective Hamiltonian. This seems to be very restrictive because distribution functions like (1.5) are unbounded. However, for important classes of devices like e.g. resonant tunnelling diodes generically the point spectrum is absent. That is why we cut off unbounded distribution functions like (1.5) for proving that the quantum transmitting Schr¨ odinger–Poisson system has solutions. Instead of assuming boundedness of the distribution functions ± ρ± p one can presuppose the monotonicity of ρp . Using the decomposition (5.7) and ± the fact that %p (v) is a function of the point part of the respective Hamiltonian Kv± one obtains monotonicity of the “point part” of the particle density operator in a similar way as in the case of a Hamiltonian with compact resolvent. Thus, in the case of unbounded, but monotone distribution functions ρ± p the monotonicity of the “point part” of the particle density operator can be used to prove the existence of solutions of the quantum transmitting Schr¨ odinger–Poisson system as in the theory of Schr¨ odinger–Poisson systems comprising Hamiltonians with compact resolvent, see [10, 33, 17–20, 24]. In this paper we assume boundedness of the distribution functions ρ± p since the “monotonicity” approach is technically more involved and because dealing with the point spectrum of the operator Kv± is not the primary objective of this paper. To each Buslaev–Fomin operator Kv±± , see (2.3), we associate a QTB operator family {Hv±± (z)}z∈C+ according to Definition 2.2. The functions ρ± (·) define by (5.6) steady states %± (v), i.e. non-negative, selfadjoint operators which commute with Kv±± . The carrier densities u%± (v) for the electrons and holes are determined by Definition 5.3. The corresponding carrier density operators (6.1) are denoted by Nρˆ±± (v).
April 27, 2004 19:23 WSPC/148-RMP
324
00199
M. Baro et al.
7.2. Definition of solutions The linear Poisson operator P : WR1,2 → WΓ−1,2 is defined by Z b X hPυ, ςi = dx (x)υ 0 (x)ς 0 (x) + ι(x)υ(x)ς(x) , a
x∈{a,b}\Γ
for all ς ∈ WΓ1,2 , υ ∈ D(P) = WR1,2 .
We denote the restriction of P to WΓ1,2 by P0 . The inverse of P0 exists and its norm does not exceed ˜(1 + γι ), see [1]. The constant γι :=
sup 06=ψ∈WΓ1,2
kψ 0 k2L2 +
P
kψk2L2
x∈{a,b}\Γ ι(x)|ψ(x)|
2
is finite because the case of purely homogeneous Neumann boundary conditions is excluded by assumption A7. ϕ˜Γ denotes the bounded linear form Z b dx (x)ϕ0Γ (x)υ 0 (x) , υ ∈ D(ϕ˜Γ ) = WΓ1,2 . υ 7→ a
Definition 7.2. If u ∈ L1R , then ϕ ∈ WR1,2 is a solution of Poisson’s equation ±
−
d d ϕ = q(C + u+ − u− ) in (a, b) dx dx
with the boundary conditions (7.2) if and only if ϕ − ϕΓ ∈ WΓ1,2 satisfies P0 (ϕ − ϕΓ ) = D + q E1 u+ − q E1 u− ,
(7.3)
where D := qC − ϕ˜Γ and E1 is the embedding operator from L1 into WΓ−1,2 . Definition 7.3. (ϕ, u+ , u− ) ∈ WR1,2 × L1R × L1R is a solution of the quantum transmitting Schr¨ odinger–Poisson system if ϕ satisfies Poisson’s equation in the sense of Definition 7.2 with u+ = u + %+ (w + +qE∞ ϕ)
and u− = u− %− (w − −qE∞ ϕ) ,
where E∞ is the embedding operator from WR1,2 into L∞ R . 7.3. Existence of solutions Following [19] we define a mapping whose fixed points determine the solutions of the quantum transmitting Schr¨ odinger–Poisson system. To that end we first introduce the mapping J : L1R × L1R → WR1,2 which assigns to a couple of densities (u+ , u− ) ∈ L1R × L1R the solution of Poisson’s equation: J (u+ , u− ) = P0−1 D + qE1 (u+ − u− ) + ϕΓ . (7.4) 1,2 Obviously, the map J is continuous. Now we define Ψ : L∞ R → WR by Ψ : v 7→ Nρˆ++ (w+ + qv), Nρˆ−− (w− − qv) 7→ J Nρˆ++ (w+ + qv), Nρˆ−− (w− − qv) .
April 27, 2004 19:23 WSPC/148-RMP
00199
A Quantum Transmitting Schr¨ odinger–Poisson System
325
Since the carrier density operators Nρˆ±± are continuous, see Theorem 6.6, we get ∞ that Ψ is continuous. Finally we define Ψ∞ : L∞ R → LR by Ψ∞ := E∞ Ψ . As both Ψ and the embedding operator E∞ from WR1,2 into L∞ R are continuous so is Ψ∞ , Moreover, the map Ψ∞ is compact because E∞ is compact. Lemma 7.4. An element v ∈ L∞ R is a fixed point of Ψ∞ if and only if the triple − (Ψ(v), u+ , u− ) = (Ψ(v), u+ %+ (w + +qv) , u%− (w − −qv) )
is a solution of the quantum transmitting Schr¨ odinger–Poisson system. The proof follows directly from the definitions. Definition 7.5. With respect to the data of the problem we define r σ12 σ1 x0 := + + σ2 , 2 4 as the (unique) positive root of the polynomial p : x 7→ x2 − σ1 x − σ2 , where σ1 := ∞ 1 ˜(1 + γι )q(σ1+ + σ1− ) ,
and σ1± σ2±
σ2 := ∞ kϕΓ kW1,2 + ∞ ˜(1 + γι ) kDkW−1,2 + q1 (σ2+ + σ2− ) ,
(7.5)
! p ± k ∞ (b − a) 2km (b − a) √ L ± + qCp± , 8 + 4 km± kL∞ := qCac ~ π~ p (b − a) p ± := Cac 3 + 8 + 4 km± kL∞ 1 + kw± kL∞ (7.6) ~ ! p 2km± kL∞ (kw± kL∞ + |vb |)(b − a) ± , + Cp 1 + π~ √
p
± with Cac and Cp± according to (5.3) and (5.4), respectively. ∞ Theorem 7.6. The map Ψ∞ : L∞ R → LR has a fixed point. For any fixed point v of Ψ∞ there is
kvkL∞ ≤ x20 ,
(7.7)
where x0 is according to Definition 7.5. Proof. By the definition (7.4) of the operator J we have kJ (u+ , u− )kW1,2 ≤ kϕΓ kW1,2 + ˜(1 + γι )kD + qE1 (u+ − u− )kW−1,2 Γ
R
≤ kϕΓ kW1,2
+ ˜(1 + γι ) kDkW−1,2 + 1 q ku+ kL1 + ku− kL1 . Γ
(7.8)
April 27, 2004 19:23 WSPC/148-RMP
326
00199
M. Baro et al.
Since u± := Nρˆ±± (v ± ) with v ± = w± ± qv we get from Lemma 6.2: p ku± kL1 ≤ σ1± kvkL∞ + σ2± , where
σ1±
and
σ2±
(7.9)
are given by (7.6). Thus, (7.8) and (7.5) provide 1/2
kΨ∞ (v)kL∞ ≤ ∞ kJ (Nρˆ++ (v), Nρˆ−− (v))kW1,2 ≤ σ1 kvkL∞ + σ2 .
(7.10)
If x0 is the unique positive root of the polynomial p : x 7→ x2 − σ1 x − σ2 and kvkL∞ ≤ x20 , then kΨ∞ (v)kL∞ ≤ σ1 x0 + σ2 = x20 .
2 Hence, Ψ∞ maps the ball {v ∈ L∞ R | kvkL∞ ≤ x0 } continuously into itself. Since Ψ∞ is compact, the image of this ball is precompact in L∞ R . Therefore Schauder’s fixed point theorem assures the existence of a fixed point. Let us now assume the second assertion were false, i.e. we assume that there exists a fixed point v satisfying kvkL∞ > x20 . By (7.10) we get 1/2
1/2
(kvkL∞ )2 = kvkL∞ = kΨ∞ (v)kL∞ ≤ σ1 kvkL∞ + σ2 .
This is a contradiction to p(x) > 0 for x > x0 . Theorem 7.7. Under the assumptions in Sec. 7.1 the quantum transmitting Schr¨ odinger–Poisson system always admits a solution in the sense of Definition 7.3 and any solution (ϕ, u+ , u− ) of the quantum transmitting Schr¨ odinger–Poisson system satisfies the a priori estimate kϕkL∞ ≤ x20 ,
± ± r + Cp± s± , ku± kL1 ≤ Cac
(7.11)
where x0 is given by Definition 7.5 and p p (b − a) √ ± ± r := 3 + 8 + 4 km kL∞ qx0 + 1 + kw± kL∞ , ~ p 2km± kL∞ (b − a) p ± √ kw kL∞ + |vb | + qx0 . s± := 1 + π~
Proof. The first assertion follows from Lemma 7.4 and Theorem 7.6. As for (7.11) one obtains the first inequality from (7.7) and the second inequality from the first one in conjunction with Lemma 6.2. 7.4. Concluding remarks
Open quantum systems like the quantum transmitting Schr¨ odinger–Poisson system have been treated in a very general framework in [32]. However, in [32] certain restrictions are imposed on the steady state %(v) and only the unipolar case has been treated. In the present paper we are dealing with the general bipolar case and we obtain explicit estimates for the electrostatic potential and the densities of electrons and holes. We have demonstrated that for steady states of a more general
April 27, 2004 19:23 WSPC/148-RMP
00199
A Quantum Transmitting Schr¨ odinger–Poisson System
327
form than %(v) = f (Kv ) the QTB approach leads to a non-zero current through the boundary of the device. This allows a current coupling between a QTB family and a classical drift diffusion model or a kinetic model. However, in coupling the QTB family with an external model for the potential to be seen by Schr¨ odinger’s operator one cannot assume anymore that the potential outside the interval (a, b), i.e. va , vb is fixed. Hence, the operator E defined by (2.1) has to be replaced by the ˜ : CR ([a, b]) → Cb (R) given by operator E R v(a) , −∞ < x ≤ a , ˜ e = CR ([a, b]) . Ev (x) := v(x) , x ∈ (a, b) , v ∈ D(E) v(b) , b ≤ x < ∞ , The Buslaev–Fomin operators are then defined by 2
e . ˜ v := − ~ d 1 d + Ev K 2 dx m ˆ dx Note that in this case the absolutely continuous spectrum depends on the potential ˜ v ) = [min{v(a), v(b)}, ∞). Furthermore, in general the wave operators v, i.e. σac (K ˜ w, K ˜ v ) = s-lim exp(itK ˜ w ) exp(−itK ˜ v )Pac (K ˜v) , W± (K t→±∞
v, w ∈ CR ([a, b])
do not exist and, therefore, we do not have a representation as given in Remark 5.2. Thus, the techniques used in this paper to prove the continuity of the particle density operator, see Theorem 6.6, do not apply in this case. There is a close relation between the dissipative Schr¨ odinger–Poisson system treated in [1] and the quantum transmitting Schr¨ odinger–Poisson system we have investigated in this paper. As already noted in Remarks 5.4 and 5.10 the carrier and current densities of the dissipative and the quantum transmitting Schr¨ odinger–Poisson system coincide for fixed energy λ0 > vb , provided the generalized distribution function of the dissipative system is transformed according to (5.15). Therefore, the dissipative Schr¨ odinger–Poisson system can be regarded as a single energy approximation of the quantum transmitting Schr¨ odinger–Poisson system. We have demonstrated that the quantum transmitting Schr¨ odinger–Poisson system always admits a solution, if the assumptions in Sec. 7.1 are satisfied. Moreover, there are a priori estimates for any solution ϕ and the corresponding carrier densities u± . The QTB family already contains all the information, which is needed to determine the carrier and the current density by means of a function ρ generating all the steady states of the system, see Lemma 3.6 and (5.13) (for the carrier density) and Lemmas 4.3 and 5.9 (for the current density). As for the current driven van Roosbroeck system, see [28], in general one cannot expect uniqueness of a solution of the quantum transmitting Schr¨ odinger–Poisson system. Indeed, different modes of operation of nanoelectronic devices, to which the quantum transmitting boundary method applies, are important for their application. That is why numerical algorithms target specific solutions of the quantum transmitting Schr¨ odinger–Poisson system, see e.g. [38].
April 27, 2004 19:23 WSPC/148-RMP
328
00199
M. Baro et al.
Acknowledgments The authors thank Prof. P. Degond and Prof. N. Ben Abdallah for fruitful discussions. Michael Baro and Hagen Neidhardt enjoyed the hospitality of the Institute “Math´ematiques pour l’Industrie et la Physique”, Toulouse. The financial support of the DFG under grant RE 1480/2 for Neidhardt, and by the DFG Research Center “Mathematics for key technologies” (FZT 86) in Berlin for Baro, is gratefully acknowledged. References [1] M. Baro, H.-Chr. Kaiser, H. Neidhardt and J. Rehberg, Dissipative Schr¨ odinger Poisson systems, J. Math. Phys. 45 (2004) 21–43. [2] M. Baro and H. Neidhardt, Dissipative Schr¨ odinger-type operator as a model for generation and recombination, J. Math. Phys. 44 (2003) 2373–2401. [3] H. Baumg¨ artel and M. Wollenberg, Mathematical Scattering Theory (Birkh¨ auser, Basel, 1983). [4] N. Ben Abdallah, A hybrid kinetic–quantum model for stationary electron transport in a resonant tunneling diode, J. Stat. Phys. 90 (1998) 627–662. [5] N. Ben Abdallah, On a multidimensional Schr¨ odinger–Poisson scattering model for semiconductors, J. Math. Phys. 41 (2000) 4241–4261. [6] N. Ben Abdallah, P. Degond and P. Markowich, On a one-dimensional Schr¨ odinger– Poisson scattering model, Z. Angew. Math. Phys. 48 (1997) 135–155. [7] M. Ben-Artzi, On spectral properties of the acoustic propagator in a layered band, J. Differ. Equations 136(1) (1997) 115–135. [8] V. Buslaev and V. Fomin, An inverse scattering problem for the one dimensional Schr¨ odinger equation on the entire axis, Vestnik Leningrad Univ. 17 (1962) 56–64 (Russian). [9] Ph. Caussignac, J. Descloux and A. Yamnhakki, Simulation of some quantum models for semiconductors, Math. Models and Methods Appl. Sci. 12 (2002) 1049–1074. [10] Ph. Caussignac, B. Zimmermann and F. Ferro, Finite element approximation of electrostatic potential in one dimensional multilayer structures with quantized electronic charge, Computing 45 (1990) 251–264. [11] C. Foias and B. Sz.-Nagy, Harmonic Analysis of Operators on Hilbert Space (Akad´emia Kiad´ o, Budapest and North-Holland Publishing Company, AmsterdamLondon, 1970). [12] W. R. Frensley, Quantum Transport, in Heterostructures and Quantum Devices, eds. N. G. Einspruch and W. R. Frensley (Academic Press, New York, 1994). [13] W. R. Frensley, Boundary conditions for open quantum systems driven far from equilibrium, Rev. Modern Phys. 62 (1990) 745–791. [14] H. Gajewski, On existence, uniqueness and asymptotic behavior of solutions of the basic equations for carrier transport in semiconductors, Z. Angew. Math. Mech. 65 (1985) 101–108. [15] H. Gajewski, Analysis und Numerik von Ladungstransport in Halbleitern (Analysis and numerics of carrier transport in semiconductors), Mitt. Ges. Angew. Math. Mech. 16 (1993) 35–57. [16] I. M. Gelfand and G. E. Schilow, Verallgemeinerte Funktionen I, Distributionen (Deutscher Verlag der Wissenschaften, Berlin, 1960). [17] H.-Chr. Kaiser and J. Rehberg, On stationary Schr¨ odinger–Poisson equations, Z. Angew. Math. Mech. 75 (1995) 467–468.
April 27, 2004 19:23 WSPC/148-RMP
00199
A Quantum Transmitting Schr¨ odinger–Poisson System
329
[18] H.-Chr. Kaiser and J. Rehberg, On stationary Schr¨ odinger–Poisson equations modelling an electron gas with reduced dimension, Math. Methods Appl. Sci. 20 (1997) 1283–1312. [19] H.-Chr. Kaiser and J. Rehberg, About a one-dimensional stationary Schr¨ odinger– Poisson system with Kohn–Sham potential, Z. Angew. Math. Phys. 50 (1999) 423– 458. [20] H.-Chr. Kaiser and J. Rehberg, About a stationary Schr¨ odinger–Poisson system with Kohn–Sham potential in a bounded two- or three-dimensional domain, Nonlinear Anal. Theory Methods Appl. 41 (2000) 33–72. [21] H.-Chr. Kaiser, H. Neidhardt and J. Rehberg, Macroscopic current induced boundary conditions for Schr¨ odinger operators, Integral Equations Oper. Theory 45 (2003) 39– 63. [22] H.-Chr. Kaiser, H. Neidhardt and J. Rehberg, On 1-dimensional dissipative Schr¨ odinger-type operators, their dilations and eigenfunction expansions, Math. Nach. 252 (2002) 51–69. [23] H.-Chr. Kaiser, H. Neidhardt and J. Rehberg, Density and current of a dissipative Schr¨ odinger operator, J. Math. Phys. 43 (2002) 5325–5350. [24] H.-Chr. Kaiser, H. Neidhardt and J. Rehberg, Convexity of trace functionals and Schr¨ odinger operators, Preprint 835 (Weierstrass Institute for Applied Analysis and Stochastics (http://www.wias-berlin.de) Berlin, 2003). [25] T. Kato, Perturbation Theory for Linear Operators, 2nd edn. (Springer, Berlin, 1980). [26] D. Kirkner and C. Lent, The quantum transmitting boundary method, J. Appl. Phys. 67 (1990) 6353–6359. [27] L. A. Landau and E. M. Lifschitz, Quantenmechanik (Akademie–Verlag, Berlin, 1971). [28] P. A. Markowich, The Stationary Semiconductor Device Equations (Springer, Wien, 1986). [29] P. A. Markowich, C. A. Ringhofer and C. Schmeiser, Semiconductor Equations (Springer, Wien, 1990). [30] A. S. Markus, Introduction to the spectral theory of polynomial operator pencils. Transl. from the Russian by H. H. McFaden, Translations of Mathematical Monographs, 71. Providence, RI: American Mathematical Society (AMS), 1988. [31] V. V. Mitin, V. A. Kochelap and M. A. Stroscio, Quantum heterostructures: microelectronics and optoelectronics (Cambridge University Press, 1999). [32] F. Nier, The dynamics of some quantum open system with short-range nonlinearities, Nonlinearity 11 (1998) 1127–1172. [33] F. Nier, A stationary Schr¨ odinger–Poisson system arising from the modelling of electronic devices, Forum Math. 2 (1994) 489–510. [34] S. J. Polak, C. den Heuer, W. H. A. Schilders and P. Markowich, Semiconductor device modelling from the numerical point of view, Int. J. Numer. Methods Eng. 24 (1987) 763–838. [35] M. Reed and B. Simon, Methods of Modern Mathematical Physics III, Scattering Theory (Academic Press, New York, 1979). [36] M. Reed and B. Simon, Methods in Modern Mathematical Physics IV : Analysis of Operators (Academic Press, New York, 1978). [37] S. Selberherr, Analysis and Simulation of Semiconductor Devices (Springer, Wien, 1984). [38] A. Svizhenko, M. P. Anantram, T. R. Govindan, B. Biegel and R. Venugopal, Twodimensional quantum mechanical modeling of nanotransistors, J. Appl. Phys. 91 (2002) 2343–2353.
April 27, 2004 19:23 WSPC/148-RMP
330
00199
M. Baro et al.
[39] J. Weidmann, Spectral Theory of Ordinary Differential Operators (Springer, Berlin, 1987). [40] C. Weisbuch and B. Vinter, Quantum semiconductor structures: Fundamentals and applications (Academic Press, Boston, 1991).
April 20, 2004 11:4 WSPC/148-RMP
00201
Reviews in Mathematical Physics Vol. 16, No. 3 (2004) 331–352 c World Scientific Publishing Company
CONNECTIONS OVER TWO-DIMENSIONAL CELL COMPLEXES
AMBAR N. SENGUPTA∗ Department of Mathematics, Louisiana State University Baton Rouge, LA 70803, USA
Received 12 April 2003 Revised 21 January 2004 A framework of connections, gauge transformations, and more, is developed over cell complexes. The relationship between this and the continuum theory is explained. A 2form over the space of flat connections over certain types of cell complexes is defined and its significant properties proved; this 2-form is induced by the Atiyah–Bott symplectic structure on the space of connections over closed, oriented surfaces. Keywords: Gauge theory; cell complexes; moduli space; symplectic structure.
1. Introduction 1.1. Objectives The purpose of this article is to study connections over cell complexes, primarily those corresponding to closed, oriented surfaces. This discrete framework is induced by a continuum picture, that of connections on bundles over surfaces and we shall explain the detailed relationship between the discrete and continuum structures. In more detail, our objectives are to: (i) describe the framework of connections and gauge transformations for cell complexes (Sec. 3); (ii) demonstrate how the continuum structures of connections and gauge transformations induce their discrete counterparts over cell complexes (Sec. 3.3) (iii) Define a 2-form on the space of discrete connections over cell-complexes (Sec. 4.1), explain its relation to the Atiyah–Bott symplectic form on the space of connections over a closed, oriented surface, and prove that this 2form is closed (Sec. 4.2) and that it satisfies a moment-map type property (Sec. 4.3). ∗ http://www.math.lsu.edu/˜sengupta
331
April 20, 2004 11:4 WSPC/148-RMP
332
00201
A. N. Sengupta
The above results generalize those proved in [7, 8] (where the cell complexes had only one 2-cell). 1.2. Background Discrete connections arose naturally in lattice-approximations to continuum gauge theories. In the case of two-dimensional quantum pure Yang–Mills gauge theory, Wilson loop expectation values have exact expressions in terms of integration over discrete connections with respect to the discrete version of the continuum quantum Yang–Mills measure. In more detail, if c1 , . . . , ck are loops, all based at one point, on a surface, and h(c; ω) denotes the holonomy of a connection ω around c, then, for a bounded measurable function f on Gk , the expected value of f h(c1 ; ω), R . . . , h(ck ; ω) with respect to the quantum Yang–Mills measure works out to be f (h(c1 ; x), . . . , h(ck ; x) dµ(x), where now the integration is over a space of connections x on a cell complex containing the loops. Here µ is the discrete Yang–Mills measure (see [3]). The discrete Yang–Mills measure was defined mathematically in [3], but the ideas go back to Migdal’s [10] and Witten’s works [17, 18]. For more references see [16]. It is also of interest to start from a discrete group K with a finite number of generators obeying a finite set of relations — view this as specifying a cell complex in a standard way and then study the space of flat connections over this cell complex to obtain “invariants” of the group K. The case we study in the present paper is when K is the fundamental group of a closed, oriented surface. For recent interesting developments and elegant ideas from this point of view see Mulase and Penkava [11]. This point of view is, at some level, that of group cohomology [4]. For more on the moduli space of flat connections see the collection [9] (in particular, Huebschmann’s article [6] for the group-cohomological approach). A highly successful geometry, including Morse theory, over cell complexes has been developed by Robin Forman (papers are available at his website). Though originally inspired by the Yang–Mills gauge theory, this work has an independent value and has had a substantial impact in the field of combinatorics. As alluded to above, the origin of much of the interest in studying spaces of connections, especially the moduli space of flat connections, arose from the physics literature. One way to look at this is from the point of view of the Yang–Mills gauge theory. The quantum field theoretic functional integrals (Wilson loop expectations) for Yang–Mills fields over compact surfaces converge to integrals over the moduli space of flat connections over the surface in the “classical” limit (see, for instance, [16] for an exposition of this idea, originating with Witten’s work [17]). These limiting integrals are with respect to the volume measure corresponding to the symplectic structure, a general version of which (as a 2-form) we study in this paper. Another source of interest in the moduli space of flat connections is from Chern– Simons field theory. This is, classically, a dynamical system in 2 (space) + 1 (time) dimensions. The field is again a connection over a surface evolving in time. The
April 20, 2004 11:4 WSPC/148-RMP
00201
Connections over Complexes
333
phase space for this dynamical system turns out to be precisely the moduli space of flat connections, equipped with its symplectic structure. Yet another way this moduli space arises is from three-dimensional gravity, though the group G is then generally non-compact (see the article by Schroers [12]). The discrete framework, from the physics standpoint, is useful mainly as an approximation tool in the quantum theory. Few of the concepts and structures of the continuum/classical theory survive the discretization. In our work, we extract certain such structures, such as the 2-form Ω studied in Sec. 4, which do survive the discretization to cell complexes from a continuum manifold setting. In the first few sections we shall review the definition and fundamental facts about a symplectic structure on the space of connections over a closed, oriented surface. Then in the second half of this paper we shall study this space of discrete flat connections over a cell complex in an independent way. We shall construct here an explicit formula for the induced structure on the moduli space of “discrete” flat connections over the 1-skeleton of a cell complex spanning the surface. 2. The Space of Connections over a Surface In this section we shall review the structure of the space A of connections on a principal bundle over a closed, oriented surface Σ. The structure of interest here is a symplectic form, introduced by Atiyah and Bott [2], defined on the affine space A. The group of gauge transformations acts as symmetries of this symplectic space and a Marsden–Weinstein type reduction yields (modulo technicalities) a symplectic structure on the moduli space of flat connections. This moduli space, and its symplectic structure, is of interest from many points of view. To cite just one instance, it is the phase space for the dynamical system describing the Chern– Simons field in an appropriate setting. We shall work with a compact, oriented two-dimensional manifold Σ, a compact, connected Lie group G whose Lie algebra LG is equipped with an Ad-invariant metric, and a principal G-bundle π : P → Σ. In particular, the group G acts freely on P by a smooth map P × G → P : (p, g) 7→ pg = Rg p . If v ∈ Tp P and g ∈ G then we write vg to mean the vector (dRg )p v ∈ Tpg P . If X ∈ LG and p ∈ P then we define d ∗ (2.1) X (p) = p exp(tX) . dt t=0
Then X ∗ is a smooth vector field on P . A vector v ∈ Tp P is called vertical if dπp v = 0. 2.1. The space A and the group G A connection ω on P is an LG-valued 1-form ω on P satisfying two conditions: (i) ω(X ∗ (p)) = X for every X ∈ LG, (ii) Rg∗ ω = Ad(g −1 )ω for every g ∈ G.
April 20, 2004 11:4 WSPC/148-RMP
334
00201
A. N. Sengupta
Let A be the set of all connections on P . This is an infinite-dimensional affine space under pointwise operations. The tangent space at any ω ∈ A is Tω A = {ω 0 − ω : ω 0 ∈ A} .
(2.2)
Denote by G the set of all bundle equivalences of P (gauge transformations), i.e. all G-equivariant diffeomorphisms φ : P → P for which π ◦ φ = π. Then G is a group under composition. There is another useful incarnation of G: it is CG (P, G) the set of all smooth maps ˆ ˆ for all p ∈ P and g ∈ G. This forms a group φˆ : P → G satisfying φ(pg) = g −1 φ(p)g under pointwise multiplication, and there is an isomorphism CG (P, G) → G : φˆ 7→ φ ˆ specified by requiring that for every p ∈ P , φ(p) = pφ(p). The group G acts on A by pullbacks: A × G → A : (ω, φ) 7→ γω (φ) = φ∗ ω = Ad(φˆ−1 )ω + φˆ−1 dφˆ .
(2.3)
2.2. The tangent space Tω A and the Lie algebra LG ¯ k (P ; LG) the linear space of all smooth LG-valued k-forms A on P We denote by Λ satisfying two conditions: (i) A(v1 , . . . , vk ) = 0 if at least one of the vi is vertical; (ii) Rg∗ A = Ad(g −1 )A for every g ∈ G, i.e. A(v1 g, . . . , vk g) = Ad(g −1 )A(v1 , . . . , vk ) for every g ∈ G, every p ∈ P and every v1 , . . . , vk ∈ Tp P . ¯ 0 (P ; g) is, pointwise, a Lie algebra. Considering the incarnaThe linear space Λ ¯ 0 (P ; g) as the Lie tion CG (P, G) of G, it is seen that it is reasonable to think of Λ algebra of G, and so we set ¯ 0 (P ; g) . LG = Λ There is an exponential map: if H ∈ LG then we define eH ∈ Gˆ by eH (p) = exp H(p) ∈ G .
(2.4)
(2.5)
Thus,
∂ etH (p) = H(p) . ∂t t=0
(2.6)
¯ 1 (P ; g), for any ω ∈ A: The tangent space Tω A is Λ ¯ 1 (P ; g) . Tω A = Λ We will say that a map [0, 1] → A : t 7→ ωt is a pointwise smooth path in A if [0, 1]×P → T ∗ P ⊗LG : (t, p) 7→ ωt (p) is smooth. For such a path we have the initial tangent vector d ωt ∈ T ω 0 A dt t=0 d which is the LG-valued 1-form whose value at any p ∈ P is dt ω (p). t=0 t
April 20, 2004 11:4 WSPC/148-RMP
00201
Connections over Complexes
Similarly, from (2.3) we have the derivative of the orbit map γω : d γω0 H = γω (etH ) = dH + [ω, H] = D0ω H dt t=0 ω where the notation D0 H is as explained in (2.8) below.
335
(2.7)
2.3. The covariant derivative D ω and the curvature Ωω For a connection ω, and any LG-valued k-form η on P , the covariant derivative of η is the (k + 1)-form on P given by Dω η = Dkω η = dη + [ω ∧ η] ,
(2.8)
where Dkω is the restriction of D ω to k-forms. The curvature of ω is the LG-valued 2-form J(ω) = Ωω given by J(ω) = Ωω = Dω ω ¯ 2 (P ; g). and this is in Λ A connection is flat if its curvature is zero. A simple calculation produces the derivative d J 0 (ω)A = J(ω + tA) = dA + [ω ∧ A] = D1ω A dt
(2.9)
(2.10)
t=0
which shows that J 0 (ω) : Tω A → LG is in fact the linear map D1ω . ¯ k (P ; g) we have Restricting the covariant derivative to Λ ¯ k (P ; g) → Λ ¯ k+1 (P ; g) : η 7→ dη + [ω ∧ η] . Dω = Dkω : Λ A calculation shows that (Dω )2 η = Ωω ∧ η
(2.11)
¯ k (P ; g). for every η ∈ Λ 2.4. The chain complex Cω If ω is flat then (Dω )2 = 0, and so there is a chain complex Cω : Dω
Dω
0 1 0 → Cω0 −→ Cω1 −→ Cω2 → 0
(2.12)
where Cω0 = LG ,
Cω1 = Tω A ,
¯ 2 (P ; g) . Cω2 = Λ
(2.13)
If ω is a flat connection and A ∈ Tω A is the initial tangent to a pointwise smooth path lying entirely on A0 , then we say that A is tangent to A0 at ω. The set of all such vectors will be denoted Tω A 0 . It is not being claimed that this is a linear space.
April 20, 2004 11:4 WSPC/148-RMP
336
00201
A. N. Sengupta
Since J 0 (ω) = D1ω , we have Tω A0 ⊂ ker D1ω .
(2.14)
Assume that ω is regular/irreducible enough that D1ω is surjective/ ker D0ω = 0, and Tω A0 = ker D1ω . Then the cohomology of Cω is contained in the first cohomology group H 1 (Cω ) = ker D1ω /Im D0ω = Tω A0 /Tω (Gω) ' T[ω] (A0 /G)
(2.15)
where we have taken Tω (Gω) to be the image of LG under the map γω0 = D0ω : LG → Tω A .
(2.16)
2.5. The symplectic structure Ω Assume now that Σ is an oriented surface, possibly with boundary. Suppose the Lie algebra LG of G has an Ad-invariant non-degenerate symmetric bilinear form h·, ·i. If A, B ∈ Tω A then there is a 2-form hA ∧ Bi on Σ whose value on any vectors v, w ∈ Tm Σ is hA ∧ Bi(v, w) = hA(v), B(w)i − hA(w), B(v)i . Integrating this 2-form over the compact, oriented 2-manifold Σ gives Z Ωω (A, B) = hA ∧ Bi .
(2.17)
Σ
This specifies a constant 2-form Ω (introduced in [2]) on the infinite-dimensional affine space A. In fact it is a symplectic structure on A. It is readily verified that Ω is invariant under the action of G. ¯ 2 (P ; g) can be taken to be the dual space LG ∗ . The pairing is given The space Λ ¯ 2 (P ; g) then as follows: if H ∈ LG and η ∈ Λ Z hη, Hi = Hη . Σ
ω
Thus the curvature J(ω) = Ω can be viewed as taking values in the dual LG ∗ , and J is then a map J : A → LG ∗ .
(2.18)
The adjoint action of G on LG produces a dual action on LG ∗ . With respect to this J is G-equivariant. Suppose now that the compact, oriented 2-manifold Σ has no boundary, i.e. Σ is a closed, oriented surface. Then an application of Stokes theorem shows Ωω (γω0 H, B) = hJ 0 (ω)B , Hi . Thus J is a moment map for the action of G on the symplectic space A.
(2.19)
April 20, 2004 11:4 WSPC/148-RMP
00201
Connections over Complexes
337
This condition (2.19) implies that, for any flat connection ω, Ω induces a skewlinear pairing in H 1 (Cω ) = ker Dω1 /Im Dω0 : ¯ [ω] (A, ¯ B) ¯ = Ω(A, B) Ω where A, B ∈ Tω A = ker Dω1 , [ω] is the image of ω in A/G, A¯ = A + Im Dω0 , and ¯ similarly for B. ¯ may be viewed as 2-form on the moduli space of The skew-symmetric pairing Ω flat connections M0 = A0 /G = J −1 (0)/G .
(2.20)
The Marsden–Weinstein quotient procedure, for finite-dimensional symplectic man¯ is a symplectic structure on (part of) M0 , and this may ifolds, suggests that Ω be rigorously proved (see [14]) with appropriate definitions in the context of the infinite-dimensional space A. ¯ may be viewed as a “cup-product” on H 1 (Cω ). The pairing Ω 2.6. The form induced by Ω for a 2-cell Now suppose we carry out the construction of Ω with the disk D (a “cell”) as base manifold. Suppose that the boundary ∂D is divided into consecutive arcs e1 , . . . , eN , with ∂D = eN · · · e1 , with ei running from vertex vi−1 to vertex vi , for i = 1, . . . , N . Fix a section s over the set {v1 , . . . , vN }. For ω ∈ A, the space of all connections on the bundle, let xω (ei ) ∈ G be specified by τω (ei )vi−1 = vi xω (ei ) where τω (ei ) is parallel transport by ω along ei . Consider the map Φ : A → GN : ω 7→ xω (e1 ), . . . , xω (eN ) . The derivative dΦω : Tω A → (LG)N , defined by d dΦω (A) = Φ(ω + tA) dt t=0
exists, is a linear map, and is given explicitly via the Duhamel formula (3.3) below. Let Ω0 be the 2-form on GN given by 1 X −1 −1 Bj i (2.21) Ω0x (xA, xB) = Ai , fj−1 hfi−1 2 1≤i,j≤N
where fi = Ad(xi · · · x1 ), and A = (A1 , . . . , AN ) ∈ (LG)N and B = (B1 , . . . , BN ) ∈ (LG)N . Let ω be a flat connection on the bundle over D (more accurately, boundary arcs on ∂D are pasted together in appropriate pairs to form a closed, oriented surface of positive genus). It is essentially proven in [14] and [7] that for any A, B ∈ Tω A0 , Ωω (A, B) = Ω0 (dΦω A, dΦω B) . This is the reason for our interest in the 2-form Ω0 .
(2.22)
April 20, 2004 11:4 WSPC/148-RMP
338
00201
A. N. Sengupta
3. Connections over Cell Complexes Our next objective is to understand how the continuum structures described in the preceding section restrict over a cell-decomposition of the underlying surface.
3.1. The cell complex Here we shall state all the conditions specifying the type of cell complexes we shall be working with in this paper. Let X be a compact Hausdorff topological space. Let Bk be the closed unit-ball in Rk , with B0 being just a one-point set. By an open k-cell in X we mean an open subset of X which is the homeomorphic image of Int(Bk ). We consider X, together with a finite family of cells ei which cover all of X, and for each k-cell a continuous “attachment map” f : Bk → X which carries the interior of Bk homeomorphically onto the cell. It is also assumed that this attachment map carries the boundary of Bk into a set which is the union of lower dimensional cells. Zero-cells, or vertices, shall typically be denoted vi , 1-cells are edges, denoted by eα , and 2-cells by σ. The general framework of discrete connections over cell complexes which we set up is meaningful without additional hypotheses. However, many of our results, for instance those in Sec. 3.4 and Sec. 4 require certain additional hypotheses, which we state below. It is possible (as has been remarked by a referee) that these additional conditions imply that the cell complex is homotopically equivalent to a closed, oriented 2-manifold. The results we prove are discrete versions of the continuum results already known from the Atiyah–Bott theory, and one may hope for analogous, but possibly different, results if the conditions below are altered. For the purposes of results proven in Sec. 3.4 and beyond, we shall assume that there is no boundary for the complex and • each 0-cell is on the boundary of a 1-cell, and each 1-cell is on the boundary of a 2-cell. Only the existence of the attachment maps are part of the definition of a CW complex, but for our purposes we need to often consider 2-cells with orientation and base point, specifying a particular loop going around the boundary of the cell. Consider a 2-cell σ, and an attachment map f : B2 → Clsre(σ). We will assume that on the boundary of B2 there are points v0 , . . . , vn = v0 , sequenced in the positive sense, such that f restricts on each open arc (vi−1 , vi ) to a 1-cell. We shall eventually need each 2-cell equipped with this additional data. Note that we are imposing some additional condition on the nature of the cell complex: for instance, the representation of a 2-sphere using a single 2-cell along with a 0-cell is excluded. We shall say that the complex is oriented if it is possible to choose an orientation on each 2-cell in such a way that
April 20, 2004 11:4 WSPC/148-RMP
00201
Connections over Complexes
339
• if a 1-cell lies once each on the boundary of two 2-cells then these cells induce opposite orientations on the 1-cell, and each 1-cell lies on the boundary of at most two 2-cells; • a 1-cell can appear at most twice on the boundary of a 2-cell, and if it does appear twice on such a boundary then the induced orientations are opposite to each other. 3.2. Connections over cell complexes We shall define the notions of “connection” and “gauge transformations” over a cell complex. Later we shall show how the continuum notions induce these corresponding structures over cell complexes. In this section C is a cell complex and G a Lie group, with Lie algebra denoted LG. A connection, with values in the group G, over the complex C is a map x from the set of oriented 1-cells to G, such that for any 1-cell e x(¯ e) = x(e)−1 where e¯ is the orientation reverse of e. If κ is a “curve” in the complex, i.e. a sequence of oriented 1-cells e1 , . . . , eN , each ending where the next begins, then we write x(κ) = x(eN ) · · · x(e1 ) . Now let x be a connection and σ a based, oriented 2-cell. We will call x(∂σ) the curvature of the connection x over σ, and denote it by Kx (σ) Kx (σ) = x(∂σ) .
(3.1)
When the base point of σ is changed, Kx (σ) is conjugated. The set of all connections, with values in a fixed group G, over C will be denoted AC . A connection x is flat if x(∂σ) = e, the identity in G, for every 2-cell σ. A gauge transformation is a map θ : {all 0-cells} → G. This is a group, to be denoted GC , under pointwise multiplication. The group GC acts on AC as follows: xθ (ab) = θ(b)−1 x(ab)θ(a)
(3.2)
for any oriented 1-cell ab, running from a to b. 3.3. From the continuum to the discrete case Let π : P → Σ be a principal G-bundle. We assume that C is a cell-decomposition of Σ, and each 1-cell e : [0, 1] → Σ is smooth. Fix a section θ of P over the 0-cells. Let ω be a flat connection on P .
April 20, 2004 11:4 WSPC/148-RMP
340
00201
A. N. Sengupta
Define xω ∈ AC as follows: if e is a 1-cell running from a to b, τω (e)θ(a) = θ(b)xω (e) where τω (e) is parallel-transport by ω along the oriented 1-cell e. Then τω (¯ e)θ(b) = θ(a)xω (e)−1 . So xω (¯ e) = xω (e)−1 . Let c˜ω be the ω-horizontal lift over e; thus c˜ω (t) = τω (e|[0, t])θ(a) . Let A ∈ Tω A. Define A(e) = −
Z
A.
e˜ω
It is proven in [14, Eq. (3.6g)] that A(e) is the derivative of ω 7→ xω (e) at ω along the vector A: d (3.3) A(e) = xω (e)−1 xω+tA (e) . dt t=0 This is the Duhamel formula. Then
˜e¯ω (t) = τω e¯ | [0, t] θ(b) = τω e¯ | [0, t] τω (e)θ(a)xω (e)−1 = τω (e | [0, 1 − t])θ(a)xω (e)−1
= e˜ω (1 − t)xω (e)−1 . So ˜e¯ω (1) = θ(a)xω (e)−1 and
So
0 A ˜e¯ω (t) = −Ad xω (e)A e˜0ω (1 − t) . A(¯ e) = Ad xω (e)
Z
A = −Ad xω (e)A(e) .
(3.4)
e˜ω
Thus the discrete analog of A can be taken to be a mapping, also denoted A, which assigns to each oriented 1-cell an element A(e) in LG such that A(¯ e) = −Ad xω (e)A(e) .
(3.5)
April 20, 2004 11:4 WSPC/148-RMP
00201
Connections over Complexes
341
Thus A ∈ Cx1ω . Now for H ∈ Cω0 , −
Z
D H =−
Z
dH + [ω, H]
=−
Z
dH
ω
e˜ω
e˜ω
(since ω is 0 on e˜ω )
e˜ω
= H θ(a) − Ad xω (e)−1 H θ(b) .
If ω is flat, the 2-cell σ has a “horizontal lift” σ ˜ in the interior σ 0 of σ. Then, for any such based, oriented 2-cell σ, we have Z Z dA + ωA Dω A = σ ˜
σ ˜
= =
Z
Z
dA σ ˜
A ∂σ ˜
= A(e1 ) + Ad xω (e1 )−1 A(e2 ) + · · · + Ad xω (eN −1 ) · · · xω (e1 ) = D1x A(σ)
−1
A(eN )
where ∂σ = eN · · · e1 . In view of this, we take the discrete analog of D ω A to be D1xω A, which associates to each based, oriented 2-cell σ the element of LG given by −1 A(eN ) . D1xω A(σ) = A(e1 ) + Ad xω (e1 )−1 A(e2 ) + · · · + Ad xω (eN −1 ) · · · xω (e1 )
(3.6)
3.4. The chain complex associated to a flat connection In this section we shall describe a chain complex associated with a flat connection over a cell complex. If x is a connection then we have the tangent space Tx AC , consisting of all maps A : {1-cells} → LG such that A(¯ e) = − Ad x(e) A(e)
(3.7)
for every oriented 1-cell e. This condition is obtained by “differentiating” the relation x(¯ e) = x(e)−1 with respect to x(e) along the vector A(e).
April 20, 2004 11:4 WSPC/148-RMP
342
00201
A. N. Sengupta
We shall also denote Tx AC by Cx1 = Tx AC .
(3.8)
Assume now that x is a flat connection. In the following we shall use our requirement that a based, oriented 2-cell σ comes with a specified based loop tracing out ∂σ. Let Cx2 be the set of all maps f : {2-cells} → LG such that f (¯ σ ) = −f (σ) and if σa is an oriented cell based at a, and σb the same cell, based now at b, then f (σb ) = Ad x(ab)f (σa )
(3.9)
where ab is the curve formed by the sequence of 1-cells on ∂σa running positively from a to b. Because x(∂σ) = e, the identity, the condition defining f is self– consistent. Finally, let Cx0 = {all maps H : {all 0-cells} → LG} .
(3.10)
It is useful to note that this may be viewed as the Lie algebra of the group of gauge transformations: LGC = Cx0 .
(3.11)
Recall from (3.2) how the group of gauge transformations GC acts on the space of connections AC . The orbit map through x ∈ A, γx : GC → AC : θ 7→ xθ
(3.12)
has the derivative γx0 : LGC → Tx AC given by γx0 (Y )(ab) = Y (a) − Ad x(ab)−1 Y (b)
(3.13)
D0x = γx0 : Cx0 → Cx1 .
(3.14)
for every 1-cell ab running from a to b. Compare with the continuum equations (2.7). The map γx0 will also be denoted D0x : If σ is a based, oriented 2-cell then we have the derivative of the map x 7→ Kx (σ), left translated back to the identity in G: dKx (σ) : Tx A → LG
(3.15)
A 7→ A(e1 ) + Ad x(e1 )−1 A(e2 ) + · · · + Ad x(eN −1 ) · · · x(e1 )
−1
A(eN )
(3.16)
April 20, 2004 11:4 WSPC/148-RMP
00201
Connections over Complexes
343
where ∂σ = eN · · · e1 . Sometimes, however, it will be convenient to take the derivative as a map dKx (σ) : Tx A → TKx (σ) G, and write the map in (3.15) as Kx (σ)−1 dKx (σ). If x is flat then, as is readily verified, dKx (σ) is independent of the choice of base point. For flat x we define D1x : Cx1 → Cx2 : A 7→ D1x A
(3.17)
where D1x A is the element of Cx2 specified by D1x A(σ) = dKx (σ)A
(3.18)
for each based, oriented 2-cell σ. Our first observation is Lemma 3.1. For any flat connection x, D1x D0x = 0 . Proof. Let H ∈ Cx0 and σ a based, oriented 2-cell. Write ∂σ = eN · · · e1 , where the ei are oriented 1-cells, and ei runs from vi−1 vi . Then D1x D0x H(σ) =
N X i=1
Ad x(ei−1 ) · · · x(e1 )
−1
H(vi−1 ) − Ad x(ei )−1 H(vi )
= H(v0 ) − Ad Kx (σ)−1 H(v0 ) = 0,
the last equality following from Kx (σ) = x(∂σ) = e, the identity in G, because x is flat. Thus we obtain a chain complex: 0 → Cx0 → Cx1 → Cx2 → 0
(3.19)
where the first differential is D0x and the second D1x . The zeroth cohomology is H 0 (Cx ) = ker D0x . Proposition 3.2. Suppose that, for each 0-cell a, Z = 0 is the only vector in LG which satisfies Ad x(L) Z = Z for every loop L in C based at a. Then H 0 (Cx ) = {0} .
Proof. Suppose Y ∈ ker D0x and a is any 0-cell. By our hypotheses on the cell complex C, a is the base point of some 1-cell e1 , and e1 is part of the boundary of some 2-cell. So, in particular, there exist loops with base point a.
April 20, 2004 11:4 WSPC/148-RMP
344
00201
A. N. Sengupta
If L = eN · · · e1 is any loop based at a, with the 1-cell ei running from vertex ai−1 to vertex ai , then the definition of D0x shows that Y (ai−1 ) = Ad x(ei )
−1
Y (ai )
holding for i = 1, . . . , N . Combining these we have, using a = a0 , Y (a) = Ad x(L) Y (a) .
Since this holds for every loop L based at a, it follows by the hypothesis that Y (a) = 0. Since a is any arbitrary 0-cell of the complex, we have Y = 0. The first cohomology group H 1 (Cx ) is: H 1 (Cx ) = ker D1x /Im D0x .
(3.20)
Define the tangent space Tx A0C
(3.21)
to be the set of all vectors in Tx AC which are initial tangents to C ∞ paths, lying entirely on A0C ⊂ AC . If x is a flat connection then Kx (σ) = e for every 2-cell σ and so Tx A0C ⊂ ker D1x .
(3.22)
a flat connection x is regular if Tx A0C = ker D1x .
(3.23)
We shall say that
The following result is now clear: Proposition 3.3. For any regular flat connection x, Tx A0C /γx0 (LGC ) = H 1 (Cx ) .
(3.24)
The left side in (3.24) could be viewed as the tangent space at [x] ∈ A0C /GC of the moduli space A0C /GC . 4. The 2-form Ω In this section, hypotheses and notation will be as in the previous section. The gauge group G has Lie algebra LG equipped with an Ad-invariant symmetric bilinear form h·, ·i. In particular, we work with connections over a cell complex C satisfying all the conditions described in Sec. 3.1. We shall construct and study a special 2-form Ω on the space A0C of flat connections.
April 20, 2004 11:4 WSPC/148-RMP
00201
Connections over Complexes
345
4.1. Definition of the 2-form Ω Let x ∈ AC and consider tangent vectors A, B ∈ Tx AC . Let σ be a based, oriented, 2-cell whose boundary ∂σ is eN · · · e1 , where ei is the 1-cell from ai−1 to ai . Define Ωσ (A, B)x = Ωx (A, B)σ =
1 2
X
−1 −1 ij hfi−1 A(ei ), fj−1 B(ej )i
(4.1)
1≤i,j≤N
where ij is 1 if i < j, is −1 if i > j, and is 0 if i = j, and fi = Ad x(ei−1 ) · · · x(e1 )
(4.2)
with f0 being the identity map on LG. The reason for considering this 2-form is that in the continuum situation there is the Atiyah–Bott symplectic form on the space of all connections and our formulas (2.22) and (2.21) show that at the discrete level we then have the induced 2-form Ωσ (·, ·)x . In general, the value Ωx (A, B)σ will depend on the choice of base point on σ. However, for flat connections we have: Lemma 4.1. Let x be a flat connection. If A, B ∈ ker D1x then Ωx (A, B)σ is independent of the choice of the base point of the oriented 2-cell σ. Proof. Let ∂σ = eN · · · e1 , with ei running from vi−1 to vi , and vN = v0 is the base point. Moving the base point over to v1 , we call the 2-cell σ 0 . So ∂σ 0 = e1 eN · · · e2 . The terms in Ωx (A, B)σ and Ωx (A, B)σ0 are identical except for those which involve A(e1 ) and those which involve B(e1 ). With this in mind we write, 1X −1 −1 B(ej )i A(ei ), fj−1 ij hfi−1 Ωx (A, B)σ = 2 i,j =
where
1 hA(e1 ), Li 2 * + N N i−1 X X 1X −1 −1 −1 + fi−1 A(ei ), fj−1 B(ej ) − fj−1 B(ej ) 2 i=2 j=i+1 j=1 L=
N X
−1 B(ej ) . fj−1
j=2
Ωx (A, B)σ0
−1
= Ad x(e1 ) fi−1 , we also have * + N X 1 −1 −1 =− Ad x(e1 )fN A(e1 ), Ad x(e1 ) fj−1 B(ej ) 2 j=2
Using Ad x(ei ) · · · x(e2 )
N
+
1X −1 hAd x(e1 )fi−1 A(ei ), Ad x(e1 )M i 2 i=2
(4.3)
April 20, 2004 11:4 WSPC/148-RMP
346
00201
A. N. Sengupta
where M=
N X
−1 −1 fj−1 B(ej ) + fN B(e1 ) −
j=i+1
i−1 X
−1 fj−1 B(ej ) .
(4.4)
j=2
Then 2Ωx (A, B)σ − 2Ωx (A, B)σ0 =
* +
(1 +
−1 fN )A(e1 ),
N X
−1 fj−1 B(ej )
j=2
N X
+
−1 −1 hfi−1 A(ei ), −B(e1 ) − fN B(e1 )i
i=2
=
*
(1 +
−
*
−1 fN )A(e1 ),
N X
−1 fj−1 B(ej )
− B(e1 )
j=1
N X
−1 fi−1 A(ei )
− A(e1 ), (1 +
+
−1 fN )B(e1 )
i=1
+
−1 = h(fN − fN )A(e1 ), B(e1 )i * + N X −1 −1 + (1 + fN )A(e1 ), fj−1 B(ej ) j=1
−
*
N X
−1 fi−1 A(ei ), (1
+
−1 fN )B(e1 )
i=1
+
−1 = h(fN − fN )A(e1 ), B(e1 )i −1 + h(1 + fN )A(e1 ), dKσ (x)Bi −1 − h(1 + fN )B(e1 ) , dKσ (x)Ai .
In this, using fN = Identity (because x is flat), the first term is 0, and the hypothesis that A, B ∈ ker D1x shows that the remaining terms also equal zero. We work with the hypothesis that C is oriented. Then each 2-cell comes with a favored orientation, and, moreover, we have also assumed that the boundary of the 2-cell can be expressed as a sequence of oriented 1-cells, starting at a specific base point (depending on the 2-cell). Define X Ωx (A, B) = Ωx (A, B)σ (4.5) all 2-cells σ where the sum is over all positively oriented 2-cells, with base points as chosen. Clearly Ωx is skew-symmetric, and so Ω is a 2-form on AC .
April 20, 2004 11:4 WSPC/148-RMP
00201
347
Connections over Complexes
4.2. Ω is closed Let σ be a based, oriented 2-cell with boundary ∂σ = eN · · · e1 . Consider the map Φσ : A C → G N given by Φσ (x) = x(e1 ), x(e2 )x(e1 ), . . . , x(eN ) · · · x(e1 ) .
Now let ω be the 2-form on GN given by X ωx (xA, xB) = ij hAi − Ai−1 , Bi − Bi−1 i i,j
N
where A, B ∈ (LG) . Then 1 ∗ Φ ω = Ωσ 2 σ
(4.6)
and so Ω=
X1 σ
2
Φ∗σ ω
(4.7)
where the sum is over all 2-cells, each taken once only, with specified base point and orientation as specified by the orientation of the cell complex C. Using this we prove: Theorem 4.2. The 2-form Ω is closed: dΩ = 0 .
(4.8)
Proof. From (4.7) and using the expression for dω given in Lemma 4.3 below, we have 1 XX hA(ei ), [B(ei ), C(ei )]i (4.9) dΩx (A, B, C) = − 2 σ i P where the sum i is over the oriented 1-cells on the boundary of σ. Now each oriented 1-cell e, appears in the sum on the right in (4.9) exactly twice. If e appears on the boundary of some σ then there is a σ 0 (possibly the same as σ) such that the orientation reverse e¯ appears on the boundary of σ 0 . Moreover, A(¯ e) = −Ad(x(e))A(e) and similarly for B(¯ e) and C(¯ e). Since h−Ad(x(e))A(e), [−Ad(x(e))B(e), −Ad(x(e))C(e)]i = −hA(e), [B(e), C(e)]i it follows that the terms in the sum on the right in (4.9) cancel in pairs, leaving a sum of 0. We have used the following result [7, Proposition E] for which we include a proof for completeness.
April 20, 2004 11:4 WSPC/148-RMP
348
00201
A. N. Sengupta
Lemma 4.3. Let ω be the 2-form on GN given by X ij hAi − Ai−1 , Bj − Bj−1 i ωx (xA, xB) = i,j
where A, B ∈ (LG)N . Then ωx (xA, xB) =
N X
(hAi−1 , Bi i − hAi , Bi−1 i)
i=1
and dω = −
N +1 X
hAi − Ai−1 , [Bi − Bi−1 , Ci − Ci−1 ]i
i=1
where, for any vector X = (X1 , . . . , XN ) ∈ (LG)N we take Xi = 0 for i ≤ 0 and i > N. Proof. We have ωx (xA, xB) =
N X
hAi − Ai−1 , BN − Bi − (Bi−1 − B0 )i
i=1
=
N X
(hAi−1 , Bi i − hAi , Bi−1 i) +
i=1
N X
(hAi−1 , Bi−1 i − hAi , Bi i)
i=1
+ hAN − A0 , BN i + hAN − A0 , B0 i =
N X
(hAi−1 , Bi i − hAi , Bi−1 i) .
i=1
The 2-form ω is manifestly left-invariant. So, instead of writing ωx (xA, xB) we shall simply write ω(A, B). Next we compute dω. By left invariance it suffices to compute it at the identity in G. Denoting the vector field x 7→ xA by A˜ etc., we have ˜ C)) ˜ + C˜e (ω(A, ˜ B)) ˜ +B ˜e (ω(C, ˜ A)) ˜ dω(A, B, C) = A˜e (ω(B, + ω(A, [B, C]) + ω(C, [A, B]) + ω(B, [C, A]) . The first three terms, being derivatives of constant functions, are each equal to zero. Proceeding with the remaining terms we have: dω(A, B, C) =
N X
(hAi−1 , [Bi , Ci ]i − hAi , [Bi−1 , Ci−1 ]i)
i=1
+
N X i=1
(hCi−1 , [Ai , Bi ]i − hCi , [Ai−1 , Bi−1 ]i)
April 20, 2004 11:4 WSPC/148-RMP
00201
Connections over Complexes
+
N X
349
(hBi−1 , [Ci , Ai ]i − hBi , [Ci−1 , Ai−1 ]i)
i=1
=
N X
hAi , [Bi+1 , Ci+1 ] − [Bi−1 , Ci−1 ]i
i=1
+
N X
hAi , [Bi , Ci−1 ] − [Bi , Ci+1 ]i
i=1
+
N X
hAi , [Bi−1 , Ci ] − [Bi+1 , Ci ]i
i=1
=
N X
hAi , Si i
i=1
where
Si = [Bi+1 , Ci+1 ] − [Bi−1 , Ci−1 ] + [Bi , Ci−1 ] − [Bi , Ci+1 ] + [Bi−1 , Ci ] − [Bi+1 , Ci ] = [Bi+1 − Bi , Ci+1 − Ci ] − [Bi − Bi−1 , Ci − Ci−1 ] . Thus dω(A, B, C) =
N X
hAi , [Bi+1 − Bi , Ci+1 − Ci ] − [Bi − Bi−1 , Ci − Ci−1 ]i
i=1
=
N +1 X
hAi−1 , [Bi − Bi−1 , Ci − Ci−1 ]i
i=1
−
N +1 X
hAi , [Bi − Bi−1 , Ci − Ci−1 ]i
i=1
=−
N +1 X
hAi − Ai−1 , [Bi − Bi−1 , Ci − Ci−1 ]i
i=1
as claimed.
4.3. A moment-map type property Let us briefly review the situation. Associated to our cell complex C we have the space AC of connections, on which there is a 2-form Ω, and the group GC of gauge transformations acts on AC . The definition of the 2-form Ω requires a choice of a base point on each 2-cell. In this section we shall derive a formula which is strongly reminiscent of the definition of a moment map in Hamiltonian mechanics. The formula of this section originated in a result proven in [7].
April 20, 2004 11:4 WSPC/148-RMP
350
00201
A. N. Sengupta
Let x be a connection. Then we have the orbit map D0x = γx0 : LGC → Tx AC . Recall that LGC = Cx0
and Tx AC = Cx1 .
We prove Theorem 4.3. Assume that C is oriented. Let x ∈ AC be a connection over C, A any element of Cx1 , and H any element of Cx0 . Then: (i) the following holds: Ωx (D0x H, A)
=
X 1 σ
2
(1 + Ad Kσ (x)
−1
)H(vσ ), dKσ (x)A
(4.10)
where vσ is the chosen base point on the boundary of the 2-cell σ, and the sum is over all positively oriented 2-cells; (ii) if x is a flat connection then X hH(vσ ), D1x A(σ)i (4.11) Ωx (D0x H, A) = σ
with notation as before; (iii) if x is a flat connection and A ∈ ker D1x then Ωx (D0x H, A) = 0 .
(4.12)
Proof. Let σ be a based, oriented 2-cell, with ∂σ = eN · · · e1 , where ei runs from a vertex vi−1 to a vertex vi . Thus vN = v0 is the base point of σ. Recall that D0x H is the element of Tx AC given by D0x H(e) = H(a) − Adx(e)−1 H(b) where e is the oriented cell from vertex a to vertex b. We then have −1 1X −1 Ωx (D0x H, A)σ = A(ej )i H(vi−1 ) − Ad x(ei )−1 H(vi ) , fj−1 ij hfi−1 2 i,j N
=
1 X
−1 −1 −1 A(ej ) H(vN ), fj−1 H(vj−1 ) − fj−1 H(vj ) + fN H(v0 ) − fj−1 2 j=1 N
=
=
1 X
−1 −1 −1 A(ej ) H(vj−1 ), fj−1 (1 + fN )H(v0 ) − fj−1 H(vj ) − fj−1 2 j=1
1 (1 + Kσ (x)−1 )H(vσ ), dKσ (x)A − Mσ 2
April 20, 2004 11:4 WSPC/148-RMP
00201
Connections over Complexes
351
where vσ is the chosen base point v0 on the boundary of the 2-cell σ, and N
Mσ =
1X hAd x(ej )−1 H(vj ) + H(vj−1 ), A(ej )i . 2 j=1
By our assumptions concerning the oriented, boundary-less cell-complex C, each 1-cell appears exactly twice in the sum X Mσ , σ
once in the form e as a boundary of an oriented 2-cell and once as e¯ of either the same 2-cell or a neighboring 2-cell. Suppose e runs from a to b; then the terms involving e are: hAd x(e)−1 H(b) + H(a), A(e)i + hAd x(¯ e)−1 H(a) + H(b), A(¯ e)i = hAd x(e)−1 H(b) + H(a), A(e)i + hAd x(e)H(a) + H(b), −Ad x(e)A(e)i = 0. Using this we conclude that Ωx (D0x H, A)
=
X 1 σ
2
(1 + Kσ (x)
−1
)H(vσ ), dKσ (x)A
where vσ is the chosen base point on the boundary of the 2-cell σ, and the sum is over all positively oriented 2-cells. This proves part (i). Parts (ii) and (iii) are immediate consequences. The moment-map property was proved in [7] for the case of complexes with one 2-cell (corresponding to closed, oriented surfaces). In a later paper, Alekseev et al. [1] developed an entire theory of Lie group-valued moment maps. There is a non-degeneracy condition that is used in that theory and it would be of interest to verify if that condition holds for our context; at this time, we have not been able to verify it. (The author thanks a referee for directing him to this question.) The above moment-map result was used in [8] to prove non-degeneracy of the symplectic form on the moduli space of flat connections. It has also played a crucial role in the computation of the symplectic volume of the moduli space of flat connections. A similar application can, hopefully, be made in the more general case we have worked with here. For more ideas in this direction we refer to some recent work of Mulase and Penkava et al. [11]. Acknowledgments I am thankful to the referees for their comments (in particular, for pointing out [1]). Research support provided by US NSF grant DMS-0201683 is also gratefully acknowledged.
April 20, 2004 11:4 WSPC/148-RMP
352
00201
A. N. Sengupta
References [1] A. Alekseev, A. Malkin and E. Meinrenken, Lie group valued moment maps, J. Differential Geom. 48(3) (1998) 445–495. [2] M. Atiyah and R. Bott, The Yang–Mills equations over Riemann surfaces, Phil. Trans. R. Soc. Lond. A308 (1982) 523–615. [3] C. Becker and A. Sengupta, Sewing Yang–Mills measures and moduli spaces over compact surfaces, J. Funct. Anal. 152 (1998) 74–99. [4] K. Brown, Cohomology of Groups (Springer-Verlag, 1982). [5] W. Goldman, The symplectic nature of fundamental groups of surfaces, Adv. Math. 54 (1984) 200–225. [6] J. Huebschmann, Singularities and Poisson geometry of certain representation spaces, in Deformation Quantization of Singular Sympletic Quotients, Progress in Mathematics, Vol. 198, eds. N. P. Landsman, M. Pflaum and M. Schlichenmaier (Birkh¨ auser, 2001). [7] C. King and A. Sengupta, An Explicit Description of the Symplectic Structure of Moduli Spaces of Flat Connections, J. Math. Phys. 10 (1994) 5338–5353. [8] C. King and A. Sengupta, The semiclassical limit of the two dimensional quantum Yang–Mills model, J. Math. Phys. 35 (1994) 5354–5361. [9] Deformation Quantization of Singular Symplectic Quotients, Progress in Mathematics, Vol. 198, eds. N. P. Landsman, M. Pflaum and M. Schlichenmaier (Birkh¨ auser, 2001). [10] A. A. Migdal, Recursion equations in gauge field theories, Sov. Phys. JETP 42 (1975) 413; 743. [11] M. Mulase and M. Penkava, Volume of representation varieties, preprint. [12] B. Schroers, Combinatorial quantization of Euclidean gravity in three dimensions, in Deformation Quantization of Singular Symplectic Quotients, Progress in Mathematics, Vol. 198, eds. N. P. Landsman, M. Pflaum and M. Schlichenmaier (Birkh¨ auser, 2001). [13] A. Sengupta, Gauge theory on compact surfaces, Memoirs of the Amer. Math. Soc. 126(600) (1997). [14] A. Sengupta, The moduli space of Yang–Mills connections over a compact surface, Rev. Math. Phys. 9 (1997) 77–121. [15] A. Sengupta, Sewing symplectic volumes for flat connections over compact surfaces, J. Geom. Phys. 32 (2000) 269–292. [16] A. Sengupta, The Yang–Mills measure and symplectic structure on spaces of connections, in Deformation Quantization of Singular Symplectic Quotients, Progress in Mathematics, Vol. 198, eds. N. P. Landsman, M. Pflaum and M. Schlichenmaier (Birkh¨ auser, 2001). [17] E. Witten, On quantum gauge theories in two dimensions, Commun. Math. Phys. 141 (1991) 153–209. [18] E. Witten, Two dimensional quantum gauge theory revisited, J. Geom. Phys. 9 (1992) 303–368.
April 20, 2004 11:21 WSPC/148-RMP
00200
Reviews in Mathematical Physics Vol. 16, No. 3 (2004) 353–382 c World Scientific Publishing Company
LOCAL NATURE OF COSET MODELS
¨ S. KOSTER Inst. f. Theor. Physik, Tammannstr. 1, 37077 G¨ ottingen, Germany [email protected] Received 17 July 2003 Revised 24 January 2004 The local algebras of the maximal Coset model Cmax associated with a chiral conformal subtheory A ⊂ B are shown to coincide with the local relative commutants of A in B, provided A possesses a stress-energy tensor. Making the same assumption, the adjoint action of the unique inner-implementing representation U A associated with A ⊂ B on the local observables in B is found to define net-endomorphisms of B. This property is exploited for constructing from B a conformally covariant holographic image in (1 + 1) dimensions which proves useful as a geometric picture for the joint inclusion A ∨ Cmax ⊂ B. Immediate applications to the analysis of current subalgebras are given and the relation to normal canonical tensor product subfactors is clarified. A natural converse of Borchers’ theorem on half-sided translations is made accessible. Keywords: Conformal quantum field theory; nets of subfactors; coset construction; current algebras; isotony.
1. Introduction Structural and conceptual questions of quantum field theory are often addressed best within the framework of local quantum physics, where physics is described by assigning local algebras B(O) of observables to localization regions O rather than in terms of quantum fields [35]. In this picture it is natural to investigate the relative position of subnets A(O) ⊂ B(O) in the larger theory B. In (3 + 1)-dimensional spacetime this problem may be dealt with by means of the powerful reconstruction method of Doplicher and Roberts [22] and according to the results of Carpi and Conti [19] any subtheory satisfying certain assumptions is in a tensor product position in the larger theory. In lower dimensions, however, the situation is less restrictive and a lot of interesting examples are known. In this article we will study a class of inclusions of local quantum theories A ⊂ B, where B is given in its vacuum representation. When not all of the energy-content of B belongs to A, there is space for other subtheories C ⊂ B which commute with all of A. We call such subtheories C Coset models (associated with A ⊂ B), and we want to derive typical features of these. By placing the problem into the setting of 353
April 20, 2004 11:21 WSPC/148-RMP
354
00200
S. K¨ oster
chiral conformal quantum field theory the large spacetime symmetry enables us to state and discuss the problems concerned clearly and rigorously. Chiral conformal Coset models are studied for various reasons, in most cases connected to inclusions of chiral current algebras. These models exhibit a rich and yet tractable structure. One of the major achievements in this direction was the construction of the discrete series of Virasoro theories as Coset models by Goddard, Kent, and Olive [28]. In the algebraic approach to quantum field theory much has been achieved for inclusions of local quantum theories generated by chiral current algebras and closely related structures [48, 68, 67, 70, 49, 40]. Here, we want to broaden the perspective by using methods which do not make use of structures specifically connected to chiral current algebras, but which apply in a more general context. For chiral conformal quantum field theories the natural localization regions are open, non-dense intervals I in the circle. The local algebras C(I) of a Coset model associated with a subnet A ⊂ B are contained in the local relative commutants CI := A(I)0 ∩ B(I). Actually, if the local relative commutants fulfill isotony, i.e. the local relative commutants CI increase with I, the CI define a Coset model themselves which is obviously maximal. Isotony for the CI holds for chiral current subalgebras because of the strong additivity property of these models [63, Corollary IV.1.3.3.]: If I1 , I2 arise from I by removing a point in its interior, the local algebra A(I) is generated by its subalgebras A(I1 ) and A(I2 ). This property is absent in many chiral conformal models [15, 71] and, naturally, the question arises, under which circumstances the equality of CI and Cmax (I) can be proven. Our intention is to find more general conditions which secure this equality for various reasons and we refer to this task as the isotony problem. In a recent work [43] we constructed for any chiral conformal subtheory A ⊂ B a globally A-inner representation U A which implements the (global) chiral conformal transformations on A. The result provides a factorization of U , the implementation of chiral conformal symmetry in the vacuum representation of B, into two com0 muting representations, U A and U A , which share with U the properties of leaving the vacuum invariant and of positivity of energy. It was proved, by a simple argument, that the local operators in B which commute with U A form the maximal Coset model Cmax associated with every particular inclusion A ⊂ B. Cmax can be 0 non-trivial, if we have U A 6= 1l, i.e. if not all of the energy-content of B belongs to A. We would like to have a simple and applicable characterization of local operators in B which belong to a Coset model associated with a subtheory A, and we want this characterization to involve only local data according to the conviction that all observation is of finite extension and of finite duration. Of course, it is in principle possible to make this decision simply by taking all operators from CI and discarding all operators which do not commute with all operators belonging to an algebra A(J), J slightly enlarged. But chiral conformal quantum field theories usually behave well when J tends to I, and we are led to the conjecture that the local algebras of
April 20, 2004 11:21 WSPC/148-RMP
00200
Local Nature of Coset Models
355
the maximal Coset model and the corresponding local relative commutants should coincide in very general circumstances. As it stands for the moment, the maximal Coset model is determined by global data, the inner-implementing representation U A , and establishing the equality CI = Cmax (I) would prove that the Coset model is of a local nature, its local operators being singled out by a simple algebraic relation only involving local data associated with the very same localization region. For dealing with the isotony problem in our context,a we look at the action of AdU A on the local observables of B. Because the construction of U A does not refer to the local structure of A at all, we need some information on the way this representation is generated by local observables. In chiral conformal field theory it is natural to assume that the inner implementing representation is generated by integrals of a stress-energy tensor affiliated with A. This assumption does not imply strong additivity [15] and concerning the models known today (at least to the author) is more general, since all strongly additive models contain a stress-energy tensor. Because of the special features stress-energy tensors of chiral (and (1+1)dimensional) conformal field theory have according to the L¨ uscher–Mack theorem [26], solving the isotony problem proves possible, but the presence of a stress-energy tensor does not trivialize it at all. In fact, one is led to pinpoint the problem very much using arguments independent of the additional assumption, before the stressenergy tensor actually is needed to prove two crucial, but natural lemmas. Our discussion should, therefore, serve well as a setup for further generalizations. Even for current subalgebras, which always contain a stress-energy tensor by the Sugawara construction, the action of a stress-energy tensor ΘA of a current subalgebra on general currents in the larger current algebra B has not been studied as such, yet. Only in connection with the classification of conformal inclusions, i.e. the case that the stress-energy tensor ΘB coincides with that of A [60, 1, 3], this action has been the object of research. The new perspective of analyzing the action of U A on B (in this context: of ΘA on B) directly has led to a simple and natural characterization of conformal inclusions by methods familiar in (axiomatic) quantum field theory [44]. As mentioned above, the local nature of maximal Coset models associated with current subalgebras A ⊂ B is clear because of the strong additivity property of A. If the embedding A∨Cmax ⊂ B is known to be of finite index, then Cmax inherits the strong additivity property from A and B by the results of Longo [49]. According to Xu [68] a large number of current algebra inclusions are known to satisfy this condition (cofinite inclusions A ⊂ B), but for a lot of others the situation has not been clarified, yet. If we now look at the embedding Cmax ⊂ B and consider the (“iterated”) Coset models associated with this inclusion, we arrive at another isotony problem. In case Cmax is strongly additive as well, the local relative commutants of a Apparently, Carpi and Conti encountered the same problem while generalizing their analysis [19] to general field algebras and solved it by methods quite different from the ones applied here [18].
April 20, 2004 11:21 WSPC/148-RMP
356
00200
S. K¨ oster
Cmax and the local algebras of Cmax form a pair of subnets which locally are their respective relative commutants. Inclusions of this type are of particular interest and Rehren called them a normal pair of subnets [57]. Our analysis applies to the maximal Coset models associated with current subalgebras as these always contain the Coset stress-energy tensor Θ B − ΘA . This way we extend the finding on normal pairs for cofinite current subalgebras to all inclusions A ⊂ B where both B and A contain a stress-energy tensor, independent of strong additivity or the index of the inclusion A ∨ Cmax ⊂ B. In the next section we first state our general assumptions and conventions and then discuss the “geometric impact” of U A on B. Intuitively, we do not expect an observable of B to be more sensitive to the action of AdU A than to that of AdU : the generator of translations, P , is known to decompose into two commuting positive 0 parts, P = P A +P A , and regarding them as chiral analogues of Hamiltonians leads us to the expectation that P A should not transport observables of B “faster” than P itself. A typical local observable B in B should exhibit a behavior interpolating between invariance (B in Cmax ) and covariance (B in Amax ). For this behavior to be ensured we have, as it turns out, only to show that scale transformations represented through U A respect the two fixed points of scale transformations, namely 0 and ∞, when acting on B. We can prove this to be the case in the presence of a stress-energy tensor and it seems natural in any case. The sub-geometrical transformation behavior for translations, which we expect, then follows by the results of Borchers [11, 12] using the spectrum condition and modular theory. We collected, rearranged and reformulated results of Borchers and Wiesbrock in order to provide a natural converse of Borchers’ theorem on half-sided translations, which was not yet available in the literature. By extending the analysis to general conformal transformations we arrive at the notion of net-endomorphism property for the action of U A on B. In the third section we use the net-endomorphic action of U A to construct from the chiral conformal theory B a conformal net in (1+1) dimensions which contains the chiral algebras as time-zero algebras. The result satisfies all axioms of a (1+1)-dimensional conformal quantum theory, except that its translations in spacelike directions to the right have positive spectrum rather than in futurelike directions. While this prohibits interpreting the picture of chiral holography as (completely) physically sensible, it gives a satisfactory geometric interpretation to the net-endomorphisms induced by U A and it provides a rather helpful geometrical framework of a quasi-theory in (1+1) dimensions. The subnet A ⊂ B and its Coset models appear as subtheories of chiral observables and thus we make connection with the results of Rehren [57], which have interesting consequences for known examples. In the closing section we provide our solution to the isotony problem (main Theorem 4.1), i.e. we establish the local nature of the maximal Coset model. We start by giving a new characterization of Cmax making use of the particular structure of the group of chiral conformal transformations. And then, again, the presence of
April 20, 2004 11:21 WSPC/148-RMP
00200
Local Nature of Coset Models
357
a stress-energy tensor for A is only needed in order to establish a rather natural, but crucial lemma on the representation of scale transformations through U A . At the very end we discuss possible generalizations to models having no stress-energy tensor and to subtheories in other spacetimes. The appendix contains background on our additional assumption on the inner-implementing representation U A , while we will use an abstract formulation of it in the main sections, and a simple, technical lemma on scale transformations as elements of the group of orientation preserving diffeomorphism on the circle, Diff + (S1 ). 2. Net-Endomorphism Property The fundamental object of this study is an inclusion of a chiral conformal theory, A, in another chiral conformal theory, B. The theory B shall be given in its vacuum representation, of which we summarize the general assumptions and some of its properties (cf. [29, 27] and references therein), and we describe the embedding of A in B in this setting. The localization regions for chiral conformal theories are taken to be the proper intervals contained in the unit circle S1 , which is to be regarded as the conformal compactification of a (chiral) light-ray; the point +1 on S1 corresponds to the point 0 on the light-ray and −1 ∈ S1 corresponds to ∞. A proper interval I is an open, connected subset of S1 which has a causal (open) complement, I 0 := {S1 \ I}◦ 6= ∅. The inclusion of such a proper interval I in the unit circle will be denoted as I b S1 . The vacuum representation of B is given by a map from the set of proper intervals to von Neumann algebras of bounded operators on a separable Hilbert space H satisfying isotony, i.e. for I1 ⊂ I2 b S1 we have B(I1 ) ⊂ B(I2 ), and locality, that is: if I1 ⊂ I20 , then B(I1 ) is contained in B(I2 )0 , the commutant of B(I2 ). It is required as well that there is a unitary, strongly continuous representation U of the group of global, chiral, conformal transformations, PSL(2, R), which satisfies the following: the generator of translations has positive spectrum (positivity of energy), U implements the corresponding symmetry of B, i.e. for g ∈ PSL(2, R) the adjoint action of U (g) on local algebras of B defines an isomorphism αg from any B(I) onto the corresponding B(gI), and, finally, U has to contain the trivial representation exactly once. We choose a vector Ω, the vacuum, of length 1 in the corresponding representation space. Ω has to be cyclic for B, which, by the Reeh–Schlieder theorem, amounts to demanding B(I)Ω to be dense in H for all I b S1 . A chiral conformal subtheory A embedded in B, written as A ⊂ B, is given by a map from the set of proper intervals to local von Neumann algebras, S1 c I 7→ A(I), with the following properties: • Inclusion: A(I) ⊂ B(I) for I b S1 . • Isotony: If I1 ⊂ I2 , then A(I1 ) ⊂ A(I2 ). • Covariance: For all g ∈ PSL(2, R) and I b S1 we have: A(gI) = αg (A(I)).
April 20, 2004 11:21 WSPC/148-RMP
358
00200
S. K¨ oster
These assumptions have a lot of interesting consequences of which we only name a few directly involved in this work. For instance, B has the Bisognano–Wichmann property, i.e. the modular data of the local algebra assigned to the upper half circle have a direct geometrical interpretation. If the action of scale transformations on the (chiral) light-ray, which we identify with R, reads D(t) : x 7→ et x, x ∈ R, and the modular group of B(S1+ ) is given by ∆it , then we have U (D(−2πt)) = ∆it . Furthermore, the modular conjugation of B(S1+ ), denoted by J, implements the reflection x 7→ −x. By covariance, this means in particular: the vacuum representation of B satisfies Haag duality (on the circle), namely we have B(I) 0 = B(I 0 ), I b S1 . The local algebras B(I), I b S1 , are continuous from the inside as well as from the outside, that is: B(I) coincides with the intersection of all local algebras assigned to proper intervals J containing I¯ and is generated by all its local subalgebras (assigned to proper intervals J with J¯ ⊂ I), respectively. Continuity from the inside implies weak additivity, i.e. B(I) is generated by the subalgebras B(Ji ) for S each covering i Ji = I [23]. The vacuum representation of A is contained in the representation induced by the embedding A ⊂ B. In fact, the local inclusions A(I) ⊂ B(I), I b S1 , define a (quantum field theoretical ) net of subfactors in the sense of Longo and Rehren [50]. By the Reeh–Schlieder theorem, the projection eA onto the Hilbert space resulting from the closure of A(I)Ω, I b S1 , does not depend on I and, because the local subalgebras A(I) ⊂ B(I) are modular covariant by the Bisognano–Wichmann property of B and conformal covariance of A ⊂ B, it follows [61, 37] that for every I b S1 we have: A(I) = {eA }0 ∩ B(I). For a general summary on modular covariant subalgebras see e.g. [11]. We denote the von Neumann algebra which is generated by all local algebras A(I), I b S1 , by A as well (with a slight abuse of notation); this algebra contains all local observables of the theory A and all global observables associated with the subtheory A ⊂ B, that is all bounded operators which are weak limits of local observables of the subtheory but which are not local themselves. By the Borchers– Sugawara construction [43] there is a unique representation U A of PSL(2, R)∼ , the universal covering group of PSL(2, R), which consists of global observables of A and implements conformal covariance on A by its adjoint action. Namely, the rep0 resentation U factorizes as U ◦ p(g) = U A (g)U A (g), where p denotes the covering 0 projection form PSL(2, R)∼ onto PSL(2, R) and U A is another representation of PSL(2, R)∼ by unitaries in A0 . In the following there will appear frequently the subgroups of translations, T (a)x = x + a, x, a ∈ R, and of special conformal transformations, S(n)x = x/(1 + nx), x, n ∈ R. Both groups are inverse-conjugate in PSL(2, R), i.e. T (a) is conjugate to S(−a), and the same holds true for their images in PSL(2, R)∼ , which ˜ respectively. Furthermore, we have the groups of rigid we will denote by T˜ and S, ˜ respectively, and of dilatations (scale conformal rotations, denoted by R and R, ˜ transformations), D, D. We adopt the physicists’ convention on the Lie algebra
April 20, 2004 11:21 WSPC/148-RMP
00200
Local Nature of Coset Models
359
and use the same symbols for elements of the Lie algebra and their representatives as elements of an infinitesimal representation by (essentially) self-adjoint operators. We use parameters on the three subgroups mentioned so far which make the subgroup R of rotations naturally isomorphic to R/2πZ and yield the following relation between the generator of translations, P , the generator of special conformal transformations, K, and the generator of rigid conformal rotations, the conformal Hamiltonian L0 : 2L0 = P − K. P is a positive operator if and only if L0 is positive 0 or, equivalently, if and only if −K is positive (e.g. [43, Proposition 1]). U A and U A are both of positive energy since U is. Furthermore, both representations leave the vacuum invariant [43, Corollaries 6 and 7]). In the following we deduce, step by step, the sub-geometric character of the 0 adjoint action of U A (and of U A ) on B. The analysis relies on a single property of the dilatations in U A . The notion of net-endomorphisms arises naturally in the course of the argument and will be discussed at the end of this section. We therefore define: Definition 2.1. U A is said to have the net-endomorphism property, if the ˜ adjoint action of U A (D(t)), t ∈ R, defines a group of automorphisms of B(S1+ ). This property holds making the following Additional Assumption: There is a unitary, strongly continuous, projective representation ΥA of the universal covering group of orientation preserving diffeomorphisms of the circle, Diff + (S1 )∼ , on H such that: • If a diffeomorphism ϕ ∈ Diff + (S1 ) is localized in I b S1 , i.e. ϕ I 0 = id I 0 , it is represented by a local observable of A, namely: ΥA (p−1 (ϕ)) ∈ A(I). A ˜ ˜ • ΥA (D(t))U (D(t))∗ ∈ C1l for all t ∈ R. Here, the covering projection from Diff + (S1 )∼ onto Diff + (S1 ) is denoted by p. Localized diffeomorphisms ϕ are identified by their pre-image p−1 (ϕ) in the first sheet of the covering. The Additional Assumption only enters through Lemmas 2.1 and 4.2, which we believe to hold true in a lot more general circumstances. It can be verified in the presence of an integrable stress-energy tensor for A (see discussion in Appendix). In ∼ this case the representations ΥA PSL(2, R) and U A coincide, whereas we have only assumed that the respective generators agree up to a multiple of 1l. At this point we want to stress: we do not assume A to be diffeomorphism covariant, i.e. the adjoint action of ΥA on A to implement a geometric, automorphic action of Diff + (S1 ) on A. Lemma 2.1. U A has the net-endomorphism property, if the Additional Assumption holds. Proof. By Lemma A.1 there exist, for small t ∈ R, diffeomorphisms gε , gδ localized in arbitrarily small neighborhoods of −1 and 1, respectively, and diffeomorphisms
April 20, 2004 11:21 WSPC/148-RMP
360
00200
S. K¨ oster
g+ , g− localized in S1+ and S1− , respectively, such that we have: D(t) = g+ g− gδ gε . If the closure of a proper interval I is contained in S1+ , we have with an appropriate choice of gδ , gε by the Additional Assumption: A ˜ ˜ U A (D(t))B(I)U (D(t))∗ = ΥA (p−1 (g+ ))B(I)ΥA (p−1 (g+ ))∗ ⊂ B(S1+ ) .
(2.1)
Because B(S1+ ) is continuous from the inside, we see that AdU A (D(t)) induces an ˜ 1 A ˜ endomorphism of B(S+ ). The same holds true for U (D(−t)) and, therefore, these endomorphisms are automorphisms. The next step is to give a natural characterization of one-parameter groups of unitary operators which define, by their adjoint action, endomorphism semigroups of a standard von Neumann algebra. The following theorem is mainly a new formulation of results by Borchers and Wiesbrock. Its present form is new and appears to be a natural converse of Borchers’ theorem on half-sided translations. The methods of proof are completely standard, but the result ought to be made available.b Theorem 2.1. Assume M ⊂ B(H ) to be a von Neumann algebra having a cyclic and separating vector Ω in the separable Hilbert space H . J, ∆ shall stand for the modular data of this pair. Let V (t), t ∈ R, be a strongly continuous one-parameter group. Then any two from {1, 2, 3} imply the remaining two in the list below ; 4 yields 1, 2, 3. 1. (a) (b) 2. (a) (b) 3. (a) (b) 4. (a) (b) (c)
V (s) = eiHs , H ≥ 0, V (s)MV (s)∗ ⊂ M, s ≥ 0. V (s)Ω = Ω, s ∈ R, V (s)MV (s)∗ ⊂ M, s ≥ 0. ∆it V (s)∆−it = V (e−2πt s), JV (s)J = V (−s), t, s ∈ R, V (s)MV (s)∗ ⊂ M, s ≥ 0. V (s) = eiHs , H ≥ 0, ∆it V (s)∆−it = V (e−2πt s), t, s ∈ R, hm0+ Ω, V (s)m+ Ωi ≥ 0, s ≥ 0, m+ ∈ M+ , m0+ ∈ M0 + .
M+ denotes the cone of positive elements in M, M0+ the cone of positive elements in its commutant M0 . Proof. Most of the implications were proved by Borchers and Wiesbrock, respectively: 1 ∧ 2 ⇒ 3: [10] (cf. [24]); 2 ∧ 3 ⇒ 1: [66]; 1 ∧ 3 ⇒ 2: [13]; 1 ∧ 2 ∧ 3 ⇒ 4: [14, Proposition 2.5.27]. We prove the remaining statement, namely 4 ⇒ 1 ∧ 2 ∧ 3, by reduction to [11, Theorem 1.1].c As a first step we look at the domain of entire analytic vectors with b Compare
[20] for another characterization of endomorphism semigroups related to Borchers’ theorem. c Alternatively, one may use the same statement in [12, Theorem 2.5].
April 20, 2004 11:21 WSPC/148-RMP
00200
Local Nature of Coset Models
361
respect to ∆iz , which we denote by D∆ , and derive an analytic continuation of relation 4(b) as a quadratic form on D∆ . We define: F (z, w) := h∆i¯z ψ, eie
2πw
H
∆iz φi .
According to the spectrum condition on H, F is analytic in w for 0 < Im(w) < 12 , and this function is bounded and continuous for the closure of this region; the region itself shall be denoted by S. In fact, by Hartog’s theorem, F is analytic on C × S as a function in two complex variables. We make full use of relation 4(b) by looking at another function G, which agrees with F for 0 < Im(w) + Im(z) < 21 : G(z, w) := hψ, eie Evaluating at w ∈ R and z = 1 4
h∆ ψ, eie
i 4
2πw
2π(w+z)
H
φi .
we get: H
1
∆− 4 φi = hψ, e−e
2πw
H
φi .
(2.2)
1 4
− 41
Both ψ, φ are of the form ψ = ∆ ψ 0 , φ = ∆ φ0 , ψ 0 , φ0 ∈ D∆ . Since the set of such ψ 0 , φ0 is dense in H , the equation above becomes an equation for bounded operators, which yields: 1
1
eisH = ∆− 4 e−sH ∆ 4 ,
s ≥ 0.
(2.3)
Next, we show invariance of Ω following arguments from [13, Proof of Lemma 2.3.c]: let E be the projection onto the eigenvectors of ∆ having eigenvalue 1. Multiplying the identity (2.3) from both sides by E leads to: EeisH E = Ee−sH E ,
s ≥ 0.
Here, the right-hand side is a positive operator and thus we have as well: ∗ EeisH E = Ee−isH E = Ee−sH E = EeisH E , s ≥ 0 .
According to a standard argument,d this invariance with respect to conjugation yields: EeisH E = Eei0H E = E. Therefore, all vectors ξ satisfying ξ = Eξ are invariant under the action of V and this means in particular: V (s)Ω = Ω, ∀ s ∈ R. It now follows from 4(c) and [14, Proposition 2.5.28] that e−sH , s ≥ 0, leaves the natural cone of (M, Ω) globally fixed. The other assumptions of [11, Theorem 1.1] are the identities: ∆it e−Hs ∆−it = e−se e−Hs Ω = Ω ,
−2πt
H
,
s ≥ 0,
s≥0.
These relations are obvious by analytic continuation of results derived above. By [11, Theorem 1.1] the adjoint action of V (s), s ≥ 0, does indeed induce endomorphisms of M and we have completed the proof. d Such
an argument is given, for example, in [43, Proof of Corollary 7] and uses the spectrum condition, the Phragmen–Lindel¨ of theorem, Schwarz’ reflection principle and Liouville’s theorem.
April 20, 2004 11:21 WSPC/148-RMP
362
00200
S. K¨ oster
The arguments in the proof of Theorem 2.1 apply, with minor alterations, to translation groups with negative generator, for example the special conformal transformations U (S(·)). While J has the same action, JU (S(n))J = U (S(−n)), the scaling behavior is opposite: ∆it U (S(n))∆−it = U (S(e2πt n)) .
(2.4)
The negative spectrum together with the opposite scaling law (2.4) shows that the condition characterizing endomorphism semi-groups is just the same as in condition 4(c). Since the arguments are completely analogous as for the case of positive spectrum and scaling law 3(a), 4(b) we state the following corollary without proof: Corollary 2.1. The statements in Theorem 2.1 still hold , if one replaces 1(a), 4(a) by V (s) = eiKs , K ≤ 0, and uses ∆it V (s)∆−it = V (e2πt s), s, t ∈ R, instead of 3(a), 4(b). At this stage our intuition about the geometric action of U A on B can be verified. We will discuss the general situation right after the following corollary: Corollary 2.2. Assume U A to have the net-endomorphism property. Then the 0 ˜ adjoint action of U A (D(·)) on B(S1+ ) defines a group of automorphisms. For s ≥ 0 the adjoint action of U A (T˜(s)) induces endomorphisms of B(S1+ ) and the adjoint action of U A (T˜(−s)) maps B(S1+ ) into B(T (−s)S1+). The corresponding ˜ statements hold true, if one replaces A by A0 or T˜(·) by S(·). 0
Proof. The statement on AdU A0 (D(·)) follows from U A = U ◦ p U A∗ and covariance ˜ 0 of B. Using the factorization of U (T (s)) = U A (T˜(s))U A (T˜(s)), covariance and 0
isotony of B, the statement on AdU A0 (D(·)) and invariance of Ω with respect to U A , ˜ 0 we have the following inequality for all t ∈ R, s ≥ 0, B+ ∈ B(S1+ )+ , B+ ∈ B(S1− )+ : 0 0 0 0 ∗ 0 ∗ ˜ ˜ ˜ ˜ 0 ≤ hU A (D(t)) B+ U A (D(t))Ω, U (T (s))U A (D(t)) B+ U A (D(t))Ωi 0
0 = hB+ Ω, U A (T˜(s))U A (T˜(et s))B+ Ωi . 0 0 In the limit t → −∞ strong continuity of U A implies hB+ Ω, U A (T˜(s))B+ Ωi ≥ 0, A ˜ which in turn yields the statement on U (T (s)), s ≥ 0, by Theorem 2.1. Following the same argument with A instead of A0 and vice versa leads to the corresponding 0 ˜ statement on U A (T˜(s)), s ≥ 0. If one replaces in both statements T˜(s) by S(s),
one may apply the argument as well, but using the limit t → ∞ and Corollary 2.1. The remainder follows immediately from the following argument, which we indicate for the translations represented through U A : AdU A (T˜ (−s)) B(S1+ ) = AdU (T (−s)) AdU A0 (T˜ (s)) B(S1+ ) ⊂ B(T (−s)S1+ ) . The geometric action of a general U A (˜ g ), g˜ ∈ PSL(2, R)∼ , on an arbitrary local algebra B(I) is discussed easily. We may restrict our attention to group elements g˜
April 20, 2004 11:21 WSPC/148-RMP
00200
Local Nature of Coset Models
363
for which there is a single sheet of the covering projection p containing both g˜ and the identity, as the following discussion indicates. Every element g in PSL(2, R) is contained in (at least) one one-parameter group e [52, 53]. We use the local identification of one-parameter subgroups in PSL(2, R) and in PSL(2, R)∼ , choose a parameterization such that g˜ = g˜(1), id = g˜(0), and we S1 set γg˜ (I) := τ =0 p(˜ g(τ ))I. For g˜ further away from the identity we set γg˜ (I) = S1 1 and take B(S ) to be the algebra of all bounded operators on H . Then we have: Proposition 2.1. Assume U A to have the net-endomorphism property. Then we have for any g˜ ∈ PSL(2, R)∼ and any I b S1 : AdU A (˜g) B(I) ⊂ B(γg˜ (I)), and AdU A0 (˜g) B(I) ⊂ B(γg˜ (I)). Proof. Each proper interval I in S1 may be identified by the ordered pair consisting of its boundary points, z+ and z− . We define three one-parameter subgroups in PSL(2, R) referring to each I b S1 with respect to a particular choice h ∈ PSL(2, R) satisfying hS1+ = I: DI (·) = hD(·)h−1 , TI (·) = hT (·)h−1 , SI (·) = hS(·)h−1 . Each element g in PSL(2, R) is fixed, up to a dilatation DI (t), by its action on {z+ , z− }. Under the action of elements g(τ ), τ = 0, . . . , 1, interpolating in the one-parameter group associated with g between the identity and g = g(1), the orbits of z± are given by monotonous functions z+ (τ ), z− (τ ). Demanding s, n, t to depend continuously on τ and to take value 0 at τ = 0, every g(τ ) may be represented as g(τ ) = STI (s(τ ))I (n(τ ))TI (s(τ ))DI (t(τ )) or as g(τ ) = TSI (n(τ ))I (s(τ ))SI (n(τ ))DI (t(τ )). We choose one form which works for all interpolating elements. By the requirements we have made it is ensured that the representation works (after obvious identifications) in PSL(2, R)∼ as well. Corollary 2.2 implies the claim of the proposition now. This proves in particular: for every I b S1 there is a neighborhood of the identity in PSL(2, R)∼ for which the action of AdU A (.) on B(I) yields local observables. We have found AdU A to induce homomorphisms from local algebras of B into algebras associated with an enlarged localization region. This action respects isotony, i.e. the net-structure. The adjoint action of U induces the covariance isomorphisms of local algebras and one usually regards these as automorphisms of the net B. We consider, therefore, the term net-endomorphisms appropriate. The automorphic action of AdU A (D(.)) on B(S1+ ) which we proved in Lemma 2.1 does not, apparently, ˜ follow from the endomorphism property for the translation subgroups in Corollary 2.2. This motivated Definition 2.1 above. In the next section we give a holographic interpretation of the net-endomorphism property. This shows that the results achieved so far are satisfactory and yield an interesting and useful new insight into structures associated with chiral conformal subnets and their Coset models. e I am indebted to D. Guido for providing the reference. In the particular case of PSL(2, R) this fact may be checked directly (cf. [45]).
April 20, 2004 11:21 WSPC/148-RMP
364
00200
S. K¨ oster
3. Chiral Holography 0 ˜ 7→ U A (˜ ˜ defines a representation U A × U A0 of the The mapping (˜ g, h) g)U A (h) group PSL(2, R)∼ × PSL(2, R)∼ . This is, in fact, a representation of the conformal symmetry group of a local conformal quantum theory in (1+1) dimensions, which is isomorphic to (PSL(2, R)∼ × PSL(2, R)∼ )/Z. This factor group arises, if one ˜ ˜ identifies the simultaneous rigid conformal rotation by 2π, namely (R(2π), R(2π)), with the trivial transformation. The last section taught us a lot about the sub0 geometrical action of U A , U A on the local observables in B. So, it is natural to look for a relation between the geometrical character of this action and structures in (1+1) dimensions. This relation turns out to be a complete correspondence: We construct a (1+1)dimensional, local, conformal theory from the original chiral theory B applying the net-endomorphism property of U A . In order to prove locality in (1+1) dimensions we are led to a particular choice of light-cone coordinates, by which the original local algebras B(I), I b S1 , are included in the (1+1)-dimensional picture as time zero algebras. This choice of coordinates yields an unphysical spectrum condition: translations in the right spacelike wedge have positive spectrum. Whereas this prohibits an interpretation of the new theory as a genuinely physical one, where we would have positivity of the spectrum in future-like directions, the construction does provide us with a useful geometrical picture for questions concerned with chiral subnets and their Coset models. For this reason we regard the result of our construction as a local, conformal quasi-theory in (1+1) dimensions. If, on the other hand, one takes a (physical) conformal quantum theory in (1+1) dimensions and defines a chiral conformal net by restriction to time zero algebras, a similar phenomenon arises (cf. [41, 49]): the spectrum condition disappears altogether, but powerful tools of local quantum theory are available still, because the Reeh–Schlieder property survives. In our case there remains a spectrum condition from which one can still derive the Reeh–Schlieder property. In this sense we find a natural “converse” of the restriction process which justifies the term chiral holography for our construction. The main result of this section will be proved by making contact with the analysis of Brunetti, Guido and Longo [7] who discussed conformal quantum field theories in general spacetime dimensions as local quantum theories on the conformal covering of the respective Minkowski space as extensions of local nets living on Minkowski space itself. In (1+1) dimensions, Minkowski space M is the Cartesian product of two chiral light-rays, which we take as light-cone coordinates of M. One arrives at the (phys˜ of M, if one compactifies both light-rays adding the ical) conformal covering M points at infinity, takes the infinite, simply connected covering of the compactification S1 × S1 , which yields R × R, and, finally, one identifies all points which are connected by the action of simultaneous rigid conformal rotations by 2π. The result ˜ = S1 × R. Without has the shape of a cylinder having infinite timelike extension: M
April 20, 2004 11:21 WSPC/148-RMP
00200
Local Nature of Coset Models
365
the final identification we would have spacelike separated copies of M in covering space, which we consider unphysical; conformally covariant quantum fields can be proven to live on this (physical) conformal covering of Minkowski space, see [47]. ˜ are infinitely extended, universal coverings of the compactified Light-rays in M light-rays and serve well as light-cone coordinates. The localization regions are (1+1)-dimensional double cones given as Cartesian product of two intervals, I × J, where I, J are properly contained in a single copy of S1 on the left and right ˜ light-rays, respectively, in M. PSL(2, R)∼ has an action on the infinite covering R of S1 which is transitive for the intervals properly contained in a single copy of S1 . We exclude the point of infinity from S1 and choose a fixed interval I which is properly contained in the remainder. This interval is identified with its first (pre-)image in covering space and we choose for any pair of proper intervals JL,R group elements g˜L,R ∈ PSL(2, R)∼ satisfying JL = g˜L I, JR := g˜R I. Making use of this choice we define a set of (local) algebras indexed by (1+1)-dimensional double cones: 0
0
B 1+1 (JL × JR ) := U A (˜ gL )U A (˜ gR )B(I)U A (˜ gL )∗ U A (˜ g R )∗ . By covariance of B, the resulting algebra B 1+1 (JL × JR ) is JL × J R . Furthermore, we define a covering projection p from the covering projection p : PSL(2, R)∼ → PSL(2, R) such p(˜ gL,R )I. This definition enables us to state two identities in equation (3.1):
(3.1)
uniquely determined by R onto S1 referring to that we have: pJL,R := for the algebras defined
gL g˜R −1 )∗ B 1+1 (JL × JR ) = U A (˜ gL g˜R −1 )B(pJR )U A (˜ 0
0
= U A (˜ gR g˜L −1 )B(pJL )U A (˜ gR g˜L −1 )∗ . Double cones J ×J, which are centered at the time zero axis, are called time zero double cones and we get for the corresponding time zero algebras: B 1+1 (J × J) = B(pJ). Thus, the local algebras of the original chiral conformal theory B are included into the new quasi-theory B 1+1 as time zero algebras. Now we are prepared to state the main result of this section: Theorem 3.1. If A ⊂ B is an inclusion of chiral conformal theories and if the unique inner-implementing representation U A associated with this inclusion has the net-endomorphism property, then Eq. (3.1) defines a set B 1+1 of local algebras ˜ having assigned to double cones in (1+1)-dimensional conformal space time, M, all but one of the usual properties of a local , conformal , weakly additive quantum theory in (1+1) dimensions (see [7]): the spectrum condition holds for translations in the right spacelike wedge. Proof. Obviously, the set B 1+1 of local algebras is covariant with respect to the 0 A0 ˜ ˜ representation U A × U A . Because of the identity U A (R(2π))U (R(2π)) = 1l ˜ and U A × U A0 is a the set B 1+1 is in fact labelled by the double cones in M
April 20, 2004 11:21 WSPC/148-RMP
366
00200
S. K¨ oster
representation of the conformal group in (1+1) dimensions, namely the group 0 (PSL(2, R)∼ × PSL(2, R)∼ )/Z. The spectrum condition for U A × U A was proved in [43, Corollary 6]. 0 The vacuum vector is invariant with respect to U A × U A [43, Corollary 7] and it is a basis for the space of vectors with this property, because the space of U -invariant vectors is one-dimensional. Ω is cyclic for all local algebras in B 1+1 because of the Reeh–Schlieder property of B. Isotony follows directly from the net-endomorphism property. An inclusion of ˜LI × h ˜ R I contained in Minkowski (1+1)-dimensional double cones g˜L I × g˜R I ⊂ h −1 ˜ space M, yields the relations: hL,R g˜L,R I ⊂ I. Applying Proposition 2.1 we get: gL I × g˜R I) ⊂ AdU A0 (h˜ R −1 g˜R )U A (h˜ L −1 g˜L ) B(I) ⊂ B(I). This is equivalent to B 1+1 (˜ 1+1 ˜ ˜ B (hL I × hR I). Locality for double cones in M is shown easily as well. We can reduce the discussion to the situation where there is a double cone J1 × J2 spacelike to our basic time zero double cone I ×I simply by applying an appropriate transformation. There is a time zero double cone J × J which contains J1 × J2 and is spacelike to I ×I. Since we have shown isotony for B 1+1 , locality for this set follows from locality of B. Weak additivity may be proved as in the chiral case. By scale covariance the local algebras of B 1+1 are continuous from the inside as well as from the outside [51]. Because we can restrict the discussion to time zero algebras and the argument of J¨ orß [38] for the corresponding chiral situation may be extended directly, we have weak additivity for B 1+1 . The proof is complete, if one recognizes that the proof of [7, Proposition 1.9, on ˜ only requires the prerequisites the unique extendibility of B 1+1 from M to all of M] established so far. In particular, the spectrum condition itself is not needed, but only its consequence, the Reeh–Schlieder property. In light of this theorem we obtain a straightforward interpretation of the subgeometrical action of U A on B. If we apply a chiral coordinate transformation g˜R to a time zero double cone J × J and if we test the localization of the correspondingly transformed local algebra of B 1+1 only by looking at time zero algebras, then we find that the result commutes just with time zero algebras B(K) assigned to proper intervals K contained in the causal complement of γg˜R J. The statement of Proposition 2.1 follows from Haag duality of B. The theorem has some direct applications to chiral subtheories and their Coset models: we have found that the maximal Coset model Cmax associated with a subtheory A ⊂ B may be regarded as the chiral conformal theory of all right chiral observables in B 1+1 in the sense of Rehren [57], i.e. the local observables of B 1+1 which are invariant under the action of transformations on the left light-cone coordinate only. The observables of A may be viewed as left chiral observables and the chiral conformal subnet Amax ⊂ B consisting of local observables invariant with respect
April 20, 2004 11:21 WSPC/148-RMP
00200
Local Nature of Coset Models
367
0
to the action of U A (and hence covariant with respect to the action of U A ) is to be identified with the chiral theory of all left chiral observables in B 1+1 . Thus, we have identified Amax and Cmax as fixed-points of a space-time symmetry acting on a suitably extended theory, namely B 1+1 . In the presence of the net-endomorphism property it is not necessary to extend the “classical” symmetry concept (see e.g. [2]), if one wants to interpret the chiral subtheories Amax and Cmax as fixed-points of a symmetry; all one has to do is to extend the theory B to its holographic image. Generalizations of the symmetry concept are necessary for a large class of chiral conformal subtheories [50, 56]. Further remarksf : Another interesting, direct consequence of Theorem 3.1 is the following: The cyclic subspaces of Cmax and Amax , namely Cmax (I)Ω and Amax (I)Ω, 0 coincide with the spaces of U A - and U A -invariant vectors, respectively.g By this, results from character arguments on inclusions of current algebras have a direct and rigorous meaning to the analysis of the respective inclusions of chiral conformal theories and Coset models. Here one starts with an inclusion of current algebras, A ⊂ B, and looks at the decomposition of the vacuum representation of B when restricted to the subtheory A ∨ C ⊂ B, C some Coset model associated with A ⊂ B. Goddard, Kent and Olive [28] constructed the minimal series of the Virasoro algebra, i.e. the quantum field theories generated by the stress-energy tensors having central charge less than 1, as Coset models associated with the inclusion of current algebras SU (2)k+1 ⊂ SU (2)1 ⊗ SU (2)k , k = 1, 2, . . . . Their decomposition formulas show the Coset stress-energy tensor to generate all of Cmax and SU (2)k+1 to coincide with its maximal covariant extension,h Amax . We call the chiral conformal quantum theories generated by a stress-energy tensor with central charge c less than one “Virc<1 models”. Kac and Wakimoto [46] gave a list of such decompositions for inclusions A∨C ⊂ B, where A ⊂ B is an inclusion of current algebras and C is a Virc<1 -Coset model associated with this inclusion. Their list includes some examples in which Amax and/or Cmax are non-trivial local extensions of A and C, respectively. The local extensions of all Virc<1 models have been classified completely by Kawahigashi and Longo [40]. Most of the non-trivial ones are given by orbifolds: the local extension contains the Virc<1 model as fix-point subtheory with respect to a Z2 symmetry. Some of these are among the examples of [46]. Only four local extensions are of a different type. For two of these Kawahigashi and Longo gave a rigorous interpretation as Coset models following the suggestions of B¨ ockenhauer and Evans [5].
f For
further details see [45] and the Appendix. [57, Lemma 2.3]. The proof of Proposition 4.1 includes an alternative argument leading to this statement. h Kawahigashi and Longo gave an alternative argument on this point [40, Lemma 3.2, Corollary 3.3].
g By
April 20, 2004 11:21 WSPC/148-RMP
368
00200
S. K¨ oster
One of the remaining two is given as a maximal Coset model by chiral holography and the results of [46]: the vacuum representations of the maximal Coset models associated with the current algebra inclusions SU (9)2 ⊂ E(8)2 and E(8)3 ⊂ E(8)2 ⊗ 21 model into the direct sum of E(8)1 both decompose upon restriction to the Virc= 22 the vacuum representation and the representation with highest weight 8. Following Kawahigashi and Longo there is only one local extension with this decomposition, namely the extension (A10 , E6 ) according to the classification scheme [40], which thus is identified as the maximal Coset model associated with both current algebra inclusions. By classification results on inclusions of current algebras giving rise to Virc<1 Coset models [6], the fourth exceptional local extension, namely (A28 , E8 ) of , does not seem to be available by a Coset construction using current Virc= 144 145 algebra inclusions. However, the local extension is known to exist by an abstract 144 [40] and thus appears to be a construction relying on the DHR-data of Virc= 145 genuine achievement of local quantum physics. As we have mentioned before, the holographic image B 1+1 may not be interpreted as a physical model because of its peculiar spectrum condition. The picture changes in this aspect, if the net-endomorphism property takes a sharper form, namely if the transformed algebra, AdU A (˜g) B(I), commutes with all B(J), J a proper interval contained in I 0 ∩ (p(˜ g )I)0 . After transfer into the holographic picture this property can easily be seen to be equivalent to timelike commutativity of B 1+1 . In this case we may interchange the role of space and time in the holographic picture and get a physically sensible conformal quantum theory in (1+1) dimensions. It is not difficult to extend the arguments of Longo [49] on chiral subtheories to the (1+1)-dimensional inclusions Amax ∨ Cmax ⊂ B 1+1 : one can show that the representation of Amax ⊗ Cmax induced by the inclusion is unitarily equivalent to a localized representation ρ. In case ρ has finite statistical dimension and a finite decomposition into tensor products σi ⊗ τj of irreducible localized representations σi of Amax and τj of Cmax , respectively, this can be applied to the situation where B 1+1 fulfills timelike commutativity in order to derive a necessary criterion for this particular property to hold. One knows that the statistical phases of σi and τj in a tensor product σi ⊗ τj occurring in ρ have to be conjugate to each other because of spacelike commutativity [58, Corollary 3.2]. The argument covers the situation of timelike commutativity, where it forces the same statistical phases to coincide. The conformal spin-statistics ˜ theorem [29] tells us then that U A (R(2π)) has to have spectrum in {±1}, i.e. the conformal highest weights associated with σi and τj have to lie in 21 N. This necessary condition excludes all inclusions of current algebras known to the author (except the ones that one can make up trivially). The result on the conformal highest weights is well known for (quasi-) primary fields in (1+1) dimensions commuting with themselves not only for spacelike, but also for timelike
April 20, 2004 11:21 WSPC/148-RMP
00200
Local Nature of Coset Models
369
separation. The analyticity properties of their two-point function force both chiral scaling dimensions of such fields to be half integers. 4. Isotony Problem In this section we use the Additional Assumption in order to solve the isotony problem for the local relative commutants CI of an inclusion of chiral conformal theories, A ⊂ B. Once their isotony is proved, they are known to coincide with the local algebras of the maximal Coset model Cmax associated with A ⊂ B. This way, we reach the main goal of this paper: the maximal Coset model is found to be of a local nature, i.e. it is determined completely by local data. As we mentioned in the introduction, the isotony problem requires a discussion using an argument suited for our specific scenario. We reduce the task by a purely group theoretical lemma first, which is a slightly extended version of a result of Guido and Longo [29]. In a side remark we use this lemma to characterize the subnets Amax and Cmax and the vacuum subrepresentation of Amax ∨ Cmax . The argument is continued by a proposition providing necessary and sufficient conditions for isotony of local relative commutants to hold. Aside of being an intermediate step of our analysis, it illustrates the character of the isotony problem. The argument is completed by an application of the Additional Assumption and summarized in the main theorem of this work. In the remainder of this section we make some remarks on immediate applications of the theorem and relations to other works. Lemma 4.1. H a separable Hilbert space, V a unitary, strongly continuous representation of PSL(2, R)∼ on H . If H ⊂ PSL(2, R)∼ is a subgroup having closed , non-compact image in PSL(2, R) under the action of the covering projection p, then each V |H -invariant vector is in fact V -invariant. If V is a representation of positive ˜ energy, then each vector which is invariant with respect to V (R(.)) is V -invariant as well. Proof. The proof of the claim is, up to trivial modifications, identical to the one indicated by Guido and Longo for [29, Corollary B.2]. For the reader’s convenience we include a sketch of the argument. First, one recognizes that it is completely sufficient to discuss the complement of the trivial subrepresentation, V ⊥ , on the Hilbert subspace H ⊥ , which contains no vectors invariant with respect to the whole of V . We decompose V ⊥ into a direct integral of irreducible representations Vx . We look at Vx ⊗ Vx , which can easily be seen not to contain the trivial representation, because Vx is infinite dimensional (cf. e.g. [32]). Moreover, Vx ⊗ Vx is a representation of PSL(2, R). Now we are in the position to apply [72, Theorem 2.2.20] and thus we have for any ξx ∈ Hx : 2
lim |hVx (˜ g )ξx , ξx i| =
p(˜ g )→∞
g )ξx ⊗ ξx , ξx ⊗ ξx i = 0 . lim hVx (˜ g ) ⊗ Vx (˜
p(˜ g )→∞
April 20, 2004 11:21 WSPC/148-RMP
370
00200
S. K¨ oster
If we apply this to a Vx |H -invariant vector ψx , we readily see: ψx = 0. Integrating over x yields the first statement of the lemma. The result on rigid conformal rotations may be deduced in the same manner: the irreducible representations Vx are almost all of positive energy and the only irreducible representation of PSL(2, R)∼ having positive energy and containing a ˜ non-trivial R(·)-invariant vector is the trivial representation [32]. The following result is partly known from [57, Lemma 2.3]; we give an alternative proof here. Together with the other parts, this proposition may be viewed as a generalized version of [68, Theorem 2.4], which is formulated for a particular class of chiral subnets. Proposition 4.1. Assume U A to have the net-endomorphism property and denote 0 the projections onto the subspaces of U A - and U A -invariant vectors by EA and EA0 , respectively. Then we have for the maximal U A -covariant extension of A, given by 0 Amax (I) := {U A }0 ∩ B(I), and the maximal Coset model associated with A ⊂ B, given by Cmax (I) := {U A }0 ∩ B(I), for arbitrary I b S1 : Amax (I)Ω = EA0 H ,
Cmax (I)Ω = EA H .
(4.1)
For any Coset model C associated with A ⊂ B we have a unitary equivalence of chiral conformal theories: A ∨ CeA∨C ∼ = AeA ⊗ CeC . EA H has a direct interpretation as multiplicity space of the vacuum subrepresentation of A ⊂ B. Proof. Concerning the proof of (4.1) we may restrict to I = S1+ (because of the Reeh–Schlieder theorem). By Lemma 4.1 the spaces of vectors which are invariant with respect to translations are identical with EA H and EA0 H , respectively. Taking into account Corollary 2.2 above, the statement (4.1) was proved by Borchers [13, Theorem 2.6.3]. Straightforward verification shows AeA ⊗ CeC to be a chiral conformal theory with the obvious definitions: its vacuum is given by Ω ⊗ Ω, the representation implementing covariance is U eA ⊗ U eC (·), its representation space is eA H ⊗ eC H . The factoriality of the local algebras proves that Ω ⊗ Ω is (up to scalar multiples) unique [29, Proposition 1.2], [62, IV.5, Corollary 5.11]. One can establish uniqueness of Ω ⊗ Ω by group theoretic arguments as in Lemma 4.1 as well. We now look at the restrictions of A ∨ CeA∨C and AeA ⊗ CeC to the chiral S light-ray, R. Ω is separating for IbR A ∨ CeA∨C (I), the union of all local algebras assigned to compact intervals in R. Thus, we are allowed to define a linear operator W densely by: WAC Ω := AΩ ⊗ CΩ ,
A ∈ A(I) , C ∈ C(I) , I b R . (4.2) S The vacuum is a product state for IbR A ∨ CeA∨C (I) (a corollary to Takesaki’s theorem on modular covariant subalgebras [61]). Hence, W is bounded and extends by continuity to an isometry, as one may readily verify. Moreover, it is elementary
April 20, 2004 11:21 WSPC/148-RMP
00200
Local Nature of Coset Models
371
to check that W W ∗ and W ∗ W commute with the respective restricted nets on R, but these are irreducible. Hence, W is a unitary operator. AdW induces a unitary equivalence of the respective local algebras associated with every I b R by its definition (4.2) and the separating property of the vacuum. Furthermore, W is readily shown to be covariant. If we denote the covariance automorphisms of AeA ⊗ CeC by α⊗ , we have for gI b R, I b R: α⊗ g AdW |A∨C(I) = AdW αg |A∨C(I) . Using the Reeh–Schlieder property of the local algebras, one may reconstruct the representations U eA ⊗ U eC (·) and U (·)eA∨C from the action of the automorphisms. This, in turn, proves that W intertwines the representations U (·)eA∨C and U eA ⊗ U eC (·). Finally, we reconstruct the conformal models from their restrictions to the light-ray using conformal covariance. In the following discussion A denotes local observables in A ⊂ B and π0 (A) its representative in the vacuum representation on eA H =: H0 . The implementation of conformal covariance in π0 shall be written U0 . For every vacuum subrepresentation in A ⊂ B there is a partial isometry R : H → H0 satisfying RA = π0 (A)R, for all local A in A ⊂ B. The projection eR := R∗ R commutes with all of A. RU A (·)R∗ is a unitary strongly continuous representation of P SL(2, R)∼ which implements global conformal covariance in π0 , thus: RU A (·)R∗ = U0 (·). It follows directly that ΦΩ := R∗ Ω, the vacuum of the subrepresentation associated with R, is invariant with respect to U A , i.e. ΦΩ ∈ EA H . This completes the proof of the last statement. It is not clear in general that the representation A ∨ Cmax ⊂ B of the tensorproduct theory defined by the vacuum representation of a chiral subnet A ⊂ B and the vacuum representation of its maximal Coset model has a (spatial) tensorproduct decomposition. This is known under certain conditions [41]. We write A⊗C for the vacuum representation of A ∨ C. We now give a characterization of isotony for the local relative commutants: Proposition 4.2. Assume the unique inner-implementing representation U A associated with a chiral subnet A ⊂ B to have the net-endomorphism property. Referring to I b S1 , ecI shall denote the projection onto the Hilbert subspace which the local relative commutant CI = A(I)0 ∩ B(I) generates from the vacuum. The following are equivalent: (1) For some pair I, K of intervals satisfying K ( I b S1 holds: ecK ⊂ ecI . ˜ (2) CS1+ ⊂ {U A (D(t)), t ∈ R}0 . (3) Cmax (I) = {U A (˜ g ), g˜ ∈ PSL(2, R)∼ }0 ∩ B(I) = CI , I b S1 . Remark: The statement on the cyclic projections is non-trivial since, although the local relative commutants are manifestly covariant with respect to U , the Reeh– Schlieder theorem does not apply due to the unclear status of isotony (cf. e.g. [9]). Proof. The implications 3 ⇒ 1, 2 are obvious. We start the proof with a discussion on 1 ⇒ 3 and here we look at the case I = S1+ (general case by covariance). We set
April 20, 2004 11:21 WSPC/148-RMP
372
00200
S. K¨ oster
ecS1 ≡ ec+ . The inclusion ecK ⊂ ec+ yields by the separating property of the vacuum +
and modular covariance of CS1+ ⊂ B(S1+ ): CK ⊂ CS1+ . Thus, any g ∈ PSL(2, R) satisfying gS1+ = K leads to an operator U (g) which leaves ec+ H globally invariant. g has the form g = S(n)T (s)D(t), n, s ≥ 0. g may be chosen such that t = 0. By modular covariance J, the modular conjugation of B(S1+ ), and ec+ commute and, by covariance and the Bisognano–Wichmann property of B, AdJU (R(π)) induces an automorphism of CS1+ , so ec+ commutes with U (R(π)), too. The relations JT (s)J = T (−s), JS(n)J = S(−n) lead to U (S(−n))U (T (−s))ec+ H ⊂ ec+ H . We assume n, s > 0 and define ns + (1 + ns)2 2 + ns 2 g(n, s) := S −n T −s (S(n)T (s)) . 2 + ns ns + (1 + ns)2 Applying scale covariance we arrive at: U (g(n, s))ec+ H ⊂ ec+ H . The group element g(n, s) leaves the point 1 ∈ S1 invariant and is not a pure scale transformation. This proves that all special conformal transformations leave e+ c invariant. The same follows for the translations because of R(π)S(n)R(π) = T (−n), which proves U (g), ec+ = 0 for all g ∈ PSL(2, R) recognizing that translations and special conformal transformations generate the whole group. For n = 0 or s = 0 the last part applies directly. This proves: ecK = ec+ for all K b S1 . By modular covariance of the inclusions CK ⊂ B(K) we have CK = {ecK }0 ∩ B(K) and this yields isotony for the local relative commutants. The remainder follows by maximality of Cmax . Finally we discuss the implication 2 ⇒ 3. If B ∈ B(S1+ ) commutes with A ˜ U (D(t)), t ∈ R, then BΩ is invariant under the action of all of U A (Lemma 4.1). If g˜ is sufficiently close to the identity, AdU A (˜g) (B) is a local operator (Proposition 2.1), and the separating property of the vacuum proves that B commutes with all of U A . Thereby, we arrive at CS1+ ⊂ Cmax (S1+ ), provided the assumption in 4.2 holds. The other inclusion is trivial. ˜ If the dilatations U A (D(t)), t ∈ R, induce automorphisms of B(S1+ ), the last part 1 of the proof shows Cmax (S+ ) to be the fixed-point subalgebra with respect to this automorphism group. Covariance leads to a corresponding identification of every Cmax (I), I b S1 . This may be regarded as an alternative “local” characterization of Cmax , but since the automorphism groups are determined by global observables, namely non-trivial unitaries from U A , this is not satisfactory. Only for the final step of our analysis we need to invoke the Additional Assumption once again: Lemma 4.2. Assume the Additional Assumption to hold. Then we have: ˜ U A (D(t)) ∈ A(S1+ ) ∨ A(S1− ), t ∈ R, and U A has the net-endomorphism property. Proof. According to the Additional Assumption and Lemma A.1 there exist, for small, fixed t, diffeomorphisms gδ , gε loacalized in arbitrarily small neighborhoods
April 20, 2004 11:21 WSPC/148-RMP
00200
Local Nature of Coset Models
373
τ1 ,τ2 τ1 ,τ2 of +1 ∈ S1 and −1 ∈ S1 , respectively, and diffeomorphisms g+ , g− which are 1 1 loacalized in S+ and S− , respectively, and phases ϕ(τ1 , τ2 ) such that for τ1,2 ∈ R+ : τ1 ,τ2 τ1 ,τ2 ˜ U A (D(t)) = ϕ(τ1 , τ2 )ΥA (p−1 (g+ ))ΥA (p−1 (g− )) A −1 · AdU A (D(τ (gε )))AdU A (D(−τ (ΥA (p−1 (gδ ))) . ˜ 1 )) (Υ (p ˜ 2 ))
Following Roberts [59, Corollary 2.5], dilatation invariance of the vacuum and the shrinking supports ensure that the last two operators converge weakly to their vacuum expectation values in the limit τ1,2 → ∞. We rewrite the equation above: A −1 ∗ ˜ AdU A (D(τ (gε )))AdU A (D(−τ (ΥA (p−1 (gδ )))U A (D(t)) ˜ 1 )) (Υ (p ˜ 2 )) τ1 ,τ2 ∗ A −1 τ1 ,τ2 ∗ )) Υ (p (g− )) . = ϕ(τ1 , τ2 )ΥA (p−1 (g+
(4.3)
The operators to the right converge weakly by this equation in the limit τ1 , τ2 → ∞. For small t, gε and gδ may be chosen close to the identity, ω(·) is continuous and normalized, which means that for gε , gδ ≈ id we have ω(ΥA (p−1 (gε ))) 6= 0, ˜ ω(ΥA (p−1 (gδ ))) 6= 0. This implies U A (D(t)) ∈ A(S1+ ) ∨ A(S1− ) for small and hence for all t. τ1 ,τ2 τ1 ,τ2 Because ΥA (p−1 (g+ )) and ΥA (p−1 (g− )) are unitary operators, the right˜ hand side of Eq. (4.3) converges, up to a phase, strongly against U A (D(t)) for small 1 t. This strong convergence proves that for B ∈ B(S+ ) and small t holds true in the weak topology: A ˜ ˜ U A (D(t))BU (D(t))∗ =
lim
τ1 ,τ2 →∞
AdΥA (p−1 (g+τ1 ,τ2 )) (B) ∈ B(S1+ ) .
(4.4)
This establishes the net-endomorphism property (Definition 2.1). The statement of this lemma holds trivially, if the global algebra A coincides with A(S1+ ) ∨ A(S1− ). This is a desirable property (e.g. for the Connes’ fusion approach to superselection structure) and it holds in the presence of strong additivity, but a proof of it relying on general properties of chiral conformal subtheories seems out of reach. We summarize and state the main result of this work, which proves that the maximal Coset models are of a local nature: Theorem 4.1 (Main Theorem). A ⊂ B an inclusion of chiral conformal quantum theories and suppose the Additional Assumption to hold. Then the unique innerimplementing representation U A has the net-endomorphism property and for all I b S1 holds: Cmax (I) := {U A }0 ∩ B(I) = CI := A(I)0 ∩ B(I) . Proof. The net-endomorphism property of U A holds by Lemma 2.1, and 2 in Proposition 4.2 is fulfilled because of Lemma 4.2.
April 20, 2004 11:21 WSPC/148-RMP
374
00200
S. K¨ oster
In the cases where both A and B possess an integrable stress-energy tensor, and hence Cmax alike, the main theorem means in particular: Amax (I) and Cmax (I), I b S1 arbitrary, are their mutual relative commutants in B(I). The local algebras Amax (I) are factors which shows the relative commutant of Amax (I) ∨ Cmax (I) in B(I) is C1l, i.e. this inclusion is irreducible. The main theorem proves the conclusions of Rehren [57] to hold true which rely on the generating property of nets of chiral observables, if the (1+1)-dimensional theory contains a stress-energy tensor in the sense of the L¨ uscher–Mack theorem [26]. Since such a stress-energy tensor factorizes into its independent chiral components, our analysis applies directly. The generating property introduced in [57] resisted attempts of proof even in the presence of a stress-energy tensor, unfortunately. Further remarksi: Results of Xu [68, 69] and Longo [49] show that the current algebras SU (n)k , n, k ∈ N and all Virc<1 modelsj are completely rational [41], i.e. they have finitely many sectors, all with finite statistics, they are strongly additive and satisfy the split property. Finiteness of statistics shows that the decomposition formulas of Kaˇc and Wakimoto of inclusions in the current algebras just mentioned yield examples of nets of normal, irreducible canonical tensor product subfactors (normal CTPS) in the sense of Rehren [57]. For these inclusions the fact that Amax and Cmax are locally their mutual relative commutants follows from the heredity of strong additivity for inclusions of finite index [49]. Our result gives an independent proof relying on the presence of stress-energy tensors only and covers directly all current algebra inclusions. max Rehren has shown for normal CTPS that the sectors ρA ≺ ρ, ρCj max ≺ ρ form i sets which are closed under conjugation and (up to direct sums) fusion, that the coupling matrix Zij has to be a permutation matrix and that the coupling matrix induces an isomorphism of the fusions rules of Amax and Cmax as far as only submax endomorphisms of ρ are involved. In particular, the statistical dimensions of ρA i Cmax and ρj have to coincide for Zij 6= 0 ⇒ Zij = 1. Thereby, the results of Kaˇc and Wakimoto and similar decomposition formulas allow us to translate information on the superselection structure of Amax into information on Cmax and vice versa. Recently, M¨ uger [54] succeeded in extending the results of Rehren: He proved that for normal CTPS Amax ∨ Cmax ⊂ B the coupling matrix induces even an isomorphism of the respective DHR subcategories, if B has trivial superselection structure, i.e. the vacuum representation is its only locally normal representation. There is one current algebra which has trivial superselection structure, namely E(8)1 . In a study on branching rules associated with conformal inclusions in exceptional current algebras, Kaˇc and Niculescu Sanielevici [42] provided some decomposition i Further
details in [45] and in the Appendix. list may easily be extended, e.g. by looking at branching rules for e.g. in [46, 42] and through conformal inclusions.
j This
April 20, 2004 11:21 WSPC/148-RMP
00200
Local Nature of Coset Models
375
formulas which yield examples of this structure. Particularly interesting is the embedding SU (2)16 ∨ SU (3)6 ⊂ E(8)1 . If we regard SU (2)16 as chiral conformal subtheory A ⊂ E(8)1 and SU (3)6 as associated Coset model, C, then both Amax and Cmax are non-trivial extensions and the localized representation connected with Amax ∨ Cmax ⊂ E(8)1 is found to be a “diagonal” sum of six tensor products. The latter are known to be inequivalent by the result of Rehren and M¨ uger’s results show, that the the respective DHR categories associated with the endomorphisms of Amax and Cmax , respectively, involved in ρ are isomorphic. 5. Discussion Information on the way in which the Borchers–Sugawara representation U A associated with a chiral conformal subtheory A ⊂ B is generated by local observables led to further knowledge on U A . This, in turn, was exploited for proving the maximal Coset model, Cmax , associated with A ⊂ B to be of a local nature, more specifically to coincide with the respective local relative commutant. This way, we provided a solution of the isotony problem for a large class of chiral subsystems A ⊂ B (Theorem 4.1). All that turned out to be necessary were two special features of the implementers of dilatations (Lemmas 2.1 and 4.2). The first one leads to an understanding of how U A acts on general local observables in B geometrically which we summarized as net-endomorphism property (Proposition 2.1, Definition 2.1). We found this property is in complete correspondence with the geometry of a (1+1)dimensional conformal quantum theory and derived all properties one can ask from a (1+1)-dimensional holographic image of B (Theorem 3.1). The derivation of the net-endomorphism property relied mainly on a result on the interplay of modular theory and positivity of energy; we found it worthwhile to summarize and reformulate the facts known today in a natural converse of Borchers’ theorem on half-sided translations (Theorem 2.1). Our solution of the isotony problem made use of specific structures of the chiral conformal group, PSL(2, R)∼ , and integrable positive energy representations of the group of orientation preserving diffeomorphisms of the circle, Diff + (S1 ). The results are satisfactory in many respects: The Borchers-Sugawara construction of U A is completely general, yet completely independent of local information and we have shown that additional input is needed only for deriving two natural lemmas (Lemmas 2.1 and 4.2). Our Additional Assumption, the presence of an integrable stress-energy tensor, is satisfied for a large class of well-investigated examples, the inclusions of current algebras (cf. Appendix). The main results exhibit the natural objects of studies on Coset models to be the maximal Coset model, Cmax and the maximal covariant extension, Amax , and opened the gate for a direct incorporation of results which have been compiled in research in representation theory of affine Lie algebras and string theory. In particular, we made accessible examples of normal canonical tensor
April 20, 2004 11:21 WSPC/148-RMP
376
00200
S. K¨ oster
product subfactors [57] in which Amax and Cmax are both non-trivial local extensions. Yet, there are chiral conformal models which do not possess a stress-energy tensor [45], so our analysis asks for a more general approach. The most general concept relating covariance and local observables is given in terms of local implementers constructed via the universal localization map as an application of the split property, known as the quantum Noether Theorem [4]. Especially in connection with chiral conformal models this concept proved applicable: Carpi [17] reconstructed the stress-energy tensor of certain models via point like limits of local implementers by methods which were introduced and applied in the context of general chiral conformal quantum theory by Fredenhagen and J¨ orß [23] with remarkable success. Approaching the problem from this angle appears to be promising. First, we have reduced it to a question on the way a particular set of global observables, namely ˜ to the dilatation group U A (D(·)), is generated by local observables of A ⊂ B. The dilatations proved natural and very useful to look at in connection with the isotony problem. Secondly, the models to look at first, the conformally covariant derivatives of the U (1) current, obey canonical commutation relations and are well known in many respects (see e.g. [71, 30]). Analytic problems connected with nuclearity and the split property have been addressed successfully for free fields [16, 21], and the task looks interesting and difficult enough. As we already mentioned, our analysis does not directly extend to subsystems in other spacetimes. Conformally invariant theories in higher dimensions might be accessible by the more general applicability of the Borchers–Sugawara construction [43] and the presence of spacetime symmetry groups leaving compact localization regions globally fixed (cf. e.g. [7]), so an analysis based on local implementers might work here as well. For other local quantum theories the isotony problem has been solved by methods less direct than ours, but very general ones [19, 18]. Thus, the quest for the heart of this problem still awaits further investigation. Acknowledgments I thank K.-H. Rehren and S. Carpi for interesting discussions and helpful criticism. Financial support from the Ev. Studienwerk Villigst is gratefully acknowledged. Appendix A The first part of this appendix discusses the background on our additional assumption. Since we only need to give a summary of (mostly) well known results, the discussion will be brief; further details may be found in [45]. The second part contains a simple lemma on the position of dilatations (scale transformations) in Diff + (S1 )∼ . Chiral current algebras provide a large class of interesting models. They constitute chiral conformal quantum field theories defined by fields, the currents, for which the commutator is linear in the fields. The current algebras which we are
April 20, 2004 11:21 WSPC/148-RMP
00200
Local Nature of Coset Models
377
interested in are labelled by reductive, compact Lie algebras, which we call the respective color algebra of the model. The structure constants of the color algebra determine the current algebra up a central extension, which is labelled for any simple ideal in the color algebra by a positive integer, called the level. The current algebras associated with abelian and with simple, non-abelian color algebras of type A(n), B(n), C(n), D(n), G(2) can be constructed at level 1 as quark models [8], i.e. by combining Wick squares of free fermions (reviews: [31, 26]). The corresponding models of higher level, k, are constructed by tensoring the level 1 current algebra k times with itself and extracting the vacuum representation of the level k current (sub-)algebra from this tensor product representation. We denote the current algebra of type A(n) at level k as A(n)k and extend this notation to all other simple, non-abelian current algebras mutatis mutandis. These models fulfill manifestly Wightman’s axioms. The conformal Hamiltonian yields an energy grading on them in terms of modes. In the mode picture, the current algebra associated with a simple, non-abelian color algebra is given by commutation relations of the following form: a b c jm , jn = if ab c jm+n + kg ab mδm,−n .
The indices a, b, c refer to a basis of the color Lie algebra, f ab c are its structure constants, g ab its Cartan metric (in a natural normalization). Unitarity reads in a † terms of the modes jna : jna = j−n . Using these commutation relations and positivity of energy it is possible to establish linear H bounds by arguments as in [15]. This yields the following properties of the fields: smeared currents which are symmetric and localized in a proper interval are essentially self-adjoint on the Wightman domain and generate local von Neumann algebras which satisfy the Haag–Kastler axioms of local quantum theory [36, 35]. For current algebras with an abelian color algebra these axioms were established by Buchholz and Schulz–Mirbach [15]. For the cases E6 , E7 , E8 , F4 there are means to construct the current algebras in the mode picture at all levels, and Wightman’s axioms appear implicitly in the literature for these cases (see [39, 33, 63]). Hence, the Haag–Kastler axioms may be proved as indicated above. Another way of establishing current algebras as local quantum theories is to look at the exponentiated, positive-energy representations of loop groups stemming from the modesk [27, 63, 65]. In conformally covariant Wightman quantum field theories in one (chiral) or (1 + 1) dimensions the theorem of L¨ uscher and Mack [26] determines the commutation relations of a symmetric Wightman field implementing the infinitesimal conformal spacetime transformations on fields of the quantum field theory, the stress k This
approach relies entirely on the structure and representation theory of loop groups and their Lie algebras and works for the integration of quark models as well. Mentioning the integration through linear H bounds seems worthwhile since this is closer to the general approach for the transition from a local quantum theory in terms of quantum fields to the corresponding formulation in terms of local algebras of bounded operators.
April 20, 2004 11:21 WSPC/148-RMP
378
00200
S. K¨ oster
energy tensor (SET), up to a numerical constant, which is determined by the twopoint function of the SET. In (1 + 1) dimensions the SET is found to factorize into two (independent) chiral components. The chiral SETs form, by their commutation relations, an infinitesimal, positive energy representation of Diff + (S1 ). In terms of its modes, Ln , the SET defines a Virasoro algebra with the numerical constant, the central charge c, determining the central extension: [Lm , Ln ] = (m − n)Lm+n + (m − 1)m(m + 1)
c δm,−n . 12
Unitarity of this representation manifests itself in the relations Ln = L†−n . Either by establishing linear H bounds [15] or by integrating the Virasoro algebra [34, 48] the SET can be shown to define a conformally covariant, local quantum theory. In both formulations of current algebras the (Segal–) Sugawara construction (cf. [55, Sec. 9.4, 26] yields a SET which is a quadratic function of the currents, either in terms of the modes or as a Wick square. In fact, these models prove to be diffeomorphism invariant. The embedding of one color Lie algebra, h, into another one, g, yields an inclusion of current algebras and the respective chiral conformal quantum theories. By the (Segal–) Sugawara construction both current algebras contain a SET. Due to complete reducibility results [46, 39] the respective current algebra associated with h ⊂ g and the Sugawara–Virasoro algebras are known to be represented as direct sums of irreducible highest-weight representations tensored by trivial representations on multiplicity spaces. This ensures the integrability of the infinitesimal representations and even more: the cocycles of the respective representations are found to be completely determined by the infinitesimal central extensions, i.e. the cocycles for all irreducible subrepresentations agree and the group laws are fulfilled in the direct sum representations up to phases. Due to technical problems connected with the infinite dimension of the groups/Lie algebras involved this is not obvious at all, but it has been established by Toledano-Laredo [64] on the grounds indicated here. This is of special importance for the SET of the current algebra associated with h embedded in the current algebra of g. It is straightforward to prove that the A modes Lh⊂g ±1,0 agree with the respective linear combinations of the generators of U , where A ⊂ B is taken to be the corresponding inclusion of current algebras as local quantum theories. This identity follows by integrability of both infinitesimal representations (e.g. [25]) and uniqueness of U A [43]. Our additional assumption is thus shown to hold in this class of examples: ΥA , the integrated representation of Diff + (S1 )∼ generated by the Sugawara SET of A, is a projective, unitary, strongly continuous, of positive energy and its restriction to PSL(2, R)∼ agrees with U A . The statement of the additional assumption possibly covers a more general set of models, but due to technical difficulties connected with the infinite dimension of Diff + (S1 )∼ one has to be content with discussing integrable representations.
April 20, 2004 11:21 WSPC/148-RMP
00200
Local Nature of Coset Models
379
We come now to a simple, technical lemma on the position of scale transformations in Diff + (S1 )∼ . Lemma A.1. For a fixed scale transformation D(t) 6= id, t small , there exist diffeomorphisms gδ , gε ∈ Diff + (S1 ) which are localized in arbitrarily small neighborhoods of +1 and −1, respectively, and which agree with D(t) close to +1 and −1, respectively, such that, by defining gδτ1 := D(τ1 )gδ D(τ1 )−1 ,
gετ2 := D(τ2 )−1 gε D(τ2 ) ,
we have for all τ1,2 ∈ R+ : τ1 ,τ2 τ1 ,τ2 τ1 τ2 g− gδ gε . D(t) = g+
(A.1)
τ1 ,τ2 τ1 ,τ2 Here, the diffeomorphisms g+ , g− are uniquely specified by their being localized in the upper and lower half circle, respectively. After a local identification of Diff + (S1 ) with a sheet of Diff + (S1 )∼ containing the identity, equation (A.1) still holds for the respective images in Diff + (S1 )∼ .
Proof. If I1 and I2 are neighboring intervals, the completed union which consists ¯ I2 ; the result is of I1 ∪ I2 and the common boundary point will be denoted I1 ∪ 1 assumed to be a proper interval in S . 0 Choose a set {Iι0 , ι = +, −, δ, ε} of proper, disjoint intervals such that I± ⊂ S1± , 0 0 0 +1 ∈ Iδ , −1 ∈ Iε , the Iι are separated by proper intervals Ia , . . . , Id and a covering of S1 by proper intervals Iι1 is defined through: 1 0 ¯ ¯ I+ I+ := Ia ∪ ∪ Ib ,
1 0 ¯ ¯ I− I− := Ic ∪ ∪ Id ,
¯ Iδ0 ∪ ¯ Id , Iδ1 := Ia ∪
¯ Iε0 ∪ ¯ Ib . Iε1 := Ic ∪
For fixed t, one can choose these intervals such that D(t) satisfies D(t)Iι0 ⊂ Iι1 . Since D(t)S1± ⊂ S1± , we may choose gδ , gε close to id such that gδ agrees with D(t) on Iδ0 and with id on Iδ1 0 and gε agrees with D(t) on Iε0 and with id on Iε1 0 . Referring to this choice we set: 1 g± I ± := D(t)gδ−1 gε−1 S1± ,
g± S1∓ := id S1∓ .
Then we have D(t) = g+ g− gδ gε . We may now apply the definitions in the lemma to this choice and recognize the results to satisfy equally well the assumptions of the construction just given. For a neighborhood of the identity the covering projection p : Diff + (S1 )∼ → Diff + (S1 ) is a homeomorphism. If we apply p−1 to D(t), gδ , gε , g+ , g− , we have p−1 (D(t)) = p−1 (g+ )p−1 (g− )p−1 (gδ )p−1 (gε ). For small τ1 , τ2 the equality (A.1) holds with the corresponding replacements, and the same is true for all τ1,2 ∈ R+ by continuity: denoting the covering projection from R onto S1 by p, all the group elements involved belong to the identity component of the subgroup of Diff + (S1 )∼ which stabilizes p−1 (+1) and p−1 (−1), i.e. we never leave the first sheet of the covering.
April 20, 2004 11:21 WSPC/148-RMP
380
00200
S. K¨ oster
References [1] R. C. Arcuri, J. F. Gomes and D. I. Olive, Conformal subalgebras and symmetric spaces, Nuclear Phys. B285 (1987) 327. [2] H. Araki, Symmetries in theory of local observables and the choice of the net of local algebras, Rev. Math. Phys. SI 1 (Special Issue) (1992) 1–14. [3] F. A. Bais and P. G. Bouwknegt, A classification of subgroup truncations of the bosonic strings, Nuclear Phys. B279 (1987) 561. [4] D. Buchholz, S. Doplicher and R. Longo, On Noether’s theorem in quantum field theory, Ann. Phys. 170 (1986) 1–17. [5] J. Bockenhauer and D. E. Evans, Modular invariants, graphs and α-induction for nets of subfactors II, Commun. Math. Phys. 200 (1999) 57–103. [6] P. Bowcock and P. Goddard, Virasoro algebras with central charge c < 1, Nuclear Phys. B285 [FS19] (1987) 651–670. [7] R. Brunetti, D. Guido and R. Longo, Modular structure and duality in conformal quantum field theory, Commun. Math. Phys. 156 (1993) 201–219. [8] K. Bardakci and M. B. Halpern, New dual quark models, Phys. Rev. D3 (1971) 2493. [9] H.-J. Borchers, On the converse of the Reeh–Schlieder theorem, Commun. Math. Phys. 10 (1968) 269–273. [10] H.-J. Borchers, The CPT theorem in two-dimensional theories of local observables, Commun. Math. Phys. 143 (1992) 315–332. [11] H.-J. Borchers, Half-sided modular inclusions and structure analysis in quantum field theory, in Operator Algebras and Quantum Field Theory, eds. S. Doplicher, R. Longo, J. E. Roberts and L. Zsido (International Press, Cambridge, MA, 1997), pp. 589–608. [12] H.-J. Borchers, On the lattice of subalgebras associated with the principle of halfsided modular inclusion, Lett. Math. Phys. 40 (1997) 371–390. [13] H.-J. Borchers, Half-sided translations and the type of von Neumann algebras, Lett. Math. Phys. 44 (1998) 283–290. [14] O. Bratteli and D. W. Robinson, Operator Algebras and Quantum Statistical Mechnics I. C ∗ - and W ∗ -Algebras, Symmetry Groups, Decomposition of States, 2nd edn. (Springer, New York, 1987). [15] D. Buchholz and H. Schulz-Mirbach, Haag duality in conformal quantum field theory, Rev. Math. Phys. 2 (1990) 105–125. [16] D. Buchholz and E. H. Wichmann, Causal independence and the energy-level density of states in local quantum field theory, Commun. Math. Phys. 106 (1986) 321–344. [17] S. Carpi, Quantum Noether’s theorem and conformal field theory: Study of some models, Rev. Math. Phys. 11 (1999) 519–532. [18] S. Carpi and R. Conti, Classification of subsystems for graded-local nets with trivial superselection structure, to appear in Commun. Math. Phys. math.OA/0312033. [19] S. Carpi and R. Conti, Classification of subsystems for local nets with trivial superselection structure, Commun. Math. Phys. 217 (2001) 89–106. [20] D. R. Davidson, Endomorphism semigroups and lightlike translations, Lett. Math. Phys. 38 (1996) 77–90. [21] C. D’Antoni, S. Doplicher, K. Fredenhagen and R. Longo, Convergence of local charges and continuity properties of W ∗ -inclusions, Commun. Math. Phys. 110 (1987) 325–348. [22] S. Doplicher and J. E. Roberts, Why there is a field algebra with a compact gauge group describing the superselection structure in particle physics, Commun. Math. Phys. 131 (1990) 51–107. [23] K. Fredenhagen and M. J¨ orß, Conformal Haag–Kastler nets, pointlike localized fields, and the existence of operator product expansions, Commun. Math. Phys. 176 (1996) 541–554.
April 20, 2004 11:21 WSPC/148-RMP
00200
Local Nature of Coset Models
381
[24] M. Florig, On Borchers’ theorem, Lett. Math. Phys. 46 (1998) 289–293. [25] J. Fr¨ ohlich, Application of commutator theorems to the integration of representations of Lie algebras and commutation relations, Commun. Math. Phys. 54 (1977) 135–150. [26] P. Furlan, G. M. Sotkov and I. T. Todorov, Two-dimensional conformal quantum field theory, Riv. Nuovo Cim. 12 (1989) 1–203. [27] F. Gabbiani and J. Fr¨ ohlich, Operator algebras and conformal field theory, Commun. Math. Phys. 155 (1993) 569–640. [28] P. Goddard, A. Kent and D. Olive, Unitary representations of the Virasoro and super-Virasoro algebras, Commun. Math. Phys. 103 (1986) 105–119. [29] D. Guido and R. Longo, The conformal spin and statistics theorem, Commun. Math. Phys. 181 (1996) 11–35. [30] D. Guido, R. Longo and H.-W. Wiesbrock, Extensions of conformal nets and superselection sectors, Commun. Math. Phys. 192 (1998) 217–244. [31] P. Goddard and D. Olive, Kac-Moody and Virasoro algebras in relation to quantum physics, Internat. J. Modern Phys. A1 (1986) 303–414. [32] D. R. Grigore, The projective unitary irreducible representations of the Poincar´e group in (1 + 2) dimensions, J. Math. Phys. 34 (1993) 4127–4189. [33] R. Goodman and N. R. Wallach, Structure and unitary cocycle representations of loop groups and the group of diffeomorphisms of the circle, J. Reine Angew. Math. 347 (1984) 69–222. [34] R. Goodman and N. R. Wallach, Projective unitary positive-energy representations of Diff(S 1 ), J. Funct. Anal. 63 (1985) 299–321. [35] R. Haag, Local Quantum Physics: Fields, Particles, Algebras (Springer, Berlin, Heidelberg, 1992). [36] R. Haag and D. Kastler, An algebraic approach to quantum field theory, J. Math. Phys. 5 (1964) 848–861. [37] V. F. R. Jones, Index for subfactors, Invent. Math. 72 (1983) 1–25. [38] M. J¨ orß, Conformal quantum field theory: From Haag-Kastler nets to Wightman Fields, Ph.D. thesis (Universit¨ at Hamburg, 1996) DESY 96-136, KEK 96-10-219. [39] V. Kac, Infinite Dimensional Lie Algebras (Cambridge University Press, 1990). [40] Y. Kawahigashi and R. Longo, Classification of local conformal nets. Case c < 1, math-ph/0201015, 2002. [41] Y. Kawahigashi, R. Longo and M. M¨ uger, Multi-interval subfactors and modularity of representations in conformal field theory, Commun. Math. Phys. 219 (2001) 631–669. [42] V. G. Kaˇc and M. Niculescu Sanielevici, Decompositions of representations of exceptional affine algebras with respect to conformal subalgebras, Phys. Rev. D37 (1988) 2231–2237. [43] S. K¨ oster, Conformal transformations as observables, Lett. Math. Phys. 61 (2002) 187–198. [44] S. K¨ oster, Conformal covariance subalgebras, to appear in Lett. Math. Phys. (2003), hep-th/0303201. [45] S. K¨ oster, Structure of Coset Models, Ph.D. thesis (Georg-August-Universit¨ at G¨ ottingen, 2003). [46] V. Kac and M. Wakimoto, Modular and conformal invariance constraints in representation theory of affine algebras, Adv. Math. 70 (1988) 156–236. [47] M. L¨ uscher and G. Mack, Global conformal invariance in quantum field theory, Commun. Math. Phys. 41 (1975) 203–234. [48] T. Loke, Operator Algebras and Conformal Field Theory of the Discrete Series Representations of Diff(S 1 ), Ph.D. thesis (University of Cambridge, 1994). [49] R. Longo, Conformal subnets and intermediate subfactors, Commun. Math. Phys. 237 (2003) 7–30.
April 20, 2004 11:21 WSPC/148-RMP
382
00200
S. K¨ oster
[50] R. Longo and K.-H. Rehren, Nets of subfactors, Rev. Math. Phys. 7 (1995) 567–597. [51] P. Leyland, J. Roberts and D. Testard, Duality for quantum free fields, 1978, CNRS Marseille 78/P.1016, KEK 79-1-157. [52] M. Moskowitz, On the surjectivity of the exponential map for certain Lie groups, Annali di Matematica pura ed applicata (serie IV) 166 (1994) 129–143. [53] M. Moskowitz, Correction and addenda to: On the surjectivity of the exponential map for certain Lie groups, Annali di Matematica pura ed applicata (serie IV) 173 (1997) 351–358. [54] M. M¨ uger, private communication. [55] A. Pressley and G. Segal, Loop Groups (Clarendon Press, Oxford, 1986). [56] K.-H. Rehren, Subfactors and coset models, in Generalized Symmetries in Physics, eds. H.-D. Doebner, Y. K. Dobrev and A. G. Ushveridze (World Scientific, Singapore, 1994), pp. 338–356. [57] K.-H. Rehren, Chiral observables and modular invariants, Commun. Math. Phys. 208 (2000) 689–712. [58] K.-H. Rehren, Locality and modular invariance in 2D conformal QFT, in Mathematical Physics in Mathematics and Physics: Quantum and Operator Algebraic Aspects, Fields Institute Communications, Vol. 30, ed. R. Longo (American Mathematical Society, Providence, RI, 2001), pp. 341–354. Siena 2000 Proceedings. [59] J. E. Roberts, Some applications of dilatation invariance to structural questions in the theory of local observables, Commun. Math. Phys. 37 (1974) 273–286. [60] A. N. Schellekens and N. P. Warner, Conformal subalgebras of Kac-Moody algebras, Phys. Rev. D34 (1986) 3092–3096. [61] M. Takesaki, Conditional expectations in von Neumann algebras, J. Funct. Anal. 9 (1972) 306–321. [62] M. Takesaki, Theory of Operator Algebras I (Springer Verlag, New York, 1979). [63] V. Toledano Laredo, Fusion of Positive Energy Representations of LSpin2n , Ph.D. thesis (University of Cambridge, 1997). [64] V. Toledano Laredo, Integrating unitary representations of infinite-dimensional Lie groups, J. Funct. Anal. 161 (1999) 478–508. [65] A. Wassermann, Operator algebras and conformal field theory III. Fusion of positive energy representations of LSU (N ) using bounded operators, Invent. Math. 133 (1998) 467–538. [66] H.-W. Wiesbrock, A comment on a recent work of Borchers, Lett. Math. Phys. 25 (1992) 157–159. [67] F. Xu, Algebraic Coset conformal field theories II, Publ. Res. Inst. Math. Sci. 35 (1999) 795–824. [68] F. Xu, Algebraic Coset conformal field theories, Commun. Math. Phys. 211 (2000) 1–43. [69] F. Xu, Jones–Wassermann subfactors for disconnected intervals, Commun. Contemp. Math. 2 (2000) 307–347. [70] F. Xu, On a conjecture of Kac–Wakimoto, Publ. Res. Inst. Math. Sci. 37 (2001) 165–190. [71] J. Yngvason, A note on essential duality, Lett. Math. Phys. 31 (1994) 127–141. [72] R. J. Zimmer, Ergodic Theory and Semisimple Groups, Monographs in Mathematics, Vol. 81 (Birkh¨ auser, Boston, Basel, Stuttgart, 1984).
April 21, 2004 14:27 WSPC/148-RMP
00202
Reviews in Mathematical Physics Vol. 16, No. 3 (2004) 383–420 c World Scientific Publishing Company
ON APPROXIMATE SOLUTIONS OF SEMILINEAR EVOLUTION EQUATIONS
CARLO MOROSI∗ and LIVIO PIZZOCCHERO† ∗Dipartimento
di Matematica, Politecnico di Milano P.za L. da Vinci 32, I-20133 Milano, Italy
†Dipartimento
di Matematica, Universit` a di Milano Via C. Saldini 50, I-20133 Milano, Italy and Istituto Nazionale di Fisica Nucleare, Sezione di Milano, Italy ∗[email protected] †[email protected] Received 9 April 2003 Revised 11 February 2004 A general framework is presented to discuss the approximate solutions of an evolution equation in a Banach space, with a linear part generating a semigroup and a sufficiently smooth nonlinear part. A theorem is presented, allowing one to infer from an approximate solution the existence of an exact solution. According to this theorem, the interval of existence of the exact solution and the distance of the latter from the approximate solution can be evaluated by solving a one-dimensional “control” integral equation, where the unknown gives a bound on the previous distance as a function of time. For example, the control equation can be applied to the approximation methods based on the reduction of the evolution equation to finite-dimensional manifolds; among them, the Galerkin method is discussed in detail. To illustrate this framework, the nonlinear heat equation is considered. In this case the control equation is used to evaluate the error of the Galerkin approximation; depending on the initial datum, this approach either grants global existence of the solution or gives fairly accurate bounds on the blow up time. Keywords: Differential equations; theoretical approximation; nonlinear heat equation; blow up.
1. Introduction In this paper we consider, within a Banach space F, a Volterra integral equation Z t ds U(t − s)P(ϕ(s), s) , (1.1) ϕ(t) = U(t − t0 )f0 + t0
for an unknown function ϕ from a real interval to F. Here f0 ∈ F, U is a linear semigroup on F and P is a locally Lipschitz nonlinear map from an open set of F×R to F. If U is the semigroup generated by a linear operator A : Dom A ⊂ F → F, 383
April 21, 2004 14:27 WSPC/148-RMP
384
00202
C. Morosi & L. Pizzocchero
under minimal technical conditions the above Volterra equation is equivalent to a Cauchy problem ϕ(t) ˙ = Aϕ(t) + P(ϕ(t), t) ,
ϕ(t0 ) = f0 ,
(˙ := d/dt) .
(1.2)
To standardize the language, problems (1.1) and (1.2) are defined precisely in Sec. 2; the local existence and uniqueness of their solutions are well known. The aim of this paper is to discuss the approximate solutions of (1.1). In the most general sense, an approximate solution is simply a continuous map t 7→ ϕap (t) which can be inserted in the right-hand side of (1.1), i.e., such that graph ϕap ⊂ Dom P. For any such map, we can define the integral error as the difference between the two sides of (1.1). If ϕap is a bit more regular, the integral error is determined by the differential and datum errors which are, respectively, the differences between the two sides in the differential equation and in the initial condition of (1.2). All the above concepts are formalized in Sec. 3. Here, we also present a general statement (Proposition 3.4) which can be applied to an approximate solution t 7→ ϕap (t) to infer the existence of an exact solution ϕ on an appropriate time interval, and also to estimate the difference ϕ(t) − ϕap (t). The essential character in Proposition 3.4 is an integral control inequality, depending on the available estimators for the integral error of ϕap and for the growth of P away from the graph of ϕap . The unknown in the control inequality is a real, nonnegative function t 7→ R(t); if a solution R is found to exist on a time interval [t0 , t1 | (i.e., either [t0 , t1 ] or [t0 , t1 )), then it is granted that (1.1) possesses an exact solution ϕ : [t0 , t1 | → F, and that kϕ(t) − ϕap (t)k ≤ R(t). In typical cases, a solution of the previous integral inequality can be constructed solving an ordinary differential equation for R, that we call as well the control equation. In this way, the problem of giving estimates on the existence time for (1.1) and on its exact solution ϕ, living in F which is typically of infinite dimension, is reduced to the analysis of a one-dimensional ODE. Proposition 3.4 can be regarded as a general formulation of many statements about specific evolutionary problems, often encountered in the literature. From this viewpoint, the content of this proposition is not at all surprising: however, the technique we use to prove it is essentially different from the arguments often employed in related situations. The standard way of thinking would suggest to prove Proposition 3.4 in two steps: (a) derive (via some nonlinear Gronwall Lemma [16]) an a priori bound kϕ(t) − ϕap (t)k ≤ R(t), holding until ϕ(t) exists; (b) show that nonexistence of ϕ on the whole interval [t0 , t1 | would contradict the previous bound: this argument is called the “continuation principle” in [19]. On the contrary, the proof we propose (in Sec. 4) is very direct, and shows that ϕ can be constructed on the whole [t0 , t1 | by a convergent Peano–Picard iteration, applying repeatedly the Volterra integral operator to the approximate solution ϕ ap . The control inequality ensures the invariance under the Volterra operator of the space of functions with distance ≤ R(t) from ϕap (t) on [t0 , t1 |; the confinement to
April 21, 2004 14:27 WSPC/148-RMP
00202
On Approximate Solutions of Semilinear Evolution Equations
385
this domain of all iterates of ϕap , and the local Lispchitz nature of P, allow to prove their convergence to a function ϕ, also distant less than R from ϕap . As a first, very simple illustration of Proposition 3.4, in Sec. 5 we apply the control equation to the approximate solution ϕap (t) := 0. In spite of the trivial choice for ϕap , the control equation gives useful information on the interval of existence and on the growth of the exact solution ϕ, depending on the norm kf0 k of the initial datum. The accuracy of these predictions is tested on an example, concerning the (one-dimensional) wave equation with polynomial nonlinearity. A second, more refined application of the control equation is proposed in Sec. 6 for the Galerkin scheme (and similar approaches). In the conventional formulation, the Galerkin method is an algorithm to construct approximate solutions t 7→ ϕap (t) of (1.1) with values in a finite-dimensional submanifold of F. In this section, the standard evolution equations for the coordinates of ϕap (t) in the Galerkin submanifold are coupled with the control equation for R(t); in this way, a finite-dimensional system of ODE’s gives simultaneously the Galerkin approximate solution ϕap , an interval [t0 , t1 | on which the exact solution ϕ of (1.1) is granted to exist and an upper bound for kϕap (t) − ϕ(t)k on this interval. In Sec. 7, all the previous results are applied to a nonlinear heat equation, working for simplicity in one space dimension (with a spatial coordinate x ∈ (0, π)). In this case, the Cauchy problem (1.2) (with initial time t0 := 0) is, symbolically, ϕ(x, ˙ t) = ϕxx (x, t) + ϕ(x, t)p ,
ϕ(x, 0) = f0 (x)
(1.3)
with p ∈ {2, 3, 4, . . .}, to be discussed in the Sobolev space F := H01 (0, π). The implementation of the general framework in the present case with polynomial nonlinearity requires accurate information on the pointwise product of functions in H01 (0, π); in particular, precise estimates are needed for the norm kf hk when f, h are in this space (see Appendix A about this, and [11] for more general information about multiplication in Sobolev spaces). To exemplify some general p facts about (1.3), in the same section we consider the initial datum f0 (x) := 2/πA sin x. If the (nonnegative) constant A is below a critical value, the control equation for the zero approximate solution suffices to prove existence of a globally defined solution ϕ : [0, +∞) → F of (1.3). For larger A, the same control equation gives a finite lower bound for the existence time of the solution ϕ. These conclusions are complementary to the ones arising from a known “blow up” theorem of Kaplan for the nonlinear heat equation (see [5]; a review is given in Appendix B). When Kaplan’s theorem is applied to (1.3) with the previous datum, for sufficiently large A it predicts a finite, explicitly determined upper bound on the existence time of the solution. Again in Sec. 7, we add to the above facts the information arising from application of the control equation to the Galerkin scheme; the chosen Galerkin submanifold is the linear span of finitely many elements in the Fourier basis. As an example, we consider the Galerkin differential equations for two modes, coupled with the control equation for R, with p = 2 and the previous f0 . This system in three unknown
April 21, 2004 14:27 WSPC/148-RMP
386
00202
C. Morosi & L. Pizzocchero
real functions can be easily treated by any package for the numerical solution of ODE’s; the results obtained by the MATHEMATICA package, for several values of A, are presented with some detail. Among other things, the Galerkin approach with the control equation allows to increase the critical value of A below which global existence is granted for (1.3); for A above the new critical value, a better lower bound for the existence time is derived. If A is fairly large, the new lower bound is close to the Kaplan upper bound, which yields an uncertainty between 20% and 30% on the existence time of the exact solution. Also, the upper bound R(t) on kϕ(t) − ϕap (t)k is fairly small in comparison with kϕap (t)k for nonlarge t. To some extent, it is surprising that fairly good accuracy can be obtained combining the control equation with a Galerkin scheme in two modes only. These outcomes encourage us to hope that the same method would give nontrivial information on the Cauchy problem for the equations of fluid dynamics, whose Galerkin approximations in few modes give rise, among others, to the widely studied Lorentz model [9, 15]. 2. Preliminaries Throughout the paper, F denotes a real or complex Banach space with norm k k and elements f, f0 , f1 , h, . . . . We write B(f0 , ρ) for the open ball in F of center f0 and radius ρ (if ρ = +∞, this means the whole F). Let us be given a linear operator A : Dom A ⊂ F → F
(2.1)
with domain a linear subspace of F; whenever we speak of a continuous map from/to Dom A, we always refer to the topology of the graph norm kf kA := kf k + kAf k (as is well known, Dom A is complete in this norm if and only if A is closed). We denote with L(F) the Banach space of bounded linear operators of (the whole) F into itself. We always write [t0 , t1 | for a real interval of the form [t0 , t1 ] or [t0 , t1 ) (always intending t0 < t1 ; in the second case, t1 can be +∞). If ψ : [t0 , t1 | → F, the graph of this function and the tube around ψ of any radius ρ ∈ (0, +∞] are graph ψ := {(ψ(t), t)|t ∈ [t0 , t1 |} ⊂ F × R ,
(2.2)
T(ψ, ρ) := {(f, t) ∈ F × [t0 , t1 | | kf − ψ(t)k < ρ}
(2.3)
(the latter is the whole F × [t0 , t1 |, if ρ = +∞; it becomes B(f0 , ρ) × [t0 , t1 |, if ψ(t) = const. = f0 ). 2.1. Linear semigroups on F This name indicates maps U such that U : [0, +∞) → L(F) ,
U(t + s) = U(t) U(s) ,
U(0) = 1F .
(2.4)
April 21, 2004 14:27 WSPC/148-RMP
00202
On Approximate Solutions of Semilinear Evolution Equations
387
The generator of a linear semigroup U is the linear operator
Dom A :=
A : Dom A ⊂ F → F , f 7→ Af , d d [U(t)f ] exists , Af := [U(t)f ] f ∈ F dt t=0 dt t=0
(2.5) (2.6)
(with (d/dt)t=0 denoting the right derivative). A linear semigroup U in F is strongly continuous if for all f ∈ F the map [0, +∞) → F, t 7→ U(t)f is continuous. In this case (see, e.g., [2, 19]), the map (f, t) 7→ U(t)f is jointly continuous, the generator A is densely defined in F and closed, and (a) and (b) hold: (a) A determines U. For all f0 ∈ Dom A, the function t 7→ U(t)f0 is the unique function ϕ such that ϕ ∈ C([0, +∞), Dom A) ∩ C 1 ([0, +∞), F) ,
ϕ(t) ˙ = Aϕ(t) for all t ,
ϕ(0) = f0 ; (2.7)
(b) for any function ψ ∈ C([t0 , t1 |, Dom A) ∩ C 1 ([t0 , t1 |, F) and t in this interval, it is Z t ˙ ψ(t) = U(t − t0 )ψ(t0 ) + ds U(t − s)[ψ(s) − Aψ(s)] (2.8) t0
(here and in the sequel, the dot indicates the derivative). A linear semigroup U is uniformly continuous if the map U is continuous from [0, +∞) to L(F) with the standard operator norm (uniformly continuous semigroup); this happens if and only if U has generator A ∈ L(F), and gives a trivial example of strongly continuous semigroup (extendable to t < 0). Definition 2.1. An estimator for a linear semigroup U on F is a continuous function u : [0, +∞) → [0, +∞) such that, for all f ∈ F and t ∈ [0, +∞), kU(t)f k ≤ u(t)kf k .
(2.9)
Each strongly continuous linear semigroup admits an estimator of the form u(t) = U e−Bt , where U ≥ 1 and B are real constants: see [3]. 2.2. Lipschitz maps If C, D are subsets of a topological vector space, we say that C is a strict subset of D, and write C b D, if C is bounded and C¯ ⊂ D, the symbol ¯ denoting the closure. Now, let us be given a (possibly nonlinear) map, with open domain, P : Dom P ⊂ F × R → F ,
(f, t) 7→ P(f, t) .
(2.10)
Definition 2.2. We say that P is Lipschitz at fixed time (or, respectively, Lipschitz) on the strict subsets of its domain if, for every C b Dom P, there is a nonnegative
April 21, 2004 14:27 WSPC/148-RMP
388
00202
C. Morosi & L. Pizzocchero
constant L = L(C) (or, resp., a pair of nonnegative constants L = L(C), M = M (C)) such that kP(f, t) − P(f 0 , t)k ≤ Lkf − f 0 k ,
for (f, t), (f 0 , t) ∈ C ; (2.11)
kP(f, t) − P(f 0 , t0 )k ≤ Lkf − f 0 k + M |t − t0 | ,
for (f, t), (f 0 , t0 ) ∈ C . (2.12)
Of course (2.12) implies (2.11) and the continuity of P. Example. Some applications presented in the sequel rely on a map P of the form Dom P = F × ∆ (∆ ⊂ R an open interval) ,
P(f, t) := P(f, . . . , f, t) ,
(2.13)
(f1 , . . . , fp , t) → P(f1 , . . . , fp , t)
(2.14)
where P : ×p F × ∆ → F (p ∈ {1, 2, . . .}) ,
is R-linear in each argument f1 , . . . , fp ; it is also assumed that kP(f1 , . . . , fp , t)k ≤ P (t)kf1 k · · · kfp k ,
kP(f1 , . . . , fp , t) − P(f1 , . . . , fp , t0 )k ≤ Q(t, t0 )kf1 k · · · kfp k
(2.15)
for all f1 , . . . , fp ∈ F and t, t0 ∈ ∆, where P : ∆ → [0, +∞) and Q : ∆ × ∆ → [0, +∞) are continuous functions. It is finally required that, for each B b ∆, there is a constant M = M (B) such that Q(t, t0 ) ≤ M |t − t0 | ,
for t, t0 ∈ B .
(2.16)
Proposition 2.3. With the previous assumptions, for all f, f 0 ∈ F and t, t0 ∈ ∆ it is p X p kP(f, t) − P(f 0 , t0 )k ≤ P (t) kf 0 kp−j kf − f 0 kj + Q(t, t0 )kf 0 kp . (2.17) j j=1 Proof. Setting for convenience f1 := f 0 , f2 := f − f 0 we can write
P(f, t) − P(f 0 , t) = P(f1 + f2 , . . . , f1 + f2 , t) − P(f1 , . . . , f1 , t) =
p X
X
P(fl1 , . . . , flp , t) ,
j=1 (l1 ,...,lp )∈Λpj
Λpj := {(l1 , . . . , lp ) ∈ {1, 2}p|ls = 2 for j values of s} .
(2.18)
From the first inequality (2.15), we infer 0
kP(f, t) − P(f , t)k ≤ P (t) because Λpj has cardinality
p j
p X p j=1
j
kf 0 kp−j kf − f 0 kj ,
(2.19)
. Finally, the second assumption (2.15) gives
kP(f 0 , t) − P(f 0 , t0 )k ≤ Q(t, t0 )kf 0 kp and (2.19) and (2.20), with the triangular inequality, yield the thesis (2.17).
(2.20)
April 21, 2004 14:27 WSPC/148-RMP
00202
On Approximate Solutions of Semilinear Evolution Equations
389
Equation (2.17) will be frequently used in the sequel; together with (2.16), it implies Corollary 2.4. The map P is Lipschitz on the strict subsets of F × ∆. 2.3. General formulation of the Volterra and Cauchy problems We define formally both problems, and review their relations. Definition 2.5. Let us be given: (i) a strongly continuous linear semigroup U on F; (ii) a continuous map P : Dom P ⊂ F × R → F, with open domain; (iii) a pair (f0 , t0 ) ∈ Dom P. The Volterra problem related to U and P with datum f0 at time t0 is as follows: Find ϕ ∈ C([t0 , t1 |, F) such that graph ϕ ⊂ Dom P and Z t ϕ(t) = U(t − t0 )f0 + ds U(t − s)P(ϕ(s), s) , for all t ∈ [t0 , t1 | .
(2.21)
t0
Definition 2.6. Consider: (i) a linear operator A : Dom A ⊂ F → F; (ii) a continuous map P as in the previous definition; (iii) a pair (f0 , t0 ) ∈ Dom P such that f0 ∈ Dom A. The Cauchy problem corresponding to A, P with datum f0 at time t0 is as follows: Find ϕ ∈ C([t0 , t1 |, Dom A) ∩ C 1 ([t0 , t1 |, F) such that graph ϕ ⊂ Dom P and ϕ(t) ˙ = Aϕ(t) + P(ϕ(t), t) ,
for all t ∈ [t0 , t1 | ,
ϕ(t0 ) = f0 .
(2.22)
Proposition 2.7. Let A, P, f0 , t0 be as in Definition 2.6, and further assume A to be the generator of a strongly continuous linear semigroup U. Then: (i) a solution ϕ of the Cauchy problem (2.22) is also solution of the Volterra problem (2.21); (ii) as a partial converse, a solution ϕ of the Volterra problem (2.21) is a solution of the Cauchy problem (2.22) in either of these situations: (α) F is reflexive and P is Lispchitz on the strict subsets of its domain; (β) (trivial case) A ∈ L(F), no further assumptions on F and P. Proof. It is essentially based on (2.7) and (2.8): see [2] (conditions (α) and (β) in item (ii) ensure a solution ϕ ∈ C([t0 , t1 |, F) of (2.21) to be in C 1 ([t0 , t1 |, F) ∩ C([t0 , t1 |, Dom A)).
April 21, 2004 14:27 WSPC/148-RMP
390
00202
C. Morosi & L. Pizzocchero
In particular, the operator A := 0 is the generator of the identity semigroup U(t) = 1F for all t. With this remark, the framework of this paper applies to any ODE ϕ(t) ˙ = P(ϕ(t), t) in a Banach space, also including the finite dimensional cases F = Rm or Cm . Proposition 2.8. Consider the Volterra problem (2.21), where U is a strongly continuous linear semigroup, and P is continuous and Lipschitz at fixed time on the strict subsets of its domain; then (i) and (ii) hold. (i) Problem (2.21) has a solution. (ii) If ϕ : [t0 , t1 | → F and ϕ0 : [t0 , t01 | → F are two solutions, it is ϕ(t) = ϕ0 (t) ,
for t ∈ [t0 , t1 | ∩ [t0 , t01 | .
(2.23)
Proof. (ii) We consider any t2 in the intersection of the domains. Subtracting Eq. (2.21) for ϕ from the analogous equation for ϕ0 , and taking the norm, we obtain Z t kϕ(t) − ϕ0 (t)k ≤ ds u(t − s)kP(ϕ(s), s) − P(ϕ0 (s), s)k t0
≤ UL
Z
t
t0
kϕ(s) − ϕ0 (s)k
(2.24)
for each t ∈ [t0 , t2 ]. Here: u is any estimator for U; U := maxs∈[t0 ,t2 ] u(s); L is a constant fulfilling the Lipschitz condition (2.11) for P on the set C := graph (ϕ0 [t0 , t2 ]) ∪ graph (ϕ [t0 , t2 ]) (this C is a strict subset of Dom P). Equation (2.24) and the classical Gronwall Lemma [10] imply kϕ(t) − ϕ0 (t)k = 0 for all t ∈ [t0 , t2 ]. (i) Equation (2.21) is the fixed point problem for a Volterra type integral operator, and a solution can be constructed by standard Peano–Picard iteration, starting from the function ϕ0 (t) := const. := f0 ; see, e.g., [2]. From our viewpoint, the previously mentioned argument for local existence is a particular case of a more general statement, allowing us to construct a solution of (2.21) by a Peano–Picard iteration with starting point any approximate solution (of sufficiently small error); all this will be discussed in the next section. Of course, Proposition 2.7 allows to transfer the statements on uniqueness and existence from (2.21) to (2.22). Let us call maximal a solution ϕ of (2.21) or (2.22) which has no proper extension. If one can grant the existence of a solution on a sufficiently small interval, and the coincidence of two solutions on the intersection of their domains, it follows that a unique maximal solution ϕ exists and any other solution is a proper restriction of the maximal one. Furthermore, if local existence is granted for arbitrary data, the domain of the maximal solution ϕ with a given datum has the form [t0 , ϑ) (otherwise, ϕ could be extended taking its value at ϑ as a new initial datum).
April 21, 2004 14:27 WSPC/148-RMP
00202
On Approximate Solutions of Semilinear Evolution Equations
391
3. Approximate Solutions. Statements of the Main Results We consider a strongly continuous linear semigroup U on the Banach space F, and a continuous function P : Dom P ⊂ F × R → F with open domain. We are interested in the Volterra problem (2.21), for a given pair (f0 , t0 ) ∈ Dom P. Definition 3.1. By an approximate solution of problem (2.21), we mean any continuous function ϕap : [t0 , t1 | → F, such that graph ϕap ⊂ Dom P. Given any such function, we stipulate the following: (i) the integral error of ϕap is the function E(ϕap ) : t ∈ [t0 , t1 | 7→ E(ϕap )(t) Z t := ϕap (t) − U(t − t0 )f0 − ds U(t − s)P(ϕap (s), s) ;
(3.1)
t0
an integral error estimator for ϕap is a continuous function E : [t0 , t1 | → [0, +∞) such that, for all t in this interval, kE(ϕap )(t)k ≤ E(t) .
(3.2)
(ii) The datum error for ϕap is the difference d(ϕap ) := ϕap (t0 ) − f0 ;
(3.3)
a datum error estimator for ϕap is a nonnegative real number δ such that kd(ϕap )k ≤ δ .
(3.4)
(iii) If A is the generator of U and ϕap ∈ C([t0 , t1 |, Dom A) ∩ C 1 ([t0 , t1 |, F), the differential error of ϕap is the function e(ϕap ) : t ∈ [t0 , t1 | 7→ e(ϕap )(t) := ϕ˙ ap (t) − Aϕap (t) − P(ϕap (t), t) ;
(3.5)
a differential error estimator for ϕap is a continuous function : [t0 , t1 | → [0, +∞) such that, for t in this interval, ke(ϕap )(t)k ≤ (t) .
(3.6)
Of course, ϕap is a solution of the Volterra (resp., Cauchy) problem iff E(ϕap ) = 0 (resp., d(ϕap ) = 0 and e(ϕap ) = 0). Lemma 3.2. Let ϕap : [t0 , t1 | → F be an approximate solution of the Volterra problem (2.21), and assume the regularity conditions in item (iii) of Definition 3.1. Then, the integral error of ϕap is related to the datum and differential errors by Z t ds U(t − s)e(ϕap )(s) . (3.7) E(ϕap )(t) = U(t − t0 ) d(ϕap ) + t0
If U, d(ϕap ), e(ϕap ) have estimators u, δ, , then E(ϕap ) has the estimator Z t E(t) := u(t − t0 ) δ + ds u(t − s)(s) , for all t ∈ [t0 , t1 | . t0
(3.8)
April 21, 2004 14:27 WSPC/148-RMP
392
00202
C. Morosi & L. Pizzocchero
Proof. Equation (3.7) follows applying the definitions of E(ϕap ), d(ϕap ), e(ϕap ) and the identity (2.8) with ψ := ϕap . Given (3.7), the bound (3.8) on kE(ϕap )(t)k is evident. Remark. The estimator E defined by (3.8) is useful, because in many cases it can be easily computed. However, in peculiar situations involving oscillating functions, this estimator can be rough. For example, consider the semigroup U(t) := 1F for all t, with generator A = 0 and estimator u(t) := 1. Let us choose ϕap (t) := f0 for Rt all t, so that d(ϕap ) = 0, e(ϕap )(t) = −P(f0 , t) and E(ϕap )(t) = − t0 ds P(f0 , s). Suppose P(f0 , t) = g0 eiωt , with ω ∈ (0, +∞) and g0 a vector of the (complex) space F; then, the best estimators for d(ϕap ) and e(ϕap ) are, respectively, δ = 0 and (t) = kg0 k. Correspondingly, Eq. (3.8) gives the integral error estimator E(t) = kg 0 k(t − t0 ); on the other hand, it is found by direct computation that E(ϕap )(t) = ig0 (eiωt − eiωt0 )/ω; thus kE(ϕap )(t)k is a bounded function of t, whereas the estimator E grows linearly. Similar drawbacks of the estimator (3.8) in the presence of oscillatory functions are met (even for F = Rm ) if one considers a differential equation with fast periodic variables and the approximate solutions which arise from averaging methods [8]. To formulate the main theorem on approximate solutions, we need one more notion describing the growth of P away from a function ψ : [t0 , t1 | → F, such that graph ψ ⊂ Dom P. Definition 3.3. A growth estimator for P from ψ (if it exists) is a continuous function ` : [0, ρ) × [t0 , t1 | → [0, +∞) ,
(r, t) 7→ `(r, t)
(3.9)
such that: (i) ρ ∈ (0, +∞] and T(ψ, ρ) ⊂ Dom P (see Eq. (2.3)); (ii) ` is nondecreasing in the first variable: `(r, t) ≤ `(r 0 , t) for r ≤ r0 ; (iii) for all (f, t) ∈ T(ψ, ρ), it is kP(f, t) − P(ψ(t), t)k ≤ `(kf − ψ(t)k, t) .
(3.10)
Remarks. (a) From a continuous function `0 : [0, ρ) × [t0 , t1 | → [0, +∞) fulfilling (i) and (iii) but not (ii), we can construct the function `(r, t) := maxr0 ∈[0,r] `0 (r0 , t), which also fulfils (ii). (b) A function P which is Lipschitz at fixed time on a tube around ψ possesses on it a growth estimator linear in r. Less trivial estimators appear if Dom P = F × ∆, ∆ a real interval, and one wishes to estimate the growth of P from ψ on the whole product space F × [t0 , t1 | (= on a tube of infinite radius). For instance, consider a map P as in the Example of page 388; the growth of P from any ψ : [t0 , t1 | ⊂ ∆ → F admits the estimator
April 21, 2004 14:27 WSPC/148-RMP
00202
On Approximate Solutions of Semilinear Evolution Equations
`(r, t) := P (t)
p X p j=1
j
kψ(t)kp−j rj
393
(3.11)
(r ∈ [0, +∞), t ∈ [t0 , t1 |). To find this, apply Eq. (2.17) with f 0 = ψ(t) and t0 = t. We come to the main theorem of this section: the proof will be given in Sec. 4. Proposition 3.4. Let us be given a Volterra problem (2.21), where: U is a strongly continuous linear semigroup; P : Dom P ⊂ F × R → R is continuous and Lipschitz at fixed time on the strict subsets of its open domain (Definition 2.2). Assume that: (i) u is an estimator for U; (ii) ϕap : [t0 , t1 | → F is an approximate solution of (2.21), and E : [t0 , t1 | → [0, +∞) is an estimator for the integral error E(ϕap ); (iii) ` : [0, ρ) × [t0 , t1 | → [0, +∞) is a growth estimator for P from ϕap (ρ ∈ (0, +∞]). Consider the following problem:
E(t) +
Z
t t0
Find R ∈ C([t0 , t1 |, [0, ρ)) such that ds u(t − s)`(R(s), s) ≤ R(t) ,
for t ∈ [t0 , t1 | .
(3.12)
If (3.12) has a solution R with domain [t0 , t1 |, then (2.21) has a solution ϕ with the same domain, and for all t therein it is kϕ(t) − ϕap (t)k ≤ R(t) .
(3.13)
The solution ϕ can be constructed by a Peano–Picard iteration, starting from ϕ ap . Definition 3.5. Equation (3.12) will be referred to as the control inequality. Remarks. (i) The function R is required to exist on the same domain [t0 , t1 | of ϕap . In many applications, one starts with an approximate solution on a domain [t0 , t2 | and then finds (3.12) to have a solution R on a domain [t0 , t1 | ⊂ [t0 , t2 |; of course, in this case the Proposition 3.4 must be applied to ϕap [t0 , t1 |. (ii) As anticipated, the argument we will employ to prove Proposition 3.4 is different from the “continuation principle” mentioned in the Introduction. Instead of using a Gronwall Lemma plus a reductio ad absurdum, we will prove the existence of ϕ on [t0 , t1 |, and the bound (3.12), by a constructive Peano–Picard iteration; the convergence of this iteration on the whole [t0 , t1 | has some theoretical interest by itself. Furthermore, this approach overcomes some technicalities required by the application of nonlinear Gronwall Lemmas (the analysis of the associated integral equation, and the necessity to determine the greatest solution when uniqueness fails [16]). Apart from the general concept of approximate solution employed here, the idea to prove existence for an ODE f˙ = P(f, t) by the Peano–Picard method, under
April 21, 2004 14:27 WSPC/148-RMP
394
00202
C. Morosi & L. Pizzocchero
conditions of nonlinear growth for P of a more global type than the Lipschitz property can be ascribed to Caratheodory [1], and was developed in [17] and [12]. (iii) Of course, we can accept as a solution of (3.12) an R fulfilling the equation Z t ds u(t − s)`(R(s), s) = R(t) , for t ∈ [t0 , t1 | , (3.14) E(t) + t0
hereafter referred to as the control integral equation. (The existence of such an R on a sufficiently short interval is granted by standard compactness arguments, see [16]. Uniqueness can be proved under supplementary assumptions of Lipschitz kind for `). Let us exploit a typical case, where the control integral equation (3.14) is equivalent to a Cauchy problem. To this purpose, assume that Z t u(t) = U e−Bt (U ≥ 1, B ∈ R) , E(t) = U e−B(t−t0 ) δ + U ds e−B(t−s) (s) t0
(3.15)
for some constant δ ≥ 0 and some continuous function : [t0 , t1 | → [0, +∞) (for example, the estimator E derived from Eq. (3.8) has the above form). Then, multiplying by eB(t−t0 ) we see that Eq. (3.14) is equivalent to Z t Z t Uδ + U ds eB(s−t0 ) (s) + U ds eB(s−t0 ) `(R(s), s) = eB(t−t0 ) R(t) . (3.16) t0
t0
Any solution R of (3.16) is clearly C 1 . By derivation in t of this equation, and evaluation of the same at t = t0 , we get Proposition 3.6. If u and E are as in (3.15), Eq. (3.14) is equivalent to the problem ˙ R(t) = U (t) + U `(R(t), t) − BR(t) ,
R(t0 ) = U δ ,
(3.17)
for an unknown function R ∈ C 1 ([t0 , t1 |, [0, ρ)) (the terms control problem, or control equation will be employed as well , for (3.17) or the differential equation therein). 4. Proof of Proposition 3.4 We present in detail the argument in the case of a compact interval [t0 , t1 ]. In this case, we use the space C([t0 , t1 ], F), regarded as a Banach space with the usual sup norm kψk := maxt∈[t0 ,t1 ] kψ(t)k. The case when ϕap , R, etc. are defined on [t0 , t1 ) (with t1 possibly infinite) is treated in a similar way, using C([t0 , t1 ), F) with the topology of uniform convergence on all compact subintervals [t0 , τ ] ⊂ [t0 , t1 ).a a This
complete, locally convex topology on C([t0 , t1 ), F) is defined by the seminorms (k kτ )τ ∈[t0 ,t1 ) where kψkτ := maxt∈[t0 ,τ ] kψ(t)k. To adapt the proof to this case, the ob¯ ap , %), Λ, etc. appearing in the sequel must be replaced by families of objects %τ , jects %, T(ϕ ¯ ap [t0 , τ ], %τ ), Λτ , etc., one for each τ ; the definition of D is simply rephrased using [t0 , t1 ). T(ϕ The Peano–Picard iteration converges in all the seminorms k kτ .
April 21, 2004 14:27 WSPC/148-RMP
00202
On Approximate Solutions of Semilinear Evolution Equations
395
Sticking from now on to the case [t0 , t1 ], we introduce the objects % := max R(t) ; t∈[t0 ,t1 ]
¯ ap , %) = {(f, t) ∈ F × [t0 , t1 ] | kf − ϕap (t)k ≤ %} ; (4.1) T(ϕ
D := {ψ ∈ C([t0 , t1 ], F) | kψ(t) − ϕap (t)k ≤ R(t) ,
for t ∈ [t0 , t1 ]} ;
(4.2)
¯ ap , %) ⊂ T(ϕap , ρ) is a strict subset of Dom P. D is a closed then % < ρ, and T(ϕ ¯ ap , %). subset of C([t0 , t1 ], F) (containing ϕap ) and ψ ∈ D ⇒ graph ψ ⊂ T(ϕ Definition 4.1. We put J : D → C([t0 , t1 ], F) , J (ψ)(t) := U(t − t0 )f0 +
Z
ψ 7→ J (ψ) ,
(4.3)
t t0
ds U(t − s)P(ψ(s), s) ,
for t ∈ [t0 , t1 ] .
Of course, we have Lemma 4.2. ϕ ∈ D solves the Volterra problem (2.21) if and only if ϕ = J (ϕ) . Lemma 4.3. There is a constant Λ ≥ 0 such that, for all ψ, ψ 0 ∈ D, Z t 0 kJ (ψ)(t) − J (ψ )(t)k ≤ Λ dskψ(s) − ψ 0 (s)k , for t ∈ [t0 , t1 ] .
(4.4)
t0
Thus kJ (ψ) − J (ψ 0 )k ≤ Λ(t1 − t0 )kψ − ψ 0 k, which implies the continuity of J . Proof. P is Lipschitz at fixed time on the strict subsets of its domain, so there is a constant L ≥ 0 such that kP(f, t) − P(f 0 , t)k ≤ Lkf − f 0 k for (f, t), (f 0 , t) ∈ ¯ ap , %). If u is an estimator for the semigroup U and U := maxt∈[t ,t ] u(t), T(ϕ 0 1 we see that Eq. (4.4) is fulfilled with Λ := U L; the remaining statements are trivial. Lemma 4.4. J (D) ⊂ D. Proof. Let ψ ∈ D. For all t ∈ [t0 , t1 ] the definitions of J and of the error E(ϕap ), with the properties of E, u, `, imply Z t dsU(t − s)[P(ψ(s), s) − P(ϕap (s), s)] , J (ψ)(t) − ϕap (t) = −E(ϕap )(t) + t0
kJ (ψ)(t) − ϕap (t)k ≤ E(t) +
Z
(4.5)
t t0
ds u(t − s)`(kψ(s) − ϕap (s)k, s) .
(4.6)
On the other hand, kψ(s) − ϕap (s)k ≤ R(s) which implies `(kψ(s) − ϕap (s)k, s) ≤ `(R(s), s); inserting this into (4.6), and using the control inequality (3.12) for R, we conclude kJ (ψ)(t) − ϕap (t)k ≤ R(t) , i.e. ,
J (ψ) ∈ D .
(4.7)
April 21, 2004 14:27 WSPC/148-RMP
396
00202
C. Morosi & L. Pizzocchero
The invariance of D under J is a central result; with the previously shown properties of J , it allows to set up the Peano–Picard iteration and get ultimately a fixed point. Definition 4.2. (ϕk ) (k ∈ N) is the sequence of functions in D, defined recursively by ϕ0 := ϕap ,
ϕk := J (ϕk−1 ) (k ≥ 1) .
(4.8)
Lemma 4.6. For all k ∈ N and t ∈ [t0 , t1 ], it is kϕk+1 (t) − ϕk (t)k ≤ Σ
Λk (t − t0 )k k!
(4.9)
where Λ is the constant of Eq. (4.4) and Σ := maxt∈[t0 ,t1 ] E(t). So, kϕk+1 − ϕk k ≤ Σ
Λk (t1 − t0 )k . k!
(4.10)
Proof. Equation (4.10) is an obvious consequence of (4.9). We will prove (4.9) by recursion, indicating with a subscript k the thesis at a specified order. We have ϕ1 (t) − ϕ0 (t) = J (ϕap )(t) − ϕap (t) = −E(ϕap )(t) by the definition of E(ϕap ), whence kϕ1 (t) − ϕ0 (t)k ≤ E(t) ≤ Σ; this gives (4.9)0 . For each k ≥ 0, we have Z t dskϕk+1 (s) − ϕk (s)k , kϕk+2 (t) − ϕk+1 (t)k = kJ (ϕk+1 )(t) − J (ϕk )(t)k ≤ Λ t0
(4.11)
the last passage depending on Eq. (4.4). Eqs. (4.11) and (4.9)k imply (4.9)k+1 . Lemma 4.7. For all k, k 0 and n ∈ N, it is kϕk0 − ϕk k ≤ ΣeΛ(t1 −t0 )
Λh (t1 − t0 )h , h!
h := min(k, k 0 ) ;
(4.12)
so, (ϕk ) is a Cauchy sequence. Proof. To prove Eq. (4.12), it suffices to consider the case k 0 > k (so that h = k). P 0 −1 Writing ϕk0 − ϕk = kj=k (ϕj+1 − ϕj ) and using Eq. (4.10) we get kϕ − ϕk k ≤ Σ k0
0 kX −1
j=k
Λj (t1 − t0 )j . j!
(4.13)
Pk0 −1 P+∞ On the other hand, for each ξ ≥ 0, it is j=k ξ j /j! ≤ j=k ξ j /j! ≤ eξ ξ k /k!; with ξ = Λ(t1 − t0 ) we obtain Eq. (4.12), implying kϕk0 − ϕk k → 0 for (k, k 0 ) → ∞.b b Incidentally we note that (4.12) could be improved, but this is unnecessary: this estimate is needed only to infer the Cauchy property of the sequence.
April 21, 2004 14:27 WSPC/148-RMP
00202
On Approximate Solutions of Semilinear Evolution Equations
397
Proof of Proposition 3.4. (ϕk ) being a Cauchy sequence, limk7→+∞ ϕk := ϕ exists in C([t0 , t1 ], F); ϕ belongs to D because this set is closed. By the continuity of J , we have J (ϕ) = lim J (ϕk ) = lim ϕk+1 = ϕ ; k7→+∞
k7→+∞
(4.14)
thus, ϕ solves the Volterra problem (2.21). Finally, the inequality kϕ(t) − ϕap (t)k ≤ R(t) for all t ∈ [t0 , t1 ] is ensured by the definition of D. 5. An Elementary Application of Proposition 3.4 The results we are presenting in the forthcoming Propositions 5.1 and 5.2 are essentially known (see, e.g., [7] for the case U(t) = 1 and dim F finite), but their derivation as a subcase of Proposition 3.4 is instructive: the main idea is to use the zero function as an approximate solution. Proposition 5.1. Consider the Volterra problem (2.21), where: (i) U is a strongly continuous linear semigroup, with an estimator u(t) := U e −Bt (U ≥ 1, B ≥ 0). (ii) P is continuous and Lipschitz at fixed time on the strict subsets of its open domain. It is B(0, ρ) × [t0 , T ) ⊂ Dom P for some ρ ∈ (0, +∞], T ∈ (t0 , +∞], and P(0, t) = 0 ,
for t ∈ [t0 , T ) .
(5.1)
There is a continuous function ` : [0, ρ) × [t0 , T ) → [0, +∞), nondecreasing in the first variable, such that kP(f, t)k ≤ `(kf k, t) ,
for (f, t) ∈ B(0, ρ) × [t0 , T ) .
(5.2)
(iii) The control problem ˙ R(t) = U `(R(t), t) − BR(t) ,
R(t0 ) = U kf0 k ,
(5.3)
has a solution R ∈ C 1 ([t0 , tN ), [0, ρ)), for some tN ∈ (t0 , T ). Then, the Volterra problem (2.21) has a solution ϕ ∈ C([t0 , tN ), F) and , for all t in this interval , kϕ(t)k ≤ R(t) .
(5.4)
Proof. We apply Proposition 3.4 and 3.6 with ϕap (t) := 0 for t ∈ [t0 , tN ); the function ` in item (ii) is a growth estimator for P from the approximate solution. The datum and differential errors are d(ϕap ) = −f0 , e(ϕap )(t) = 0, so they admit the estimators δ := kf0 k, (t) := 0. With these estimators, problem (3.17) takes the form (5.3). The symbol tN adopted here for the right extreme of the domain of R is chosen for future convenience; it emphasizes the dependence of this object on the norm
April 21, 2004 14:27 WSPC/148-RMP
398
00202
C. Morosi & L. Pizzocchero
of the initial datum. In the time independent case `(r, t) = `(r), Eq. (5.3) can be solved by the quadrature formula Z
R(t) U kf0 k
dr = t − t0 ; U `(r) − Br
(5.5)
let us write the explicit solution in a simple case. Proposition 5.2. Let the previous assumptions be satisfied with t0 = 0, δ = +∞, T = +∞ and `(r, t) = P r p (P ≥ 0, p > 1). Then, the problem (5.3) has the solution R ∈ C 1 ([0, tN ), [0, +∞)) defined hereafter. It is tN :=
LB (u) :=
R(t) :=
if P U p kf0 kp−1 ≤ B ,
+∞
1 LB P U p kf0 kp−1 (p − 1)
−(1/B) log(1 − B/u)
1/u
(P U p kf
[1 − Bu e −1 B EB (u) := u
U kf0 k
0
kp−1
if P U p kf0 kp−1 > B ,
(5.6)
if 0 < B < u , (5.7) if B = 0 < u , 1
− B)EB ((p − 1)t)] p−1
for all t ∈ [0, tN ) ,
if B > 0 ,
(5.8)
(5.9)
if B = 0 .
The function R has the following features. If P U p kf0 kp−1 < B, R is decreasing and R(t) → 0 for t → +∞. If P U p kf0 kp−1 = B, R(t) = const. = U kf0 k. If P U p kf0 kp−1 > B, R is increasing and R(t) → +∞ for t → t− N. Proof. Everything follows in an elementary way from (5.5). Remarks. (i) A map P as in the Example of page 388 has the properties required by the Proposition 5.2, if the function t 7→ P (t) appearing in Eq. (2.15) is bounded on the interval [t0 , T ) under consideration. In this case, the growth of P from zero admits the estimator `(r, t) := P r p , with P the sup of the function t 7→ P (t). (ii) Obviously enough: if P U p kf0 kp−1 < B, the Volterra problem (2.21) has a solution ϕ defined for all t ∈ [0, +∞), and kϕ(t)k ≤ R(t) → 0 for t → +∞. If P U p kf0 kp−1 = B, we have again a solution defined on [0, +∞), and kϕ(t)k ≤ U kf0 k for all t. If P U p kf0 kp−1 > B, we can grant existence of a solution ϕ at least until the time tN in Eq. (5.6), and the bound kϕ(t)k ≤ R(t) with R diverging at tN ; the result for this case can be applied to blow up problems, to get a lower bound on the time of explosion of the solution and an upper bound on its growth.
April 21, 2004 14:27 WSPC/148-RMP
00202
On Approximate Solutions of Semilinear Evolution Equations
399
Example. We consider the Banach space F := C0 (R) := {f : R → R | f is continuous, f (x) → 0 for x → ∞} ; kf k := sup |f (x)| x∈R
for f ∈ F .
(5.10) (5.11)
We define a linear semigroup U : t ∈ [0, +∞) → U(t) ∈ L(F) setting (U(t)f )(x) := f (x + t)
(5.12)
for all f ∈ F; U is strongly continuous, and kU(t)f k = kf k.c The generator of U is the operator d : C01 (R) ⊂ F → F , f 7→ fx (5.13) dx where C01 (R) is the space of the C 1 functions f : R → R such that f, fx vanish at infinity. We also introduce the function A :=
P : F → F,
P(f ) := f p (p > 1 integer) ,
(5.14)
which can be seen as a t-independent case of the Example on page 388, with P(f1 , . . . , fp ) := f1 · · · fp ; of course kP(f1 , . . . , fp )k ≤ kf1 k · · · kfp k. We consider the Volterra problem (2.21) with t0 := 0, and an arbitrary initial datum f0 ∈ F; in the special case f0 ∈ C01 (R), the corresponding Cauchy problem is ϕ(t) ˙ = ϕ(t)x + ϕ(t)p ,
ϕ(0) = f0 ,
(5.15)
involving a first order wave equation with polynomial nonlinearity. The results of Proposition 5.2 can be applied in this framework with U = 1, B = 0 and P = 1. For any f0 ∈ F, this proposition ensures existence of the solution ϕ from time 0 to 1 tN := , (5.16) (p − 1)kf0 kp−1 (intending tN := +∞ if f0 = 0), and gives the bound kϕ(t)k ≤ R(t) ,
R(t) :=
kf0 k
1
[1 − (p − 1)kf0 kp−1 t] p−1
(5.17)
for all t ∈ [0, tN ). In this case, the accuracy of the estimates in Proposition 5.2 can be checked in a very direct way, because the maximal solution of (2.21) is known; this is given by ϕ(t)(x) =
c In
f0 (x + t)
for t ∈ [0, ϑ) , 1 [1 − (p − 1)f0 (x + t)p−1 t] p−1 ( ) p−1 ϑ := sup t > 0 (p − 1) sup f0 (x) t<1 . x∈R
(5.18)
(5.19)
fact, U can be extended to a linear group, also defined for t ≤ 0, but we do not emphasize this aspect: our general framework is designed for time evolution in the future.
April 21, 2004 14:27 WSPC/148-RMP
400
00202
C. Morosi & L. Pizzocchero
If f0 ∈ C01 (R), the above ϕ also fulfils the Cauchy problem (5.15): this implies the full equivalence of the Volterra and Cauchy problems for such an f0 , in spite of the fact that item (ii) of Proposition 2.7 does not apply to the nonreflexive Banach space C0 (R). The following facts occur: (i) for p odd, or p even and supx f0 (x) = supx |f0 (x)|, ϑ equals the time tN in Eq. (5.16); thus, Proposition 5.2 gives the best possible lower bound on the existence time of the maximal solution. (ii) For p even and 0 < supx f0 (x) < supx |f0 (x)|, it is +∞ > ϑ > tN . (iii) For p even and supx f0 (x) ≤ 0 < supx |f0 (x)|, it is +∞ = ϑ > tN . The accuracy of the growth estimate (5.17) is easily analysed by comparison with Eq. (5.18). We think that better results would arise in cases (ii) and (iii) by suitably generalizing the theory of approximate solutions to the framework of ordered Banach spaces [18]; this will be done elsewhere. 6. Approximate Solutions on Finite-Dimensional Submanifolds of F Let us discuss a general scheme to construct accurate approximate solutions, and apply to it Proposition 3.4 to get information on the exact solution; a typical realization of this scheme is the Galerkin method, discussed in the sequel. 6.1. The framework From now on: U is a strongly continuous linear semigroup with generator A and an estimator u(t) := U e−Bt (U ≥ 1, B ∈ R); P : Dom P ⊂ F × R → F is continuous and Lipschitz on the strict subsets of its open domain; (f0 , t0 ) ∈ Dom P. Our idea is to construct an approximate solution ϕap for the Volterra problem (2.21), lying on a finite-dimensional (linear or nonlinear) submanifold of F; we assume the latter to be coordinatized by some real parameters ak , labelled by a finite set of indices I. More precisely, we consider an injective C 1 map G : Dom G ⊂ RI → F ,
a = (ak )k∈I 7→ G(a) ,
(6.1)
with open domain, such that the partial derivatives ∂G (a) ∈ F , (k ∈ I) (6.2) ∂ak are linearly independent for all a ∈ Dom G; we regard Im G as a multidimensional surface in F. We also suppose that ∂k G(a) ≡
Im G ⊂ Dom A ,
Im G × [t0 , T ) ⊂ Dom P
(6.3)
for some T ∈ (t0 , +∞], and ask G to be continuous as a map to Dom A with the graph norm. The approximate solution we consider has the form ϕap (t) := G(a(t)) ,
a( ) ∈ C 1 ([t0 , t1 ), Dom G) ,
t 7→ a(t)
(6.4)
April 21, 2004 14:27 WSPC/148-RMP
00202
On Approximate Solutions of Semilinear Evolution Equations
401
where [t0 , t1 ) ⊂ [t0 , T ), and a( ) is a function determined in the sequel. Clearly, the datum and differential errors of ϕap are d(ϕap ) := G(a(t0 )) − f0 ,
(6.5)
e(ϕap )(t) := ∂k G(a(t))a˙ k (t) − AG(a(t)) − P(G(a(t)), t)
(6.6)
(here and in the sequel, we employ the familiar Einstein’s summation convention on repeated indices). We prescribe a( ) to fulfil a Cauchy problem a(t) ˙ = X(a(t), t) ,
a(t0 ) = a0
where a0 is an initial datum, and X : Dom G ⊂ RI → RI a continuous vector field; the criteria to fix a0 and X are discussed later. For convenience, for all a ∈ Dom G, a˙ ∈ RI , t ∈ [t0 , T ) we put ˆ δ(a) := kG(a) − f0 k ,
ˆ(a, a, ˙ t) := kAG(a) + P(G(a), t) − ∂k G(a)a˙ k k
ˆ(a, t) := ˆ(a, X(a, t), t) ;
(6.7) (6.8)
then, the approximate solution ϕap admits the datum and differential error estimators ˆ 0) , δ := δ(a
(t) := ˆ(a(t), t) .
(6.9)
To conclude, we assume there are ρ ∈ (0, +∞] and a continuous function `ˆ : [0, ρ) × Dom G × [t0 , T ) → [0, +∞) ,
ˆ a, t) , (r, a, t) 7→ `(r,
(6.10)
nondecreasing in the variable r, such that a ∈ Dom G, t ∈ [t0 , T ) and kf −G(a)k < ρ imply (f, t) ∈ Dom P and ˆ kP(f, t) − P(G(a), t)k ≤ `(kf − G(a)k, a, t) .
(6.11)
Then, the function ˆ a(t), t) `(r, t) := `(r,
(6.12)
is a growth estimator for P from ϕap . The application of Propositions 3.4 and 3.6 gives Proposition 6.1. Consider the equations a(t) ˙ = X(a(t), t) ,
a(t0 ) = a0 ,
ˆ ˙ R(t) = U ˆ(a(t), t) + U `(R(t), a(t), t) − BR(t) , 1
(6.13) ˆ 0) , R(t0 ) = U δ(a
(6.14)
1
for the unknowns a( ) ∈ C ([t0 , t1 ), Dom G), R ∈ C ([t0 , t1 ), [0, ρ)). If (a( ), R) is a solution on some interval [t0 , t1 ) and ϕap (t) := G(a(t)), then the Volterra problem (2.21) has a solution ϕ on [t0 , t1 ), and kϕ(t) − ϕap (t)k ≤ R(t) on the same interval. Let us pass to the criteria for choosing X and a0 . One of the most familiar is the Galerkin criterion (see, e.g., [4] or [15]): we will concentrate on it and will not discuss other approaches (such as the variational methods often used for the Lagrangian
April 21, 2004 14:27 WSPC/148-RMP
402
00202
C. Morosi & L. Pizzocchero
or Hamiltonian evolution equations, see, e.g., [6]). The Galerkin choice for a 0 and X is the one minimizing the norms of the datum error and of the differential error (at any time): Definition 6.2. The vector field X and the datum a0 fulfil the Galerkin criterion if ˆ(a, a, ˙ t) = min! ˆ δ(a) = min!
for a˙ = X(a, t) ,
(6.15)
for a = a0
(6.16)
(the symbol min! indicating the absolute minimum). (Of course, condition (6.16) is trivially satisfied if f0 = G(a0 ); then the absolute ˆ attained at this point, is zero). minimum of δ, Both Eqs. (6.15) and (6.16) can be studied in a systematic way if F pis a Hilbert space, say real, with an inner product h | i yielding the norm kf k := hf |f i; from now on we stick to this assumption. For all a ∈ Dom G we introduce the matrix gkl (a) := h∂k G(a) | ∂l G(a)i ,
(k, l ∈ I) ,
(6.17)
which is symmetric and positive defined (recall the linear independence of the vectors ∂k G(a)). As customary in tensor calculus, we denote the inverse matrix with gkh (a) and introduce the convention of “raising and lowering indices” with these matrices. In connection with this, it is worthy to write down the identities h∂ k G(a) | ∂h G(a)i = δhk ;
(6.18)
v k (a)∂k G(a) = vk (a)∂ k G(a) , hv k (a)∂k G(a) | wh (a)∂h G(a)i = vk (a)wk (a) = v k (a)wk (a) .
(6.19)
Here: ∂ k G(a) := gkh (a)∂h G(a) ∈ F; v k (a) (k ∈ I) is a family of real numbers, vk (a) := gkh (a)v h (a) and wk (a), wk (a) have a similar meaning. We apply all these notations to the discussion of the minimum problems (6.15) and (6.16); the solutions are given by the two forthcoming propositions. Proposition 6.3. For any fixed (a, t) ∈ Dom G×[t0 , T ), the function a˙ 7→ ˆ2 (a, a, ˙ t) is quadratic; it has a unique point of absolute minimum at a˙ k = X k (a, t) ,
X k (a, t) := h∂ k G(a) | AG(a) + P(G(a), t)i .
(6.20)
The absolute minimum ˆ(a, X(a, t), t) := ˆ(a, t) is given by ˆ(a, t)2 = kAG(a)k2 − hAG(a) | ∂k G(a)ih∂ k G(a) | AG(a)i + 2hAG(a) | P(G(a), t)i − 2hAG(a) | ∂k G(a)ih∂ k G(a) | P(G(a), t)i + kP(G(a), t)k2 − hP(G(a), t) | ∂k G(a)ih∂ k G(a) | P(G(a), t)i .
(6.21)
April 21, 2004 14:27 WSPC/148-RMP
00202
On Approximate Solutions of Semilinear Evolution Equations
403
Proof. The general theory of Hilbert spaces tells us that, given a vector g and a closed vector subspace T of F, the problem find f in T such that kg − f k = min!
(6.22)
has the unique solution f = Πg; kg − f k2 = kgk2 − hg | Πgi, Π := the orthogonal projection of F → T . (6.23) For any fixed (a, t), the problem we have in mind has just the form (6.22): in this case T = T (a) := {a˙ k ∂k G(a)) | a˙ ∈ RI } ⊂ F ,
(6.24)
(which represents the tangent subspace at G(a) of Im G), the unknown f is written X k (a, t)∂k G(a) and g = AG(a) + P(G(a), t) ,
Π = Π(a) = h∂ k G(a) | ·i∂k G(a) ;
(6.25)
computing from (6.23) the solution f and the quantity kg−f k2 = ˆ(a, t)2 , we obtain Eqs. (6.20) and (6.21). Proposition 6.4. Assume that: (i) G is C 2 , Dom G is convex ; 2 (ii) the matrix gkl (a) + hG(a) − f0 | ∂kl G(a)i is semipositive for all a ∈ Dom G 2 (where ∂kl are the second partial derivatives with respect to ak , al ); (iii) there is a point a0 such that h∂ k G(a0 ) | G(a0 )i = h∂ k G(a0 ) | f0 i .
(6.26)
ˆ Then, the absolute minimum of δ(a) is attained at a = a0 . Proof. Everything follows computing the first and second derivatives of the function a → δˆ 2 (a) in Eq. (6.7). Equation (6.26) is the vanishing condition for the first derivatives, the other assumptions ensure that the stationary point a0 is of absolute minimum. Remark. The geometrical meaning of Eq. (6.26) is Π(a0 )G(a0 ) = Π(a0 )f0 , where Π(a0 ) is the orthogonal projection onto T (a0 ), see (6.25). 6.2. The classical Galerkin method All the previous formulas become very simple in the “classical” realization, where G : RI → F ,
a 7→ G(a) = ak ek ,
(6.27)
(ek )k∈I linearly independent vectors of Dom A, (ek , t) ∈ Dom P for k ∈ I, t ∈ [t0 , T ) .
April 21, 2004 14:27 WSPC/148-RMP
404
00202
C. Morosi & L. Pizzocchero
In this case Im G is a vector subspace, and we have the identities ∂k G(a) = const. = ek ,
gkl (a) = const. := gkl ,
∂ k G(a) = const. = gkh eh := ek . (6.28)
(Also, the tangent space at any point G(a) is T (a) = const. = Im G; further simplifications occur in the orthonormal case where gkl = δkl , ek = ek ). Proposition 6.5. The absolute minimum point for δˆ and its value at this point are given by ak0 = hek | f0 i ,
ˆ 0 ) = kf0 − ak ek k . δ(a 0
(6.29)
ˆ 0 ) = 0 if f0 is in the linear subspace spanned by the family (ek )). (In particular , δ(a Proof. Elementary (recall Eq. (6.26)). Proposition 6.6. Suppose P to be determined by a p-linear map P, as in the Example of page 388. Then, Eqs. (6.20) and (6.21) become X k (a, t) = hek | Ael ial + hek | P(el1 , . . . , elp , t)ial1 · · · alp ; ˆ(a, t)2 = hAej | Ael i − hAej | ek ihek | Ael i aj al
(6.30)
+ 2 hAej | P(el1 , . . . , elp , t)i − hAej |ek ihek |P(el1 , . . . , elp , t)i aj al1 · · · alp
+ hP(ej1 , . . . , ejp , t) | P(el1 , . . . , elp , t)i
− hP(ej1 , . . . , ejp , t) | ek ihek | P(el1 , . . . , elp , t)i aj1 · · · ajp al1 · · · alp . (6.31) In the right-hand side of the last equation, the coefficients of aj al and aj al1 · · · alp are zero if the subspace spanned by the family (ek ) is invariant under A (which occurs, in particular , if each ek is an eigenvector of A). Proof. Both Eqs. (6.30) and (6.31) follow easily from ∂k G(a) = ek and from the multilinearity of P. In the second equation the coefficients of aj al and aj al1 · · · alp are, respectively, hAej | Ael i − hAej | ek ihek | Ael i = hAej | Ael i − hAej | ΠAel i ,
(6.32)
hAej | P(el1 , . . . , elp , t)i − hAej | ek ihek | P(el1 , . . . , elp , t)i = hAej | P(el1 , . . . , elp , t)i − hΠAej | P(el1 , . . . , elp , t)i , where Π is the projection on the linear subspace spanned by (ek ). If this subspace is left invariant by A we have ΠAel = Ael for each l ∈ I, so the above coefficients are zero.
April 21, 2004 14:27 WSPC/148-RMP
00202
405
On Approximate Solutions of Semilinear Evolution Equations
Proposition 6.7. Assume again P to be as in the Example of page 388. Then, the growth of the function P starting from a point G(a) = ak ek admits this estimate, for all f ∈ F and t ∈ ∆: ˆ kP(f, t) − P(G(a), t)k ≤ `(kf − G(a)k, a, t) p p Pp I ˆ ˆ a k ak ` : [0, +∞) × R × ∆ → [0, +∞), `(r, a, t) := P (t) j=1 j Proof. Apply Eq. (2.17) with f 0 = G(a) and t0 = t; note that kf 0 k =
(6.33) p−j
rj . (6.34)
p a k ak .
In applications of the classical Galerkin method, some choices for the family (ek ) have a special consideration. Apart from systems of eigenvectors of A, other choices occur in finite elements methods, which are strictly related to the idea of approximating the evolutionary problem by space discretization; in this case, (ek ) is typically a family of piecewise linear (or polynomial) “chapeau functions” related to some spatial grid, see e.g. [4]. 7. Applications to the Nonlinear Heat Equation Our aim is to discuss the nonlinear heat equation f˙ = fxx + f p with Dirichlet boundary conditions, for x ranging in (0, π) (we work in one space dimension only for simplicity). Let us introduce this equation in the framework of Sobolev spaces. 7.1. Notations for Sobolev spaces All functions on (0, π) are real-valued; F(0, π) means F((0, π), R) for each functional class F. We Rconsider the Hilbert space L2 (0, π), with the standard inner π product hf | giL2 := 0 dxf (x)g(x); here the functions r 2 sin(kx) (k ∈ {1, 2, 3, . . .}) (7.1) sk (x) := π form a complete orthonormal system. We introduce the Sobolev space H 1 (0, π) := {f ∈ L2 (0, π) | fx ∈ L2 (0, π)} ⊂ C([0, π]) ,
(7.2)
fx denoting the distributional derivative of f ; this is a Hilbert space with the inner product hf | gi := hf | giL2 + hfx | gx iL2 , (7.3) p yielding the norm kf k := hf | f i. The inclusion indicated in (7.2) is well known, and allows us to define f (0), f (π) for all f ∈ H 1 (0, π); we fix attention on the closed subspace F := H01 (0, π) := {f ∈ H 1 (0, π) | f (0) = f (π) = 0} ,
(7.4)
April 21, 2004 14:27 WSPC/148-RMP
406
00202
C. Morosi & L. Pizzocchero
and equip it with the restriction of the inner product (7.3). It turns out that ( ) ∞ X F = f ∈ L2 (0, π) (7.5) k 2 hsk | f iL22 < +∞ . k=1
The functions sk form a complete orthogonal system for this space: it is hsk | sl i = (1 + k 2 )δkl , ∞ X
hf | gi =
k=1
1
hsk |f i = (1 + k 2 )hsk |f iL2 ,
(1 + k 2 )hf | sk iL2 hsk | giL2 ,
∀f ∈ F,
∀f, g ∈ F .
(7.6)
Both H (0, π) and F are known to be Banach algebras with respect to the pointwise product. We will use systematically the inequality (almost optimal, see Appendix A) kf gk ≤ kf k kgk ,
∀ f, g ∈ F .
(7.7)
7.2. The operator A This is the linear map A :=
d2 on Dom A := {f ∈ F | fxx ∈ F} . dx2
(7.8)
Of course, for all k, Ask = −k 2 sk .
(7.9)
The operator A generates the strongly continuous linear semigroup U on F, defined by U(t)f :=
∞ X
k=1
2
e−k t hsk | f iL2 sk ,
for t ∈ [0, +∞), f ∈ F ;
the above series is in fact convergent in F, and v u∞ uX kU(t)f k = t e−2k2 t (1 + k 2 )hsk | f iL22 ≤ e−t kf k .
(7.10)
(7.11)
k=1
7.3. The nonlinear function P This is defined by P : F → F,
P(f ) := f p
(p > 1 integer) .
(7.12)
It belongs to the class of maps in the Example of page 388, and corresponds to the time independent p-linear map P : ×p F → F ,
P(f1 , . . . , fp ) := f1 · · · fp .
Of course, Eq. (7.7) implies kP(f1 , . . . , fp )k ≤ kf1 k · · · kfp k.
(7.13)
April 21, 2004 14:27 WSPC/148-RMP
00202
On Approximate Solutions of Semilinear Evolution Equations
407
7.4. The Volterra and Cauchy problems We consider the Volterra problem (2.21) on F, with U as before, P(f, t) := the above defined P(f ), t0 := 0 and some initial datum f0 ∈ F; this reads ϕ(t) = U(t)f0 +
Z
t 0
ds U(t − s)ϕ(s)p .
(7.14)
From now on, we will denote with ϕ : [0, ϑ) → F
(7.15)
the maximal solution. If f0 ∈ Dom A, (2.21) is fully equivalent to the Cauchy problem ϕ(t) ˙ = ϕ(t)xx + ϕ(t)p ,
ϕ(0) = f0
(7.16)
(because F is reflexive and P Lipschitz on the strict subsets of its domain). 7.5. Kaplan’s blow up criterion We specialize to the present framework a general citerion of Kaplan [5] for the blow up in a finite time of the solution of a nonlinear parabolic equation. To this purpose, we introduce the function Z 1 1 π Q : L2 (0, π) → R , f 7→ Q(f ) := hsin |f iL2 = dx sin x f (x) . (7.17) 2 2 0 Proposition 7.1. Consider the Volterra problem (7.14); if f0 ≥ 0 ,
Q(f0 ) > 1 ,
(7.18)
then ϑ ≤ tK ,
tK := −
1 1 log 1 − . (p − 1) Q(f0 )p−1
(7.19)
Proof. It is sketched in Appendix B, adapting Kaplan’s general argument. From now on, tK will be referred to as the Kaplan time for the datum f0 . 7.6. Basic estimates on the solution As a first step in our analysis, let us apply Propositions 5.1 and 5.2 to the Volterra problem. In the case we are considering, the semigroup has an estimator u(t) = U e−Bt with U = 1, B = 1; also, it is kP(f )k ≤ `(kf k) with `(r) := r p . Therefore, we have
April 21, 2004 14:27 WSPC/148-RMP
408
00202
C. Morosi & L. Pizzocchero
Proposition 7.2. For any initial datum f0 ∈ F, it is ϑ ≥ tN and kϕ(t)k ≤ R(t) for all t ∈ [0, tN ), with tN and R depending on the norm of f0 in the following way: +∞ if kf0 k ≤ 1 , ! (7.20) tN := 1 1 if kf0 k > 1 , − (p − 1) log 1 − kf kp−1 0
kf0 k R(t) := 1 . 1 − (kf0 kp−1 − 1)(e(p−1)t − 1) p−1
(7.21)
If kf0 k < 1, R is decreasing and R(t) → 0 for t → +∞. If kf0 k = 1, R(t) = 1 for all t. If kf0 k > 1, R is increasing and R(t) → +∞ for t → t− N. 7.7. Summary of the previous results on ϑ. An example We have tN ≤ ϑ for all f0 ∈ F ;
ϑ ≤ tK if f0 ≥ 0, Q(f0 ) > 1 ,
(7.22)
with tN as in (7.20), tK as in (7.19). Let us consider, in particular, the initial datum r 2 f0 (x) := As1 (x) = A sin x , (A ≥ 0) ; (7.23) π then f0 ∈ Dom A, so we have a full equivalence of (7.14) with the Cauchy problem (7.16). It turns out that √ 2 A kf0 k = , CN := = 0.7071.. ; CN 2 r (7.24) 2 A , CK := 2 = 1.595.. . Q(f0 ) = CK π Therefore, Eq. (7.22) with this choice of the datum tells us that ϑ = +∞ if 0 ≤ A ≤ CN ;
tN ≤ ϑ if CN < A ≤ CK ;
tN ≤ ϑ ≤ tK if A > CK , (7.25)
tN
C p−1 1 := − log 1 − N p−1 Ap−1
!
∼A→+∞
p−1 1 CN , p − 1 Ap−1
(7.26)
tK
C p−1 1 log 1 − K := − p−1 Ap−1
!
∼A→+∞
p−1 1 CK . p − 1 Ap−1
(7.27)
It should be noted that (7.22) does not allow us to establish whether ϑ is finite or infinite, for A in the interval (CN , CK ]. In the rest of the section, we will infer more precise estimates about ϑ by means of the Galerkin method, and also rediscuss its behavior for large A.
April 21, 2004 14:27 WSPC/148-RMP
00202
On Approximate Solutions of Semilinear Evolution Equations
409
7.8. A Galerkin approach to the nonlinear heat equation We apply the scheme of Sec. 6 with G(a) := ak sk for all a = (ak ) ∈ RI
I a finite subset of {1, 2, 3, . . .} ,
(7.28)
and sk the functions (7.1). We refer, in particular, to the description given in the previous section for the “classical” Galerkin method, to be employed with ek := sk , h | i the inner product (7.3) on F := H01 (0, π), and gkl = (1 + k 2 )δkl ,
gkl =
1 δ kl ; 1 + k2
(7.29)
these matrices are used to raise and lower indices. The vector field X, the error function ˆ and the growth estimator function `ˆ of Eqs. (6.30), (6.31) and (6.34) are time independent, and given by X k (a) := −k 2 ak + hsk |sl1 · · · slp ial1 · · · alp ,
(7.30)
ˆ(a)2 := (hsj1 · · · sjp |sl1 · · · slp i − hsj1 · · · sjp |sk ihsk |sl1 · · · slp i)aj1 · · · ajp al1 · · · alp , (7.31) p X p p p−j ˆ a) := a k ak rj (7.32) `(r, j j=1 P (see the observation following Eq. (7.13); also, note that ak ak = k∈I (1+k 2 )(ak )2 ). This amount of information (completed by Eq. (6.29) for the initial datum) must be inserted into the general scheme of Proposition 6.1; the solution of the finite dimensional system (6.13) and (6.14) appearing therein provides simultaneously: (i) a pair of functions a( ), R( ) on an interval [0, tG ), the former giving the approximate solution ϕap (t) := ak (t)sk . In the sequel tG will be called the Galerkin time; (ii) an assurance that the Volterra problem (7.14) has a maximal solution ϕ on [0, θ) ⊃ [0, tG ), and a bound kϕ(t) − ϕap (t)k ≤ R(t) for all t < tG . 7.9. Introducing an example We assume p := 2 ,
f0 as in (7.23) .
(7.33)
Problem (7.14)–(7.16) will be treated with a “two-modes” application of the Galerkin method; more precisely, we will work on the linear submanifold spanned by (sk )k∈I , setting I := {1, 3} ,
α := a1 ,
γ := a3 ;
(7.34)
(the choice I = {1, 2, 3} would not yield any improvement, because the function t 7→ a2 (t) would be ultimately found to be zero). The vector field, the error function
April 21, 2004 14:27 WSPC/148-RMP
410
00202
C. Morosi & L. Pizzocchero
and the growth function of Eqs. (7.30), (7.31) and (7.32) are given by r 72 2 2 8 2 16 X α (α, γ) = −α + α αγ + γ − , π3 3 15 35 r 8 2 144 8 2 2 γ ; − α + αγ + γ X (α, γ) := −9γ + π3 15 35 9
(7.35)
7 512 34816 10 4 − − α + α3 γ 2π 15π 3 315π 3 π 22528 3 3247616 39 46 12172288 2 2 α γ − γ 4 ; (7.36) − αγ + − + π 33075π 3 175π 3 2π 99225π 3 p ˆ γ, r) = r2 + 2 2α2 + 10γ 2 r . `(α, (7.37) ˆ(α, γ)2 =
According to (6.29), the initial conditions for α(t) and γ(t) are, respectively, hs1 | f0 i = A ,
hs3 | f0 i = 0 ;
(7.38)
the corresponding datum error is zero. In conclusion, we have to study the system α˙ = X α (α, γ) ,
γ˙ = X γ (α, γ) ,
α(0) = A ,
ˆ γ, R) − R , R˙ = ˆ(α, γ) + `(α,
γ(0) = 0 ,
R(0) = 0 ,
(7.39)
for the unknown functions t 7→ α(t), γ(t), R(t). This cannot be solved analytically, but can be easily treated by any package for the numerical solution of ordinary differential equations; an integration algorithm with adaptative control of the step size gives an excellent approximation for the solution of (7.39), also including the evaluation of its existence time. All statements that follow about the system (7.39) are based on the MATHEMATICA package; thus, expressions such as “the solution of (7.39)” etc., always indicate the MATHEMATICA output (of which we report the first digits). 7.10. New estimates on the existence time ϑ We have the bounds tG ≤ ϑ for all A ≥ 0 ,
ϑ ≤ tK for A > CK
(7.40)
involving the Galerkin and Kaplan times, defined previously (the latter depends on CK = 1.595..). It is found that tG = +∞ (whence ϑ = +∞) , α(t), γ(t), R(t) → 0 ,
for 0 ≤ A ≤ CG ,
CG = 1.056.. ;
for t → +∞ and 0 ≤ A ≤ CG .
(7.41) (7.42)
It should be noted that the bound CG on A for the global existence of (7.16) is better than the previously derived bound CN = 0.7071.. . For larger values of A, the existence time tG for (7.39) is finite. The Table (7.44) reports tG for some values of
April 21, 2004 14:27 WSPC/148-RMP
00202
On Approximate Solutions of Semilinear Evolution Equations
411
A above the Kaplan critical value CK , as well as the corresponding values of tK . It also reports η :=
tK − t G , tK + t G
(7.43)
which is the relative uncertainty about the actual existence time ϑ of (7.16) corresponding to the lower and upper bounds tG , tK . A ≤ 1.056.. 1.60 2 4 10 20
tG +∞ 1.104.. 0.7730.. 0.3138.. 0.1112.. 0.05340..
tK
η
5.935.. 1.598.. 0.5090.. 0.1738.. 0.08315..
0.6861.. 0.3481.. 0.2372.. 0.2196.. 0.2177..
(7.44)
The A → +∞ limit for the previous estimates is easily discussed. To determine the asymptotics of tG , we re-express the unknown functions α(t), γ(t) and R(t) in terms of three rescaled functions t → a(t), c(t), R(t), depending on t := At and defined by α(t) = A a(At) ,
γ(t) = A c(At) ,
R(t) = A R(At) .
(7.45)
Then, the system (7.39) becomes (with 0 := d/dt) r r 72 2 8 2 a 2 8 2 16 2 9c 8 2 144 0 0 a − ac + c + a + ac + c a =− + , c = − − , A π3 3 15 35 A π3 15 35 9 ˆ c, R) − R , R0 = ˆ(a, c) + `(a, A a(0) = 1 ,
c(0) = 0 ,
(7.46)
R(0) = 0 .
In the A → +∞ limit, the terms a/A, c/A and R/A can be neglected and the outcoming system (7.46)∞ is A-independent. The numerical treatment of this limit system shows that the solution t → a(t), c(t), R(t) exists for t ∈ [0, CG ), where CG = 1.026.. .
(7.47)
Returning to the standard time t = t/A, we conclude that tG ∼
CG for A → +∞ . A
(7.48)
It should be noted that all values of A in the Table (7.44) are seen empirically to fulfil the inequality CG CG log 1 − , (7.49) tG ≥ − CG A
April 21, 2004 14:27 WSPC/148-RMP
412
00202
C. Morosi & L. Pizzocchero
with tG very close to the right-hand side. To compare these results with the Kaplan upper bound recall that, for all A > CK = 1.596.., CK CK tK = − log 1 − ∼A→+∞ . (7.50) A A Due to (7.48) and (7.50), the relative uncertainty (7.43) has the limit
CK − C G = 0.2173.. , for A → +∞. (7.51) CK + C G As a matter of fact, in this limit case one can find directly the asymptotics for the actual existence time ϑ of (7.14–7.16). In fact, if one writes the maximal solution ϕ as η→
ϕ(t) = Aχ(At)
(7.52)
one obtains for χ the Cauchy problem r 1 2 2 sin x . (7.53) χ (t) = χ(t)xx + χ (t) , χ(0)(x) = A π For A → +∞, the differential equation becomes χ0 (t) = χ2 (t), and the solution is p p 2/π sin(x) p χ(t)(x) = (7.54) , for t ∈ [0, π/2) ; 1 − 2/π t sin x 0
so, returning to the standard time t = t/A, we conclude p π/2 1.253.. ϑ ∼A→+∞ = . (7.55) A A p The constant π/2 is fairly close to the arithmetic mean of the costants CG and CK , which appear in the asymptotic expressions (7.48) and (7.50) for the lower and upper bounds tG and tK . Thus, the actual existence time ϑ should be close to the arithmetic mean of tG and tK if A is sufficiently large. An attack on the Cauchy problem (7.16) that we performed approximating d2 /dx2 by finite differences seems to indicate that this actually occurs for all A & 2. 7.11. Analysis of the Galerkin solution After solving the system (7.39) for a given A, one constructs the corresponding approximate solution for (7.16) such that, for t ∈ [0, tG ), r r p 2 2 ϕap (t)(x) = α(t) sin x + γ(t) sin(3x) ; kϕap (t)k = 2α(t)2 + 10γ(t)2 . π π (7.56) The system (7.39) also gives a function R such that kϕap (t) − ϕ(t)k ≤ R(t) for t in the same interval; let us illustrate the behavior of the above functions for two values of A. Case A = 1. It is tG = +∞, which implies ϑ = +∞. Figures 1, 2 and 4 give the graphs of the functions t 7→ α(t), γ(t), kϕap (t)k and R(t), all converging to zero for
April 21, 2004 14:27 WSPC/148-RMP
00202
On Approximate Solutions of Semilinear Evolution Equations
413
1
1
2
3
4
5
-0.002
0.8
-0.004 0.6
-0.006 0.4
˜ -0.008
0.2
-0.01 -0.012 1
2
4
3
5
6
Fig. 1.1: AA== 1. 1. Graph of α(t). Figure Graph of α(t).
1
1
1
51
4
3 0.5
62
4
3
5
6
-0.002
-0.002
0.8
0.8
2
Figure 2: A = 1. Graph of γ(t).
0.4
-0.004
-0.004
-0.006
0.3 -0.006
˜ -0.008
-0.008 ˜0.2
0.6
0.6 0.4
0.4
0.2
0.2
-0.01
-0.01
0.1
-0.012 1
2
4
3
1
5
6
2
4
3
-0.012 5
6
0.5
1
1.5
2
2.5
3
Fig. 2. 2:A A = 1. of γ(t). Figure 1: A = 1. Graph of α(t). Figure = Graph 1. Graph of γ(t).2:for Figure 1: A = 1. Figure Graph 3: of α(t). A= Graph of γ(t). A = 1. Graphs of ϕFigure x 1. ∈ (0, π) and t = 1 (continuous line), t = ap (t)(x) (short dashes), t = 3 (long dashes). 0.5
0.5
0.4
1.2 0.4
0.35
0.3
1 0.3 0.8
0.3 0.25
0.2
0.2 0.6
0.2
˜ 0.15
0.1
0.1 0.4
0.1 0.5
1
1.50.2
2
2.5 0.5
3
1
1.5
2
2.5
0.05
3
1 3 π) and4 t = Figure 3: A = 1. Graphs ofFig. ϕap3.(t)(x) x ∈ (0, t for =2 x1 ∈(continuous line), t = 62 line), t = 2 (short A = for 1. Graphs of π) ϕapand (t)(x) (0, 1 5(continuous 1 1 0.25 t = 0.52 0.75 1.25 3: A = Graphs of ϕap (t)(x) for x ∈ (0, π) and t = 1 (continuous line), dashes), t =1. 3 (long dashes). (short dashes), t = 3 Figure (long dashes). (short dashes), t = 3 (long dashes). Figure 4: A = 1. Graphs of kϕap (t)k (continFigure 5: A = 1. Graph of R(t)/kϕap ( uous line) and R(t) (dashed line). 1.2
0.35
1.2 1
1
0.25
0.8
0.2
0.6
˜ 0.15
0.8
0.6
0.4 0.2
0.3
0.4
0.35 0.3 0.25 0.2
0.1
˜ 0.15
0.05
0.1
28
-0.004 0.6
April 21, 2004 14:27 WSPC/148-RMP -0.006
0.3
00202
0.4
˜ -0.008
0.2
0.2
-0.01
0.1
-0.012 1
2
4 414
3
1
0.5
5 6 C. Morosi & L. Pizzocchero
1.5
2
2.5
3
Figure 3: A = 2:1. AGraphs of ϕap Figure = 1. Graph of(t)(x) γ(t). for x ∈ (0, π) and t = 1 (continuous line), t = (short dashes), t = 3 (long dashes).
Figure 1: A = 1. Graph of α(t). 0.5 0.4
1.2
0.35
1
0.3
0.8
0.25
0.3 0.2
0.2
0.6
0.1
˜ 0.15
0.4
0.1 0.2 1.5 2
1
0.5
2.5
1
0.05
3
2
4
3
5
6
1 1 0.25 0.5 0.75 1.25 Figure 3: A = 1. Graphs of ϕap (t)(x) for x ∈ (0, π) and t = 1 (continuous line), t = 2 (short dashes), t = 3 (long dashes). Fig. 4. A =Figure 1. Graphs (continuous line) (t)k and R(t) (dashed line). 4: ofAkϕ =ap1.(t)kGraphs of kϕ (continap Figure 5: A = 1. Graph of R(t)/kϕap uous line) and R(t) (dashed line). 1.2
0.35
1
0.3
0.8
0.25
28
0.2
0.6
˜ 0.15
0.4
0.1 0.2
0.05 1
2
3
4
5
6
0.25
Figure 4: A = 1. Graphs of kϕap (t)k (continuous line) and R(t) (dashed line).
0.5
1
0.75
1.25
1.5
Fig. 5. 5:AA== 1. 1. Graph of R(t)/kϕ ap (t)k. Figure Graph of R(t)/kϕ ap (t)k.
0.05 10
28
0.1
0.15
0.2
0.25
-0.2
9
-0.4
8
-0.6
7
-0.8
6
-1
5
-1.2 0.05
0.1
0.15
0.2
0.25
Fig. 6.6: AA== 4. 4. Graph of α(t). Figure Graph of α(t).
10 8 6 4 2
0.3
-1.4
Figure 7: A = 4. Graph of γ(t).
0.
April 21, 2004 14:27 WSPC/148-RMP
00202
415
On Approximate Solutions of Semilinear Evolution Equations
0.05
10 10
0.1
0.15
-0.2
0.2 0.05 0.25 0.10.3 0.15 0.2 -0.2
-0.4
8
-0.6
7
-0.8
-0.8
-0.8
-1
-1
5
-1.2
-1.2
0.20.050.250.1 0.30.15
0
-0.6
6
0.15
0.25
-0.6
-1
5
0.1
0.2
-0.4
7
0.05
0.15
-0.4
8
6
0.1 0.3
-0.2
9 9
0.05 0.25
-1.4 0.2
-1.2
0.05 0.1 0.15 -1.4 0.2 0.25 0.3
0.25
-1.4
0.3
Figure 6: AA A== =4. 4. 4. Graph of γ(t). α(t). Figure 7: A = 4. Graph of γ(t). 7.7: Graph of γ(t). Figure 6: A = 4. Graph of α(t). Graph of Figure 6: A = 4. GraphFigure ofFig. α(t). Figure 7: A = 4. Graph of γ(t). 10
10
8
8
6
6
4
4
2
2
0.5
1
1.5
10 8 6 4 2
2
0.52.5
1 3
1.5
2
0.5 2.5
1
3
1.5
2
2.5
3
Fig. 8. A = 4. Graphs of ϕap8: (t)(x) for 4. x ∈Graphs (0, π) and =ap 0.1 (continuous = 0.2 (short Figure A = oft ϕ (t)(x) for0.2 x line), ∈ (0,t π) and t = 0.1 (continuous line), t = gure 8: A = 4. Graphs of ϕdashes), x ∈ (0, π)ϕand t= 0.1 (continuous line), = Figure 8: A =t =4.for Graphs of for x ∈ (0, π) and t = t0.1 (continuous line), t = 0.2 ap (t)(x) ap (t)(x) 0.3 (long dashes). (short dashes), t = 0.3 (long dashes). hort dashes), t = 0.3 (short (long dashes). dashes), t = 0.3 (long dashes).
60 0.2 50 0.175
0.175
40 0.15
0.15 0.125
20
0.125 30 0.1 20 0.075
10
10 0.05
0.05
60 50 40 30
0.2
0.2
0.175 0.15 0.125 0.1
0.1
0.075
0.075
0.05
0.025 0.025 0.05 0.1 0.15 0.2 0.25 0.3 0.2 0.25 0.3 0.20.050.250.1 0.30.15 0.15 0.05 0.1 0.15 0.05 0.2 0.1 Fig. 9. A = 4. Graphs of kϕ (t)k (continuous line) (t)k and R(t) (dashed line). Figure 9: A =ap4. Graphs of kϕ (contin0.025
0.05
0.1
0.15
0.05
0.2
0.1
0.15
0
ap Figure 9:apA = 4. Graphs of kϕap (t)k (contingure 9: A = 4. Graphs of kϕ (t)k (continFigure 10: A =(t)k. 4. Graph of R(t)/kϕap ( Figure 10: A =ap4. Graph of R(t)/kϕ Figure 10:and A= 4. Graph ofline). R(t)/kϕ (t)k. ap uous line) R(t) (dashed uous line) ous line) and R(t) (dashed line).and R(t) (dashed line).
29
29
29
4
April 21, 2004 14:27 WSPC/148-RMP
00202
2
0.5
1
1.5
2
2.5
3
Figure 8: A = 4. Graphs of ϕap (t)(x) for x ∈ (0, π) and t = 0.1 (continuous line), t = 0.2 short dashes), t = 0.3 (long 416 dashes). C. Morosi & L. Pizzocchero
0
0.2
0
0.175 0.15
0
0.125
0
0.1
0
0.075 0.05
0
0.025 0.05
0.1
0.15
0.2
0.25
0.3
0.05
0.1
0.15
0.2
Figure 9: A = 4. Graphs of kϕap (t)k (continFig. 10. of R(t)/kϕ ap (t)k. Figure 10: A A ==4.4.Graph Graph of R(t)/kϕ ap (t)k. ous line) and R(t) (dashed line). t → +∞. Figure 3 gives the function x ∈ (0, π) 7→ ϕap (t)(x) at three fixed times. For the exact solution ϕ of (7.16), we infer
29 ≤ kϕap (t)k + R(t) → 0 , kϕ(t)k
for t → +∞ .
(7.57)
Figure 5 is a graph of the relative bound R(t)/kϕap (t)k in a time interval where it is fairly little. Case A = 4. The Galerkin system (7.39) has a finite existence time tG = 0.3138.. . Figures 6–10 give information of the same kind as the figures of the case A = 1, but describe a qualitatively different behavior; in particular, the function t 7→ R(t) diverges for t → tG . Acknowledgments We are grateful to the anonymous referees for some suggestions that stimulated an improvement of the paper, and to D. Bambusi for useful comments. This work was partly supported by INdAM and by MIUR, COFIN 2001 Research Project “Geometry of Integrable Systems”. Appendix A. Sobolev Spaces and Pointwise Product We consider the space H 1 (R, C) := {f : R → C | f, fx ∈ L2 (R, C)} with the inner product hf | gi := hf | giL2 + hfx | gx iL2 and the corresponding norm k k. This is useful to treat the space H01 (0, π) of Sec. 7 (made of real functions): in fact, there is an R-linear, norm preserving inclusion
H01 (0, π) H01 (0, π)
H01 (0, π) ⊂ H 1 (R, C) ,
(A.1)
where each f ∈ is extended to the full real axis setting f (x) := 0 for x 6∈ (0, π). Both and H 1 (R, C) are closed under the pointwise product, and kf gk ≤ const. kf k kgk for all functions therein [14]. We claim the following:
April 21, 2004 14:27 WSPC/148-RMP
00202
On Approximate Solutions of Semilinear Evolution Equations
417
Proposition A.1. Consider the sharp (i.e. the minimum) constants L, M in the inequalities kf gk ≤ Lkf k kgk , kf gk ≤ M kf k kgk ,
for all f, g ∈ H01 (0, π) ;
(A.2)
for all f, g ∈ H 1 (R, C) .
(A.3)
Then 0.811 < L ≤ M ≤ 1 .
(A.4)
Proof. (i) L ≤ M follows readily from (A.1). (ii) A lower bound for L follows applying (A.2) with f = g = fλ , where fλ (x) := e−λ|x−π/2| − e−λπ/2
(λ > 0) .
(A.5)
The norms kfλ k, kfλ2 k are computed in an elementary way, and we get a minorant of L for each λ. The best lower bound is attained for λ close to 1.55, and implies L > 0.811. (iii) Let us prove that M ≤ 1. To this purpose, we R employ for the complex functions f on R the Fourier transform (Ff )(k) = √12π R dx e−ikx f (x); this yields the representation Z H 1 (R, C) = f ∈ L2 (R, C) | + ∞ > dk (1 + k 2 )|Ff (k)|2 = kf k2 , (A.6) R
√ and sends pointwise product into (1/ 2π)× the convolution product ∗ . Consider any two functions f, g ∈ H 1 (R, C). Then the following holds: Z kf gk2 = dk(1 + k 2 )|F(f g)(k)|2 R
= (Ff ∗ Fg)(k) = =
1 2π Z
R
Z
R
×
Z
R
dk(1 + k 2 )|(Ff ∗ Fg)(k)|2 ;
dh Ff (k − h)Fg(h) dh p
p
1
√ 1 + (k − h)2 1 + h2
1 + (k − h)2 Ff (k − h)
p
1 + h2 Fg(h) ;
1 |(Ff ∗ Fg)(k)|2 ≤ C(k)P (k) , 2π Z dh 1 1 C(k) := = , 2π R (1 + (k − h)2 )(1 + h2 ) 4 + k2 Z dh(1 + (k − h)2 )|Ff (k − h)|2 (1 + h2 )|Fg(h)|2 . P (k) := R
(A.7)
(A.8) (A.9) (A.10) (A.11)
April 21, 2004 14:27 WSPC/148-RMP
418
00202
C. Morosi & L. Pizzocchero
Equation (A.9) follows from H¨ older’s inequality 2 Z Z Z 2 2 dh U V ≤ dh |V | ; dh|U |
inserting (A.9) into Eq. (A.7), we get Z Z dk(1 + k 2 )C(k)P (k) ≤ sup (1 + k 2 )C(k) dk P (k) kf gk2 ≤ k∈R
R
2
R
2
= 1 × kf k kgk .
(A.12)
In [11] we have discussed the constants for more general inequalities related to the pointwise product and to the spaces H n (Rd , C). The upper bound M ≤ 1 derived now improves the result arising from [11] in the special case of the inequality (A.3); the method employed here to bind M develops in a fully quantitative way an idea suggested in [13]. Appendix B. Proof of Proposition 7.1 We keep all notations of Sec. 7. The proof consists of the following steps: (i) The function Q of Eq. (7.17) can be seen as a continuous linear form, both on L2 (0, π) and on F. For all f ∈ F and t ∈ [0, +∞), we easily infer from (7.10) that Q(U(t)f ) = e−t Q(f ) .
(B.1)
(ii) Let f ∈ L2p (0, π)(⊂ L2 (0, π)) and f ≥ 0; then Q(f ) ≤ Q(f p )1/p .
(B.2) Rπ
This follows taking q such that 1/p + 1/q = 1, and writing Q(f ) = 0 dx u(x)v(x) R 1 1 1/q 1/p sin x) , v(x) := ( sin x) f (x); H¨ o lder’s inequality uv ≤ with u(x) := ( 2 2 R q 1/q R p 1/p d ( u ) ( v ) yields Eq. (B.2). (iii) We consider the maximal solution ϕ : [0, θ) → F of the Volterra problem (7.14) with datum f0 ∈ F, assuming f0 ≥ 0 and Q(f0 ) > 1; the nonnegativity of f0 implies ϕ(t) ≥ 0 for all t (see, e.g., [2]). We define the (continuous) function t ∈ [0, θ) 7→ Q(t) := Q(ϕ(t)) ,
(B.3)
and note that Q(t) ≥ 0 , d Equation
Q(t) ≥ e−t Q(f0 ) +
Z
t
ds e−(t−s) Q(s)p ;
(B.4)
0
(B.2) is optimal, in this sense: the best constant in the inequality Q(f ) ≤ CQ(f p )1/p for all nonnegative L2p functions is C = 1. This is true even if we restrict the inequality to much smaller classes, such as the nonnegative C ∞ , compactly supported functions f on (0, π).
April 21, 2004 14:27 WSPC/148-RMP
00202
On Approximate Solutions of Semilinear Evolution Equations
419
the first bound follows from ϕ(t) ≥ 0, and the second one from (7.14), (B.1) and (B.2). (iv) For each n ∈ N, we define a continuous function Sn : [0, ϑ) → R by Z t −t −t S0 (t) := e Q(f0 ) , Sn+1 (t) := e Q(f0 ) + ds e−(t−s) Sn (s)p ;
(B.5)
0
it is proved recursively that, for all n ∈ N and t ∈ [0, ϑ), Q(t) ≥ Sn (t) ≥ 0
(B.6)
(the first inequality depends on (B.4), the second one is elementary). It is easily checked that the sequence of functions (Sn ) is a Cauchy sequence in the topology of the uniform convergence on all compact subintervals [0, τ ] ⊂ [0, ϑ); its n → +∞ limit is a continuous function S : [0, ϑ) → R such that, for all t in this interval, Z t −t S(t) = e Q(f0 ) + ds e−(t−s) S(s)p , Q(t) ≥ S(t) ≥ 0. (B.7) 0
From the above integral equation, we see that S is in fact C 1 , and fulfils the Cauchy problem ˙ S(t) = S(t) S(t)p−1 − 1 , S(0) = Q(f0 ) ; (B.8) this has a unique maximal solution (for nonnegative times), denoted again with t 7→ S(t), which extends the function considered up to now on [0, ϑ), and is given by Z S(t) Z +∞ dr dr = t , for t ∈ [0, tK ) , tK := ; (B.9) p−1 p−1 − 1) − 1) Q(f0 ) r(r Q(f0 ) r(r
furthermore, S(t) → +∞ for t → t− K . Computing the last integral, we see that tK has the expression (7.19) in the statement of the theorem; we know that ϑ ≤ tK , so the proof of Proposition 7.1 is concluded. References [1] C. Caratheodory, Calculus of Variations and Partial Differential Equations of the First Order (Holden-Day, San Francisco, 1965). [2] T. Cazenave and A. Haraux, An Introduction to Semilinear Evolution Equations (Oxford Univ. Press, New York, 1998). [3] N. Dunford and J. T. Schwartz, Linear Operators. I (Interscience, New York, 1958). [4] D. R. Durran, Numerical Methods for Wave Equations in Geophysical Fluid Dynamics (Springer, New York, 1999). [5] S. Kaplan, On the growth of solutions of quasi-linear parabolic equations, Commun. Pure Appl. Mathematics 16 (1963) 305–330. [6] Y. B. Kivshar and B. A. Malomed, Dynamics of solitons in nearly integrable systems, Rev. Mod. Phys. 61 (1989) 763–915. [7] V. Lakshmikantham, S. Leela and A. A. Martynyuk, Stability Analysis of Nonlinear Systems (Marcel Dekker, New York, 1989).
April 21, 2004 14:27 WSPC/148-RMP
420
00202
C. Morosi & L. Pizzocchero
[8] P. Lochak and C. Meunier, Multiphase Averaging for Classical Systems. With Applications to Adiabatic Theorems, Appl. Math. Sciences 72 (Springer-Verlag, New York, 1988). [9] E. N. Lorenz, Deterministic nonperiodic flow, J. Atmospheric Sci. 20 (1963) 130–141. [10] D. S. Mitrinovic, J. E. Pecaric and A. M. Fink, Inequalities involving functions and their integrals and derivatives (Kluwer, Dordrecht, 1991). [11] C. Morosi and L. Pizzocchero, On the constants in some inequalities for the Sobolev norms and pointwise product, J. Inequal. Appl. 7 (2002) 421–452. [12] C. Olech, On the existence and uniqueness of solutions of an ordinary differential equation in the case of Banach space, Bull. Acad. Polon. Sci. S´er. Sci. Math. Astron. Phys. 8 (1960) 668–673. [13] J. P¨ oschel, Quasi-periodic solutions for a nonlinear wave equation, Comment. Math. Helv. 71 (1996) 269–296. [14] T. Runst and W. Sickel, Sobolev spaces of fractional order, Nemytskij operators and nonlinear partial differential equations (de Gruyter, Berlin, 1996). [15] R. Temam, Infinite-Dimensional Dynamical Systems in Mechanics and Physics (Springer, New York, 1988). [16] W. Walter, Differential and Integral Inequalities (Springer, New York, 1970). [17] T. Wazewski, Sur l’existence et l’ unicit´e des int´egrales des ´equations diff´erentielles ordinaires au cas de l’espace de Banach, Bull. Acad. Polon. Sci. S´er. Sci. Math. Astron. Phys. 8 (1960) 301–305. [18] E. Zeidler, Non Linear Functional Analysis and its Applications I. Fixed-Point Theorems (Springer, New York, 1986). [19] E. Zeidler, Non Linear Functional Analysis and its Applications II/B. Nonlinear Monotone Operators (Springer, New York, 1990).
May 31, 2004 11:53 WSPC/148-RMP
00205
Reviews in Mathematical Physics Vol. 16, No. 4 (2004) 421–450 c World Scientific Publishing Company
ON THE HESSIAN OF THE ENERGY FORM IN THE GINZBURG LANDAU MODEL OF SUPERCONDUCTIVITY
MYRIAM COMTE∗ and MYRTO SAUVAGEOT† Laboratoire Jacques-Louis Lions, Boˆıte 187, Universit´ e Pierre et Marie Curie 4 Place Jussieu, F-75252 Paris Cedex 05, France ∗[email protected] †[email protected] Received 25 April 2003 Revised 28 January 2004 The purpose of this work is to study the stability of radial solutions of degree d for the Ginzburg–Landau model of superconductivity with an applied magnetic field in a disk of radius r¯. We consider the branch of solutions introduced in [24] as a branch with the radius of the ball as parameter. We prove that for small radii the branch is stable while it is unstable for large radii, see [6]. We then study in detail the Hessian of the energy at the symmetric vortex at the stability transition. Finally under a couple of extra assumptions, we construct a branch of solutions bifurcating from the radial one at this point, and describe it. Keywords: Ginzburg–Landau model; stability of symmetric vortices; bifurcation branches.
0. Introduction In the Ginzburg–Landau model for superconductivity in a ball in R2 with applied magnetic field orthogonal to it, equilibrium states are critical points for the free energy form Z 2 1 λ G(ψ, A) = (1) |i∇ψ + Aψ|2 + 1 − |ψ|2 + |curl A − he3 |2 2 Br¯ 4
defined for (ψ, A) ∈ H 1 (Br¯, C × R2 ), where ψ is an order parameter related with the current and with the density of pairs of superconducting electrons, A is a vector potential and Br¯ is the ball with radius r¯ in R2 . The parameters λ, r¯ and h represent respectively a coupling constant from the material, the radius of the ball and the intensity of the applied magnetic field on the boundary of the ball. In this paper we study the existence of critical points for this energy form, and their stability. A critical point, i.e. a solution of the associated Euler–Lagrange equation DG(ψ, A) = 0 is said to be stable if the quadratic form D 2 G(ψ, A) is positive, and unstable if D 2 G(ψ, A) can take strictly negative values. 421
May 31, 2004 11:53 WSPC/148-RMP
422
00205
M. Comte & M. Sauvageot
Among solutions of the Ginzburg–Landau equation DG(ψ, A) = 0, one distinguishes the normal solutions, i.e. normally-conducting solutions, which are of the form h −y . ψ≡0 A(x) = 2 x
One distinguishes also families of superconducting solutions called symmetric vortices: for a given integer d ≥ 2, symmetric vortices of degree d are the solutions of the Euler–Lagrange equation DG(ψ, A) = 0 which are of the form ψ(r, θ) = f (r)eidθ ,
A(r, θ) = a(r)v
− sin(θ) cos(θ) .
with v = The study of their existence and of their properties, has been initiated by Plohr [20, 21], and afterwards by Berger and Chen [6] for infinite radius, and then completed by many authors (cf. for instance [1], [2] or [22]) for finite radius. Here we have to mention some results obtained about the uniqueness of symmetric vortices. Uniqueness within radial class was proved in [4] in the case λ ≥ 4d2 and by Clemons [10] for λ near 1. In this paper we consider Br¯ a ball of radius r¯ and a branch of symmetric vortices with Dirichlet boundary conditions ψ(¯ r , θ) = eidθ ,
r¯|A(¯ r , θ)| = d ≥ 2 ,
∀ θ ∈ [0, 2π]
(2)
constructed in [24]. Let us notice that the boundary condition for A in formula (2) can be interpreted in terms of quantization of the flux of the magnetic field H = curl A through Br¯: Z Z 1 1 H · be3 = curl A · e3 = d . 2π Br¯ 2π Br¯ Let us recall the main result of [24]: Theorem 1. (a) For fixed degree d ≥ 1, and any given (α, c) in R2 , there exists a radial solution of degree d of the Ginzburg–Landau equation DG(ψ, A) = 0 in H 1 (Br¯, C × R2 ) denoted − cos(θ) idθ ψα,c (r, θ) = fα,c (r)e , Aα,c (r, θ) = aα,c (r) . (3) sin(θ)
(b) There exists a constant L > 0 such that
(i) For any α in ]L, +∞[, there exist a unique c > 0, denoted by c(α), a unique ˆ R > 0, denoted by R(α), such that fα,c (r) ≥ 0 and the boundary conditions ( ˆ fα,c (R(α)) =1 ˆ ˆ d − R(α)aα,c (R(α)) = 0 are satisfied. ˆ (ii) The applications α → c(α) and α → R(α) are of class C 1 . One has 0 ˆ ˆ ˆ limα→L R(α) = +∞, limα→+∞ R(α) = 0 and R (α) < 0 for α large enough.
May 31, 2004 11:53 WSPC/148-RMP
00205
Ginzburg–Landau Model of Superconductivity
423
ˆ ˆ Thus we can consider this branch as a function α → (ψ(α), A(α) = (ψα,c(α) , Aα,c(α) ) with α running in an interval ]L , +∞[. We are interested in the linear stability of the branch with respect to α, that means in fact with respect to the size of the ball. The case of Berger and Chen vortices which corresponds to infinite radius has been solved for large λ by [14] and [5], then in full generality by [16] who prove that for λ > 1, such vortices are unstable except in degree d = 1. On the other hand, we show (cf. Lemma 3.2 below) that vortices with small radius are stable. So that on any one parameter branch of symmetric vortices starting with small radius, and ending with arbitrary large radius, some loss of stability will occur. Consequently, ˆ d ), A(α ˆ d )) is degenerately there will exist a critical value αd of α for which (ψ(α ˆ ˆ stable, while (ψ(α), A(α)) is stable for all α > αd . Our first purpose is to give a precise meaning to this loss of stability, and to provide a spectral analysis of the Hessian of G at states where this loss of stability occurs, which are points of degenerate stability. In order to do that we first remark that at any critical point (ψ, A), the Hessian HessG (ψ, A): H01 (Br¯, C × R2 ) → H −1 (Br¯, C × R2 ) , which is the linearized operator of the quadratic form D 2 G(ψ, A), has a nontrivial kernel, containing at least all elements of the form (iγψ, ∇γ) for γ ∈ H02 (Br¯, R); we shall denote by K0 the space of such elements. In the stable case, strict stability means that the kernel of the Hessian is reduced to the closure of K0 , while degenerate stability means that this kernel contains elements which do not belong to K0 . Using gauge invariance we only need to consider ¯ perturbations (ϕ, B) 6= (0, 0) ∈ K ⊥ . We introduce then a new quadratic form Q 0
also considered in [16], and defined by ¯ Q(ϕ, B) = Q(ϕ, B) +
Z
(div B − iψ . ϕ)2 Br¯
which is equal to Q in K0⊥ (see Sec. 2 for more details). Next we use the fact that the generic element ϕ of H01 (Br¯, C) can be decomposed as X ϕ(r, θ) = ϕn (r)einθ , n∈Z
and that the generic element B of
H01 (Br¯, R2 )
can be written as
B = Buu + Bvv with B u (r, θ) =
P
n∈Z
Bnu (r)einθ and B v (r, θ) =
u B−n (r) = Bnu (r) ,
P
n∈Z
Bnv (r)einθ
v B−n (r) = Bnv (r) .
We introduce W (1) the space of pairs (ϕ, B) in H01 (Br¯, C) × H01 (Br¯, R2 ) such that, for any n, ϕn and Bnv are real valued, while Bnu takes values in iR.
May 31, 2004 11:53 WSPC/148-RMP
424
00205
M. Comte & M. Sauvageot
Finally for n ≥ 0, Wn will be the subspace in H01 (Br¯, C) × H01 (Br¯, R2 ) of pairs (ϕ, B) such that ϕ of the form ϕ(r, θ) = ϕd+n (r)ei(d+n)θ + ϕd−n (r)ei(d−n)θ and B satisfies B u (r, θ) = Bnu (r)einθ + Bnu (r)e−inθ B v (r, θ) = Bnv (r)einθ + Bnv (r)e−inθ . With these notations we have the following result: Theorem 2. Let (ψ = f (r)eidθ , A = a(r)v) be a symmetric vortex of degree d. Then one has X ¯n , ¯= Q Q n≥0
¯n = Q ¯ ∩ Wn . As a consequence, where Q ¯ = ⊕n≥0 Ker(Q ¯n) . Ker(Q) Our second purpose is to study the possibility of a bifurcation. Degenerately stable critical states are the states at which bifurcation phenomena can occur. In [9], it is shown how to get bifurcating branches of symmetric vortices issued from reference branches of normal solutions. But in turn, some bifurcations are expected to exist, emanating from those branches of symmetric vortices. Existence of such a bifurcation has been proved in [11] (see also [23]) for the Ginzburg–Landau model without the magnetic field investigated in [7], in which the only variable is the wave function ψ, and which, up to rescaling, is governed by the energy form on the unit disk B1 Z 1 1 |∇ψ|2 + 2 (1 − |ψ|2 )2 Eε (ψ) = 2 B1 4ε with Dirichlet boundary condition ψ(eiθ ) = eidθ . For this restricted model and a given degree d ≥ 2, there exists a unique smooth branch of symmetric vortices, constructed in [18], and a unique value εd of the parameter ε which provides a degenerately stable solution for the equation DEε (ψ) = 0 (cf. [19]). From the vortex at εd a bifurcation branch emanates (cf. [11, 23]). Unfortunately here we are not able to prove existence of bifurcation in most cases. We have to assume two technical assumptions in order to conclude. To be ¯ given in more precise we consider the diagonalization of the quadratic form Q ˆ d ), A(α ¯ = ⊕m≥0 Q ¯ m at a degenerately stable vortex (ψ(α ˆ d )). Theorem 2 that is Q We introduce then ¯ m ) 6= {0} } p = max{m ∈ [2 · · · d] / Ker(Q
(4)
May 31, 2004 11:53 WSPC/148-RMP
00205
Ginzburg–Landau Model of Superconductivity
425
(1) ¯ (1) ¯ (1) ¯ and we fix (ϕp , Bp ) 6= (0, 0) in Ker(Q . We assume the p ), where Qp = Qp ∩ W following:
ˆ 0 (αd ) 6= 0. Assumption 1. R Assumption 2.
¯ α (ϕp ,Bp ) dQ dα
6= 0 at α = αd .
Remark 1. These two assumptions seem difficult to verify. Nevertheless when λ ≥ 4d2 we can adapt the non-degeneracy result of [4] for the Plohr or Berger– Chen vortices and conclude that Assumption 1 is always satisfied. We can then state the following result: Theorem 3. Assume that Assumptions 1 and 2 are satisfied. Then, there exists δ0 > 0, a smooth map from [−δ0 , δ0 ] in R∗+ , δ → α(δ), and a map δ → (ψδbif , Abif δ ) 1 2 ˆ which to δ ∈ [−δ0 , δ0 ] associates (ψδbif , Abif ) in H (B , C × R ) with r ¯ = R(α(δ)), r¯ δ so that the following properties are satisfied: (i) α(δ) = αd + O(δ 2 ); 2 ˆ ˆ (ii) (ψδbif , Abif δ ) = (ψ(α(δ)), A(α(δ))) + δ(ϕp , Bp ) + O(δ ); bif bif (iii) for each δ, (ψδ , Aδ ) is a solution of the Ginzburg–Landau equation with ˆ parameter r¯ = R(α(δ)) and boundary conditions (ii). Moreover , if d = 2, then for δ 6= 0 the function ψδbif possesses exactly two zeroes, symmetric with respect to 0 and at distance rδ ∼ δ 1/2 . If d = 2, or more generally if the generic element in the kernel of P¯ does not vanish at 0, one also gets a description of the zeroes of the wave function ψ on the new branch: ψδ has exactly d zeroes, located at the vortices of a regular polygon of radius δ 1/d . The physical meaning of this theorem is that when the radius of the disk increases the degree-d vortex will split into d degree-1 vortices. Note that our radial branch does not satisfy the Neumann boundary condition ∇u ψ|∂Br¯ = 0 usually assumed by physicists (u is the unit normal outward vector). This passage from one zero to d zeroes for ψ, i.e. from one normally conducting tube to many ones, is a phenomenon which is expected to be true in a physically relevant model. Let us notice that the study of the stability of solutions with respect of the size of the domain has been considered for the one-dimensional model for thin films (cf. [8]). This paper is divided as follows. The three first sections deal with degenerate stability. Section 1 is a presentation of previous results on symmetric vortices and of the radial branch ˆ d ), A(α ˆ d )). In Sec. 2, the quadratic form Q ¯ is substituted to D 2 G in α → (ψ(α order to take in account the gauge invariance property of G, and the first proper¯ are presented. In Sec. 3 is shown the stability of vortices for large α, and ties of Q
May 31, 2004 11:53 WSPC/148-RMP
426
00205
M. Comte & M. Sauvageot
their instability for α close to L, which leads to the existence of a critical α = αd at which degenerate stability occurs. The four next sections are a spectral analysis for the Hessian and the associated ¯ elliptic operator P¯ . Section 4 presents a diagonalization of the quadratic form Q ¯ ¯ as ⊕m≥0 Qm . Next sections deal with the kernel of the forms Qm at the point of stability mentioned above, and more generally at degenerately stable symmetric vortices at which the function f is positive increasing, and the function b: b(r) = ¯ m has a kernel which is at most 2-dimensional, d−ra(r) is positive decreasing; each Q ¯ and Qm is non-degenerate for m > d (Sec. 5) or for m = 1 (Sec. 6). In Sec. 7, it ¯ 0 is equivalent to the technical condition is shown that degeneracy for the form Q 0 ˆ (αd ) = 0, in which case the kernel of Q ¯ 0 has dimension 1 and is tangent to the R radial branch. The two last sections are a tentative construction of a bifurcation branch. We ¯ 0 is non-degenerate (which should be true at least for large values suppose that Q of λ), and we show in Sec. 8 how to apply the bifurcation theorem of Crandall and Rabinowitz ([12]) to obtain a bifurcation branch emanating from the radial branch ¯ d is degenerated (assumption automatically satisfied at α = αd . In the case when Q for d = 2), it is shown in Sec. 9 that the wave function ψ on the bifurcation branch has exactly d zeroes, located at the vortices of a regular polygon centered at 0. 1. Symmetric Vortices − sin(θ) Notation. u = cos(θ) , v = is the usual Fr´enet frame. sin(θ) cos(θ)
Definition 1.1. A symmetric vortex of degree d (d ∈ N∗ ) is a solution (ψ, A), in H 1 (Br¯, C) × H 1 (Br¯, R2 ), of the Ginzburg–Landau equation DG(ψ, A) = 0, which is of the form ψ(r, θ) = f (r)eidθ ,
A(r, θ) = a(r)v .
f and a are C 2 real-valued functions which satisfy on the interval [0, r¯] the system of equations d2 1 f 00 + f 0 (r) − 2 f (r) r r a(r) λ [E] f (r) + a(r)2 f (r) − f (r) 1 − f (r)2 = −2d r 2 00 a (r) + 1 a0 (r) − 1 a(r) = f (r)2 a(r) − d f (r)2 r r2 r
with f (r) = O(r d ) and a(r) = O(r) near 0; or, equivalently, setting b(r) = d−ra(r): 1 b(r)2 λ 2 f 00 (r) + f 0 (r) = f (r) − (1 − f (r) r r2 2 [E 0 ] 1 b00 (r) − b0 (r) = b(r)f (r)2 . r
May 31, 2004 11:53 WSPC/148-RMP
00205
Ginzburg–Landau Model of Superconductivity
427
It is shown in [24] that symmetric vortices of degree d constitute a family ψα,c = fα,c (r)eidθ , Aα,c = aα,c (r)v (α,c)∈R2
(5)
smoothly indexed by two parameters α and c, with fα,c and aα,c analytic on [0, r¯], such that α and c parametrize the initial conditions α = a0α,c (0) ,
c=
1 (d) f (0) . d ! α,c
ˆ ˆ The branch α → (ψ(α), A(α)). In [24] a smooth branch of symmetric vortices has been constructed which satisfies the Dirichlet boundary conditions ψ|∂Br¯ = eidθ ,
r¯|A||∂Br¯ = d ,
as mentioned in Theorem 1. More precisely, with the notations of (5), there exists a smooth increasing function α → c(α), defined for α increasing from some limit value L > 0 to +∞, such ˆ that, for a given α > L, there exists one (and only one) R(α) such that ˆ fα,c(α) (R(α)) = 1,
ˆ bα,c(α) (R(α)) = 0.
ˆ ˆ One denotes (ψ(α), A(α)) = (ψα,c(α) , Aα,c(α) ) the corresponding vortex. Note that f = fα,c(α) is positive increasing (f 0 (r) > 0 for r > 0) and b = bα,c(α) positive decreasing (b0 (r) < 0 for r > 0). Since b(0) = d, one has 0 ≤ b(r) ≤ d for r ∈ [0, r¯]. ¯ 2. The Quadratic Forms Q = D 2 G(ψ, A) and Q At any critical point (ψ, A), the Hessian HessG (ψ, A): H01 (Br¯, C × R2 ) → H −1 (Br¯, C × R2 ) , is the linearized operator of the quadratic form D 2 G(ψ, A). Note that such a Hessian (shortly denoted by P ) is not an elliptic operator. The gauge group, which is the group J of affine diffeomorphims of H 1 (Br¯, C × 2 R ) of the form (ψ, A) → (eiγ ψ, A + ∇γ) ,
γ ∈ H 2 (Br¯, R) ,
leaves G invariant. In some sense, G is defined on the orbit space H 1 (Br¯, C×R2 )/J . The element DG(ψ, A) in H −1 (Br¯, C × R2 ) satisfies Z DG(ψ, A) · (iγψ, ∇γ) = 0 , ∀γ ∈ H 2 (Br¯, R) . (6) Br¯
(The integral notation stands for the pairing between distributions and functions, while the dot · stands for the real-valued pairing between complex numbers: z · z 0 = Re z¯z 0 .)
May 31, 2004 11:53 WSPC/148-RMP
428
00205
M. Comte & M. Sauvageot
Since the dual action of the Lie algebra of the gauge group J associates with any critical point (ψ, A) a closed subspace K∗ (ψ) = (iγψ, ∇γ) γ ∈ L2 (Br¯, R)
of the space H −1 (Br¯, C × R2 ), we are allowed to consider P as defined on ⊥ H(ψ) = K∗ (ψ) = (ϕ, B) divB = iψ · ϕ , and taking values in the dual space (H(ψ))∗ = H −1 (Br¯, C × R2 ) K∗ (ψ). Thus if γ is any function in H02 (Br¯, R), the space K0 = {(iγψ, ∇γ)/γ ∈ H02 (Br¯, R)} will always be a subspace of the kernel Ker(Q) of the quadratic form Q. Elements of K0 will be considered as trivial elements of Ker(Q). Then, on H(ψ), the quadratic form Q = D 2 G(ψ, A) coincides with an extended ¯ also considered in [16], and defined by quadratic form Q Z ¯ Q(ϕ, B) = Q(ϕ, B) + (div B − iψ · ϕ)2 . Br¯
¯ is positive if and only if Q is positive, i.e. if and only if (ψ, A) is Note that, Q a stable solution of the Ginzburg–Landau equation. In this case, the elements in ¯ are the elements in the kernel of Q which are L2 -orthogonal to K0 . the kernel of Q Indeed an element (ϕ, B) of H01 (Br¯, C) × H01 (Br¯, R2 ) is orthogonal to K0 in L2 if and only if it satisfies the relation div(B) = iψ · ϕ .
(7)
Definition 2.1. Let (ψ, A) be a critical point of the energy form G defined in (1), ¯ the quadratic form associated with it. and Q (ψ, A) will be said to be ¯ is positive definite; (i) strictly stable if Q ¯ is positive and degenerate; (ii) degenerately stable if Q ¯ takes strictly negative values. (iii) unstable if Q ¯ at a point of degenerate stability. Our purpose is to investigate the kernel of Q Lemma 2.2. One has Z Z Z ¯ Q(ϕ, B) = |i∇ϕ + Aϕ|2 + |Bψ|2 + 4 (i∇ψ + Aψ) · Bϕ λ − 2
Z
2
2
(1 − |ψ| )|ϕ| + λ
Z
2
(ψ · ϕ) +
Z
2
(iψ · ϕ) +
Z
|∇B|2 .
(8)
Proof. One computes Z Z Z Q(ϕ, B) = |i∇ϕ + Aϕ|2 + |Bψ|2 + 2 (i∇ϕ + Aϕ) · Bψ +2
Z
λ (i∇ψ + Aψ) · Bϕ − 2
Z
2
2
(1 − |ψ| )|ϕ| + λ
Z
2
(ψ · ϕ) +
Z
|Curl B|2
May 31, 2004 11:53 WSPC/148-RMP
00205
Ginzburg–Landau Model of Superconductivity
429
and makes use of the two equalities Z Z Z (i∇ϕ + Aϕ) · Bψ = (i∇ψ + Aψ) · Bϕ + (iψ · ϕ) div(B)
and
Z
|Curl B|2 +
Z
|div(B)|2 =
Z
|∇B|2 .
We shall make use of the following result: Lemma 2.3. Suppose that A(r, θ) · u = 0 ∀(r, θ). Then, for any closed subspace X of H01 (Br¯, C) × H01 (Br¯, R2 ), one has ¯ µX = inf{Q(ϕ, B)/(ϕ, B) ∈ X , ||(ϕ, B)||L2 = 1} > −∞ ¯ and there exists (ϕ, B) in X with ||(ϕ, B)||L2 = 1 and Q(ϕ, B) = µX . ¯ Similarly, defining νX as the lower bound of Q(ϕ, B) for ||(ϕ, B)||H01 = 1, there ¯ exists (ϕ, B) in X with ||(ϕ, B)||H01 = 1 and Q(ϕ, B) = νX . Let us prove first an auxiliary result which will also be used later: Lemma 2.4. For any M > 0, there exists cM > 0 such that, for any A ∈ L∞ (B1 , R2 ) with A(r, θ) · u = 0 ∀(r, θ) and ||A||∞ ≤ M, and any ϕ in H01 (B1 , C), one has Z |i∇ϕ + Aϕ|2 ≥ cM ||ϕ||H01 . B1
Proof. Suppose that such a cM does not exist: there exists a sequence (ϕn , An ) in H01 (B1 , C) × L∞ (B1 ,RR2 ) with ||ϕn ||H 1 = 1, An (r, θ) · u = 0 and ||An ||∞ ≤ M for all n, such that limn |i∇ϕn + An ϕn |2 = 0. Replacing this sequence by a subsequence, we can suppose that the ϕn have a weak limit ϕ in H01 and the An have a σ-weak limit A in L∞ , with ||A||∞ ≤ M and A(r, θ) · u ≡ 0. As the ϕn tend to ϕ strongly in L2 , the sequence An ϕn converges weakly to Aϕ in L2 , the sequence i∇ϕn + An ϕn converges weakly to i∇ϕ + Aϕ in L2 , and we have Z Z |i∇ϕ + Aϕ|2 ≤ lim inf |i∇ϕn + An ϕn |2 = 0
∂ ϕ which implies ϕ = 0 i.e. i∇ϕ + Aϕ = 0. As A · u = 0, we have 0 = ∇ϕ · u = ∂r since ϕ vanishes at the boundary. We have proved that the sequence ϕn , and consequently the sequence An ϕn , R converges strongly to 0 in L2 , which implies lim |∇ϕn |2 = 0. This last result contradicts the assumption ||ϕn ||H01 = 1 ∀n.
¯ Proof of Lemma 2.3. Let X1 = {(ϕ, B) ∈ X/||(ϕ, B)||L2 = 1}. The fact that Q is bounded below on X1 is an obvious consequence of the formula in Lemma 2.2.
May 31, 2004 11:53 WSPC/148-RMP
430
00205
M. Comte & M. Sauvageot
¯ n , Bn ) = µX . As the ϕn and Let (ϕn , Bn ) be a sequence in X1 such that limn Q(ϕ R R 2 Bn are bounded in L , the sequences |i∇ϕn + Aϕn |2 and |∇Bn |2 are bounded. By Lemma 2.4, the sequence (ϕn , Bn ) is bounded in X for the H 1 norm. Replacing it by a subsequence, we can suppose that it has a weak limit (ϕ, B) in X for the H 1 -topology. The sequence (ϕn , Bn ) converges strongly to (ϕ, B) in the L2 -topology. This implies first that ||(ϕ, B)||L2 = 1, i.e. (ϕ, B) ∈ X1 . This implies also that in the formula of Lemma 2.2, all the terms in ϕn and Bn will converge towards the corresponding term in ϕ and B, except perhaps for the first and last one, for which we have Z Z 2 |i∇ϕ + Aϕ| ≤ lim inf |i∇ϕn + Aϕn |2 Z
2
|∇B| ≤ lim inf
Z
|∇Bn |2
which provide ¯ ¯ n , Bn ) = µ X . µX ≤ Q(ϕ, B) ≤ Q(ϕ The first assertion is proved. A similar argument proves the second assertion. 3. Loss of Stability on the Branch at Dirichlet Boundary Conditions Lemma 3.1. (i) When α tends decreasingly to the limit value L, the vortex ˆ ˆ (ψ(α), A(α)) tends to a Berger and Chen vortex , i.e. a vortex of the type studied in [6]. ˆ ˆ (ii) Suppose λ > 1. Then for α close to L, the vortex (ψ(α), A(α)) is unstable. Proof. (i) When α tends decreasingly to L, c(α) tends decreasingly to some C ≥ 0. ˆ As R(α) tends to infinity, the limit functions f = fL,C and b = bL,C are defined on the whole half-line [0, +∞[, which implies −1 < fL,C < 1. Moreover, since fα,c(α) is nonnegative increasing and bα,c(α) nonnegative decreasing, f will be nonnegative 0 , bα,c(α) , b0α,c(α) tend increasing and b nonnegative decreasing since fα,c(α) , fα,c(α) 0 0 to f , f , b, b respectively, uniformly on compacts of R+ . One has C > 0 (C = 0 leads to a contradiction since bL,0 (r) = d − Lr2 takes negative values). Hence, as f (r) is equivalent to Cr d near 0, f is strictly positive for r > 0 and has an increasing limit l ∈ ]0, 1] at infinity, while b has a decreasing limit m ∈ [0, d[ at infinity. 2 λ 2 Suppose l < 1: then (rf 0 (r))0 = rf (r) b(r) r 2 − 2 (1 − f (r) ) would be equivalent 2 0 to −rλl 2 (1−l ) at infinity, which would imply that rf (r) tends to −∞ at infinity and 0 would contradict the nonnegativity of f . So we have necessarily limr→+∞ f (r) = 1.
May 31, 2004 11:53 WSPC/148-RMP
00205
Ginzburg–Landau Model of Superconductivity
Similarly, suppose m > 0: then b0 (r) r
b0 (r) 0 r
=
b(r)f (r) r
would be equivalent to
431 m r
would be equivalent to m Log(r) at infinity, which would contradict the and nonpositivity of b0 . We have shown that the limit vortex (ψL,C , AL,C ) is defined on the whole plane and of the form ψ(r, θ) = f (r)eidθ , A(r, θ) = a(r)v with f increasing from 0 to 1 and ra(r) increasing from 0 to d as r runs from 0 to +∞, which characterizes Berger and Chen vortices. (ii) It is shown in [16] that Berger and Chen vortices are unstable for λ > 1 and d ≥ 2, hence the result. ˆ ˆ Lemma 3.2. For α large enough, the vortex (ψ(α), A(α)) is strictly stable. More ¯ precisely, the quadratic form Q associated to it is definite positive. ˆ Proof. One fixes L0 > L and suppose α ≥ L0 . One writes ψ = f (r)eidθ for ψ(α), ˆ ˆ A = a(r)v for A(α), R for R(α), and set b(r) = d − ra(r). Step one: there exists M > 0 such that ||A||∞ ≤ M R. The second line of equation [E] can be written 0 a(r) b(r)f (r) 0 a (r) + =− r r and since a0 (r) +
a(r) r
= 2α at r = 0, we get Z r a(r) b(s)f (s) a0 (r) + = 2α − ds ≤ 2α r s 0
d so that a(r) reaches its maximum either at r = R, in which case ||a||∞ = a(R) = R (since 0 = b(R) = d − Ra(R)) or at rµ < R, in which case ||a||∞ = a(rµ ) ≤ 2αrµ ≤ 2αR. Since limα→∞ αR2 = d, R||A||∞ remains bounded and the claim is proved.
Step two: there exists N such that ||i∇ψ + Aψ||2 ≤ N . It is shown in [24] that there exists a decreasing function h such that f (r) = rd h(r), so that we have rf 0 (r) = drd h(r) + rd+1 h0 (r) ≤ drd h(r) = f (r), and in particular Rf 0 (R) ≤ df (R) = d. We compute, using 0 ≤ f (r) ≤ 1: Z R 1 b(r)2 f (r)2 ||i∇ψ + Aψ||22 = f 0 (r)2 + rdr 2π r2 0 Z R b(r)2 f 0 (r) 0 00 − f (r) 2 = Rf (R)f (R) + −f (r) − f (r)rdr r r 0 Z λ R f (r)2 (1 − f (r)2 )rdr = Rf 0 (R) + 2 0 ≤ d+ and the claim is proved.
λR2 4
May 31, 2004 11:53 WSPC/148-RMP
432
00205
M. Comte & M. Sauvageot
Step three: there exist two constants L2 > 0 and L4 > 0 such that, for any R and any ϕ in H01 (BR , C), one has Z Z |i∇ϕ + Aϕ|2 ≥ CM |∇ϕ|2 BR
Z
L2 |∇ϕ| ≥ 2 R BR 2
Z
BR
|ϕ|
2
and
BR
Z
L4 |∇ϕ| ≥ R BR 2
Z
|ϕ| BR
4
21
.
(The first inequality is Lemma 2.4 rescaled to the disk of radius R, with the constant M provided by the first claim. The second one is just the Sobolev inclusion theorem from H01 into L2 and L4 respectively, written for the unit disk B1 and rescaled to the disk of radius R.) We can now prove the lemma. Invoking the second and third claims above, and making use of ||ψ||∞ = 1, the formula of Lemma 2.2 provides ¯ Q(ϕ, B) Z Z λ ≥ CM |∇ϕ|2 − 4||i∇ψ + Aψ||2 ||ϕ||4 ||B||4 − ||ϕ||22 + |∇B|2 2 CM L4 CM L2 λ L4 ≥ ||ϕ||22 + − ||ϕ||24 − 4N ||ϕ||4 ||B||4 + ||B||24 2 2 2R 2 2R R ¯ is positive definite. As R = R(α) ˆ which shows that, for R small enough, Q tends to 0 as α tends to infinity, the lemma is proved. 3.1. A degenerately stable vortex From now on one assumes λ > 1 and d ≥ 2. Define αd as the upper bound of the set of α’s in ]L, +∞[ for which the quadratic ˆ ˆ form D2 G(ψ(α), A(α)) takes a strictly negative value on at least one element of H01 (Br¯, C) × H01 (Br¯, R2 ). ˆ d ), A(α ˆ d )), the quadratic form Q ¯ is positive and degenProposition 3.3. At (ψ(α erate. ¯ is positive for α = αd since it is positive for α > αd . Suppose that it Proof. Q is nondegenerate, i.e. that it is positive definite. Then Lemma 2.3 provides some ¯ ν > 0 such that Q(ϕ, B) > ν||(ϕ, B)||2H 1 for all (ϕ, B). By continuity with respect 0 ˆ ¯ associated with (ψ(α), ˆ to α, the form Q A(α)) will remain positive definite for α in a neighborhood of αd . This contradicts the definition of αd as an upper bound of a ¯ is not positive. set of α for which Q ¯ at a Symmetric Vortex 4. Diagonalization of Q The generic element ϕ of H01 (Br¯, C) will be decomposed as X ϕ(r, θ) = ϕn (r)einθ . n∈Z
May 31, 2004 11:53 WSPC/148-RMP
00205
Ginzburg–Landau Model of Superconductivity
433
The generic element B of H01 (Br¯, R2 ) will be written as B = Buu + Bvv with B u (r, θ) =
P
n∈Z
Bnu (r)einθ and B v (r, θ) =
u B−n (r) = Bnu (r) ,
P
n∈Z
Bnv (r)einθ
v B−n (r) = Bnv (r) .
Notations. One denotes W (1) the space of pairs (ϕ, B) in H01 (Br¯, C) × H01 (Br¯, R2 ) such that, for any n, ϕn and Bnv are real valued, while Bnu takes values in iR. Similarly, W (2) will denote the subspace in H01 (Br¯, C) × H01 (Br¯, R2 ) of pairs (ϕ, B) such that, for any n, ϕn and Bnv take values in iR, while Bnu takes values in R. π (1) and π (2) = 1 − π (1) will be the orthogonal projections on W (1) and W (2) respectively. For n ≥ 0, Wn will be the subspace in H01 (Br¯, C) × H01 (Br¯, R2 ) of pairs (ϕ, B) such that ϕ of the form ϕ(r, θ) = ϕd+n (r)ei(d+n)θ + ϕd−n (r)ei(d−n)θ and B satisfies B u (r, θ) = Bnu (r)einθ + Bnu (r)e−inθ B v (r, θ) = Bnv (r)einθ + Bnv (r)e−inθ . (1)
πn will denote the orthogonal projection on Wn , while πn = πn ◦π (1) = π (1) ◦πn (2) (1) (resp. πn = πn ◦ π (2) = π (2) ◦ πn ) will be the orthogonal projection on Wn = (2) Wn ∩ W (1) (resp. Wn = Wn ∩ W (2) ). ¯ n, Q ¯ (1) ¯ (2) We shall also use the notations Q n and Qn to denote the restrictions of (1) ¯ to the subspaces Wn , Wn and Wn(2) respectively. the quadratic form Q Proposition 4.1. Let (ψ = f (r)eidθ , A = a(r)v) be a symmetric vortex of degree d. Then one has X X X ¯= ¯ ◦ πn = ¯ ◦ πn(1) + ¯ ◦ πn(2) . Q Q Q Q n≥0
n≥0
n≥0
As a consequence, ¯ = ⊕n≥0 Ker(Q ¯ n ) = ⊕n≥0 Ker(Q ¯ (1) ) ⊕ Ker(Q ¯ (2) ) . Ker(Q) n n
Proof. One computes, with notations above and using Lemma 2.2
1 ¯ Q(ϕ, B) 2π Z X 1 λ 2 2 2 |ϕ0n (r)|2 + = rdr (n − ra(r)) − (1 − f (r) ) |ϕ (r)| n r2 2 n∈Z
May 31, 2004 11:53 WSPC/148-RMP
434
00205
M. Comte & M. Sauvageot
+
Z
f (r)
2
"
# 2 1 X 2 λ X ϕd+n (r) − ϕd−n (r) rdr ϕd+n (r) + ϕd−n (r) + 4 4 n∈Z
Z X
− 4 Re
n∈Z
ϕd−n (r) if
n∈Z
+
XZ
|(Bnu )0 (r)|2
+
0
(r)Bnu (r)
|(Bnv )0 (r)|2
n∈Z
+
Z X 1 + n2 r2
n∈Z
+ f (r)
1 X ¯ = Q ◦ πn (ϕ, B) 2π
2
! b(r)f (r) v + Bn (r) rdr r
4n − 2 Bnv (r) · iBnu (r) r
rdr
|Bnu (r)|2 + |Bnv (r)|2 rdr
n≥0
since 1 ¯ Q ◦ π0 (ϕ, B) 2π Z b(r)2 λ 0 2 2 2 |ϕd (r)| + = − (1 − f (r) ) |ϕd (r)| rdr r2 2 Z + f (r)2 λ (Re ϕd )2 + (Im ϕd )2 rdr
Z
b(r)f (r) v ϕd (r) if + − 4 Re B0 (r) rdr r Z 1 u 0 2 v 0 2 2 u 2 v 2 rdr (9) (B0 ) (r) + (B0 ) (r) + + f (r) B0 (r) + B0 (r) + r2 0
(r)B0u (r)
and for n > 0 1 ¯ Q ◦ πn (ϕ, B) 2π Z 0 = |ϕn+d (r)|2 + |ϕ0n−d (r)|2 rdr + −
Z
(b(r) − n)2 (b(r) + n)2 2 2 rdr |ϕ (r)| + |ϕ (r)| d+n d−n r2 r2
λ 2 Z
(1 − f (r)2 ) |ϕd+n (r)|2 + |ϕd−n (r)|2 rdr
Z
2 1 2 λ ϕd+n (r) + ϕd−n (r) + ϕd+n (r) − ϕd−n (r) rdr 2 2 Z b(r)f (r) v Bn (r) rdr − 4 Re ϕd−n (r) if 0 (r)Bnu (r) + r +
f (r)2
May 31, 2004 11:53 WSPC/148-RMP
00205
Ginzburg–Landau Model of Superconductivity
Z
b(r)f (r) v − 4 Re + Bn (r) rdr ϕd+n (r) if r Z n u u 0 2 v 0 2 v +2 |(Bn ) (r)| + |(Bn ) (r)| + 4 2 Im Bn (r)Bn (r) rdr r Z 1 + n2 +2 + f (r)2 |Bnu (r)|2 + |Bnv (r)|2 rdr . 2 r
0
435
(r)Bnu (r)
(10)
Then separate real and imaginary parts in ϕn , Bnu and Bnv to check that the sums in formulas (9) and (10) split along the decomposition (ϕ, B) = π (1) (ϕ, B) + π (2) (ϕ, B). ¯ n for n > 0 5. On the Kernel of Q ¯ we just need to focus Thanks to the previous section, to go on with the study of Q, ¯ our attention on Qn . In this section we prove the following result: Proposition 5.1. Suppose f (r) ≥ 0, f 0 (r) ≥ 0 and 0 ≤ b(r) ≤ d for r ∈ [0, r¯]. If ¯ is positive, then the quadratic form Q ¯ n is nondegenerate and its kernel is reduced to {0}. (i) For n > d, Q ¯ n is nondegenerate or it has a two-dimensional kernel. (ii) For 1 ≤ n ≤ d, either Q In this case, there exist four nonnegative analytic functions σ0 , τ0 , β0u and β0v ¯ n ) if and only if ∃λ ∈ C such that on [0, r¯] such that (ϕ, B) ∈ Ker(Q ¯ 0 (r) + τ0 (r))ei(d−n)θ ϕ(r, θ) = λ(σ0 (r) − τ0 (r))ei(d+n)θ + λ(σ
(11)
¯ inθ )v . B(r, θ) = β0u (r) Im(λeinθ )u + β0v (r) Re(λe
(12)
In order to prove this proposition we need the following notations: Notations. (1)
1. The generic element element (ϕ, B) of Wn
will be denoted
ϕ(r, θ) = ϕd+n (r)ei(d+n)θ + ϕd−n (r)ei(d−n)θ B(r, θ) = β u (r) sin(nθ)u + β v (r) cos(nθ)v with ϕd+n , ϕd−n , β u and β v taking values in R. (This corresponds to Bnu = − 2i β u and Bnv = 12 β v .) 2. One will set ϕd+n + ϕd−n = σ, ϕd−n − ϕd+n = τ . 3. With the notations above, one sets 1¯ Q(ϕ, B) = Qn (σ, τ, β u , β v ) . π
May 31, 2004 11:53 WSPC/148-RMP
436
00205
M. Comte & M. Sauvageot
Formula (10) provides Qn (σ, τ, β u , β v ) Z b(r)2 + n2 2 2 = (σ(r) + τ (r) ) rdr σ 0 (r)2 + τ 0 (r)2 + r2 Z Z λ 4nb(r) (1 − f (r)2 )(σ(r)2 + τ (r)2 )rdr − − σ(r)τ (r)rdr 2 r2 Z + f (r)2 (λ σ(r)2 + τ (r)2 )rdr Z
f (r)b(r) σ(r)β v (r) rdr r Z 4n + β u0 (r)2 + β v0 (r)2 − 2 β u (r)β v (r) rdr r Z 1 + n2 + f (r)2 (β u (r)2 + β v (r)2 )rdr + r2
−4
f 0 (r)τ (r)β u (r) +
(13)
from which one deduces the following results: Lemma 5.2. Suppose f (r) ≥ 0, f 0 (r) ≥ 0 and 0 ≤ b(r) ≤ d for r ∈ [0, r¯]. (i) For n > 0, one has Qn (σ, τ, β u , β v ) ≥ Qn ( |σ|, |τ |, |β u |, |β v | ) .
(14)
(ii) For n > d and (ϕ, B) 6= (0, 0), one has Qn (σ, τ, β u , β v ) > Qn−1 (σ, τ, β u , β v ) .
(15)
(iii) If Qn is positive, either it is nondegenerate, or it has a 1-dimensional kernel R(σ0 , τ0 , β0u , β0v ), with σ0 , τ0 , β0u , β0v nonnegative and analytic on ]0, r¯]. Proof. The first assertion is obvious. For the second one, compute Qn (σ, τ, β u , β v ) − Qn−1 (σ, τ, β u , β v ) Z (2n − 1)(σ(r)2 + τ (r)2 ) − 4b(r)σ(r)τ (r) = rdr r2 Z (2n − 1)(B u (r)2 + B v (r)2 ) − 4B u (r)B v (r) + rdr r2
(16)
which is strictly positive as soon as 2n − 1 ≥ max(2, 2||b||∞ ). For (iii), we claim first that, if Qn is positive and (σ, τ, β u , β v ) is a nontrivial element in its kernel, one has — either σ(r) ≥ 0, τ (r) ≥ 0, β u (r) ≥ 0, β v (r) ≥ 0 ∀r ∈ [0, r¯] — or σ(r) ≤ 0, τ (r) ≤ 0, β u (r) ≤ 0, β v (r) ≤ 0 ∀r ∈ [0, r¯].
May 31, 2004 11:53 WSPC/148-RMP
00205
Ginzburg–Landau Model of Superconductivity
437
To prove it, note that Z = (σ, τ, β u , β v ) satisfies a linear second order equation 1 Z 00 + Z 0 (r) + M (r)Z = 0 r where M is a 4 × 4 matrix. The entries of M being analytic functions on ]0, r¯], σ, τ , β u and β v must be analytic functions on ]0, r¯]. The first assertion of the lemma insures that (|σ|, |τ |, |β u |, |β v |) is also an element of the kernel, and that consequently |σ| |τ |, |β u | and |β v | are also analytic functions. σ and |σ| being analytic, they coincide or are opposite everywhere: there exist ε1 , ε2 , ε3 and ε4 in {−1, 1} such that σ = ε1 |σ|, τ = ε2 |τ |, β u = ε3 |β u | and β v = ε4 |β v |. But one has necessarily ε1 = ε2 = ε3 = ε4 : otherwise, the inequality (14) would be strict. The claim is proved. It remains to show that two elements (σ1 , τ1 , β1u , β1v ) and (σ2 , τ2 , β2u , β2v ) in the kernel of Qn are proportional. Without loss of generality, we can suppose that the eight functions are nonnegative. (r0 ) . For Choose r0 > 0 such that σ1 (r0 ) ≥ 0 and σ2 (r0 ) > 0, and set λ0 = − σσ21 (r 0) λ > λ0 , one has σ1 (r0 ) + λσ2 (r0 ) > 0, and the claim above implies σ1 (r) + λσ2 (r) ≥ 0 ,
τ1 (r) + λτ2 (r) ≥ 0 ,
β1u (r) + λβ2u (r) ≥ 0 ,
β1v (r) + λβ2v (r) ≥ 0 ,
∀r ∈ [0, r¯] .
Similarly, for λ < λ0 , one has σ1 (r) + λσ2 (r) ≤ 0 ,
τ1 (r) + λτ2 (r) ≤ 0 ,
β1u (r)
β1v (r) + λβ2v (r) ≤ 0 ,
+
λβ2u (r)
≤ 0,
∀r ∈ [0, r¯]
and finally σ1 + λ0 σ2 = τ1 + λ0 τ2 = β1u + λ0 β2u = β1v + λ0 β2v = 0. Proof of Proposition 5.1. Assertion (i) results from the second assertion of the lemma: if Qn (σ, τ, β u , β v ) vanishes, then Qn−1 (σ, τ, β u , β v ) takes a strictly negative ¯ cannot be positive. value, and Q ¯ to Wn(1) and Wn(2) To prove assertion (ii) we first notice that the restrictions of Q (2) 1 ¯ have isomorphic kernels: for (ϕ, B) in Wn , one has π Q(ϕ, B) = Qn (τ, σ, β u , β v ) with σ = i(ϕd+n + ϕd−n ), τ = i(ϕd−n − ϕd+n ), β u = 2Bnu and β v = 2iBnv . Using this together with the third assertion of the lemma allow us to conclude. 6. The Case n = 1 The main result of this section is the following: Proposition 6.1. Assume f 0 ≥ 0, 0 ≤ b ≤ d on [0, r¯] and f 0 (¯ r ) > 0, b0 (¯ r ) < 0. ¯ ¯ Then the restriction Q1 of Q to W1 is nondegenerate. ¯ (1) and Q ¯ (2) have isomorphic kernels, it suffices to show that the Proof. As Q 1 1 quadratic form Q1 is nondegenerate.
May 31, 2004 11:53 WSPC/148-RMP
438
00205
M. Comte & M. Sauvageot
Let (σ, τ, β u , β v ) be a quadruple of functions belonging to the kernel of Q1 . By Lemma 5.2(i), they can be assumed to be nonnegative. They satisfy the linear differential system b(r)2 + 1 λ 1 0 00 2 σ (r) + σ (r) = σ(r) − (1 − 3f (r) ) r r2 2 2b(r)τ (r) 2f (r)b(r) v − β (r) − r2 r 1 b(r)2 + 1 λ 2 2 τ 00 (r) + τ 0 (r) = τ (r) − (1 − f (r) ) + f (r) r r2 2 (17) 2b(r)σ(r) 0 u − 2f (r)β (r) − r2 2 2 1 (β u )00 (r) + (β u )0 (r) = β u (r) 2 + f (r)2 − 2f 0 (r)τ (r) − 2 β v (r) r r r 2b(r)f (r) 2 2 1 σ(r) − 2 β u (r) . (β v )00 (r) + (β v )0 (r) = β v (r) 2 + f (r)2 − r r r r 0
Set σ1 (r) = f 0 (r), τ1 (r) = f (r)b(r) and β1 (r) = − b r(r) . As f an b satisfy the r system [E 0 ], one checks that the quadruple (σ1 , τ1 , β1 , β1 ) is also a solution of the system (17) above, so that one has 1 2b(r) 0 [r(σ 0 (r)σ1 (r) − σ(r)σ10 (r))] = [τ1 (r)σ(r) − σ1 (r)τ (r)] r r2 +
2b(r)f (r) [β1 (r)σ(r) − σ1 (r)β v (r)] r
1 2b(r) 0 [r(τ 0 (r)τ1 (r) − τ (r)τ10 (r))] = [σ1 (r)τ (r) − τ1 (r)σ(r)] r r2 + 2b(r)f 0 (r)[β1 (r)τ (r) − τ1 (r)β u (r)] 1 0 [r((β u )0 (r)β1 (r) − β u (r)β10 (r))] = 2b(r)f 0 (r)[τ1 (r)β u (r) − β1 (r)τ (r)] r +
2 [β1 (r)β u (r) − β1 (r)β v (r)] r2
2b(r)f (r) 1 0 [r((β v )0 (r)β1 (r) − β v (r)β10 (r))] = [σ1 (r)β u (r) − σ(r)β1 (r)] r r 2 [β1 (r)β v (r) − β1 (r)β u (r)] . r2 Summing up those four equalities, one gets +
[ rθ(r) ]0 = 0 with θ(r) = σ 0 (r)σ1 (r) − σ(r)σ10 (r) + τ 0 (r)τ1 (r) − τ (r)τ10 (r) + (β u + β v )0 (r)β1 (r) − β10 (r)(β u + β v )(r) .
May 31, 2004 11:53 WSPC/148-RMP
00205
Ginzburg–Landau Model of Superconductivity
439
As rθ(r) is equal to 0 at r = 0, one gets θ(r) = 0 ∀r. In particular, as σ, τ , β u and β v vanish at r = r¯, one gets σ 0 (¯ r )σ1 (¯ r ) + τ 0 (¯ r )τ1 (¯ r ) + (β u + β v )0 (¯ r )β1 (¯ r) = 0 . Since σ, τ , β u and β v are positive and vanish at r¯, one has necessarily σ 0 (¯ r ) ≤ 0, τ (¯ r ) ≤ 0, (β u + β v )0 (¯ r ) ≤ 0, so that, assuming f 0 (¯ r ) > 0 (i.e. σ1 (¯ r ) > 0), f (¯ r) > 0 and b(¯ r) ≥ 0 (hence τ1 (¯ r ) ≥ 0), b0 (¯ r ) < 0 (hence β1 (¯ r ) > 0), we have σ 0 (¯ r) = u 0 v 0 (β ) (¯ r ) = (β ) (¯ r ) = 0. If b(¯ r) > 0, we have also τ 0 (¯ r ) = 0: the four functions and their derivatives vanish at r¯, and by Cauchy–Lipschitz uniqueness property, we get the conclusion: 0
(σ, τ, β u , β v ) = (0, 0, 0, 0) . If b(¯ r ) = 0, and if we suppose τ 0 (¯ r ) < 0, Eq. (17) provides (β u )00 (¯ r) = 0 u 000 and, differentiating the equation, (β ) (¯ r ) = −2f 0 (¯ r )τ 0 (¯ r ) > 0, which provides r )(r − r¯)3 + O(r − r¯)4 < 0 for r < r¯ close to r¯: this contradicts β u (r) = 61 (β u )000 (¯ the positivity of β u . So in this case also we have τ 0 (¯ r ) = 0, and we conclude as above. 7. On the Kernel of Q0 The main result of this section is: ¯ 0 be the restriction to W0 of the quadratic form associated Proposition 7.1. Let Q ˆ ˆ with a vortex (ψ(α), A(α)) (α ∈]L, +∞[) of the radial branch defined in Sec. 1. ¯ 0 is degenerate if and only if one has R ˆ 0 (α) = 0. Then Q In this case, its kernel is one-dimensional , and equal to R
d ˆ ˆ ψ(α), A(α) . dα
We need some preliminaries to prove this proposition. First since B0u and B0v (1) are real-valued, the generic element (ϕ, B) of W0 can be written ϕ(r, θ) = g(r)eidθ ,
B(r, θ) = β(r)u
with g and β real-valued, vanishing at r = 0 and r = r¯. Note that (ϕ, B) satisfies div(B) = iψ · ϕ = 0, so that the quadratic forms Q ¯ coincide on W (1) . and Q 0 ¯ (1) ) if and only if g and β are solutions of the Such a (ϕ, B) belongs to Ker(Q 0 linear differential system b(r)2 λ f (r)b(r) 1 2 − (1 − 3f (r) ) −2 β(r) g 00 (r) + g 0 (r) = g(r) r r2 2 r (18) 1 2b(r)f (r) 1 2 β 00 (r) + β 0 (r) = + f (r) β(r) − g(r) . r r2 r
May 31, 2004 11:53 WSPC/148-RMP
440
00205
M. Comte & M. Sauvageot
Next we have the following: Remarks. (i) In Sec. 1 a two-parameter family (fα,c , aα,c ) of solution of system [E] has been defined. Differentiating in [E] with respect to α or c, one checks easily that the pairs ∂fα,c ∂aα,c gα = , βα = ∂α ∂α (19) ∂aα,c ∂fα,c , βc = gc = ∂c ∂c are nontrivial solution of the system (18). By results of [24], they have expansion at 0: cd r2d+2 + O(r2d+4 ) , βα (r) = r + O(r 2d+1 ) gα (r) = − 2d + 2 (20) c r2d+1 + O(r2d+3 ) . gc (r) = rd + O(r2d+2 ) , βc (r) = − 2d + 2 2
(ii) There exists an interval ]0, r0 ] on which one has f (r) > 0 and b(r) r 2 − λ > 0, so that the maximum principle implies: if (g, β) is a solution of (18) which satisfies g(r) > 0, g 0 (r) > 0 β(r) < 0 and β 0 (r) < 0 in a neighborhood of 0, then one has g(r) > 0, g 0 (r) > 0 β(r) < 0 and β 0 (r) < 0 on ]0, r0 ]. In particular, if f is positive increasing and b positive decreasing on [0, r¯], then gc and βα are positive increasing, while gα and βc are negative decreasing, on [0, r¯]. (iii) If (g, β) is a solution of (18) which is bounded near 0, then there exist µ0 and ν0 in R such that g(r) = µ0 rd + O(rd+2 ) ,
β(r) = ν0 r + O(r2d+1 )
for r close to 0. (The proof of this fact is quite similar to the proof of [24, Lemma 2.4] and will be omitted.) We are now able to prove the following results: Lemma 7.2. Let (g, β) be a solution of the system (18) on ]0, r¯] which is bounded near 0. Then there exist two real constants µ0 and ν0 such that g(r) = µ0 gc (r) + ν0 gα (r) ,
β(r) = µ0 βc (r) + ν0 βα (r) ,
∀r ∈ ]0, r¯] .
Proof. Let µ0 and ν0 be like in remark (iii) above. For µ > µ0 and ν < ν0 , one has at the neighborhood of 0 µgc (r) + νgα (r) − g(r) = (µ − µ0 )rd + O(rd+2 ) µβc (r) + νβα (r) − β(r) = (ν − ν0 )r + O(r2d+1 ) so that, by remark (ii) above, one has µgc (r) + νgα (r) − g(r) > 0 and µβc (r) + νβα (r) − β(r) < 0 on ]0, r0 ]. Hence µ0 gc (r) + ν0 gα (r) − g(r) ≥ 0 and µ0 βc (r) + ν0 βα (r) − β(r) ≤ 0 on ]0, r0 ].
May 31, 2004 11:53 WSPC/148-RMP
00205
Ginzburg–Landau Model of Superconductivity
441
Similarly, for µ < µ0 and ν > ν0 , one has µgc (r) + νgα (r) − g(r) < 0 and µβc (r) + νβα (r) − β(r) > 0 on ]0, r0 ]. Hence µ0 gc (r) + ν0 gα (r) − g(r) ≤ 0 and µ0 βc (r) + ν0 βα (r) − β(r) ≥ 0 on ]0, r0 ]. Finally µ0 gc (r) + ν0 gα (r) − g(r) = 0 and µ0 βc (r) + ν0 βα (r) − β(r) = 0 on ]0, r0 ]. The equality extends to [0, r¯] since we deal with analytic functions. Corollary 7.3. Suppose f is positive increasing and b is positive decreasing. The ¯ (1) is degenerate if and only if the function gc βα −gα βc vanishes at quadratic form Q 0 r = r¯. In this case, its kernel is one-dimensional , generated by (gα + µgc , βα + µβc ) (¯ r) βα (¯ r) with µ = − ggαc (¯ r ) = − βc (¯ r) . Proof. If gc βα − gα βc vanishes at r = r¯, then with µ as above, (g = gα + µgc , β = βα + µβc ) is a solution of the system (18) which satisfies g(¯ r) = β(¯ r ) = 0, and the corresponding pair (ϕ = g(r)eidθ , B = β(r)u) will belong to H01 (Br¯, C)×H01 (Br¯, R2 ) ¯ Hence Q ¯ is degenerate. and to the kernel of Q. ¯ is degenerate, one will find a solution (g, β) of the system (18) Conversely, if Q which satisfies g(¯ r) = β(¯ r ) = 0. By Lemma 7.2, there will be two real constants µ 0 and ν0 such that µ0 gc (¯ r )+ν0 gα (¯ r ) = 0 and µ0 βc (¯ r )+ν0 βα (¯ r ) = 0. As a consequence of remark (ii) above, µ0 and ν0 cannot be equal to 0, so that gc βα − gα βc vanishes at r¯, and any linear combination of (gα , βα ) and (gc , βc ) which vanishes at r = r¯ must be proportional to (µ0 gc + ν0 gα , µ0 βc + ν0 βα ). ¯ is positive, the restriction to W (2) of the form Q ¯ is nondegenLemma 7.4. If Q 0 erate. ¯ which is of the form Proof. Let (ϕ, B) be an element in the kernel of Q ϕ(r, θ) = ig(r)eidθ ,
B(r, θ) = β(r)u
with g and β real-valued. Rr Set γ(r) = r 0 β(r)dr: one has B + ∇γ = 0. Moreover (ϕ + iγψ, B + ∇γ) = (ϕ + iγψ, 0) belongs to the kernel of Q, and so does (iϕ − γψ, 0). As (iϕ − γψ, 0) (1) belongs to W0 , the previous study shows that iϕ − γψ = 0, and we have proved ¯ = {(0, 0)}, we have (ϕ, B) = (0, 0) (ϕ, B) = −(iγψ, ∇γ) ∈ K0 . As K0 ∩ Ker(Q) and the result. As a consequence, we get Proposition 7.1. Indeed let us recall [24, formula (41)], ˆ 0 (α) = R
r(gc βα − gα βc ) . 0 rβc fc,c(α) + gc b0c,c(α)
(21)
ˆ 0 (α) = 0. In this case, usThus gc βα − gα βc vanishes at r = r¯ if and only if R ing the results and the notations of [24], the linear combination of (gα , βα ) and (gc , βc ) vanishing at r = r¯ is (gα + c0 (α)gc , βα + c0 (α)βc ). This ends the proof of the proposition.
May 31, 2004 11:53 WSPC/148-RMP
442
00205
M. Comte & M. Sauvageot
8. Tentative Construction of a Bifurcation Branch The purpose of this section is to show how, if some additional assumptions are satisfied, the previous results lead to a bifurcation branch of solutions of the Ginzburg– ˜ ˜ Landau equation, emanating from the radial branch {(ψ(α), A(α)} α∈]L,+∞[ at α = αd defined in Sec. 3. ˆ 0 (αd ) 6= 0. Assumption 1. R ¯ 0 is not degenerate. With this assumption we know from Proposition 7.1 that Q The results of Propositions 4.1, 5.1 and 6.1 imply then ¯ = ⊕dm=2 Ker(Q ¯m) Ker(Q)
(22)
¯ m ) is reduced to {0} or 2-dimensional, one of them at least being where each Ker(Q ¯ is positive degenerate. not reduced to {0} since Q Notations. We set ¯ m ) 6= {0} p = max m ∈ [2 · · · d]/Ker(Q
(23)
¯ (1) Ker(Q p ).
and we fix (ϕp , Bp ) 6= (0, 0) in ˆ ¯ α be the quadratic form Q ¯ associated with the vortex (ψ(α), ˆ Let Q A(α)). Assumption 2.
¯ α (ϕp ,Bp ) dQ dα
6= 0 at α = αd .
Our purpose is to apply the bifurcation theorem of Crandall and Rabinowitz [12] to a suitable function F˜ defined below, in order to obtain the following result: Proposition 8.1. Suppose that Assumptions 1 and 2 above are satisfied. Then there exists a bifurcation branch (not necessarily unique) emanating from the radial ˆ A(α))} ˆ branch {(ψ(α, at α = αd , in the direction of (ϕp , Bp ), where p is defined by (23). This means that there exist δ0 > 0, a smooth function δ → α(δ) defined from [0, δ0 ] in ]L, +∞[, and a smooth branch δ → (ψˆδ , Aˆδ ) defined for δ ∈ [0, δ0 ] such that (i) α(0) = αd , (ii) (ψˆδ , Aˆδ ) is a solution of the Ginzburg–Landau equation DG(ψ, A) = 0 with ˆ r¯ = R(α(δ)) with boundary conditions (2). (iii) In a neighborhood of 0, ˆ ˆ (ψˆδ , Aˆδ ) = (ψ(α(δ)), A(α(δ)) + δ(ϕp , Bp ) + O(δ 2 ) .
(24)
The remaining of the section is devoted to the proof of this proposition. Proof of Proposition 8.1. Define first the function Γr¯ as the energy form G for the parameter r¯ (with h = 0), rescaled to the unit disk B1 , by the formula Z 2 1 1 λ¯ r2 Γ(ψ, A) = 1 − |ψ|2 + 2 |curl A|2 . (25) |i∇ψ + Aψ|2 + 2 B1 4 r¯
May 31, 2004 11:53 WSPC/148-RMP
00205
Ginzburg–Landau Model of Superconductivity
443
˜ A) ˜ A) ˜ when (ψ, A) and (ψ, ˜ are related by One has Γ(ψ, A) = G(ψ, ˜ r x) , ψ(x) = ψ(¯
˜ r x) , A(x) = r¯ A(¯
(26)
˜ A) ˜ = 0. and DΓ(ψ, A) = 0 if and only if DG(ψ, Setting, for α ∈ ]L, +∞[, ˆ ˆ ψαrad (x) = ψ(α)(¯ r x) and Arad ¯ A(α)(¯ r x) , α (x) = r we get a radial branch of solutions for the rescaled Ginzburg–Landau equation DΓ(ψ, A) = 0. We shall write (ψ0 , A0 ) for (ψαrad , Arad α ) at α = αd . Define now the following spaces: (1) Hp = ⊕k≥0 Wkp as a closed subspace of H01 (B1 , C) × H01 (B1 , R2 ); H as the space of (ϕ, B) in Hp which have the property div(B) = iψ0 · ϕ . Hp∗ will be the space of distributions (ϕ∗ , B ∗ ) in H −1 (B1 , C) × H −1 (B1 , R2 ) such that ϕ∗ has a Fourier expansion of the form X ϕ∗d+kp (r)ei(d+kp)θ ϕ∗ (r, θ) = k∈Z
∗ u
∗ v
while (B ) and (B ) have Fourier expansions of the form X X ikpθ (B ∗ )u = (B ∗ )u , (B ∗ )v = (B ∗ )vkp (r)eikpθ kp (r)e k∈Z
k∈Z
with for all k ϕ∗d+kp and (B ∗ )vkp real-valued distributions, while (B ∗ )u kp takes values in iR. Kp∗ is the space of elements of Hp∗ which are of the form (iγψ0 , ∇γ) for some γ in L2 (B1 , R). For having (iγψ0 , ∇γ) ∈ Hp∗ , γ must have a Fourier expansion P γ = k∈Z γpk (r)eikdθ with γpk (r) ∈ iR and γ−kp (r) = −γkp (r), for all r and k. This last property implies γ0 = 0, so that γ is orthogonal to constant functions. This implies that, on Kp∗ , the norms ||(iγψ0 , ∇γ)||H 1 , ||∇γ||H 1 and ||γ||L2 are equivalent, and that Kp∗ is a closed subspace of Hp∗ . Define H∗ as the quotient space = Hp∗ /Kp∗ , and πp∗ as the canonical projection from Hp∗ onto H∗ . Finally, define the functions F : ]L, +∞[×H → Hp∗ ,
(ψαrad + ϕ, Arad F (α, (ϕ, B)) = DΓR(α) ˆ α + B)
and F˜ : ]L, +∞[×H → H∗ ,
F˜ = πp∗ ◦ F .
∗ (ψαrad + ϕ, Arad (One checks easily that DΓR(α) ˆ α + B) belongs to Hp whenever (ϕ, B) belongs to H.)
May 31, 2004 11:53 WSPC/148-RMP
444
00205
M. Comte & M. Sauvageot
We shall prove that F˜ satisfies at α = αd the three conditions in [12] for a bifurcation branch: Condition 1. Ker D2 F˜ (αd , (0, 0)) has dimension 1. Condition 2. Im D2 F˜ (αd , (0, 0)) is a closed subspace of H∗ , with codimension 1. Condition 3. If (ϕp , Bp ) is a nonzero element in Ker D2 F˜ (αd , (0, 0)), then D12 F˜ (αd , (0, 0))(ϕp , Bp ) 6∈ Im D2 F˜ (αd , (0, 0)). (D2 F˜ is the derivative of F˜ with respect to (ϕ, B); D12 F˜ is the derivative of ˜ D2 F with respect to α.) Proof of Condition 1. Ker D2 F (αd , (0, 0)) can be interpreted as the restriction ˆ d ), A(α ˆ d )) rescaled from B ˜ to H of the kernel of the quadratic form D 2 G(ψ(α R(αd ) to B1 by the formula (26). As elements of H satisfy the condition div(B) = iψ0 · ¯ ϕ, Ker D2 F (αd , (0, 0)) will be in one-to-one correspondence with the kernel of Q (1) ˜ restricted to H = ⊕k≥0 Wkp . By the choice of p by (23) and the formula (22), the ¯ (1) kernel of this restriction is equal to the kernel of Q p and is 1-dimensional, equal to R (ϕp , Bp ). So that we have Ker(D2 F (αd , (0, 0))) = R (ϕ◦p , Bp◦ ) ˆ d )x) and B ◦ (x) = R(α ˆ d )Bp (R(α ˆ d )x). with ϕ◦p (x) = ϕp (R(α p Moreover, by gauge invariance of G the image of F (αd , (0, 0)) belongs to the annihilator K0⊥ of K0 in Hp∗ . One checks easily K0⊥ ∩ Kp∗ = {0} , so that πp∗ is one-to-one on the image of F (αd , (0, 0)). We get then Ker(D2 F˜ (αd , (0, 0))) = R (ϕ◦ , B ◦ ) (27) p
p
and the result. Proof of Condition 2. Note first that Hp∗ can be considered as the dual space of Hp , and that H is the annihilator of Kp∗ in Hp . So, H∗ is the dual space of H, and D2 F˜ (αd , (0, 0)), as an operator from H into H∗ , is a symmetric operator which linearizes the restriction to H of the quadratic form D 2 Γ(ψ0 , A0 ). This implies ⊥ Ker D2 F˜ (αd , (0, 0))) = Im D2 F˜ (αd , (0, 0)) in H, so that it suffices to prove that D2 F˜ (αd , (0, 0)) has a closed image. ˜ ¯ be the quadratic form Q ¯ rescaled to the unit disk B1 by the corresponLet Q dence (26). It coincides on H with the quadratic form Z (ϕ, B) → D2 F˜ (αd , (0, 0)(ϕ, B) · (ϕ, B) .
Let W be a closed supplementary subspace of R (ϕ◦p , Bp◦ ) in H. Lemma 2.3 provides a constant ν > 0 such that, for any (ϕ, B) ∈ W , one has Z D2 F˜ (αd , (0, 0)(ϕ, B) · (ϕ, B)) ≥ ν||(ϕ, B)||2H 1 0
May 31, 2004 11:53 WSPC/148-RMP
00205
Ginzburg–Landau Model of Superconductivity
445
which in turn implies ||D2 F˜ (αd , (0, 0)(ϕ, B))||H −1 ) ≥ ||(ϕ, B)||H01 and the result. More precisely, we get Im D2 F˜ (αd , (0, 0)) = R (ϕ◦p , Bp◦ )
⊥
in H∗ .
(28)
Proof of Condition 3. By (28), it suffices to prove Z d D2 F˜ (α, (0, 0))(ϕ◦p , Bp◦ ) |α=α · (ϕ◦p , Bp◦ ) 6= 0 d dα which is the translation, up to rescaling, of Assumption 2. The bifurcation theorem of [12] provides then:
Theorem 4. There exists δ0 > 0 and a smooth map δ → (α(δ), (ϕ(δ), B(δ)) from ] − δ0 , δ0 [ in ]L, ∞[×H with α(δ) = αd + O(δ 2 ) and (ϕ(δ), B(δ)) = δ (ϕ◦p , Bp◦ ) + O(δ 2 )
(29)
such that, for any δ F˜ (α(δ), (ϕ(δ), B(δ))) = 0 in H∗ . Remark. The Hessian of the quadratic form Γ at (ψ0 , A0 ) is not an elliptic operator. However, its restriction to H coincides, up to rescaling, with the linearized operator ¯ and P¯ is an elliptic operator. P¯ of the quadratic form Q, So, the operator D2 F (αd , (0, 0)), which is the Hessian of Γ restricted to H, shares the elliptic regularity property of P¯ and we have: if h∗ ∈ Im D2 F (αd , (0, 0)) ⊂ Hp∗ belongs to H k for some k ≥ 1 (resp. to C∞ ), then any pre-image of h∗ under the map D2 F (αd , (0, 0)) in H will belong to H k+2 (resp. C∞ ). In particular, (ϕ◦p , Bp◦ ) is of class C∞ . As a consequence, if one considers F˜ as defined on H ∩ H k+2 , with values in the quotient space of Hp∗ ∩ H k by Kp∗ ∩ H k , the conditions for a bifurcation are still satisfied. By uniqueness of a bifurcation, the expansion formula (29) is true in any H k and also in L∞ . With the notations above, set ζ(δ) = F (α(δ), (ϕ(δ), B(δ))). We claim that ζ(δ) = 0 for δ small enough. One has, for δ in ] − δ0 , δ0 [, ζ(δ) ∈ Kp∗ , i.e. ζ(δ) = (iγ(δ)ψ0 , ∇γ(δ)) for some γ(δ) in L2 (B1 , R). By gauge invariance of G, one will have Z ζ(δ) · (iγψδ , ∇γ) =
Z
F (α(δ), (ϕ(δ), B(δ))) · (iγψδ , ∇γ) = 0
rad for all γ in H 1 (B1 , R), with ψδ = ψα(δ) + ϕ(δ).
(30)
May 31, 2004 11:53 WSPC/148-RMP
446
00205
M. Comte & M. Sauvageot
Taking first all γ in H01 , Eq. (30) provides ∆γ(δ) = γ(δ) ψ0 · ψδ and ∆γ(δ) ∈ L2 (B1 , R), which implies γ(δ) ∈ H 2 (B1 , R). So, in (30), one can substitute γ(δ) to γ to obtain Z Z γ(δ)2 ψ0 · ψδ + |∇γ(δ)|2 = ζ(δ) · (iγ(δ)ψδ , ∇γ(δ)) = 0 . So one has ||ζ(δ)||2L2
=−
Z
γ(δ)2 ψ0 · (ψδ − ψ0 ) ≤ ||ψ0 ||∞ ||ψδ − ψ0 ||∞ ||γ(δ)||2L2 .
As γ(δ) is orthogonal to the constants, there will exist a universal constant C 1 such that ||ζ(δ)||2L2 ≥ ||∇γ(δ)||2L2 ≥ C1 ||γ(δ)||2L2 . As ψδ tends to ψ0 in L∞ (see Remark above), we get γ(δ) = 0 as soon as ||ψ0 ||∞ ||ψδ − ψ0 ||∞ is strictly smaller than C1 . We have proved F (α(δ), (ϕ(δ), B(δ))) = 0 for δ small enough, which, by defini˜ tion of F and rescaling for each δ from the disk B1 to the disk Br¯ with r¯ = R(α(δ)), proves the proposition. 9. Study of the Zeroes on the Bifurcation Branch In this section, we suppose that Assumptions 1 and 2 of the previous section are ¯ d is degenerate. Note that this assumption satisfied, and that the quadratic form Q is a consequence of the two previous ones for d = 2 using (22). Hence, we get a bifurcation branch, (ψδbif , Abif δ ) for the Ginzburg–Landau equation DG(ψ, A) = 0, which is of the form ˆ ψδbif = ψ(α(δ)) + δ ϕd + 0(δ 2 ) ˆ d ) + δ ϕd + 0(δ 2 ) . = ψ(α
(31)
By Proposition 5.1, ϕd is of the form ϕd (r, θ) = σ(r) + τ (r) + (σ(r) − τ (r))e2idθ with σ and τ taking values in R+ . Our purpose is to prove the following result: Proposition 9.1. For δ > 0 small enough, the function ψδbif possesses exactly d zeroes, located at the vortices of a regular polygon with center 0 and radius r δ ∼ δ 1/d . The proof of this proposition requires a preliminary lemma. Lemma 9.2. With the notations above, ϕd (0) is a strictly positive real number.
May 31, 2004 11:53 WSPC/148-RMP
00205
Ginzburg–Landau Model of Superconductivity
447
¯ is elliptic, which implies that (ϕd , Bd ), as Proof. The linearized operator P¯ of Q an element of the kernel of P¯ , is of class C ∞ . Consequently, σ and τ are smooth functions of r2 , and one has σ(r) − τ (r) = O(r 2d ). Consequently, σ and τ have limited expansion near 0 P σ(r) = dk=0 σk r2k + O(r2d+2 ) with σk = τk for k ≤ d − 1 . (32) Pd τ (r) = k=0 τk r2k + O(r2d+2 )
In particular, σ(0) = τ (0) in R+ . What we have to prove is σ(0) 6= 0. 1 in the canonical frame, one has If Bd has coordinates B B2
B1 + iB2 = i(β u (r) + β v (r))ei(1−d)θ + i(β u (r) − β v (r))ei(1+d)θ .
As Bd is a smooth vector field, we have β u (r) + β v (r) = O(rd−1 ), β u (r) − β v (r) = O(rd+1 ), and β u (r) = γrd−1 + O(rd+1 ) ,
β u (r) = γrd−1 + O(rd+1 )
for some γ in R. We claim that σ(0) = 0 would imply γ = 0. (σ, τ, β u , β v ) satisfy λ b(r)2 + d2 1 2 − (1 − 3f (r) ) σ 00 (r) + σ 0 (r) = σ(r) r r2 2 2db(r)τ (r) 2f (r)b(r) v − β (r) − r2 r 1 0 b(r)2 + d2 λ 00 2 2 τ (r) + τ (r) = τ (r) − (1 − f (r) ) + f (r) r r2 2 2db(r)σ(r) − − 2f 0 (r)β u (r) r2 1 + d2 1 u 0 u 2 u 00 + f (r) (β ) (r) + (β ) (r) = β (r) r r2 2d − 2f 0 (r)τ (r) − 2 β v (r) r 2 1 v 0 1+d v 00 v 2 (β ) (r) + (β ) (r) = β (r) + f (r) r r2 2b(r)f (r) 2d σ(r) − 2 β u (r) . − r r
(33)
(34)
We suppose σ(0) = τ (0) = 0. A limited expansion at order 2d − 4 in the first equation provides, using the equality of σk and τk for k ≤ d − 1 and the fact that f (r) = crd + O(rd+2 ) and b(r) = d − αr 2 + O(r2d+2 ): d−1 X k=1
4k 2 σk r2k−2 + O(r2d−2 )
May 31, 2004 11:53 WSPC/148-RMP
448
00205
M. Comte & M. Sauvageot
=
d−1 X
σk r2k−2 (d − b(r))2 −
k=1
k=1
= 4α2
d−2 λX σk r2k + O(r2d−2 ) 2
d−1 X
σk−2 r2k−2 −
k=3
d−1 λX σk−1 r2k−2 + O(r2d−2 ) 2 k=2
which, by identification, implies σ1 = 0, 16σ2 = − λ2 σ1 and σ2 = 0, then σk = 0 for k ≤ d − 1. In the same equation, expansion at order 2d − 2 provides 4d2 σd r2d−2 + O(r2d ) = 2d2 σd r2d−2 − 2d2 τd r2d−2 − 2cdγr2d−2 + O(r2d ) and 2d2 (σd + τd ) + 2cdγ = 0. As σ, τ and β u are nonnegative functions, one must have σd ≥ 0, τd ≥ 0 and γ ≥ 0, which implies γ = 0 as it has been claimed. We have shown σ(0) = 0 ⇒ β u (r) + β v (r) = O(rd+1 ) near 0 .
(35)
Set now, in the spirit of [17], f (r)b(r) b0 (r) f 0 (r) , τ (r) = , β (r) = − d d rd−1 rd rd with f and b: b(r) = d − ra(r) as above associated with the vortex (ψ 0 , A0 ). One checks that (σd , τd , βd ) satisfy the following system: b(r)2 + d2 λ 1 2 − (1 − 3f (r) ) σd00 (r) + σd0 (r) = σd (r) r r2 2 f (r)(1 − f (r)2 ) 2db(r)τd (r) 2f (r)b(r) − β (r) + λ(d − 1) − d r2 r rd 2 2 1 λ b(r) + d τd00 (r) + τd0 (r) = τd (r) − (1 − f (r)2 ) + f (r)2 r r2 2 b0 (r)f (r) 2db(r)σd (r) − 2f 0 (r)βd (r) − (2d − 2) − 2 r rd+1 (36) 2 1 0 1+d 00 2 βd (r) + βd (r) = βd (r) + f (r) r r2 2d b(r)f (r)2 − 2f 0 (r)τd (r) − 2 βd (r) + (2d − 2) r rd+1 2 1+d + f (r)2 = βd (r) r2 2b(r)f (r) 2d b(r)f (r)2 − σd (r) − 2 βd (r) + (2d − 2) . r r rd+1 Defining a function η by σd (r) =
η(r) = σd0 (r)σ(r) − σ 0 (r)σd (r) + τd0 (r)τ (r) − τ 0 (r)τd (r) + βd0 (r)(β u (r) + β v (r)) − (β u + β v )0 (r)βd (r)
May 31, 2004 11:53 WSPC/148-RMP
00205
Ginzburg–Landau Model of Superconductivity
449
one computes, from systems (34) and (36) f (r)(1 − f (r)2 ) 1 (rη(r))0 = λ(m − 1) σ(r) r rd − (2m − 2)
b0 (r)f (r) τ (r) rd+1
+ (2m − 2)
b(r)f (r)2 u (β (r) + β v (r)) rd+1
and consequently (rη(r))0 ≥ 0. If one supposes σ(0) = 0, near 0 one has σ(r) = O(r 2d ), τ (r) = O(r2d ), β u (r) + β v (r) = O(rd+1 ). As βd (r) = 2αr1−d + O(rd+1 ), one will have rη(r)|r=0 = 0, which implies η(r) > 0 for r > 0 close to 0. u v are positive functions vanishing at If σ(0) = τ (0) = 0, then σσd , ττd and β β+β d 0. So they have a positive derivative near 0, which implies η(r) < 0 for r > 0 close to 0 and a contradiction. The lemma is proved. Proof of Proposition 9.1. With the result of the previous lemma, the proof of the proposition is identical to the proof of the same property for the bifurcation branch in the simplified model without magnetic field, in [11, Sec. III], for d = 2, and in [23, Sec. 4], for larger d. Acknowledgments The authors would like to thank the RTN project Fronts-Singularities for its help. (HPRN-CT-2002-00274). References [1] A. Aftalion, On the minimizers of the Ginzburg–Landau energy for high kappa: the axially symmetric case, Annales I.H.P. Analyse non lin´eaire 16/6 (1999) 747–772. [2] A. Aftalion and E. N. Dancer, On the symmetry and uniqueness of solutions of the Ginzburg–Landau equation for small domains, Commun. Contemp. Math. 3(1) (2001) 1–14. [3] A. Aftalion, E. Sandier and S. Serfaty, Pinning phenomena in the Ginzburg–Landau model of superconductivity, J. Math. Pures Appl. 80(3) (2001) 339–372. [4] S. Alama, L. Bronsard and T. Giorgi, Uniqueness of symmetric vortex solutions in the Ginzburg–Landau model of superconductivity, J. Funct. Anal. 167 (1999) 399–424. [5] L. Almeida, F. Bethuel and Y. Guo, A remark on the instability of symmetric vortices with large coupling constant, Comm. Pure Appl. Math. L (1997) 1295–1300. [6] M. S. Berger and Y. Y. Chen, Symmetric vortices for the Ginzburg–Landau equation of superconductivity and the nonlinear desingularization phenomenon, J. Funct. Anal. 82 (1989) 259–295. [7] F. Bethuel, H. Br´ezis and F. Helein, Ginzburg–Landau Vortices (Birkh¨ auser, 1994). [8] C. Bolley and B. Helffer, Champs magn´etiques critiques et hyt´eresis dans les films supraconducteurs, Prepublication, juillet 2002. [9] P. Baumann, D. Phillips and Q. Tang, Stable nucleation for the Ginzburg–Landau system with an applied magnetic field, A.R.M.A. 142 (1998) 1–43.
May 31, 2004 11:53 WSPC/148-RMP
450
00205
M. Comte & M. Sauvageot
[10] C. B. Clemons, An existence and uniqueness result for symmetric vortices for the Ginzberg–Landau equations of superconductivity, J. Differential Equations 157 (1999), 150–162. [11] M. Comte and P. Mironescu, A bifurcation analysis for the Ginzburg–Landau equation, A.R.M.A. 144 (1998) 301–311. [12] M. G. Crandall and P. H. Rabinowitz, Bifurcation, perturbation of simple eigenvalues and linearized stability, Arch. Rational Mech. Anal. 52 (1973) 161–180. [13] Q. Du, M. D. Gunzburger and J. S. Petersonn, Analysis and approximation of the Ginzburg–Landau model of superconductivity, SIAM Review 34 (1992) 54–81. [14] Y. Guo, Instability of Symmetric Vortices with Large charge and coupling constant, Comm. Pure Appl. Math. XLIX (1996) 1051–1080. [15] V. L. Ginzburg and L. D. Landau, On the theory of superconductivity, Soviet Phys. JETP 20 (1950) 1064–1082. [16] S. Gustafson and I. M. Sigal, The stability of magnetic vortices, Comm. Math. Phys. 212 (2000) 257–275. [17] P. Hagan, Spiral waves in reaction diffusion equation, SIAM J. Appl. Math. 42 (1982) 762–786. [18] R. M. Herv´e and M. Herv´e, Etude qualitative des solutions r´eelles d’une ´equation diff´erentielle li´ee a ` l’´equation de Ginzburg–Landau, Ann. I.H.P., Analyse non lin´eaire 11 (1994) 427–440. [19] P. Mironescu, On the stability of radial solutions of the Ginzburg–Landau equation, J. Funct. Anal. 130 (1995) 334–344. [20] B. Plohr, Princeton thesis. [21] B. Plohr, The behavior at infinity of isotropic vortices and monopoles, J. Math. Phys. 22 (1981) 2184–2190. [22] E. Sandier and S. Serfaty, A rigourous derivation of a free-boundary problem arising in superconductivity, Ann. Scient. Ec. Norm. Sup. 33 (2000) 561–592. [23] M. Sauvageot, Properties of the solutions of the Ginzburg–Landau equation on the bifurcation branch, Nonlinear Partial Differential Equations and Applications 10 (2003) 375–397. [24] M. Sauvageot, Radial solutions for the Ginzburg–Landau equation with applied magnetic field, J. Nonlin. Anal.: Series A Theory and Methods 55(7/8) (2003) 785–826. [25] S. Serfaty, Solutions stables de l’´equation de Ginzburg–Landau en pr´esence de champ magn´etique, C. R. Acad. Sc. S´er. I, 326(8) (1998) 949–954.
May 31, 2004 12:28 WSPC/148-RMP
00204
Reviews in Mathematical Physics Vol. 16, No. 4 (2004) 451–477 c World Scientific Publishing Company
POSITIVE DEFINITE MAPS, REPRESENTATIONS AND FRAMES
DORIN ERVIN DUTKAY Department of Mathematics, The University of Iowa 14 MacLean Hall, Iowa City, IA 52242-1419, USA [email protected] Received 21 May 2002 Revised 2 February 2004 We present a unitary approach to the construction of representations and intertwining operators. We apply it to the C ∗ -algebras, groups, Gabor-type unitary systems and wavelets. We give an application of our method to the theory of frames, and we prove a general dilation theorem which is in turn applied to specific cases, and we obtain in this way a dilation theorem for wavelets. Keywords: Positive definite kernels; Kolmogorov decomposition; representations; Gabor frames; wavelets; Ruelle transfer operator; tight frames.
Contents 1. Introduction 2. Positive Definite Maps and Representations 3. Intertwining Operators 4. Frames and Dilations 5. A Dilation Theorem for Wavelets Acknowledgments References
451 453 459 466 469 477 477
1. Introduction Engineering problems in time-frequency analysis of coherent vector expansions, Gabor bases, wavelets based on scaling and integral translations, and multiresolution algorithms in signal processing are generally not thought to be related to operator algebras. In this paper, we show nonetheless that a fundamental idea of Kolmogorov adds clarity to known constructions in operator algebra theory, and moreover is the key to an extension of recent results in the more applied areas that we enumerated above. Of our original results (see Sec. 5 below) we highlight a new algorithm for the construction of certain orthonormal frames of wavelet type. Our paper proposes 451
May 31, 2004 12:28 WSPC/148-RMP
452
00204
D. E. Dutkay
a general method of construction of representations of various algebraic structures as operators on Hilbert spaces. Our goal is to show how some well-known constructions of representations fit into the same framework and are consequences of a general result. Among the structures considered, we mention C ∗ -algebras, groups, Gabor-type unitary systems and wavelet representations. In operator theory, the GNS construction producing representations of C ∗ algebras is a fundamental tool (see [3]). In harmonic analysis unitary representations of groups can be constructed when a function of positive type is present (see [11]). Representations are ubiquitous also in the theory of wavelets and frames (see [12, 14]). We will see how these various results have in fact a common ground — a classical theorem of Kolmogorov (Theorem 2.2), also known in the literature as the Kolmogorov decomposition of positive definite kernels. We follow here the ideas introduced in [10]. It is shown there that the Kolmogorov theorem gives a unified treatment of several important dilation theorems such as the GNS-Stinespring construction for C ∗ -algebras, the Naimark–Sz.-Nagy unitary dilation of positive definite functions on groups, the construction of Fock spaces and the algebras of canonical commutation and anticommutation relations. Kolmogorov’s result was used also by Sz.-Nagy and Foias in dilation theory, for the commutant lifting theorem ([17, 18]) which in turn was a key idea used by Sarason to obtain a solution to the Nevanlinna–Pick interpolation problem [16]. For a more complete account of the history and applications of Komogorov’s result, we refer to [4]. We will indicate how this technique can be used also for construction of wavelet representations and Gabor-type unitary systems. More general constructions for Hermitian kernels are also possible and they are based on Krein spaces (see [5]). In Sec. 2 we review the general result of Kolmogorov and we show how it can be used for the GNS construction and for positive definite maps on groups. Then we apply it to Gabor-type unitary systems and we obtain unitary representations and for wavelets we get the cyclic representations introduced in [14]. Section 3 concerns operators compatible with the representations defined in Sec. 2, called intertwining operators. Again, the starting point is a general theorem (Theorem 3.2). We consider some particular cases and study how the intertwining operators will be compatible with the additional structure that appears. In Sec. 4 we analyze some connections between representations and frames. We recall that a set {xn | n ∈ N} of vectors in a Hilbert space H is called a frame for the Hilbert space H if there are some positive constants A and B such that X Akf k2 ≤ |hf | xn i|2 ≤ Bkf k2 , (f ∈ H) . n∈N
When A = B = 1 then we call it normalized tight frame. It is known that any normalized tight frame is the projection of an orthonormal basis of a bigger Hilbert space (see [12]). We will prove that the normalized tight frames can be dilated to orthonormal bases in a way that is compatible with the
May 31, 2004 12:28 WSPC/148-RMP
00204
Positive Definite Maps, Representations and Frames
453
representations defined in Sec. 2. We will get as immediate consequences the dilation theorems for groups and Gabor-type unitary systems introduced in [12]. In the last section we consider the case of wavelets obtained from a multiresolution analysis. It is known (see [6]) that, unless some restrictions are imposed on the low-pass filter that starts the MRA construction, the wavelets obtained do not form an orthonormal basis but a normalized tight frame. Since such frames can be dilated to orthonormal bases, a natural question would be if the dilation preserves the multiresolution structure. The answer is affirmative and it is given in Theorem 5.2 and more concretely in Theorem 5.3. In this way we obtain “wavelets” in a Hilbert space bigger then L2 (R). 2. Positive Definite Maps and Representations We begin this section with a general result of Kolmogorov ([9, 10]). Then we consider several structures and show how to obtain representations from this general theorem. Definition 2.1. Let X be a nonempty set. We say that a map K: X × X → C is positive definite, and denote this by 0 ≤ K, if n X
i,j=1
K(xi , xj )ξi ξ j ≥ 0 ,
(n ∈ N, xi ∈ X, ξi ∈ C for all i ∈ {1, . . . , n}) .
Theorem 2.2 (Kolmogorov’s Theorem). If K: X × X → C is positive definite then there exists a Hilbert space HK and a map vK : X → HK such that the linear span of {vK (x) | x ∈ X} is dense in HK and hvK (x) | vK (y)i = K(x, y) ,
(x, y ∈ X) .
Moreover , HK and vK are unique up to unitary isomorphisms. Remark 2.3. Kolmogorov’s theorem is valid also for operator-valued positive definite maps and in this form it can be applied for the Stinespring construction and the Naimark–Sz.-Nagy dilation. For details consult [10] and [9]. In this paper, for the application to wavelets and Gabor frames, we will need only the more particular version of Kolmogorov’s theorem that we mentioned before. Definition 2.4. If K: X × X → C is positive definite then we call [HK , vK ] the representation associated to K. We note that Kolmogorov’s theorem is purely set theoretic; there is no structure on X. We expect that, if X has some additional structure on it and if we assume some compatibility between the positive definite map K and this structure, then the representation associated to K will also be in agreement with the structure of X. In the next examples we will see that this is indeed the case and we review the technique in the case of C ∗ -algebras and groups.
May 31, 2004 12:28 WSPC/148-RMP
454
00204
D. E. Dutkay
Example 2.5 (C ∗ -Algebras and the GNS Construction). We consider now the case when X = A is a C ∗ -algebra and prove that we can obtain the well-known GNS construction from Kolmogorov’s theorem. Theorem 2.6 (The GNS Construction). If A is a C ∗ -algebra and ϕ is a positive linear functional on A, then there exists a representation π of A on a Hilbert space H, that has a cyclic vector ξ0 ∈ H such that hπ(x)ξ0 | ξ0 i = ϕ(x) ,
(x ∈ A) .
Proof. The idea is to define K: A × A → C by K(x, y) = ϕ(y ∗ x) ,
(x, y ∈ A) .
We can use Kolmogorov’s theorem to obtain the Hilbert space HK and the map vK : A → H K . For a fixed x ∈ A, define the operator π(x) as follows: π(x)(vK (y)) = vK (xy) ,
(y ∈ A) ,
and extend by linearity. Then everything checks out. Example 2.7 (Groups and Unitary Representations). Take X = G a group. We call K: G × G → C a group positive definite map if 0 ≤ K and K(x, y) = K(zx, zy) ,
(x, y, z ∈ G) .
We note that such a positive definite map K is uniquely determined by its restriction φ(x) = K(x, 1) and φ is a function of positive type (see [11]). The proof of Theorem 2.8 will show how the well-known correspondence between functions of positive type and unitary representations of groups can be regarded as a consequence of Kolmogorov’s theorem. Theorem 2.8. Let G be a group and K a group positive definite map on G. Then there exists a unitary representation πK of G on a Hilbert space HK with a cyclic vector ξ0 ∈ HK such that hπK (x)ξ0 | πK (y)ξ0 i = K(x, y) ,
(x, y ∈ G) .
Proof. The proof works exactly as in the case of C ∗ -algebras: consider [HK , vK ] the representation associated to K by Kolmogorov’s theorem. Define the operators πK (x) for x ∈ G as follows: πK (x)(vK (y)) = vK (xy) ,
(x, y ∈ G)
and extend by linearity. Remark 2.9. Note that in the proof of Theorem 2.8 we used the representation associated to K and we see that, when K is a group positive definite map, this
May 31, 2004 12:28 WSPC/148-RMP
00204
Positive Definite Maps, Representations and Frames
455
representation has the unitary representation πK attached to it. The same observation can be done for the GNS construction: the representation of the C ∗ -algebra is attached to the representation vK . This confirms our expectation: when the positive definite map has some compatibility with the existent structure on X, this compatibility projects a nice structure on the associated representation [HK , vK ]. This is the idea that we use throughout this section. Example 2.10 (Gabor-Type Unitary Systems). We recall that a Gabor system is associated to two positive constants a, b > 0 and a function g ∈ L2 (R) and is defined by gm,n (ξ) = e2πimbξ g(ξ − na) ,
(ξ ∈ R) .
The Gabor systems are one of the major subjects in the study of frames and wavelet theory. If we define the unitary operators U, V on L2 (R), (U f )(ξ) = e2πib f (ξ) ,
(f ∈ L2 (R)) ,
(V f )(ξ) = f (ξ − a) ,
(f ∈ L2 (R)) ,
then gm,n = U m V n g, (m, n ∈ Z), and U and V satisfy the relation U V = e2πiab V U . Following [12], if U and V are unitary operators on a Hilbert space H that verify the relation U V = λV U for some unimodular scalar λ, we then call {U m V n | m, n ∈ Z} a Gabor-type unitary system. We will prove that these systems fit into our general framework and we construct representations for them. Theorem 2.11. Suppose λ is a unimodular scalar and K: Z2 ×Z2 → C is a positive definite map satisfying (m, n, m0 , n0 ∈ Z) , (2.1)
K((m + 1, n), (m0 + 1, n0 )) = K((m, n), (m0 , n0 )) , 0
K((m, n + 1), (m0 , n0 + 1)) = λm−m K((m, n), (m0 , n0 )) ,
(m, n, m0 , n0 ∈ Z) . (2.2)
Then, on the Hilbert space HK , there are unitaries U, V and a vector ξ0 ∈ HK such that
0
U V = λV U
(2.3)
{U m V n ξ0 | m, n ∈ Z} is dense in HK
(2.4)
0
hU m V n ξ0 | U m V n ξ0 i = K((m, n), (m0 , n0 )) ,
(m, n, m0 , n0 ∈ Z) .
Moreover , this representation is unique up to unitary isomorphism.
(2.5)
May 31, 2004 12:28 WSPC/148-RMP
456
00204
D. E. Dutkay
Proof. Let vK : Z2 → HK be the representation associated to K. Define the operators U and V as follows: (m, n ∈ Z) ,
U (vK (m, n)) = vK (m + 1, n) , V (vK (m, n)) = λ−m vK (m, n + 1),
(m, n ∈ Z) ,
and then extend by linearity. We check that U, V are well defined and isometric. Take ai ∈ C, (mi , ni ) ∈ Z2 , (i ∈ {1, . . . , p}). ! !+ * p p X X ai vK (mi , ni ) V ai vK (mi , ni ) V i=1
=
=
*
i=1
p X i=1
p X
p + X ai λ−mi vK (mi , ni + 1) ai λ−mi vK (mi , ni + 1) i=1
ai aj λ−mi λ−mj K((mi , ni + 1), (mj , nj + 1))
i,j=1
=
p X
i,j=1
ai aj K((mi , ni ), (mj , nj )) =
*
p X i=1
p + X ai vK (mi , ni ) . ai vK (mi , ni ) i=1
A similar calculation shows that U is well defined and isometric. Since the linear span of the vectors vK (m, n) is dense in HK , we can extend U and V to unitaries on HK . Next, we check (2.3). Take (m, n) ∈ Z2 . U V vK (m, n) = U (λ−m vK (m, n + 1)) = λ−m vK (m + 1, n + 1) = λV (vK (m + 1, n)) = λV U (vK (m, n)) , and (2.3) follows by density. Also, note that, if ξ0 = vK (0, 0), then U m V n ξ0 = U m V n vk (0, 0) = U m vK (0, n) = vK (m, n) ,
(m, n ∈ Z) .
This will imply (2.4) and (2.5).The uniqueness is a consequence of the uniqueness part of Kolmogorov’s theorem. Remark 2.12. Any Gabor-type unitary system U , V on a Hilbert space H, that has a vector ξ0 ∈ H with the property that the linear span of {U m V n ξ0 | m, n ∈ Z}
is dense in H, gives rise to a positive definite map K on Z2 that satisfies (2.1) and (2.2) as follows:
0 0 K((m, n), (m0 , n0 )) = U m V n ξ0 U m V n ξ0 , (m, n, m0 , n0 ∈ Z) .
(2.1) and (2.2) are just immediate consequences of the fact that U and V are unitary and U V = λV U .
May 31, 2004 12:28 WSPC/148-RMP
00204
Positive Definite Maps, Representations and Frames
457
Example 2.13 (Wavelet Representations). We recall briefly some facts about wavelet representations. Wavelet theory deals with two unitary operators U and T on L2 (R), corresponding to the integer N ≥ 2 called the scale: 1 x U f (x) = √ f , T f (x) = f (x − 1) , (x ∈ R, f ∈ L2 (R)) . N N
A wavelet is a function ψ ∈ L2 (R) such that
{U m T n ψ | m, n ∈ Z}
is an orthonormal basis for L2 (R). One way to construct wavelets is by multiresolutions and scaling functions (see [6]). Scaling functions satisfy equations of the form X ak T k ϕ , (2.6) Uϕ = k∈Z
where ak are complex coefficients. The scaling equation can be reformulated using representations. There is a representation of L∞ (T) (T is the unit circle) on L2 (R) given by \ (π(f )ξ) = f ξˆ ,
(f ∈ L∞ (R), ξ ∈ L2 (R))
(ξˆ denotes the Fourier transform of ξ and functions on T are identified with 2πperiodic functions on R). Using this representation, (2.6) can be rewritten as U ϕ = π(m0 )ϕ , m0 (e−iθ ) = fies
P
k∈Z
ak e−ikθ is called a low-pass filter. Also, the representation satisU π(f )U −1 = π(f (z N )) ,
(f ∈ L∞ (T)) .
(U, π, L2 (R), ϕ) is called the wavelet representation with scaling function ϕ. The wavelet theory has shown a strong interconnection between properties of the scaling function ϕ and spectral properties of the transfer operator associated to the low-pass filter m0 : 1 X Rm0 ,m0 f (z) = |m0 |2 (w)f (w) , (z ∈ T, f ∈ L1 (T)) N N w =z
where T is endowed with the normalized Haar measure. For more information on this we refer the reader to [2]. In particular, functions that are harmonic with respect to Rm0 ,m0 , i.e. Rm0 ,m0 h = h, play an important role in the theory. We recall here a theorem from [14] which establishes the link between functions which are harmonic with respect to Rm0 ,m0 and wavelet representations, because it is another particularized instance of Kolmogorov’s theorem. Theorem 2.14. If m0 ∈ L∞ (T) is non-singular (i.e. it does not vanish on a set of positive measure) and h ∈ L1 (T) , satisfies Rm0 ,m0 h = h ,
h ≥ 0,
May 31, 2004 12:28 WSPC/148-RMP
458
00204
D. E. Dutkay
then there exists a Hilbert space Hh , a representation πh of L∞ (T) on Hh , a unitary Uh on Hh and a vector ϕh ∈ Hh such that span{Uh−n πh (f )ϕh | n ∈ N, f ∈ L∞ (T)} = Hh ; Uh πh (f )Uh−1 = πh (f (z N )) ,
(f ∈ L∞ (T)) ;
Uh ϕh = πh (m0 )ϕh ; Z hπh (f )ϕh | ϕh i = f h dµ . T
Moreover , this is unique up to unitary equivalence. Proof. We give here only a sketch of the proof that uses Kolmogorov’s theorem, the rest are calculations wich can be found in [14]. Let X = {(f, n) | f ∈ L∞ (T) , n ∈ N} . We want to define a positive definite map K on X such that in the end vK (f, n) = Uh−n πh (f )ϕh . Then, we must have K((f, n), (g, m)) = hUh−n πh (f )ϕh | Uh−m πh (g)ϕh i = hUhm πh (f )ϕh | Uhn πh (g)ϕh i
m n (m) (n) = πh (f (z N )m0 (z))ϕh πh (g(z N )m0 (z))ϕh Z n m (n) (m) = f (z N )m0 (z)g(z N )m0 (z)h dµ , T
(m)
m−1
where m0 (z) = m0 (z)m0 (z N ) · · · m0 (z N ). So we have to define, for (f, n), (g, m) ∈ X, Z m n (m) (n) f (z N )m0 (z)g(z N )m0 (z)h dµ . K((f, n), (g, m)) = T
K can be checked to be positive definite so it induces a representation (Hh , vh ), according to Kolmogorov’s thoerem. Then, define ϕh = vh (1, 0), Uh vh (f, 0) = vh (f (z N )m0 , 0) , Uh vh (f, n) = (f, n − 1) ,
(f ∈ L∞ (T)) ,
(n ≥ 1, f ∈ L∞ (T)) ,
and extend by linearity and density. n
πh (f )vh (g, n) = (f (z N )g(z), n) ,
(f, g ∈ L∞ (T) , n ∈ N) ,
and extend by linearity and density. Everything can be checked out as the reader may see in [14].
May 31, 2004 12:28 WSPC/148-RMP
00204
Positive Definite Maps, Representations and Frames
459
3. Intertwining Operators In the previous section we saw how positive definite maps induce representations on Hilbert spaces. Now we will show that intertwining operators can be constructed in a similar way from maps L: X × X → C which satisfy some boundedness condition. We will also see that, when X has some structure on it and L is compatible with this structure, then the intertwining operator induced by L will be compatible with the extra structure existent on the induced representations, i.e. the operator is indeed intertwining. The format of this section is similar to the format of the previous one. We begin with a general, set theoretic result and then particularize it to various structures to obtain more information. Definition 3.1. Consider two positive definite maps K, K 0 : X×X → C and L: X× X → C (not necessarily positive definite). We say that L is bounded with respect to K and K 0 if there is a constant c > 0 such that 2 ! ! m n m m X X X X 0 L(xi , yj )ξi η j ≤ c (3.1) K (yj , yj 0 )ηj η j 0 K(xi , xi0 )ξi ξ i0 0 0 i=1 j=1
j,j =1
i,i =1
for all xi , yj ∈ X, ξi , ηj ∈ C, i ∈ {1, . . . , m}, j ∈ {1, . . . , n}. We denote this by L2 ≤ cKK 0 .
Theorem 3.2. Suppose X is a nonempty set and K, K 0 are positive definite maps on X. If L: X × X → C and L2 ≤ cKK 0 for some c > 0, then there exists a unique bounded linear operator S: HK → HK 0 such that hSvK (x) | vK 0 (y)i = L(x, y) ,
(x, y ∈ X) .
(3.2)
((HK , vK ), (HK 0 , vK 0 ) are the representation induced by K and K 0 respectively, √ according to Kolmogorov’s theorem). Moreover , kSk ≤ c. Conversely, if S: HK → HK 0 is a bounded linear operator , then there is a unique map L: X × X → C with L2 ≤ kSk2 KK 0 that satisfies (3.2). Proof. Define B: HK × HK 0 → C as follows: for xi , yj ∈ X, ξi , ηj ∈ C, ! m X n n n X X X ξi η j L(xi , yj ) . ξi vK (xi ), ηj vK 0 (yj ) = B i=1
2
i=1 j=1
j=1
0
Because L ≤ cKK , we have ! 2 n n X X ξi vK (xi ), ηj vK 0 (yj ) ≤ c B i=1
j=1
m X
K(xi , xi0 )ξi ξ i0
i,i0 =1
n
2
X
= c ξi vK (xi )
i=1
HK
!
m X
0
K (yj , yj 0 )ηj η j 0
j,j 0 =1
2
n
X
ηj vK 0 (yj )
j=1
HK 0
.
!
May 31, 2004 12:28 WSPC/148-RMP
460
00204
D. E. Dutkay
This shows that B is a well-defined bounded sesquilinear map which can be extended (by the density properties of vK and VK 0 ) to a bounded sesquilinear map B: HK × HK 0 → C. Then there exists a bounded linear operator S: HK → HK 0 such that √ kSk ≤ c and B(v1 , v2 ) = hSv1 | v2 i ,
(v1 ∈ HK , v2 ∈ HK 0 ) .
In particular, one obtains (3.2). The uniqueness is clear because the spans of {vK (x) | x ∈ X} and {vK 0 (y) | y ∈ X} are dense. The converse is also easy, one needs to check that the map L defined by (3.2) satisfies L2 ≤ kSk2 KK 0 , but this is a consequence of Schwarz’s inequality. Definition 3.3. We call the operator S associated to L in Theorem 3.2, the intertwining operator associated to L. We will also be interested in subrepresentations and in the commutant of a representation. In these instances we will work with only one positive definite map K. We give here a definition which will be appropriate for these situations. Definition 3.4. Consider K, K 0 , two positive definite maps on a nonempty set X and a constant c > 0. We denote K 0 ≤ cK if, for all xi ∈ X and ξi ∈ C, (i ∈ {1, . . . , n}), n X
i,j=1
0
K (xi , xj )ξi ξ j ≤ c
n X
K(xi , xj )ξi ξ j .
i,j=1
Proposition 3.5. If K and K 0 are positive definite maps and c > 0 then K 0 ≤ cK if and only if K 02 ≤ c2 KK. Proof. Suppose K 0 ≤ cK. Take xi , yj ∈ X, ξi , ηj ∈ C. n m n 2 * m + 2 X XX X 0 ηj vK 0 (yj ) ξi vK 0 (xi ) K (xi , yj )ξi η j = j=1
i=1
i=1 j=1
2 m
X
≤ ξi vK 0 (xi )
i=1
HK 0
m X
=
≤c
j=1
K (xi , xi0 )ξi ξ i0
i,i0 =1
2
0
2 n
X
ηj vK 0 (yj )
m X
i,i0 =1
!
K(xi , xi0 )ξi ξ i0
!
HK 0
m X
0
K (yj , yj 0 )ηj η j 0
j,j 0 =1 m X
j,j 0 =1
!
K(yj , yj 0 )ηj η j 0
!
.
May 31, 2004 12:28 WSPC/148-RMP
00204
Positive Definite Maps, Representations and Frames
461
Hence K 02 ≤ c2 KK. Conversely, if K 02 ≤ c2 KK then just take m = n, xi = yi , ξi = ηi in (3.1) to obtain exactly K 0 ≤ cK. Corollary 3.6. Suppose K is positive definite on X. Then, for every positive definite map K 0 with K 0 ≤ cK for some c > 0, there exists a unique positive operator S: HK → HK with hSvK (x) | vK (y)i = K 0 (x, y) ,
(x, y ∈ X) .
(3.3)
Moreover , kSk ≤ c. Conversely, for every positive operator S on H K there is a unique positive definite map on X that satisfies (3.3). In addition, K 0 ≤ kSkK. Proof. Using Proposition 3.5 and Theorem 3.2, we find an operator S on HK that satisfies (3.3) and kSk ≤ c. S is positive because ! n * + n n X X X ξi vK (xi ) S ξi vK (xi ) = ξi ξ j K 0 (xi , xj ) ≥ 0 i=1
i=1
i,j=1
and {vK (x) | x ∈ X} span a dense subspace of HK . For the converse, when S is given, theorem 3.2 shows that there is a K 0 satisfying (3.3) and K 02 ≤ kSk2 KK. K 0 is positive because S is, and Proposition 3.5 implies K 0 ≤ kSkK.
In the remainder of this section we apply Theorem 3.2 to the situations when X has some additional structure on it and see how the intertwining operators are in compliance with the extra structure of the representations. Example 3.7 (C ∗ -Algebras). Consider now X = A, a C ∗ -algebra. We saw in Example 2.5 that, when the positive definite map K: A × A → C is given by a positive functional ϕ: A → C, K(x, y) = ϕ(y ∗ x) ,
(x, y ∈ A) ,
then the representation induced by K has the GNS construction attached to it. We want to see for what functions L: A × A → C the associated intertwining operator will intertwine the GNS representations. Theorem 3.8. Let A be a C ∗ -algebra and ϕ, ϕ0 two positive functionals on A. Suppose that ϕ0 : A → C is linear and ϕ20 ≤ cϕϕ0 for some c > 0 in the sense that |ϕ0 (y ∗ x)|2 ≤ cϕ(x∗ x)ϕ0 (y ∗ y) ,
(x, y ∈ A) .
(3.4)
Then there exists a unique bounded operator S: Hϕ → Hϕ0 such that Sπϕ (x) = πϕ0 (x)S , hSπϕ (x)ξ0 | πϕ0 (y)ξ00 i = ϕ0 (y ∗ x) ,
(x ∈ A) ,
(3.5)
(x, y ∈ A) .
(3.6)
(Here (Hϕ , πϕ , ξ0 ) and (Hϕ0 , πϕ0 , ξ00 ) are the GNS representations associated to √ ϕ and ϕ0 respectively (see Theorem 2.6).) Moreover kSk ≤ c. Conversely, if
May 31, 2004 12:28 WSPC/148-RMP
462
00204
D. E. Dutkay
S: Hϕ → Hϕ0 is a bounded operator that satisfies (3.5) then there is a unique linear map ϕ0 : A → C that satisfies (3.6). In addition (3.4) holds with c = kSk2 . Proof. Let Kϕ , Kϕ0 : A × A → C, Kϕ (x, y) = ϕ(y ∗ x), Kϕ0 (x, y) = ϕ0 (y ∗ x) ,
(x, y ∈ A) .
Recall that Hϕ = HKϕ , Hϕ0 = HKϕ0 , πϕ (x)ξ0 = vKϕ (x), πϕ0 (x)ξ0 = vKϕ0 (x) (see the proof of Theorem 2.6). Define L(x, y) = ϕ0 (y ∗ x) for x, y ∈ A. Then (3.4) implies L2 ≤ cKϕ Kϕ0 . Theorem 3.2 gives an operator S with hSvKϕ (x) | vKϕ0 (y)i = L(x, y) ,
(x, y ∈ A) ;
then one checks that S satisfies all the requirements. As a corollary we deduce a basic fact about positive operators in the commutant of the GNS representation (see [3]). Corollary 3.9. Let ϕ, ϕ0 be two positive functionals on a C ∗ -algebra A, ϕ0 ≤ cϕ for some c > 0 (i.e. ϕ0 (x) ≤ cϕ(x) for all positive x ∈ A). There exists a unique positive linear operator S in the commutant of the GNS representation corresponding to ϕ such that hSπϕ (x)ξ0 | πϕ (y)ξ0 i = ϕ0 (y ∗ x) ,
(x, y ∈ A) .
(3.7)
Conversely, for any positive operator S in the commutant of πϕ (A), there is a unique positive functional ϕ0 on A such that (3.7) holds and ϕ0 ≤ kSkϕ. Example 3.10 (Groups). Take now X = G a group. We know from Theorem 2.8 that, if K: G × G → C is positive definite and satisfies K(x, y) = K(zx, zy) ,
(x, y, z ∈ G) ,
then K induces a unitary representation of G on HK . In the next theorem we look at operators that intertwine these representations. Theorem 3.11. Suppose G is a group and K, K 0 are positive definite maps on G satisfying K(x, y) = K(zx, zy) ,
K 0 (x, y) = K 0 (zx, zy) ,
(x, y, z ∈ G) .
Let L: G × G → C with L2 ≤ cKK 0 for some c > 0. If L(x, y) = L(zx, zy) ,
(x, y, z ∈ G) ,
(3.8)
then there is a unique operator S: HK → HK 0 such that SπK (x) = πK 0 (x)S , hSπK (x)ξ0 | πK 0 (y)ξ00 i = L(x, y) ,
(x ∈ G) , (x ∈ G) .
(3.9) (3.10)
May 31, 2004 12:28 WSPC/148-RMP
00204
Positive Definite Maps, Representations and Frames
463
((HK , πK , ξ0 ), (HK 0 , πK 0 , ξ00 ) are the unitary representations of G associated to √ K and K 0 respectively (see Theorem 2.8)). Moreover kSk ≤ c. Conversely, if S: HK → HK 0 satisfies (3.9), then there is a unique L that satisfies (3.10). In addition L satisfies (3.8) and L2 ≤ kSk2 KK 0 . Proof. Recall that, if vK : G → HK and vK 0 : G → HK 0 are the representations associated to K by Kolmogorov’s theorem, then πK (x)ξ0 = vK (x) ,
πK 0 (x)ξ00 = vK 0 (x) ,
(x ∈ G)
(see the proof of Theorem 2.8). √ Theorem 3.2 implies the existence of an operator S: HK → HK 0 with kSk ≤ c and hSvK (x) | vK 0 (y)i = L(x, y) ,
(x, y ∈ G) .
The rest follows. Corollary 3.12. Let K, K 0 be two positive definite maps on the group G that satisfy K(x, y) = K(zx, zy) ,
K 0 (x, y) = K 0 (zx, zy) ,
(x, y, z ∈ G) ,
and K 0 ≤ cK for some c > 0. Then there exists a unique positive operator S on HK in the commutant of the unitary representation πK (G), such that hSπK (x)ξ0 | πK (y)ξ0 i = K 0 (x, y) ,
(x, y ∈ G) .
(3.11)
Conversely, for every positive operator S in the commutant of πK (G), there is a unique positive definite map K 0 on G that satisfies (3.11) and K 0 (x, y) = K 0 (zx, zy) ,
(x, y, z ∈ G) .
Proof. It is an immediate conseqence of Theorem 3.11. It can also be proved from Corollary 3.6. Example 3.13 (Gabor-Type Unitary Systems). We proved in Theorem 2.11 that, given a unimodular λ ∈ C and a positive definite map K on Z2 that satisfies K((m + 1, n), (m0 + 1, n0 )) = K((m, n), (m0 , n0 )) , 0
(m, n, m0 , n0 ∈ Z) ,
K((m, n + 1), (m0 , n0 + 1)) = λm−m K((m, n), (m0 , n0 )) ,
(3.12)
(m, n, m0 , n0 ∈ Z) , (3.13)
there is a Gabor-type unitary system on HK generated by two unitaries UK and VK . As the reader probably expects, we look at the operators that intertwine these systems. Theorem 3.14. Let λ ∈ C, |λ| = 1 and K, K 0 positive definite maps on Z2 satisfying the corresponding relations (3.12) and (3.13). Let L: Z2 × Z2 → C with the property that L2 ≤ cKK 0 for some c > 0. If L satisfies the relations (3.12)
May 31, 2004 12:28 WSPC/148-RMP
464
00204
D. E. Dutkay
and (3.13), (with K replaced by L, of course), then there is a unique operator S: HK → HK 0 such that
SUK = UK 0 S , SVK = VK 0 S , m n m0 n 0 0 0 0 SUK VK ξ 0 U K (m, n, m0 , n0 ∈ Z) . 0 VK 0 ξ0 i = L((m, n), (m , n )) ,
(3.14)
(3.15) √ ((UK , VK , ξ0 ), (UK 0 , VK 0 , ξ00 ) are given by Theorem 2.11)). Moreover kSk ≤ c. Conversely, if S: HK → HK 0 satisfies (3.14), then there exists a unique L: Z2 ×Z2 → C that verifies (3.15), and in addition L will verify (3.12) and (3.13) too, and L2 ≤ kSk2 KK 0 . m n Proof. We recall that UK VK ξ0 = vK (m, n) and similarly for K 0 , (m, n ∈ Z) (see the proof of Theorem 2.11). Theorem 3.2 shows that there is an operator √ S: HK → HK 0 with kSk ≤ c and such that (3.15) holds. We need to check (3.14). Take m, n, m0 , n0 ∈ Z and compute: m0 n 0 0
m n SUK UK VK ξ 0 U K = L((m + 1, n), (m0 , n0 )) 0 VK 0 ξ 0 m0 n0 0 m n m0 −1 n0 0
m n = SUK VK ξ0 UK 0 VK 0 ξ0 UK 0 SUK VK ξ 0 U K 0 VK 0 ξ 0
= L((m, n), (m0 − 1, n0 )) = L((m + 1, n), (m0 , n0 )) . 0
0
m n m n 0 0 0 The density of the linear spans of {UK VK ξ0 | m, n ∈ Z} and {UK 0 VK 0 ξ0 | m , n ∈ Z} implies SUK = UK 0 S. A similar calculation shows that SVK = VK 0 S. The converse follows from Theorem 3.2: if L is defined by (3.15), the only thing that remains to be verified is that L satisfies (3.12) and (3.13), but this is a consequence of (3.14) and UK VK = λVK UK , UK 0 VK 0 = λVK 0 UK 0 .
Corollary 3.15. If K, K 0 are positive definite maps on Z2 satisfying the relations (3.12) and (3.13) and K 0 ≤ cK then there is a unique positive definite operator S on HK that commutes with UK and VK and
m n m0 n 0 SUK VK ξ0 UK VK ξ0 = K 0 ((m, n), (m0 , n0 )) , (m, n, m0 , n0 ∈ Z) . (3.16)
Conversely, if S is a positive operator that commutes with UK and VK then K 0 defined by (3.16) satisfies (3.12) and (3.13). Proof. The proof follows the same lines as before.
Remark 3.16. Theorem 3.2 gives us a general existence result for intertwining operators. The next theorems answer the question of what conditions should be imposed on L so that its associated operator S intertwines the extra structure existent on HK . We saw that for C ∗ -algebras the necessary and sufficient condition is that L(x, y) = ϕ0 (y ∗ x) for some linear ϕ0 , for groups we must have L(x, y) = L(zx, zy) and for Gabor-type unitary systems, L must satisfy the relations (3.12) and (3.13).
May 31, 2004 12:28 WSPC/148-RMP
00204
Positive Definite Maps, Representations and Frames
465
Example 3.17 (Intertwiners of Wavelet Representations). We mentioned in Example 2.13 and Theorem 2.14 how wavelet representations can be associated to positive functions h ∈ L1 (T) with Rm0 ,m0 h = h. In [7] and [8] we studied the operators that intertwine these representations. We indicate now how these can be connected to Kolmogorov’s theorem. So we will recall the results from [7] and we sketch the proof based on Theorem 3.2. Given h as in Theorem 2.14, call (Uh , πh , Hh , ϕh ) the cyclic representation of AN associated to h. Also, define the transfer operator associated to a pair m0 , m00 ∈ L∞ (T) by Rm0 ,m00 f (z) =
1 X m0 (w)m00 (w)f (w) , N N w =z
(z ∈ T, f ∈ L1 (T)) .
Theorem 3.18. Let m0 , m00 ∈ L∞ (T) be nonsingular and h, h0 ∈ L1 (T) , h, h0 ≥ 0, Rm0 ,m0 (h) = h, Rm00 ,m00 (h0 ) = h0 . Let (U, π, H, ϕ), (U 0 , π 0 , H 0 , ϕ0 ) be the cyclic representations corresponding to h and h0 respectively. If h0 ∈ L1 (T) , Rm0 ,m00 (h0 ) = h0 and |h0 |2 ≤ chh0 for some c > 0 then there exists a unique operator S: H → H 0 such that SU = U 0 S ,
Sπ(f ) = π 0 (f )S , Z 0 hSπ(f )ϕ | ϕ i = f h0 dµ , T
(f ∈ L∞ (T))
(3.17)
(f ∈ L∞ (T)) .
(3.18)
√ Moreover kSk ≤ c. Conversely, if S is an operator that satisfies (3.17), then there is a unique h0 ∈ L1 (T) with Rm0 ,m00 h0 = h0 such that (3.18) holds. Moreover , |h0 |2 ≤ kSk2 hh0 . Proof. Define X as in the proof of Theorem 2.14. For all (f, n), (g, m) ∈ X, we want to obtain L((f, n), (g, m)) = hSU −n π(f )ϕ | U 0−m π 0 (g)ϕ0 i = hSU m π(f )ϕ | U 0n π 0 (g)ϕ0 i
m n (m) (n) = Sπ(f (z N )m0 (z))ϕ π 0 (g(z N )m0 (z))ϕ0 Z m n (m) 0(n) = f (z N )m0 (z)g(z N )m0 (z)h0 dµ . T
Keep the first and the last terms of the equality and this defines L. L will give rise to S by Theorem 3.2. For the details of the required computations, see [7]. The converse, can also be obtained from Theorem 3.2, but here the generality of Theorem 3.2 is not really needed.
May 31, 2004 12:28 WSPC/148-RMP
466
00204
D. E. Dutkay
4. Frames and Dilations Recall that a set {xi | i ∈ I} of vectors in a Hilbert space H is called a frame if there are two constants A, B > 0 such that X Akf k2 ≤ |hf | xi i|2 ≤ Bkf k2 , (f ∈ H) . i∈I
If A = B = 1 the set {xi | i ∈ I} is called a normalized tight frame. Frames have been used extensively in applied mathematics for signal processing and data compression. They play a central role in wavelet theory and the analysis of Gabor systems. In [12] the normalized tight frames are interpreted as projections of orthonormal bases and it is proved there that Gabor-type normalized tight frames can be dilated to Gabor-type orthonormal bases, and normalized tight frames generated by groups can be dilated to orthonormal bases generated by the same group (see [12, Theorems 3.8 and 4.8]). We will revisit these theorems and show that they are immediate consequences of a general result which proves that any normalized tight frame can be dilated to an orthonormal basis in such a way that the extra structure that may exist is preserved under the dilation. We begin with a proposition that establishes what positive definite maps give rise to normalized tight frames when represented on a Hilbert space. Proposition 4.1. Let K be a positive definite map on a set X. Then {vK (x) | x ∈ X} is a normalized tight frame if and only if for all xi ∈ X, ξi ∈ C, (i ∈ {1, . . . , n}): 2 n n X X X (4.1) K(xi , x)ξi . K(xi , xj )ξi ξ j = i,j=1
x∈X
i=1
Proof. If {vK (x) | x ∈ X} is a normalized tight frame then take f = Pn i=1 ξi vK (xi ). The fact that X kf k2 = |hf | vK (x)i|2 , (4.2) x∈X
translates into (4.1). For the converse, we only need to verify (4.2) for f in a dense subset of HK (see [13, Lemma 1.10]). Since the linear span of {vK (x) | x ∈ X} is Pn dense in HK , we can take f = i=1 ξi vK (xi ) and (4.2) follows from (4.1). Definition 4.2. A positive definite map K on X is called a NTF if and only if {vK (x) | x ∈ X} is a normalized tight frame for HK .
Before we prove our general result we note that, if δ: X × X → C is defined by ( 1, if x = y δ(x, y) = 0, otherwise then δ is a positive definite map and {vδ (x) | x ∈ X} is an orthonormal basis for Hδ .
May 31, 2004 12:28 WSPC/148-RMP
00204
Positive Definite Maps, Representations and Frames
467
Proposition 4.3. If K is a NTF positive definite map on X then K ≤ δ. Proof. By definition, {vK (x) | x ∈ X} is a normalized tight frame for HK . Then, by [12, Proposition 1.1], there exists a Hilbert space H containing HK as a subspace and an orthonormal basis {e(x) | x ∈ X} such that, if P is the projection onto HK , then P e(x) = vK (x) for all x ∈ X. Now take xi ∈ X, ξi ∈ C, (i ∈ {1, . . . , n}). + * n n n X X X ξi vK (xi ) ξi vK (xi ) K(xi , xj )ξi ξ j = i=1
i=1
i,j=1
=
*
n X
P
i=1
! ξi e(xi ) P
n X
ξi e(xi )
i=1
!+
2
n n
X X
ξi e(xi ) = |ξi |2 ≤ kP k
i=1
=
n X
i=1
δ(xi , xj )ξi ξ j .
i,j=1
Therefore K ≤ δ.
Theorem 4.4. If K, K 0 are NTF positive definite maps on a countable set X, K ≤ cK 0 for some c > 0, then there exists an isometry W : HK → HK 0 , W is induced by K, that is hW vK (x) | vK 0 (y)i = K(x, y) ,
(x, y ∈ X) ,
the projection P onto W HK is also induced by K, i.e. hP vK 0 (x) | vK 0 (y)i = K(x, y) ,
(x, y ∈ X) ,
and P vK 0 (x) = W vK (x). Proof. Since K ≤ cK 0 , by Corollary 3.6, there exists a positive operator S on HK 0 such that hSvK 0 (x) | vK 0 (y)i = K(x, y) ,
1 2
(x, y ∈ X) .
Since S is positive, it has a positive square root S . Then
1 1 hS 2 vK 0 (x) | S 2 vK 0 (y)i = SvK 0 (x) vK 0 (y) = K(x, y) ,
(x, y ∈ X) .
Take
1
H = span{S 2 vK 0 (x) | x ∈ X} . By the uniqueness part of Kolmogorov’s theorem, there is a unitary W : HK → H such that 1
W vK (x) = S 2 vK 0 (x) ,
(x ∈ X) .
May 31, 2004 12:28 WSPC/148-RMP
468
00204
D. E. Dutkay 1
But then {S 2 vK 0 (x) | x ∈ X} is a normalized tight frame for H. Also, we know 1 that {vK 0 (x) | x ∈ X} is a normalized tight frame for HK 0 . So S 2 : HK 0 → H maps a normalized tight frame to a normalized tight frame, therefore it must be a co1 1 isometry (see [12, Proposition 1.9]). It follows that S 2 (S 2 )∗ : H → H is the identity on H so S is the identity on H. 1 We also know that range(S 2 ) = range(S). This implies that S(Sv) = Sv for all v ∈ HK 0 and, as S ≥ 0, S is the projection onto H. Consequently, we also have 1 S = S 2 and everything follows now by an easy computation: 1
SvK 0 (x) = S 2 vK 0 (x) = W vK (x) ,
(x ∈ X) .
hW vK (x) | vK 0 (y)i = hSvK 0 (x) | vK 0 (y)i = K(x, y) ,
(x, y ∈ X) .
Remark 4.5. Theorem 4.4 can be used to construct dilation theorems for unitary systems (the reader should have in mind the specific examples of groups and Gabortype unitary systems). Recall some definitions from [12]. If U is a countable set of unitaries on a Hilbert space H, then ξ ∈ H is called a complete wandering vector (complete normalized tight frame vector) if {U ξ | U ∈ U} is an orthonormal basis (normalized tight frame) for H. A dilation theorem will take the following form: If U is a unitary system on a Hilbert space H that has a complete normalized tight frame vector η, then there is a Hilbert space H1 that contains H and a unitary system U1 on H1 such that U1 has a complete wandering vector ξ and if P is the projection onto H then P ξ = η, P commutes with U1 and U1 7→ U1 |H is an isomorphism of U1 onto U. The proof will be guided by the following steps: (1) Construct K: U × U → C, K(x, y) = hxη | yηi; then K is an NTF positive definite map and HK = H, vK (x) = xη, (x ∈ U) and U is the extra structure UK induced by K. (2) Verify that δ: U × U → C satisfies the required compatibility conditions with U. (3) Construct Hδ , vδ and the additional structure Uδ with cyclic vector ξδ which is a complete wandering vector for Uδ . (4) Since K ≤ δ (Proposition 4.3), according to Theorem 4.4 there is an isometry W : H → Hδ which is induced by K; the projection P onto WH is also induced by K and P ξδ = η. As K is compatible with the structure U, W will intertwine U and Uδ and P commutes with Uδ . So WH is invariant for Uδ and W U W −1 = Uδ |H for all U ∈ U (Uδ is the unitary in Uδ that corresponds to U in the representation). (5) Identify H with WH and everything will follow. We will use the guidelines of Remark 4.5 to show how one can obtain the dilation Theorems 3.8 and 4.8 from [12] for groups and Gabor-type unitary systems.
May 31, 2004 12:28 WSPC/148-RMP
00204
Positive Definite Maps, Representations and Frames
469
Theorem 4.6 ([12]). Suppose U is a unitary group on H with a complete normalized tight frame vector η. Then there is a Hilbert space H1 containing H and a unitary group U1 such that U1 has a complete wandering vector ξ, if P is the projection onto H then P commutes with U1 , P ξ = η and U1 7→ U1 |H is an isomorphism of U1 onto U. Consequently, P U1 ξ = U1 |H η for all U1 ∈ U1 (that is the normalized tight frame {U η | U ∈ U} can be dilated to the orthonormal basis {U 1 ξ | U1 ∈ U1 }). Proof. Define K: U × U → C, K(x, y) = hxη | yηi for x, y ∈ U. It is clear that K is an NTF positive definite map with K(zx, zy) = K(x, y) ,
(x, y, z ∈ U)
(4.3)
and HK = H, vK (x) = xη and the representation πK given by Theorem 2.8 is πK (x) = x for x ∈ U. It is also clear that δ satisfies a relation of type (4.3) so it is compatible with the group structure and by Theorem 2.8 it induces a cyclic representation (Hδ , πδ , ξδ ) of U with ξδ = vδ (1) a complete wandering vector. By Proposition 4.3, K ≤ δ. By Theorem 4.4 there is an isometry W : H → Hδ which is induced by K, the projection P onto WH is also induced by K and P ξδ = η. Then, by Theorem 3.11, W is intertwining that is W x = π(x)W ,
(x ∈ U) ,
and P is in the commutant of πδ (U). So WH is invariant for all πδ (x), x ∈ U and W xW −1 = πδ (x) for x ∈ U. Identify H with WH and define U1 = πδ (U), ξ = ξδ and everything follows. Theorem 4.7 ([12]). Let U = {U m V n | m, n ∈ Z} be a Gabor-type unitary system associated to λ on a Hilbert space H. Suppose U has a complete normalized tight frame vector η ∈ H. Then there is a Gabor-type unitary system U1 (= {U1m V1n m, n ∈ Z}) associated to λ on a Hilbert space H1 containing H, such that U1 has a complete wandering vector ξ and if P is the projection onto H then P commutes with U1 and V1 , P ξ = η and U = U1 |H V = V1 |H . Proof. The proof is analogous to the proof of Theorem 4.6, the only difference is to verify that δ satisfies the compatibility relations (3.12) and (3.13) and this is trivial. 5. A Dilation Theorem for Wavelets Let us recall the algorithm for the construction of compactly-supported wavelets. For details we refer the reader to [6] for the scale N = 2 and to [1] for arbitrary scale N . 2 One starts with the low-pass √ filter m0 ∈ L (T) which is a trigonometric polynomial that satisfies m0 (1) = N and the quadrature mirror filter condition 1 X |m0 |2 (w) = 1 , (z ∈ T) . (5.1) N N w =z
May 31, 2004 12:28 WSPC/148-RMP
470
00204
D. E. Dutkay
Then define the scaling function ϕ ∈ L2 (R) by taking the inverse Fourier transform of ∞ Y m0 Nxk √ ϕ(x) ˆ = , (x ∈ R) . (5.2) N k=1 To construct wavelets one needs the high-pass filters m1 , . . . , mN −1 ∈ L∞ (T) such that the matrix m0 (z) m0 (ρz) · · · m0 (ρN −1 z) m1 (z) m1 (ρz) · · · m1 (ρN −1 z) 1 √ is unitary for a.e. z ∈ T . .. .. .. .. N . . . . N −1 mN −1 (z) mN −1 (ρz) · · · mN −1 (ρ z) (5.3) 2πi
(ρ = e N ). When N = 2 a choice for m1 can be m1 (z) = zm0 (−z)f (z 2 ),
(z ∈ T),
where |f (z)| = 1 on T. The wavelets are defined as follows: x mi N x ˆ ψi (x) = √ , (x ∈ R, i ∈ {1, . . . , N − 1}) , ϕˆ N N
(5.4)
or, in terms of the wavelet representation, ψi = U −1 π(mi )ϕ ,
(i ∈ {1, . . . , N − 1}) .
(5.5)
It is known that, in order to achieve orthogonality, extra conditions must be imposed on m0 . If Rm0 ,m0 has only one continuous fixed point (up to a multiplicative constant), the set {U m T n ψi | m, n ∈ Z, i ∈ {1, . . . , N − 1}} is an orthonormal basis for L2 (R). However, when this extra condition is not satisfied, one still gets good properties, namely, the fact that the above set is a normalized tight frame for L2 (R). In the sequel, we show how one can dilate this normalized tight frame to an orthonormal basis in such a way that the multiresolution structure is preserved so that “wavelets” in a space bigger then L2 (R) are obtained. We begin with a proposition that explains the multiresolution structure of the cyclic representations presented in Example 2.13. In the sequel we define m0 ∈ L∞ (T) to be nonsingular if the set {z ∈ T | m0 (z) = 0} has zero measure and |m0 | is not constant 1 a.e.
May 31, 2004 12:28 WSPC/148-RMP
00204
Positive Definite Maps, Representations and Frames
471
Proposition 5.1. Let m0 ∈ L∞ (T) be nonsingular , h ∈ L1 (T) , h ≥ 0 and Rm0 ,m0 h = h. Let (Uh , πh , Hh , ϕh ) be the cyclic representation associated to h. Define Th = πh (z), V0h = span{Thk ϕh | k ∈ Z} , Vjh = Uh−j V0h ,
(j ∈ Z) .
Then Uh Th Uh−1 = ThN , h Vjh ⊂ Vj+1 ,
(5.6) (j ∈ Z) ,
(5.7)
∪j∈Z Vjh = Hh ,
(5.8)
∩j∈Z Vjh = {0} .
(5.9)
Assume h = 1. Then {Thk ϕh | k ∈ Z} is an orthonormal basis for V0h .
(5.10)
If m1 , . . . , mN −1 satisfy (5.3) and ψih = Uh−1 πh (mi )ϕh ,
(i ∈ {1, . . . , N − 1}) ,
(5.11)
then {Thk ψih | k ∈ Z, i ∈ {1, . . . , N − 1}} is an orthonormal basis for V1h V0h
(5.12)
and {Uhm Thn ψih | m, n ∈ Z, i ∈ {1, . . . , N − 1}} is an orthonormal basis for L2 (R) . (5.13) Proof. (5.6) follows from Uh πh (f (z))Uh−1 = πh (f (z N )) with f (z) = z. Pp If f (z) = k=−p ak z k is a trigonometric polynomial then πh (f )ϕh =
p X
k=−p
ak Tkh ϕh ∈ V0h .
Each f ∈ L∞ (T) is the pointwise limit of a uniformly bounded sequence of trigonometric polynomial, hence, by [7, Lemma 2.8], πh (f )ϕh ∈ V0h . Then Uh πh (f )ϕh = πh (f (z N ))Uh ϕh = πh (f (z N )m0 (z))ϕh ∈ V0h h so V−1 ⊂ V0h and this implies (5.7). Also Uh−n πh (f )ϕh ∈ Vnh for all f ∈ L∞ (T) and n ∈ Z and (5.8) follows by density. (5.9) is proved in [14, Theorem 5.6].
May 31, 2004 12:28 WSPC/148-RMP
472
00204
D. E. Dutkay
If h = 1 then, for k ∈ Z, hThk ϕh | ϕh i = hπh (z k )ϕh | ϕh i =
Z
z k dµ = δk,0 , T
so (5.10) is valid. It remains to prove (5.12) because (5.13) follows from this immediately. The argument is essentially the one in [1, Theorem 10.1]. We will include it here to make sure everything works. For k, l ∈ Z and i, j ∈ {1, . . . , N − 1} we have hThk ψih | Thl ψjh i = hπh (z k )Uh−1 πh (mi )ϕh | πh (z l )Uh−1 πh (mj )ϕh i
= hUh−1 πh (z N k mi (z))ϕh | Uh−1 πh (z N l mj (z))ϕh i Z z N (k−l) mi (z)mj (z) dµ = T
=
Z
z k−l T
1 X mi (w)mj (w) dµ = δi,j δk,l ; N N w =z
for the last equality we used (5.3). So
{Thk ψih | k ∈ Z, i ∈ {1, . . . , N − 1}} is an orthonormal set. Take, Uh−1 πh (m)ϕh ∈ V1h , m ∈ L∞ (T) (vectors of this form are dense in V1h ). −1 Uh πh (m)ϕh ⊥ V0h is equivalent to, for all f ∈ L∞ (T): 0 = hUh−1 πh (m)ϕh | πh (f )ϕh i = hπh (m)ϕh | πh (f (z N )m0 (z))ϕh i Z = m(z)f(z N )m0 (z) dµ T
=
Z
T
1 X m(w)m0 (w)f (z) dµ N N w =z
which is equivalent to
1 X m(w)m0 (w) = 0 a.e. on T . N N w =z
This shows in particular that ψih ⊥ V0h for all i ∈ {1, . . . , N − 1}. Also, the vector m(z) ~ = (m(z), m(ρz), . . . , m(ρN −1 z))
(ρ = e−2πi/N ) must be perpendicular to the vector m ~ 0 (z) = (m0 (z), m0 (ρz), . . . , m0 (ρN −1 z)) for almost all z, so m(z) ~ =
N −1 X
µk (z)m ~ k (z)
k=1
where µk (z) = hm(z) ~ |m ~ k (z)i (which shows that µk ∈ L∞ (T)).
May 31, 2004 12:28 WSPC/148-RMP
00204
Positive Definite Maps, Representations and Frames
473
Since m(ρz) ~ is a circular permutation of m(z), ~ it follows that we must have µk (ρz) = µk (z), that is µk (z) = λk (z N ) for some λk ∈ L∞ (T). Then m(z) =
N −1 X
λk (z N )mk (z) ,
(z ∈ T) ,
k=1
and we compute Uh−1 πh (m)ϕh
=
Uh−1 πh
N −1 X k=1
N
!
λk (z )mk (z) ϕh =
N −1 X
πh (λk )ψkh ,
k=1
and this shows that Uh−1 πh (m)ϕh ∈ span{Thk ψih | k ∈ Z, i ∈ {1, . . . , N − 1}} by an argument similar to the one used in the begining of the proof (now for ψih instead of ϕh ). This completes the proof of (5.12). Motivated by the discussion in the beginning of this section, we give a dilation theorem for wavelets. The theorem describes how one can dilate a normalized tight frame wavelet to an orthonormal wavelet in a bigger space. Theorem 5.2. Let m0 ∈ L∞ (T) be a nonsingular filter with Rm0 ,m0 1 = 1, h ∈ L∞ (T) , h ≥ 0, Rm0 ,m0 h = h, and consider (Uh , πh , Hh , ϕh ), the cyclic representation associated to h. Assume also there are given filters m1 , . . . , mN −1 ∈ L∞ (T) such that (5.3) holds, define ψih as in (5.11) (i ∈ {1, . . . , N − 1}) and suppose {Uhm Thn ψih | m, n ∈ Z, i ∈ {1, . . . , N − 1}} is a normalized tight frame for Hh (Th = πh (z)). Then, if (U1 , π1 , H1 , ϕ1 ) is the cyclic representation associated to the constant function 1, then there exists an isometry W : Hh → H1 with the following properties: (i) W Uh = U1 W , W πh (f ) = π1 (f )W for all f ∈ L∞ (T); (ii) If P is the projection onto WH h then P U1 = U 1 P ,
P π1 (f ) = π1 (f )P ,
(f ∈ L∞ (T)) ;
P ϕ1 = W ϕ h .
(5.14) (5.15)
(iii) If ψi1 = U1−1 π1 (mi )ϕ1 , i ∈ {1, . . . , N − 1} then {U1m T1n ψi1 | m, n ∈ Z, i ∈ {1, . . . , N − 1}} is an orthonormal basis for H1 , (5.16) where T1 = π1 (z); P ψi1 = W ψih ,
(i ∈ {1, . . . , N − 1} .
(5.17)
May 31, 2004 12:28 WSPC/148-RMP
474
00204
D. E. Dutkay
Proof. The proof is similar to the one of Theorem 4.4 but some additional arguments are needed. Since h ∈ L∞ (T), we have |h|2 ≤ khk∞ 1 so, by Theorem 3.18 there is a positive operator S on H1 that commutes with U1 and π1 and Z
Sπ1 (f )ϕ1 ϕ1 = f h dµ , (f ∈ L∞ (T)) . T
1
S has a positive square root S 2 that commutes with U1 and π1 . Also the pro1 jection P onto the range H of S 2 must commute with U1 and π1 . Then we can restrict π1 and U1 to H and Z 1
1 2 2 π1 (f )S ϕ1 S ϕ1 = f h dµ , T
1
1 2
U1 S ϕ1 = π1 (m0 )S 2 ϕ1 .
The uniqueness part of Theorem 2.14 implies that there is a unitary W from Hh 1 to H with W ϕh = S 2 ϕ1 , W Uh = U1 W and W πh (f ) = π1 (f )W for f ∈ L∞ (T). From these commuting properties of W and P it follows that 1
W (Uhm Thn ψih ) = S 2 (U1m T1n ψ1h ) , 1 2
(m, n ∈ Z, i ∈ {1, . . . , N − 1}) .
Hence, S maps an orthonormal basis to a normalized tight frame so it must be a 1 co-isometry. Then, proceeding as in the proof of Theorem 4.4 we get S = S 2 = P and everything follows. When m0 is a regular filter (we will give the precise meaning of that in a moment), we can really get our hands on the abstract cyclic representation associated to the constant function 1 so that we obtain a very concrete dilation theorem for non-orthogonal wavelets in L2 (R). The construction is given in the next theorem and it is based on the results presented in [8]. Before we state the result, some definitions are needed. A vector (z1 , z2 , . . . , zp ) is called an m0 -cycle if z1N = z2 , z2N = z3 , . . . , zpN = z1 , zi are distinct and |m0 (zi )| = √ N for all i ∈ {1, . . . , p}. For f ∈ L∞ (T) and z0 ∈ T define αz0 (f )(z) = f (zz0 ) for z ∈ T. For n ∈ N (n)
m0 (z) = m0 (z)m0 (z N ) · · · m0 (z N
n−1
),
(z ∈ T) .
Theorem 5.3.√Let m0 be a Lipschitz function on T with finitely many zeroes m0 (1) = N , Rm0 ,m0 1 √= 1. Let Cj = (z1,j , . . . , zpj ,j ) be the m0 -cycles, j ∈ {1, . . . , n}, m0 (zk,j ) = N eiθk,j for all k ∈ {1, . . . , pj }, j ∈ {1, . . . , n}, θj = θ1,j + · · · + θpj ,j . For each j ∈ {1, . . . , n} define: Hj = L2 (R)pj , Uj : Hj → Hj Uj (ξ1 , . . . , ξpj ) = eiθ1,j U ξ2 , . . . , eiθpj −1,j U ξpj , eiθpj ,j U ξ1 , x for ξ ∈ L2 (R). where U ξ(x) = √1N f N ∞ For f ∈ L (T) πj (f )(ξ1 , . . . , ξpj ) = π(αz1,j (f ))(ξ1 ), . . . , π(αzpj ,j (f ))(ξp ) , where π is the representation on L2 (R) defined in Example 2.13.
May 31, 2004 12:28 WSPC/148-RMP
00204
Positive Definite Maps, Representations and Frames (p ) ∞ −iθj Y e αzk,j m0 j √ ϕˆk,j (x) = N pj l=1
x N lpj
,
475
(l ∈ {1, . . . , pj }) ,
ϕj = (ϕ1,j , . . . , ϕpj ,j ) . Finally, define H0 = H1 ⊕· · ·⊕Hn , U0 = U1 ⊕· · ·⊕Un , π0 (f ) = π1 (f )⊕· · · πn (f ) for f ∈ L∞ (T) and ϕ0 = ϕ1 ⊕· · ·⊕ϕn . Then (U0 , π0 , H0 , ϕ0 ) is the cyclic representation associated to the constant function 1. Also if C1 is the trivial m0 -cycle C1 = (1) then H1 = L2 (R), U1 = U, π1 = π and ∞ Y m0 Nxl √ , (x ∈ R) , ϕˆ1 (x) = N l=1 so (U1 , π1 , H1 , ϕ1 ) is the usual wavelet representation on L2 (R). If Tj = πj (z), for j ∈ {1, . . . , n} then Tj (ξ1 , . . . , ξpj )(x) = (z1,j T ξ1 , . . . , zpj ,j T ξpj ) , where T ξ(x) = ξ(x − 1) for ξ ∈ L2 (R). Assume m1 , . . . , mN −1 ∈ L∞ (T) satisfy (5.3). Define ψi1 = U1−1 π1 (mi )ϕ1 (∈ L2 (R)), ψi0 = U0−1 π0 (mi )ϕ0 , i ∈ {1, . . . , N − 1} and let P1 be the projection from H0 onto H1 and T0 = T1 ⊕ · · · ⊕ Tn . Then P1 U 0 = U 0 P1 ,
P 1 T0 = T 0 P 1 ,
P1 π0 (f ) = π0 (f )P1 ,
U0 |H1 = U1 (= U ), T0 | H1 = T1 (= T ), π0 (f ) |H1 = π1 (f )(= π(f )) ,
(f ∈ L∞ (T)) ; (5.18) (f ∈ L∞ (T)) ;
(5.19) P1 ϕ0 = ϕ 1 ,
P1 ψi0
=
U0 ϕ0 = π0 (m0 )ϕ0 ,
ψi1
,
(i ∈ {1, . . . , N − 1}) ; U1 ϕ1 = π1 (m0 )ϕ1 ;
{T0k ϕ0 | k ∈ Z} is an orthonormal set ;
(5.20) (5.21) (5.22)
{U0m T0n ψi0 | m, n ∈ Z, i ∈ {1, . . . , N − 1}} is an orthonormal basis for H0 ; (5.23) {U1m T1n ψi1 | m, n ∈ Z, i ∈ {1, . . . , N − 1}} is a normalized tight
(5.24)
frame for H1 = L2 (R) . Proof. Since the cyclic representation associated to the constant function 1 is given in [8], one needs only to take the inverse Fourier transform of the representation presented there to obtain the one described here. Then (5.18) and (5.19) follow trivially from the definition, (5.20) follows from the definition and the commuting properties of P1 , (5.21) is included in the definition of the cyclic representation,
May 31, 2004 12:28 WSPC/148-RMP
476
00204
D. E. Dutkay
(5.22) and (5.23) are consequences of Proposition 5.1 and (5.24) (which is also wellknown, see [6] or [1]) follows from the fact that the projection of an orthonormal basis is a normalized tight frame (see [12]). Example 5.4. We apply Theorem 5.3 to the low-pass filter 1 + z 3 √ − 3iθ 3θ 2 , (z = e−iθ ∈ T) = 2e cos m0 (z) = √ 2 2 which is known to give non-orthogonal wavelets. The scale N = 2. Some short √ computations show that m0 (1) = 2, Rm0 ,m0 1 = 1. The m0 -cycles are C2 = (z2,1 = e2πi/3 , z2,2 = e4πi/3 ) √ p1 = 1, p2 = 2, m0 (z1,1 ) = m0 (z2,1 ) = m0 (z2,2 ) = 2 so θ1,1 = θ2,1 = θ2,2 = 0 and θ1 = θ2 = 0. C1 = (z1,1 = 1) ,
U0 : L2 (R)3 → L2 (R)3 , U0 (ξ1 , ξ2 , ξ3 ) = (U ξ1 , U ξ3 , U ξ2 ) , T0 : L2 (R)3 → L2 (R)3 , T0 (ξ1 , ξ2 , ξ3 ) = (T ξ1 , e2πi/3 T ξ2 , e4πi/3 T ξ3 ) . Then, as αz1,1 (m0 ) = m0 , αz2,1 (m0 ) = αz2,2 (m0 ) = m0 , ∞ 3x Y 3ix sin 3x − 3ix − 2 l+1 =e 2 , ϕˆ1,1 (x) = e 2 cos 3x 2l+1 2 l=1
(2) ∞ Y m0 √ ϕˆ2,1 (x) = l=1
x 22l 22
=
∞ Y m0
x 22l
l=1
m0 √ 22
x 22l−1
(x ∈ R) ,
= ϕˆ1,1 (x) ,
and similarly for ϕˆ2,2 . Hence, ϕ1,1 = ϕ2,1 = ϕ2,2 =: ϕ = 31 χ[0,3) , and ϕ0 = (ϕ, ϕ, ϕ). To construct the wavelet we can pick m1 (z) =
1 − z3 √ , 2
(z ∈ T) .
Then the wavelet ψ0 = (ψ1 , ψ2 , ψ3 ) is given by 1 U0 ψ0 = √ (ϕ0 − T03 ϕ0 ) 2 so ψ1 = ψ2 = ψ3 =: ψ =
1 (χ 3 − χ[ 23 ,1) ) . 3 [0, 2 )
and {U0m T0n ψ0 | m, n ∈ Z} is an orthonormal basis for L2 (R)3 which dilates the normalized tight frame of L2 (R) {U m T n ψ | m, n ∈ Z} .
May 31, 2004 12:28 WSPC/148-RMP
00204
Positive Definite Maps, Representations and Frames
477
Acknowledgments The author wants to thank professor S ¸ erban Str˘ atil˘ a for pointing out the connection between the GNS construction and Kolmogorov’s theorem. This was the starting point and the key idea of this paper. Also many thanks to professor Palle Jorgensen for his suggestions and his constant support. References [1] O. Bratteli and P. E. T. Jorgensen, Isometries, shifts, Cuntz algebras and multiresolution wavelet analysis of scale N , Integral Equations Operator Theory 28 (1997) 382–443. [2] O. Bratteli and P. E. T. Jorgensen, Wavelets Through a Looking Glass (Birkhauser) to appear. [3] O. Bratteli and D. W. Robinson, Operator Algebras and Quantum Statistical Mechanics, Vol. I (Springer-Verlag, 1987). [4] T. Constantinescu, Schur Parameters, Factorization and Dilation Problems, Operator Theory Advances and Applications, Vol. 82 (Birkhauser Verlag, 1996). [5] T. Constantinescu, Representations of Hermitian kernels in Krein spaces, Publ. RIMS, Kyoto Univ. 33 (1997) 917–951. [6] I. Daubechies, Ten Lectures on Wavelets, CBMS-NSF Regional Conf. Ser. in Appl. Math., Vol. 61, Society for Industrial and Applied Mathematics, Philadelphia, 1992. [7] D. Dutkay, Harmonic analysis of signed Ruelle transfer operators, J. Math. Anal. Appl. 273(2) (2002) 590–617. [8] D. Dutkay, The wavelet Galerkin operator, J. Operator Theory. 51 (2004) 49–70. [9] D. E. Evans and Y. Kawahigashi, Quantum Symmetries on Operator Algebras (Oxford Science Publications, 1998). [10] D. E. Evans and J. T. Lewis, Dilations of irreversible evolutions in algebraic quantum theory. Commun. Dubl. Inst. Adv. Studies, Ser. A., 24 (1977). [11] G. Folland, A Course in Abstract Harmonic Analysis (CRC Press, 1995). [12] D. Han and D. Larson, Frames, bases and group representations, Memoirs of the AMS, Sept. 2000, Vol. 147, No. 697. [13] E. Hernandez and G. Weiss, A First Course on Wavelets (CRC Press, Inc. 1996). [14] P. E. T. Jorgensen, Ruelle operators: Functions which are harmonic with respect to a transfer operator, Mem. Amer. Math. Soc. 152(720) (2001). [15] W. M. Lawton, Necessary and sufficient conditions for constructing orthonormal wavelet bases, J. Math. Phys. 32 (1991) 57–61. [16] D. Sarason, Generalized interpolation in H ∞ , Trans. Amer. Math. Soc. 127 (1967) 179–203. [17] B. Sz.-Nagy and C. Foias, Dilatations des commutants d’operateurs, C. R. Acad. Sci. Paris, Serie A, 266 (1968) 493–495. [18] B. Sz.-Nagy and C. Foias, Harmonic Analysis of operators on Hilbert Space (North Holland, Amsterdam-Budapest, 1970).
May 31, 2004 15:53 WSPC/148-RMP
00203
Reviews in Mathematical Physics Vol. 16, No. 4 (2004) 479–507 c World Scientific Publishing Company
CORE SYMMETRIES OF A FLOW
AKITAKA KISHIMOTO Department of Mathematics, Hokkaido University, Sapporo, Japan 060-0810 Received 28 March 2003 Revised 23 February 2004 For a flow α on a C ∗ -algebra one defines a symmetry as the group of automorphisms γ such that γαγ −1 is a cocycle perturbation of α. We propose to define a core of this symmetry, which acts trivially on the set of equivalence classes of KMS state representations, but may act non-trivially on the set of equivalence classes of covariant irreducible representations. In particular this core acts transitively on the set of those which induce faithful representations of the crossed product by α. Keywords: C ∗ -algebra; automorphism; asymptotically inner; ground state; cocycle perturbation.
1. Introduction Let α be a flow on a C ∗ -algebra A, i.e., α is a homomorphism of R into the automorphism group Aut(A) of A such that t 7→ αt (x) is continuous for all x ∈ A. We regard such (A, α) as is representing a physical system with α a time-flow. A major motivation behind this work comes from the problem of symmetrybreaking; i.e., if ω1 and ω2 are distinct equilibrium states at some temperature (or ground states) for (A, α), the problem asks whether there is a symmetry γ (i.e., an automorphism γ of A with γαt = αt γ) such that ω1 γ = ω2 . Since we are not specifying α based on a model, if this is true then this should also be true for an inner perturbation of α. Thus we should at least modify the problem as follows: if ω1 and ω2 are as above, is there an automorphism γ of A such that the flow t 7→ γαt γ −1 is an inner perturbation of α and ω1 γ is equivalent to ω2 (i.e., πω1 γ is equivalent to πω2 or πω1 γ(x) 7→ πω2 (x) extends to an isomorphism of πω1 (A)00 onto πω2 (A)00 , with πωi as the GNS representation associated with ωi )? Of course we know this cannot hold in such a generality. There are some results, obtained recently in [24, 8], which answer similar questions, e.g., if ω1 and ω2 are states of a simple separable A such that πω1 (A)00 and πω2 (A)00 are isomorphic to an injective type III1 factor (as expected for equilibrium states), there is an automorphism γ of A such that ω1 γ is equivalent to ω2 , under a mild assumption on A (see Corollary A.3 of Appendix and [8]). As a matter of fact this kind of results has been known for some time for UHF algebras and AF 479
May 31, 2004 15:53 WSPC/148-RMP
480
00203
A. Kishimoto
algebras, which are general enough to cover many models in statistical mechanics (see [27] and [2]). It is the new proof that matters. This proof seems to enable us to tailor γ as we like; so it raises the hope that we might challenge the problem of symmetry breaking. The aforementioned results do not involve α. By using the additional fact that ω1 and ω2 are equilibrium states (or more precisely factorial KMS states) at some temperature, the problem is then whether we could impose an additional condition on γ that t 7→ γαt γ −1 (written as γαγ −1 ) is a mild (if not inner) perturbation of α. Although it looks desirable to find wilder perturbations than inner ones, which are yet mild enough to preserve the structure of equilibrium states, I know only cocycle perturbations, which are only slightly more general than inner ones. Hence we will stick to cocycle perturbations below. (If u is a continuous map of R into the unitary group U(A) of A (or A+C1) such that us+t = us αs (ut ) for all s, t ∈ R, then t 7→ ut is called an α-cocycle and t 7→ Ad ut αt is a flow, called a cocycle perturbation of α.) This ambition soon proved to be unattainable (at the moment) for two technical reasons. One is as follows: the only way I can conceive to make γ satisfy that γαγ −1 is a cocycle perturbation of α is to choose a sequence (un ) in U(A) such that γ is defined as the limit of Ad un and un αt (un )∗ converges in norm, say to ut , uniformly in t on every compact subset. (Then t 7→ ut is an α-cocycle and Ad ut αt = γt αt γ −1 .) In this case Fannes et al. [6] have shown that γ never get broken, i.e., ω1 γ is equivalent to ω1 itself (see below). So actually we are restricting ourselves to the core symmetries which are not relevant to the original problem of symmetry-breaking! The other reason is: to define γ we will have to find a sequence (Un ) of the unitaries in M = πω1 (A)00 identified with πω2 (A)00 such that Ad Un πω1 (x) converges to πω2 (x) in the strong operator topology for each x ∈ A (see Theorem A.1 of the Appendix). Then we will try to replace (Un ) by (un ) from A, in a sense, to define γ. For that purpose (Un ) must behave well with respect to the two flows on M induced by α through πω1 and πω2 . Then we have at hand a famous result of Connes [5] saying that these flows on M are a (weakly continuous) cocycle perturbation of each other. Although it appears this result points to what I aim at, the discrepancy between the weak topology and norm topology prevents me from pursuing along these lines. This is another reason why we do not consider equilibrium states but only ground states. Thus we have ended up proving something, as in the abstract, which is only valid for ground states and in no way applicable to the symmetry-breaking of equilibrium states. I must say that the result is not really satisfactory for ground states either (see below for more details) but hope that this will be a starting point for a possible general theory of symmetry breaking. We will now describe the contents more precisely. In [3] we have defined the symmetry group Gα (the class of cocycle perturbations) of α as follows: Gα = {γ ∈ Aut(A) | γαγ −1 is a cocycle perturbation of α} .
May 31, 2004 15:53 WSPC/148-RMP
00203
Core Symmetries of a Flow
481
Note that each γ ∈ Gα extends to an automorphism γ¯ of the crossed product A×α R by fixing an α-cocycle u with γαt γ −1 = Ad ut αt as follows: Z γ¯ (aλ(f )) = γ(a) f (t)ut λ(t)dt , a ∈ A, f ∈ Cc (R) , where λ denotes the canonical unitary flow in the multiplier algebra of A ×α R and Cc (R) denotes the algebra of continuous functions on R of compact support. (See [4, 26] for the definition of crossed products.) In some cases we can choose such an α-cocycle u for each γ ∈ Gα in a unique way so that γ 7→ γ¯ defines an isomorphism of Gα into Aut(A ×α R). We notice that even if γ is approximately inner, γ¯ need not be. We define a subset Gα0 of Gα as the set of γ ∈ Gα having a continuous map v of [0, ∞) into U(A) with v0 = 1 such that Ad vs (x) converges to γ(x) for x ∈ A as s → ∞ and vs αt (vs∗ ) converges, say to ut , uniformly in t on every compact subset of R as s → ∞; then it follows that u is an α-cocycle with γαt γ −1 = Ad ut αt and that γ¯ (defined by using the α-cocycle u) is asymptotically inner because γ¯ = lims→∞ Ad vs on A ×α R (since if a ∈ A and f ∈ Cc (R), then R R vs aλ(f )vs∗ = Ad vs (a) f (t)vs αt (vs∗ )λ(t)dt converges to γ(a) f (t)ut λ(t)dt in norm as s → ∞). (Since vs is only a multiplier unitary for A ×α R, we will need some work to choose a continuous function w: [0, ∞) → U(A ×α R) = U(A ×α R + C1) such that w0 = 1 and γ¯ = lims→∞ Ad ws ; we shall indicate the construction of w in Proposition B.1.) Note that Gα0 is a normal subgroup of Gα . We can regard Gα0 as a core of Gα because Gα0 is a subgroup of asymptotically inner automorphisms (which may be regarded as a core of the automorphism group) and at the same time each γ ∈ Gα0 extends to such an automorphism of A ×α R, and because Gα0 acts trivially on the set of equivalence classes of KMS state representations, by which we mean the following type of result due to M. Fannes et al. (see [6, 4, 19]). Proposition 1.1. Under the above situation let ω be a KMS state of (A, α) at an inverse temperature c ∈ R. Then ω is equivalent to ωγ for all γ ∈ Gα0 . A slight generalization of the above result will be given in Sec. 2. Let (π, U ) be a representation of (A, α), i.e., π is a (non-degenerate) representation of A on a Hilbert space H and U is a unitary flow (or a strongly continuous one-parameter unitary group) on H such that Ad Ut π(x) = παt (x), x ∈ A. Then the pair (π, U ) naturally induces a representation π × U of the crossed product R A ×α R on H by (π × U )(aλ(f )) = π(a) f (t)Ut dt, a ∈ A, f ∈ Cc (R). In this note we shall show: Theorem 1.2. Let α be a flow on a separable C ∗ -algebra A. Let (π1 , U1 ) and (π2 , U2 ) be representations of (A, α) such that π1 and π2 are irreducible and Ran(πi × Ui ) ∩ K(Hi ) = {0} for i = 1, 2, where K(Hi ) denotes the compact operators on Hi . Then the following conditions are equivalent:
May 31, 2004 15:53 WSPC/148-RMP
482
00203
A. Kishimoto
(i) Ker(π1 × U1 ) = Ker(π2 × U2 ). (ii) There exists a γ ∈ Gα0 and a continuous map v of [0, ∞) into U(A) with v0 = 1 such that γ = lims→∞ Ad vs and (π1 × U1 )¯ γ is equivalent to π2 × U2 , where γ¯ is the asymptotically inner automorphism of A ×α R defined by γ¯ (x) = lims→∞ Ad vs (x) for x ∈ A ×α R. (In particular π1 γ is equivalent to π2 .) It is obvious that (ii) ⇒ (i), because γ¯ leaves every ideal invariant. We will show (i) ⇒ (ii) in Sec. 3. The range condition that Ran(πi × Ui ) ∩ K(Hi ) = {0} follows by requiring that α is sufficiently non-trivial, e.g., that the Connes spectrum of the flow on the quotient A/Ker(πi ) induced by α is non-zero. (See [26] for the definition of Connes spectrum, which is essentially the essential spectrum of the flow.) Under this assumption, assuming Ker(πi ) = {0}, for any UHF flow β on the CAR algebra N∞ B = 1 M2 with Sp(β) contained in the Connes spectrum in the theorem, there is a cocycle perturbation α0 of α and a closed α0 -invariant projection e ∈ A∗∗ such that πi∗∗ (e)πi (A)πi∗∗ (e) is isomorphic to B under which the restriction of α0 corresponds to β and e is large in the sense that c(e)x = 0 implies x = 0 for x ∈ A, where c(e) is the central support of e in A∗∗ . This is a variation of Glimm’s type theorem [10, 21]; a UHF flow means just a flow obtained as an infinite tensor product type. Hence it follows that the range of πi × Ui cut off by πi∗∗ (e) does not contain a non-zero compact operator. We recall that the condition Ker(π1 × U1 ) = Ker(π2 × U2 ) can be expressed in terms of the spectra of Ui restricted to various invariant subspaces of the representation space Hi , which will be the condition actually used in the proof of the above theorem. Here we mean by the spectrum of Ui the spectrum of the unitary representation of R given by t 7→ Ui (t), i.e., the spectrum of the (self-adjoint) infinitesimal √ −1Hi t . generator Hi of Ui : Ui (t) = e Proposition 1.3. Let (π1 , U1 ) and (π2 , U2 ) be representations of (A, α). Then the following conditions are equivalent: (i) Ker(π1 × U1 ) = Ker(π2 × U2 ). (ii) For any non-zero α-invariant hereditary C∗ -subalgebra B of A, it follows that Sp(U1 |[π1 (B)H1 ]) = Sp(U2 |[π2 (B)H2 ]) . Proof. Suppose that condition (ii) does not hold. Then there is a non-zero α-invariant hereditary C ∗ -subalgebra B of A such that Sp(U1 |[πR1 (B)H1 ]) 6= Sp(U2 |[π2 (B)H2 ]). Then there is a f ∈ L1 (R) such that U1 (f ) = U1 (t)f (t)dt is non-zero on the U1 -invariant subspace [π1 (B)H1 ] but U2 (f ) is zero on [π2 (B)H2 ] or vice versa. This means that Bλ(f ) 6⊂ Ker(π1 × U1 ) but Bλ(f ) ⊂ Ker(π2 × U2 ) or vice versa. To show the other implication, we use the fact that for any closed ideal J of A ×α R the closed linear span of {λ(f )a ∈ J | a ∈ A, f ∈ L1 (R)} equals J [15].
May 31, 2004 15:53 WSPC/148-RMP
00203
Core Symmetries of a Flow
483
That λ(f )a ∈ Ker(πi × Ui ) means that if B denotes the α-invariant hereditary C ∗ -subalgebra generated by aa∗ (see below), then Ui (f ) = 0 on [πi (B)Hi ]. Hence we get that (ii) ⇒ (i). If a ∈ A is non-zero, the smallest α-invariant hereditary C ∗ -subalgebra B of A containing aa∗ is given as the closed linear span of αs (a)Bαt (a)∗ , s, t ∈ R; in this case the closed subspace [πi (B)Hi ] is the closed linear span of Ui (t)πi (a)Hi , t ∈ R. In condition (ii) of the above proposition we only have to take those C ∗ -subalgebras B. Note that if Sp(Ui |[πi (B)Hi ]) = R for all those B, then the Ker(πi × Ui ) is generated by Ker(πi ) [13]. In the course of the proof of the above theorem we will show other equivalent conditions: Proposition 1.4. Let (π1 , U1 ) and (π2 , U2 ) be representations of (A, α) such that π1 and π2 are mutually disjoint irreducible representations. Suppose that Ran(π i × Ui ) does not contain a non-zero compact operator. Then the following conditions are equivalent: (i) Ker(π1 × U1 ) = Ker(π2 × U2 ). (ii) For any > 0 there exists an h ∈ Asa with khk < satisfying: Hi + πi (h) is diagonal with the same set S of eigenvalues for i = 1, 2, where Hi is given by √ Ui (t) = e −1Hi t , and S1 (p) = S2 (p) for any p ∈ S, where Si (p) denotes the weak∗ closure of the set of states ωΦ πi of A for all unit vectors Φ ∈ Hi with (Hi + πi (h))Φ = pΦ. (iii) For any > 0 there exists an h ∈ Asa with khk < satisfying: both Hi +πi (h)’s have a common eigenvalue p such that S1 (p) = S2 (p), where Si (p) is defined as in the above condition. The equivalence of (i) and (ii) is Proposition 3.8 and that (ii) implies (iii) is obvious. That (iii) implies (i) follows from the proof of Theorem 1.2. Unfortunately the above theorem is unlikely to apply to ground state representations which may arise in physical models, since we expect that the ground states are closely related to the KMS states at low temperature, on the latter of which Gα0 acts trivially. Let ω be a ground state of (A, α), i.e., if (πω · Hω , Ωω ) is the GNS triple for ω and if we define a unitary flow Uω on Hω by Uω (t)πω (x)Ωω = πω (αt (x))Ωω , x ∈ A, then the generator Hω of Uω is non-negative, i.e., Sp(Uω ) ⊂ [0, ∞). Note that then the Borchers’ theorem tells us that Uω (t) ∈ πω (A)00 (see [4, 29]). If ω is extreme in the set of ground states, then πω is irreducible (by checking that πω (A)0 Ωω = CΩω using the Borchers’ theorem). We call ω physical if {ξ ∈ Hω | Hω ξ = 0} is onedimensional (and so is equal to CΩω ) and the spectrum of Uω has a gap, i.e., Sp(Uω ) ⊂ {0} ∪ [λ, ∞) for some λ > 0 (see [29, 4.2.6], where the notion of physical ground state is defined without the gap condition). If ω is a physical ground state, then it is an extreme ground state. We can show the following.
May 31, 2004 15:53 WSPC/148-RMP
484
00203
A. Kishimoto
Proposition 1.5. If ω1 and ω2 are distinct physical ground states of (A, α), then Ker(πω1 × Uω1 ) 6= Ker(πω2 × Uω2 ). Proof. Since ω1 6= ω2 and ω1 and ω2 are extreme ground states, ω1 is disjoint from ω2 . Hence there is an α-invariant hereditary C ∗ -subalgebra B of A such that ω1 |B = 0 and kω2 |Bk = 1. Since ω1 is a ground state with a gap, the spectrum of Uω1 |[πω1 (B)Hω1 ] is contained in (0, ∞) but certainly Sp(Uω2 |[πω2 (B)Hω2 ]) 3 0. This implies that Ker(πω1 × Uω1 ) 6= Ker(πω2 × Uω2 ). Hence the applicability of the above theorem may be limited in the case πi × Ui is not faithful. Let us state the following as a corollary. Corollary 1.6. Let A be a separable C ∗ -algebra and let α be a flow on A. If (π1 , U1 ) and (π2 , U2 ) are representations of (A, α) such that both π1 and π2 are irreducible and both π1 × U1 and π2 × U2 are faithful, there is a γ ∈ Gα0 and an extension γ¯ to an automorphism of A ×α R such that π1 γ is equivalent to π2 and (π1 × U1 )¯ γ is equivalent to π2 × U2 . The assumption implies that A is prime as well as A ×α R; so the Connes spectrum of α is R [26] and the range condition Ran(πi × Ui ) ∩ K(Hi ) = {0} is satisfied. It follows from [24] that there is an asymptotically inner automorphism φ of A ×α R such that (π1 × U1 )φ is equivalent to π2 × U2 . The above result says that we can choose φ as γ¯ , an extension of an automorphism γ ∈ Gα0 . Some examples of (A, α) each giving a simple crossed product can be found in [13, 17]. There are quite a few examples of (π, U ) which gives a faithful covariant irreducible representation of A ×α R [14, 21]. For example if (A, α) has a faithful covariant irreducible representation and the Connes spectrum is full, then we can embed an arbitrary UHF flow into (A, α) as noted after the theorem; hence (A, α) has as many faithful covariant irreducible representations as a UHF flow has. Before concluding this section I would like to express my thanks to the referees for numerous comments on an earlier version of this note and M. Rørdam for his suggestion which is materialized in Appendix. 2. Core Symmetries When a flow α is given on A, we have defined the symmetry group Gα as the group of automorphisms γ of A for which γαγ −1 is a cocycle perturbation of α. We have also defined the core Gα0 of Gα as the group of asymptotically inner automorphisms γ of A which have a continuous map v: [0, ∞) → U(A) with v0 = 1 such that γ = lims→∞ Ad vs and vs αt (vs∗ ) converges to an α-cocycle ut (uniformly in t on every compact as s → ∞) with γαt γ −1 = Ad ut αt . Let A be a unital simple C ∗ -algebra with a unique tracial state τ and let α be an approximately inner flow on A. We denote by δα the infinitesimal generator of
May 31, 2004 15:53 WSPC/148-RMP
00203
Core Symmetries of a Flow
485
α, which is a closed linear operator defined on a dense ∗ -subalgebra D(δα ) of A such that δα (xy) = δα (x)y + xδα (y) and δα (x∗ ) = δα (x)∗ . In this case we have that τ (uδα (u∗ )) = 0 for all u ∈ U(A) ∩ D(δα ). (Since α is approximately inner, there is a sequence (hn ) in Asa such that δα is the graph limit of (ad ihn ); hence for any unitary u ∈ D(δα ) there is a sequence (un ) in U(A) such that ku − un k → 0 and kδα (u) − ad ihn (un )k → 0. From this the assertion follows.) We call an α-cocycle u admissible if it has a v ∈ U(A) such that v ∗ ut αt (v) is differentiable in t [18] and d ∗ dt τ (v ut αt (v))|t=0 = 0. It then follows that for each α-cocycle there is a unique p ∈ R such that the α-cocycle t 7→ eipt ut is admissible and that the set of admissible α-cocycles is closed (under the topology of uniform convergence on every compact subset). To show the former we just note that if t 7→ ut is differentiable, then d d τ (vut αt (v ∗ ))|t=0 = τ (ut )|t=0 dt dt d for any v ∈ U(A) ∩ D(δα ) and dt τ (ut )|t=0 ∈ iR. To show the latter let (un ) be a sequence of α-cocycles such that (un,t ) converges to an α-cocycle ut uniformly in t on every compact subset. Let A∞ be the C ∗ -algebra of sequences (xn ) in A such that lim xn exists and define a flow α∞ on A∞ by α∞ t ((xn )) = (αt (xn )) and a α∞ -cocycle U by (un ). To see that U is really an α∞ -cocycle, we check that U s α∞ s (Ut ) = (un,s αs (un,t )) = (un,s+t ) = Us+t and kUt − 1k ≡ supn kun,t − 1k converges to zero as t → 0. Then from [18] there is a V = (vn ) ∈ U(A∞ ) ∗ ∗ such that V Ut α∞ t (V ) is differentiable. If v = lim vn , then vn un,t αt (vn ) converges ∗ to vut αt (v ) (uniformly in t on every compact subset). Since t 7→ V Ut α∞ (V ∗ ) is differentiable, each t 7→ vn un,t αt (vn∗ ) is differentiable and the differential of vn un,t αt (vn∗ ) converges. This implies that t 7→ vut αt (v ∗ ) is differentiable and that d d ∗ ∗ dt (vut αt (v )) = limn dt (vn un,t αt (vn )). From this it follows that if un ’s are admissible, then u is admissible. For each γ ∈ Gα we choose a unique admissible α-cocycle uγ such that γαt γ −1 = Ad uγt αt and extend γ to an automorphism γ¯ of A ×α R by using uγ . If β, γ ∈ Gα , then uβγ = β(uγ )uβ , which shows that γ 7→ γ¯ is an isomorphism of Gα into Aut(A ×α R) in this case. If γ ∈ Gα0 , then we have chosen an α-cocycle u such that there is a continuous v: [0, ∞) → U(A) such that v0 = 1, γ = lims→∞ Ad vs , and ut = lims→∞ vs αt (vs∗ ). This shows that u = uγ , i.e., u is admissible. Hence under the above isomorphism γ 7→ γ¯, Gα0 maps into the group of asymptotically inner automorphisms of A ×α R. (Note that γ¯ = lims→∞ Ad vs and vs is a unitary multiplier of A ×α R; but we can replace (vs ) by a continuous family in U(A ×α R) by Proposition B.1.) We now come back to the general case and define a new subgroup Gα1 of Gα as the group of γ ∈ Gα having a sequence (vn ) in U(A) such that γ = limn→∞ Ad vn and the α-cocycles
t 7→ vn∗ ut αt (vn )
May 31, 2004 15:53 WSPC/148-RMP
486
00203
A. Kishimoto
are equicontinuous in n ∈ N, where u is an α-cocycle with γαt γ −1 = Ad ut αt . Then Gα0 ⊂ Gα1 , which are normal subgroups of Gα . To show that Gα1 is normal, we proceed as follows. If γ ∈ Gα1 with (vn ) as above and ϕ ∈ Gα , then ϕγϕ−1 = lim Ad ϕ(vn ). If u and w are α-cocycles such that γαt γ −1 = Ad ut αt and ϕαt ϕ−1 = Ad wt αt , then ϕγϕ−1 αt ϕγ −1 ϕ−1 = ϕγ Ad ϕ−1 (wt∗ )αt γ −1 ϕ−1 = ϕ Ad (γϕ−1 (wt∗ )ut )αt ϕ−1 = Ad (ϕγ −1 ϕ−1 (wt∗ )ϕ(ut )wt )αt , where t 7→ ϕγ −1 ϕ−1 (wt∗ )ϕ(ut )wt is an α-cocycle. We have to check the equicontinuity of t 7→ ϕ(vn∗ )ϕγ −1 ϕ−1 (wt∗ )ϕ(ut )wt αt ϕ(vn ) = ϕ(vn∗ γ −1 ϕ−1 (wt∗ )ut ϕ−1 (wt )ϕ−1 αt ϕ(vn )) = ϕ(vn∗ γ −1 ϕ−1 (wt∗ )vn vn∗ ut αt (vn )ϕ−1 (wt )) . This follows since the last expression approximately equals wt∗ ϕ(vn∗ ut αt (vn ))wt uniformly in t on every compact subset as n → ∞. We do not know how strong the condition αt ∈ Gα1 is. (If α is a Rohlin flow on a unital separable nuclear purely infinite simple C ∗ -algebra, then αt ∈ Gα1 ; see [23]. If α is an AF flow, then αt ∈ Gα0 ; see [3].) The following result shows that the group generated by Gα1 (and αR ) acts trivially on the set of equivalence classes of KMS state representations. Hence the quotient of Gα by this (normal) subgroup may be considered as an essential symmetry group of α. Proposition 2.1. Let α be a flow on a C ∗ -algebra A and ω a KMS state of (A, α) at an inverse temperature c ∈ R. Then πω γ is equivalent to πω for all γ ∈ Gα1 . Proof. This proof follows the one of [19, 4.4]. By the assumption on γ there is an α-cocycle u and a sequence (vn ) in U(A) such that γαt γ −1 = Ad ut αt , γ = lim Ad vn , and the functions t 7→ vn∗ ut αt (vn ) are equicontinuous in n ∈ N. If ω is a KMS state at c = 0, then it is a tracial state, from which follows that ωγ = ω. Suppose that c > 0. (A similar proof applies in the case c < 0.) Since ω is a KMS state at c [4, 29], there is a bounded continuous function F in Sc = {z ∈ C | 0 ≤ =(z) ≤ c} for any x, y ∈ A such that F is holomorphic in the interior of Sc and satisfies the boundary conditions: F (t) = ω(xαt (y)) , F (t + ic) = ω(αt (y)x) ,
May 31, 2004 15:53 WSPC/148-RMP
00203
Core Symmetries of a Flow
487
for all t ∈ R. Replacing x by vn x and y by vn∗ , we get such a function Fn with the boundary conditions: Fn (t) = ω(vn xαt (vn∗ )) , Fn (t + ic) = ω(αt (vn∗ )vn x) . Let wn,t = vn∗ ut αt (vn ); (wn,t )n is equicontinuous (as functions in t). Then ∗ ∗ αt (vn∗ )vn = wn,t vn∗ ut vn is approximately equal to wn,t γ −1 (ut ) uniformly in t on every compact subset as n → ∞. Thus it follows that the functions t 7→ αt (vn∗ )vn ∗ are equicontinuous. Similarly t 7→ vn αt (vn∗ ) = vn wn,t vn∗ ut are equicontinuous. Note also that these functions equal 1 at t = 0. By passing to a suitable subsequence of (Fn ) we get a bounded continuous function Fx on Sc such that Fx is holomorphic in the interior and satisfies Fx (t) = hπω (γ(x))Vt Ω, Ωi , Fx (t + ic) = hπω (x)Ω, Wt Ωi , where πω is the GNS representation of A associated with ω, Ω is the canonical unit vector there, and (Vt ) and (Wt ) are norm-continuous families in the unit ball of πω (A)00 such that V0 = 1 = W0 . One can choose Vt and Wt for countably many x. If πω and πω γ were not equivalent, then we could choose a sequence (xn ) in the unit ball of A+ such that πω (xn )Ω → 0 and πω γ(xn )Ω converges to a non-zero vector in norm, or vice versa. This gives us a contradiction since (Fxn ) converges, say to F such that F is a bounded continuous function on Sc and is holomorphic in the interior such that F (0) 6= 0 and F (t + ic) = 0 for all t ∈ R; so πω γ must be equivalent to πω . 3. Proof of Theorem 1.2 The present proof is given largely following [7, 24]. Recall that (π, U ) is a representation of (A, α) on the Hilbert space H. We generalize a lemma in [22, 25] as follows. Lemma 3.1. Suppose that U is diagonal. Let F be a finite subset of A, E a finiterank projection on H such that Ad Ut (E) = E, and > 0. Then there is a finite-rank self-adjoint operator H on H such that E ≤ H ≤ 1, Ad Ut (H) = H , k[π(x), H]k < ,
x∈F.
Proof. Let n ∈ N and δ > 0 be such that n−1 1 and δn 1. Let W1 = EH. Let V2 be the linear subspace of H generated by π(x)ξ, x ∈ F ∪ F ∗ , ξ ∈ W1 , where F ∗ = {x∗ | x ∈ F}. We choose a finite-dimensional subspace
May 31, 2004 15:53 WSPC/148-RMP
488
00203
A. Kishimoto
W2 of H such that W2 ⊃ W1 , Ut W2 = W2 , and kPW2 PV2 − PV2 k < δ, where PW denotes the (self-adjoint) projection onto the subspace W . We define V3 to be the subspace generated by π(x)ξ, x ∈ F ∪ F ∗ , ξ ∈ W2 . We repeat this process to define finite sequences (Vk ) and (Wk ) of subspaces up to k = n satisfying, e.g., Wk−1 ⊂ Vk , Wk−1 ⊂ Wk , kPWk PVk − PVk k < δ, and Ut Wk = Wk . We define n
1X PW k , n
H=
k=1
which has finite rank, dominates PW1 = E, and commutes with Ut . Let W0 = {0} and Wn+1 = H and let k = 1, 2, . . . , n − 1. If ξ ∈ Wk+1 Wk and x ∈ F, then n−k π(x)ξ , [H, π(x)]ξ = H − n which almost belongs to Wk+2 Wk−1 . More precisely, note that k(1 − PWk+2 )π(x)ξk ≤ k(1 − PWk+2 )PVk+2 π(x)ξk ≤ δkxkkξk and that kPWk−1 π(x)ξk = sup{|hξ, π(x∗ )ηi| ; η ∈ Wk−1 , kηk ≤ 1} = sup{|hPVk (1 − PWk )ξ, π(x∗ )ηi| ; η ∈ Wk−1 , kηk ≤ 1} ≤ δkxkkξk . Now we will estimate [H, π(x)]ξ =
n X k=0
H−
n−k π(x)(PWk+1 − PWk )ξ n
for ξ ∈ H and x ∈ F. For each i = 0, 1, 2 we have that
2
X n−k
H− (PWk+2 − PWk−1 )π(x)(PWk+1 − PWk )ξ
n k≡i mod 3
≤
X
n−2 k(PWk+2 − PWk−1 )π(x)(PWk+1 − PWk )ξk2
X
n−2 kxk2 k(PWk+1 − PWk )ξk2
k≡i mod 3
≤
k≡i mod 3
≤ n−2 kxk2 kξk2 .
May 31, 2004 15:53 WSPC/148-RMP
00203
Core Symmetries of a Flow
489
Hence it follows that k[H, π(x)]ξk ≤ 3n
−1
kxkkξk +
k=0
+
n−k
P π(x)(P − P )ξ H −
Wk−1 Wk+1 Wk
n
n X
k=1
n−k
(1 − PWk+2 )π(x)(PWk+1 − PWk )ξ
H−
n
n−1 X
≤ 3n−1 kxkkξk + 2nδkxkkξk = (3n−1 + 2nδ)kxkkξk . This concludes the proof.
Recall that U(A) denotes the unitary group of A (or A+C1 if 1 6∈ A) and U0 (A) denotes the connected component of 1 in U(A). The following is also taken from [22, 25]. Lemma 3.2. Suppose that U is diagonal. Let E be a finite-rank projection on H such that Ad Ut (E) = E, F a finite subset of U0 (A), and > 0. Then there exists an n ∈ N such that each u ∈ F has a finite sequence (u1 , u2 , . . . , un ) in U0 (A) satisfying u1 = u, un = 1, and kuk −uk+1 k < /2 for k = 1, 2, . . . , n−1. Furthermore there exists a finite-rank projection F on Cn ⊗ H such that Ad(1 ⊗ Ut )(F ) = F, F ≥ E ⊕ 0 ⊕ · · · ⊕ 0 and k[πn (ˆ u), F ]k < , u ∈ F, where πn = id ⊗ π is a representation of Mn ⊗ A on Cn ⊗ H and u ˆ = u1 ⊕ u2 ⊕ · · · ⊕ un using the finite sequence given above. Proof. The first part is obvious. S To show the second part let G = u∈F {u1 , u2 , . . . , un } and δ > 0. By the previous lemma we find a sequence (Hk )k=1,2,...,n−1 of finite-rank self-adjoint operators on H such that E ≤ H1 ≤ E1 ≤ H2 ≤ · · · ≤ Hn−1 ≤ En−1 ≡ Hn , Ad Ut (Hk ) = Hk , k[Hk , π(x)]k < δ ,
x∈G,
where Ek denotes the support projection of Hk . We define a projection F on Cn ⊗H by Fk,k = Hk − Hk−1 q Fk,k+1 = Fk+1,k = Hk − Hk2 ,
and Fi,j = 0 otherwise, where H0 = 0. Noting that Hk Hk−1 = Hk−1 and H1 ≥ E it follows that F is indeed a finite-rank projection such that Ad(1 ⊗ Ut )(F ) = F and F ≥ E ⊕ 0 ⊕ · · · ⊕ 0. For u ∈ F we have that (πn (ˆ u)F − F πn (ˆ u))k,k = [π(uk ), Hk ] − [π(uk ), Hk−1 ] , q q (πn (ˆ u)F − F π(ˆ u))k,k+1 = [π(uk ), Hk − Hk2 ] + Hk − Hk2 (uk − uk+1 ) .
May 31, 2004 15:53 WSPC/148-RMP
490
00203
A. Kishimoto
p Hence, since k Hk − Hk2 k ≤ 1/2, the norm of [πn (ˆ u), F ] is smaller than q /2 + 2δ + 2 max{k[π(x), Hk − Hk2 ]k | x ∈ G, k = 1, . . . , n} , which can be made smaller than by choosing δ > 0 small. The following is an α-covariant version of [22, 25]. Lemma 3.3. Let (π, U ) be a representation of (A, α) on H such that π is irreducible and U is diagonal. For any finite subset F of U0 (A), any non-zero finite-rank projection E on H with Ad Ut (E) = E, and any > 0, there exists an n ∈ N and a finite subset G of M1n (A) satisfying: kww ∗ k = 1 and π(ww ∗ )E = E for w ∈ G, and for any t ∈ [−1, 1] there is a bijection ft : G → G such that kαt (w)−ft (w)k < , w ∈ G, and for any u ∈ F there is a bijection fu : G → G such that kuw−fu (w)k < , w ∈ G. Remark 3.4. In the above situation for any t ∈ [−1, 1] and u ∈ F there is a bijection g: G → G such that kαt (u)w−g(w)k < 3, w ∈ G. We may take ft ◦fu ◦ft−1 for g, where ft and fu are the bijections on G specified in the lemma. When (X, d) is a metric space, S ⊂ X, and > 0, we call S an -net if X = {B(x, ) | x ∈ S}, where B(x, ) = {y ∈ X | d(x, y) < }. When X has a finite -net, we define N (X, ) to be the minimum of orders over all the finite -nets. If X is compact, then N (X, ) is well-defined for any > 0.
S
Lemma 3.5. Let (X, d) be a compact metric space. If S1 and S2 are -nets consisting of N (X, ) points, then there is a bijection f of S1 onto S2 such that d(x, f (x)) < 2, x ∈ S1 . Proof. Let F be a non-empty subset of S1 and set G = {y ∈ S2 | ∃ x ∈ F, B(y, ) ∩ B(x, ) 6= ∅} . S Since {B(x, ) | x ∈ F} ⊂ {B(x, ) | x ∈ G}, it follows that G ∪ (S1 \F) is an -net; hence the order of G is greater than or equal to the order of F. Then by the matching theorem we can find a bijection f of S1 onto S2 such that f (x) ∈ {y ∈ S2 | B(y, ) ∩ B(x, ) 6= ∅}. S
Lemma 3.6. Let π be an irreducible representation of A on a Hilbert space H and E be a finite-rank projection on H. For any finite subset F of A and > 0, there is an e ∈ A such that 0 ≤ e ≤ 1, E ≤ π(e), and kxek < kπ(x)Ek + ,
x∈F.
Proof. By Kadison’s transitivity [11] the set of e ∈ A with 0 ≤ e ≤ 1 and E ≤ π(e) S is non-empty. Assuming A has a unit, the set D = >0 {e ∈ A | 1 ≤ e ≤ 1, E ≤ π(e)} forms a decreasing net and has limit E in the second dual A∗∗ (see [26, 1.4.2]; for e1 , e2 ∈ D e = 1 − (a1 + a2 )(1 + a1 + a2 )−1 is smaller than e1 and e2 , where
May 31, 2004 15:53 WSPC/148-RMP
00203
Core Symmetries of a Flow
491
ai = e−1 i (1 − ei )). Hence it follows that the limit of the net (kxek)e∈D is kπ(x)Ek. In the case where A has no unit, the conclusion follows from this case. Proof of Lemma 3.3. For a finite subset F of U0 (A), a finite-rank projection E, and > 0 as in Lemma 3.3, we choose an n ∈ N and a projection F on Cn ⊗H as in Lemma 3.2; in particular k[πn (ˆ u), F ]k ≈ 0 for u ∈ F, where u ˆ = u 1 ⊕ u2 ⊕ · · · ⊕ u n as in Lemma 3.2. We assume that there is a unitary v(u) ∈ Mn (A) for u ∈ F such that kv(u) − 1k < and [πn (v(u)ˆ u), F ] = 0. Let B = {x ∈ Mn (A) | [πn (x), F ] = 0} , which is a C ∗ -subalgebra of Mn (A) = Mn ⊗ A and is left invariant under id ⊗ α (as Ad(1⊗Ut )(F ) = F ). We also have that v(u)ˆ u ∈ B for u ∈ F. Since πn = id⊗π is an irreducible representation of Mn (A), Kadison’s transitivity implies that F πn (B)F equals the set of all linear operators on F (Cn ⊗ H). We choose an -net S in the unitary group U(F (Cn ⊗ H)) of minimal order and choose a subset G of U(B) of the same order such that F πn (G) = S. For u ∈ F there is a bijection fu : G → G such that k(πn (v(u)ˆ uw) − πn (fu (w)))F k < 2 ,
w ∈G.
There exists an N ∈ N such that if |t| < 1/N , then kid ⊗ αt (w) − wk < /2 for w ∈ G. Let ∆ = {t/N | k = −N, −N + 1, . . . , 0, . . . , N }. For each t ∈ ∆, there is a bijection ft : G → G such that k(πn (id ⊗ αt )(w) − πn (ft (w)))F k < 2 ,
w ∈G.
Then there is an e ∈ Mn (A) by Lemma 3.6 such that 0 ≤ e ≤ 1, F ≤ πn (e), and k(id ⊗ αt )(e) − ek < ,
t ∈ [−1, 1] ,
k(v(u)ˆ uw − fu (w))ek < 2 ,
u∈F,
w ∈G,
k((id ⊗ αt )(w) − ft (w))ek < 2 ,
t ∈ ∆,
w ∈G.
The first of the displayed inequalities can be shown easily. The last condition implies, by the choice of ∆, that for any t ∈ [−1, 1] there is a bijection ft : G → G such that k((id ⊗ αt )(w) − ft (w))ek < 3. By using the first condition it then follows that k(id ⊗ αt )(we) − ft (w)ek < 4 . Note that we2 w∗ ≤ ww∗ = 1n and πn (we2 w∗ ) ≥ πn (w)F πn (w∗ ) = F ; thus the (1, 1) component of we2 w∗ is dominated by 1 and dominates E. Let G1 be the set of the first of rows of we, w ∈ G. Then it follows that kxk ≤ 1 and π(xx∗ ) ≥ E for x ∈ G1 . For u ∈ F if we define a bijection fu : G1 → G1 in an obvious way by using fu : G → G and the natural bijection G → G1 , then it follows that kux − fu (x)k ≤ kˆ uwe − fu (w)ek < kv(u)ˆ uwe − fu (w)ek + < 2 ,
May 31, 2004 15:53 WSPC/148-RMP
492
00203
A. Kishimoto
where x ∈ G1 is the first row of we with w ∈ G. For t ∈ [−1, 1], if we define a bijection ft : G1 → G1 by using ft : G → G, then it follows that kαt (x) − ft (x)k ≤ k(id ⊗ αt )(we) − ft (w)ek < 4 , where x ∈ G1 is the first row of we with w ∈ G. This concludes the proof. Lemma 3.7. Let (π1 , U1 ) and (π2 , U2 ) be representations of (A, α) such that πi ’s are irreducible. Then for any > 0 there is an h ∈ Asa such that khk < and H√i + πi (h) is diagonal for i = 1, 2, where Hi is a self-adjoint operator with Ui (t) = e −1tHi . The above lemma easily follows from [16]. We will elaborate on it under some additional assumptions. When π is a representation of A on H and Φ is a unit vector in H, we denote by ωΦ π the state of A defined by x 7→ hπ(x)Φ, Φi. Proposition 3.8. Let (π1 , U1 ) and (π2 , U2 ) be representations of (A, α) such that π1 and π2 are mutually disjoint irreducible representations. Suppose that Ran(π i × Ui ) does not contain a non-zero compact operator. Then the following conditions are equivalent: (i) Ker(π1 × U1 ) = Ker(π2 × U2 ). (ii) For any > 0 there exists an h ∈ Asa with khk < satisfying: Hi + πi (h) is diagonal with the same set S of eigenvalues for i = 1, 2, where Hi is given by √ −1Hi t , and S1 (p) = S2 (p) for any p ∈ S, where Si (p) denotes the Ui (t) = e weak∗ closure of the set of states ωΦ πi of A for all unit vectors Φ ∈ Hi with (Hi + πi (h))Φ = pΦ. Proof. Suppose (ii). Denote by α(h) the inner perturbation of α by h, i.e., the flow generated by δα + ad ih, where δα is the generator of α. There is a natural (h) isomorphism φ from A ×α R onto A ×α(h) R such that πi × U = (πi × Ui )φ, where (h) (h) Ui (t) = eit(Hi +πi (h)) . Thus it suffices to show that Ker(π1 × U1 ) = Ker(π2 × (h) (h) U2 ). We now denote α(h) (resp. Ui ) by α (resp. Ui ) and assume that both the Ui are diagonal. To show that the kernels of πi × Ui ’s are the same, it suffices to show, by Proposition 1.3, that for any α-invariant hereditary C ∗ -subalgebra B of A, Sp(Ui |[πi (B)Hi ])’s are the same, where Hi is the Hilbert space on which πi (A) acts and [πi (B)Hi ] denotes the closed subspace spanned by πi (B)Hi . Since Sp(Ui |[πi (B)Hi ]) is the closure of the set of p ∈ S with Si (p)|B 6= 0, this follows immediately. The proof of the other implication will be continued soon. For the proof of the other part, we prepare the following lemma, where Ei denotes the spectral measure of Hi .
May 31, 2004 15:53 WSPC/148-RMP
00203
Core Symmetries of a Flow
493
Lemma 3.9. Under the above situation let I be a non-empty open interval of R and Φ be a unit vector in H1 such that E1 (I)Φ = Φ and let ω1 = ωΦ π1 . Then there exists a sequence (Ψn ) of unit vectors in H2 such that |hπ2 (x)Ψn , Ψn i − ω1 (x)| → 0 ,
x ∈ A,
and E2 (I)Ψn = Ψn for all n. Moreover (Ψn ) converges to zero in the weak topology. Proof. Let ω ¯ 1 = ωΦ ◦ (π1 × U1 ); a state of A ×α R. Let g ∈ L1 (R) be such that the Fourier transform gˆ of g is supported on the closure of I and gˆ(t) > 0, t ∈ I. Let B be the hereditary C ∗ -subalgebra of A ×α R generated by λ(g). Since (π2 × U2 )(B) does not contain a non-zero compact operator and ω ¯ 1 restricts to a state on B, Glimm’s theorem [9] says that ω ¯ 1 |B can be approximated by states of the form ωΨ ◦ (π2 × U2 )|B; in particular ω1 can be approximated by ωΨ ◦ π2 , where Ψ is a unit vector in E2 (I)H2 . The last statement follows since π1 and π2 are disjoint. This concludes the proof. Proof of Proposition 3.8, continued. Let (Fn ) be an increasing sequence of S finite subsets of A such that the union n Fn is dense in A and let ∈ (0, 1). Let (ηi,n ) be a dense sequence in the unit sphere of Hi and let ∆ be a dense countable subset of Sp(H1 ) = Sp(H2 ) such that ∆ contains all eigenvalues of Hi with i = 1, 2. Let ξ = η1,1 and let F1 be a finite family of mutually disjoint open intervals (a, b) with a, b 6∈ ∆ and 0 < b − a < /2 ≡ 1 such that ξ(I) = E1 (I)ξ 6= 0 for I ∈ F1 and X kξk2 − kξ(I)k2 < 1 . I∈F1
For each I ∈ F1 specify m(I) ∈ R be such that m(I) ∈ I ∩ ∆ and let X (Hi − m(I))Ei (I) Ki,1 = I∈F1
for i = 1, 2. Then Ki is a self-adjoint operator in Hi such that kKi,1 k < 1 . Let V1,1 = {ξ(I) | I ∈ F1 } and define m: V1,1 → ∆ and I: V1,1 → Int by m(ξ(I)) = m(I) and I(ξ(I)) = I, where Int denotes the set of open intervals of R with endpoints in R\∆. Let P1,1 denote the projection onto the subspace spanned by V1,1 . By Lemma 3.9 we choose a φ(v) ∈ H2 for each v ∈ V1,1 such that E2 (I(v))φ(v) = φ(v), kφ(v)k = kvk, and kvk−2 |hπ1 (x)v, vi − hπ2 (x)φ(v), φ(v)i| < 1 ,
x ∈ F1 .
Let V2,1 = {φ(v) | v ∈ V1,1 } and let P2,1 be the projection onto the subspace generated by V2,1 . Define m: V2,1 → ∆ by m(φ(v)) = m(v). By Kadison’s transitivity [11, 26] there exists an h1 ∈ Asa such that kh1 k < 1 , and πi (h1 )Pi,1 = −Ki,1 Pi,1
May 31, 2004 15:53 WSPC/148-RMP
494
00203
A. Kishimoto
for i = 1, 2. Then we have that (Hi + πi (h1 ))v = m(v)v ,
v ∈ Vi,1 .
We set Hi,1 = Hi + πi (h1 ) and ∆1 = {m(v) | v ∈ V1,1 } = {m(v) | v ∈ V2,1 }. Note that Hi,1 commutes with Pi,1 and the spectrum of Hi,1 Pi,1 equals ∆1 . Let 2 ∈ (0, 1 /2) be such that |p − q| > 2 for all distinct p, q ∈ ∆1 . Let ξ = (1−P2,1 )η2,1 . Let F2 be a finite family of mutually disjoint open intervals S in Int of length smaller than 2 such that F2 ⊃ ∆1 , ξ(I) = E2,1 (I)ξ 6= 0 for I ∈ F2 with I ∩ ∆1 = ∅, and X kξk2 − kξ(I)k2 < 2 , I∈F2
where Ei,1 is the spectral measure of Hi,1 . For I ∈ F2 we set m(I) = p if I∩∆1 = {p} (which contains at most one point) and otherwise specify m(I) in I ∩ ∆ arbitrarily. Let V2,2 be the union of V2,1 and the set of non-zero ξ(I) with I ∈ F2 and let P2,2 be the projection onto the subspace generated by V2,2 . Note that P2,2 ≥ P2,1 . We extend m: V2,1 → ∆ to m: V2,2 → ∆ by m(ξ(I)) = m(I) and let ∆2 be the image of m on V2,2 . We also define I: V2,2 → Int by I(v) = I ∈ F2 if E2,1 (I)v = v. For p ∈ ∆2 let W (p) be the subspace spanned by {v ∈ V2,2 | m(v) = p} (which is either one-dimensional or two-dimensional at this stage). Let V (p) be a finite -net of the unit sphere of W (p) (i.e., any unit vector of W (p) has distance less then 2 from some vector in V (p)). For each v ∈ V (p) with p ∈ ∆2 , we choose a unit vector ψ(v) ∈ (1 − P1,1 )H1 , by Lemma 3.9, such that E1,1 (I(v))ψ(v) = ψ(v), and |hπ1 (x)ψ(v), ψ(v)i − hπ2 (x)v, vi| < 2 ,
x ∈ F2 .
We may suppose that V1 (p) ≡ {ψ(v) | v ∈ V (p)} is orthogonal. Note that if p 6= q in ∆ , then the vectors in V1 (p) are orthogonal to the ones in V1 (q). We set V1,2 = S 2 {V1 (p) | p ∈ ∆2 } and define m: V1,2 → ∆ by m(v) = p for v ∈ V1 (p). Let X Ki,2 = (Hi,1 − m(I))Ei,1 (I) , I∈F2
which is a self-adjoint operator in Hi such that kKi,2 k < 2 . 0 Let P1,2 be the projection onto the subspace spanned by V1 (p), p ∈ ∆2 ; then 0 0 P1,2 ⊥ P1,1 . We define P1,2 = P1,1 + P1,2 . Note that K1,2 P1,1 = 0. By Kadison’s transitivity there exists an h2 ∈ Asa such that kh2 k < 2 and πi (h2 )Pi,2 = −Ki,2 Pi,2 . Then it follows that (Hi,1 + πi (h2 ))v = m(v)v ,
v ∈ Vi,2 .
We set Hi,2 = Hi,1 + πi (h2 ) = Hi + πi (h1 + h2 ); then Hi,2 commutes with Pi,2 and the spectrum of Hi,2 Pi,2 equals ∆2 . We repeat this process.
May 31, 2004 15:53 WSPC/148-RMP
00203
Core Symmetries of a Flow
495
Suppose that we have defined a decreasing sequence (k ) of positive numbers, an increasing sequence (∆k ) of finite subsets of ∆, an increasing sequence (Pi,k ) of finite-rank projections in Hi , and a sequence (hk ) in Asa , all up to k = n such that k+1 < k /2, the minimum of the differences of distinct points in ∆k is greater than k+1 , k(1 − Pi,k )ηi,` k2 < k for ` = [(k + 1)/2], khk k < k , πi (hk )Pi,k−1 = 0, Pi,k commutes with Hi,k with Hi,k = Hi,k−1 + πi (hk ), and Sp(Hi,k Pi,k ) = ∆k . Moreover if k is odd and p ∈ ∆k , then there is an k -net V (p) in the unit sphere of P1,k E1,k ({p})H1 such that the state ωΦ π1 of A for Φ ∈ V (p) can be approximated by a state ωΨ π2 with Ψ ∈ P2,k E2,k ({p})H2 on Fk up to k , and if k is even and p ∈ ∆k , the same type condition follows switching π1 and π2 . Suppose that n is even. We define n+1 ∈ (0, n /2) such that |p − q| > n+1 for distinct p, q ∈ ∆n . Let ξ = (1 − P1,n )η1,n/2+1 and we find a finite family Fn+1 of mutually disjoint open intervals in Int (i.e., of endpoints in R\∆) of length less than S n+1 such that Fn+1 ⊃ ∆n , if I ∈ Fn+1 and I ∩ ∆n = ∅ then ξ(I) ≡ E1,n ξ 6= 0, and X kξk2 − kξ(I)k2 < n+1 . I∈Fn+1
For I ∈ Fn+1 we set m(I) = p if I ∩ ∆n = {p} and otherwise set m(I) ∈ I ∩ ∆ arbitrarily. We define ∆n+1 to be the set of m(I), I ∈ Fn+1 . Let W be the subspace generated by P1,n H1 and ξ(I), I ∈ Fn+1 and let P1,n+1 be the projection onto W . For each p ∈ ∆n+1 let W (p) be the subspace (of W ) generated by P1,n E1,n ({p})H1 and E1,n (I)ξ with m(I) = p. We choose a finite n+1 -net V (p) in the unit sphere of W (p). By Lemma 3.9 we choose a unit vector φ(v) ∈ (1 − P2,n )E2,n (I)H2 for each v ∈ V (p), where m(I) = p, such that |hπ1 (x)v, vi − hπ2 (x)φ(v), φ(v)i| < n+1 ,
x ∈ Fn+1 .
We may suppose that V2 (p) ≡ {φ(v) | v ∈ V (p)} is orthogonal. Let P2,n+1 be the projection onto the subspace generated by P2,n H2 and V2 (p), p ∈ ∆n+1 . We set X Ki,n+1 = (Hi,n − m(I))Ei,n , I∈Fn+1
which is a self-adjoint operator in Hi such that kKi,n+1 k < n+1 and Ki,n+1 Pi,n = 0. By Kadison’s transitivity we find an hn+1 ∈ Asa such that khn+1 k < n+1 and πi (hn+1 )Pi,n+1 = −Ki,n+1 Pi,n+1 . Then it follows that Hi,n+1 = Hi,n + πi (hn+1 ) commutes with Pi,n+1 and the spectrum of Hi,n+1 Pi,n+1 equals ∆n+1 . We can argue similarly in the case where n is odd. In this way we complete the induction. From the conditions on (Pi,n ) which use ηi,n , we can conclude that (Pi,n ) conP verges to 1 as n → ∞. Let h = ∞ k=1 hk , which converges in Asa and satisfies that khk < . We can conclude that Hi + πi (h) commutes with all Pi,n and is diagonal, S whose point spectrum is equal to n ∆n .
May 31, 2004 15:53 WSPC/148-RMP
496
00203
A. Kishimoto
Let p be a point spectrum of H1,∞ = H1 + π1 (h) and let v be a unit vector in H1 such that H1,∞ v = pv. Since v almost belongs to P1,n H1 for large odd n and hence to P1,n E1,n ({p})H1 (since P1,n E1,n = P1,n E1,∞ , where Ei,∞ is the spectral measure of Hi,∞ ). Thus the state ωv π1 can be approximated by states ωw π2 with w ∈ P1,n E2,n ({p})H2 on Fn up to n . Since n is arbitrary, this shows that ωv π1 belongs to S2 (p), the weak∗ closure of the set of states ωw π2 for w ∈ E2,∞ ({p})H2 . Thus, since Si (p) is weak∗ -closed, we obtain that S1 (p) ⊂ S2 (p). By a symmetric argument, we also obtain the other inclusion. This concludes the proof of (i) ⇒ (ii). The following is an α-covariant version of [24, 1.2] (see also [7]). Lemma 3.10. Let (π, U ) be a representation of (A, α) on a Hilbert space H such that A is irreducible and U is diagonal. Let Φ be a unit vector in H such that Ut Φ = Φ. For any finite subset F of A and > 0, there exists a finite subset G of A and δ > 0 satisfying: for any unit vector Ψ ∈ H such that Ut Ψ = Ψ and |hπ(x)Φ, Φi − hπ(x)Ψ, Ψi| < δ ,
x∈G,
and hπ(x)Φ, Ψi = 0 ,
x∈G,
there is a continuous path (us )s∈[0,1] in U(A) such that u0 = 1, π(u∗1 )Φ = Ψ, kαt (us ) − us k < for t ∈ [−1, 1], and k[us , x]k < for x ∈ F. Proof. Let F be a finite subset of U0 (A) (instead of A) and > 0; we will choose a required path (us ) in such a way that it almost commutes with F. By Lemma 3.3 there exists a finite subset G of M1n (A) for some n ∈ N satisfying: (i) kww∗ k = 1 and π(ww∗ )Φ = Φ for w ∈ G, (ii) for any t ∈ [−1, 1] there is a bijection ft : G → G such that kαt (w) − ft (w)k < for w ∈ G, and (iii) for any u ∈ F there is a bijection fu : G → G such that kuw − fu (w)k < for w ∈ G. Let 0 > 0 and let g be a non-negative, even, continuous function on R such R that g(t)dt = 1, T ≡ sup{t ∈ R | g(t) 6= 0} < ∞, and Z |g(t) − g(t − s)|dt < 0 , s ∈ [−1, 1] . Let G 0 = {wi∗ | w ∈ G, i = 1, 2, . . . , n}. We choose an N ∈ N such that if |t| < T /N , then kαt (x) − xk < 0 for x ∈ G 0 . We set T = {kT /N | k = −N, −N + 1, . . . , N }. We choose a δ > 0 as follows: if Ψ is a unit vector in H such that Ut Ψ = Ψ, |hπαs (x)Φ, παt (y)Φi − hπαs (x)Ψ, παt (y)Ψi| < δ ,
x, y ∈ G 0 ,
and hπαs (x)Φ, παt (y)Ψi = 0 ,
x, y ∈ G 0 ,
s, t ∈ T ,
s, t ∈ T ,
May 31, 2004 15:53 WSPC/148-RMP
00203
Core Symmetries of a Flow
497
then there is a self-adjoint unitary S on H such that kSπαs (x)Φ − παt (x)Ψk < 0 ,
x ∈ G0 ,
s∈T .
(See [7] for details.) Hence we assume that we have a self-adjoint unitary S on H and a unit vector Ψ ∈ H with the above properties. Let E = (1 − S)/2, which is a projection. For x ∈ G 0 and s ∈ T we have that kEπαs (x)(Φ + Ψ)k = 2−1 kπαs (x)Φ + παs (x)Ψ − Sπαs (x)Φ − Sπαs (x)Ψk < 0 and kEπαs (x)(Φ − Ψ) − παs (x)(Φ − Ψ)k = 2−1 k − παs (x)Φ + παs (x)Ψ − Sπαs (x)Φ + Sπαs (x)Ψk < 0 . We choose an h ∈ Asa , by Kadison’s transitivity, such that khk < 1 + 0 and π(h) = E on the subspace spanned by παs (x)Φ and παs (x)Ψ with all x ∈ G 0 and s ∈ T . Then we have, for t ∈ [−T, T ], w ∈ G, and i = 1, 2, . . . , n, that kUt π(h)Ut∗ π(wi∗ )(Φ + Ψ)k = kπ(h)α−t (wi∗ )(Φ + Ψ)k < kEπαs (wi∗ )(Φ + Ψ)k + 20 < 30 , where s ∈ T is chosen so that |s + t| < T /N , and kUt π(h)Ut∗ π(wi∗ )(Φ − Ψ) − π(wi∗ )(Φ − Ψ)k = kπ(h)πα−t (wi∗ )(Φ − Ψ) − πα−t (wi∗ )(Φ − Ψ)k
< kπ(h)παs (wi∗ )(Φ − Ψ) − παs (wi∗ )(Φ − Ψ)k + 40
< 50 , with s ∈ T chosen as above. Hence if we define Z ¯ h = αt (h)g(t)dt , ¯ − hk ¯ < 0 , t ∈ [−1, 1], we have that which satisfies that kαt (h) ∗ 0 ¯ kπ(h)π(w i )(Φ + Ψ)k < 3 ,
and ∗ ∗ 0 ¯ kπ(h)π(w i )(Φ − Ψ) − π(wi )(φ − Ψ)k < 5 .
We further define X ¯= 1 ¯ ∗, whw h |G| w∈G
May 31, 2004 15:53 WSPC/148-RMP
498
00203
A. Kishimoto
¯ ≤ khk < 1 + 0 and k[h, ¯ x]k < which is a self-adjoint element in Asa such that khk , x ∈ F. Then it follows that 1 X ¯ ¯ ∗ )(Φ + Ψ)k kπ(whw kπ(h)(Φ + Ψ)k ≤ |G| w∈G
≤
n 1 XX ¯ ∗ )(Φ + Ψ)k kπ(wi hw i |G| i=1 w∈G
< 3n0 and ¯ kπ(h)(Φ − Ψ) − (Φ − Ψ)k < 5n0 . ¯
If 0 is sufficiently small, then we get eiππ(h) Φ ≈ Ψ. The desired path will be obtained ¯ by setting us = eiπhs , s ∈ [0, 1] with an appropriate modification around s = 1. If t ∈ [−1, 1], we choose s ∈ T such that |s − t| < T /N and compute: X ¯ ∗ ) − whw ¯ ∗ )k ¯ − hk ¯ ≤ |G|−1 k (αt (whw kαt (h) w∈G
< |G|−1
X
¯ s (w∗ ) − fs (w)hf ¯ s (w)∗ k + 0 + 2n0 khk kαs (w)hα
w∈G 0
< (1 + 2nkhk) + 2khk . Since khk < 1 + 0 and 0 was chosen after n was decided, this completes the proof. We should note that in the above lemma the condition that U is diagonal could be dropped and the condition that Ut Ψ = Ψ could be replaced by kUt Ψ − Ψk < δ, t ∈ [0, 1]. Now we turn to the proof of (i) ⇒ (ii) of Theorem 1.2. If π1 is equivalent to π2 , then there is nothing to prove. Hence we suppose that they are disjoint (since they are irreducible representations). By a small inner perturbation of α, we may suppose that the conditions as stated in Lemma 3.8 are satisfied. (More precisely, if α0 is an inner perturbation of α, then there is an isomorphism γ of A ×α0 R onto A ×α R such that (πi × Ui )γ equals πi × Ui0 , where Ui0 is the corresponding perturbation of Ui , and thus we may impose any desirable conditions on the perturbed Ui0 .) In particular we suppose that 0 belongs to the point spectrum of Ui (or the self-adjoint generator of Ui ) and that S1 (0) = S2 (0), where Si (0) is the weak∗ closure of the set of ωΦ πi with all unit vectors Φ ∈ Hi (0) ≡ Ei ({0})Hi . Note that we will use only the fact S1 (0) = S2 (0) in the following proof, which shows why (iii) implies (i) in Proposition 1.4. Let (Fn ) be an increasing sequence of finite subsets of A with dense union and let > 0.
May 31, 2004 15:53 WSPC/148-RMP
00203
Core Symmetries of a Flow
499
Let Φ1 be a unit vector in H1 (0) and let F10 = F1 and 1 = /2. We choose a finite subset G1 = G of A and δ1 = δ > 0 for (π, U, Φ, F, ) = (π1 , U1 , Φ1 , F10 , 1 ) as in Lemma 3.10 such that δ1 < 1 . Let Ψ1 ∈ H2 (0) be a unit vector such that |hπ1 (x)Φ1 , Φ1 i − hπ2 (x)Ψ1 , Ψ1 i| < δ1 /2 ,
x ∈ G 1 ∪ F1 .
Let F100 = F1 . We choose a finite subset G2 of A and δ2 > 0 for (π2 , U2 , Ψ1 , F100 , 1 ) such that G2 ⊃ G1 and δ2 < δ1 /2. Let Φ2 ∈ H1 (0) be a unit vector such that |hπ1 (x)Φ2 , Φ2 i − hπ2 (x)Ψ1 , Ψ1 i| < δ2 /2 ,
x ∈ G 2 ∪ F2 ,
and hπ1 (x)Φ1 , Φ2 i = 0, x ∈ G1 . Then it follows that |hπ1 (x)Φ1 , Φ1 i − hπ1 (x)Φ2 , Φ2 i| < δ1 ,
x ∈ G1 .
Applying Lemma 3.10 with this condition, we find a continuous path (u1 (s))s∈[0,1] in U(A) such that u1 (0) = 1, π1 (u1 (1)∗ )Φ1 = Φ2 , k[u1 (s), x]k < 1 , x ∈ F10 , and kαt (u1 (s)) − u1 (s)k < 1 , t ∈ [0, 1]. Let π1,1 = π1 Ad u1 (1) and U1,1 = Ad π1 (u1 (1))U1 (so that (π1,1 , U1,1 ) is a representation of (A, α)). Then it follows that hπ1 (x)Φ2 , Φ2 i = hπ1,1 (x)Φ1 , Φ1 i ,
x ∈ A.
Let F20 = F2 ∪ {Ad u1 (1)∗ (x) | x ∈ F2 } and let 2 = 1 /2 = 2−2 . We choose a finite subset G3 of A and δ3 > 0 for (π1,1 , U1,1 , Φ1 , F20 , 2 ), as in Proposition 3.8, such that G3 ⊃ G2 and δ3 < δ2 /2. We then find a unit vector Ψ2 ∈ H2 (0) such that |hπ1,1 (x)Φ1 , Φ1 i − hπ2 (x)Ψ2 , Ψ2 i| < δ3 /2 ,
x ∈ G 3 ∪ F3
and hπ2 (x)Ψ1 , Ψ2 i = 0, x ∈ G2 . Then it follows that |hπ2 (x)Ψ1 , Ψ1 i − hπ2 (x)Ψ2 , Ψ2 i| < δ2 ,
x ∈ G2 ,
by which we find a continuous path (v1 (s))s∈[0,1] such that v1 (0) = 1, π2 (v1 (1)∗ )Ψ1 = Ψ2 , k[v1 (s), x]k < 1 , x ∈ F2 , and kαt (v1 (s))−v1 (s)k < 1 , t ∈ [0, 1]. Let π2,1 = π2 Ad v1 (1) and U2,1 = Ad π2 (v1 (1))U2 and note that hπ2 (x)Ψ2 , Ψ2 i = hπ2,1 (x)Ψ1 , Ψ1 i ,
x ∈ A.
Let F200 = F2 ∪ {Ad v1 (1)∗ (x) | x ∈ F2 }. We find a finite subset G4 of A and δ4 > 0 for (π2,1 , U2,1 , Ψ1 , F200 , 2 ), as in Lemma 3.10, such that G4 ⊃ G3 and δ4 < δ3 /2. We then find a unit vector Φ3 ∈ H1 (0) such that |hπ1 (x)Φ3 , Φ3 i − hπ2,1 (x)Ψ1 , Ψ1 i| < δ4 /2 ,
x ∈ G 4 ∪ F4 .
Note that hπ1,1 (x)π1 (u1 (1))Φ3 , π1 (u1 (1))Φ3 i = hπ1 (x)Φ3 , Φ3 i , and Ad π1 (u1 (1))(U1 (t))π1 (u1 (1))Φ3 = π1 (u1 (1))Φ3 ,
x ∈ A,
May 31, 2004 15:53 WSPC/148-RMP
500
00203
A. Kishimoto
where the unitary flow t → Ad π1 (u1 (1))(U1 (t)) is what is denoted by U1,1 . Since |hπ1,1 (x)Φ1 , Φ1 i − hπ1,1 (x)π1 (u1 (1))Φ3 , π1 (u1 (1))Φ3 i| < δ3 ,
x ∈ G3 ,
we find, by Lemma 3.10, a continuous path (u2 (s))s∈[0,1] in U(A) such that u2 (0) = 1, π1,1 (u2 (1)∗ )Φ1 = π1 (u1 (1))Φ3 , k[u2 (s), x]k < 3 , x ∈ F20 , and kαt (u2 (s)) − u2 (s)k < 2 , t ∈ [0, 1]. Let π1,2 = π1,1 Ad u2 (1) = π1 Ad (u1 (1)u2 (1)); then hπ1,2 (x)Φ1 , Φ1 i = hπ1 (x)Φ3 , Φ3 i ,
x ∈ A.
We repeat this process. We thus obtain sequences (un ) and (vn ) of continuous paths in U(A) such that un (0) = 1, vn (0) = 1, k[un (s), x]k < n , x ∈ Fn0 , kαt (un (s))−un (s)k < n , t ∈ [0, 1], k[vn (s), x]k < n , x ∈ Fn00 , kαt (vn (s))−vn (s)k < n , t ∈ [0, 1], and π1 (un (1)∗ un−1 (1)∗ · · · u1 (1)∗ )Φ1 ∈ H1 (0) , π2 (vn (1)∗ vn−1 (1)∗ · · · v1 (1)∗ )Ψ1 ∈ H2 (0) , and |hπ1 Ad (u1 (1)u2 (1) · · · un (1))(x)Φ1 , Φ1 i − hπ2 Ad (v1 (1)v2 (1) · · · vn (1))(x)Ψ1 , Ψ1 i| < 2n+1 , 0 for x ∈ F2n+1 , where n = 2−n , Fn0 = Fn ∪ {Ad un−1 (1)∗ (x) | x ∈ Fn ∪ Fn−1 }, and 00 ∗ 00 Fn = Fn ∪ {Ad vn (1) (x) | x ∈ Fn ∪ Fn−1 }. (We have used the fact that δn < n by the choice of (δn ) above.) We then conclude that Ad(u1 (1)u2 (1) · · · un (1)) converges to an automorphism, say φ, as n → ∞ and that Ad(v1 (1)v2 (1) · · · vn (1)) converges to an automorphism, say ψ, as n → ∞. (This is because if m > n and x ∈ Fn then kAd(u1 (1) · · · un (1))(x) − Ad(u1 (1) · · · um (1))(x)k < n+1 + n+2 + · · · + m < n and kAd(un (1)∗ · · · u1 (1)∗ )(x)−Ad(um (1)∗ · · · u1 (1)∗ )(x)k < n .) Moreover defining u, v: [0, ∞) → U(A) by u(s) = u1 (1) · · · un−1 (1)un (s − n + 1) for s ∈ [n − 1, n) and v(s) = v1 (1) · · · vn (s − n + 1) for s ∈ [n − 1, n), we have that u(0) = 1 = v(0) and
φ = lim Ad u(s) , s→∞
ψ = lim Ad v(s) , s→∞
which shows that both φ and ψ are asymptotically inner. Since kαt (un (s)) − un (s)k < n for t ∈ [0, 1], we have that for s > n − 1 and t ∈ [0, 1], ku(s)αt (u(s)∗ ) − u1 (1) · · · un−1 (1)αt (u∗n−1 · · · u1 (1)∗ )k < 2−n+1 . Thus it follows that u(s)αt (u(s)∗ ) converges, say to w(t), uniformly in t on every compact subset of R, as s → ∞ and that w(t) is an α-cocycle such that φαt φ−1 = Ad w(t)αt . The same applies to ψ; z(t) = lims→∞ v(s)αt (v(s)∗ ) exists and is an αcocycle such that ψαt ψ −1 = Ad z(t)αt . Hence φ¯ = lim Ad u(s) and ψ¯ = lim Ad v(s) define automorphisms of A ×α R such that Z ¯ φ(aλ(f )) = φ(a) f (t)w(t)λ(t)dt ,
May 31, 2004 15:53 WSPC/148-RMP
00203
Core Symmetries of a Flow
¯ ψ(aλ(f )) = ψ(a)
Z
501
f (t)z(t)λ(t)dt ,
for a ∈ A and f ∈ Cc (R). Since π1 (u(n))∗ Φ1 = π1 (un (1)∗ un−1 (1)∗ · · · u1 (1))Φ1 ∈ H1 (0), we get that Z ¯ h(φ1 × U1 )φ(aλ(f ))Φ1 , Φ1 i = hπ1 φ(a)Φ1 , Φ1 i f (t)dt for a ∈ A and f ∈ Cc (R). Similarly we have that ¯ h(φ1 × U2 )ψ(aλ(f ))Ψ1 , Ψi = hφ2 ψ(a)Ψ1 , Ψi
Z
f (t)dt .
Since hπ1 φ(a)Φ1 , Φi = hπ2 ψ(a)Ψ1 , Ψ1 i ,
a ∈ A,
¯ we can conclude that (π1 × U1 )φ¯ is equivalent to (π1 × U2 )ψ. −1 Note that γ = φψ is an asymptotically inner automorphism such that γαγ −1 is a cocycle perturbation of α. Moreover there exists a continuous map ζ: [0, ∞) → U(A) (each ζ(s) of the form u(s1 )v(s2 )∗ for large s1 , s2 ) such that ζ(0) = 1, γ = lims→∞ Ad ζ(s), and lims→∞ ζ(s)αt (ζ(s)∗ ) exists and converges to an α-cocycle by which α is perturbed to γαγ −1 . Hence we get the conclusion with γ = φψ −1 and vs = ζ(s). A. Appendix We recall that an automorphism γ of A is said to be asymptotically inner if there is a continuous map v: [0, ∞) → U(A) such that γ(x) = lim Ad v(s)(x) , s→∞
x ∈ A.
We denote by AInn(A) the group of asymptotically inner automorphisms of A and by AInn0 (A) the subgroup of γ ∈ AInn(A) with v: [0, ∞) → U(A) such that v(0) = 1 and γ = lims→∞ Ad v(s). Note that AInn(A) can be a proper subgroup of the group of approximate inner automorphisms of A, i.e., the closure of {Ad v | v ∈ U(A)} with pointwise norm convergence on A. Theorem A.1. Let A be a separable C ∗ -algebra and π1 and π2 be non-degenerate representations of A on a separable Hilbert space H such that π 1 (A)00 = π2 (A)00 ≡ M. Suppose that (i) M is injective. (ii) There exists a sequence (Un ) in U(M) such that Ad Un π1 (x) converges strongly to π2 (x) for all x ∈ A. (iii) There exists a sequence (Vn ) in U(M) such that Ad Vn π2 (x) converges strongly to π1 (x) for all x ∈ A.
May 31, 2004 15:53 WSPC/148-RMP
502
00203
A. Kishimoto
Then there exists an γ ∈ AInn0 (A) such that π1 γ(x) 7→ π2 (x) extends to an automorphism of M which is trivial on the center. For the proof, see [8, 4.1], where we have assumed, instead of the conditions (ii) and (iii) above, that there exists a sequence (Un ) in U(M) such that kAd Un π1 (x)− π2 (x)k → 0 for all x ∈ A. But the inspection of the proof shows that we only used these weaker conditions. If M is a type I∞ factor (or non-factor), the conditions (ii) and (iii) follow from Ker(π1 ) = Ker(π2 ) (see [31, 24, 20]). If M is of type II, these conditions may not follow from the kernel condition (see [2]). We shall show the following lemma, whose proof was suggested to us by M. Rørdam in the summer of 2002. Lemma A.2. Let A be a unital simple nuclear separable C ∗ -algebra and let M be an injective factor of type IIIλ with 0 < λ ≤ 1 on a separable Hilbert space H. Let π1 and π2 be unital homomorphisms of A into M. Then there is a sequence (Un ) in U(M) such that Ad Un π1 (x) converges strongly to π2 (x) for any x ∈ A. Since πi (A)00 is injective for a nuclear A and injective type IIIλ factors with separable predual are unique up to isomorphism for λ ∈ (0, 1], we get the following result combining the above two results. Corollary A.3. Let A be a unital simple nuclear separable C ∗ -algebra and let π1 and π2 be factorial representations of A. Suppose that λ ∈ (0, 1] and that π 1 (A)00 and π2 (A)00 are of type IIIλ . Then there is an γ ∈ AInn0 (A) such that π1 γ is equivalent to π2 . Under the situations of the above corollary we do not know whether the conclusion follows if πi (A)00 is of type III0 . But for some C ∗ -algebras A this is certainly true (see [8]). We will prove Lemma A.2 below, by invoking some deep results: the uniqueness of injective type IIIλ factors with 0 < λ ≤ 1 (due to Connes and Haagerup), A ⊗ O2 ∼ = O2 (due to Kirchberg), and the fact that any two homomorphisms of O2 into M are approximately unitarily equivalent (due to Rørdam), where O2 is the Cuntz algebra generated by two isometries. Lemma A.4. Let M be an injective factor of type IIIλ with 0 < λ < 1 on a separable Hilbert space. Then there exists a unital endomorphism γ of M and a sequence (Un ) in U(M) such that (i) M ∩ γ(M)0 ∼ = M. (ii) Ad Un (Q) strongly converges to γ(Q) for Q ∈ M. (iii) Ad Un∗ γ(Q) strongly converges to Q for Q ∈ M. Proof. For λ ∈ (0, 1) we define a state ϕ on M2 by x11 x12 1 (x11 + λx22 ) ϕ = 1+λ x21 x22
May 31, 2004 15:53 WSPC/148-RMP
00203
Core Symmetries of a Flow
503
N∞ N∞ and define a state f on the UHF C ∗ -algebra B = 1 M2 by f = 1 ϕ. Then by the uniqueness of injective type IIIλ with separable predual, M is isomorphic to πf (B)00 [5, 1, 30], where (πf , Hf , Ωf ) is the GNS triple associated with f . Thus we assume that M = πf (B)00 and identify B as a subalgebra of M omitting πf . An injective function ψ of N into N induces a unital endomorphism γ of B = N N M2 sending a copy of M2 at i ∈ N to that at ψ(i). If the range of ψ has infinite complement, we have that B ∩ γ(B)0 ∼ = B. Let (eij ) denote the canonical family of matrix units for M2 and let u = P2 i,j=1 eij ⊗eji . Then u is a self-adjoint unitary in M2 ⊗M2 such that Ad u(x⊗y) = y ⊗ x, with x, y ∈ M2 . For each pair (k, `) with k, ` ∈ N, we construct such a unitary, denoted by u(k,`) , based on M2 at k and `. Then it is straightforward to get a sequence (un ) from the set of products of these unitaries u(k,`) such that γ(x) = lim Ad un (x), x ∈ B. We should note that f (un x) = f (xun ), x ∈ B, i.e., un belongs to the centralizer of M with respect to Ωf (or the extension of f to a state of M). Since f ◦ γ = f , we can define an isometry V on Hf by V xΩf = γ(x)Ωf , x ∈ B. Since V x = γ(x)V, x ∈ B, it follows that for any Q ∈ M there is a Q1 ∈ M such that V Q = Q1 V . Since V QΩf = Q1 Ωf and Ωf is separating for M, Q1 must be unique. Thus we can define a map γ¯: M → M by γ¯ (Q) = Q1 , i.e., by the requirement V Q = γ¯ (Q)V . It is straightforward to show that γ¯ is a unital endomorphism of M. Let J denote the modular conjugation for M with respect to Ωf . (J is obtained from the polar decomposition of the closure of the preclosed operator S defined by SQΩf = Q∗ Ωf , Q ∈ M; see [5].) We should note that Jun Ωf = u∗n Ωf and JMJ = M0 . Since un Jun JxΩf = un xu∗n Ωf → γ(x)Ωf for x ∈ B, we have that un Jun J strongly converges to V . Hence for Q ∈ M we have that un Qu∗n Ωf = un Jun JQΩf → V QΩf = γ¯(Q)Ωf . Since Ωf is separating for M, it follows that Ad un (Q) strongly converges to γ¯ (Q) for Q ∈ M. On the other hand, since u∗n γ¯(Q)un Ωf = u∗n Ju∗n J γ¯(Q)Ωf = u∗n Ju∗n JV QΩf for Q ∈ M, we have that u∗n γ¯(Q)un Ωf − QΩf = u∗n Ju∗n J(V QΩf − un Jun JQΩf ) converges to 0. Hence it follows that Ad u∗n γ¯ (Q) strongly converges to Q for Q ∈ M. This completes the proof. The following result is due to M. Rørdam (see [28, 3.6]). Lemma A.5. Let π2 and π2 be unital homomorphisms of the Cuntz algebra O2 into a von Neumann algebra M. Then there is a sequence (un ) in U(M) such that kAd un π1 (x) − π2 (x)k → 0 for x ∈ O2 . Proof. We briefly indicate how to prove it. Note that O2 is generated by two isometries s1 and s2 with the relation s1 s∗1 + s2 s∗2 = 1. The linear span of si1 si2 · · · sin s∗jn · · · s∗j2 s∗j1 for all n-tuples
May 31, 2004 15:53 WSPC/148-RMP
504
00203
A. Kishimoto
(i1 , . . . , in ), (j1 , . . . , jn ) in {1, 2} is isomorphic to the tensor product of n copies of M2 . The closure B of the union for all n is isomorphic to the UHF C ∗ -algebra N∞ 1 M2 . Define U = π2 (s1 )π1 (s1 )∗ + π2 (s2 )π1 (s2 )∗ , which is a unitary in M and satisfies that U π1 (si ) = π2 (si ). Define a unital endomorphism Φ of M by Φ(Q) = π1 (s1 )Qπ1 (s1 )∗ + π1 (s2 )Qπ1 (s2 )∗ . N∞ The restriction of Φ to π1 (B) induces the one-sided shift on B ∼ = 1 M2 . Using the Rohlin property of Φ|π1 (B) we can find a sequence (un ) in U(M) such that kun Φ(u∗n ) − U k → 0 . Then it follows that kAd un π1 (si ) − π2 (si )k → 0 for i = 1, 2, concluding the proof. Proof of Lemma A.2. Let M be an injective type III factor with separable predual. If M is of type III1 , then M may be identified with Mλ1 ⊗ Mλ2 , where Mλ is of type IIIλ for 0 < λ < 1 and log λ1 and log λ2 are rationally independent [30]. In any case, if M is of type IIIλ with 0 < λ ≤ 1, we find, by Lemma A.4, a unital endomorphism γ of M and a sequence (Un ) in U(M) such that M∩γ(M)0 contains a factor of type I∞ and Ad Un (Q) → γ(Q) and Ad Un∗ γ(Q) → Q in the strong operator topology for all Q ∈ M. Choose two isometries s1 and s2 in M ∩ γ(M)0 such that s1 s∗1 + s2 s∗2 = 1. Let O2 denote the C ∗ -algebra generated by these s1 and s2 . Then by Kirchberg and Phillips [12] we have that O2 ⊗ A ∼ = O2 . We define unital homomorphisms ψ1 and ψ2 of O2 ⊗ A into M by ψi (si ⊗ x) = si γπi (x). By Lemma A.5 there is a sequence (Vn ) in U(M) such that kAd Vn ψ1 (x) − ψ2 (x)k → 0 for x ∈ O2 ⊗ A ∼ = O2 . In particular we have that kAd Vn γπ1 (x) − γπ2 (x)k → 0 ,
x ∈ A.
Let F be a finite subset of A, G a finite subset of Hf , and > 0. Then there are k, ` ∈ N such that kUk∗ γπ2 (x)Uk ξ − π2 (x)ξk < ,
x∈F,
ξ∈G
and kV` γπ1 (x)V`∗ − γπ2 (x)k < ,
x∈F.
Then there is an m ∈ N such that ∗ kUm π1 (x)Um · V`∗ Uk ξ − γπ1 (x)V`∗ Uk ξk < ,
x∈F,
ξ ∈G.
May 31, 2004 15:53 WSPC/148-RMP
00203
Core Symmetries of a Flow
505
Then we have that kAd(Uk∗ V` Um )π1 (x)ξ − π2 (x)ξk < 3 ,
x∈F,
ξ ∈G.
Thus we can choose a sequence (Wn ) from {Uk∗ V` Um | k, `, m ∈ N} such that Ad Wn π1 (x) strongly converges to π2 (x) for all x ∈ A. B. Appendix Let α be a flow on a C ∗ -algebra and let v be a continuous map of [0, ∞) into the unitary group U(A) of A (or A + C1 if A is non-unital) with v(0) = 1 such that γ(a) = lims→∞ Ad v(s)(a) defines an automorphism of A and t 7→ v(s)αt (v(s)∗ ) converges uniformly on every compact subset of t when s goes to infinity. In this case Ad v(s)(x) converges for any element x of the crossed product A ×α R and defines an automorphism γ¯ of A ×α R. We shall show: Proposition B.1. Under the above situation there is a continuous map z of [0, ∞) into U(A ×α R) such that z(0) = 1 and γ¯ (x) = lims→∞ Ad z(s)(x) for x ∈ A ×α R. Proof. If A is not unital, we suppose that v(s) ∈ A + 1 as we may. We denote by λ(t), t ∈ R the canonical unitary group in the multiplier algebra ∗ ∗ ∗ of A ×α R satisfying that R λ(t)aλ(t) = αt (a), a ∈ A. Let∗ C (λ) be the C -algebra ∼ generated by λ(f ) = f (t)λ(t)dt, f ∈ Cc (R), i.e., C (λ) = C0 (R). Note that A ×α R is generated by AC ∗ (λ). First of all we can find a continuous increasing map e of [0, ∞) into {f ∈ C ∗ (λ) | 0 ≤ f ≤ 1} such that ke(s)x − xk → 0 for x ∈ A ×α R as s → ∞ and e(s)e(t) = e(s) for t > s + 1 for any s. We find a sequence (φi ) of non-decreasing continuous functions from [0, ∞) into [0, ∞) such that φ1 (s) = s ≥ φ2 (s) ≥ φ3 (s) ≥ · · ·, {i | φi (s) > 0} is finite for any s, and supi≥1 kv(φi+1 (s)) − v(φi (s))k ≤ 2−s for a small constant > 0. We then find a sequence (ψi ) of increasing continuous functions of [0, ∞) into [0, ∞) such that s ≤ ψ1 (s) < ψ1 (s) + 1 < ψ2 (s) < ψ2 (s) + 1 < ψ3 (s) < · · ·, and w(s) = v(φ1 (s))e(ψ1 (s)) +
∞ X
v(φi+1 (s))(e(ψi+1 (s)) − e(ψi (s)))
i=1
= v(s)e(ψ1 (s)) +
n−1 X
v(φi+1 (s))(e(ψi+1 (s)) − e(ψi (s))) + 1 − e(ψn (s))
i=1
is invertible and approximately equal to a unitary as s → ∞, where n satisfies that φi (s) = 0 for i > n. This is indeed possible by making ψi (s) large since (e(ψi+1 (s)) − e(ψi (s)))(e(ψj+1 (s)) − e(ψj (s))) = 0 for i > j + 1. Because then v(φi (s)) and e(ψj (s)) nearly commute with each other. Since w(s)xw(s)∗ → γ¯(x) for x ∈ A ×α R and w(s) ∈ A ×α R + 1, we may take the unitary part of the polar decomposition of w(s) for z(s); then z, as a map into U(A ×α R), will satisfy the required condition.
May 31, 2004 15:53 WSPC/148-RMP
506
00203
A. Kishimoto
References [1] H. Araki and E. J. Woods, A classification of factors, Publ. Res. Inst. Math. Sci. Kyoto Ser. A 4 (1968) 51–130. [2] O. Bratteli, Inductive limits of finite-dimensional C ∗ -algebras, Trans. Amer. Math. Soc. 171 (1972) 195–234; 4 (1968) 51–130. [3] O. Bratteli and A. Kishimoto, AF flows and continuous symmetries, Rev. Math. Phys. 13 (2001) 1505–1528. [4] O. Bratteli and D. W. Robinson, Operator algebras and quantum statistical mechanics, I (Springer, 1979); II (Springer, 1996). [5] A. Connes, Une classification des facteurs de type III, Ann. Scient. Ec. Norm. Sup. 4e serie 6 (1973) 133–252. [6] M. Fannes, P. Vanheuverzwijn and A. Verbeure, Quantum energy-entropy inequalities: A new method for proving the absence of symmetry breaking, J. Math. Phys. 25 (1984) 76–78. [7] H. Futamura, N. Kataoka and A. Kishimoto, Homogeneity of the pure state space for separable C ∗ -algebras, Int. J. Math. 12 (2001) 813–845. [8] H. Futamura, N. Kataoka and A. Kishimoto, Type III representations and automorphisms of some separable nuclear C ∗ -algebras, J. Funct. Anal. 197 (2003) 560–575. [9] J. Glimm, A Stone-Weierstrass theorem for C ∗ -algebras, Ann. of Math. 72 (1960) 216–244. [10] J. Glimm, Type I C ∗ -algebras, Ann. of Math. 73 (1961) 572–612. [11] R. V. Kadison, Irreducible operator algebras, Proc. Nat. Acad. Sci. U.S.A. 43 (1957) 273–276. [12] E. Kirchberg and N.C. Phillips, Embedding of exact C ∗ -algebras in the Cuntz algebra O2 , J. Reine Angew. Math. 525 (2000) 17–53. [13] A. Kishimoto, Simple crossed products of C ∗ -algebras by locally compact abelian groups, Yokohama Math. J. 28 (1980) 69–85. [14] A. Kishimoto, C∗ -crossed products by R, Yokohama Math. J. 30 (1982) 151–164. [15] A. Kishimoto, Ideals of C∗ -crossed products by locally compact abelian groups, Proc. of Symposia in Pure Mathematics, Part 1, 38 (1982) 365–368. [16] A. Kishimoto, Outer automorphism subgroups of a compact abelian ergodic action, J. Operator Theory 20 (1988) 59–67. [17] A. Kishimoto, A Rohlin property for one-parameter automorphism groups, Commun. Math. Phys. 179 (1996) 599–622. [18] A. Kishimoto, Locally representable one-parameter automorphism groups of AF algebras and KMS states, Rep. Math. Phys. 45 (2000) 333–356. [19] A. Kishimoto, UHF flows and the flip automorphism, Rev. Math. Phys. 9 (2001) 1163–1181. [20] A. Kishimoto, Approximately inner flows on separable C ∗ -algebras, Rev. Math. Phys. 14 (2002) 649–673. [21] A. Kishimoto, Quasi-product flows on a C ∗ -algebra, Commun. Math. Phys. 229 (2002) 397–413. [22] A. Kishimoto, The representations and endomorphisms of a separable nuclear C ∗ -algebra, Int. J. Math. 14 (2003) 313–326. [23] A. Kishimoto, Central sequence algebras of a purely infinite simple C ∗ -algebra, Canad. J. Math. to appear. [24] A. Kishimoto, N. Ozawa and S. Sakai, Homogeneity of the pure state space of a separable C ∗ -algebra, Canad. Math. Bull. 46 (2003) 365–372. [25] A. Kishimoto and S. Sakai, Homogeneity of the pure state space for the separable nuclear C ∗ -algebras (2001) unpublished.
May 31, 2004 15:53 WSPC/148-RMP
00203
Core Symmetries of a Flow
507
[26] G. K. Pedersen, C ∗ -algebras and their automorphism groups (Academic Press, 1979). [27] R. T. Powers, Representations of uniformly hyperfinite algebras and their associated von Neumann rings, Ann. of Math. 86 (1967) 138–171. [28] M. Rørdam, Classification of inductive limits of Cuntz algebras, J. Reine Angew. Math. 440 (1993) 175–200. [29] S. Sakai, Operator algebras in dynamical systems (Cambridge Univ. Press, 1991). [30] M. Takesaki, Theory of Operator Algebras III, Encyclopedia of Math. 127 (Springer, 2000). [31] D. Voiculescu, A non-commutative Weyl-von Neumann theorem, Rev. Roum. Pures Appl. 21 (1976) 97–113.
May 31, 2004 13:56 WSPC/148-RMP
00207
Reviews in Mathematical Physics Vol. 16, No. 4 (2004) 509–558 c World Scientific Publishing Company
ALGEBRAIC APPROACH TO THE 1/N EXPANSION IN QUANTUM FIELD THEORY
STEFAN HOLLANDS Enrico Fermi Institute, Department of Physics, University of Chicago 5640 Ellis Ave. Chicago IL 60637, USA [email protected] Received 18 September 2003 Revised 19 March 2003 The 1/N expansion in quantum field theory is formulated within an algebraic framework. For a scalar field taking values in the N by N hermitian matrices, we rigorously construct the gauge invariant interacting quantum field operators in the sense of power series in 1/N and the ‘t Hooft coupling parameter as members of an abstract *-algebra. The key advantages of our algebraic formulation over the usual formulation of the 1/N expansion in terms of Green’s functions are (i) that it is completely local so that infrared divergencies in massless theories are avoided on the algebraic level and (ii) that it admits a generalization to quantum field theories on globally hypberbolic Lorentzian curved spacetimes. We expect that our constructions are also applicable in models possessing local gauge invariance such as Yang–Mills theories. The 1/N expansion of the renormalization group flow is constructed on the algebraic level via a family of *-isomorphisms between the algebras of interacting field observables corresponding to different scales. We also consider k-parameter deformations of the interacting field algebras that arise from reducing the symmetry group of the model to a diagonal subgroup with k factors. These parameters smoothly interpolate between situations of different symmetry. Keywords: Quantum field theory; large N expansion; perturbation theory; algebraic methods.
1. Introduction A common strategy to gain (at least approximate) information about physical models is to expand quantities of interest in terms of the parameters of the model. For example in perturbation theory, one expands in terms of the coupling parameter(s) of the theory. In quantum theories, it is sometimes fruitful to expand in terms of Planck’s constant, ~. The key point in all cases is that the theory one is expanding about — often a linear theory in the first example, and the classical limit in the second example — is under better control, and that there exist, in many cases, systematic and constructive schemes to calculate the deviations order by order. Another such expansion that has by now become standard in quantum field theory is the expansion in 1/N , where N describes the number of components of the 509
May 31, 2004 13:56 WSPC/148-RMP
510
00207
S. Hollands
field(s) in the model. As in the previous two examples, the theory that one expands about, i.e. the large N limit, is often somewhat simpler than the theory at finite N , and can sometimes even be solved exactly. (Just as an example, it can happen [17] that the large N limit of a non-renormalizable theory is renormalizable, and the 1/N -corrections remain renormalizable.) The 1/N expansion in quantum field theory was first introduced by ‘t Hooft [15] in the context of non-abelian gauge theories. He observed that, if one explicitly keeps track of all factors of N in the perturbative expansion of a connected Green’s function of gauge invariant interacting fields, then the series can be organized as a power series in 1/N , provided that the coupling parameter of the theory is also chosen to depend on N in a suitable way. Moreover, he showed that the Feynman diagrams associated with terms at a given order in 1/N can naturally be related to Riemann surfaces with a number of handles equal to that order. Thus, in the 1/N expansion of this model, the leading contribution corresponds to planar diagrams, the subleading contribution to diagrams with a toroidal topology, etc. The usual schemes for calculating the Green’s functions in perturbation theory implicitly assume that the interacting fields approach suitable “in”-fields in the asymptotic past, which one assumes can be identified with the fields in the underlying free field theory that one is expanding about. The existence of such “in”-fields is closely related with the possibility to interpret the theory in terms of particles, and with the existence of an S-matrix. However, none of these usually exist in massless theories. Thus, the formulation of the 1/N expansion in terms of Green’s functions is potentially problematical in massless theories. These problems come into even sharper focus if one considers theories on non-static (globally hypberbolic) Lorentzian spacetimes. Here there is not, in general, available even a preferred vacuum state based on which to calculate the Green’s functions. Moreover, one would certainly not expect the fields to approach free “in”-fields in the asymptotic past for example in spacetimes that do not have suitable static regions in the asymptotic future and past (such as our very own universe). A strategy to avoid these difficulties has recently been developed in [3, 4]. The new idea in those references is to construct directly the interacting field operators as members of some *-algebra of observables, rather than trying to construct the Green’s functions. The key advantage of this approach is that, as it turns out, the interacting fields operators can always be defined in a completely satisfactory way, without any reference to an imagined (in general non-existent) “in”-field, or “in”-states in the asymptotic past. Consequently, infrared problems do not arise on the level of the interacting field observables and their associated algebras. a Not surprisingly, these ideas have also been a key ingredient in the construction of interacting quantum field theories in curved spacetimes [10–12]. The purpose of this article is to show that it is also possible to formulate the 1/N expansion directly in terms of the interacting fields and their associated algebras a Such
divergences will arise, if one tries to construct (non-existent) quantum states in the theory corresponding to “free incoming particles”.
May 31, 2004 13:56 WSPC/148-RMP
00207
Algebraic Approach to the 1/N Expansion in Quantum Field Theory
511
of observables to which these fields belong, thereby achieving a complete disentanglement from the infrared behavior of the theory. We will consider in this article ¯ representation of U (N ) on explicitly only the theory of a scalar field in the N ⊗ N Minkowski space. However, since our algebraic construction is done without making use of any of the particlular features of Minkowski space, it can be generalized to arbitrary Lorentzian curved spacetimes by the methods of [3, 11, 12]. Also, we expect that our methods are applicable to models with local gauge symmetry such as Yang–Mills theories, although the algebraic construction of the interacting fields in such theories is more complex due to the presence of unphysical degrees-of-freedom, and still a subject of investigation, see e.g. [5, 16, 7]. We now summarize the contents of this paper. In Sec. 2 we review, in a pedagogical way, the construction of the field observables and the corresponding algebra associated with a single, free hermitan scalar field φ. This algebra is sufficiently large to contain the Wick powers of φ and their time-ordered products, which are required later in the construction of the corresponding interacting quantum field theory. A description of the properties of these objects and their construction is therefore included. ¯ In Sec. 3, we generalize these constructions to a free scalar field in the N ⊗ N representation of U (N ), and we show how to construct the 1/N expansion of this theory on the level of field observables and their associated algebras. In Sec. 4, we proceed to interacting quantum field theories. Based on the algebraic construction of the underlying free quantum field theory in Sec. 3, we construct the interacting quantum fields as formal power series in 1/N and the ‘t Hooft coupling as members of a suitable algebra AV , where V is a gauge invariant interaction. The contributions to these quantities arising at order H in the 1/N expansion correspond precisely to Feynman diagrams whose toplogy is that of a Riemann surface with H handles. We point out that the algebras AV also incorporate an expansion in ~ (“loop-expansion”) [4], as well as by construction an expansion in the coupling parameters appearing in V . Therefore, the construction of AV in fact incorporates all three expansions mentioned at the beginning. In Sec. 5 we review, first for a single scalar field, the formulation of the renormalization group in the algebraic framework [10] that we are working in. We then ¯ show that this construction can be generalized to an interacting field in the N ⊗ N representation of U (N ) with gauge invariant interaction, defined in the sense of a power series in 1/N . The renormalization group map, is therefore also defined as a power series in 1/N . In Sec. 6, we vary the constructions of Secs. 3 and 4 by considering interactions that are invariant only under some subgroup of U (N ). We show that, for a diagonal subgroup with k factors, the algebraic construction of the 1/N expansion can still be carried through, but now leads to deformed algebras of interacting field observables which are labeled by k real deformation parameters associated with the relative size of the subgroups. These parameters smoothly interpolate between situations of different symmetry. The contributions to an interacting field at order H in 1/N
May 31, 2004 13:56 WSPC/148-RMP
512
00207
S. Hollands
are now associated with Riemann surfaces that are “colored” by k “spins”, where each coloring is weighted according to the values of the deformation parameters. 2. Algebraic Construction of a Single Scalar Quantum Field The perturbative construction of an interacting quantum field theory is based on the construction on the corresponding free quantum field theory, and we shall therefore begin by considering free fields. In this section, we will review how to define an algebra of observables associated with a single free hermitian Klein–Gordon field of mass m, described by the classical actionb Z S = (∂µ φ∂ µ φ + m2 φ2 ) dd x, (1)
which is large enough in order to contain the Wick powers and the time-ordered products of the field φ. Our review is essentially self-contained and follows the ideas developed in [4, 10, 11, 12, 3], which the reader may look up for details. For pedagogical purposes, we begin by defining first a “minimal algebra” of observables associated with the action (1). Consider the free *-algebra over the complex numbers generated by a unit 1 and formal expressions φ(f ) and φ(h)∗ , where f and h run through the space of compactly supported smooth test functions on Rd . The minimal algebra is obtained by factoring this free algebra by the following relations: (i) (ii) (iii) (iv)
(Linearity) φ(af + bh) = aφ(f ) + bφ(h) for all a, b ∈ C and test functions f, h. (Field equation) φ((∂ µ ∂µ − m2 )f ) = 0 for all test functions f . (Hermiticity) φ(f )∗ = φ(f¯). (Commutation relations) [φ(f ), φ(h)] = i∆(f, h) · 1, where ∆(f, h) is the advanced minus retarded propagator for the Klein–Gordon equation, smeared with the test functions f and h.
We formally think of the expressions φ(f ) as the “smeared” quantum fields, i.e., the integral of the formalc pointlike quantum field against the test function f , Z φ(f ) = φ(x)f (x) dd x . (2) Rd
The linearity of the expression φ(f ) in f corresponds to the linearity of the integral. Relation (ii) is the field equation for φ(x) in the sense of distributions, i.e. it formally corresponds to the Klein–Gordon equation for φ(x) via a partial integration. Relation (iii) says that the field φ is hermitian and relation (iv) implements the usual commutation relations of the free hermitian scalar field on d-dimensional Minkowski spacetime. b Our
signature convention is − + + + · · ·. ∆ is a distribution, relation (iv) implies that the field φ necessarily has a distributional character. Therefore, the field only makes good mathematical sense after smearing with a test function.
c Since
May 31, 2004 13:56 WSPC/148-RMP
00207
Algebraic Approach to the 1/N Expansion in Quantum Field Theory
513
The minimal algebra is too small for our purposes. It does not, for example, contain observables corresponding to Wick powers of the field φ at the same spacetime point, nor their time-ordered products. These are, however, required if one wants to construct the interacting quantum field theory perturbatively around the free theory. We now construct an enlarged algebra, W, which contains elements corresponding to these observables. For this purpose, it is useful to first present the minimal algebra in terms of a new set of generators, defined by W0 = 1, and a X ∂a iφ(F ) 21 ∆+ (F,F ) a e e λi fi , (3) , F = Wa (f1 ⊗ · · · ⊗ fa ) = (−i) ∂λ1 · · · ∂λa λi =0 where ∆+ is any distribution in 2 spacetime variables which is a solution to the Klein–Gordon equation in each variable, and which has the property that its antisymmetric part is equal to (i/2)∆. In particular, we could choose ∆+ = (i/2)∆ at this stage, but it is important to leave this choice open for later. It follows from the definition that W1 (f ) = φ(f ), that the quantities Wa are symmetric under exchange of the test functions, and that Wa (f1 ⊗ · · · ⊗ fa )∗ = Wa (f¯1 ⊗ · · · ⊗ fa ) ,
Wa (f1 ⊗ · · · (∂ µ ∂µ − m2 )fi ⊗ · · · fa ) = 0 .
(4) Using the algebraic relations (i)–(iv), one can express the product of two such quantities again as a linear combination of such quantities, Wa (f1 ⊗ · · · ⊗ fa ) · Wb (h1 ⊗ · · · ⊗ hb ) X Y = ∆+ (fk , hl ) · Wc (⊗i∈P / 2 hj ) , / 1 fi ⊗ ⊗j ∈P
(5)
P (k,l)∈P
where the following notation has been used to organize the sum on the right side: we consider sets P of pairs (i, j) ∈ {1, . . . , a}×{1, . . . , b}; we say that i ∈ / P1 if there is no j such that (i, j) ∈ P, and we say j ∈ / P2 if there is no i such that (i, j) ∈ P. The number c is related to a and b by a + b − c = 2|P|, where |P| is the number of pairs in P. It is easy to see that relations (4) and (5) form an equivalent presentation of the minimal algebra, i.e. we could equivalently define the minimal algebra to be the abstract algebra generated by the elements Wa (⊗i fi ), subject to the relations (4) and (5), instead of defining it as the algebra generated by φ(f ) subject to the relations (i)–(iv) above (this is true no matter what the particular choice of ∆ + is). Thus, all we have done so far is to rewrite the minimal algebra in terms of different gerenators. To obtain the desired extension, W, of the minimal algebra, we now choose a distribution ∆+ which is not only a bisolution to the Klein–Gordon equation with the property that its antisymmetric part is equal to (i/2)∆, but which has the additional property that that it is of positive frequency type in the first variable and of negative frequency type in the second variable. This condition is formalized by
May 31, 2004 13:56 WSPC/148-RMP
514
00207
S. Hollands
demanding that the wave front setd WF(∆+ ), (for the definition of the wave front set of a distribution, see [9]) has the following, so-called “Hadamard”, property WF(∆+ ) ⊂ {(x1 , x2 ; p1 , p2 ) ∈ (Rd × Rd ) × (Rd × Rd \(0, 0)) | p1 = −p2 , (x1 − x2 )2 = 0, p1 ∈ V¯ + } ≡ C+ ,
(6)
where V¯ ± denotes the closure of the future, resp. past, lightcone in Rd . The key point now is that the relations (4) and (5) make sense not only for test functions of the form f1 ⊗ · · · ⊗ fa , but even much more generally for any test distribution, t, in the space Ea0 of compactly supported distributions in a spacetime arguments which have the property that their wave front set does not contain any element of the form (x1 , . . . , xn ; p1 , . . . , pn ) such that all pi are either in the closure of the forward lightcone, or the closure of the past lightcone, Ea0 = {t ∈ D 0 (×a Rd ) | t comp. supp., WF(t) has no element in common with (×a Rd ) × (×a V¯ + ) or (×a Rd ) × (×a V¯ − )} .
(7)
Indeed, these conditions on the wave front set on the t, together with the wave front set properties (6) of ∆+ can be shown to guarantee that the potentially illdefined products of distributions occurring in a product Wa (t) · Wb (s) are in fact well-defined and are such that the resulting terms are each of the form Wc (u), with u again an element in the space Ec0 .e We take W to be the algebra generated by symbols of the form Wa (t), t ∈ Ea0 , subject to the relations (4) and (5), with ⊗i fi in those relations replaced by distributions in the spaces Ea0 . Our definition of the generators Wa (t) depends on the particular choice of ∆+ . However, as an abstract algebra, W is independent of this choice [11]. To see this, choose any other bidistribution ∆0+ with the same wave front set property as ∆+ , and let W 0 be the corresponding algebra with generators Wa0 (t) defined as in Eq. (3). Then W and W 0 are isomorphic. The isomorphism is given in terms of the generators X a! W0 (hF ⊗n , ti) , (8) Wa (t) → (2n)!(a − 2n)! a−2n 2n≤a
where F = ∆+ − ∆0+ , and where hF ⊗n , ti is the compactly-supported distribution on ×a−2n Rd defined by Z Y Y ⊗n hF , ti(y1 , . . . , ya−2n ) = t(x1 , . . . , x2n , y1 , . . . , ya−2n ) F (xi , xi+1 ) d d xi . i
d It
(9)
can be shown that the wave front set of a distribution is actually invariantly defined as a subset of the cotangent space of the manifold on which the distribution is defined. The set (6) should therefore be intrinsically thought of as a subset of T ∗ (Rd × Rd ). e We note that the wave front set of (i/2)∆ is not of Hadamard type, and ∆ + = (i/2)∆ is not a possible choice in the construction of W.
May 31, 2004 13:56 WSPC/148-RMP
00207
Algebraic Approach to the 1/N Expansion in Quantum Field Theory
515
0 (That this distribution is in the class Ea−2n follows from the wave front set proper0 ties of ∆+ , ∆+ , which, together with the wave equation imply that F is smooth.) This completes our construction of the algebra of quantum observables for a single Klein–Gordon field associated with the action (1). Quantum states in the algebraic framework are by definition linear functionals ω: W → C which are positive in the sense that ω(A∗ A) ≥ 0 for all A ∈ W, and which are normalized so that ω(1) = 1. This algebraic notion of a quantum state encompasses the usual Hilbert-space notion of state, in the sense that any vector or density matrix in a Hilbert-space on which the elements of W are represented as linear operators defines an algebraic state in the above sense via taking expectation values. Conversely, given an algebraic state ω, the GNS-construction yields a representation π on a Hilbert space H containing a vector |Ωi such that ω(A) = hΩ|π(A)|Ωi. Note, however, that it is not true that any state on W arises in this way from a single, given Hilbert-space representation (this is closely related to the fact that W has (many) inequivalent representations). The algebra W can be equipped with a unique topology that makes the product and *-operation continuous [11], and the notion of a continuous state on W can thereby be defined. The continuous states ω on W can be characterized entirely in terms of the n-point distributions ω(φ(x1 ) · · · φ(xn )) of the field φ. Moreover, it can be shown [14], that the continuous states are precisely those for which the 2-point distribution has wave front set Eq. (6), and for which the so-called “connected” n-point distributions are smooth for n 6= 2. The invariance of the action (1) under the Poincar´e-group is reflected in a corresponding invariance of W, in the sense that W admits an automorphic action of the Poincar´e-group: For any element {Λ, a} of the Poincar´e group consisting of a proper, orthochronous Lorentz transformation Λ and a translation vector a ∈ Rd , there is an automorphism α{Λ,a} on W satisfying the composition law α{Λ,a} ◦ α{Λ0 ,a0 } = α{Λ,a}·{Λ0 ,a0 } . The action of this automorphism is most easily described if we choose a ∆+ which is invariant under the Poincar´e-group. (Since W is independent of the choice of ∆+ , we may do so if we like.) An admissiblef choice for ∆+ with the above properties is the Wightman function of the free field,
∆+ (x, y) = w(m) (x − y) ≡
1 (2π)d−1
Z
δ(p2 − m2 )eip(x−y) dd p .
(10)
p0 ≥0
With this choice, the action of α{Λ,a} is simply given byg α{Λ,a} (Wa (t)) = Wa (t ◦ {Λ, a}) . f That
(11)
the wave front set of the Wightman function is equal to (6) is proved e.g. in [18]. t ◦ {Λ, a} is again an element of Ea0 is a consequence of the covariant transformation law WF(f ∗ t) = f ∗ WF(t) of the wave front set [9], where f can be any diffeomorphism, together with the fact that the future/past lightcones are preserved under the action of the proper, orthochronous Poincar´e group. On the other hand, a Lorentz transformation reversing the time orientation does not preserve the spaces Ea0 and consequently does not give rise to an automorphism of W. g That
May 31, 2004 13:56 WSPC/148-RMP
516
00207
S. Hollands
Furthermore, with this choice for ∆+ , the generators Wa (t) correspond to the usually considered normal-ordered products of fields, Wa (f1 ⊗ · · · ⊗ fa ) = :
a Y
φ(fi ) : ,
(12)
i=1
and the product formula (5) simply corresponds to “Wick’s theorem” for multiplying to Wick-polynomials. Since W1 (f ) = φ(f ), the enlarged algebra W contains the minimal algebra generated by the free field φ(f ) as a subalgebra. In fact, W also contains Wick powers of the field at the same spacetime point as well as their time-ordered products, which are not in the minimal algebra. These objects can be characterized axiomatically (not uniquely, as we shall see) by a number of properties that we will list now. In order to state these properties in a convenient way, let us introduce the vector space V whose basis elements are labeled by formal products of the field φ and its derivatives, o n Y (13) V = span O = ∂ µ1 · · · ∂ µk φ ,
P so that each element of V is given by a formal linear combination gj Oj , gj ∈ C. We refer to the elements of V as “formal” field expressions, because no relations such as the field equation are assumed to hold at this stage. Consider, furthermore, the space D(Rd ; V) of smooth functions of compact support whose values are elements in the vector space V. Thus, any element F ∈ D(Rd ; V) can be written in the form P F (x) = fi (x)Oi , where Oi are basis elements in V, and where fi are complex valued smooth functions on Rd of compact support. We view the Wick powers as linear maps D(Rd ; V) → W ,
f O → O(f )
(14)
and the n-fold time-ordered products as multilinear maps T : ×n D(Rd ; V) → W ,
(f1 O1 , . . . , fn On ) → T
Y
fi Oi .
(15)
Time-ordered products with only one factor are required to be given by the corresponding Wick power, T (f O) ≡ O(f ) .
(16)
The further properties required from the time-ordered products (including the Wick powers as a special case) are the followingh: h Actually, one ought to impose additional renormalization conditions specifically for time-ordered products containing derivatives of the fields, see e.g. [5] and [13], beyond the requirements (t1)–(t8) below. Such conditions are important e.g. in order to show that the field equations or conservation equations hold for the interacting fields, but they do not play a role in the present paper. We have therefore omitted them here to keep things as simple as possible.
May 31, 2004 13:56 WSPC/148-RMP
00207
Algebraic Approach to the 1/N Expansion in Quantum Field Theory
517
(t1) (Symmetry) The time-ordered products are symmetric under exchange of the arguments. (t2) (Causal factorization) If I is a subset of {1, . . . , n}, and if the supports of {fi }i∈I are in the causal future of the supports of {fj }j∈I c (I c denotes the complement of I), then we ask that ! ! ! Y Y Y fi Oi = T T fi Oi T (17) fj Oj . i
j∈I c
i∈I
(t3) (Commutator) ! # " n Y T fi Oi , φ(h) i=1
= i
X
X
T f1 O1 · · ·
µ1 ···µl
k
∂Ok (h∂µ1 · · · ∂µl ∆ ∗ fk ) · · · fn On ∂(∂µ1 · · · ∂µl φ)
,
(18)
R where we have set (∆ ∗ f )(x) = ∆(x − y)f (y) dd y. (t4) (Covariance) Let {Λ, a} be a Poincar´e transformation. Then !! ! Y Y ∗ α{Λ,a} T =T fi Oi ψ{Λ,a} (fi Oi ) , i
!
(19)
i
∗ where ψ{Λ,a} denotes the pull-back of an element in D(Rd ; V) by the linear transformation x → Λx + a. (t5) (Scaling) The time-ordered products have the following “almost homogeneous” scaling behavior under simultaneous rescalings of the intertial coordinates and the mass, m. Let λ > 0, and set f λ (x) = λ−n f (λx) for any function or distribution on Rn . For a given prescription T (m) for the value m of the mass (valued in the algebra W (m) associated with this value of the mass), consider the new prescription T (m)0 defined by ! " !# Y Y P (m)0 − di (λm) λ T fi Oi ≡ λ σλ T fi Oi , (20) i
i
where σλ : W (λm) → W (m) is the canonical isomorphism,i and where di is the “engineering dimension”j of the field Oi . (Note that T (λm) is valued in W (λm) .) i The
canonical isomorphism is defined by σλ : W (λm) 3 Wa (t) → λ−a · Wa (tλ ) ∈ W (m) . scalar field theory in d spacetime dimensions, the mass dimension of a field O is defined as the number of derivatives plus (d − 2)/2 times the number of factors of φ plus twice the number of factors of m2 .
j In
May 31, 2004 13:56 WSPC/148-RMP
518
00207
S. Hollands
Then we demand that T (m)0 depends at most logarithmically on λ in the sense thatk T (m)0 = T (m) + polynomial expressions in ln λ .
(21)
(t6) (Microlocal spectrum condition) Let ω be a continuous state on W. Then the Q distributions ωT : (f1 , . . . , fn ) → ω(T ( fi Oi )) are demanded to have wave front set WF(ωT ) ⊂ CT ,
(22)
where the set CT ⊂ (×n Rd ) × (×n Rd \{0}) is described as follows (we use the graphological notation introduced in [2, 3]): Let Γ(p) be a “decorated embedded Feynman graph” in Rd . By this we mean an embedded Feynman graph Rd whose vertices are points x1 , . . . , xn with valence specified by the fields Oi occurring in the time-ordered product under consideration, and whose edges, e, are oriented null-lines [i.e. (xi − xj )2 = 0 if xi and xj are connected by an edge]. Each such null line is equipped with a momentum vector pe parallel to that line. If e is an edge in Γ(p) connecting the points xi and xj with i < j, then s(e) = i is its source and t(e) = j its target. It is required that pe is future/past directed if xs(e) is not in the past/future of xt(e) . With this notation, we define ( CT =
(x1 , . . . , xn ; k1 , . . . , kn ) | ∃ decorated Feynman graph Γ(p) with vertices
x1 , . . . , xn such that ki =
X
e:s(e)=i
pe −
X
pe
e:t(e)=i
)
∀i .
(23)
(t7) (Unitarity) We have T ∗ = T¯, where T¯ is the “anti-time-ordered product, defined as ! ! X Y Y n+j (−1) T T¯(f1 O1 · · · fn On ) = f¯i Oi · · · T f¯i Oi , I1 t···tIj ={1,...,n}
i∈I1
i∈Ij
(24)
where the sum runs over all partitions of the set {1, . . . , n} into disjoint subsets I1 , . . . , I j . (t8) (Smooth dependence upon m) The time-ordered products depend smoothly upon the mass parameter m in the following sense. Let ω (m) be a 1-parameter family of states on W (m) . We say that ω (m) depends smoothly upon m if (i) k The difference T (m) − T (m)0 describes the failiure of T m to scale exactly homogeneously. For the time-ordered products with only one factor (i.e. the Wick powers), it can be shown that this difference vanishes, i.e. the Wick powers scale exactly homogeneously in Minkowski space. For the time-ordered products with more than one factor, the logarithms cannot in general be avoided.
May 31, 2004 13:56 WSPC/148-RMP
00207
Algebraic Approach to the 1/N Expansion in Quantum Field Theory (m)
the 2-point function ω2 has wave front set (m)
WF(ω2
519
(x, y) when viewed as a distribution jointly in m, x, y
) ⊂ {(x1 , x2 , m; p1 , p2 , ρ) | (p1 , p2 , ρ) 6= 0, (x1 , x2 ; p1 , p2 ) ∈ C+ } , (25)
where the set C+ was defined above in Eq. (6), and if (ii) the truncated n-point (m)conn functions ωn are smooth jointly in m, x1 , . . . , xn . We say the prescription Q (m) T (m) is smooth in m if ωT (f1 , . . . , fn ) = ω (m) (T (m) ( fi Oi )) (viewed as a distribution jointly in m and its spacetime arguments) has wave front set (m) WF(ωT ) ⊂ (x1 , . . . , xn , m; k1 , . . . , kn , ρ) | (k1 , . . . , kn , ρ) 6= 0, (x1 , . . . , xn ; k1 , . . . , kn ) ∈ CT } (26) for such a smooth family of states, where the set CT was defined above in Eq. (23).
It is relatively straightforward to demonstrate the existence of a prescription for defining the Wick powers as elements of W satisfying the above properties. For example, for the fields φa ∈ V, a = 1, 2, . . . , the corresponding algebra elements φa (f ) ∈ W satisfying the above properties may be defined as follows. Let H (m) (x, y) be any family of bidistributions satisfying the wave equation in both entries and the wave front set condition (6), whose antisymmetric part is equal to (i/2)∆(x, y), and which has a smooth dependence upon m in the sense of (t8). Define δn iφ(f )+ 12 H (m) (f,f ) . (27) e φa (x) = n n i δf (x) f =0 Then φa (f ) ∈ W satisfies (t1)–(t8). This definition of φa (f ) can be restated equivalently as follows: we may use the bidistribution ∆+ = H (m) in the definition of the generators Wa (see Eq. (3)) and the algebra product (5) of W, since we have already argued that W is independent of the particular choice of ∆+ . Consider the distribution t given by t(x1 , . . . , xa ) = f (x1 )δ(x1 − x2 ) · · · δ(xa−1 − xa ) ,
(28)
where δ is the ordinary delta-distribution in Rd . Then one can show that t is in the class of distributions Ea0 , and definition (27) (in smeared form) is equivalent to setting φa (f ) = Wa (t) ∈ W ,
(29) (m)
where it is understood that Wa is defined in terms of ∆+ = H . Wick powers containing derivatives are defined in a similiar way via suitable derivatives of delta distributions. The usual “normal ordering” prescription for Wick powers would correspond to setting H (m) equal to the Wightman 2-point function w (m) given above in Eq. (10). However, this is actually not an admissible choice in our framework since, by inspection w(m) (and hence the vacuum state) does not depend smoothly upon m in the
May 31, 2004 13:56 WSPC/148-RMP
520
00207
S. Hollands
sense of (t8). In fact, the Wightman 2-point function w (m) (x−y) explicitly contains a term of the form J[m2 (x − y)2 ] log m2 with a logarithmic dependence upon the mass m, where J is a smooth (in fact, analytic) function that can be expressed in terms of Bessel functions. For this reason, the usual normal ordering prescription violates our condition (t8) that the Wick powers have a smooth dependence upon m. An admissible choice for H (m) is e.g. w(m) without this logarithmic term, H (m) (x, y) = w(m) (x − y) − J[m2 (x − y)2 ] log m2 .
(30)
Since normal ordering is not admissible in our framework, it follows that no prescription for Wick powers satisfying (t1)–(t8) can have the property that it has a vanishing expectation value in the vacuum state for all values of m ∈ R, because this property precisely distinguishes normal ordering. However, we can always adjust our prescription within the freedom left over by (t1)–(t8) in such a way that all Wick powers have a vanishing expectation value in the vacuum state for an arbitrary, but fixed value of m. It is therefore clear that, in practice, our prescription is just as viable as the usual normal ordering prescription, since m can take on only one value. On the other hand, our prescription would lead to different predictions in a theory containing a spacetime dependent mass. It is not possible to give a similarly explicit construction of time-ordered products satisfying (t1)–(t8) with more than one factor. Using the ideas of “causal perturbation theory” (see e.g. [19]) one can, however, give an inductive construction of the time-ordered products so that (t1)–(t8) are satisfied which is based upon the above construction of the Wick powers (i.e. time-ordered products with one factor). These constructions are described in detail in [3, 5, 12] (see expecially [12] for the proof that scaling property (t5) can be satisfied), and we will therefore only sketch the key steps and ideas going into this inductive construction, referring the reader to the references for details. The main idea behind the inductive construction is that the causal factorization property expressing the temporal ordering of the factors in the time-ordered product already defines time-ordered products with the desired properties for non-coinciding spacetime points once the Wick powers are known. Namely, if e.g. supp f1 is before supp f2 , supp f2 is before supp f3 etc., then the causal factorization property tells us that we must have ! Y T fi Oi = O1 (f1 ) · · · On (fn ) . (31) i
Since the Wick powers on the right side have already been constructed, we may take this relation as the definition of the time-ordered products for test functionsl F = ⊗ni fi whose support has no intersection with any of the “partial diagonals” DI = {(x1 , . . . , xn ) ∈ ×n Rd | xi = xj l Note
∀ i, j ∈ I} ,
I ⊂ {1, . . . , n} ,
(32)
that the time-ordered products can be viewed as multilinear maps × n D(Rd ) → W for a fixed choice of fields O1 , . . . , On .
May 31, 2004 13:56 WSPC/148-RMP
00207
Algebraic Approach to the 1/N Expansion in Quantum Field Theory
521
in the product manifold ×n Rd , because one can decompose such F into contributions whose supports are temporally ordered via a partition of unity [3]. The causal factorization property alone therefore already defines the time-ordered products as W-valued distributions, denoted T 0 , on the space ×n Rd , minus the union ∪I DI of all partial diagonals, and it can furthermore be seen that these objects have the desired properties (t1)–(t8) on that domain. In order to define the time-ordered products as distributions on all of ×n Rd , one has to construct a suitable extension T of T 0 to a distribution defined on all of ×n Rd in such a way that (t1)–(t8) are preserved in the extension process. This step corresponds to the usual “renormalization” step in other approaches and is the hard part of the analysis. Actually, we can even assume that T 0 is already defined everywhere apart from the total diagonal Dn = {xi 6= xj ∀ i, j}, since one can construct the extension T inductively in the number of factors. Having constructed these for up to less or equal than n−1 factors then leaves the time-ordered products with n factors undetermined only on the total diagonal. A key simplification for the extension problem occurs because the commutator condition (inductively known to hold for T 0 ) can be shown to be equivalent to the following “Wick-expansion” for T 0 , ! n Y 0 T Oi (xi ) i=1
=
1 α ! · · · αn ! ,... 1
X
α1 ,α2
× τ 0 [δ α1 O1 ⊗ · · · ⊗ δ αn On ] (x1 , . . . , xn ) :
n Y Y
[(∂)j φ(xi )]αij :H .
(33)
i=1 j
Here, the τ 0 [⊗i Ψi ] are c-number distributions on ×n Rd \Dn [in fact equal to the Q expectation value of the time-ordered product T 0 ( i Ψi )] depending in addition upon an arbitrary collection of fields Ψ1 ⊗ · · · ⊗ Ψn ∈ ⊗n V, each αj is a multi-index and we are using the notation ( αj ) Y ∂ O∈V (34) δα O = ∂[(∂)j φ] j Q as well as α! = j αj ! for multi-indices.m The notation : :H stands for the “Hnormal ordered products”, defined by k Y δk iφ(f )+ 12 H (m) (f,f ) e . (35) : φ(xi ) :H = k i δf (x1 ) · · · δf (xk ) i=1
m We
f =0
are also suppressing tensor indices in Eqs. (33) and (34). For example, the notation (∂) j φ is a shorthand for ∂(µ1 · · · ∂µj ) φ.
May 31, 2004 13:56 WSPC/148-RMP
522
00207
S. Hollands
The key point about the Wick expansion is that it reduces the problem of extending the algebra valued T 0 to the problem of extending the c-number distributions τ 0 . Since we want the extensions T to satisfy (t1)–(t8), we also want the extensions τ of the τ 0 to satisfy a number of corresponding properties: First, the wave front set condition on the T correspond to the requirement that WF(τ ) ⊂ CT , where the set CT was defined above in Eq. (23). Second, since the T are required to be Poincar´e invariant, also the extension τ must be Poincar´e invariant. Finally, since the T are supposed to have an almost homogeneous scaling behavior under a rescaling x → λx (and a simultaneous rescaling m → λ−1 m of the mass), the τ must have the scaling behavior k D (λ−1 m) ∂ λ τ (λx1 , . . . , λxn ) = 0 , (36) ∂ log λ for some k, where D is the sum of the mass dimensions of the fields Ψi on which τ depends, and where we are indicating explicitly the dependence of τ upon the mass parameter.n By induction, these properties are already known for τ 0 (i.e. off the total diagonal Dn ), so the question is only whether they can also be satisfied in the extension process. To reduce this remaining extension problem to a simpler task, one shows [12] that it is possible to expand the τ 0 in terms of the mass parameter m in a “scaling expansion” of the form τ (m) 0 =
j X
m2k · u0k + rj0 ,
(37)
k=n
where the u0j are Poincar´e invariant distributions (independent of m) that scale almost homogeneously under a rescaling of the spacetime coordinates, k D−2k 0 ∂ λ uk (λx1 , . . . , λxn ) = 0 , (38) ∂ log λ
with WF(u0k ) ⊂ CT , and where the remainder rj0 is a distribution with WF(rj0 ) ⊂ CT , smooth in m, whose scaling degree [3] can be made arbitrarily low by carrying out the expansion to sufficiently large order j. The idea now is to construct the desired extension τ by constructing separately suitable extensions of u0k and rj0 . Actually, since the remainder has a sufficiently low scaling degree, it extends by continuity to a unique distribution rj , and that extension is seen to be automatically Poincar´e invariant, have wave front set WF(rj ) ⊂ CT , and have an almost homogeneous scaling behavior under a rescaling of the coordinates and the mass parameter. The distributions u0k , on the other hand, do not extend by continuity, but one can construct the desired extension as follows [12]: one first constructs, by the methods originally due to Epstein and Glaser and described e.g. in [19], an arbitrary extension that n The unitarity condition on the T also implies a certain reality condition on the τ , which however is rather easy to satisfy in the present context.
May 31, 2004 13:56 WSPC/148-RMP
00207
Algebraic Approach to the 1/N Expansion in Quantum Field Theory
523
is translationally invariant and has the same scaling degree as the unextended distribution. That extension then also satisfies the wave front set condition [3], but it will not, in general, yield a distribution with an almost homogeneous scaling behavior (i.e. homogeneous scaling up only to logarithmic terms), nor will it be Lorentz invariant. The point is, however, that this preliminary extension can always be modified, if necessary, so as to restore the almost homogeneous scaling behavior and Lorentz invariance (while at the same time keeping the wave front set property and translational invariance), see [12, Lemma 4.1]. This accomplishes the desired extension of the τ 0 , and thereby establishes the existence of a prescription T for time-ordered products satisfying (t1)–(t8). We emphasize that the above list of properties (t1)–(t8) does not determine the Wick powers and time-ordered products uniquely (for the time-ordered products, this non-uniqueness arises because the extension process is not unique). The non-uniqueness corresponds to the usual “finite renormalization ambiguities”. Their form is severly restricted by the properties (t1)–(t8) and is described by the “renormalization group”, see Sec. 5. 3. Algebraic Construction of the Field Observables as Polynomials in 1/N In this section, we generalize the algebraic construction of the field observables from a single scalar field to a multiplet of scalar fields, and we will show that the number of field components can be viewed as a free parameter that can be taken to infinity in a meaningful way on the algebraic level. The model that we want to consider is described by the classical action Z S = Tr(∂ µ φ∂µ φ + m2 φ2 ) dd x , (39) where φ = {φij 0 } is now a field taking values in the hermitian N × N matrices, and where “Tr ” denotes the trace, with no implicit normalization factors. More ¯ representationo precisely, we should think of the field as taking values in the N ⊗ N P 0 0 0 j k a of the group U (N ), the trace being given by Tr φ = φij 0 δ φkl0 δ l m · · · φmn0 δ n i ¯ in terms of the invariant tensor δij 0 in N ⊗ N.
For an arbitrary, but fixed N , we begin by constructing a minimal algebra of observables corresponding to the action (39) in a similar way as described in the previous section for the case of a single field. The minimal algebra is now generated by a unit and finite sums of products of smeared field components, φij 0 (f ), where f runs through all compactly supported test functions, and where the “color indices” i and j 0 run from 1 to N . The relations in the case of general N differ from relations (i)–(iv) for a single field only in that the hermiticity and commutation relations now read ¯ are putting a prime on the indices associated with the tensor factor transforming under N in the spirit of van der Waerden’s notation.
o We
May 31, 2004 13:56 WSPC/148-RMP
524
00207
S. Hollands
(iiiN ) (Hermiticity) φij 0 (f )∗ = φj 0 i (f¯) (ivN ) (Commutator) [φij 0 (f ), φkl0 (h)] = iδil0 δkj 0 ∆(f, h) · 1, where ∆ is the advanced minus retarded propagator of a single Klein–Gordon field. We construct an enlarged algebra, WN , by passing to a new set of generators of the form (3) and by allowing these generators to be smeared with suitable distributions, i.e. WN is spanned by expressions of the form Z Y a a Y A= : φik jk 0 (xk ) : t(x1 , . . . , xa ) d d xk , (40) k
k=1
Ea0
where t ∈ and where we are using the usual informal integral notation for distributions. The product of these quantities can again be expressed in a form that is similar to (5). Since the real components of the field φij 0 are not coupled to each other, the enlarged algebra WN is isomorphic to the tensor product of the corresponding algebra W1 for each independent real component of the field as defined in the previous section, 2
WN ∼ =
N O
W1 ,
(41)
N 2 being the number of independent real components of the field φij 0 . ¯j 0 l0 φkl0 leaves the classical action functional (39) The transformation φij 0 → Ui k U invariant for any unitary matrix U ∈ U (N ). This invariance property is expressed on the algebraic level by a corresponding action of the group U (N ) on the algebras WN via a group of *-automorphisms αU . We are interested in the subalgebra inv WN = {A ∈ WN | αU (A) = A
∀ U ∈ U (N )}
(42)
of “gauge invariant” elements, i.e. the subalgebra of WN consisting of those elements that are invariant under this automorphic action of the group U (N ). It is not inv is given by difficult to convince oneself that, as a vector space, WN inv 0 WN = span{Wa (t) | t ∈ E|a| },
(43)
where Wa (t) =
1 N |a|/2
Z
: Tr
Y
i1 ∈I1
!
φ(xi1 ) · · · Tr
Y
iT ∈IT
φ(xiT )
!
: t(x1 , . . . , x|a| )
|a| Y
d d xi .
i=1
(44)
0 Here, t ∈ E|a| , the symbol a now stands for a multi-index, a = (a1 , . . . , aT ), and we are using the usual multi-index notation
|a| =
T X i
ai .
(45)
May 31, 2004 13:56 WSPC/148-RMP
00207
Algebraic Approach to the 1/N Expansion in Quantum Field Theory
525
The Ij are mutually disjoint index sets containing each aj elements such that ∪j Ij = {1, . . . , |a|}. For later convenience, we have also incorporated an overall normalization factor into our definition of the generators Wa (t). The generators are symmetric under exchange of the arguments within each trace and under exchange of the traces.p Furthermore, they satisfy a wave equation and hermiticity condition completely analogous to the ones in the scalar case, Wa (t¯) = Wa (t)∗ ,
Wa ([1 ⊗ · · · (∂ µ ∂µ − m2 ) ⊗ · · · 1]t) = 0 ,
(46)
where the Klein–Gordon operator acts on any of the arguments of t. inv Since WN is a subalgebra of WN (i.e. closed under multiplication), the product of two such generators can again be expressed as a linear combination of these generators. In order to determine the precise form of this linear combination, one has to take care of the color indices and the structure of the traces. Since the traces in the generators (44) imply that there are “closed loops” of contractions of the color indices, there will also appear similar closed loops of index contractions in the formula for the product of two generators. Such closed index loops will give rise to combinatorical factors involving N . We are ultimately interested in taking the limit when N goes to infinity, so we must study the precise form of this N -dependence. Since the N -dependence arises solely from the index structure and not from the spacetime dependence of the propagators, it is sufficient for this purpose to study the “matrix model” given by the zero-dimensional version of the action functional (39), Smatrix = m2 Tr M 2 ,
(47)
where we have put M = φ in this case in order to emphasize the fact that we are dealing now with hermitian N × N matrices M = {Mij 0 } with no dependence upon the spacetime point (this action does not, of course, describe a quantum field theory). By analogy with Eq. (3), we define the “normal-ordered product” of k matrix entries to be the function ∂k iJ·M + m12 J 2 (48) e : Mi1 j1 0 · · · Mik jk 0 : = (−i)k i1 j1 0 0 i j k k ∂J · · · ∂J J=0
P
ij 0
of the matrix entries where we have put M · J = ij Mij 0 J . For the first values of k, the definition yields : Mij 0 : = Mij 0 , and : Mij 0 Mkl0 : = Mij 0 Mkl0 − δil0 δkj 0 /m2 , etc. The (commutative) product of normal-ordered polynomials : Q({Mij 0 }) : can be expressed in terms of normal-ordered polynomials via the following version of Wick’s theorem: : Q1 : · · · : Qr :=: eiM ·∂/∂J : h: e−iJ ·∂/∂M Q1 : · · · : e−iJ ·∂/∂M Qr :imatrix J =0 , (49)
where we have introduced the “correlation functions” Z h: Q1 : · · · : Qr :imatrix ≡ N dM : Q1 : · · · : Qr : e−Smatrix p Another
(50)
way of saying this is that the Wa really act on distributions t with these symmetry properties.
iM ·∂/∂J
−iJ ·∂/∂M
Qr : = : e : h: e00207 1 : · · · : WSPC/148-RMP May 31, 2004: Q13:56
−iJ ·∂/∂M
Q1 : · · · : e
Qr :imatrix
,
(49)
J =0
where we have introduced the “correlation functions” Z h: Q1 : · · · : Qr :imatrix ≡ N dM : Q1 : · · · : Qr : e−Smatrix
(50)
which we normalize so that h1imatrix = 1. It follows from these definitions that we always 526 S. Hollands have h: Q :imatrix = 0. The correlation functions can be written as a sum of contributions associated with Feynman diagrams. Each such diagram consists of r vertices that are which we normalize so that h1imatrix = 1. It follows from these definitions that we connected by “propagators” always have h: Q :imatrix = 0. The correlation functions can be written as a sum of 1 Each such diagram consists of r contributions associated with Feynman diagrams. hMij 0 Mkl0 imatrix = 2 δil0 δkj 0 , (51) vertices that are connected by “propagators” m which are represented by oriented lines, 1the always going from the(51) primed 0 δkj 0 , hMij 0 Mdouble δilarrow kl0 imatrix = m2 to the unprimed index. which are represented by oriented double lines, the arrow always going from the primed to the unprimed index. i j0
l0 - k
The structure of the ith vertex is determined by the form of the polynomial Qi . The space of gauge invariant polynomials in Mij 0 is spanned by functionals of formq of i-th vertix is determined by the form of the polynomial Qi . The the structure spanned by functionals of the The space of gauge invariant 1polynomialsa1 in Mij 0 is (52) Wa = |a|/2 : Tr M · · · Tr M aT : , form17 N 1 (52) : Tr M a1 · · · Tr M aT : . Wa =analogue which is the 0-dimensional N |a|/2 of expression (44), with the only difference being that there is no dependence on the smearing distribution t since we are in which is the 0-dimensional analogue of expression (44), with the only difference that there spacetime dimensions. Since distribution these multi-trace observables the space of all is no0dependence on the smearing t since we are in span 0 spacetime dimensions. polynomial U (N )-invariant function of the matrix entries, we already know thatfuncSince these multi trace observables span the space of all polynomial U (N )-invariant product of entries, Wa withwe Walready bethat written a linear such b can again tion the of the matrix know the as product of combination Wa with Wbofcan again observables. We are interested in the dependence of the coefficients in this linear 17 Note that these on polynomials are not linearly independent finite N . For example, for N = 2 there combination N . We calculate the product Wa ·Wat b via Wick’s formula (49), and we holds the relation Tr M 3 − 32 Tr M Tr M 2 + 12 (Tr M )3 = 0. A set of linearly independent polynomials can organize theso-called resulting sum of expressions in terms of the following Feynman graphs. be obtained using “Schur-polynomials”. From the T traces in Wa , there will be T a-vertices with aj legs each (we think of the legs as carrying a number), where j = 1, . . . , T . We draw the legs as double lines. 18 As an example, let a = (a1 , a2 ) = (3, 3), so that W(3,3) = 1/N 3 : Tr M 3 Tr M 3 :. In this case, we have 2 a-vertices with 3 lines, each corresponding to one trace with 3 factors of M . Each such vertex therefore looks as shown in Fig. 1. The double lines should also be equipped with orientations that are compatible at the vertex, although we have not drawn this here. From the S traces in Wb there are similarly S b-vertices with bj legs each, where j = 1, . . . , S. We consider graphs obtained by joining a-vertices with b-vertices by a double line representing a matrix propagator (51), but we do not allow any a−a or b−b connections (such connections have already been taken care of by the normal ordering prescription used in the definition of Wa , respectively Wb ). We finally attach an “external current” Mij 0 to every leg of an a-vertex or b-vertex that is not connected by a propagator. An example of a Feynman graph resulting from this procedure occurring in the product W(a1 ,a2 ) · W(b1 ,b2 ) with a1 = a2 = b1 = b2 = 3 is drawn as shown in Fig. 2. q Note
that these polynomials are not linearly independent at finite N . For example, for N = 2 there holds the relation Tr M 3 − 23 Tr M Tr M 2 + 21 (Tr M )3 = 0. A set of linearly independent polynomials can be obtained using so-called “Schur-polynomials”.
terms of the following Feynman graphs. From the T traces in Wa , there will be T avertices with aj legs each (we think of the legs as carring a number), where j = 1, . . . , T . as athe linear combination of As such observables. We are interested in the deMaybe 31,written 2004 13:56 WSPC/148-RMP We draw legs as double lines. 00207 an example, let a = (a1 , a2 ) = (3, 3), so that 3 3 3this linear combination on N. We calculate the product pendence of the coefficients in W(3,3) = 1/N : Tr M Tr M :. In this case, we have 2 a-vertices with with 3 lines, each Wa · Wcorresponding formula (49), we organize the such resulting expressions in one trace withand 3 factors of M. Each vertexsum looksoftherefore as b via Wick’s to follows: terms of the following Feynman graphs. From the T traces in W , there will be T aa jk 0
vertices with aj legs each (we think of the legs as carring a number), where j = 1, . . . , T . We draw the legs as double lines. As an example, let a = (a1 , a2 ) = (3, 3), so that W(3,3) = 1/N 3 : Tr M 3 TrAlgebraic M 3 :. In this case, we have 2kia-vertices with with 3 lines, each 0 Approach to the 1/N Expansion in Quantum Field Theory 527 corresponding to one trace with 3 factors of M. Each such vertex looks therefore as follows: jk 0
ij 0 The double lines should also be equipped with orientations that are compatible at the vertex, although we have not drawn this here. From the S traces in Wb there are similarly S b-vertices with bj legs each, where j = 1, . . . , S. We consider ki0 graphs obtained by joining a-vertices with b-vertices by a double line representing a matrix propagator (51), but we do not allow any a − a or b − b connections (such connections have already been taken care of by the normal ordering prescription used in the definition of Wa respectively Wb ). 0 We finally attach an “externalij current” Mij 0 to every leg of an a-vertex or b-vertex that is not connected by a propagator. An example a Feynman graph from thisat the The double lines should also be equipped withoforientations that resulting are compatible Fig. 1. procedure occurring in the product W · W with a = a = b = b = 3 is drawn 1 2 in 1W there 2 (b1 ,b2 ) the S traces 1 ,a2 ) vertex, although we have not drawn this (ahere. From are similarly b in the following picture.
S b-vertices with bj legs each, where j = 1, . . . , S. We consider graphs obtained by joining a-vertices with b-vertices by a double line representing a matrix propagator (51), but we Mkl0 jk 0 do not allow any a − a or b − b M connections (such connections have already been taken care of by the normal ordering prescription used in the definition of Wa respectively Wb ). We finally attach an “external current” M 0 to every a2 = 3leg of an a-vertex or b-vertex that b1 = 3 ij is not connected by a propagator. An example of a Feynman graph resulting from this procedure occurring in the product W a1 = a2 = b1 = b2 = 3 is drawn ) with a1 (a =1 ,a 3 2 ) · W(b1b,b22= 3 in the following picture. Mij 0
Mli0
Mjk0
Mkl0
2. The resulting structure will consist ofFig. a number of closed loops obtained by following the lines (including loops that run through external currents). There will be, in general, 3 a =3 b =3 2
1
The resulting structure will consist of a number of closed loops obtained by 19 following the lines (including aloops that run through b2 = 3 external currents). There will 1 = 3 be, in general, 3 kinds of loops: (i) Degenerate loops around a single vertex that has only external currents but no propagators attached to it. Let the number of such loops (i.e. isolated vertices) at least one external 0 Mlicontain Mij 0 be D. (ii) Loops that current and at least one propagator line. Let the number of these surfaces be J. (iii) Loops that contain no external current. Let I be the number of these loops. [Thus, in the example will graph shownofinaFig. 2, we of have D =loops 0, I =obtained 1 (corresponding The resulting structure consist number closed by following to the inner square-shaped loop) and J = 1 (the loop running around theinsquare the lines (including loops that run through external currents). There will be, general, 3 passing through the 4 external currents).] Following a set of ideas by ‘t Hooft [15], 19 we consider the big (in general multiply connected) closed 2-dimensional surface S obtained by capping off the loops of types (ii) and (iii) with little surfaces (we would obtain a sphere in the above example.) The total number F of little surfaces in S is consequently given by F =I +J.
(53)
Let us label the loops containing currents by j = 1, . . . , D + J, and let cj be the number of currents in the corresponding loop. By construction, the number of edges,
May 31, 2004 13:56 WSPC/148-RMP
528
00207
S. Hollands
P , of the surface S is related to a, b, and c by 2P = |a| + |b| − |c| ,
(54)
where we are using the same multi-index notation as above. The number of vertices, V , in S is V = T +S −D,
(55)
i.e. is equal to the total number of traces T and S in Wa and Wb minus D, the number of vertices that are not connected to any other vertex. We apply the wellknown theorem by Euler to the surface S which tells us that X F −P +V = (2 − 2Hk ) , (56) k
where Hk is the genus of the kth disconnected component of S . The little surfaces in S each carry an orientation induced by the direction of the enclosing index loops, and these give rise to an orientation on each of the connected components of the big surface S . An oriented 2-dimensional surface always has Hk ≥ 0, and Hk is equal to the number of handles of the corresponding connected component in that case. Let us analyze the contributions to the product Wa · Wb associated with a given graph. From the P double lines of the graph, there will be a contribution Y 1 , (57) m2 lines (k,l)
associated with the double line propagators. From the closed loops of the kind (iii) in the graph there will be a factor NI = N
P
(2−2Hk )+(|a|+|b|−|c|)/2−V −J
(58)
because each of the I such closed index loops gives rise to a closed loop of index P i contractions of Kronecker deltas, N = δi . Finally, there will be a contribution : Tr M c1 · · · Tr M cJ+D :
(59)
corresponding to the external currents in J + D closed loops of the kind (i) and (ii) containing ci external currents each. Taking into account the normalization factors of N −|a|/2 , respectively N −|b|/2 , associated with Wa , respectively Wb , and letting Vk be the number of vertices in the kth connected component of S , we therefore find X Y P P 1 Wa · Wb = (1/N )J+ Hk + (Vk −2) · Wc , (60) m2 graphs
lines (k,l)
where the sum is over all distinct Feynman graphs.r r Note
that we think of the legs of a- and b-vertices as numbered, and so a graph is understood here as a graph carrying the corresponding numberings. Topologically identical graphs with distinct numberings of the legs count as different in the above sum, as well as similar sums below.
May 31, 2004 13:56 WSPC/148-RMP
00207
Algebraic Approach to the 1/N Expansion in Quantum Field Theory
529
We can easily generalize these considerations to calculate the product of Wa (f1 ⊗ . . . ⊗ f|a| ) with Wb (h1 ⊗ . . . ⊗ h|b| ) in the case when the dimension of the spacetime is nonzero. In this case, the legs of each a-vertex are associated with the smearing functions fj appearing in the corresponding trace in Eq. (44), and the legs of every b-vertex are likewise associated with the smearing functions h k . Every matrix propagator connecting such an a and b vertex then gets replaced by ∆ + , 1 → ∆+ (fj , hk ) . (61) m2 Furthermore, in a given graph, the J + D index loops with currents now correspond to a contribution of the forms ! ! cJ+D c1 Y Y φ(jk ) : , (62) φ(ji ) · · · Tr : Tr where the jk ∈ {fk , hk } denotes the test function associated with the corresponding external current. With these replacements, we obtain the following formula for the inv product of two generators Wa (⊗i fi ) with Wb (⊗i hi ) of the algebra WN : Wa (f1 ⊗ · · · ⊗ f|a| ) · Wb (h1 ⊗ · · · ⊗ h|b| ) X Y P P = (1/N )J+ Hk + (Vk −2) graphs
∆+ (fk , hl )
lines (k,l)
· Wc (j1 ⊗ · · · ⊗ j|c| ) .
(63)
An entirely analogous formula is obtained if the test functions ⊗i fi and ⊗j hj are 0 0 replaced by arbitrary distributions t and s in the spaces E|a| , respectively E|b| . The important thing to observe about relation Eq. (63) is how the coefficients in the sum on the right side depend on N : the numbers Hk (the number of handles of the kth component of the surface S associated with the graph) and J are always non-negative. The number Vk − 2 is also non-negative since the number of vertices in each component, Vk , is by construction always greater or equal than 2. Hence, we conclude that 1/N appears always with a non-negative power in the coefficients on the right side of Eq. (63). Since these coefficients are essentially the “structure inv constants” of the algebra WN , it is therefore possible to take the large N limit on the algebraic level. We now formalize this idea by constructing a new algebra which inv has essentially the same relations as the algebras WN , but which incorporates the important new point of view that N , or rather 1 (64) N is not fixed, but is instead considered as a free expansion parameter that can range freely over the real numbers, including in particular ε = 0. We will then show ε=
s To simplify, we are assuming here that ∆ + is given by the Wightman 2-point function, see Eq. (10).
May 31, 2004 13:56 WSPC/148-RMP
530
00207
S. Hollands
that this algebra contains elements corresponding to Wick powers and their timeordered products. The construction of this new algebra therefore incorporates the 1/N expansion of the quantum field observables associated with the action (39), including in particular the large N limit of the theory. Consider the complex vector space X [ε] consisting of formal power series expressions of the form X εj Waj (tj ) (65) j≥0
in the “dummy variable” ε, where the aj are multi-indices, and where the tj are 0 taken from the space E|a of distributions in |aj | spacetime variables defined in (7). j| We implement the second of relations Eq. (46) by viewing the symbols Waj (tj ) as depending only on the equivalence class of tj in the quotient space J|aj | , where Jn = En0 /{(1 ⊗ · · · (∂ µ ∂µ − m2 ) ⊗ · · · 1)s | s ∈ En0 } .
On the so-defined complex vector space X [ε], we define a product by ! ! X X X X k j Waj (tj ) · Wbk (sk ) , εr ε Wbk (sk ) = ε Waj (tj ) j≥0
k≥0
r≥0
(66)
(67)
r=k+j
where the product Waj (tj )·Wbk (sk ) is given by formula Eq. (63) (with 1/N replaced by ε in this formula), and we define a *-operation on X [ε] by !∗ X X j ε Waj (tj ) = εj Waj (t¯j ) . (68) j≥0
j≥0
Proposition 1. The product formula (67) and the formula (68) for the *-operation makes X [ε] into an (associative) *-algebra with unit (given by 1 ≡ W0 ). Proof. We need to check that the product formula (67) defines an associative product, and that the formula (68) for the *-operation is compatible with this product in the usual sense. For associativity, we consider the associator of generators A(ε) = Wa (r) · (Wb (s) · Wc (t)) − (Wa (r) · Wb (s)) · Wc (t) ,
(69)
which we evaluate using the product formula in the order specified by the brackets. The resulting expression can be written as a finite sum of terms of the form P Qj (ε)Wdj (uj ), where the Qj (ε) are polynomials in ε, and where the uj are linearly independent. But we already know A(ε) = 0 for ε = 1, 1/2, 1/3, etc., since the inv algebras WN , N = 1, 2, 3, etc. are associative. Therefore, the Qj (ε) must vanish for these values of ε. Since a polynomial vanishes identically if it vanishes when evaluated on an infinite set of distinct real numbers, it follows that the Qj vanish identically, proving that A(ε) = 0 as a power series in ε. The consistency of the *-operation is proved similarly. The construction of the algebra X [ε] completes our desired algebraic formulation of the 1/N expansion of the field theory associated with the free action (39). Our
May 31, 2004 13:56 WSPC/148-RMP
00207
Algebraic Approach to the 1/N Expansion in Quantum Field Theory
531
construction of X [ε] depends on a particular choice of the distribution ∆+ , but different choices again give rise to isomorphic algebras, showing that, as an abstract algebra, X [ε] is independent of this choice. Indeed, let ∆0+ be another bidistribution whose antisymmetric part is (i/2)∆ which satisfies the wave equation and has wave front set WF(∆0+ ) of Hadamard form, and let X 0 [ε] be the corresponding algebra constructed from ∆0+ with generators Wa0 (t). Then the desired *-isomorphism from X [ε] → X 0 [ε] is given by * +! O X P P J+ Hk + (Vk −2) 0 F, t . (70) ε · Wb Wa (t) → lines
graphs
Here F is the smooth function given by ∆+ −∆0+ and a graph notation as in Eq. (63) has been used: the sum is over all graphs obtained by writing down the T vertices corresponding to the T traces in Wa (t), a = (a1 , . . . , aT ), by contracting some legs with “propagators”, and by attaching “external currents” to others. If x i is the point associated with the ith leg in a graph with n propagator lines (2n ≤ |a|), then + * Z Y Y O d d xi , (71) F (xi , xj ) F, t (x1 , . . . , x|a|−2n ) = t(x1 , . . . , x|a| ) lines
lines (i,j)
legs i
where the second product is over all legs which have a propagator attached to them. The numbers Hk and Vk are the number of handles, respectively the number of vertices (≥ 2), in the kth disconnected component of the surface associated with the graph. J is the number of closed index loops associated with the graph which contains currents, and each such loop with bi currents corresponds to a trace in Wb0 , where b = (b1 , b2 , . . . , bJ ). We note that this implies in particular that only positive powers of ε appear in Eq. (70), which is necessary in order for the right side to be an element in X [ε]. We finally show that X [ε] contains observables corresponding to the suitably normalized smeared gauge invariant Wick powers and their time-ordered products. To have a reasonably compact notation for these objects, let us introduce the vector space of all formal gauge invariant expressions in the field φ and its derivatives, ( ) Y Y inv V = span O = Tr . (72) ∂ µ1 · · · ∂ µk φ i
inv
If O is a monomial in V , we denote by |O| the number of free field factors φ in the formal expression for O, for example |O| = 6 for the field O = Tr φ4 Tr φ2 . For a fixed N , the gauge invariant Wick powers are viewed as linear maps inv D(Rd , V inv ) → WN ,
f O → O(f ) .
(73)
Likewise, the gauge invariant time-ordered products are viewed as multilinear maps inv T : ×n D(Rd ; V inv ) → WN ,
(f1 O1 , . . . , fn On ) → T (f1 O1 · · · fn On ) .
(74)
May 31, 2004 13:56 WSPC/148-RMP
532
00207
S. Hollands
The Wick powers are identified with the time-ordered products with a single factor. In the previous section, we demonstrated that, in the scalar case (N = 1), the Wick powers and time-ordered products can be constructed so as to satisfy a number of properties that we labeled (t1)–(t8). It is clear that these constructions can be generalized straightforwardly also to the case of a multiplet of scalar fields in the adjoint representation of U (N ) (with N arbitrary but fixed) and thereby yield timeordered products with properties completely analogous to the properties (t1)–(t8) stated above for the scalar case. We would now like to investigate the dependence upon N of these objects and show that, if the time-ordered products are normalized by suitable powers of 1/N , these can be viewed as elements of X [ε], i.e. that they can be expressed as a linear combination of Wa (t), with t depending only on positive powers of ε = 1/N . The Wick powers O(f ) are constructed in the same way as in the scalar case, see Eq. (27). The only difference is that we need to multiply the Wick powers by suitable normalization factors depending upon ε = 1/N in order to get well-defined elements of X [ε]. Taking into account the normalization factor in the definition of the generators Wa , Eq. (44), one sees that ε|O|/2 O(f ) ∈ X [ε] .
(75)
Given that the suitably normalized Wick powers Eq. (75) are elements in X [ε], one naturally expects that their time-ordered products are also elements in X [ε], ! Y P |Oi |/2 ε T fi Oi ∈ X [ε] . (76) i
Now, if the test functions fi are temporally ordered, i.e. if for example the support of f1 is before f2 , the support of f2 before f3 etc., then the time-ordered product factorizes into the ordinary algebra product ε|O1 |/2 O1 (f1 )ε|O2 |/2 O2 (f2 ) . . . in X [ε], by the causal factorization property of the time-ordered products. Therefore, since the normalized Wick powers have already been demonstrated to be elements in X [ε], their product also is (because X [ε] was shown to be an algebra). Hence, one concludes by this arguments that if F = ⊗i fi is supported away from the union of all partial diagonals DI in the product manifold ×n Rd , then the corresponding timeordered product satisfies (76). However, for arbitrarily supported test functions f i , the time-ordered product does not factorize, and therefore does not correspond to the usual algebra product. In other words, whether Eq. (76) is satisfied or not depends on the definition of the time-ordered products on the partial diagonals DI (as a function of N ). As we have described explicitly above in the scalar case, the definition of the time-ordered products on the diagonals is achieved by extending the time-ordered products defined by causal factorization away from the diagonals in a suitable way. Therefore, in order that Eq. (76) be satisfied, we must control the dependence upon N of the constructions in the extension argument, or, said differently, we must control the way in which the time-ordered products are renormalized as a function of N .
May 31, 2004 13:56 WSPC/148-RMP
00207
Algebraic Approach to the 1/N Expansion in Quantum Field Theory
533
Acutally, we will now argue that one can construct time-ordered products T satisfying Eq. (76) [in addition to (t1)–(t8)] from the scalar time ordered products that were constructed in the previous section, so there is no need to repeat the extension step. The arguments are purely combinatorical and very similar to the kinds of arguments used before in the construction of the algebra X [ε], so we will only sketch them here. Also, to keep things as simple as possible, we will consider explicitly only the case in which all the Oi contain only one trace and no derivaQ tives, Oi = Tr φai . In order to describe our construction of T ( i Oi ), we begin by considering the coefficient distributions τ [⊗i φbi ] occurring in the Wick expanQ sion (33) of the time-ordered products T ( φai ) in the scalar theory (N = 1). As it is well-known, these can be decomposed into contributions from individual Feynman graphs τ [φb1 ⊗ · · · ⊗ φbn ] =
X
cγ · τ γ [φb1 ⊗ · · · ⊗ φbn ] .
(77)
graphs γ
Here, the sum is over all distinct Feynman graphs γ (in the scalar theory) with n vertices of valence bi each (and no external lines), and cγ is a combinatorical factor chosen so that τ γ coincides with the distribution constructed by the usual Feynman rules (where the latter are well-defined as distributions, i.e. away from the diagonals). Consider now, in X [ε], the product ε|O1 |/2 O1 (f1 )ε|O2 |/2 O2 (f2 ) · · · . If one evaluates this product successively using the product formula (64), then one sees that the result is organized in terms of the following double-line Feynman graphs Γ: each such graph has n vertices labeled by points xi of valence ai each (drawn as in Fig. 1). External lines ending on a vertex xk carrying a color index pair (ij 0 ) are associated with a factor of φij 0 (xk ). Given such a graph Γ, we form the surface SΓ by capping off the closed index loops with little surfaces, i.e. the index loops that do not meet any of the factors φij 0 (xk ). We do not cap off any of the index loops that meet one or more of the factors φij 0 (xk ) along the way, and these will consequently correspond to holes in the surface. We let J be the number of such holes and we let h = 1, . . . , J be an index labeling the holes. We will say that k ∈ h if the factor φij 0 (xk ) is encountered when running around the index loop in the hole labeled by h. Finally, for a given double line graph Γ, let γ be the single line graph obtained from Γ by removing all the external lines, and by replacing all remaining double lines by single ones. Then, for (x1 , . . . , xn ) such that xi 6= xj for Q all i, j — i.e. away from all partial diagonals — we can rewrite T ( Tr φai (xi )) as follows: ! Y X P P P Tr φai (xi ) = εJ+ Hk + (Vk −2) τ γ (x1 , . . . , xn ) ε ai /2 T i
Γ
× :
Y h
Tr
(
Y
k∈h
ε
1/2
φ(xk )
)
:H ,
(78)
May 31, 2004 13:56 WSPC/148-RMP
534
00207
S. Hollands
where Vk is the number of vertices in the kth disconnected component of SΓ , and Hk the number of handles. The idea now is to define the time-ordered product on the left side by the right side for arbitrary (x1 , . . . , xn ), including configurations on the partial diagonals. A similar definition can be given for operators Oi containing multiple traces or derivatives, the only difference being that the Feynman diagrams that are involved have to also incorporate the multiple traces. The key point about our definition (78) is that we now have complete control over the dependence upon N of the time-ordered products of gauge invariant elements: Since the expression in the second line of the above equation is an element P Q of X [ε] (after smearing), it follows that the so-defined ε ai /2 T ( Tr φai (xi )) is an element of X [ε] (after smearing). Also, since the τ have been defined so that the corresponding time-ordered products in the scalar theory (see Eq. (33), with τ 0 replaced by τ in that equation) satisfy (t1)–(t8), it follows that the time-ordered products at arbitrary N defined by Eq. (78) also satisfy these properties. A similar argument can be given when the operators Oi contain multiple traces or derivatives. Thus, we have altogether shown that the algebra X [ε] of formal power series in ε = 1/N contains the suitably normalized time-ordered products of gauge invariant elements, and that these time-ordered products can be defined so that they satisfy the analogs of (t1)–(t8). 4. The Interacting Field Theory In the previous sections, we constructed an algebra of observables X [ε] associated with the free field described by the action (39), whose elements are (finite) power series in the free parameter ε = 1/N . This algebra contains, among others, the gauge invariant smeared Wick powers of the free field and their time-ordered products. In the present section we will show how to construct from these building blocks the interacting field quantities as power series in ε and the self-coupling constant in an interacting quantum field theory with free part (39) and gauge invariant interaction part, Z S = Tr(∂ µ φ∂µ φ + m2 φ2 ) + V (φ) dd x . (79) For definiteness we consider the self-interaction V (φ) = g Tr φ4
(80)
which will be treated perturbatively. We begin by constructing the perturbation series for the interacting fields for a given but fixed N . Let K be a compact region in d-dimensional Minkowski spacetime, and let θ be a smooth cutoff function which is equal to 1 on K and which vanishes outside a compact neighborhood of K. For the cutoff interation θ(x)V and a given N , we define interacting fields by Bogoliubov’s formula OθV (f ) ≡
∂ S(θV )−1 S(θV + λf O)|λ=0 , i∂λ
(81)
May 31, 2004 13:56 WSPC/148-RMP
00207
Algebraic Approach to the 1/N Expansion in Quantum Field Theory
535
where the local S-matrices appearing in the above equation are defined in terms of the time-ordered products in the free theory by ! ! n X X (ig)n X Y Oj ∈ V inv . (82) fj Oj = fj Oj , S g T n! n j j Although each term in the power series defining the local S-matrix is a well-defined inv element in the algebra WN , the infinite sum of these terms is not, since this algebra by definition only contains finite sums of generators. We do not want to concern ourselves here with the problem of convergence of the perturbative series, so we will view the local S-matrix, and likewise the interacting quantum fields (81), simply inv as a formal power series in g with coefficients in WN , that is, as elements of the vector space ( ∞ ) X inv n inv WN [g] = An g | An ∈ W N ∀n . (83) n=0
inv We make the space WN [g] into a ∗-algebra by defining the product of two formal power series to be the formal power series obtained by formally expanding out the P P P P product of the infinite sums, ( n An g n ) · ( m Bm g m ) = k m+n=k (An · Bm )g k , P P and by defining the *-operation to be ( n An g n )∗ = n A∗n g n . We now remove the cutoff θ on the algebraic level. For this, we first note that the coefficients in the power seriest defining the interacting field (81) with cutoff are in fact the so-called “totally retarded products”, X (ig)n R(θ Tr φ4 · · · θ Tr φ4 ; f O) , (84) OθV (f ) = O(f ) + {z } | n! n≥1
n factors
each of which can in turn be written in terms of products of time-ordered products. It can be shown that the retarded products vanish whenever the support of θ is not in the causal past of the support of f . This makes it possible to define the interacting fields not only for compactly supported cutoff functions θ, but more generally for cutoff functions with compact support only in the time direction, i.e. we can choose K to be a time slice. Thus, when θ is supported in a time slice K, inv then the right side of Eq. (84) is still a well-defined element of WN [g]. We next remove the restriction to interactions localized in a time slice. For this, we consider a sequence of cutoff functions {θj } which are 1 on time slices {Kj } of increasing size, eventually covering all of Minkowksi spacetime in the limit as j goes to infinity. It is tempting to try to define the interacting field without cutoff as the limit of the algebra elements obtained by replacing the cutoff function θ in Eq. (81) by the members of the sequence {θj }. This limit, provided it existed, would in effect correspond to defining the interacting field in such a way that it coincides with the free “in”-field in the asymtptoic past. However, it is well-known that such t This
series is sometimes referred to as “Haag’s series”, since it was first obtained in [8].
May 31, 2004 13:56 WSPC/148-RMP
536
00207
S. Hollands
an “in”-field will in general fail to make sense in the massless case due to infrared divergences. Moreover, it is clear that the local quantum fields in the interior of the spacetime should at any rate make sense no matter what the infrared behavior of the theory is. As we will see, these difficulties are successfully avoided if, instead of trying to fix the interacting fields as a suitable “in”-field in the asymptotic past, we fix them in the interior of the spacetime. We now formalize this idea following [10] (which in turn is based on ideas of [3]). For this, it is important that for any pair of cutoff functions θ, θ 0 which are equal to 1 on a time slice K, there exists a unitary U (θ, θ 0 ) ∈ W inv [g] such that [3] U (θ, θ0 ) · OθV (f ) · U (θ, θ0 )−1 = Oθ0 V (f )
(85)
for all test functions f supported in K, and for all O. These unitaries are in fact given by U (θ, θ0 ) = S(θV )−1 S(h− V ) ,
(86)
where h− is equal to θ−θ 0 in the causal past of K and equal to 0 in the causal future of K. Equation (85) shows in particular that, within K, the algebraic relations between the interacting fields do not depend on one’s choice of the cutoff function. From our sequence of cutoff functions {θj }, we now define u1 = 1 and unitaries uj = U (θj , θj−1 ) for j > 1, and we set Uj = u1 ·u2 ·. . .·uj . We define the interacting field without cutoff to be OV (f ) ≡ lim Uj · Oθj V (f ) · Uj−1 ,
(87)
j→∞
where f is allowed to be an arbitrary test function of compact support. In fact, using Eqs. (85) and (86), one can show (see [10, Proposition 3.1]) that the sequence on the right side remains constant once j is so large that Kj contains the support inv [g]. of f , which implies that the right side is always a well-defined element of WN The unitaries Uj in Eq. (87) implement the idea to “keep the interacting field fixed in the interior of the slice K1 ”, instead of keeping it fixed in the asymptotic past. This completes our construction of the interacting fields without cutoff. These constructions can be generalized to define time-ordered products of interacting fields by first considering the corresponding quantities associated with the cutoff interaction θ(x)V , ! X ∂ , (88) S(θV )−1 S θV + λi f i O i TθV (f1 O1 · · · fn On ) ≡ n i ∂λ1 · · · ∂λn i
λi =0
possessing a similar expansion in terms of retarded products, ! ! Y Y Y X (ig)n R θ Tr φ4 · · · θ Tr φ4 ; TθV fi Oi . (89) fi Oi = T fi Oi + | {z } n! i
i
n≥1
n factors
i
May 31, 2004 13:56 WSPC/148-RMP
00207
Algebraic Approach to the 1/N Expansion in Quantum Field Theory
537
Q The corresponding time-ordered products without cutoff, denoted TV ( fi Oi ), are then defined in the same way as the interacting Wick powers, see Eq. (87). The latter are, of course, equal to the time-ordered products with only one factor, OV (f ) = TV (f O) .
(90)
inv The definition of the interacting fields as elements of WN [g] depends on the chosen sequence of time slices {Kj } and corresponding cutoff functions {θj }. However, one can show (see [10, p. 138]) that the *-algebra generated by the interacting fields does not depend on these choices in the sense that different choices give rise to isomorphic algebras.u Furthermore, the local fields and their time-ordered products constructed from different choices of {Kj } and {θj } are mapped into each other under this isomorphism. In this sense, our algebraic construction of the interacting field theory is independent of these choices. Although this is not required in this paper, we remark that the above algebras of interacting fields can also be equipped with an action of the Poincar´e group on d-dimensional Minkowski spacetime by a group of automorphisms transforming the fields in the usual way. Thus, we have achieved our algebraic formulation of the interacting quantum field theory given by the action (79) for an arbitrary, but fixed N . We will now take the large N limit of the interacting field theory on the algebraic level in a similar way as in the free theory described in the previous section, by showing that the (suitably normalized) interacting fields can be viewed as formal power series in the free parameter ε = 1/N , provided that the ‘t Hooft coupling
gt = gN
(91)
is held fixed at the same time. In fact, the suitably normalized interacting fields will be shown to be elements of a subalgebra of the algebra X [ε, gt ] of formal power series in gt with coefficients in X [ε]. To begin, we prove a lemma about the dependence upon N of the nth order contribution to the the interacting field with cutoff interaction, given by the nth retarded product in Eq. (89). Lemma 1. Let ε = 1/N, Oi , Ψj ∈ V inv , let fi , hj ∈ D(Rd ), and let Ti , respectively Sj , be the number of traces occurring in Oi , respectively Ψj . Then we have ! n m Y Y −2+Ti +|Oi |/2 Sj +|Ψj |/2 (92) ε fi Oi ; ε hj Ψj = O(1) , R i=1
j=1
inv where the notation O(εk ) means that the corresponding algebra element of WN (supposed to be given for all N ) can be written as a linear combination of the generators Wa (t) with t independent of ε, and with coefficients of order εk .
u These
inv [g]. algebras do not, of course, define the same subalgebra of W N
May 31, 2004 13:56 WSPC/148-RMP
538
00207
S. Hollands
Proof. For simplicity, we first give a proof of Eq. (92) in the case m = 1; the case of general m is treated below. Let us define, following [4], the “connected product” inv inv in WN as the k-times multilinear maps on WN defined recursively by the relation (Wa1 (t1 ) · · · · · Wak (tk ))conn ≡ Wa1 (t1 ) · · · · · Wak (tk ) −
X
{1,...,k}=∪I
class Y I
Y
j∈I
Waj (tj )
!conn
,
(93)
inv where the “classical product” ·class is the commutative associative product on WN defined by
Wa (t) ·class Wb (s) = Wab (t ⊗ s) ,
(94)
and where the trivial partition I = {1, . . . , k} is excluded in the sum. We now analyze the ε-dependence of the contracted product, restricting attention for simplicity first to the case when each of the Wai (ti ) contains only one trace. We use the product formula (63) to evaluate the connected product (Wa1 (t1 ) · · · · · Wak (tk ))conn as a sum of contributions of the form εI Wb (s) associated with Feynman graphs Γ, where 0 I is the number of index loops in the graph, and where s ∈ E|b| does not depend upon ε. It is seen, as a consequence of our definiton of the connected product, that precisely the connected diagrams occur in the sum. By arguments similar to the one given in the previous section, the number I associated with a given conneceted diagram with k vertices is given by 2 − k − H − J, where H is the number of handles of the surface associated with the diagram, and where J is the number of traces in the algebraic element Wb (s) associated with the contribution of that Feynman graph. Consequently, since H, J ≥ 0, we have (Wa1 (t1 ) · · · · · Wak (tk ))conn = O(εk−2 )
(95)
when each of the Wai (ti ) contains only one trace. Now consider the retarded product when each of the fields has only one trace, and when the supports of the test functions fi , h satisfy supp fi ∩ supp h = supp fi ∩ supp fj = ∅ .
(96)
Without loss of generality, we can assume that the supports of the fi have no intersection with either the causal past or the causal future of the support of h (otherwise, we write each fi as a sum of two test functions with this property). Under these assumptions, the retarded product is given by [4] ! n Y X R fi Oi ; hΨ = [Oπ1 (fπ1 ), [Oπ2 (fπ2 ), . . . [Oπn (fπn ), Ψ(h)] . . .]] , (97) i=1
π
when the supports of all fi have no point in common with the causal future of the support of h, and by 0 otherwise. We now use the following lemma which we are going to prove below:
May 31, 2004 13:56 WSPC/148-RMP
00207
Algebraic Approach to the 1/N Expansion in Quantum Field Theory
539
inv Lemma 2. Let B, A1 , . . . , An ∈ WN . Then X conn X [Aπn , [Aπ(n−1) , . . . [Aπ1 , B] . . .]] = [Aπn , [Aπ(n−1) , . . . [Aπ1 , B] . . .]] . π
π
(98)
Since ε|Oi |/2 Oi (fi ) and ε|Ψ|/2 Ψ(h) can be written in the form Wa (t) for some distributions t not depending on ε, it follows by Eqs. (95) and (97) and the lemma that ! n Y |Oi |/2 |Ψ|/2 R ε fi Oi , ε hΨ = O(εn−1 ) (99) i=1
when the supports of fi , h satisfy Eq. (96), and when each of the fields Oi , Ψ contains only one trace. When the test functions fi , h have overlapping supports, the formula (97) for the retarded products is not well-defined, or, alternatively speaking, the formula only defines an algebra valued distribution on the domain [ n+1 d DI , (100) × R I⊂{1,...,n+1}
where DI is a “partial diagonal” in the product manifold ×n+1 Rd , see Eq. (32). However, as explained at the end of Sec. 3, we are considering a prescription for constructing the time-ordered (and hence retarded) product possessing a Wick expansion of the form Eq. (78) everywhere, including the diagonals. The Wick expansion Eq. (78) implies that the N -dependence on the diagonals is identical to that of the diagonals. Equation (99) therefore follows immediately for all test functions. This proves the desired relation (92) when m = 1 and when all fields contain only one trace. The situation is a bit more complicated when the fields Oi , Ψ contain multiple traces. In that case, we similarly begin by analyzing the N -dependence of the connected product (93) when the Wai (ti ) contain multiple traces, so that each ai now stands for a multi-index (ai1 , . . . , aiTi ), where Ti is the number of traces in Wai (ti ), and where aij is the number of free field factors appearing in the jth trace of Wai (ti ). It is seen that only the following type of Feynman graphs can occur in the connected product of these algebra elements: the valence of the vertices of the graphs are determined by the number of fields aij appearing in the jth trace of the ith algebra element. For each fixed i, no aij -vertex can be connected to a aik -vertex. For fixed i, l, there exist indices j, k such the aij -vertex is connected to the alk -vertex. Analyzing the N -dependence of these graphs arising from index contractions along closed index loops in same way as in our analysis of the N dependence of the algebra product (63), we find that the contributions from these Feynman graphs are at most of order O(εJ+2
P
Hj +
P
Vj −2C
),
(101)
May 31, 2004 13:56 WSPC/148-RMP
540
00207
S. Hollands
where C is the number of disconnected components of the surface associated with the graph, J is the number of closed index loops containing “external currents”, Vj is the number of vertices in the jth disconnected component, and Hj the number of handles (components containing only a single vertex do not count). Clearly, we have Hj , J ≥ 0 and we know that X X Vj = Ti − D , (102)
with D the number of vertices that are not connected to any other vertex. In order to estimate the number C of connected components of the graph, we first assume D = 0 and imagine the graph obtained by moving all the aij , j = 1, . . . , Ti on top of each other for each i. The resulting structure will then only have one connected component, since we know that for fixed i, l, there exist indices j, k such the aij vertex is connected to the alk -vertex. If we now move the aij , j = 1, . . . , Ti apart again for a given i, then it is clear that we will create at most Ti −1 new disconnected components. Doing this for all i, we therefore see that our graph can have at most 1 + (T1 − 1) + . . . + (Tk − 1) disconnected components. If D is not zero, then we repeat this argument for those vertices that are not isolated, and we similarly arrive at the estimate X C ≤1−k+ Ti − D (103) for the number of disconnected components of any graph appearing in the connected product (93). Hence, we find altogether that (Wa1 (t1 ) · · · · · Wak (tk ))conn = O(ε2k−2−
P
Ti
)
(104)
when each of the Waj (tj ) contains Tj traces. We can now finish the proof in just the same way as in the case when all the fields Oi , Ψ contain only a single trace. Now let m in Eq. (92) be arbitrary and consider a situation wherein the supports of the test functions fi , hj satisfy supp fi ∩ supp hj = supp fi ∩ supp fj = supp hi ∩ supp hj = ∅ .
(105)
Without loss of generality, we assume that the support of fi+1 has no intersection with the causal future of the support of fi . Then it follows from the recursion formula (74) of [4] together with the causal factorization property of the timeordered products (17) that ! n m Y Y R fi Oi ; hj Ψ j i=1
=
j=1
X
[Oπ1 (fπ1 ), [Oπ2 (fπ2 ), . . . [Oπn (fπn ), Ψ1 (h1 ) · · · Ψm (hm )] . . .]]
π
=
X
m Y
I1 ∪...∪Im ={1,...,n} k=1
Y
i∈Ik
!
ad(Oi (fi )) [Ψk (hk )]
(106)
May 31, 2004 13:56 WSPC/148-RMP
00207
Algebraic Approach to the 1/N Expansion in Quantum Field Theory
541
when the supports of all fi have no point in common with the causal future of the supports of hj , and by 0 otherwise, and where we have set ad(A)[B] = [A, B]. Since N −|Oi |/2 Oi (fi ) and N −|Ψj |/2 Ψj (hi ) can be written in the form Wa (t) for some distribution not depending on N , we conclude by the same arguments as above that ! P Y 2|I |− i∈I (Ti +|Oi |/2)−(Sk +|Ψk |/2) k ad(Oi (fi )) [Ψk (hk )] = O(ε k ), (107) i∈Ik
from which the statement of the theorem follows when the supports of fi , hj have the properties (105). The general case can be proved from this as above. We end the proof of Lemma 1 with the demonstration of Lemma 2: let A = P λi Ai and consider the formal power series expression e−A · B · eA =
X
m,n≥0
1 (−A)m · B · An . m!n!
(108)
For a fixed k > 0, consider the contribution to the sum on the right-hand side arising from diagrams such that precisely k A-vertices are disconnected from the other A- and B-vertices. Since disconnected diagrams factorize with respect to the classical product ·class this contribution is seen to be equal to X m!n! 1 ((−A)r · As ) ·class ((−A)m−r · B · An−s ) . m!n! r!(m − r)!s!(n − s)! r+s=k m,n≥0 (109) P r s But this expression vanishes, due to r+s=k (−A) · A /r!s! = 0, showing that −A A −A A conn e · B · e = (e ·B·e ) . The statement of the lemma is obtained by differentiating this expression n times with respect to the parameters λi . X
Applying the lemma to the retarded products appearing in the definition (84) of the interacting field with cutoff, (i.e. Oi = Tr φ4 , so that Ti = 1, |Oi | = 4 in that case), and using our assumption g ∝ ε (see Eq. (91)), we get ! n Y 4 n θ Tr φ ; f O = O(ε−T −|O|/2 ) , (110) g R where T is the number of traces in the field O. Therefore, since the cutoff interacting field is a sum of such terms, we have found OθV (f ) = O(ε−T −|O|/2 ) for the cutoff interacting fields, viewed now as formal power series in the ‘t Hooft coupling parameter gt rather than g. For the interacting time-ordered products with cutoff, P Q − Ti +|Oi |/2 we similarly get TθV ( Oi (fi )) = O(ε ). We claim that the same is true for the interacting fields without cutoff: Proposition 2. Let ε = 1/N, O ∈ V inv with n factors of φ and T traces. Then OV (f ) = O(ε−T −n/2 )
(111)
May 31, 2004 13:56 WSPC/148-RMP
542
00207
S. Hollands
as formal power series in the ‘t Hooft coupling gt . More generally, for the interacting time-ordered products Y P Oi (fi ) = O(ε− Ti +ni /2 ) , (112) TV where Ti is the number of traces in Oi , and where ni is the number of factors of φ in Oi .
Proof. According to our definition of the interacting field without cutoff, Eq. (87), we must show that εn/2+T · Uj · Oθj V (f ) · Uj −1 = O(1)
(113)
where {θj } and {Uj } are sequences of cutoff functions and unitary elements as in our definition of the interacting field, see Eq. (87). We expand Uj and Oθj V (f ) in terms of the retarded products and use the fact, shown in [4], that only connected diagrams contribute to each term in the resulting formal power series. The N dependence of these terms can then be analyzed in a similar fashion as in the proof of Lemma 1 and gives (113).v The proof for the time-ordered producs is similar. The proposition allows us to view the suitably normalized interacting fields and their time-ordered products as elements of the algebra X [ε, gt ] of formal power series in gt with coefficients in X [ε], i.e. we have shown εT +n/2 OV (f ) ∈ X [ε, gt ] ,
(114)
and similarly for the interacting time-ordered products.w We denote by AV the subalgebra of X [ε, gt ] generated by the fields (114) and their time-ordered products, ( ! ) Y P d inv Ti +|Oi |/2 · TV AV = alg ε fi Oi fi ∈ D(R ), Oi ∈ V ⊂ X [ε, gt ] . i (115)
By the same arguments as given in [10, p. 138], one can again prove that, as an abstract algebra, AV does not depend on the choice of the cutoff functions entering in the definition of the interacting field. Since the algebra AV is an algebra of formal power series in ε = 1/N , the construction of AV accomplishes the desired algebraic formulation of the 1/N -expansion for the interacting quantum field theory associated with the action (79). Since the algebra AV was constructed perturbatively, it incorporates not only an expansion in 1/N , but also of course a formal expansion in the coupling parameters. Moreover, one can show that the value of Planck’s constant, ~, (set equal to 1 so far) v Note,
however, that the expansion of Uj itself contains negative powers of ε, i.e. it is not true that Uj is of O(1) separately. w Note that the ε-dependence of the normalization factors necessary to make the interacting fields and their time-ordered products elements of X [ε, gt ] differs from that in the free field theory, see (75).
May 31, 2004 13:56 WSPC/148-RMP
00207
Algebraic Approach to the 1/N Expansion in Quantum Field Theory
543
can be incorporated explicitly into the algebra AV , and it is seen that the classical limit, ~ → 0 can thereby be included into our algebraic formulation. Following [4], we briefly describe how this is done. One first introduces an explicit dependence on ~ into the algebra product (5) in W by replacing ∆+ in that product formula by ~∆+ . With this replacement understood, W can now be viewed as a 1-parameter family of *-algebras depending on the parameter ~. It is possible to set ~ = 0 on 1 times the algebraic level. In this limit, W becomes a commutative algebra, and i~ the commutator defines a Poisson bracket in the limit. In this way, the 1-parameter family of algebras W depending on ~ is seen to be a deformation of the classical Poisson algebra associated with the free Klein–Gordon field. These considerations can be generalized straightforwardly to the algebras X [ε] as well as X [ε, g t ], and we incorporate the dependence on ~ of these algebras into the new notation X [ε, gt , ~]. The algebras of interacting fields, AV , with interaction now taken to be ~1 V , can be seenx to be subalgebras of X [ε, gt , ~], and therefore depend likewise on the indicated deformation parameters, AV = AV [ε, gt , ~] .
(116)
The interacting field algebras consequently have a classical limit, ~ → 0, and can thereby be seen to be non-commutative deformations of the Poisson algebras of classical (perturbatively defined) field observables associated with the action (79), that depend on 1/N as a free parameter. In this way, the expansion of the large N interacting field theory in terms of ~ is incorporated on the algebraic level, and the classical limit ~ → 0 can be taken on this level. On the other hand, one can show that the vacuum state and the Hilbert space representations of AV as operators on Hilbert space cannot be taken. This demonstrates the strength of the algebraic viewpoint. For a more general interaction X V (φ) = gi Oi (117)
including interaction vertices Oi ∈ V inv with Ti multiple traces, it follows from Lemma 1 that the interacting field will still satisfy Eq. (114), provided that the coupling constants gi tend to zero for large N in such a way that the corresponding ‘t Hooft parameters git defined by git = gi N Ti +|Oi |/2−2
(118)
remain fixed (note that (91) is the special case Oi = Tr φ4 of this relation). Thus, if the coupling constants gi are tuned in the prescribed way, the interacting field algebra AV is defined as a subalgebra of the algebra X [ε, g1t , g2t , . . .] of formal power series in the ‘t Hooft coupling parameters with coefficients in X [ε]. x This is a non-trivial statement, because statement can be adapted from [4].
1 V ~
contains negative powers of ~. The proof of this
May 31, 2004 13:56 WSPC/148-RMP
544
00207
S. Hollands
The perturbative expansion of the interacting fields (114) defined by the interaction (80) as an element of AV is organized in terms of Feynman graphs that are associated with Riemannian surfaces, where contributions from genus H surfaces are suppressed by a factor εH . To illustrate this in an example, consider the interacting field ε3/2 (Tr φ)θV with cutoff interaction θ(x)V . In order to have a compact notation for the decomposition of the nth order retarded product occuring in the perturbative expansion of this interacting field into contributions associated with Feynman graphs, we first consider a corresponding retarded product occurring in φθV in the theory of a single scalar field with interaction θ(x)V , where V = gφ4 . Such a retarded product can be decomposed in the form [19] X rΓ (y1 , . . . , yn ; x) : φa1 (y1 ) · · · φan (yn ) :H . R(V (y1 ) · · · V (yn ); φ(x)) = g n graphs Γ
(119)
The sum is over all connected graphs Γ with 4-valent vertices yi and a 1-valent vertex x, and ai is the number of external legs (i.e. lines with open ends) attached to the vertex yi . The rΓ are c-number distributions associated with the graph which are determined by appropriate Feynman rules. We now look at a correpsonding retarded product occurring in the perturbative expansion of the corresponding field ε3/2 (Tr φ)θV in the large N interacting quantum field theory with V = g Tr φ4 . By an analysis analogous to the one given in the proof of Lemma 1, it can be shown that such a retarded product can be written as a sum of contributions from individual Feynman graphs as follows: ε3/2 R(V (y1 ) · · · V (yn ); Tr φ(x)) X X = gtn εH εf /2+T genera H
graphs Γ
· rΓ (y1 , . . . , yn ; x) : Tr
Y i
φ(yi ) · · · Tr
Y
φ(yj ) :H .
(120)
j
The expression on the right side is to be understood as follows: gt is the ‘t Hooft coupling (91). The sum is over all distinct Feynman graphs Γ that occur in the corresponding expansion (119) in the theory with only a single scalar field, and the c-number distributions rΓ are identical to the ones appearing in that expansion. The sum over graphs is subdivided into contributions grouped together according to their topology specified by the genus, H, of the graph, defined as the number of handles of the surface S obtained by attaching faces to the closed index loops occurring in the given graph (we assume that a double line notation as described in Sec. 3 is used for the propagators and the vertices). The external legs are incorporated by capping off each such external line connected to yk and ending on the index pair ij 0 with an “external current” φij 0 (yk ). The external currents are collected in the normal-ordered term appearing in Eq. (120), where each trace corresponds to following through the index line to which the currents within that trace belong.
May 31, 2004 13:56 WSPC/148-RMP
00207
Algebraic Approach to the 1/N Expansion in Quantum Field Theory
545
The number of traces in such a normal-ordered term is denoted T , and the number of factors of φ is denoted f . The same remarks also apply to the perturbative expansion of the more general gauge invariant fields ε|O|/2+T OθV in the large N theory. A similar expansion is also valid for the corresponding fields without cutoff θ. Moreover, V may be replaced by an arbitrary (possibly non-renormalizable) local interaction of the form (117), provided that the couplings are tuned in the large N limit in the manner prescribed in Eq. (118). 5. Renormalization Group Our construction of the interacting field theory given in the previous section is equally valid for interactions V that are renormalizable by the usual power counting criterion as well as for non-renormalizable theories. Let us sketch how the distinction between renormalizable and non-renormalizable theories appears in the algebraic framework that we are working in. For simplicity, let us first consider the theory of a single hermitian scalar field, φ. We take the action of this scalar field to consist of P a free part given by Eq. (1), and an interaction given by V = gi Oi , which might be renormalizable or non-renormalizable. (As above, Oi are monomials in φ and its derivatives.) The difference between renormalizable V and non-renormalizable V shows up in the perturbatively defined interacting quantum field theory as follows: our definition of interacting fields depends on a prescription for defining the Wick powers and their time-ordered products in the free theory, which is given by a map T with the properties (t1)–(t8) specified in Sec. 3. As explained there, these properties do not, in general, determine the time-ordered products (i.e. the map T ) uniquely, and this consequently leaves a corresponding ambiguity in the definition of the interacting fields. However, as first showny in [10], the algebra of interacting fields associated with interaction V constructed from a given prescription T is isomorphic to the algebra constructed from any other prescription, T 0 , provided the interaction P is also changed from V to V 0 = gi0 Oi , where each of the modified couplings gi0 is a suitable formal power series in the couplings g1 , g2 , . . . . Renormalizable theories are characterized by the fact that V 0 always has the same form as V , modulo terms of the form already present in the free Lagrangian. If OV are the interacting fields constructed from the interaction V using the first prescription for defining time-ordered products in the free theory, and if OV0 0 are the fields constructed from the interaction V 0 and the second prescription, then the above isomorphism, let us call it R, can be shown [10] to be of the form X 0 R: OiV → Zij · OjV (121) 0 , j
y The constructions in [10] were actually given in the more general context of an interacting (scalar) field theory on an arbitrary globally hyperbolic curved spacetime. An explicit treatment of the special case of Minkowski spacetime was recently given in [6].
May 31, 2004 13:56 WSPC/148-RMP
546
00207
S. Hollands
where we have omitted the smearing functions for simplicity. The “field strength renormalization” constants Zij are formal power series in g1 , g2 , . . . . For renormalizable theories, one can show that there will appear only finitely many terms in the sum on the right side. The possible terms are restricted in that case by the requirement that the fields Oj on the right side cannot have a greater engineering dimension than the field Oi on the left side. In a non-renormalizable theory, no such restriction occurs. The map R together with the transformation V → V 0 corresponds to the “renormalization group” in other approaches. Since the interactions V might be viewed as elements of the abstract vector space V spanned by the field monomials O, we may view the renormaliztion group as providing a map V → V. The subspace of renormalizable interaction vertices V ∈ V thus corresponds precisely to the largest finite dimensional subspace of V that is invariant under all renormalization group transformations. One can in particular consider the special case in which the alternate prescription T 0 is related to the original prescription, T , for defining the time-ordered products in the free theory by a multiplicative change of scale (with multiplication factor λ > 0), i.e. T 0 is given terms of T by Eq. (20). In that case, we obtain a family of isomorphisms R(λ) labeled by the parameter λ, together with oneP parameter families gi0 = gi (λ), V 0 = gi (λ)Oi and Zij (λ) (for details, we refer to [10]). By the almost homogeneous scaling behavior of the time-ordered products in the free theory, Eq. (21), it follows that each term appearing in the power series expansions of gi (λ) and Zij (λ) depends at most polynomially on ln λ. The functions λ → gi (g1 , g2 , . . . , λ) define the “renormalization group flow” of the theory, which may be viewed as a 1-parameter family (in fact, group) of diffeomorphisms on V. Thus, our formulation of the renormalization group flow is that a given way of defining the interacting fields OV (i.e. using a given renormalization prescription) is equivalent, via the isomorphism R(λ), to defining the fields OV0 0 via the “rescaled” prescription — denoted by “prime” — obtained from the previous prescription by changing the “scale” according to Eq. (20), provided that the interaction is at the P same time modified to V 0 = gi (λ)Oi . We can re-express this renormalization group flow in a somewhat more transparent way by noting that, from Eq. (20), the rescaled prescription (i.e. the “primed” prescription appearing in the renormalization group flow (121) is given in terms of the original one (up to the isomorphism σλ ) simply by appropriately rescaling the mass, the field strength and the coordinates in the time-ordered products in the free theory. Thus, by composing R(λ) with σλ , we get the following equivalent version of our algebraic formulation of the renormalization group flow: let AV (U ) be the algebra of interacting fields smeared with test functions supported in a region U ⊂ Rd of Minkowski space. Then ρλ = R(λ) ◦ σλ is given by (m)
(λ−1 m)
ρλ : AV (λU ) → AV (λ) OiV (λx) →
X j
(U ) ,
λ−dj Zij (λ) · OjV (λ) (x)
(122)
May 31, 2004 13:56 WSPC/148-RMP
00207
Algebraic Approach to the 1/N Expansion in Quantum Field Theory
547
and is again an isomorphism, where we are now indicating the dependence of the P −δi algebras upon the mass parameter, m. Here, V (λ) = λ gi (λ)Oi , where di is the engineering dimension of the field Oi and δi the engineering dimension of the corresponding coupling gi , and the functions Zij (λ), gi (λ) are as in Eq. (121). Stated differently, the action of ρλ is described as follows: if the argument of an interacting field is rescaled by λ, this is equivalent via ρλ to a redefinition of the interaction, V → V (λ) together with a suitable redefinition of the field strength by the matrix λ−dj Zij (λ). We also note explicitly that Eq. (122) makes reference to only one given renormalization prescription. An important feature of our algebraic formulation of the renormalization group flow is that it is given directly in terms of the interacting field operators which are members of the algebra AV , rather than in terms of the correlation functions of these objects, as is normally done. Of course, one can always apply a state (i.e. a normalized linear functional on the field algebra) to the relation (121) and thereby obtain a relation for the behavior of the Green’s functions under a rescaling. Our algebraic formulation makes it clear that the existence of the renormalization group flow is an algebraic property of the theory, i.e. it is encoded in the local algebraic relations between the quantum fields. It has nothing to do a priori with the vacuum state or e.g. the superselection sector of the theory. Besides offering a conceptually new perspective on the nature of the renormalization group flow, our algebraic formulation has the advantage that, since the construction is essentially of a local nature, it works regardless of what the infrared behavior of the theory is. This makes the algebraic approach superior e.g. in curved spacetime [10], where there is no preferred vacuum state, and where moreover the infrared behavior of generic states is very difficult to control (and at any rate, depends upon the behavior of the spacetime metric at large distances). The statements just made for the theory of a single, scalar field carry over straightforwardly to a multiplet of scalar fields. In particular, they are true for the ¯ representation of the group U (N ) with action (79), theory of a field φ in the N ⊗ N for any arbitrary but fixed N . The aim of the present section is to show that, for gauge invariant interactions, the algebraic formulation of renormalization group carries over in a meaningful way in the limit of large N , or more properly, that the renormalization group it can be defined in the sense of power series in ε = 1/N , with positive powers. For this, consider two different prescriptions T and T 0 for defining Wick powers and time-ordered products in the free theory satisfying (t1)–(t8), as well as Eq. (76). An explicit construction of such a prescription was given at the end of Sec. 3, but we will not need to know the details of that construction here. For a given gauge invariant P interaction V = gi Oi ∈ V inv (renormalizable or non-renormalizable), let AV , 0 respectively AV , be the algebras of interacting field observables constructed via the two prescriptions, each of which is a subalgebra of X [ε, g1t , g2t , . . .], where git are the ‘t Hooft coupling parameters related to the couplings gi in the interaction via formula (118). Let the interacting quantum fields in these algebras be ε|O|/2+T OV ,
May 31, 2004 13:56 WSPC/148-RMP
548
00207
S. Hollands
respectively ε|O|/2+T OV0 (T the number of traces), defined as formal power series in ε and the ‘t Hooft parameters git . P P 0 Proposition 3. For any given V = gi Oi ∈ V inv there exists a V 0 = gi Oi ∈ V inv and a *-isomorphism R: AV → A0V 0
(123)
0 Ti +|Oi |/2−2 such that gi0 = git ε (Ti is the number of traces in the field Oi ), with 0 0 git = git (g1t , g2t , . . . , ε)
(124)
a formal power series in git and ε (i.e. containing only positive powers of ε). The action of R on a local field is given by X 0 (125) Zij · ε|Oj |/2+Tj OjV R: ε|Oi |/2+Ti OiV → 0 , {z } | | {z } j ∈AV
∈A0V 0
where Zij are formal power series in g1t , g2t , . . . and ε, and where Ti is the number of traces in Oi . [Recall that the ε-normalization factor in expressions like ε |O|/2+T OV in the above equation is precisely the factor needed to make the latter an element of AV .] A similar formula holds for the time-ordered products. Moreover , if the “prime” prescription is related to the “unprime” prescription via a multiplicative change of scale (with multiplication factor λ), then each term 0 (λ) and Zij (λ) depends at most polynomially on ln λ, e.g. in the expansion of git X a1 a2 Zij (λ) = zij,a1 a2 ···h (ln λ)g1t g2t · · · εh (126) a1 ,a2 ,...,h≥0
where the zij,a1 a2 ···h are polynomials in ln λ.
Proof. For any given, but fixed N , one can show by the same arguments as in [11] that any two prescriptions T and T 0 for defining time-ordered products with properties (t1)–(t8) are related to each other in the following way: ! ! !! n n Y Y X Y Y T0 fi Oi = T fi Oi + T δ|Ij | fk Ok . (127) i=1
i=1
∪j Ij ={1,...,n}
j
k∈Ij
Here, the following notation has been introduced: the sum runs over all partitions of the set {1, . . . , n}, excluding the trivial partition. The δk are maps δk : ⊗k D(Rd ; V inv ) → D(Rd ; V inv ) ,
(128)
0
characterizing the difference between T and T at order k. The maps δk have the formz ! k Y X δk fi Oi = Fk,i Ψi , (129) i=1
i
z Note that δ is not the identity, since we are allowing ambiguities in the definition of Wick powers, 1 rather than defining them by normal ordering.
May 31, 2004 13:56 WSPC/148-RMP
00207
Algebraic Approach to the 1/N Expansion in Quantum Field Theory
where the functions Fk,i are of the form Y X ∂(µk ) fk (x) , ck,i (µ1 )···(µk ) Fk,i (x) =
549
(130)
k
(µ1 )···(µk )
with each (µi ) denoting a symmetrized spacetime multi-index (µi1 · · · µis ), and with each ck,i (µ1 )···(µk ) denoting a Lorentz invariant tensor field (independent of x). Let us define a V 0 in V inv [g1 , g2 , . . .] (the space of formal power series in gi with coefficients in V inv ) by ! k Y X ik δk θj V , (131) V 0 = lim j→∞ k! k≥1
where {θj } represents any series of cutoff functions that are equal to 1 in compact sets Kj exhausting Rd in the limit as j goes to infinity. Then, for any given but fixed N , the result [11] establishes the existence of an isomorphism R between the algebras of interacting field observables associated with the two prescriptions satisfying Eq. (121) for some set of formal power series Zij in g1 , g2 , . . . , where the fields in that equation are now given by gauge invariant expressions in V inv . In order to prove the theorem, we must show that the interaction V 0 and the factors Zij appearing in the automorphism R have the N -dependence specified by Eqs. (121), respectively (120). This will guarantee that the above automorphisms R defined separately for each N given rise to a corresponding automorphism of the interacting field algebras, viewed now as depending on ε = 1/N as a free parameter. In order to analyze the N -dependence of V 0 , let us consider the prescription T 00 defined by T 0 when applied to k factors or more, and defined by Eq. (127) when applied to n ≤ k − 1 factors. Then, by definition, the prescriptions T and T 00 will agree on n ≤ k − 1 factors, and ! ! k k X Y Y 00 T Ψi (Fk,i ) , (132) fi Oi − T fi Oi = i=1
i
i=1
where Fk,i is as in Eq. (130). If Si is the number of traces in the field Ψi , then we claim that Ψi (Fk,i ) = O(1/N Si +2k−2−
P
Tj +|Oj |/2
),
(133)
inv where we recall that an algebra element A ∈ WN given for all N is said to be h h O(1/N ) if it can be written as 1/N times a sum of terms of the form Wa (ta ), 0 with each ta ∈ E|a| depending only on positive powers of 1/N . Using Eq. (75), Eq. (133) is equivalent to
Fk,i = O(1/N Si +|Ψi |/2+2k−2−
P
Tj +|Oj |/2
).
(134)
Assuming that this has been shown, we get the statement (120) about the N dependence of V 0 by plugging this relation into Eqs. (128), (130) and (131), and using the definition of the ‘t Hooft couplings, Eq. (118). In order to show (134), let
May 31, 2004 13:56 WSPC/148-RMP
550
00207
S. Hollands
us begin by introducing the “connected time-ordered product” as the map T conn : inv ⊗k D(Rd , V inv ) → WN defined recursively in terms of T by ! ! ! class n n X Y Y Y Y conn conn T (135) fj Oj , T fi Oi ≡ T fi Oi − i=1
i=1
{1,...,n}=∪I
I
j∈I
inv where the “classical product” ·class is the commutative associative product on WN defined by Eq. (94), and where the trivial partition I = {1, . . . , k} is excluded in the sum. By definition, we have T 00conn = T conn when acting on n ≤ k − 1 factors, because T 00 = T in that case. This implies that we can alternatively write P i Ψi (Fk,i ) in Eq. (132) as the corresponding difference of connected time-ordered products. By a line of arguments similar to the proof of Eq. (104) in Lemma 1 using that only connected Feynman diagrams contribute to the connected time-ordered products, it can be seen that ! k X X Y P Wa (ta (⊗i fi )) , (136) (1/N )j+2k−2− Tl +|Ol |/2 fi Oi = T conn i=1
j
a=(a1 ,...,aj )
where each ta is a linear map 0 ta : ⊗k D(Rd ) → E|a|
(137)
which can contain only positive powers of 1/N . A completely analogous estimate holds for T 00conn, with ta replaced by maps t00a with the same property. By Eq. (132) (with the time-ordered products replaced by the connected products in that equation), we therefore find X X X P Ψi (Fk,i ) = (1/N )j+2k−2− Tl +|Ol |/2 Wa (sa ) , (138) i
j
a=(a1 ,...,aj )
ta − t00a .
where we have set sa = We now write the expressions appearing on the left side as Ψi (Fk,i ) = Wa (ua ), where the distributions ua are related to Ψi and Fk,i via a relation of the form (28) and (29). If we now match the terms on both sides of this equation and use the linear independence of the Wa ’s, we obtain the desired estimate (133). As already explained, this proves the desired N -dependence of V 0 . The proof that the field strength renormalization factors Zij in Eq. (121) have the desired N -dependence expressed in Eq. (125) is very similar to the proof that we have just given, so we only sketch the argument. For a given, but fixed N , the factors Zij are defined implicitly by the relation ! k X ik Y X lim θl V = Zij f Oj . (139) δk+1 f Oi , l→∞ k! j k≥0
The desired N -dependence of the field strength renormalization factors implicit in Eq. (125) is equivalent to Zij (ε, g1 , g2 , . . .) = ε|Oi |/2+Ti −|Oj |/2−Tj · Zij (ε, g1t , g2t , . . .) ,
(140)
May 31, 2004 13:56 WSPC/148-RMP
00207
Algebraic Approach to the 1/N Expansion in Quantum Field Theory
551
where Zij are formal power series in the ‘t Hooft parameters and ε (i.e. depending only on positive powers of ε), and where Ti is the number of traces in the field Oi . In order to prove this equation from the definition (139), one proceeds by analyzing the N -dependence of the maps δk+1 in the same way as above. 0 The desired polynomial dependence of the coefficients of Zij (λ) and git (λ) on 0 ln λ when T arises from T via a scale transformation follows as in [10] from the almost homogeneous scaling behavior (21) of the time-ordered products in the free theory. 6. Reduced Symmetry In the previous sections, we have constructed interacting field algebras associated with a U (N )-invariant action as a power series in 1/N . Instead of considering U (N )invariant actions, one can also consider actions that are only invariant under some subgroup. In the present section we will consider actions of the form Eq. (81) in which the free part of the action is invariant under the full U (N )-group, and in which the interaction term V is now invariant only under a subgroup G of the form G = U (N1 ) × · · · × U (Nk ) ⊂ U (N ) ,
(141)
P
where Nα = N . Since the perturbative construction of a quantum field theory with such an interaction involves Wick powers and time-ordered products in the free theory that are not invariant under the full U (N ) symmetry group but only under the subgroup, we begin by describing the algebra G = {A ∈ WN | αU (A) = A , WN
∀ U ∈ G}
(142)
of observables invariant under G of which these fields are elements. If we define Pα to be the projection matrix corresponding to the αth factor in the product (141), G then it is easy to see that WN is spanned by expressions of the form Wa,α (t) =
1 N |a|/2
Z
: Tr
Y
i1 ∈I1
φ(xi1 )Pαi1 · · · Tr
Y
iT ∈IT
φ(xiT )PαiT : t(x1 , . . . , x|a| )
Y
dd x j .
j
(143)
Here, a represents a multi-index (a1 , . . . , aT ), α represents a multi-index (α1 , . . . , α|a| ), the Ij ’s are mutually disjoint index sets with aj elements each such 0 that ∪j Ij = {1, . . . , |a|}, and t is a distribution in the space E|a| . G The large N limit of the algebras WN can be taken in a similar way as in the case of full symmetry described in Sec. 3, provided that the ratios sα = Nα /N
(144)
have a limit. As above, the large N limit is incorporated in the construction of a suitable algebra Xs [ε], depending now on the ratios s = (s1 , . . . , sk ), of polynomial
May 31, 2004 13:56 WSPC/148-RMP
552
00207
S. Hollands
expressions in ε whose coefficients are given by generators Wa,α (t). To work out the algebra product between two such generators as a power series in ε = 1/N , it is useful again to consider first the simplest case d = 0 corresponding to the matrix model given by the action functional (47), and by considering the product of the matrix generators T Y 1 Tr M Pαi M Pαi+1 · · · M Pαk : , (145) Wa,α = |a|/2 : N | {z } i ai factors of M
corresponding to the algebra elements (143). We expand the product of two such generators in terms of Feynman graphs as in Sec. 3, the only difference being that the vertices corresponding to the traces in Eq. (145) now also contain projection operators Pα . We take this into account by modifying our notation of these vertices by indicating also the projectors adjacent to the vertex. As an example, consider the generator with a single trace given by 1
: Tr Pα1 M Pα2 M Pα3 M : . N 3/2 This generator willwill contribute a 3-valent vertex drawn in the This generator contribute a 3-valent vertex drawn in following Fig. 3: picture: W3,(α1 ,α2 ,α3 ) =
(146)
jk 0 Pα2 Pα1
Pα3
ki0
ij 0 A closed loop of index contractions occurring in a diagram associated with the product Fig. 3. Wa,α · Wb,β will contribute a factor of Y A closed loop of index contractions occurring in a diagram associated with the Tr Pγ , (147) product Wa,α · Wb,β will contribute aγ∈{α factor of i ,βj } ! Y where the product is over all projectors that are when following the (147) index Tr Pγ encountered , 0 Pγ = δγγ 0 Pγ , so this loop. But the projection matrices Pγγ∈{α are i ,β mutually orthogonal, P γ j} factor is given by Nγ if the projectors in the index loop are all equal to some γ, and where otherwise. the product projectors that are encountered when following the invanishes If is weover let Iall α be the number of index loops in a given graph containing dex loop. But the projection matrices mutually orthogonal, Pγ 0 Pγ = δγγ 0 Pγ , γ are only projections on the Nα -subspace, thenPwe consequently get so this factor is given by Nγ if the projectors in the index loop are all equal to some X Y P P 1 loops in a given + (V 1 γ, and vanishes otherwise. IfsIwe letIk IJ+ number of index k −2) α beHkthe Wa,α · W · Wc,γ , (148) b,β = 1 . . . sk ε 2 m graph containing only projections on the Nα -subspace, graphs linesthen (k, l) we consequently get Y X P P 1 Ik J+ Hk + (Vk −2) Wa,α · Wisb,βonly = over graphs sI11 . . . swhose · Wc,γ (148) where the sum index loops contain only one2 kind of ,projectors, k ε m and where ε = 1/N asgraphs usual. As in the previous section, lines Wγ,c (k, arises from the graphs with l) loops containing cj external currents each, and c denotes the multi index (c1 , c2 , . . . ). The new feature is that each of these loops now also contains projection operators P γ . As in the previous section, we can generalize the considerations leading to formula (148) 0 to determine the product of algebra elements Wa,α (t), t ∈ E|a| when the spacetime dimension is not zero. We thereby obtain a family of algebras Xs [ε] P depending analytically on the ratios s = (s1 , . . . , sk ). Since by definition 0 ≤ sα ≤ 1 and sα = 1, the tuples s can naturally be viewed as elements of the standard (k − 1)-dimensional simplex X ∆k−1 = {(s1 , . . . , sk ) ∈ Rk | 0 ≤ sα ≤ 1, sα = 1}. (149)
May 31, 2004 13:56 WSPC/148-RMP
00207
Algebraic Approach to the 1/N Expansion in Quantum Field Theory
553
where the sum is only over graphs whose index loops contain only one kind of projectors, and where ε = 1/N as usual. As in the previous section, Wγ,c arises from the graphs with loops containing cj external currents each, and c denotes the multi-index (c1 , c2 , . . .). The new feature is that each of these loops now also contains projection operators Pγ . As in the previous section, we can generalize the considerations leading to for0 mula (148) to determine the product of algebra elements Wa,α (t), t ∈ E|a| when the spacetime dimension is not zero. We thereby obtain a family of algebras Xs [ε] depending analytically on the ratios s = (s1 , . . . , sk ). Since by definition 0 ≤ sα ≤ 1 P and sα = 1, the tuples s can naturally be viewed as elements of the standard (k − 1)-dimensional simplex n o X ∆k−1 = (s1 , . . . , sk ) ∈ Rk | 0 ≤ sα ≤ 1, sα = 1 . (149) Thus, our construction of the algebras associated with the reduced symmetry group yields a bundle σk−1 : ∆k−1 → Alg ,
s → Xs [ε]
(150)
of algebras with every point of the standard (k − 1)-symplex for every k. The parameters s interpolate continuously between situations of different symmetry. If ∆l is a face of ∆k , (so that l < k), then the assignment fulfills the “self-similar” restriction property σk ∆ l = σ l .
(151)
The extremal points of ∆k (i.e. the zero-dimensional faces) correspond to full symmetry, i.e. the restriction of σk to these points yields the algebras X [ε] constructed in Sec. 3. We now repeat the construction of the interacting field algebras AV , V = P gi Oi , with each Oi an expression in the field that is invariant under the reduced symmetry group. We denote the vector space of such formal expressions by ! ) ( ai Y Y inv (152) Vk = span O = Tr (Pαl ∂µ1 · · · ∂µj φ) , αl = 1, . . . , k , i
(note that V1inv can naturally be identified with V inv in the notation introduced P earlier). For an arbitrary but fixed N , a gauge invariant interaction V = gi Oi ∈ Vkinv gives rise to corresponding interacting quantum fields OV and their timeG ordered products as elements in the corresponding algebra WN [g1 , g2 , . . .]. The limit N → ∞ can be taken on the algebraic level in the same way as in the case of full symmetry described in Sec. 4, provided that the ratios sα = Nα /N are held fixed, and provided that the couplings are tuned as in (118). This construction directly leads to an algebra AV,s of formal power series in ε = 1/N as well as the ‘t Hooft couplings git of which the smeared normalized gauge invariant interacting fields ε|O|/2+T OV (f ) and their time-ordered products are elements. The algebra AV,s is
May 31, 2004 13:56 WSPC/148-RMP
554
00207
S. Hollands
now a subalgebra of the algebra Xs [ε, g1t , g2t , . . .] of formal power series in the ‘t Hooft couplings, with coefficients in the algebra Xs [ε]. We have constructed in this way a family algebra AV,s parametrized by deformation parameters s, i.e. a bundle σV,k : ∆k−1 → Alg ,
s → AV,s ,
(153)
where ∆k−1 is the (k − 1)-dimensional standard simplex (149) of which the s are elements. These deformation parameters smoothly interpolate between situations of different symmetry as well as between different interactions. For example, s1 = 1, s2 = · · · = sk = 0 (i.e. N1 = N ) corresponds to the extremal case of full symmetry, where only those terms in the interaction V ∈ Vkinv contribute that contain only projectors P1 associated with the N1 -factor in the symmetry group. More generally, if ∆l is a face of ∆k , l < k then we have σV,k ∆l = σV,l ,
(154)
where it is understood that the “V ” appearing in σV,l is the formal expression in Vlinv obtained by dropping in V ∈ Vkinv all terms containing projectors that are not associated with the extremal points of ∆l . Using the restriction property (154), one can also construct bundles of interacting field algebras over an arbitrary kdimensional (C 0 -) manifold X by triangulating X into simplices ∆l . As in the case of full symmetry, the perturbative expansion of the interacting fields ε|O|/2+T OV defined via an interaction V ∈ Vkinv as an element of AV,s , s = (s1 , . . . , sk ), is organized in terms of Feynman graphs that are associated with Riemann surfaces. Moreover, the faces of these Feynman graphs defined by the closed index loops are now “colored” by the numbers sα . To illustrate this in an example, consider the interacting field ε3/2 (Tr φ)θV in case of 3 colors, k = 3, with interaction θ(x)V , where θ is a cutoff function, and where we take V to be V (φ) = g
X
Tr (Pα1 φPα2 φ · · · Pαn φ) .
(155)
αi ∈{1,2,3}
In order to make things a little more interesting, we restrict the sum in this equation to sequences of colors (α1 , . . . , αn ) such that α1 6= α2 · · · 6= αn 6= α1 .
(156)
In the graphical notation introduced above this condition means that the vertices occurring in V are restricted by the property that adjacent projectors Pαi (as one moves around the vertex) are different. A retarded product appearing in the perturbative expansion of ε3/2 (Tr φ)θV with V given by Eq. (155), can now be written as a sum of contributions from individual Feynman graphs as follows:
May 31, 2004 13:56 WSPC/148-RMP
00207
Algebraic Approach to the 1/N Expansion in Quantum Field Theory
ε3/2 R(V (y1 ) · · · V (yn ); Tr φ(x)) X X X = gtn εH genera H
555
1 F2 F3 CC,Γ · sF 1 s2 s3 ·
graphs Γ coloringsC
· εf /2+T rΓ (y1 , . . . , yn ; x) : Tr
YY i
αi
φ(yi )Pαi · · · Tr
YY j
φ(yj )Pβj :H . (157)
βj
The notation used in the expression is analogous to that in the corresponding Eq. (120) in the case of full symmetry, with the following differences: the rΓ are the distributions appearing in the expansion of the interacting field in scalar φn -theory. In contrast to Eq. (120), there appears now an additional sum over colorings, C, over all ways to assign colors s1 , s2 , s3 to those little surfaces in the big surface S associated with the Feynman graph not containing currents, in such a way that adjacent surfaces are never occupied by the same color (this corresponds to the property (156) of V ), and Fα is the number of such little surfaces colored by sα . The combinatorical factor CΓ,C counts the number of ways in which a given coloring scheme can be produced by assigning the different terms in V to the vertices. The external currents are again collected in the normal-ordered term appearing in Eq. (157), where each trace corresponds to following through the index line to which the currents within that trace belong, but there now appear also the projectors Pα that are encountered when following through such an index line. A similar expansion can be written down for the interacting fields without cutoff θ, as defined in Eq. (87). Thus, roughly speaking, the Feynman expansion of an interacting field ε3/2 (Tr φ)V with interaction V given by (155) differs from the corresponding expansion of φV in the theory of a single scalar field with V = gφn only in that the little surfaces in the graphs defined by the propagator lines are now colored according to the structure of the interaction V , and each coloring is weighted by the Q α number α sF α where Fα is the number of little surfaces colored by α ∈ {1, 2, 3}. The property (156) of the interaction chosen in our example implies that only those graphs occur which can be colored by 3 colors in such a way that adjacent little surfaces have different colors. In other words, there cannot appear any Feynman graphs such that the associated surface cannot be colored by less than 4 colors in this way. Thus, by choosing the interaction V in the way described above, we have, in effect, suppressed certain Feynman graphs that would be present in scalar φn -theory. 7. Summary and Comparison to Other Approaches In this paper, we have constructed perturbatively the gauge invariant interacting quantum field operators for scalar field theory in the adjoint representation of U (N ), with an arbitrary gauge invariant interaction. These operators are members of an abstract algebra, whose structure constants, as we have demonstrated, have a welldefined limit as N → ∞, or, more properly, are power series in 1/N , with positive
May 31, 2004 13:56 WSPC/148-RMP
556
00207
S. Hollands
powers (provided the coupling parameters are also rescaled in a specific way by suitable powers of 1/N ). In this sense, these algebras, and the interacting quantum fields that are the elements of this algebra, also possess a well-defined large N limit. We showed that the renormalization group flow can be defined on the algebraic level via a 1-parameter family of isomorphisms acting on the fields via a rescaling of the spacetime arguments, the field strength, and an appropriate change in the coupling parameters. That flow was shown to be a power series in 1/N with positive powers, and hence has a large N limit. We also presented similar results in the case when the interaction of the fields is not invariant under U (N ), but only invariant under certain diagonal subgroups. We did not address issues related to the convergence of the perturbation expansion or the expansion in 1/N . Our motivation for investigating the formulation of the 1/N -expansion in an algebraic framework rather than via Green’s functions of the vacuum state — as is conventionally done — was that the algebraic formulation is completely local in nature and thereby bypasses potential infrared problems, which can occur in the usual formulations via Green’s functions in massless theories. Also, although we explicitly only worked in Minkowski space, we were strongly motivated by the fact that an algebraic approach is essential if one wants to formulate quantum field theory in a generic curved spacetime, where no preferred vacuum state exists. Actually, since our arguments are mostly of combinatorical nature, we expect that the present algebraic formulation of the 1/N expansion can be carried over rather straightforwardly to curved space. The algebraic approach presented in this paper is rather different in appearance from the usual formulation via Green’s functions, so we would briefly like to explain the relationship between the two approaches. In the conventional approach, one considers the N -dependence of the vacuum Green’s functionsaa of gauge invariant interacting fields, Gn = ω0 (OV · · · OV ) associated with the interaction V . For example, for O = Tr φ2 , one finds that the corresponding connected Green’s function Gconn receives contributions of order N 2−2H from Feynman graphs of genus n H (assuming that the couplings in V are scaled by appropriate powers of 1/N ). Thus, the planar diagrams H = 0 make the leading contribution at large N , with Gconn ∼ N 2 , independent of n. n If one wants to reconstruct from the Green’s functions the Hilbert space of the theory and the interacting field observables as linear operators on that Hilbert space, one needs to consider not the connected Green’s functions, but the Wightman Green’s functions Gn themselves, since the latter enter in the Wightman reconstruction argument. Writing the Wightman Green’s functions Gn in terms of Gconn via n 2n the usual formulae, one immediately gets that Gn ∼ N . Hence, it is clear that, if one wants these constructions to be well defined at infinite N , then one needs to aa One
normally considers time-ordered Green’s functions, but the arguments do not depend on the time ordering and therefore also equally apply to the Wightman functions, which we prefer to consider here.
May 31, 2004 13:56 WSPC/148-RMP
00207
Algebraic Approach to the 1/N Expansion in Quantum Field Theory
557
consider the normalized fields N −2 Tr φ2 . Similar remarks also apply to more general composite fields, with appropriate powers of 1/N in the normalization factor, depending on the number of traces and the number of basic fields. These powers coincide precisely with the powers found in our algebraic approach (see Eq. (114)) by different means. The arguments that we have just given are of course only formal, because the reconstruction theorem, as it stands, is not really applicable in perturbation theory. Also, as we have already emphasized several times, the G n may actually be ill-defined because they may involve infrared divergent integrations over interaction vertices (in massless theories). The methods of this paper, on the other hand, give a rigorous construction of the field theory at the algebraic level that, by contrast to the formulation via Green’s functions, should also be applicable in curved spacetimes. It is a trivial consequence of the large N -behavior of the connected Green’s ˆ n of the suitfunctions that, in the large N limit, the Wightman Green’s functions G ˆ n (x1 , . . . , xn ) ∼ ably normalized field operators factorize into 1-point functions, G ˆ 1 (x1 ) · · · G ˆ 1 (xn ). Thus, it is formally clear that the large N theory is abelian, i.e. G the field commutators vanish. This can be seen explicitly in our algebraic framework, since the commutator of any two (suitably normalized) interacting fields is seen to be of order N −2 . Thus, the algebra AV of interacting fields is abelian in the large N limit, and consequently the representations are degenerate. This behavior can be nicely formalized in the algebraic framework by viewing AV as a Poisson algebra,bb with antisymmetric bracket defined by { . , . } = N 2 [ . , . ]. In the limit of large N , that Poisson algebra becomes abelian, as is also the case for the classical limitcc ~ → 0 (the appropriate definition of the Poisson bracket in that case being { . , . } = (i~)−1 [ . , . ]). Thus, it is seen clearly at the algebraic level that there exist formal similarities between the large N limit and the classical limit, and that, in particular, the large N limit of a field theory does not define a quantum field theory in the usual sense, but rather a Poisson algebra. We note that the large N limit can thereby be interpreted, within our algebraic framework, as some kind of “deformation quantization” [1], the deformation parameter being 1/N 2 .
Acknowledgments I would like to thank K. H. Rehren for useful conversations. This work was supported by NSF-grant PH00-90138 to the University of Chicago. bb A
Poisson algebra is an algebra A together with an antisymmetric bracket { . , . } from A × A to A satisfying the Leibniz rule {ab, c} = a{b, c} + {a, c}b, together with the Jacobi identity. A Poisson algebra is called abelian if A is abelian. The observables of a classical field theory form an abelian Poisson algebra, with the (commutative) algebra multiplication given by pointwise multiplication of the observables, and with the Poisson bracket given in terms of the symplectic structure of the theory. A trivial example of a nonabelian Poisson algebra is any non-commutative algebra with the Poisson bracket defined by the algebra commutator. cc The classical limit and the large N limit are not, of course, equivalent.
May 31, 2004 13:56 WSPC/148-RMP
558
00207
S. Hollands
References [1] F. Bayen, M. Flato, C. Fronsdal, A. Lichnerowicz and D. Sternheimer, Deformation theory and quantization, Ann. Phys. (NY) 111 (1978) 61. [2] R. Brunetti, K. Fredenhagen and M. K¨ ohler, The microlocal spectrum condition and Wick polynomials on curved spacetimes, Commun. Math. Phys. 180 (1996) 633–652. [3] R. Brunetti and K. Fredenhagen, “Microlocal analysis and interacting quantum field theories: Renormalization on physical backgrounds, Commun. Math. Phys. 208 (2000) 623–661. [4] M. D¨ utsch and K. Fredenhagen, Algebraic quantum field theory, perturbation theory, and the loop expansion, Commun. Math. Phys. 219 (2001) 5, arXiv:hep-th/0001129; Perturbative algebraic field theory, and deformation quantization, to appear in Fields Inst. Commun. arXiv:hep-th/0101079. [5] M. D¨ utsch and K. Fredenhagen, The master Ward identity and generalized Schwinger–Dyson equation in classical field theory, Commun. Math. Phys. 243 (2003) 275, arXiv:hep-th/0211242. [6] M. D¨ utsch and K. Fredenhagen: Causal Perturbation Theory in Terms of Retarded Products, and a Proof of the Action Ward Identity, hep-th/0403213. [7] M. D¨ utsch, T. Hurth and G. Scharf, Causal construction of Yang–Mills theories. 4. Unitarity, Nuovo Cim. A108 (1995) 737. See also references contained in this paper. [8] R. Haag, On quantum field theories, Dan. Mat. Fys. Medd. 29(12) (1955) 13; reprinted in Dispersion Relations and the Abstract Approach to Field Theory, ed. L. Klein (Gordon & Breach, NY, 1961). [9] L. H¨ ormander, The Analysis of Linear Partial Differential Operators I, 2nd edn. (Springer-Verlag, 1990). [10] S. Hollands and R. M. Wald, On the renormalization group in curved spacetime, Commun. Math. Phys. 237 (2003) 123–160, arXiv:gr-qc/0209029. [11] S. Hollands and R. M. Wald, Local Wick polynomials and time-ordered products of quantum fields in curved spacetime, Commun. Math. Phys. 223 (2001) 289, arXiv:grqc/0103074. [12] S. Hollands and R. M. Wald, Existence of local covariant time-ordered products of quantum fields in curved spacetime, Commun. Math. Phys. 231 (2002) 309, arXiv:grqc/0111108. [13] S. Hollands and R. M. Wald, Time ordered products of fields with derivatives, in preparation. [14] S. Hollands and W. Ruan, The state space of perturbative quantum field theory in curved space-times, Ann. Inst. H. Poincar´e 3 (2002) 635, arXiv:gr-qc/0108032. [15] G. ’t Hooft, A planar diagram theory for strong interactions, Nucl. Phys. B72 (1974) 461. [16] T. Hurth and K. Skenderis, Quantum Noether method, Nucl. Phys. B541 (1999) 566, arXiv:hep-th/9803030. [17] G. Parisi, The theory of non-renormalizable interactions. The large N expansion, Nucl. Phys. B100 (1975) 368–388. [18] M. Reed and B. Simon, Methods of Modern Mathematical Physics I (Academic Press, New York, 1973). [19] G. Scharf, Finite Quantum Electrodynamics. The Causal Approach, 2nd edn. (Springer Verlag, 1995).
July 14, 2004 17:56 WSPC/148-RMP
00208
Reviews in Mathematical Physics Vol. 16, No. 5 (2004) 559–582 c World Scientific Publishing Company
STRONG-COUPLING ASYMPTOTIC EXPANSION FOR ¨ SCHRODINGER OPERATORS WITH A SINGULAR INTERACTION SUPPORTED BY A CURVE IN R3
P. EXNER∗ and S. KONDEJ† ∗Nuclear
Physics Institute, Academy of Sciences ˇ z near Prague, Czech Republic 25068 Reˇ and Doppler Institute, Czech Technical University, Bˇrehov´ a7 11519 Prague, Czech Republic †Institute
of Physics, University of Zielona G´ ora, ul. Szafrana 4a 65246 Zielona G´ ora, Poland ∗[email protected] †[email protected] Received 6 March 2003 Revised 21 April 2004
We investigate a class of generalized Schr¨ odinger operators in L 2 (R3 ) with a singular interaction supported by a smooth curve Γ. We find a strong-coupling asymptotic expansion of the discrete spectrum in the case when Γ is a loop or an infinite bent curve which is asymptotically straight. It is given in terms of an auxiliary one-dimensional Schr¨ odinger operator with a potential determined by the curvature of Γ. In the same way, we obtain asymptotics of spectral bands for a periodic curve. In particular, the spectrum is shown to have open gaps in this case if Γ is not a straight line and the singular interaction is strong enough. Keywords: Schr¨ odinger operators; singular interaction; strong-coupling asymptotics.
1. Introduction The subjects of this paper are asymptotic spectral properties for several classes of generalized Schr¨ odinger operators in L2 (R3 ) with an attractive singular interaction supported by a smooth curve or a family of such curves. On a formal level, we can write such a Hamiltonian as −∆ − α ˜ δ(x − Γ) ,
(1.1)
however, a proper way to define the operator corresponding to the formal expression is involved and will be explained in Sec. 2.2.a A physical motivation for this model a In
particular, this is the reason why we use here a formal coupling constant different from the parameter α introduced in the condition (2.4) below. 559
July 14, 2004 17:56 WSPC/148-RMP
560
00208
P. Exner & S. Kondej
is to understand the electron behavior in “leaky” quantum wires, i.e. a model of these semiconductor structures which is realistic since it takes into account the fact that the electron as a quantum particle capable of tuneling can be found outside the wire (cf. [8]) for a more detailed discussion. One natural question is whether in the case of a strong transverse coupling, properties of such a “leaky” wire will approach those of an ideal wire of zero thickness, i.e. the model in which the particle is confined to Γ alone, and how the geometry of the configuration manifold will be manifested at that. In the two-dimensional case when Γ is a planar curve, this problem was analyzed in [11, 12] where it was shown that apart from the divergent term which describes the energy of coupling to the curve, the spectrum coincides asymptotically with that of an auxiliary onedimensional Schr¨ odinger operator with a curvature-induced potential.b The case of a curve in R3 which we are going to discuss here is more complicated for several reasons. First of all, the codimension of Γ is two in this situation which means that to define the Hamiltonian, we cannot use the natural quadratic form and have to employ generalized boundary conditions instead. Furthermore, while the strategy of [11, 12] based on bracketing bounds combined with the use of suitable curvilinear coordinates in the vicinity of Γ can be applied again, the “straightening” transformation we have to employ is more involved here. Also the bound on the transverse part of the estimating operators is less elementary in this case. Let us review briefly the contents of the paper. We begin by constructing a self-adjoint operator Hα,Γ which corresponds to the formal expression (1.1), where Γ is a curve in R3 ; this will be done in Sec. 2.5. To this aim, we employ in the transverse plane to Γ the usual boundary conditions defining a two-dimensional point interaction [2, Sec. I.5]. Recall that the latter is known to have, for any α ∈ R, a single negative eigenvalue which equals ξα = −4e 2(−2πα+ψ(1)) , where −ψ(1) = 0.577 . . . is the Euler constant. The main topic of this paper is the spectral properties of Hα,Γ in the strong-coupling asymptotic regime which means here that −α is large. The auxiliary operator mentioned above is given by 1 S := −∆ − κ2 , 4 where ∆ is the one-dimensional Laplace operator on the segment parameterizing Γ and κ is the curvature of Γ. Its discrete spectrum is non-empty unless Γ is a straight line; we denote the jth eigenvalue as µj . Our main results can be then characterized briefly as follows. Discrete spectrum: if Γ is a loop, we show in Sec. 3 that the jth eigenvalue λj (α) of Hα,Γ admits an asymptotic expansion of the following form λj (α) = ξα + µj + O(e πα ) as α → −∞ b A similar analysis was performed in [6] for smooth surfaces in R 3 where the asymptotic form of the spectrum is given by a suitable “two-dimensional” operator supported by the surface Γ.
July 14, 2004 17:56 WSPC/148-RMP
00208
Asymptotic Expansion for Schr¨ odinger Operators
561
and the counting function α 7→ ]σd (Hα,Γ ) satisfies the relation in this limit L (−ξα )1/2 (1 + O(e πα )) . π In addition, the last formula does not require Γ to be a closed curve as we shall show in Sec. 3.5. Moreover, if Γ is infinite with κ 6= 0 and at the same time asymptotically straight in an appropriate sense, then the above expansion for λj (α) holds again (cf. Sec. 4). Periodic curves are discussed in Sec. 5; we perform Bloch decomposition and use the same technique as above to estimate the discrete spectrum of the fiber operators. In particular, we find that if Γ is a periodic curve and κ(·) is non-constant, then σ(Hα,Γ ) contains open gaps for −α sufficiently large. In the closing section, we will show that the problem can be rephrased in terms of a semiclassical approximation and list some open problems. ]σd (Hα ) =
2. Hamiltonians with Curve-Supported Perturbations 2.1. The curve geometry Let Γ be a curve in R3 (either infinite or a closed loop) which is assumed to be C k , k ≥ 4. Without loss of generality, we may assume that it is parametrized by its arc length, i.e. to identify Γ with the graph of a function γ: I → R3 , where I = [0, L] (with the periodic boundary conditions, γ(0) = γ(L) and the same for the derivatives) if Γ is finite and I = R otherwise. One of our tools will be a parametrization of some neighborhoods of Γ. To describe it, let us suppose first that the curve possesses the global Frenet’s frame, i.e. the triple (t(s), b(s), n(s)) of tangent, binormal and normal vectors which are by assumption C k−2 smooth functions of s ∈ I; recall that this is true if the second derivative of Γ vanishes nowhere. The mentioned neighborhoods are open tubes of a fixed radius centered at Γ: given d > 0, we call Ωd := {x ∈ R3 : dist(x, Γ) < d}. We will impose another restriction on the class of curves excluding those with self-intersections and “nearintersections”, i.e. we suppose that (aΓ1) there exists d > 0 such that the tube Ωd does not intersect itself . Our aim is to describe Ωd by means of curvilinear coordinates, i.e. to write it as the image of a straight cylinder Bd := {r ∈ [0, d), θ ∈ [0, 2π)} by a suitable map. If the Frenet frame exists, we choose the latter as φd : Dd → R3 defined by φd (s, r, θ) = γ(s) − r[n(s) cos(θ − β(s)) + b(s) sin(θ − β(s))] ,
(2.1)
where Dd := I ×Bd and the function β will be specified further. For convenience, we will denote the curvilinear coordinates (s, r, θ) also as q with the coordinate indices (1, 2, 3) ↔ (s, r, θ), and moreover, since it can hardly lead to a confusion, we use the same notation φd for the mappings with target spaces R3 and Ωd which we will need later.
July 14, 2004 17:56 WSPC/148-RMP
562
00208
P. Exner & S. Kondej
The geometry of Ωd is naturally described in terms of its metric tensor (gij ); the latter is according to [5] expressed by means of the curvature κ and torsion τ of Γ in the following way 2 h + r2 ς 2 0 r2 ς gij = 0 1 0 , r2 ς
0
r2
where
ς := τ − β,s and h := 1 + rκ cos(θ − β) .
(2.2)
We use here the standard conventions β,s ≡ ∂s β and g ij ≡ (gij )−1 . In particular, the volume element of Ωd is given by dΩ = g 1/2 dq where g := det(gij ). The simplest situation occurs if we choose β,s = τ ,
(2.3)
because then the tensor gij takes the diagonal form gij = diag(h2 , 1, r2 ). Remarks 2.1. (1) It is well-known that compact manifolds in Rn have the tubular neighborhood property. Thus if Γ is a finite C 4 curve, then the assumption (aΓ1) is satisfied if and any if Γ has no self-intersections. (2) Combining the explicit formula for gij with the inverse function theorem, it is easy to see that the inequality dkκk∞ < 1 is sufficient for φd to be locally diffeomorphic. The special rotating system described above is usually called, in the theory of ˙ h ¨ are waveguides, the Tang system of coordinates. If I is finite, the functions h, h, bounded by assumption, while in the case when I = R, the global boundedness has to be assumed. The main problem, however, is that the described construction may fail if the Frenet frame is not uniquely defined. Hence we suppose in general that (aΓ2) for all d > 0 small enough, there is a diffeomorphism φd : Dd → Ωd such that the corresponding metric tensor is gij = diag(h2 , 1, r2 ) where h is given by (2.2) with β which is locally bounded, C k−2 smooth with a possible exception of a nowhere dense subset of I, and h together with its first two derivatives are bounded. While it represents a nontrivial restriction, this hypothesis can nevertheless be satisfied for a wide class of curves without a global Frenet frame. Example 2.2. Suppose that the curve parameter interval I can be covered by at S most countable union j∈J Ij of intervals Ij ≡ [aj , bj ] such that a pair of different Ij , Ik has in common at most one endpoint, and furthermore, either the Frenet frame exists in (aj , bj ) or Γ,ss = 0 in [aj , bj ]. In the former case we assume also that limits of n(s) as s approaches aj and bj exist. We claim that in such a case a diffeomorphism with a diagonal gij can be constructed.
July 14, 2004 17:56 WSPC/148-RMP
00208
Asymptotic Expansion for Schr¨ odinger Operators
563
Let us describe first its building blocks. If the Frenet frame exist in (aj , bj ), we (j) construct a map φd on the appropriate part of the curve by (2.1) with β replaced by a function βj satisfying the condition (2.3). On the other hand, in the straight (j) parts we construct φd similarly choosing a constant for βj and an arbitrary pair of unit vectors forming an orthogonal system with the tangent for b, n with a proper orientation. (j) The maps φd can be patched into a global diffeomorphism by choosing properly βj ’s. Suppose first that the family {aj } of left endpoints has no accumulation points in the interior of I. In that case, we may identify without loss of generality the index set J with a segment of Z, i.e. j = −N, −N + 1, . . . , M for some N, M ∈ N0 ∪ {∞} and to suppose that the interval family is ordered, an+1 = bn . Assume now that Sm such a diffeomorphic map exists on j=−n Ij . The left and right limits of the base vector systems at the points a−n = b−n−1 and bm = am+1 exist by assumption and differ at most by a rotation in the normal planes to Γ at these points, so the S map can be extended to a diffeomorpism on m+1 j=−n−1 Ij by adjusting β−n−1 and βm+1 ; notice that the condition (2.3) remains valid on (aj , bj ) if the function βj is shifted by a constant. The sought conclusion then follows easily by induction. On the other hand, let the set {aj } have accumulation points in the interior of I; we call them {ck }, ordering them into an increasing sequence, finite or countable. The above construction defines a diffeomorphism in each interval I˜k := (ck , ck+1 ) between adjacent points. Then the argument can be repeated, the role of Ij now being played by the intervals I˜k . Having constructed such a φd : Dd → Ωd , one can check directly whether the global boundedness conditions of the assumption (aΓ2) are satisfied.
2.2. Singularly-perturbed Schr¨ odinger operators The Hamiltonians we want to study are Schr¨ odinger operators with s-independent perturbations supported by the curve Γ. Such operators can be understood as the Laplacian with specific boundary conditions on Γ and the aim of this section is to make these conditions precise. Let us assume that for a number d, the map φd satisfies conditions (aΓ1, 2). Given ρ ∈ (0, d) and θ0 ∈ [0, 2π) denote by Γρ,θ0 , the “shifted” curve located at the distance ρ from Γ which is defined as the φd image of the set I × {ρ, θ0 } ⊂ Dd ; recall that the global diffeomorhism φd exists by assumption (aΓ2). Consider the 2,2 Sobolev space Wloc (Λ \ Γ), where Λ is an open bounded or unbounded set in R3 such that Ωd ⊆ Λ; particularly, Λ may coincide with whole R3 . Since its elements 2,2 are continuous on Λ away from Γ, the restriction of a function f ∈ Wloc (Λ \ Γ) to the “shifted” curve located sufficiently close to Γ is well defined; we will denote it as f Γρ,θ (·). In fact, we can regard f Γρ,θ as a distribution from D 0 (0, L) 0 0 parameterized by the distance ρ and the angle θ0 . We shall say that a function 2,2 f ∈ Wloc (Λ \ Γ) ∩ L2 (Λ) belongs to ΥΩd if the following limits
July 14, 2004 17:56 WSPC/148-RMP
564
00208
P. Exner & S. Kondej
Ξ(f )(s) := − lim
ρ→0
1 f (s) , ln ρ Γρ,θ0
Ω(f )(s) := lim f Γρ,θ (s) + Ξ(f )(s) ln ρ ρ→0
0
exist a.e. in [0, L], which are independent of θ0 , and define a pair of functions belonging to L2 (0, L); for an infinite curve, [0, L] is replaced by R. We should also 2,2 stress here that the elements of Wloc (Λ\Γ) are in fact distributions from D 0 (R3 ), however, in the definition of ΥΛ , we can naturally identify them with their canonical imbeddings into L2 (Λ). Given a function f ∈ ΥΛ , we write f ˜α · bc(Γ) if the limits Ξ(f )(·), Ω(f )(·), characterizing the behavior of f close to Γ satisfy the following relation 2παΞ(f )(s) = Ω(f )(s) .
(2.4)
With these prerequisites, we can define the singularly-perturbed Schr¨ odinger operator in question through the set D(Hα,Γ ) = {f ∈ ΥR3 : f ˜α · bc(Γ)} on which, the operator Hα,Γ : D(Hα,Γ ) → L2 (R3 ) acts as Hα,Γ f (x) = −∆f (x) ,
x ∈ R3 \Γ .
(2.5)
To show that Hα,Γ makes sense as a quantum mechanical Hamiltonian, we will assume here that Γ is finite or infinite periodic. Another interesting case, that of an infinite non-periodic curve which is asymptotically straight, needs additional assumptions and will be discussed separately in Sec. 4. Theorem 2.3. Under the stated assumptions, Hα,Γ is self-adjoint. Proof. One check using integration by parts and passing to the curvilinear system of coordinates q = (s, r, θ) in a sufficiently small tubular neighborhood of Γ that the following boundary form υ: υ(f, g) = (Hα,Γ f, g) − (f, Hα,Γ g) vanishes for all f, g ∈ D(Hα,Γ ), i.e. the operator Hα,Γ is symmetric. To check its self-adjointness, we can proceed in analogy with [9, Theorem 4.1]. Repeating the argument presented there step by step, we derive the resolvent of Hα,Γ and the sought result, then follows from [14, Theorem 2.1]. An alternative way is to note that Hα,Γ is one of the self-adjoint extensions discussed in [13]. It is true that in this paper stronger smoothness conditions for Γ were adopted, however, the results remain valid for the C 4 class. The operator Hα,Γ will be a central object of our interest. It is natural to regard it as a Schr¨ odinger operator with the singular perturbation supported by the curve Γ.
July 14, 2004 17:56 WSPC/148-RMP
00208
Asymptotic Expansion for Schr¨ odinger Operators
565
Remarks 2.4. (1) The choice of boundary conditions (2.4) which we used in the construction had a natural motivation. If Γ is a line in R3 , one can separate variables; in the cross plane, we then have the two-dimensional Laplace operator with a singlecentre point interaction −∆α,{0} which is a well studied object (cf. [2, Sec. I.5]). 2,2 To define it, one considers for a function f ∈ Wloc (R2 \{0}) ∩ L2 (R2 ) the following limits ˜ ) := − lim 1 f , ˜ ) := lim (f + Ξ(f ˜ ) ln r) ; Ξ(f Ω(f r→0 ln r r→0 if they are finite and satisfy the relation ˜ ) = Ω(f ˜ ), 2παΞ(f
(2.6)
the function f belongs to the domain of −∆α,{0} . Using the explicit form of its resolvent, it is easy to see that such an operator has for any α ∈ R exactly one negative eigenvalue which is given by ξα = −4e 2(−2πα+ψ(1)) ,
ψ(1) = −0.577 . . . .
(2.7)
Obviously, it coincides with the bottom of the essential spectrum of Hα,Γ for a straight Γ. We know from [9] that this property is preserved if Γ is curved but asymptotically straight in a suitable sense; in that case the operator has a nonempty discrete spectrum (cf. Sec. 4). It is also clear from the relation (2.7) and the corresponding eigenfunction [2, Sec. I.5] that a strong coupling corresponds to large negative values of α. (2) For the sake of brevity, we use (2.6) the abbreviation f ˜α · bc(0), in analogy with (2.4) for the boundary condition, later we employ similar self-explanatory symbols for other conditions, Dirichlet, Neumann, periodic, etc. 3. Strong Coupling Asymptotics for a Loop In this section we will discuss in detail the strong-coupling asymptotic behavior of the discrete spectrum in the simplest case when Γ is a finite closed curve satisfying the regularity assumptions stated above; by Remark 2.1(1), it means that Γ is C 4 and does not intersect itself. Remark 3.1. In the following considerations we will rely on the operator inequality A ≤ B, where both operators, A and B, are self-adjoint and bounded from below. To be precise, we are going to follow the definition from [15, Sec. XIII.15], i.e. A ≤ B if and only if qA [f ] ≤ qB [f ] ,
f ∈ Q(B) ⊆ Q(A) ,
where qA , qB are the forms associated with A, B having the form domains Q(A), Q(B), respectively. Since Γ is compact, it does not influence the essential spectrum of Hα,Γ . This can be seen by writing explicitly the resolvent [14] and checking that it differs from
July 14, 2004 17:56 WSPC/148-RMP
566
00208
P. Exner & S. Kondej
the free one by a compact operator in analogy with the argument used in [4] for codim Γ = 1. However, there is a simpler way. Proposition 3.2. With the stated assumptions, we have σess (Hα,Γ ) = σess (−∆) = [0, ∞) . Proof. By Neumann bracketing, we can check that inf σess (Hα,Γ ) = 0. Indeed, N∂B choose a ball B such that Γ is contained in its interior and call Hα,Γ the Laplace 2 3 operator in L (R ) with the same boundary condition on Γ as Hα,Γ and Neumann N∂B condition at ∂B. We have Hα,Γ ≥ Hα,Γ and the spectrum of the latter is the union of the interior and the exterior component. The first named one is discrete and the spectrum of the other is the non-negative halfline, so the claim follows from the minimax principle. To show that every positive number belongs to σ(Hα,Γ ), it is sufficient to construct a suitable Weyl sequence; one can use a Weyl sequence for −∆ chosen in such a way that its elements have supports disjoint from B. Let us turn to the main subject of this section. To describe how the discrete spectrum of Hα behaves asymptotically for α → −∞, we employ the comparison operator defined by S=−
d2 κ(s)2 − : D(S) → L2 (0, L) , ds2 4
(3.1)
with the domain D(S) = {φ ∈ W 2,2 (0, L); φ˜p · bc(0, L)}, i.e. determined by periodic boundary conditions, φ(0) = φ(L), φ0 (0) = φ0 (L). Furthermore, κ(·) is the curvature of Γ. It is worth stressing that S acts in a different Hilbert space than Hα,Γ . We denote by µj the jth eigenvalue of S. With this notations, our main result as follows. Theorem 3.3. (1) To any fixed n ∈ N, there exists an α(n) ∈ R such that ]σd (Hα,Γ ) ≥ n f or α ≤ α(n) . The jth eigenvalue λj (α) of Hα,Γ admits an asymptotic expansion of the following form λj (α) = ξα + µj + O(e πα ) as α → −∞ . (b) The counting function α 7→ ]σd (Hα,Γ ) behaves asymptotically as ]σd (Hα ) =
L (−ξα )1/2 (1 + O(e πα )). π
The proof of the theorem is divided into several steps which we will describe subsequently in the following sections. It is also worth stressing here that the error term O(e πα ) is not uniform with respect to j; this will be clear from Lemma 3.5.
July 14, 2004 17:56 WSPC/148-RMP
00208
Asymptotic Expansion for Schr¨ odinger Operators
567
3.1. Dirichlet–Neumann bracketing Our aim is to estimate the operator Hα,Γ in the negative part of its spectrum from both sides by means of suitable operators acting in a tubular neighborhood Ωd of Γ, with d sufficiently small to make the assumptions (aΓ1, 2) satisfied. The first step in obtaining the estimating operators is to impose additional Dirichlet and Neumann j condition at the boundary of Ωd . Thus let the operators Hα,Γ , j = D, N , in L2 (Ωd ) j act as the Laplacian with the domains given respectively by D(Hα,Γ ) = {f ∈ ΥΩd : j f ˜α · bc(Γ), f ˜j · bc(∂Ωd )}; it is straightforward to check that operators Hα,Γ are self-adjoint. Now the well-known result [15, Sec. XIII.15] says that N D D −∆N Σd ⊕ Hα,Γ ≤ Hα,Γ ≤ −∆Σd ⊕ Hα,Γ ,
Σd := R3 \Ωd .
What is important is that the operators −∆jΣd corresponding to the exterior of Ωd do not contribute to the negative part of the spectrum because they are both positive by definition. j It is convenient to express the operators Hα,Γ in the curvilinear coordinates q = (s, r, θ); this can be done by means of the unitary transformation U f = f ◦ φd : L2 (Ωd ) → L2 (Dd , g 1/2 dq) ,
Dd = [0, L] × Bd ;
recall that the global diffeomorphism φd exists by assumption (aΓ2). Then the ˜ j := U H j U −1 act as operators H α,Γ α,Γ f (x) 7→ −(g −1/2 ∂i g 1/2 g ij ∂j f )(x) for x ∈ Ωd \Γ with the domains {f ∈ ΥΩd : f ˜α · bc(Γ), f ˜j · bc(ωr (d)), f ˜p · bc(ωs (0), ωs (L))}, respectively, where we have introduced the notation ωqi (t) := {q ∈ Dd : qi = t} . To simplify it further, we remove the weight g 1/2 appearing in the inner product of the space L2 (Dd , g 1/2 dq). This is done by means of another unitary map, ˆ : L2 (Dd , g 1/2 dq) → L2 (Dd , dq) , U
ˆ f := g 1/4 f ; U
˜ j will be denoted as H ˆj = U ˆH ˜j U ˆ −1 . The aim of these unitary the images of H α,Γ α,Γ α,Γ transformations is to find a representation where the eigenvalues (which we need to estimate the eigenvalues of Hα,Γ by means of the minimax principle) are easy to analyze. A straightforward calculation analogous to that performed in [5] yields ˆ j , j = D, N , which both act asc explicit formulae for H α,Γ 1 −∂i g ij ∂j − r−2 + V , 4 where V is the effective potential given by 1 V = g −1/4 (∂i g ij (∂j g 1/4 )) + r−2 , 4 c We
(3.2)
employ the usual convention that summation is performed over repeated indices keeping in mind that (g ij ) is diagonal.
July 14, 2004 17:56 WSPC/148-RMP
568
00208
P. Exner & S. Kondej
while their domains are different ˆ D ) = {f ∈ ΥD : g −1/4 f ˜α · bc(Γ), f ˜p · bc(ωs (0), ωs (L)) , D(H α,Γ d f ˜D · bc(ωr (d))} , N ˆ α,Γ D(H ) = {f ∈ ΥDd : g −1/4 f ˜α · bc(Γ), f ˜p · bc(ωs (0), ωs (L)) ,
(∂r f )r=d = −[(g 1/4 ∂r g −1/4 )f ]r=d } . Remark 3.4. Notice that the boundary conditions satisfied by functions from ˆ j ) on the curve Γ can be written in a simpler way. Since only the leading D(H α,Γ term in g −1/4 is important as r → 0, they are equivalent to r −1/2 f ˜α · bc(Γ). Notice also that while the Dirichlet boundary condition at ∂Ωd persists at the ˆ into a mixed boundary unitary transformation, the Neumann one is changed by U condition. 3.2. Estimates by operators with separated variables ˆ j , j = D, N , give the two-sided bounds for the negative While the operators H α,Γ eigenvalues of Hα , they are not easy to handle. This is why we pass to a cruder, but still sufficient estimate by operators with separated variables. In the first step, we will make the boundary conditions in the lower bound independent of the coordinates. The boundary term involved in the definition of ˆ N ) depends on s and θ. We replace the corresponding coefficient by M := D(H α,Γ
g 1/4 ∂r g −1/4 ∞ passing thus to the operator L (ωr (d)) − ˆN H˙ α,Γ := −∆h ⊗ I + I ⊗ (−∆− α ) + V ≤ Hα,Γ
on L2 (0, L) ⊗ L2 (Bd ), where −∆h := −∂s h−2 ∂s : D(S) → L2 (0, L) and 1 2 2 −2 2 −∆− ∂θ − r−2 : D(∆− α ) → L (Bd ) , α := −∂r − r 4 2,2 − 2 −1/2 D(∆− f ˜α · bc(0) , α ) := {f ∈ Wloc (Bd \{0}) : ∆α f ∈ L (Bd ), r
(∂r f )|r=d = M f |r=d } with the boundary condition at the centre of the circle written in the simplified form mentioned in Remark 3.4. The upper bound contains no boundary term depending on s or θ, so we can put D ˆ α,Γ H˙ + = H = −∆h ⊗ I + I ⊗ (−∆+ α) + V α,Γ
which acts in the same way but the above mixed boundary condition on ∂Bd is replaced by the Dirichlet condition. The next estimate concerns the effective potential V given by (3.2); by a straightforward calculation [5], we can express it in terms of the curvature together with the function h and its two first derivatives with respect to the variable s as follows, V =−
κ2 h,ss 5(h,s )2 + 3 − . 2 4h 2h 4h4
(3.3)
July 14, 2004 17:56 WSPC/148-RMP
00208
Asymptotic Expansion for Schr¨ odinger Operators
569
It is important that up to an O(d) term, this expression coincides with the potential involved in the comparison operator S. Indeed, since h is continuous on a compact set and thus bounded, by (2.2) there exists a positive Ch such that the inequalities Ch− (d) ≤ h−2 ≤ Ch+ (d) with Ch± (d) := 1 ± Ch d , hold for all d small enough. Since Γ is C 4 by assumption, the derivatives h,s and h,ss are also bounded; hence (3.3) yields the estimate 2 V + κ ≤ C V d 4
with a positive CV valid on Dd for all sufficiently small d. At the same time, we can apply the above bounds for h−2 to the longitudinal part of the kinetic term. Putting all this together, we get + L− d ⊗ I ≤ −∆h ⊗ I + V ≤ Ld ⊗ I ,
where ± L± d := −Ch
κ2 d2 − ± CV d : D(S) → L2 (0, L) . ds2 4
Summarizing the above discussion, we can introduce a pair of operators with the longitudinal and transverse components separated, namely ± 2 2 Bα± := L± d ⊗ I + I ⊗ (−∆α ) on L (0, L) ⊗ L (Bd ) ,
(3.4)
± which give the sought two-sided bounds, ±H˙ α,Γ ≤ ±Bα± .
3.3. Component eigenvalues estimates ± In the next step we have to estimate the eigenvalues of L± d and −∆α . Let us start with the longitudinal part. It is easy to check the identity κ2 ± ± Ld = Ch (d)S ± CV + Ch d; 4
combining it with the minimax principle and the fact that the eigenvalues of S 2 2 behave as ( 2π L ) ` + O(1) as ` → ±∞, we arrive at the following conclusion. Lemma 3.5. There is a positive C such that the eigenvalues lj± (d) of L± d , numbered in the ascending order , satisfies the inequalities |lj± (d) − µj | ≤ Cj 2 d
(3.5)
for all j ∈ N and d small enough. The transverse part is a bit more involved. Our aim is to show that in the strongcoupling case, the influence of the boundary conditions is weak, i.e. the negative eigenvalues of the operators −∆± α do not differ much from the number (2.7).
July 14, 2004 17:56 WSPC/148-RMP
570
00208
P. Exner & S. Kondej
Lemma 3.6. There exist positive numbers Ci , 1 ≤ i ≤ 4, such that each one of the ± operators −∆± α has exactly one negative eigenvalue tα which satisfies + ξα − S(α) < t− α < ξα < tα < ξα + S(α)
(3.6)
for α large enough negative, where S(α) := C1 ζα2
p dζα exp(−C2 dζα )
with ζα := (−ξα )1/2 , provided dζα > C3 and dM < C4 .
Proof. Let us start with the eigenvalue of the operator −∆+ α involved in the upper bound; the argument will be divided into four parts. Step 1. We will show that the number −kα2 with kα > 0 is an eigenvalue of −∆+ α if and only if kα is a solution of the equation x = ζα η(x) , where ζα has been defined above and η is the function given by K0 (xd) ; η : R+ → R+ , η(x) = exp − I0 (xd)
(3.7)
(3.8)
the symbols K0 , I0 denote the Macdonald and the modified Bessel function, respectively [1]. To verify this claim, we note that the eigenfunction ϕ of −∆+ α corresponding to −kα2 is a linear combination ϕ(r) = D1 I0 (kα r)r1/2 + D2 K0 (kα r)r1/2 with the coefficients D1 , D2 chosen in such a way that the conditions following from ϕ˜D · bc(∂Bd ) and r−1/2 ϕ˜α · bc(0) are satisfied. Using the behavior of K0 , I0 at the origin ρ (3.9) K0 (ρ) = − ln + ψ(1) + O(ρ) and I0 (ρ) = 1 + O(ρ) , 2 as ρ → 0, we can readily check that ϕ fulfils the needed boundary conditions if and only if (D1 , D2 ) ∈ ker M (α), where M (α) is the matrix given by I0 (kα d) K0 (kα d) Mij (α) = 1 ω(α, kα ) with ω(α, kα ) := ψ(1) − 2πα − ln(kα /2). Of course, the condition ker M (α) 6= ∅ is equivalent to det M (α) = 0; the latter holds if and only if kα is a solution of (3.7). Step 2. Our next aim is to show that Eq. (3.7) has at least one solution for −α sufficiently large, and moreover, that such a solution kα satisfies the inequalities ˜ α < kα < ζα Cζ
(3.10)
with C˜ ∈ (0, 1) independent of α. Using (3.9) again together with the asymptotic behavior of the functions K0 , I0 at infinity, we get for a fixed α ζα η(x) → ζα as x → ∞
July 14, 2004 17:56 WSPC/148-RMP
00208
Asymptotic Expansion for Schr¨ odinger Operators
571
and ζα η(x) = gα,d x + O(x2 ) as x → 0 ,
(3.11)
where gα,d := 21 e−ψ(1) dζα . It is clear that the error term is uniform with respect to α over finite intervals only, however, if gα,d > 1 ,
(3.12)
then Eq. (3.7) has obviously at least one solution. The second inequality in (3.10) holds trivially because η(x) < 1 for any x > 0. Let us assume that the first one is violated. This means that there is a sequence {αn } with αn → −∞ as n → ∞ such that η(kαn ) → 0 as n → ∞. This may happen only if the kαn tends to the singularity of K0 , in other words if kαn → 0 holds as n → ∞. However, the inequality (3.12) is valid for αn with n large enough, thus small kαn cannot in view of the asymptotics (3.11) be a solution of (3.7) in contradiction with the assumption. Step 3. To show that there exists only one solution of (3.7), it suffices to check that the function hα : R+ 7→ R, hα (x) = x − ζα η(x) , ˜ α , ζα ) and −α sufficiently large. Using again is strictly monotonous for x ∈ (Cζ the behavior of K0 , I0 at large values of the argument, we find that the derivative η 0 (x) → 0 as x → ∞ which implies the result. 2 Step 4. It remains to show that the eigenvalue t+ α = −kα satisfies the second one of the inequalities
ξα < −kα2 < ξα + S(α) .
(3.13)
Since the functions −K0 , I0 are increasing and I0 (0) = 1, we get from (3.10) the estimate ˜ α d)) . η(kα ) ≥ exp(−K0 (Cζ ˜ ˜ α d)) ζα2 and using the asymptotic behavior of Putting S(α) = 1 − exp(−2K0 (Cζ K0 at large distances, one finds that p ˜ S(α) ≤ C˜1 ζα2 dζα exp(−C˜2 dζα ) as α → −∞
holds with suitable constants C˜1 , C˜2 and the inequality (3.13) is satisfied which concludes the proof for the operator −∆+ α. Let us turn to the operator −∆− . The argument is similar, so we just sketch α 2 it with the emphasis on the differences. The number t− α = −kα is an eigenvalue of −∆− α if and only if kα is a solution of the equation x = ζα η˜(x) ,
(3.14)
July 14, 2004 17:56 WSPC/148-RMP
572
00208
P. Exner & S. Kondej
where η˜: R+ → R+ is the function given by SK (xd) , SF (xd) = F˜1 (xd)xd + wd F0 (xd) η˜(x) = exp − SI (xd) ˜ 1 = −K1 and wd := for F = K, I, where I˜1 = I1 , K
1 2
− M d; we assume that
wd > 0 .
(3.15)
To proceed further, we employ again the asymptotics of functions In , Kn , n = 0, 1, for x → 0 and at large values of the argument. It is easy to see that the behavior of (xd) for small x is dominated by that of K0 (·). Thus, mimicking the second x 7→ SSKI (xd) step of the above argument, we can show that Eq. (3.14) has at least one solution for −α sufficiently large provided that assumption (3.15) is satisfied. Repeating the Step 3, we can check that the solution kα is unique for −α sufficiently large. By reduction ad absurdum, as in the second step, we can also prove that there exists ˆ α < kα , which means that kα → ∞ as α → −∞. The constant Cˆ such that Cζ can be made more specific: using the fact that the term −dx K1 (dx) dominates the behavior of SK (dx) for large x and SI > 0, we get η˜ > 1 for −α sufficiently large, i.e. ζα < k α . Using properties of the special functions involved here, we also find that ˙ 1 (dζα )dζα ) η˜(kα ) ≤ exp(CK ˙ Thus proceeding similarly as in holds for any C˙ satisfying (wd )−1 + (dζα )−1 < C. ˘ ˘ Step 4, we infer that there are constants C1 , C2 such that ˜ ξα − S(α) < −kα2 < ξα ,
(3.16)
where ˜ S(α) ≤ C˘1 ζα2
p dζα exp(−C˘2 dζα ) as α → −∞ .
Finally putting together (3.12), (3.13), (3.15) and (3.16), we get the claim with C1 := max{C˜1 , C˘1 } and C2 := min{C˜2 , C˘2 }. 3.4. Proof of Theorem 3.3 for a loop Suppose now that Γ is a closed curve. The result will follow from combination of the above estimates. We have to couple the width of the neighborhood Ωd and the coupling constant α in such a way that d shrinks properly to zero as α → −∞. This is achieved, e.g. by choosing d(α) = e πα .
(3.17)
Proof of Theorem 3.3(1). To find the asymptotic behavior of eigenvalues λj (α) of Hα,Γ , we will rely on the decomposition (3.4), according to which we know that the negative eigenvalues of Hα,Γ are squeezed between lj± (d) + t± α . Since the operators
July 14, 2004 17:56 WSPC/148-RMP
00208
Asymptotic Expansion for Schr¨ odinger Operators
573
−∆± α have a single negative eigenvalue, the sought values λ j (α) are ordered in the same way as lj± (d) are. Combining (3.17) with the results of Lemmas 3.5 and 3.6, we get for the upper and lower bound πα lj± (d(α)) + t± ) as α → −∞ , α = ξα + µj + O(e
and of course, the same asymptotics holds for λj (α). Clearly, to a given integer n, there exists α(n) ∈ R such that ln+ (d(α)) + t+ α < 0 is true for all α ≤ α(n); this completes the proof of (1). Proof of Theorem 3.3(2). Using the above asymptotic estimates and Lemma 3.5, we get νj− (α) ≤ λj (α) ≤ νj+ (α) ,
(3.18)
where νj± (α)
:= ξα + j
2
2π L
2
+ O(e
πα
) ±v
and v = 4−1 kκ2 k∞ . Combining this with the minimax principle, we arrive at the two-sided estimate ]{j ∈ Z : νj+ (α) < 0} ≤ ]σd (Hα ) ≤ ]{j ∈ Z : νj− (α) < 0} , which implies ]σd (Hα ) =
L (−ξα )1/2 (1 + O(e πα )) . π
3.5. A curve with free ends Theorem 3.3(2) does not require Γ to be a closed curve. One can repeat the argument with a small modification taking for Ωd a closed tube around Γ bordered by the additional “lid” surfaces normal to Γ at its ends. Thus instead of d2 κ2 S we have a pair of comparison operators S i = − ds on L2 (0, L) with 2 − 4 i 2,2 D(S ) := {f ∈ W (0, L), f ˜i · bc}, i = D, N , which give in the same way as above estimates for the eigenvalues λj (α) of Hα,Γ as α → −∞, namely πα πα ξα + µ N ) ≤ λj (α) ≤ ξα + µD ), j + O(e j + O(e
where µij , j = 1, 2, . . . , denote the eigenvalues of S i . The fact that the latter are different for the Dirichlet and Neumann condition does not allow us to squeeze λj (α) sufficiently well to get its asymptotics in analogy with the claim (1) of the N theorem. On the other hand, the behavior of µD j − µj as j → ∞ allows us to find an asymptotic estimate for the counting function. Recall that the eigenvalues of d2 i 2 i 2 π 2 −∆i = − ds 2 : D(S ) → L (0, L) are of the form sj = j ( L ) , where j ∈ N for i = D and j ∈ N ∪ {0} for i = N ; thus in analogy with (3.18) we can define the functions 2 π 2 + O(e πα ) ± v , νj± (α) := ξα + j± L
July 14, 2004 17:56 WSPC/148-RMP
574
00208
P. Exner & S. Kondej
where j + = j and j − = j − 1 with j ∈ N, which give a two-sided bound for λj (α). Combining it with the minimax principle, we arrive again at the formula ]σd (Hα,Γ ) =
L (−ξα )1/2 (1 + O(e πα )) as α → −∞ . π
Remark 3.7. While Theorem 3.3 was formulated for a single finite curve, which may not be closed for part (2), the argument easily extends to any Γ which decomposes into a finite disjoint union of such curves, up to the eigenvalue numbering. The latter may be ambiguous in case that the corresponding operator S, which is now an orthogonal sum of components of the type (3.1), exhibits an accidental degeneracy in its spectrum. 4. Infinite Asymptotically Straight Curves We know from [9] that the operator Hα,Γ has a non-empty discrete spectrum if Γ is an infinite C 4 curve which is non-straight but it is asymptotically straight in the following sense (aΓinf 1) for all s ∈ R we have |κ(s)| ≤ M |s|−β , where β > 5/4 and M > 0 . Moreover, one has to assume that (aΓinf 2) there exists a constant c ∈ (0, 1) such that |γ(s) − γ(s0 )| ≥ c|s − s0 |. If these conditions are satisfied, then the operator Hα,Γ is self-adjoint and σess (Hα,Γ ) = [ξα , ∞) ,
σd (Hα,Γ ) 6= ∅ .
Since the infinite curve has no free ends, the asymptotics of eigenvalues of Hα,Γ for α → −∞ can be found in the same way as for the loop. We employ the comparison operator which now takes the form S=−
1 d2 − κ(s)2 : D(S) → L2 (R) 2 ds 4
with the domain D(S) equal to W 2,2 (R). It is a Schr¨ odinger operator on line with a potential which is purely attractive provided κ 6= 0, and therefore σd (S) 6= ∅ . On the other hand, in view of the assumed decay of curvature as |s| → ∞, the number N := ]σd (S) is finite [15, Theorem XIII.9]. Using the symbol µj for the jth eigenvalue of the operator S, we get the following result. Theorem 4.1. Under the above stated assumptions, there is α0 ∈ R such that ]σd (Hα,Γ ) = N holds for all α < α0 . Moreover , the jth eigenvalue λj (α) of Hα,Γ , j = 1, . . . , N, admits the asymptotic expansion λj (α) = ξα + µj + O(e πα ) as α → −∞ . Since the proof is fully analogous to that of Theorem 3.3, we omit the details.
July 14, 2004 17:56 WSPC/148-RMP
00208
Asymptotic Expansion for Schr¨ odinger Operators
575
5. Spectrum for an Infinite Periodic Curve 5.1. The Floquet–Bloch decomposition Now we turn our attention to Hamiltonians with singular perturbations supported by a periodic C 4 curve without self-intersections. In other words, we assume that there is a vector K1 ≡ K ∈ R3 and a number L > 0 such that γ(s + L) = K + γ(s) for all s ∈ R . Of course, we can always choose the Cartesian system of coordinates such that K = (K, 0, 0) with K > 0, and γ(0) = 0. As usual in periodic situations, we decompose the space R3 according to the periodicity of Γ. To this aim, we define the basic period cell as ) ( 3 X ti Ki , t1 ∈ [0, 1), ti ∈ R, i = 2, 3 , (5.1) C0 ≡ C := x : x = i=1
{Ki }3i=1
where are linearly independent vectors in R3 ; without loss of generality, we may suppose that K2 ⊥ K3 . Then the translated cells Cn := C + nK, where S n ∈ Z, are mutually disjoint for different values of the index and R3 = n∈Z Cn . As in the previous section, we assume that Γ has no self-intersections. However, to proceed further, we need an additional assumption, namely (aΓper ) the restriction of ΓC := C ∩ Γ to the interior of C is connected . Let us note that the choice of the point s = 0 is important in checking the assumption (aΓper), and for the same reason, we do not generally require that K1 ⊥ {K2 , K3 } (see also Remark 5.4 below). While a smooth periodic curve without self-intersections satisfies (aΓ1), the property (aΓper ) ensures that we can choose a neighborhood of ΓC which is connected set contained in C; this is important for the construction described below. In view of Theorem 2.3, the Hamiltonian with the singular perturbation supported by Γ is well defined as a self-adjoint operator in L2 (R3 ). To perform the Floquet–Bloch reduction for Hα,Γ , we decompose first the state Hilbert space into a direct integral Z ⊕ H= H0 dθ , H0 := L2 (C) . [−π/K,π/K)
It is a standard matter to check that the operator U : L2 (R3 ) → H given by X 1 (U f )θ (x) = e −iθKn f (x + nK) (5.2) 1/2 (2π) n∈Z on f ∈ C0∞ (R3 ) acts isometrically, so it can be uniquely extended to a unitary operator on the whole L2 (R3 ). We will say that the function f ∈ C 2 (C\ΓC ) belongs to Υα (θ) if it satisfies the condition f ˜α · bc(ΓC ) ,
July 14, 2004 17:56 WSPC/148-RMP
576
00208
P. Exner & S. Kondej
and furthermore, for all x such that both x and x+K belong to ∂C and x 6= (0, 0, 0), we have f (ν) (x + K) = e iθK f (ν) (x) ,
ν = 0, 1 ,
(5.3)
where f (0) := f, f (1) := ∂x1 f. Now we define Hα,Γ (θ) as the self-adjoint Laplace operator in L2 (C) with the boundary conditions introduced above; more precisely, Hα (θ) is the closure of H˙ α,Γ (θ) : D(H˙ α,Γ (θ)) = {f ∈ Υα (θ) : H˙ α,Γ (θ)f ∈ L2 (C)} → L2 (C) , H˙ α,Γ (θ)f (x) = −∆f (x) ,
x ∈ C\ΓC .
The following lemma states the usual unitary equivalence between Hα,Γ and the direct integral of its fiber components Hα,Γ (θ). Lemma 5.1. U Hα,Γ U −1 =
R⊕
[−π/K,π/K) Hα,Γ (θ) dθ.
Proof. Take a function f belonging to the set L := {g ∈ C 2 (R3 \Γ) : f ˜α · bc(Γ), supp f is compact} ,
(5.4)
then for all i = 1, 2, 3 we have (U ∂i f (x))θ = ∂i (U f )θ (x) ,
x∈ / Γ,
and the same relations hold for the second derivatives. Thus to prove the lemma, it suffices to show that any function admitting the representation (U f )θ with f ∈ L belongs to Υα (θ). It is easy to check that for all x 6= (0, 0, 0) such that x and x + K are in ∂C, we have (ν)
(ν)
((U f )θ (x + K) = e iθK ((U f )θ (x)) ,
for ν = 0, 1 .
The behavior of the function (U f )θ in the vicinity of ΓC is characterized by the limits Ξ((U f )θ )(·) and Ω((U f )θ )(·). Using the periodicity of Γ, we get X Ξ((U f )θ )(s) = (2π)−1/2 e −inθK Ξ(f )(s + nL) , s ∈ (0, L) , n∈Z
Ω((U f )θ )(s) = (2π)−1/2
X
e −inθK Ω(f )(s + nL) ,
s ∈ (0, L) ;
n∈Z
to derive these relations, we also used the uniform convergence of the sums. In this way, we conclude that (U f )θ ˜α · bc(ΓC ). The Laplace operator in L2 (C) with the domain consisting of functions which admit the representation (U f )θ with f ∈ L is essentially self-adjoint and its closure coincides with Hα,Γ (θ); this completes the proof.
July 14, 2004 17:56 WSPC/148-RMP
00208
Asymptotic Expansion for Schr¨ odinger Operators
577
5.2. Spectral analysis of Hα,Γ (θ) As in the case of a finite curve, we can now analyze the discrete spectrum of the operator Hα,Γ (θ). Before doing that, let us localize the essential spectrum. An argument analogous to that of Proposition 3.2 shows that the singular perturbation supported by ΓC does not change the essential spectrum of the Laplacian in a slab with Floquet boundary conditions, i.e. σess (Hα,Γ (θ)) = θ2 , ∞ . (5.5) To describe the asymptotic behavior of the eigenvalues of Hα,Γ (θ), we introduce a κ(s)2 d2 comparison operator by Sθ = − ds : D(Sθ ) → L2 (0, L), where 2 − 4 D(Sθ ) := { f ∈ W 2,2 (0, L) : f (L) = e iθK f (0), f 0 (L) = e iθK f 0 (0) } . In analogy with Theorem 3.3, we state the following theorem. Theorem 5.2. Under the assumption given above for a fixed number n, there exists α(n) ∈ R such that ]σd (Hα (θ)) ≥ n holds for α ≤ α(n). Moreover , the jth eigenvalue of Hα,Γ (θ) has the asymptotic expansion of the form λj (α, θ) = ξα + µj (θ) + O(e πα ) as α → −∞ , where µj (θ) is the jth eigenvalue of Sθ and the error term is uniform with respect to θ. Proof. The argument follows closely that of Theorem 3.3; the only difference is the replacement of periodic boundary condition by the Floquet one. The fact that the error is uniform with respect to θ is a consequence of Lemma 3.5 and continuity of the functions µj (·). 5.3. Spectral analysis of Hα,Γ in terms of Hα,Γ (θ) Now our aim is to express the spectrum of Hα,Γ in terms of Hα,Γ (θ). First, let us note that combining (5.5) with standard results [15, Sec. XIII.16], we get the following equivalence for the positive part of spectrum [ σ(Hα,Γ ) ∩ [0, ∞) = σ(Hα,Γ (θ)) ∩ [0, ∞) = [0, ∞). θ∈[−π/K,π/K)
The negative part of the spectrum is more interesting being given by the union of ranges of the functions λj (α, ·). They give rise to well defined spectral bands because the latter are continuous in the Brillouin zone [−π/K, π/K). This can be seen by checking in the usual way, putting θ into the operator and showing that the θ dependent part is an analytic perturbation. Alternatively, one can take g = (U f ) θ with f ∈ L as defined by (5.4) and investigate the functions 1 X −i(n−m)θ θ 7→ qg (θ) := (g, Hα,Γ (θ)g)L2 (C) = e (fn , Hα,Γ fm )L2 (C) , 2π n,m∈Z
July 14, 2004 17:56 WSPC/148-RMP
578
00208
P. Exner & S. Kondej
where fn (x) := f (x + nK). In view of (5.2) and the uniform convergence of the respective sums, such a qg (·) is continuous for g running over a common core of all Hα,Γ (θ). Thus by the minimax principle, we get the continuity of λj (α, ·) and combining this fact with the results of [15], we get [ σ(Hα,Γ ) ∩ (−∞, 0] = σ(Hα (θ)) ∩ (−∞, 0] θ∈[−π/K,π/K)
arriving finally at
σ(Hα,Γ ) =
[
σ(Hα,Γ (θ)) .
θ∈[−π/K,π/K)
These results together with Theorem 5.2 allow us to describe the band structure of Hα,Γ , in particular, the existence of gaps. Notice that this operator as well κ(s)2 d2 in L2 (R) commute with the complex conjugation, so their as S = − ds 2 − 4 Floquet eigenvalues are generically twice degenerate depending on |θ| only. For the comparison operator, the width of the jth gap is ( µj+1 (π/K) − µj (π/K) , for odd j , Gj (S) = µj+1 (0) − µj (0) , for even j , and similarly for Hα,Γ . The expansion of Theorem 5.2 then gives Gj (Hα,Γ ) = Gj (S) + O(e πα ) . In combination with the known result about existence of gaps for one-dimensional Schr¨ odinger operators, we arrive at the following conclusion. Corollary 5.3. Suppose that in addition to the above assumption, the function κ(·) is non-constant. In the generical case when S has infinitely many open gaps, one can find to any n ∈ N, an α(n) ∈ R such that the operator Hα,Γ has at least n open gaps in its spectrum if α < α(n). If the number of gaps in σ(S) is N < ∞, then σ(Hα,Γ ) has the same property for −α large enough. Notice that this property is determined by the curvature alone. Thus the result does not apply not only to the trivial case of a straight line, but also to screw-shaped spirals Γ for which κ is non-zero but constant. Remark 5.4. It is not always possible to choose C in the form of a rectangular slab (5.1) as we did above, which would satisfy the assumption (aΓper ); counterexamples can be easily found. However, if we choose instead another period cell C with a smooth boundary for which the property (aΓper ) is valid, the argument modifies easily and the claim of Theorem 5.2 remains valid. On the other hand, such a decomposition may not exist if the topology of Γ is non-trivial; a simple counterexample is given by a “crotchet-shaped” curve. While we conjecture that the claim of Theorem 5.2 is still true in this situation, a different method is required to demonstrate it.
July 14, 2004 17:56 WSPC/148-RMP
00208
Asymptotic Expansion for Schr¨ odinger Operators
579
5.4. Compactly disconnected periodic curves So far we have considered a single periodic connected curve. A slightly stronger result about the existence of gaps in the spectrum of Hα,Γ as α → −∞ can be obtained for compactly disconnected periodic curves in R3 , i.e. they decompose into a disjoint union in which each of the connected components is compact. To be more specific, we consider a family of curves obtained by translations of a loop Γ 0 (being a graph of a function γ0 ) generated by an r-tuple {Ki } linearly independent S vectors, where r = 1, 2, 3. The curve Γ in question is then a union Γ = n∈Zr Γn , where Γn are graphs of X γn := γ0 + ni Ki : [0, L] → R3 , n = {ni } ; n∈Zr
for the sake of brevity, we put here Γn0 = Γ0 , γn0 = γ0 , where n0 := (0, 0, 0). We assume that Γ0 is contained in the interior of the period cell ) ( r−1 X ti Ki : 0 ≤ ti < 1 × {Ki }⊥ , C= i=0
which is non-compact if r = 1, 2 and compact otherwise. Similarly as before, we can make Floquet–Bloch decomposition of Hα,Γ into a direct integral of the fiber operators Hα,Γ (θ). However now, since dist(∂C, ΓC ) > 0 holds by assumption, the comparison operator S = S(θ) is now independent of the quasimomentum θ ∈ Q× −1 , π|Ki |−1 ). While in the previous case, some gaps of S(θ) might 1≤i≤r [−π|Ki | be closed, now they are all open. As a result, each gap in the spectrum of σ(Hα,Γ ), which depends of course on θ, will eventually open for −α large enough. Theorem 5.5. Under the assumptions stated above, the spectrum of Hα,Γ (θ) is Pr 2 purely discrete if r = 3, and σess (Hα,Γ (θ)) = , ∞ if r = 1, 2. The jth θ i=1 i eigenvalue of Hα,Γ (θ) admits the asymptotic expansion of the following form λj (α, θ) = ξα + µj + O(e πα ) as α → −∞ , where µj is the jth eigenvalue of S and the error is uniform with respect to θ. Consequently, for any n ∈ N there is α(n) ∈ R such that the operator H α,Γ has at least n open gaps in its spectrum if α < α(n). 6. Concluding Remarks (1) The results obtained in the previous discussion can be rephrased as a semiclassical approximation. To see this, let us consider the Hamiltonian Hα,Γ (h) with the Planck’s constant h reintroduced; the latter is understood in the mathematical sense, i.e. as a parameter which allows us to investigate the asymptotic behavior as h → 0. The operator in question then acts as Hα,Γ (h)f (x) = −h2 ∆f (x) ,
x ∈ R3 \Γ ,
July 14, 2004 17:56 WSPC/148-RMP
580
00208
P. Exner & S. Kondej
and has the domain D(Hα,Γ (h)) = {f ∈ ΥR3 : f ˜α(h) · bc(Γ)} , where α(h) := α +
1 ln h . 2π
(6.1)
This definition of Hα,Γ (h) requires a comment. In the case when codim Γ = 1 discussed in [6], the Hamiltonian is defined by the natural quadratic form, hence introducing h means a multiplicative change of the coupling parameter, α → αh−2 ; one can see that also from the approximation of such an operator by means of scaled regular potentials [8]. In contrast to that, a two-dimensional point interaction involves a complicated nonlinear coupling constant renormalization [2, Sec. I.5], so introducing Planck’s constant is in this case arbitrary to a certain extent. We choose the simplest way noticing that the relation between the free operators −∆ and −h2 ∆ can be expressed by means of the scaling transformation x 7→ hx, and require the similar behavior for the singular interaction term; it is well-known that a scaling for a twodimensional point interaction is equivalent to a logarithmic shift of the coupling parameter (cf. [7]). In view of (6.1), the semiclassical limit h → 0 is within this convention for a fixed coupling constant α equivalent to α(h) → −∞ which means a strong coupling again. Since Hα,Γ (h) = h2 Hα(h),Γ (1), we see that the eigenvalues λj (α, h) of Hα,Γ (h) then take the following form λj (α, h) = ξα + µj h2 + O(h5/2 ) as h → 0 . In the same way, we find the counting function which is given by ]σd (Hα (h)) =
L (−ξα )1/2 (1 + O(h1/4 )) . πh
(2) Let us finally list some open problems related to the present subject as follows. • One is naturally interested in the asymptotic expansion in the situation when Γ is a curve with free ends and the present method allows us to treat the counting function only; the analogous question stands for planar curves [11] and surfaces with a boundary [6]. We conjecture that the expansion of Theorem 3.3 holds again with µj corresponding to the comparison operator which acts according to (3.1) with Dirichlet boundary conditions at the boundary of Γ. • The results can be extended to higher dimensions provided codim Γ ≤ 3 so that the singular interaction Hamiltonian is well defined. • The smoothness assumption is crucial in our argument. A self-similar curve such as a broken line consisting of two halflines joined at a point provides an example of a situation where the asymptotic behavior differs from that of Theorem 3.3. One can ask, e.g. how the asymptotics looks like for a piecewise smooth curve with non-zero angles at a discrete set of points.
July 14, 2004 17:56 WSPC/148-RMP
00208
Asymptotic Expansion for Schr¨ odinger Operators
581
• Another important question concerns the absolute continuity of the spectrum in the case when Γ is a periodic curve or a family of curves. The answer is known if codim Γ = 1 and the elementary cell is compact [3, 16]. The cases of a single connected periodic curve or a periodic surface diffeomorphic to the plane are open, and the same is true for periodic curve(s) in R3 , i.e. the situation with codim Γ = 2. Acknowledgment We thank the referees for their remarks which helped to improve the text. One of the author (S. Kondej) is grateful for the hospitality in the Department of Theoretical Physics, NPI, Czech Academy of Sciences, where a part of this work was done. The research has been partially supported by the ASCR project K1010104 and by the Polish Ministry of Scientific Research and Information Technology under (solicited) Grant No. PBZ-Min-008/PO3-03. References [1] M. S. Abramowitz and I. A. Stegun (eds.), Handbook of Mathematical Functions (Dover, New York, 1965). [2] S. Albeverio, F. Gesztesy, R. Høegh-Krohn and H. Holden, Solvable Models in Quantum Mechanics (Springer, Heidelberg 1988). [3] M. S. Birman, T. A. Suslina and R. G. Shterenberg, Absolute continuity of the twodimensional Schr¨ odinger operator with delta potential concentrated on a periodic system of curves, Algebra i Analiz 12 (2000) 140–177; translated in St. Petersburg Math. J. 12 (2001) 535–567. ˇ [4] J. F. Brasche, P. Exner, Yu. A. Kuperin and P. Seba, Schr¨ odinger operators with singular interactions, J. Math. Anal. Appl. 184 (1994) 112–139. [5] P. Duclos and P. Exner, Curvature-induced bound states in quantum waveguides in two and three dimensions, Rev. Math. Phys. 7 (1995) 73–102. [6] P. Exner, Spectral properties of Schr¨ odinger operators with a strongly attractive δ interaction supported by a surface, in Proc. NSF Summer Research Conference (Mt. Holyoke 2002); Series in AMS Contemporary Mathematics, Vol. 339, Providence, R. I., (2003) 25–36. ˇ [7] P. Exner, R. Gawlista, P. Seba and M. Tater, Point interactions in a strip, Ann. Phys. 252 (1996) 133–179. [8] P. Exner and T. Ichinose, Geometrically induced spectrum in curved leaky wires, J. Phys. A34 (2001) 1439–1450. [9] P. Exner and S. Kondej, Curvature-induced bound states for a δ interaction supported by a curve in R3 , Ann. H. Poincar´e 3 (2002) 967–981. ˇ [10] P. Exner and P. Seba, Bound states in curved quantum waveguides, J.Math. Phys. 30 (1989) 2574–2580. [11] P. Exner and K. Yoshitomi, Asymptotics of eigenvalues of the Schr¨ odinger operator with a strong δ-interaction on a loop, J. Geom. Phys. 41 (2002) 344–358. [12] P. Exner and K. Yoshitomi, Band gap of the Schr¨ odinger operator with a strong δ-interaction on a periodic curve, Ann. H. Poincar´e 2 (2001) 1139–1158. [13] Y. V. Kurylev, Boundary condition on a curve for a three-dimensional Laplace operator, J. Sov. Math. 22 (1983) 1072–1082.
July 14, 2004 17:56 WSPC/148-RMP
582
00208
P. Exner & S. Kondej
[14] A. Posilicano, A Krein-like formula for singular perturbations of self-adjoint operators and applications, J. Funct. Anal. 183 (2001) 109–147. [15] M. Reed and B. Simon, Methods of Modern Mathematical Physics, IV. Analysis of Operators (Academic Press, New York, 1978). [16] T. A. Suslina and R. G. Shterenberg, Absolute continuity of the spectrum of the Schr¨ odinger operator with the potential concentrated on a periodic system of hypersurfaces, Algebra i Analiz 13 (2001) 197–240.
July 14, 2004 18:3 WSPC/148-RMP
00211
Reviews in Mathematical Physics Vol. 16, No. 5 (2004) 583–602 c World Scientific Publishing Company
TWISTED ENTIRE CYCLIC COHOMOLOGY, J-L-O COCYCLES AND EQUIVARIANT SPECTRAL TRIPLES
DEBASHISH GOSWAMI The Stat-Math Unit, Indian Statistical Institute 203, B.T. Road, Kolkata 700 108, India [email protected] Received 02 April 2003 Revised 29 September 2003 We study the “quantized calculus” corresponding to the algebraic ideas related to “twisted cyclic cohomology” introduced in [12]. With very similar definitions and techniques as those used in [9], we define and study “twisted entire cyclic cohomology” and the “twisted Chern character” associated with an appropriate operator theoretic data called “twisted spectral data”, which consists of a spectral triple in the conventional sense of noncommutative geometry [1] and an additional positive operator having some specified properties. Furthermore, it is shown that given a spectral triple (in the conventional sense) which is equivariant under the (co-) action of a compact matrix pseudogroup, it is possible to obtain a canonical twisted spectral data and hence the corresponding (twisted) Chern character, which will be invariant (in the usual sense) under the (co-)action of the pseudogroup, in contrast to the fact that the Chern character coming from the conventional noncommutative geometry need not to be invariant. In the last section, we also try to detail out some remarks made in [3], in the context of a new definition of invariance satisfied by the conventional (untwisted) cyclic cocycles when lifted to an appropriate larger algebra. Keywords: J-L-O Cocycles; twisted cyclic cohomology; quantum groups. 2000 Mathematics Subject Classifications: 58B34, 58B32
1. Introduction Ordinary and entire cyclic cohomology theory are indeed some of the fundamental ingredients of Connes’ noncommutative geometry. A comprehensive account of this theory can be found in [1] and the references cited in that book. Let us briefly recall how this theory is used to define a noncommutative version of the Chern character. First of all, there is a canonical pairing between the K-theory and the ordinary as well as the entire cyclic cohomology. Let A be a Banach or more generally locally convex topological algebra, h·, ·i : K∗ (A) × H ∗ (A) → C and h·, ·i : K∗ (A) × H∗ (A) → C be the canonical pairing [1] between the K-theory and the periodic cyclic cohomology and the pairing between the K-theory and the 583
July 14, 2004 18:3 WSPC/148-RMP
584
00211
D. Goswami
entire cyclic cohomology of A respectively. Given a Fredholm module (H, F ) over A, or equivalently a spectral triple (A, H, D), one constructs a canonical element ch∗ (H, F ), called the Chern character, of H ∗ (A) or H∗ (A), depending on whether the Fredholm module is “p-summable” for some p > 0 or it is only “Θ-summable”. The map φ from K∗ (A) to C given by φ(·) = h·, ch∗ (H, F )i (h·, ch∗ (H, F )i for the Θ-summable case) actually takes integer values and can be obtained as an index of a suitable Fredholm operator. For the finite summable situation, ch∗ (H, F ) is given by (up to a constant) the cocycle τn (a0 , . . . , an ) = Trs (a0 [F, a1 ] · · · [F, an ]), aj ∈ A, where Trs is a suitable graded trace defined in [1]. In terms of the associated spectral triple, under some additional assumptions, one gets a canonical Hochschild n-cocycle φw given by, φw (a0 , . . . , an ) = λn Ψ(a0 [D, a1 ] · · · [D, an ]), where λn is a constant and Ψ(A) := Trw (A|D|−n ), A ∈ B(H), where Trw (·) denotes the Dixmier trace and n is suitably chosen so that |D|−n has finite Dixmier trace. The positive linear functional A 3 a 7→ Trw (a|D|−n ) is a trace and can be thought of as the “noncommutative volume form” associated with the noncommutative spin geometry encoded by the spectral triple. Furthermore, if A is equipped with the action of a classical compact group G implemented by a unitary representation of G on H and the spectral triple is Gequivariant (which means in particular that D commutes with the G-representation on H), then the above-mentioned cocycles are G-invariant in the sense that τn (g · a0 , . . . , g · an ) = τ (a0 , . . . , an ) and similar thing is true for φw . In particular, if A is chosen to be an appropriate function algebra (containing the smooth functions) on a classical compact Lie group G and the canonical equivariant Dirac operator D is chosen, then the above-mentioned volume form will (up to a constant) coincide with the integral with respect to the Haar measure. The above-mentioned invariance of the cocycles makes it possible to consider their lifting to the algebraic crossedproduct in a canonical way. However, things do drastically change when one replaces classical compact groups by noncommutative and non-cocommutative compact quantum groups defined by Woronowicz [17]. The first major difference is that the Haar state on such a quantum group is no longer a trace, and if one considers an equivariant spectral triple such as the ones constructed by Chakraborty and Pal [6], and constructs the Chern character as mentioned before, it may no longer be invariant under the natural (co-)action of the quantum group. In particular, the noncommutative volume form in general will not coincide with the Haar state, and in fact may not be even faithful. The simplest and important case of SUq (2) (0 < q < 1) deserves some discussion in this context. From the explicit description of the K-homology of SUq (2) in [14], it is easily seen that the Chern character ω (in the notation of the above-mentioned paper) of the 1-summable generator of the odd K-homology is not invariant under the SUq (2) (co-)action. Had it been invariant, one would have ω ∗ τeven = ω, which is not the case as it is shown in [14] (here τeven is as in [14] and ∗ denotes the product described in that paper, which is the combination
July 14, 2004 18:3 WSPC/148-RMP
00211
Twisted Entire Cyclic Cohomology
585
of the shuffle product and the SUq (2)-coproduct). Thus, it is not possible to get an invariant Chern character within the framework of conventional noncommutative geometry and this explains why the Chern character obtained in [6] cannot be invariant. However, even if one forgets the aspect of invariance, there are other strange properties observed in this example. It can be shown (to be discussed in detail elsewhere) that some of the natural properties, which one almost always observes for a nice compact manifold such as the classical SU (2), are not valid. For example, there can be no odd 1-summable spectral triple for SUq (2) whose Chern character (in HC 1 ) is nontrivial and which also satisfies the property that the corresponding Dirac operator D1 has compact resolvents, |D1 |−1 (to be interpreted as the inverse of the restriction of |D1 | onto the orthogonal complement of the kernel of D1 ) has finite Dixmier trace; the repeated commutators with |D1 | is bounded for any element in the finite ∗-algebraic span of the canonical generators of SUq (2). It is interesting to remark that the equivariant Dirac operator D in [6], whose associated Fredholm module is the generator of the odd K-homology group, has the property that |D|−3 is in the Dixmier-trace-class, and the Hochschild cohomology class of the associated Chern character in HC 3 vanishes. Thus, this Chern character must be of the form Sτ for some τ ∈ HC 1 (where S is the periodicity operator used in [1] in the exact couple relation between the Hochschild and cyclic cohomology), and it is not clear whether it is at all possible to obtain this τ as a Chern character coming from some equivariant Dirac operator D 0 such that |(D0 )|−1 has finite Dixmier trace. Thus, it is not known whether one can describe the Hochschild class corresponding to τ ∈ HC 1 by a suitable “local” formula involving Dixmier trace (we should note that it is indeed possible to describe the periodic cyclic cohomology class of τ by local formula involving residues, as shown in [3]). However, the issue of invariance has been discussed in [3] and the author has proposed a new formulation of invariance which is seen to be satisfied by the canonical cochains and cocycles in this example. It is also shown there how to put the alternative approach suggested in [12], based on what the authors have called “twisted cyclic cohomology”, in a proper perspective. In fact, the author of [3] has prescribed a canonical way to recover the so-called twisted cohomology theory from the conventional cyclic cohomology of a suitable enlarged algebra. However, as pointed out in [3], twisted cyclic cohomology still has some amount of usefulness as “detectors” of cohomology classes in the above-mentioned enlarged algebra [3, Sec. 8] and thus it is perhaps interesting to study the operator theoretic framework corresponding to the algebraic theory of [12]. We would like to point out here that we shall mainly build some amount of general theory, which in particular will enable one to obtain an invariant (twisted) Chern character in this context. We shall discuss the examples of SUq (2) and the quantum 2-spheres, but leave the study of other examples for future. Motivated by the fact that the Haar state on typical compact quantum groups are not tracial and other considerations, the authors of [12] have found it somewhat
July 14, 2004 18:3 WSPC/148-RMP
586
00211
D. Goswami
natural to introduce “twisted cyclic cohomology”, which is indeed a module in the cyclic category (see e.g. [1] or [13]). However, they did not focus on the “quantized calculus” related to the twisted cyclic cohomology, which is our goal in the present article. To be more precise, we shall discuss the twisted analogue of the entire cyclic cohomology and show how one can obtain canonical J-L-O type [9] cocycles in this twisted entire cyclic cohomology from a spectral triple and an additional positive operator giving rise to the “twist”. In fact, although we shall make a somewhat general theory, our main focus will be on the examples coming from the quantum group theory and we shall show that a canonical “twisting” operator exists for a given equivariant spectral triple for the (co-)action of compact matrix pseudogroups of Woronowicz. Let us remark here that in some special examples of noncommutative manifolds studied in [4] and [5], the conventional theory of noncommutative geometry was shown to be nicely applicable to certain Hopf algebras or associated homogeneous spaces, but those Hopf algebras (e.g. SLq (2) for q complex of modulus 1) do not come under the framework of topological quantum groups given by Woronowicz and others. Before we enter into the main results, we should perhaps mention why we are interested in the twisted version of the entire cyclic cohomolgy (hence J-L-O type cocycles) rather than the ordinary cyclic cohomology. This is motivated by our study of the example SUq (2) [8, 6], where we have shown that the Haar state can −tD2
) , a ∈ SUq (2), where D is the equivariant be recaptured by the formula Tr(aRe Tr(Re−tD2 ) Dirac operator and R is a suitable positive operator, coming from the modular theory of SUq (2). It is also shown that there is no finite positive number d so that 2 Tr(Re−tD ) = O(t−d ). This in some sense indicates that the associated Fredholm module is not finite dimensional, and is only Θ-summable, so that it is natural to construct J-L-O type cocyles in the (twisted) entire cyclic cohomology.
2. Twisted Entire Cyclic Cohomology and J-L-O Cocycles 2.1. Algebraic aspects Let us develop the theory for Banach algebras for simplicity, but we note that our results extend to locally convex algebras, which we actually need. The extension to the locally convex algebra case follows exactly as remarked in [1, p. 370]. So, let A be a unital Banach algebra, with k · k∗ denoting its Banach norm, and let σ be a continuous automorphism of A, σ(1) = 1. For n ≥ 0, let C n be the space of continuous n + 1-linear functionals φ on A which are σ-invariant, i.e. φ(σ(a0 ), . . . , σ(an )) = φ(a0 , . . . , an ), ∀a0 , . . . , an ∈ A; and C n = {0} for n < 0. We define linear maps Tn , Nn : C n → C n , Un : C n → C n−1 and Vn : C n → C n+1 by (Tn f )(a0 , . . . , an ) = (−1)n f (σ(an ), a0 , . . . , an−1 ) , Nn =
n X j=0
Tnj ,
July 14, 2004 18:3 WSPC/148-RMP
00211
Twisted Entire Cyclic Cohomology
587
(Un f )(a0 , . . . , an−1 ) = (−1)n f (a0 , . . . , an−1 , 1) , (Vn f )(a0 , . . . , an+1 ) = (−1)n+1 f (σ(an+1 )a0 , a1 , . . . , an ) . Pn+1 −j−1 j Let Bn = Nn−1 Un (Tn − I), bn = j=0 Tn+1 Vn Tn . Let B, b be maps on the n complex C ≡ (C )n given by B|C n = Bn , b|C n = bn . It is easy to verify (similar to what is done for the untwisted case, e.g. in [1]) that B 2 = 0, b2 = 0 and Bb = −bB, so that we get a bicomplex (C n,m ≡ C n−m ) with differentials d1 , d2 given by B : C n,m → C n,m+1 . Furthermore, d1 = (n − m + 1)b : C n,m → C n+1,m , d2 = n−m e 2n let C = {(φ2n )n ∈ N; φ2n ∈ C , ∀n ∈ N}, and C o = {(φ2n+1 )n ∈ N; φ2n+1 ∈ C 2n+1 , ∀n ∈ N}. We say that an element φ = (φ2n ) of C e is a σ-twisted even P n entire cochain if the radius of convergence of the complex power series kφ2n k zn! is infinity, where kφ2n k := supkaj k∗ ≤1 |φ2n (a0 , . . . , a2n )|. Similarly, we define σ-twisted odd entire cochains, and let Ce (A, σ) (Co (A, σ) respectively) denote the set of σtwisted even (respectively odd) entire cochains. Let ∂ = d1 + d2 , and we have the ∂
o short complex Ce (A, σ) ←− −→ C (A, σ). We call the cohomology of this complex the ∂ σ-twisted entire cyclic cohomology of A and denote it by H∗ (A, σ).
Proposition 2.1. Let Aσ = {a ∈ A : σ(a) = a} be the fixed point subalgebra for the automorphism σ. There is a canonical pairing h·, ·iσ, : K∗ (Aσ )×H∗ (A, σ) → C. Proof. The restriction φσ of an element φ ∈ H∗ (A, σ) to the subalgebra Aσ clearly belongs to the untwisted entire cyclic cohomology H∗ (Aσ ), hence can be paired with the K-theory of Aσ . Thus, we define h·, ·iσ, ([e], φ) := h[e], φσ i , where [e] ∈ K ∗ (Aσ ), φ ∈ H∗ (A, σ) and h·, ·i denotes the canonical pairing between the K-theory and the entire cyclic cohomology of Aσ . The Proposition now follows by arguments similar to those in the proof for the untwisted case, as in [1]. 2.2. Construction of the Chern character using the J-L-O cocycles We begin with the following definition. Definition 2.1. Let H be a separable Hilbert space, A∞ be a ∗-subalgebra (not necessarily complete) of B(H), R be a positive (possibly unbounded) operator on H, and D be a self-adjoint operator in H such that the followings hold: (1) [D, a] ∈ B(H), ∀a ∈ A∞ ; (2) R commutes with D; (3) for any real number s and a ∈ A∞ , σs (a) := R−s aRs is bounded and belongs to A∞ . Furthermore, for any positive integer n, sups∈[−n,n] kσs (a)k < ∞. Then we call the quadruple (A∞ , H, D, R) an odd R-twisted spectral data. Furthermore, if there is a grading given by γ ∈ B(H) with γ = γ ∗ = γ −1 , and γ commutes with A∞ and R, and anti-commutes with D, then we say that we are
July 14, 2004 18:3 WSPC/148-RMP
588
00211
D. Goswami
given an even R-twisted spectral data (A∞ , H, D, R, γ). We say that the given (odd 2 or even) twisted spectral data is Θ-summable if Re−tD is trace-class for all t > 0. Remark 2.1. (3) in the above definition in particular implies that for s ∈ R, σs is an automorphism of the algebra A∞ . It is, however, not a ∗-automorphism in general. In this section, let us consider even twisted spectral data only, as the odd case can be treated with obvious and minor modifications as done in the untwisted case. Let us assume that we are given an even twisted spectral data specified by (A∞ , H, D, R, γ) as in the above definition, and fix any β > 0. Let H = D 2 , A(s) = e−sH Ae−sH , A(s) = e−sH AesH for s > 0 and A ∈ B(H). Let us denote by B the set of all A ∈ B(H) for which σs (A) := R−s ARs ∈ B(H) for all real numbers s, [D, A] ∈ B(H), and s 7→ kσs (A)k is bounded over compact subsets of the real line. In particular, A∞ ⊆ B. We define for n ∈ N an n + 1-linear functional Fnβ on B by the formula Z Fnβ (A0 , . . . , An ) = Tr(γA0 A1 (t1 ) · · · An (tn )Re−βH )dt1 · · · dtn , Σn
where Σn = {(t1 , . . . , tn ) : 0 ≤ t1 ≤ · · · ≤ tn ≤ β}. It follows from the following lemma that the above integral exists as a finite quantity.
Lemma 2.1. Fnβ is well defined and one has the estimate βn Tr(Re−βH )Πnj=0 Cj , |Fnβ (A0 , . . . , An )| ≤ n! where Cj = sups∈[−1,1] kσs (Aj )k. Proof. The proof is very similar to that of [9, Proposition IV.2], so we only sketch t −t the main ideas. We use the same notation δ1 , . . . , δn+1 as in [9], i.e. δj = j βj−1 , with t0 = 0, tn+1 = β. Thus, (t1 , . . . , tn ) in the integrand can be replaced by P (δ1 , . . . , δn+1 ) with the condition that δj ≥ 0, δj = 1. Then, as in [9], we have Tr(A0 A1 (t1 ) · · · An (tn ) Re−βH )
= Tr γA0 e−βδ1 H A1 e−βδ2 H · · · An e−βδn+1 H R
P
j
δj
= Tr γσ−1 (A0 )(Re−βH )δ1 σ− Pn+1 δj (A1 )(Re−βH )δ2 · · · σ−δn+1 (An )(Re−βH )δn+1 , j=2
where in the last step we have used the fact that R and H commute, and γ and R also commute. Now, by the generalized H¨ older’s inequality for Schatten ideals, the desired estimate follows, noting that k(Re−βH )δ kδ−1 = (Tr(Re−βH ))δ as Re−βH is a positive operator. d For A ∈ B(H), let A˙ denote dt (A(t))|t=0 = −[H, A], whenever it exists as a bounded operator. Clearly for A of the form A = B (s) , s > 0, we have A˙ ∈ B(H).
Lemma 2.2. Let Ai , i = 0, 1, . . . , n be elements of B such that A˙ i ∈ B(H), ∀i. Let dA := i[D, A]. Then we have the following:
July 14, 2004 18:3 WSPC/148-RMP
00211
Twisted Entire Cyclic Cohomology
589
β (1) for j = 1, . . . , n, Fn+1 (A0 , . . . , Aj−1 , A˙j , . . . , An+1 ) = β Fn (A0 , . . . , Aj−1 , Aj Aj+1 , . . . , An+1 ) − Fnβ (A0 , . . . , Aj−1 Aj , . . . , An+1 ); β (2) Fn+1 (A˙0 , A1 , . . . , An+1 ) = Fnβ (A0 A1 , . . . , An+1 ) − Fnβ (A1 , . . . , An+1 σ−1 (A0 )); β ˙ ) = F β (σ1 (An+1 )A0 , . . . , An ) − (3) Fn+1 (A0 , A1 , . . . , An+1 n β Fn (A0 , A1 , . . . , An An+1 ); (4) Fnβ (A0 , . . . , An ) = Fnβ (σ1 (An ), A0 , . . . , An−1 ), Fnβ (σ1 (A0 ), . . . , σ1 (An )) = Fnβ (A0 , . . . , An ); Pn−1 β β (5) j=0 Fn (A0 , . . . , Aj , 1, Aj+1 , . . . , An−1 ) = βFn−1 (A0 , . . . , An−1 ); P n (6) Fnβ (dA0 , . . . , dAn ) = j=1 (−1)j Fnβ (A0 , dA1 , . . . , dAj−1 , A˙j , dAj+1 , . . . , dAn ).
The proofs of the above formulae are straightforward and very similar to the analogous formulae derived in [9], hence we omit the proofs. Let us now equip A∞ with the locally convex topology given by the family of Banach norms k·k∗,n, n = 1, 2, . . ., where kak∗,n := sups∈[−n,n] (kσs (a)k+k[D, σs (a)]k). Let A denote the completion of A∞ under this topology, and thus A is Frechet space. We shall now construct the Chern character in H∗ (A, σ), where σ = σ1 , which extends on the whole of A by continuity. Theorem 2.1. Let φe ≡ (φ2n )n and φo ≡ (φ2n+1 )n be defined by
β φ2n (a0 , . . . , a2n ) = β −n F2n (a0 , [D, a1 ], . . . , [D, a2n ]), ai ∈ A , √ −n− 1 β 2F φ2n+1 (a0 , . . . , a2n+1 ) = 2iβ 2n+1 (γa0 , [D, a1 ], . . . , [D, a2n+1 ]), ai ∈ A .
Then (b + B)φe = 0, (b + B)φo = 0, and hence ψ e ≡ ((2n)!φ2n )n ∈ He (A, σ) and ψ o ≡ ((2n + 1)!φ2n+1 )n ∈ Ho (A, σ). Proof. Let us consider φe only, as the proof for φo is similar. First of all, we extend the definition of φ2n on B × · · · × B (2n + 1 copies) by the same expression as in the statement of the present theorem, which is well defined by Lemma 2.1. Let B ∞ denote the unital algebraic span of elements of the form A(s) for s > 0 and A ∈ A∞ . Let us denote by C n (B ∞ ) the space of all n + 1-linear functionals on B ∞ (without any continuity requirements) and extend the definitions of b and B on the complex C(B ∞ ) ≡ (C n (B ∞ ))n by the same expression as in the case of C n , i.e. for functionals on A. This is possible because σ = σ1 is defined on the whole of B ∞ . Now, the formulae (i) to (vi) of Lemma 2.2 are applicable for elements of B ∞ , and by a straightforward calculation as in [9], we can show that (b + B)(φe ) = 0 on elements of B ∞ . Now, to prove the same for arbitrary elements of A∞ , we note that for any A ∈ A∞ , A(s) → A, [D, A(s) ] = [D, A](s) → [D, A] and σt (A(s) ) → σt (A) ∀t, as s → 0+ strongly. By using the fact that Tr(Bn C) → Tr(BC) whenever Bn → B w.r.t. the strong operator topology and C is trace-class, (s) (s) (s) β we conclude that the integrand in the definition of F2n a0 , [D, a1 ], . . . , [D, a2n ] (s) converges to that with aj replaced by aj (for aj ∈ A∞ ). Finally, as ka(s) k ≤ kak, ∀ a ∈ B(H), an application of the Dominated Convergence Theorem allows (s) (s) (s) β β a0 , [D, a1 ], . . . , [D, a2n ] , us to prove that F2n a0 , [D, a1 ], . . . , [D, a2n ] → F2n
July 14, 2004 18:3 WSPC/148-RMP
590
00211
D. Goswami
hence (b + B)φe = 0 on elements of A∞ . The remaining part of the proof of the theorem is straightforward, and follows exactly in the same way as in [1]. We shall call ψ ∗ in the above theorem the Chern character of the twisted spectral data (A∞ , H, D, R) (or (A∞ , H, D, R, γ) for the even case). We remark here that by an easy adaptation of the techniques of [9] and [10], we can show that the above Chern characters do not depend on our choice of β, namely cohomologous for all β > 0. Furthermore, invariance of the Chern character under some suitable homotopy of the spectral data can possibly be established along the lines of the above-mentioned references. We, however, would like to consider those issues elsewhere. 3. Canonical Twisted Equivariant Spectral Data Arising from (co-)Actions of Compact Matrix Pseudogroups In this section we shall show how one can find canonical examples of twisted spectral data from the theory of compact matrix pseudogroups of Woronowicz [17]. Let S be a compact matrix pseudogroup and assume that the countable set of inequivalent irreducible (co-)representations of A is indexed by, say, π1 , π2 , . . ., where πn is dn -dimensional and {tnij , i, j = 1, . . . , dn } is a set of matrix elen ments for the (co-)representation πn . Thus, Tn ≡ ((tnij ))dij=1 is a unitary element of Mdn (C) ⊗ S, and the coproduct ∆ and the antipode κ are given by P n n n n ∗ ∆(tnij ) = k tik ⊗ tkj , κ(tij ) = (tji ) . It is well-known that the linear span of n {tij , i, j = 1, . . . , dn ; n = 1, 2, . . .} is norm-dense in S. Let h be the Haar state on S, and assume that it is faithful. Let K = L2 (S, h) be the GN S-space associated with the state h on S, and we imbed S in B(K) in the natural manner. It should be noted that we have h(a) = h1, a1i for a ∈ S, where 1 denotes the identity element of S viewed as a vector in K and h·, ·i denotes the inner product of K. We continue to denote the normal bounded linear functional B(K) 3 x 7→ h1, x1i on B(K) by the same notation h. For the rest of the present section, let us fix a unitary (co-)representation of the quantum group S on a separable Hilbert space H given by a unitary element V of L(H ⊗ S) ⊆ B(H) ⊗ B(K) with additional properties to be found in, e.g. [16] and other relevant literature on compact quantum groups. Note that we have denoted by L(H⊗S) the C ∗ -algebra of adjointable linear maps on the Hilbert module H⊗S. We have an equivalent picture of this given by a map V 0 from H to the Hilbert module H ⊗ S, defined by V 0 (ξ) = V (ξ ⊗ 1). For A ∈ B(H), we define ∆A (A) = V (A ⊗ I)V ∗ ∈ B(H) ⊗ B(K). Let us assume that there is a ∗-subalgebra A∞ of B(H) such that ∆A (A∞ ) ⊆ A∞ ⊗alg S ∞ , where S ∞ denotes the algebraic span of the matrix elements of S and their adjoints. Clearly, ∆A : A∞ → A∞ ⊗alg S ∞ is a coaction of the Hopf algebra S ∞ . By the Peter–Weyl theory for compact quantum groups (due to Woronowicz), L there exists an orthogonal decomposition H = n Hn , where Hn is of the form Hn = Cdn ⊗ Kn , Kn is a finite-dimensional or separable infinite-dimensional Hilbert
July 14, 2004 18:3 WSPC/148-RMP
00211
Twisted Entire Cyclic Cohomology
591
space. Furthermore, we can choose an orthonormal basis {enj , j = 1, 2, . . . , dn } of P Cdn such that V 0 (enj ⊗ ξ) = i (eni ⊗ ξ) ⊗ tnij for ξ ∈ Kn . We shall use the following notation: for two linear functionals η, ψ on S ∞ , we define a linear functional η ∗ ψ on S ∞ by (η ∗ ψ)(a) := (η ⊗ ψ)(∆(a)). Similarly, given a coassociative coaction ∆C : C → C ⊗alg S ∞ of S ∞ on some algebra C, and a linear functional α on C, we define (α ∗ η)(c) := (α ⊗ η)(∆C (c)) for c ∈ C. We note that (α ∗ η) ∗ ψ = α ∗ (η ∗ ψ). For a ∈ S ∞ , we define an element ψ ∗ a ∈ S ∞ by ψ ∗ a := (id ⊗ ψ)(∆(a)). We shall occasionally use the so-called Sweedler notation, which we briefly explain now. For a ∈ S ∞ , there are finitely many elements ai(1) , ai(2) , i = 1, 2, . . . , p P (say), such that ∆(a) = i ai(1) ⊗ ai(2) . For notational convenience, we abbreviate this as ∆(a) = a(1) ⊗ a(2) . The notation ∆C (c) = c(1) ⊗ c(2) is understood similarly. For any positive integer m, let C = A∞ m be the m-fold algebraic tensor product of A∞ . There is a natural coaction of S ∞ on A∞ m given by ∆m A (a1 ⊗ a2 ⊗ · · · ⊗ am ) := (a1 (1) ⊗ · · · ⊗ am (1) ) ⊗ (a1 (2) · · · am (2) ) , using the Sweedler notation, with summation being implied. Let us now recall a few facts from the general theory of compact matrix pseudogroups. Lemma 3.1 (Proved in [17]). For each n = 1, 2, . . . , there is a unique dn × dn complex matrix Fn with the following properties. (1) Fn is positive and invertible with Tr(Fn ) = Tr(Fn−1 ) = Mn > 0, say. (2) h(tnij tnkl ∗ ) = M1n δik Fn (j, l), where δik is the Kronecker delta. (3) For any complex number z, let φz be the functional on S ∞ defined by φz (tnij ) = (Fn )z (j, i). Then each φz is multiplicative, (φz ∗ 1) = 1, φz ∗ φw = φ(z+w) , and for any fixed element a ∈ S ∞ , z 7→ φz (a) is a complex analytic map. Using the above lemma, we prove the following. Lemma 3.2. Let φ : S ∞ → C be a linear functional. We define a linear map Fφ on H with the domain consisting of all ξ ∈ H such that V 0 ξ ∈ H ⊗alg S ∞ , and set Fφ (ξ) = (id ⊗ φ)(V 0 ξ) for any ξ in the above domain. Then we have the following: (1) Fφ is densely defined. (2) For a ∈ A∞ , a Dom(Fφ ) ⊆ Dom(Fφ ) and in case when φ is multiplicative, Fφ (aξ) = (φ ∗ a)Fφ (ξ) for ξ ∈ Dom(Fφ ), where (φ ∗ a) := (id ⊗ φ)(∆A (a)). (3) For any a ∈ A∞ , C 3 z 7→ φz ∗ a is complex analytic. Proof. It follows from the decomposition of H into the subspaces Hn = Cn ⊗ Kn as discussed before, and the form of V 0 that Hn ⊆ Dom(Fφ ) for each n. This proves (1). For (2), we first note that for ξ ∈ Dom(Fφ ) and a ∈ A∞ , we have V 0 (aξ) = V (aξ ⊗ 1) = V (a ⊗ 1)V ∗ V (ξ ⊗ 1) = ∆A (a)V 0 ξ (where an element of A∞ ⊗alg S is naturally acting from left by multiplication on H ⊗ S), which shows
July 14, 2004 18:3 WSPC/148-RMP
592
00211
D. Goswami
that aξ ∈ Dom(Fφ ). Finally, if φ is multiplicative, we have (id ⊗ φ)(∆A (a)V 0 ξ) = (id ⊗ φ)(∆A (a))(id ⊗ φ)(V 0 ξ). From this, it follows that Fφ (aξ) = (φ ∗ a)Fφ (ξ). To prove (3), we first note that by our assumption ∆A (a) ∈ A∞ ⊗alg S ∞ , for a ∈ A∞ . Thus, for any fixed a ∈ A∞ , there are elements a1 , . . . , am ∈ A∞ ; b1 , . . . , bm ∈ P P S ∞ (m positive integer) such that ∆A (a) = m i=1 ai ⊗bi , hence φz ∗a = i ai φz (bi ). Now, (3) is immediately from the fact that z 7→ φz (bi ) is complex analytic for each bi , as observed in Lemma 3.1. Let us now take R = Fφ1 on H, where φ1 is as in Lemma 3.1. With this choice, we have the following result. Lemma 3.3. Let L ∈ B(H) be a bounded , positive, equivariant (i.e. L ⊗ IK and V commute with each other ) operator. Then we have: (1) R and L commute with each other. (2) If we assume furthermore that RL is trace-class (i.e. RL has a bounded extension which is trace-class), and define a bounded linear normal functional χ on B(H) by χ(A) = Tr(ARL), A ∈ B(H), then (χ ⊗ h)(V (A ⊗ I)V ∗ ) = χ(A) , where h denotes the extension of the Haar state on B(L2 (S, h)) given by h(x) = h1, x1i, as mentioned before. L Proof. (1) Since L is equivariant, it must be of the form L = n (ICdn ⊗ Ln ), with respect to the decomposition of H mentioned earlier. Here each Ln is a positive bounded operator on Kn . On the other hand, by definition R is of the form R = L n is a positive, invertible operator on the dn -dimensional n (Rn ⊗ IKn ), where RP space, given by, Rn enj = l Fn (j, l)enl . Thus, R and L commute with each other. (2) The assumption of RL being trace-class implies in particular that Rn ⊗ Ln is trace-class for each n (by noting the operator inequality 0 ≤ (Rn ⊗ Ln ) ≤ RL), thus ICdn ⊗ Ln = (Rn−1 ⊗ IKn )(Rn ⊗ Ln ), and hence Ln is trace-class. So, we can choose a diagonalizing basis for Ln , i.e. an orthonormal basis ξkn , k = 1, 2, . . . , mn (where mn = dim(Kn ), 0 ≤ mn ≤ ∞) and a sequence of positive numbers λn,k , k = P 1, 2, . . . , mn , such that n,k λn,k < ∞ for each n, and Ln ξkn = λn,k ξkn . Let us denote
by en,k the vector eni ⊗ξkn , where eni ’s are as mentioned earlier. Clearly, {en,k i i ; n, i, k} P n,k is an orthonormal basis for H. We first note that V ∗ (en,k e ⊗ 1) = ⊗ (tnji )∗ . j i i Thus, since Mn = Tr(Fn ), we have (χ ⊗ h)(V (a ⊗ I)V ∗ ) X n,i = hej ⊗ 1, V (a ⊗ I)V ∗ (RL(en,i j ) ⊗ 1)i n,i,j
=
X
n,i,j,r
∗ n,i hV ∗ (en,i j ⊗ 1), (a ⊗ I)V (λn,i Fn (j, r)er ⊗ 1)i
July 14, 2004 18:3 WSPC/148-RMP
00211
Twisted Entire Cyclic Cohomology
=
X
n,i n ∗ n ∗ λn,i Fn (j, r)hen,i k ⊗ (tjk ) , (a ⊗ I)(el ⊗ (trl ) )i
X
n,i n n∗ λn,i Fn (j, r)hen,i k , ael ih(tjk trl )
n,i,j,r,k,l
=
n,i,j,r,k,l
=
X
n,i,j,k,l
=
X
n,i,k
n,i λn,i Fn (j, j)hen,i k , ael i
X Fn (j, j) j
Mn
!
λn,i
*
593
Fn (k, l) Mn
en,i k ,a
X
Fn (k, l)en,i l
l
!+
= χ(a) . Corollary 3.1. Let us consider the case when A = S, A∞ = S ∞ , H = K = L2 (S, h), R = Fφ1 on L2 (S, h), and let V be the unitary associated with the canonical left regular representation of S in L2 (S, h). Given any positive nonzero operator P L on L2 (S, h) such that L(tnij ) = λn tnij with n Mn λn < ∞, we can recover the Haar state by the following formula h(a) =
Tr(aRL) , Tr(RL)
a∈S.
Proof. From the property of L, it is clear that RL is a positive trace-class operator, and L is also equivariant. Thus, by Lemma 3.3, we conclude that (η ⊗ h)(V (a ⊗ I)V ∗ ) = η(a) for all a ∈ S, where η(a) = Tr(aRL). Since V (a⊗I)V ∗ = ∆(a) in this case, we have η ∗ h = η. However, α ∗ h = α(1)h for any bounded linear functional α on S, so we must have η = η(1)h, or, h(a) = η(a) η(1) , as η(1) = Tr(RL) is nonzero, since the operator RL is a nonzero positive operator. This completes the proof of the corollary. The above corollary generalizes a similar result obtained in [8] for SUq (2). We shall now prove the main result of this section. Note we say that an m-linear functional χ on A∞ is S ∞ -invariant (or simply invariant if no confusion arises) if χ(a1 (1) , . . . , am (1) )a1 (2) · · · am (2) = χ(a1 , . . . , am )1S , or, equivalently, χ ∗ λ = λ(1)χ for any bounded linear functional λ on S. Theorem 3.1. Assume that (A∞ , H, D) is an odd equivariant spectral triple in the sense of [6], i.e. D is a self-adjoint operator on H satisfying [D, a] ∈ B(H) for a ∈ A∞ , and ((iλ − D)−1 ⊗ I)V = V ((iλ − D)−1 ⊗ I) for all λ ∈ R. Then (A∞ , H, D, R) is an R-twisted odd spectral data. Similarly, we obtain a twisted even spectral data from a given even equivariant spectral data with an equivariant grading 2 operator. Moreover , if Re−βD is trace-class for all β > 0, then the associated Chern characters are invariant.
July 14, 2004 18:3 WSPC/148-RMP
594
00211
D. Goswami
Proof. Since the resolvent operators (iλ − D)−1 , λ ∈ R, are equivariant bounded operators, it follows from Lemma 3.3 that (iλ − D)−1 commutes with R for all λ ∈ R, and thus the self-adjoint operators D and R will commute with each other. Furthermore, for ξ in the algebraic direct sum of Hn ’s, it is clear from the definition of R that Rs ξ = Fφs ξ for all s ∈ R. Using (ii) of Lemma 3.2, it is easy to see that for a ∈ A∞ , R−s aRs ξ = (φ−s ∗ a)R−s Rs ξ = (φ−s ∗ a)ξ. Thus, for any s ∈ R, σs (a) := R−s aRs admits a bounded extension, namely φ−s ∗ a, which belongs to A∞ . Moreover, by Lemma 3.2, C 3 z 7→ φz ∗ a is complex analytic, so in particular continuous, hence sups∈[−m,m] kφ−s (a)k < ∞. This proves that (A∞ , H, D, R) is an R-twisted odd spectral data. A similar proof can be given when the spectral data is even. Let us now verify the invariance of the associated Chern character, under the ad2 ditional assumption that Re−tD is trace-class for all t > 0. Denote by χs the func2 tional χs (A) := Tr(ARe−sD ) for any s > 0, A ∈ B(H). By Lemma 3.3, we have, (χs ⊗ h)(V (A ⊗ I)V ∗ ) = χs (A) for all s > 0, A ∈ B(H). It is easy to see that since 2 for any s > 0, V and e−sD ⊗ I commute with each other (by the assumed equiv2 ariance of the resolvent operators of D), one has for a ∈ A∞ , V (ae−sD ⊗ I)V ∗ = 2 ∆A (a)(e−sD ⊗ I). For s0 , . . . , sn > 0, let η be the n + 1 linear functional on A∞ 2 2 2 given by η(a0 , . . . , an ) = Tr(a0 e−s0 D [D, a1 ]e−s1 D · · · [D, an ]e−sn D R). Clearly, by using the equivariance mentioned earlier, it follows that D ⊗I and V commute with each other, and thus we have 2
2
V ((a0 e−s0 D [D, a1 ]e−s1 D · · · [D, an ]) ⊗ I)V ∗ 2
= ∆A (a0 ) · (e−s0 D ⊗ I) · [(D ⊗ I), ∆A (a1 )] · · · [(D ⊗ I), ∆A (an )] 2
2
= (a0 (1) e−s0 D [D, a1 (1) ]e−s1 D · · · [D, an (1) ]) ⊗ (a0 (2) · · · an (2) ) , using the Sweedler notation with summation implied. Thus, we have η(a0(1) , . . . , an(1) )h(a0(2) · · · an(2) ) 2
2
= (χsn ⊗ h)((a0(1) e−s0 D [D, a1(1) ]e−s1 D · · · [D, an(1) ]) ⊗ (a0(2) · · · an(2) )) 2
2
= (χsn ⊗ h)(V ((a0 e−s0 D [D, a1 ]e−s1 D · · · [D, an ]) ⊗ I)V ∗ ) 2
2
= χsn (a0 e−s0 D [D, a1 ]e−s1 D · · · [D, an ]) 2
2
2
= Tr(a0 e−s0 D [D, a1 ]e−s1 D · · · [D, an ]Re−sn D ) = η(a0 , . . . , an ) , which can be written as η ∗ h = η, in the notation introduced earlier. From this the required invariance of the odd Chern characters follows, because by using the fact that h ∗ λ = λ(1)h for any bounded linear functional λ on S, we get η ∗ λ = (η ∗ h) ∗ λ = η ∗ (h ∗ λ) = λ(1)η ∗ h = λ(1)η. Similarly, the case of the even Chern characters can be treated.
July 14, 2004 18:3 WSPC/148-RMP
00211
Twisted Entire Cyclic Cohomology
595
We have already remarked that for typical nonclassical examples of compact matrix pseudogroups, we are forced to consider the Θ-summable case rather than the finitely summable case. Now, we shall show that this assertion can be made a bit more definitive. Proposition 3.1. Let A = S, A∞ = S ∞ , H = L2 (S, h) and V be the unitary corresponding to the left regular representation as in the Corollary 3.1. We assume that there is some n such that Fn is not equal to Idn (c.f. Lemma 3.1 for the definition of Fn ). Suppose, furthermore, that we are given an equivariant spectral triple (S ∞ , H, D) satisfying [|D|, a], [|D|, [|D|, a]] ∈ B(H), ∀a ∈ S ∞ . Then, there 2 cannot be any finite positive number p such that limt→0+ (tp Tr(Re−tD )) = C, 0 < C < ∞, for some suitable Banach limit lim on the space of bounded functions on R+ , as considered in [7] and elsewhere. Proof. Suppose that the assertion of the proposition is false, and we are indeed given an equivariant spectral triple (S ∞ , H, D) such that [|D|, a], [|D|, [|D|, a]] ∈ B(H) ∀a ∈ S ∞ and there are positive constants p and C such that 2 limt→0+ tp Tr(Re−tD ) = C. In particular, there will be a constant M such that 2 tp Tr(Re−tD ) ≤ M for small enough t. We have already seen that h(a) = −tD2
) for all a ∈ S. We now claim that for a, b ∈ S ∞ , h(ab) = limt→0+ Tr(aRe Tr(Re−tD2 )
h(σ(b)a), where σ(b) = φ−1 ∗ b. Before we prove this claim, we argue how it leads to a contradiction and hence completes the proof. It is shown in [17] that h(ab) = h((φ−1 ∗ b ∗ φ−1 )a), where (b ∗ φ) := (φ ⊗ id)(∆(b)). Thus, we get h((φ−1 ∗ b ∗ φ−1 )a) = h(σ(b)a) for all a, b ∈ S ∞ , hence (σ(b) ∗ φ−1 ) = σ(b) ∀ b ∈ S ∞ , using the fact that h is faithful by assumption. This implies that b ∗ φ1 = b for all b ∈ S ∞ , since σ is an automorphism. But this is possible only if Fn = Idn , ∀n, which is contradictory to the assumption of the present proposition. So, it is enough to prove that h(ab) = h(σ(b)a). We have 2
R[e−tD , a] Z = −t
1
2
2
Re−tsD [D2 , a]e−t(1−s)D ds
0 1
= −t
Z
1
= −t
Z
0
2
2
2
2
Re−tsD (2|D|[|D|, a] − [|D|, [|D|, a]])e−t(1−s)D ds Re−tsD (2[|D|, a]|D| + [|D|, [|D|, a]])e−t(1−s)D ds ,
0
since [D2 , a] = 2|D|[|D|, a] − [|D|, [|D|, a]] = 2[|D|, a]|D| + [|D|, [|D|, a]]. By standard estimates, one can now show that tp times the above expression goes to 0 in trace-norm as t → 0+. To do this, we consider the integral from 0 to 1 1 1 1 2 and from 2 to 1 separately. Fix some α > p, and β > 0 such that α + β = 1. It
July 14, 2004 18:3 WSPC/148-RMP
596
00211
D. Goswami
is easy to see by a straightforward simplification using the fact that R commutes 2 with D (and hence with e−sD and |D|) that Z 21 2 2 Re−tsD (2[|D|, a]|D| + [|D|, [|D|, a]])e−t(1−s)D ds 0
= where
Z
1 2
0
2
1
1
2
t
2
t
R α e−tsD (B1 + B2 |D|)R β e− 2 (1−s)D e− 2 (1−s)D ds , 1
1
B1 = R β [|D|, [|D|, a]]R− β = [|D|, [|D|, (φ− β1 ∗ a)]] , 1
1
B2 = 2R β [|D|, a]R− β = 2[|D|, (φ− β1 ∗ a)] , 1
since R β and |D| commute. Note that φ− β1 is as in Lemma 3.1, and also recall from Lemma 3.2 that φ− β1 ∗ a ∈ S ∞ for a ∈ S ∞ , so B1 , B2 are bounded operators. It 2
2
1
is easy to see that k|D|e−uD k ≤ C1 u− 2 , where C1 = sup0≤x<∞ xe−x . Choose a 2 small enough t such that the inequality tp Tr(Re−tD ) ≤ M is valid. By H¨ older’s inequality (for Schatten ideals), we have Z 12 2 1 2 2 t 1 t kR α e−tsD (B1 + B2 |D|)R β e− 2 (1−s)D e− 2 (1−s)D k1 ds t 0
≤ t
Z
1 2
1
2
t
2
kR α e−tsD kα (kB1 k + kB2 kk|D|e− 2 (1−s)D k)
0
1
t
2
× kR β e− 2 (1−s)D kβ ds ≤ t
Z
1 2
0
2
1
1
(Tr(Re−αtsD )) α (kB1 k + C1 kB2 k(t(1 − s)/2)− 2 )
× (Tr(Re−βt(1−s)D p
p
2
/2
1
1
)) β ds 1
≤ tα− α β − β (M t−p )( α + β ) Z 12 p p 1 × (kB1 k + C1 kB2 k(t(1 − s)/2)− 2 )s− α ((1 − s)/2)− β ds 0
0 1−p
≤ Ct
(kB1 k + 2C1 kB2 kt
− 21
)
Z
1 2
p
s− α ds
0
1− αp −1 1 1 p = C 0 t1−p (kB1 k + 2C1 kB2 kt− 2 ) 1− 2 α = o(t−p ) , where C 0 is some constant, and we have used the fact that (1−s) ≥ 12 for 12 ≤ s ≤ 1, and also that 1 − αp > 0 by our choice of α. It should also be noted that in the above we have denoted by k · k the operator norm and by k · kµ the µth Schatten
July 14, 2004 18:3 WSPC/148-RMP
00211
Twisted Entire Cyclic Cohomology
597
norm of an operator. The integral from 12 to 1 can be estimated in a similar way. 2 Thus, tp kR[e−tD , a]k1 → 0 as t → 0 + . 2 2 We now recall that by the Corollary 3.1, tp Tr(Re−tD )h(x) = tp Tr(x Re−tD ) 2 for any t > 0, x ∈ S. As C = limt→0+ tp Tr(Re−tD ) is finite and nonzero, we have 2 limt→0+ Tr(x Re−tD ) = Ch(x). Thus, for a, b ∈ S ∞ , h(σ(b)a) =
2 1 limt→0+ tp Tr(σ(b)aRe−tD ) C
=
2 1 limt→0+ tp Tr(aRe−tD σ(b)) C
=
2 1 limt→0+ tp Tr(aRσ(b)e−tD ) , C
2
2
since tp |Tr(aR[e−tD , σ(b)])| ≤ tp kakkR[e−tD , σ(b)]k1 → 0 as t → 0 + . However, 2 2 1 1 limt→0+ tp Tr(aRσ(b)e−tD ) = limt→0+ tp Tr(aRR−1 bRe−tD ) C C
=
2 1 limt→0+ tp Tr(abRe−tD ) C
= h(ab) . This completes the proof. Remark 3.1. In the above Proposition 3.1, we have taken A = S. The conclusion of this proposition may not be valid if one considers more general A, for example the quantum homogeneous spaces. The argument that b ∗ φ−1 = b for all b ∈ S ∞ cannot hold if F is not identity is not valid if one replaces S by a general quantum homogeneous space A; in particular it is not valid for the standard quantum 2sphere associated with SUq (2). In fact, for all elements b of this quantum sphere (imbedded into SUq (2) in the canonical way) one has b ∗ φ−1 = b. Thus, we cannot rule out the possibility of constructing a Dirac operator D on such homogeneous 2 spaces with a “twisted” finite summability condition in the sense that Tr(Re−tD ) = O(t−p ) (as t → 0+, p finite positive number). Indeed, the recent Dirac operator on the standard Podl´es 2-sphere for SUq (2) constructed in [15] satisfies such a condition for p = 2. This very interesting example of spectral data obtained in [15] is -summable for every > 0 in the usual (untwisted) sense of summability, which can be interpreted as “metric dimension = 0”; whereas the fact that it has a dimension 2 from the viewpoint of the theory of twisted spectral data is in agreement with the classical intuition, as the classical 2-sphere is indeed a 2-dimensional manifold. Remark 3.2. If we look at the proof of the Theorem 3.1 carefully, we can easily notice that there is indeed some amount of flexibility in the choice of R. In fact, the conclusion of the theorem remains valid if we replace the canonical R chosen by us by some operator of the form R1 = RR0 , where R0 is a positive operator having n,k 0 n,k en,k j ’s as a complete set of eigenvectors with R (ej ) = µk ej , where µk ’s are such
July 14, 2004 18:3 WSPC/148-RMP
598
00211
D. Goswami 2
that R1 e−βD is trace-class for all β > 0. In the context of SUq (2), this flexibility of choice will play a crucial role. However, it should be noted that the conclusion of Proposition 3.1 may no longer hold if we change R. Remark 3.3. The twisted Chern character can be paired with the equivariant K-theory, i.e. the K-theory of the subalgebra Ainv ≡ {a ∈ A : ∆A (a) = a ⊗ 1S }. In fact, from the special form of σ, it is easily seen that Ainv ⊆ Aσ , hence we can restrict the pairing h·, ·iσ, on K∗ (Ainv ) × H∗ (A, σ) to get the desired map. Furthermore, since the equivariant Dirac operator D decomposes into a direct sum of operators Dπ , indexed by irreducible (co-)representations π of S, and any projection of Ainv also naturally respects this decomposition, one can consider the twisted Chern character corresponding to the spectral data given by any fixed Pπ DPπ , where Pπ denotes the projection onto the subspace corresponding to π. The corresponding pairing assigns to each element of K∗ (Ainv ) a complex number depending on π, and thus one gets a map from the set of irreducible (co-)representations of S to the dual of the K-theory, which may be formally thought of some kind of “character-valued index”. However, any attempt to give this a rigorous meaning requires first of all a generalization of equivariant entire cyclic cohomology as discussed, for example, in [11] and related works of other authors, to the framework of compact quantum groups. We shall conclude with some discussion on the case of SUq (2). We recall from Sec. 2 that in general the twisted entire cyclic cohomology H∗ (A, σ) pairs with the K-theory of only a subalgebra Aσ of A, and not with that of A. However, it may sometimes turn out that the subalgebra Aσ is large enough to capture the K-theory of A itself. We shall see that this is indeed the case if we consider SUq (2). Let us recall the notation of [6], where the generators of SUq (2) were denoted by α, β, and u = I1 (β ∗ β)(β − I) + I was chosen to be the generator of K1 (SUq (2)) which is Z. It is easily seen that the map from K1 (C ∗ (u)) to K1 (SUq (2)), induced by the inclusion map, is an isomorphism of the K1 -groups (where C ∗ (u) denotes the unital C ∗ -algebra generated by u). Now our aim is to construct an appropriate twisted spectral triple so that the associated fixed point subalgebra SUq (2)σ will contain u. To do this, we have to refer to the Remark 3.2 made earlier. Let us deviate a little from our notational convention in this section, and index the space of irreducible (co-)representations of SUq (2) by half-integers, i.e. n = 0, 21 , 1, . . . ; and index the orthonormal basis of the corresponding (2n + 1)2-dimensional subspace of L2 (SUq (2), h) by i, j = −n, . . . , n, instead of 1, 2, . . . , (2n + 1). Thus, let us consider the orthonormal basis eni,j , n = 0, 21 , . . . ; i, j = −n, −n + 1, . . . , n in the notation of [6]. We consider any of the equivariant spectral triples constructed by the authors of [6] and in the associated Hilbert space H = L2 (SUq (2), h) choose a suitable R0 as in the Remark 3.2, given by R0 eni,j = q −2i eni,j , so that the new choice of R, i.e. R1 , actually coincides with the R in [8]. We have, R1 (eni,j ) = q −2i−2j eni,j ,
July 14, 2004 18:3 WSPC/148-RMP
00211
Twisted Entire Cyclic Cohomology
599
n = 0, 12 , , 1, . . . ; i, j = −n, −n + 1, . . . , n. Let us choose a spectral triple given by the Dirac operator D on H, defined by D(eni,j ) = d(n, i)eni,j , where d(n, i) are as in (3.12) of [6], i.e. d(n, i) = 2n + 1 if n = i, d(n, i) = −(2n + 1) otherwise. It can be easily seen that (S ∞ , H, D, R1 ) is an odd R1 -twisted spectral data, where S = SUq (2), and furthermore, the fixed point subalgebra SUq (2)σ for σ(·) = R1−1 . R1 is the unital ∗-algebra generated by β, so in particular contains u, allowing us to consider the pairing of the twisted Chern character with K1 (C ∗ (u)), and in turn with K1 (SUq (2)) using the isomorphism noted before. The important question is whether we recover the nontrivial pairing obtained in [6] in our twisted framework. We would like to conjecture that the answer to this question is affirmative. However, it is necessary to build some more technical tools before we can attempt to investigate this, which we postpone for future. 4. The New Notion of Invariance Due to Connes and its Implications In this final section, we shall briefly recall the recent formulation of invariance by Connes in [3]. In fact, there is an important remark in the above mentioned article by Connes on a previous version of the present article, which motivates us to add this new section. Our aim is to understand the remark of Connes by working out some analytical details. 0 Let S∞ be the dual of S ∞ , i.e. the space of linear functionals on S ∞ , equipped with the algebra structure obtained by dualizing the coproduct of S, or equivalently, 0 by taking f ·g = f ∗g for f, g ∈ S∞ . Let us consider the setup of the previous section, ∞ and let H, V, A, A be as in the beginning of the previous section. Let O denote the ∗-algebra of (possibly unbounded) operators L from H to H having D ≡ ⊕alg Hn in the domain, and such that this domain is invariant by the action of L. In the 0 notation of Lemma 3.2, it is straightforward to verify that S∞ 3 η 7→ Fη ∈ O is an ∞ algebra homomorphism. Let us assume furthermore that A (D) ⊆ D. We consider the ∗-algebra B0 generated by A∞ , R = Fφ1 , and Fχnij ; i, j = 1, . . . , dn , where φ1 is as in the Sec. 3, and χnij is defined by χnij (tm kl ) = 1 if n = m, i = k, j = l; and 0 otherwise. Clearly, Fχnij is a finite-rank operator (in fact its range is a subset of Hn ), and R−1 Fχnij R = ⊕m Tm , with Tm nonzero for m = n only. Thus, R−1 Fχnij R is finite-rank, in particular bounded. Furthermore, it is easily seen by using the definition of R and Fχnij ’s that R−1 Fχnij R can be expressed as a linear combination of the operators Fχnkl , k, l = 1, . . . , dn . It is easy to note that for the regular representation of SUq (2), B0 will be nothing but SUq (2) o U in the notation of [3]. Following the suggestion of [3], we would like to construct a natural Θ-summable spectral triple on B0 using the given equivariant spectral triple (A∞ , H, D). However, there is some conceptual difficulty involved here, since B0 is no longer acting on H as a subset of bounded operators, and [D, b] is not bounded for a general element b ∈ B0 . Thus, we are led to consider
July 14, 2004 18:3 WSPC/148-RMP
600
00211
D. Goswami
the following modified definition of a Θ-summable spectral data. Let us mention here that given a dense subspace K0 of a Hilbert space K, we shall denote by OK0 the set of (possibly unbounded) operators on K which have K0 as an invariant subspace in the domain. Definition 4.1. A generalized (odd) spectral data is given by a quadruple (C, K, K0 , D), where K is a separable Hilbert space, K0 is a dense subspace of K, C is a ∗-subalgebra of OK0 , D is a self-adjoint operator on H such that D, [D, c] ∈ OK0 2 for all c ∈ C. We say that this data is Θ-summable if for all t > 0, ae−tD is traceclass for all a in the ∗-algebra C˜ generated (algebraically only, no completion) by C and [D, c], c ∈ C. The definition of a generalized even spectral data can be given in a similar way. This definition is perhaps too general to work with, and the lack of boundedness makes it extremely difficult to consider some kind of J-L-O cocycle in this setup. However, we specialize to a particular situation relevant to us, and see that such cocycles can indeed be constructed along the line of construction of twisted J-L-O cocycles. Proposition 4.1. Let (A∞ , H, D) be an equivariant spectral triple as in the Sec. 3, 2 such that Re−tD is trace-class for all t > 0, where R is as in Sec. 3. Let B0 be the ∗-algebra of (unbounded) operators described at the beginning of this section. Then (B0 , H, D, D) is a Θ-summable generalized odd spectral data in the sense of the previous definition. Furthermore, for b0 , b1 , . . . , bn in B0 , the following integral C for some constant is absolutely convergent and is bounded in absolute value by n! C depending only on b0 , . . . , bn ψn (b0 , b1 , . . . , bn ) Z := P δi ≥0,
i
δi =1
2
2
2
Tr(b0 e−δ1 D [D, b1 ]e−δ2 D · · · [D, bn ]e−δn+1 D )dδ1 · · · dδn+1 .
Proof. Let E0 denote the algebra generated by A∞ and elements of the form P Fχnij . It is easy to see that any b in B0 is of the form b = j=−M,−M +1,...,N aj Rj for some positive integers M, N and bounded operators aj ’s. Using this observation as well as the fact that R commutes with D, and that for any integer m and element c ∈ E0 , R−m cRm is bounded and belongs to E0 , we can express 2 2 2 (b0 e−δ1 D [D, b1 ]e−δ2 D · · · [D, bn ]e−δn+1 D ) as a finite linear combination of ele2 2 2 ments of the form c0 e−δ1 D [D, c1 ]e−δ2 D · · · [D, cn ]e−δn+1 D Rm for some integer m and elements c0 , . . . , cn ∈ E0 . However, it is not difficult to see that (E0 , H, D, Rm ) is indeed an Rm -twisted odd Θ-summable spectral data in the language of Sec. 2. Thus, we complete the proof of the present proposition by using the arguments given in the proof of the Lemma 2.1. Now, let us consider the finest locally convex topology on B0 , and view B0 as a locally convex algebra in this topology [1, Remark 7(b), p. 370]. It is easy to see
July 14, 2004 18:3 WSPC/148-RMP
00211
Twisted Entire Cyclic Cohomology
601
√ that ψ := ((2n + 1)! 2iψ2n+1 )n defines an entire cyclic cocycle in Ho (B0 ). This is a canonical construction of J-L-O type cocycle from the given equivariant spectral triple, and as pointed out in [3], under the map Ho (B0 ) 3 η 7→ ρ(η) ∈ Ho (A∞ , σ) given by ρ(η)2n+1 (a0 , . . . , a2n+1 ) := η2n+1 (a0 , . . . , a2n+1 R), the above canonical J-L-O cocycle on B0 will be mapped to the canonical twisted Chern character obtained by us in the previous section. Remark 4.1. We have already seen that even if we are given an equivariant finitesummable (in the usual sense) spectral data, the canonical twisted spectral data constructed from this may very often turn out to be only Θ-summable in the “twisted” sense, as discussed in the previous section. However, in some cases we may indeed obtain a twisted data which is finitely summable (in the “twisted” sense), as noted in the context of the example constructed in [15] (c.f. Remark 3.1). Unfortunately, even in such cases, the generalized spectral data (B0 , H, D, D) obtained in the present section will not be finitely summable in any suitable sense in general. For example, if we consider the spectral data constructed in [15], with the corre2 sponding Dirac operator D satisfying Tr(Re−tD ) = O(t−2 ) (i.e. 2-summability in 2 the twisted sense), we cannot fix any p > 0 such that Tr(cRe−tD ) = O(t−p ) for all c in the algebra B0 mentioned in the Proposition 4.1, since Rm for any arbitrary integer m are in the algebra B0 . Thus, we have elaborated the comments made in [3], and this has led us to consider a slight modification of the concept of a Θ-summable spectral data, i.e. a minor change in the conventional framework of noncommutative geometry, apart from the modification of the notion of invariance. We, however, do not know if this can be achieved without the above little modification of noncommutative geometry, avoiding the notion of generalized spectral data. Acknowledgment The author would like to acknowledge the support of the Abdus Salam I.C.T.P. (Trieste, Italy) where most of this work was carried out, during January-August 2002. References [1] A. Connes, Noncommutative Geometry (Academic Press, 1994). [2] A. Connes, Geometry from the spectral point of view, Lett. Math. Phys. 34(3) (1995) 203–238. [3] A. Connes, Cyclic Cohomology, Quantum group symmetries and the local index formula for SUq(2), preprint (2002), math.QA/0209142. [4] A. Connes and M. Dubois-Violette, Noncommutative finite-dimensional manifolds. I: Spherical manifolds and related examples, Commun. Math. Phys. 230(3) (2002) 539–579. [5] A. Connes and G. Landi, Noncommutative manifolds, the instanton algebra and isospectral deformations, Commun. Math. Phys. 221 (2001) 141–159.
July 14, 2004 18:3 WSPC/148-RMP
602
00211
D. Goswami
[6] P. S. Chakraborty and A. Pal, Equivariant spectral triples on the quantum SU (2) group, K-Theory 28(2) (2003) 107–126. [7] J. Fr¨ ohlich, O. Grandjean and A. Recknagel, Supersymmetric quantum theory and noncommutative geometry, Commun. Math. Phys. 203 (1999) 119–184. [8] D. Goswami, Some noncommutative geometric aspects of SUq (2), preprint, mathph/0108003. [9] A. Jaffe, A. Lesniewski and K. Osterwalder, Quantum K-theory. I: The Chern character, Commun. Math. Phys. 118(1) (1988) 1–14. [10] K. Ernst, P. Feng, A. Jaffe and A. Lesniewski, Quantum K-theory. II: Homotopy invariance of the Chern character, J. Funct. Anal. 90(2) (1990) 355–368. [11] S. Klimek and A. Lesniewski, Chern character in equivariant entire cyclic cohomology, K-Theory 4 (1991) 219–226. [12] J. Kustermans, G. J. Murphy and L. Tuset, Differential calculi over quantum groups and twisted cyclic cocycles, J. Geom. Phys. 44(4) (2003) 570–594. [13] J. L. Loday, Cyclic homology (Appendix E by M. O. Ronco.), Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences], 301 (Springer-Verlag, Berlin, 1992). [14] T. Masuda, Y. Nakagami and J. Watanabe, Noncommutative differential geometry on the Quantum SU (2), I: An algebraic viewpoint, K-Theory 4 (1990) 157–180. [15] K. Schm¨ udgen and E. Wagner, Dirac operator and a twisted cyclic cocycle on the standard Podl´es quantum sphere, preprint, QA/0305051. [16] S. L. Woronowicz, Twisted SU (2)-group: An example of a noncommutative differential calculus, Publ. Res. Inst. Math. Sci. (Kyoto Univ.) 23 (1987) 117–181. [17] S. L. Woronowicz, Compact matrix pseudogroups, Commun. Math. Phys. 111(4) (1987) 613–665.
July 13, 2004 11:49 WSPC/148-RMP
00213
Reviews in Mathematical Physics Vol. 16, No. 5 (2004) 603–628 c World Scientific Publishing Company
THE RADIAL PART OF THE ZERO-MODE HAMILTONIAN FOR SIGMA MODELS WITH GROUP TARGET SPACE
DOUG PICKRELL Mathematics Department, University of Arizona, Tucson, AZ 85721 [email protected] Received 30 September 2003 Revised 15 April 2004 In this note, we use geometric arguments to derive a possible form for the radial part of the “zero-mode Hamiltonian” for the two-dimensional sigma model with target space S 3 , or more generally a compact simply connected Lie group. Keywords: Sigma model; Hamiltonian; loop group; Wiener measure. 2000 Mathematics Subject Classifications: 81T40, 22E67, 58D20
0. Introduction Suppose that X is a Riemannian manifold. In this paper we will be mainly interested in the case when X = K, a simply connected compact Lie group with biinvariant metric. At the classical level, the (d+1)-dimensional sigma model with target space X is defined by the kinetic energy action Z 1 A(x) = |dx|2 dV , (0.1) 2 Σ where the field x is a map from (d + 1)-dimensional space-time Σ to X. In this paper we always consider the critical dimension, dim n(Σ) = 2, in which case the action is conformally invariant, i.e. depends only on the conformal structure of Σ. At the quantum level, from one perspective, defining the sigma model roughly amounts to making sense of path integrals Z 1 (0.2) Z(Σ) = exp − A(x) Dx , λ Map(Σ,X) where λ is a dimensionless coupling constant. The conventional wisdom regarding the qualitative properties of these integrals is largely based upon renormalization group analysis (see [2, Sec. 3], [7, Chaps. 2 and 8] and [12, Lecture 4]). In this analysis, one first regularizes the integrals, breaking conformal invariance. As the regularization is removed, the coupling constant λ (and other parameters as well, 603
July 13, 2004 11:49 WSPC/148-RMP
604
00213
D. Pickrell
such as the metric of X, if it is not sufficiently symmetric) must be allowed to flow, to obtain well-defined limits. At least for homogeneous geometries on the target X, based upon this analysis, it is believed that the resulting quantum field theory is not conformally invariant, whenever the Ricci curvature of X is nonvanishing; this is the case for the targets considered in this paper (for generic metrics on X, the flow of parameters is so complex, it seems impossible to know what, in the end, parametrizes the possible quantum field theory limits). From a geometric point of view, for a quantum 2D sigma model, one expects 1 that for each finite radius R, there is a Hilbert space of states, H(SR ), associated to the circle with radius R, and there is a trace class operator, U (Σ), from incoming to outgoing state spaces, associated to each oriented compact Riemannian surface Σ, with parametrized geodesic boundary components, such that sewing of surfaces corresponds to composition of operators. The sewing property expresses the locality of quantum field theory, as interpreted by Witten and Segal; the trace class condition arises from the expectation that finite numbers can be associated to the path integrals corresponding to closed surfaces (see [2, Sec. 2.6]). In particular, the 1 infinitesimal generator for the one parameter family, U (SR × [0, t]), 0 < t < ∞, the Hamiltonian HR (our main concern), should have a discrete spectrum. In the case of a conformally invariant quantum sigma model (e.g. X = T , a flat 1 ) and Hamiltonian HR are essentially independent torus), the Hilbert space H(SR of R. Otherwise, there is dependence on R, and for the models we consider, it is expected that in the infrared limit R ↑ ∞, the Hamiltonian H = H∞ has a mass gap and continuous spectrum. For emphasis, note that the mass parameter (and the dependence on the radius), does not appear explicitly in the action; the mass parameter arises in the renormalization group process, as the dimensionless parameter λ is adjusted (this is referred to as “dimensional transmutation”, see [12, p. 1174]). Now suppose that X = K, a simply connected compact Lie group with biin1 ) have variant metric. In this paper, we will assume that the Hilbert spaces H(SR a certain concrete mathematical form. We will also introduce some simplifying assumptions regarding the action of the Hamiltonian operators HR . On the basis of these assumptions, we will draw some conclusions about the form of the radial part of the “zero-modes” of these operators. The arguments in this paper are most complete in the limiting case R ↑ ∞. 1 We assume that in this limit the space of states has the form H(S∞ ) = L2 (µ), where µ is a certain canonical measure on a distribution-like completion of the loop space LK. We think of this measure as specifying the vacuum for the theory. This is motivated by the known form for the vacuum of the conformally invariant WZW model (see [2, Sec. 4.1]). We also assume that the Hamiltonian acts as a second order differential operator on a certain subspace of zero-mode states. This “zero-mode Hamiltonian” is a K × K-biinvariant Laplace type operator on G, the complexification of K. In the case of K = S 3 , on the basis of these assumptions,
July 13, 2004 11:49 WSPC/148-RMP
00213
The Radial Part of the Zero-Mode Hamiltonian
605
we find that the radial part of this zero-mode Hamiltonian is equivalent to
−
d dr
2
+
1 15 − sech2 (r) , 4 4
(0.3)
(acting on odd functions) up to a scale factor (the mass parameter). This operator has a unique ground state and a mass gap. The spectrum of this radial part does not reflect expected features of the full spectrum, such as jumps in multiplicity corresponding to multiparticle states. Jumps do occur in the spectrum of the full zero-mode Hamiltonian, due to the discreteness of the K × K isotypic decomposition (the radial part corresponds to the trivial K ×K representation). Unfortunately, the zero-mode Hamiltonian cannot be determined on mathematical grounds alone from its radial part. It is possible that together with a quantum group symmetry, the radial part determines the entire Hamiltonian, but I cannot presently account for this “symmetry” from this paper’s perspective; see [9] and references there for the integrable field theory point of view, which suggests this possibility. In Sec. 1, I will more fully explain the motivation for the conjecture. This involves a brief review of the Hamiltonian formalism for the two-dimensional sigma model. In Sec. 2, I will review some of the mathematics which Sec. 1 depends upon, especially the construction of the measure µ (which is carried out in [6]). This involves understanding the limit of Wiener measure on LK, in terms of Riemann– Hilbert factorization, as inverse temperature tends to zero (which corresponds to R → ∞). For the purposes of this paper, the key is Step 7 in Sec. 2, which is a conjectured formula for the spherical transform of the diagonal (or zero-mode) distribution of the measure µ, in terms of an affine analogue of Harish–Chandra’s famous c-function. (The mathematical conclusions of this paper depend exclusively on this affine c-function, and many readers may care to skip the infinite dimensional measure-theoretic considerations; however, it should be borne in mind the fact that this function arises as a transform for a measure on (distributional) loop space is a key argument for the plausibility of our physical claims.) In Sec. 3, I will present the calculations which lead to (0.3), and to an analogous conjecture for the massive deformation of the conformally invariant WZW model. In Sec. 4, I will discuss the case of finite radius. The mathematical underpinnings for this section are not as strong as in the case R ↑ ∞, because the analogue of the measure µ has not been rigorously constructed. However, there does appear to be a natural conjecture for the zero-mode distribution. This involves understanding a quantum (theta function) deformation of the affine c-function. Most of this section revolves around some nontrivial positivity checks for this deformation. These positivity checks depend ultimately on some Fourier transform type relations among classical theta functions which seem to be new (see e.g. Lemmas 4.1 and 4.2).
July 13, 2004 11:49 WSPC/148-RMP
606
00213
D. Pickrell
1. Origins of the Conjecture We initially suppose that space is the circle, S 1 , so that spacetime is Σ = S 1 × R, with coordinates (θ, t). We also initially suppose that the target space is X, an arbitrary Riemannian manifold. The classical fields for the sigma model with target space X are maps x : Σ → X, and the action is the kinetic energy function Z Z 2 2 ∂x 1 1 + ∂x dθdt . (1.1) hdx ∧ ∗dxi = A(x) = ∂θ 2 Σ 2 Σ ∂t
This action is conformally invariant, meaning that if the metric ds of Σ is changed to ρds, where ρ is a positive function, the action remains unchanged. In particular, the action depends upon the radius of the circle and time scale in a covariant way. The time zero fields constitute the loop space LX = Map(S 1 , X) (when it is useful to denote the degree of smoothness, we will use a subscript, e.g. LC 0 X, which denotes the manifold of continuous loops). The tangent space to LX at x is naturally identified with Ω0 (x∗ T X), the space of vector fields along the loop x. There is a Riemannian metric on this tangent space, given by Z hv(θ), w(θ)ix(θ) dθ , (1.2) hv, wix = S1
where v(θ), w(θ) ∈ T X|x(θ), and h·, ·ix(θ) denotes the inner product (Riemannian metric) for X at the point x(θ). In this way, we can view LC 0 X as a Riemannian manifold. In the second expression in (1.1) for A, the first term is the usual kinetic energy for a path in the Riemannian manifold LC 0 X, and the second term represents a potential energy term, corresponding to the energy function on the finite energy loop space LW 1 X, Z Z 2 1 ∂x 1 dθ . (1.3) hdx ∧ ∗dxi = E(x : S 1 → X) = 2 S1 2 ∂θ
Note that the Riemannian metric (1.2) and E depend upon the radius of S 1 . From (1.1)–(1.3), we can deduce, in a rough heuristic way, that the quantum Hamiltonian for the sigma model is of the form H =∆+E,
(1.4)
where ∆ is the Laplacian for the Riemannian manifold LC 0 X, and E is viewed as an (extremely singular) multiplication operator. At this heuristic level, the operator H should define a nonnegative self-adjoint operator on a Hilbert space that is of the form “L2 (LX, dV )”, where dV is a fictional Riemannian volume element. To rigorously define H, one must introduce a regularization, necessarily breaking scale covariance, and (as discussed in the introduction) it is expected that in general, after removing the regularization, there is a residual nontrivial dependence of H on R, the radius of the circle.
July 13, 2004 11:49 WSPC/148-RMP
00213
The Radial Part of the Zero-Mode Hamiltonian
607
One can at least schematically think of one aspect of this renormalization process in terms of a commutative diagram (involving unbounded operators) Ω◦(∆+E)◦Ω−1
L2 (Ω2R dV ) −−−−−−−−−−−− → x −1 Ω L2 (dV )
∆+E
−−−−→
L2 (Ω2R dV ) x −1 , Ω
(1.5)
L2 (dV )
where Ω = ΩR , the fictional ground state for ∆ + E (depending upon the radius R of the circle), is viewed as a multiplication operator. In this diagram, the coupled pair Ω2R dV should represent a well-defined finite positive measure (when topological terms are added to the action, this might more generally represent a measure having values in a line bundle). The point is that to obtain a well-defined Hamiltonian, one must consider states relative to the ground state. Example (see [2, Sec. 1.3]). Suppose X = R, and for simplicity, we add an explicit mass term m2 |x(t, θ)|2 to the integrand in (1.1). By expanding x(t, θ) = P xk (t)eikθ in a Fourier series, one sees that ∆ + E can be written as a sum of oscillators. The formal Hilbert space L2 (LR, dx) and the ground state ! ∞ 1 X p 2 2 m + k 2 |xk | (1.6) Ω = exp − 2 k=−∞
do not make sense individually. But in the top row of the diagram P√ 2 1 − P √m2 +k2 |xk |2 Ω◦(∆+E)◦Ω−1 2 2 1 − m +k2 |xk |2 e −−−−−−−−−−− →L e L dx − dx Z Z x x −1 −1 Ω Ω L2 (LR, dx)
∆+E
−−−−→
(1.7)
L2 (LR, dx)
the measure is a well-defined Gaussian measure, and the operator can be rigorously defined. Before discussing what is mathematically known about operators on LX (when X is curved), we digress to recall an idea inspired by Witten’s work. Suppose that Y is a finite dimensional Riemannian manifold (but we will want to heuristically apply this to Y = LX), and that E is a function on Y . There is a commutative diagram (involving unbounded operators) ∆β
L2 (Y, e−βE dV ) −−−−→ x β2 E e ∆Y
L2 (Y, e−βE dV ) x β2 E e
(1.8)
L2 (Y, dV ) −−−−→ L2 (Y, dV ) , β
β
akin to (1.5), where ∆β = e 2 E ◦∆Y ◦e− 2 E , and dV denotes the Riemannian volume element (a refinement of this, involving the conjugation of d, the exterior derivative, is relevant to Morse theory and the supersymmetric sigma model; see [11]).
July 13, 2004 11:49 WSPC/148-RMP
608
00213
D. Pickrell
In the case when Y = LW 1 X, there is a natural choice for E, E = E, there is an analogue of e−βE dV , namely Wiener measure with inverse temperature β, denoted νβ (although note this measure is not supported on Y ), and there is an analogue of ∆β , which has been investigated by Gross and others. From the sigma model point of view, one can view these objects as regularizations of the heuristic expressions that we introduced above. This motivates the study of possible limits of Wiener measure νβ and ∆β , plus a potential function that might stand in for E, as β ↓ 0 (but note that ∆β by itself tends to the Laplacian for the W 1 loop space, not the C 0 loop space). Note also that letting β ↓ 0 corresponds to R ↑ ∞. Now suppose that X = K, a compact simply connected Lie group with a simple Lie algebra, k. This space has an essentially unique biinvariant Riemannian structure, determined by a multiple of the Killing form, which we will normalize in a standard way (the length squared of a long root is 2; in the case when K = SU (2, C) = S 3 , this means that the inner product is hx, yi = trC2 (x∗ y), for x, y ∈ su(2, C)). Eliminating this normalization would introduce a mass parameter. A first fact of note is that Gross has proven that for a large class of potentials, {V }, ∆β + V has a unique ground state (see [3]; the nature of the spectrum apparently remains unknown). I am unaware of any result concerning limits of these operators as β ↓ 0, which in the present context amounts to removing a regularization. Our speculations to follow are possibly related to these limits (and hence to an infinite radius limit). Let G denote the complexification of K (if K = SU (2), then G = SL(2, C)). As I will describe in Sec. 2, there is a natural completion of the loop space LG, the hyperfunction loop space Lhyp G, with the properties that (1) LK acts from the left and right, and (2) the Wiener measures νβ converge to a biinvariant probability measure µ on Lhyp G as β ↓ 0. This depends in an essential way on our assumption that K is simply connected. The measure µ should be characterized in the following way. Conjecture 1.1. There is a unique probability measure on Lhyp G which is biinvariant with respect to LK (for uniqueness it should suffice to consider polynomial loops). In this paper we assume that the Hilbert space for the sigma model with target space K, in the infinite radius limit, is L2 (Lhyp G, µ), and we propose to use the structure of Lhyp G and µ to infer properties of the Hamiltonian H = H∞ . Remarks. (1) We emphasize that µ is a probability measure. Even in a heuristic sense, µ is not to be confused with the fictional Riemannian volume element dV on LC 0 K. We think of switching from dV to µ as similar to the vacuum renormalization process in (1.5), with R = ∞. (2) In Sec. 4, we will discuss the case when R < ∞, as best we can understand it. At this time, it is not clear how to formulate a characterization of the corresponding measure, Ω2R dV (assuming it exists), similar to Conjecture 1.1.
July 13, 2004 11:49 WSPC/148-RMP
00213
The Radial Part of the Zero-Mode Hamiltonian
609
A generic g ∈ Lhyp G can be represented as a formal product g = g − · g0 · g+ ,
(1.9)
where g0 ∈ G is constant, and g± are G-valued holomorphic functions on the disks ∆ = {|z| < 1} and ∆∗ = {|z| > 1}, respectively, with g+ (0) = 1 and g− (∞) = 1. If g ∈ LC 0 G (an ordinary continuous loop in G) and generic (i.e. the Toeplitz operator associated to g is invertible), then (1.9) is the standard triangular or Riemann–Hilbert or Birkhoff factorization of g (see [1] or [8, Chap. 8]). There is a strongly motivated conjecture for the g0 distribution of µ; in the case when K = SU (2), the conjecture states that 1 (1.10) (g0 )∗ µ = tr(g0∗ g0 )−3 dm(g0 ) , Z where dm denotes an invariant measure for SL(2, C), and Z normalizes the total mass to be one (see (3.4) below for the general formula). We now introduce two further assumptions. The first is that the low energy states of the sigma model should be functions of g0 alone. The second is that the Hamiltonian, or an approximation to it, should act on the space L2 (G, (g0 )∗ µ), and that this approximation should be given by a second order operator, necessarily biinvariant with respect to K. We will refer to this approximation as the “zeromode Hamiltonian”, since functions of g0 are rotation invariant. In the case of K = SU (2), K biinvariance leaves just one degree of freedom for the radial part, and as we will calculate in Sec. 3, this leads to (0.3). We will now briefly indicate how this generalizes to include other action terms. Our main purpose here is to compare with the WZW model of conformal field theory. Returning to a general target space X, suppose that we are given a “B-field” ˆ 2 (X, T), the degree 2 Cheeger–Simons differential on X, i.e. an element b ∈ H characters. Following the convention in physics, we will write b heuristically as b = exp(2πiB), where B is a (locally defined) 2-form on X. Given b, there is a multivalued generalization of the sigma model action, which gives rise to well-defined Feynmann amplitudes, Z ∗ exp −βA(x) + 2πi x B . (1.11) Σ
The deformation invariant of a B-field is the cohomology class of dB in H 3 (X, Z). In the case when X = K, there are special B-fields, the WZW action terms, which are parametrized by a level l ∈ Z = H 3 (K, Z). When the inverse temperature parameter β and the level l satisfy β = l, then the corresponding sigma model is the conformally invariant WZW model at level l, for which the Hilbert space is HL0 2 (L∗⊗l ) ,
(1.12)
ˆ hyp G ×C∗ C, the space of holomorphic sections of the line bundle L , where L = L where the hat denotes the Kac–Moody extension determined by the level. The vacuum for this theory is (an appropriate power of) the Toeplitz determinant, det A(ˆ g ), ∗⊗l
July 13, 2004 11:49 WSPC/148-RMP
610
00213
D. Pickrell
viewed as a section. For the purpose of motivating this paper, it is important to note that this Toeplitz determinant is a function of gˆ0 , the zero-mode of gˆ, in the ˆ corresponding to (1.9). The same is true for the other “primary factorization for LG fields” of the WZW model (see e.g. [6, Part III, Sec. 4.3]). Thus just as functions of gˆ0 are the most important states for the WZW model, we expect that functions of g0 are the most important states for the sigma model without WZW term. It is believed that there is a massive deformation of the WZW model at level l, in the direction specified by the energy-momentum tensor. We consider the ansatz that the Hilbert space for this deformation is the larger space of all sections Ω0L2 (L∗⊗l ) .
(1.13)
(In general, the state space of a massive model is expected to contain, in a natural way, the state space of its ultra-violet limit, describing its universality class; see [9, Sec. 1].) It is again reasonable to investigate the possibility that in terms of the Riemann–Hilbert factorization (1.9), the low energy states depend only upon g 0 , and so on. We will write down a possible form for the radial part of the zero-mode Hamiltonian at level l in Sec. 3. 2. The Structure of µ The existence of biinvariant limits of the measures νβ as β ↓ 0 is proven in [6]. Here I will give an outline of a relatively direct proof. The argument is broken into seven steps, two of which are listed as conjectural. Step 5 (Conjectural) can be bypassed, as I have indicated in Appendix A. However, this detour is somewhat messy, and Step 5 is of considerable intrinsic interest. Step 7 (Conjectural), which gives an explicit formula for the g0 distribution of µ, is essential for the purposes of this paper. By definition (see [6, Part III, Chap. 2]), as a set,
1
Lhyp G = G(O(S 1− )) ×G(O(S 1)) G(O(S 1+ )) ,
(2.1)
1+
where G(O(S )) is the group of analytic loops in G, G(O(S )) is the direct limit of the groups G(O({r < |z| < 1})) as r ↑ 1 (G-valued holomorphic functions on some annulus just inside S 1 ), G(O(S 1− )) is the direct limit of the G(O({1 < |z| < r})) as r ↓ 1, and G(O(S 1 )) acts on these latter two groups by multiplication. This is a nonabelian generalization of Sato’s realization of the dual of O(S 1 ) (the elements of this dual are called hyperfunctions, and generalize the notion of a distribution). From this global definition, it is clear that G(O(S 1 )) acts on the left and right of Lhyp G (but this action is far from transitive). The set Lhyp G can be turned into a complex manifold, where a model coordinate neighborhood is given by (1.9); the coordinates for this neighborhood are (θ− , g0 , θ+ ) ∈ H 1 (∆∗ , g) × G × H 1 (∆, g) ,
(2.2)
−1 −1 where θ+ = g+ ∂g+ and θ− = (∂g− )g− . Other neighborhoods are obtained by 1 translation by elements of G(O(S )).
July 13, 2004 11:49 WSPC/148-RMP
00213
The Radial Part of the Zero-Mode Hamiltonian
611
Technical Remark. Below, it will occasionally be useful to replace θ+ by its integral x+ ∈ H 0 (∆, g)0 , where θ+ = ∂x+ , x+ (0) = 0. One could imagine using other coordinates as well. But (2.2) is natural in the following sense: there is a 1 natural action of Diff + C ω (S ) on Lhyp G, in addition to the action of LK × LK; the coordinates (2.2) are equivariant with respect to the subgroup P SU (1, 1), where P SU (1, 1) acts naturally on H 1 (∆, g). Assuming the truth of Conjecture 1.1, the measure µ is invariant with respect to these actions. There is a natural inclusion of LC 0 G → Lhyp G; this follows from the existence of Riemann–Hilbert factorization for continuous loops. Wiener measure νβ on LC 0 K can therefore be viewed as a probability measure on Lhyp G. We recall that νβ is characterized in the following way: given vertices {v} and associated edges {e} around S 1 , the distribution of the values {g(v)} is given by the probability measure Q on {v} K Y 1 Y dgv , pT l(e) (g∂e ) Z {e}
(2.3)
{v}
where T = 1/β, pt denotes the heat kernel for K [in particular pt (g, h) ∼ 1 1 2 Z exp(− 2t d(g, h) ) as d(g, h) → 0], and g∂e denotes the pair of values of g at the ends of the edge e. Unfortunately, this characterization is not directly useful in understanding νβ in terms of the Riemann–Hilbert coordinates θ− , g0 , θ+ . We now turn to the basic steps of the argument. Step 1. νβ is quasiinvariant with respect to LW 1 K (finite energy loops) acting on LC 0 K from either the left or right. Step 2. νβ is asymptotically invariant as β ↓ 0 in the following precise sense: for each p < ∞, given g 0 ∈ LW 1 K, p Z dνβ (g 0 g) dνβ (g) ≤ 2c(β)Γ p + 1 (2βE(g 0 ))p/2 , − 1 dνβ (g) 2 LK where c(β) → 1 as β → 0. There is a similar estimate for g 0 acting on the right.
Step 3. With νβ probability one, g has a Riemann–Hilbert factorization as in (1.9), −1 and g± and x± have the same “smoothness properties” as g, where ∂x+ = g+ ∂g+ , x+ (0) = 0. “Smoothness” in Step 3 can be understood in various ways. A version sufficient for our purposes is the following. It is known that with νβ probability 1, g has a derivative of order s in a Sobolev (or Holder) sense, for any s < 1/2. According to Step 3, the same is true for g± and x± . In particular we have X nα |ˆ x+ (n)|2 < ∞ , a.e. [νβ ] , (2.4) n>0
for each α < 1.
July 13, 2004 11:49 WSPC/148-RMP
612
00213
D. Pickrell
These first three steps are true for an arbitrary compact type Lie group K. In particular, the first two steps involve a reduction to a linear situation via the use of stochastic analysis (see [6, Part II, Sec. 4.1]). The third step depends fundamentally on the fact that the conjugation operator is continuous on the class of Holder continuous functions, C µ , for any 0 < µ < 1 (see [1, p. 60, Sec. 2]). The next step depends crucially upon the simple connectedness of K. In the case when K = SU (2), if we write a1 (g) b1 (g) g+ (z) = 1 + z + gˆ+ (2)z 2 + · · · , (2.5) c1 (g) −a1 (g) a straightforward calculation (see (A.3) of Appendix A) shows that for k = a bz ∈ SU2τ ⊂ LSU2 , −¯ bz −1 a ¯
ab1 (g) − b . b1 (gk −1 ) = ¯ bb1 (g) + a ¯
(2.6)
In other words, the right action of k ∈ SU2τ on g ∈ Lhyp G intertwines with the natˆ (SU τ is a subgroup of LSU2 which ural linear fractional action of SU2 on b1 ∈ C 2 is conjugate to SU2 via an outer automorphism τ , hence the notation). This latter action is transitive and completely determines the form of an invariant measure. Since the νβ are asymptotically invariant, the b1 distributions are asymptotically ˆ This leads to the following concluinvariant with respect to this SU2 action on C. sion. Step 4. In the limit as β → 0, the distribution of b1 is the SU2 -invariant distribuˆ tion on C, 1 (1 + |b1 |2 )−2 dm(b1 ) . Z
The behavior of the measures (b1 )∗ νβ contrasts sharply with the behavior of the Gaussian measures Z1 exp(−βx2 )dx on Euclidean space (which is what we encounter for K = T , a flat torus), as β → 0, because the “probabilistic mass” of the latter measures escapes to ∞ as β → 0. One theme of this note is that the preservation of probabilistic mass, which depends essentially on the semisimplicity of K, is related to the existence of a mass gap for the sigma model. Remarks. (1) For a general simply connected K, there is a result similar to Step 4, where b1 is replaced by the coordinate for the highest root space of g. (2) Note that θ0 = θˆ+ (0) = x ˆ+ (1) = gˆ+ (1) ∈ g. Conjecturally, lim (θ0 )∗ νβ =
β→0
1 (1 + |θ0 |2 )−d−1 dm(θ0 ) , Z
(2.7)
where d = dim nC (g). But at this point, there does not exist a conjectural explicit formula for the joint distribution of all the modes θ0 , θ1 = θˆ+ (1), . . . (see Appendix A).
July 13, 2004 11:49 WSPC/148-RMP
00213
The Radial Part of the Zero-Mode Hamiltonian
613
Using invariance, we can now use Steps 3 and 4 to show that the distributions of all the coefficients for g+ (or x+ or θ+ ) assume a finite shape as β → 0. This is proven in [6], using an induction argument; a simplified version of this is presented in Appendix A. I believe, however, that there is a more elegant explanation, possibly useful in a more general context. A corollary of (2.3) is that for fixed α < 1 and for each R > 0, νβ {nα |ˆ x+ (n)|2 > R} → 0
as
n → ∞.
(2.8)
It is natural to ask whether there exists α such that the sequence in (2.8) is actually nonincreasing (but not necessarily going to 0), for all β and R; if so, there exists a largest such α, αc . In the abelian case, one can easily calculate that αc = 2. Step 5 (Conjectural). For α = 0, (2.4) is a nonincreasing function of n, for all β > 0 and R ≥ 0, i.e. αc ≥ 0. In particular νβ {|ˆ x± (n)| > R} ≤ νβ {|ˆ x± (1)| > R} . The following is a consequence of Step 4 and Step 5 (Conjectural). Step 6. There exists a constant d (depending only upon g) such that lim νβ {|ˆ x± (n)| > R} ≤
β→0
d . (1 + (R/d)2 )
(2.9)
This step implies that the mass of the νβ does not escape to ∞ as β → 0, at least when we consider the θ± coordinates. This has already been done in [6] in a qualitative way (see Appendix A); the point of (2.9) is to quantity this result in an elegant way. To complete our outline, we need to know that mass does not escape to infinity through g0 . Again, this has already been done in a qualitative way in [6], but we need an explicit formula. Suppose that we choose a maximal torus T for K, and a choice of positive roots for the action of the corresponding Cartan subalgebra h of g. We can generically write g0 ∈ G in triangular form, g0 = l0 mau0 , where we have further decomposed the diagonal term into a phase m ∈ T and its magnitude a ∈ exp(hR ). Step 7 (Conjectural). We have Z Y lim a(g)−iλ dνβ∗=∗ (g) = β↓0
α>0
sin( 2πg˙ hρ, αi)
sin( 2πg˙ hρ − iλ, αi)
,
(2.10)
where g˙ is the dual Coxeter number, ρ is the sum of the positive roots, and λ ∈ h∗R (recall that the inner product has been normalized). In the case of K = SU2 , this is equivalent to (1.10). The original motivation for this conjecture is explained in [6, Part III, Sec. 4.4]. This formula should be compared with the known formula of Harish–Chandra, Z Y hρ, αi (2.11) lim a(g)−iλ dνβ (g) = c(ρ − iλ) = β↑∞ hρ − iλ, αi α>0 (see [6, Part II, Sec. 4.4]).
July 13, 2004 11:49 WSPC/148-RMP
614
00213
D. Pickrell
When we incorporate the level l, the generalization of Step 7 (Conjectural) is Z π Y sin( 2(g+l) hρ, αi) ˙ lim a−iλ dνβ,l = . (2.12) π β↓0 sin( 2(g+l) hρ − iλ, αi) ˙ α>0
As l → ∞, we recover the classical limit of Haar measure, (2.11). If we write (g0 )∗ µl = φl dm(g0 ), then (2.12) is equivalent to the following formula for the Harish–Chandra transform Y Y πhλ, αi h−iλ, αi = Γ 1+i (2.13) (Hφl )(λ) = c π h−iλ, αi) sin( 2(g+l) 2(g˙ + l) ˙ α α>0
(this follows from [6, Part II, 4.4.27]). Steps 6 and 7 imply that the measures νβ have limits in Lhyp G as β → 0. Asymptotic invariance implies that these limits are biinvariant with respect to analytic loops in K. The remaining step is to show that there is a unique such measure. Considerable progress has been made, but this question remains open. Remark. Although not directly relevant in this paper, we mention that there are conjectural expressions for the θ± distributions, at least in terms of other, more explicit, limits. For example, for K = SU (n) in the defining representation, conjecturally
1 ˙ det(1 + Z ∗ Z)−2g−l dm(Pn θ− ) , (2.14) Z where Z = Z(g− ) = C(g− )A(g− )−1 (following the notation in [8]), g− corresponds to Pn θ− , and Pn projects θ− to its first n coefficients (so that it is an orthogonal projection for H 1 (∆∗ , g)). This expression is manifestly P SU (1, 1) invariant. This is the analogue of a well-known formula of Harish–Chandra for the invariant measure on a finite dimensional flag space (see [4, Theorem 5.20, p. 198]). (θ− )∗ µl = lim
n→∞
3. The Conjecture for the Radial Part (R = ∞) We introduce the ansatz that the subspace L2 (G, (g0 )∗ µ) ⊂ L2 (µ)
(3.1)
is invariant, or at least approximately invariant, with respect to the action of the Hamiltonian, H = H∞ . This approximation, HG , will necessarily be a K × Kinvariant linear operator. To further restrict the possibilities, we also assume that HG is a second order differential operator. Consider the Cartan decomposition ψ : K × p → G : k, x → g = kex .
(3.2)
In these coordinates Harish–Chandra’s formula for the Haar measure of G is Y sinh α(a(x)) 2 (3.3) dg = α(a(x)) dk × dx , α>0
July 13, 2004 11:49 WSPC/148-RMP
00213
The Radial Part of the Zero-Mode Hamiltonian
615
where x ∈ p is K-conjugate to a(x) ∈ hR , and the product is over the positive roots (see [4, Theorem 5.8, p. 186]). Step 7 (Conjectural) of Sec. 2 is equivalent to RQ P 2 ˙ −1 aiw·λ dλ 1 W (−1)w α>0 hλ, αi sinh(πhλ, αi/2g) Q (g0 )∗ µ = dg0 , (3.4) α −α ) Z α>0 (a − a
where in this formula, for g ∈ G, KgK = Ka(g)K, a ∈ exp(hR )/W . This reduces to (1.10) for K = SU2 . If K = SU (2, C), then p consists of 2 × 2 Hermitian matrices, and we can use the standard identification K × p = S 3 × (R~ı + R~ + R~k) , where ~i ↔
1
0
0 −1
, ~j ↔
0 1 , 1 0
dg =
and ~k ↔
0 i . −i 0
sinh(2|x|) 2|x|
2
(3.5)
In these coordinates
dk × dx ,
(3.6)
where dk denotes Haar measure for SU (2) and dx is Lebesgue measure for R 3 . The formula (1.10) is then 2 sinh(2|x|) 1 1 dk × dx (g0 )∗ µ0 = Z cosh3 (2|x|) 2|x| 2 1 tanh(2|x|) = sech(2|x|) dk × dx Z 2|x| =
1 δ(r)dr × dk × dAS 2 (x0 ) , Z
(3.7)
where δ = sech(r) tanh2 (r), 2x = rx0 , x0 ∈ S 2 (see [6, Part III, Sec. 4.4]). ∂ , and let Hr denote the radial part of HG . Since Hr is self-adjoint Let D = ∂r and nonnegative with respect to δ(r)dr, and because H, hence Hr , applied to a constant (the vacuum) vanishes, Hr must necessarily be of the form D(αDδ 1/2 ) δ 1/2 D2 (δ 1/2 ) = α − δ −1/2 ◦ D2 ◦ δ 1/2 + − D(α)D δ 1/2 1 15 2 −1/2 2 1/2 = α −δ ◦D ◦δ + − sech (r) − D(α)D , 4 4
Hrα = −δ −1/2 ◦ D ◦ α(r) ◦ D ◦ δ 1/2 +
(3.8)
where α = α(r) is a positive function. This is our initial ballpark conjecture. The principal symbol of Hrα , in the coordinate r, is αξ 2 , where ξ is a variable dual to r. Determining α is thus equivalent to picking out a preferred geometry. We will now explain why α = 1 appears to be a preferred choice.
July 13, 2004 11:49 WSPC/148-RMP
616
00213
D. Pickrell
We are assuming that Hr is the radial part of an operator HG , and there is an intermediate operator Hp , acting on functions of p alone, in the Cartan decomposition (these are functions which are invariant with respect to the left action of K; we could just as well consider the right action). The principal symbol of Hp corresponds to a metric on p. In considering interesting possibilities for the principal symbol of Hp , it seems that this metric has the form gx (v, w) = hA(ad(x))v, wi ,
(3.9)
where A is an analytic function which is expressible as a power series in powers of ad(x), x ∈ p. In Appendix B, we will show that in all such cases, α = 1 (see (6) of Lemma B.1). Suppose that α = 1. In this case Hr is equivalent to 1 15 − sech2 (r) , (3.10) 4 4 acting on odd functions of r. The restriction to odd functions of r is necessitated by the fact that δ 1/2 = sech1/2 (r) tanh(r) is an odd function (functions in the domain of Hr will then be of the form (odd function)/δ 1/2 , which will represent a welldefined function on G). This operator has a unique eigenvalue λ = 0 corresponding to the ground state, δ 1/2 , and the rest of the spectrum is continuous and of the form [m, ∞), where m = 41 is the mass gap, with multiplicity one. (Note: if we remove the restriction on the domain of (3.10) to odd functions, then the operator has a lower energy state, the even function sech3/2 (r), which corresponds to the eigenvalue −2.) The scattering theory for the sech2 potential (at least without domain restriction) is well-known (see e.g. [5, Sec. 2.5]). Taking the domain restriction into account, this should be related to Zamolodchikov’s conjectural S-matrix for this model (see [14]). In our argument for α = 1, we noted that Hr does not determine the form of HG (or Hp ). At the level of G, we have −D2 +
∆(Φ1/2 ) , (3.11) Φ1/2 where ∆ (a Laplace type operator) is self-adjoint with respect to dk × dx and (g0 )∗ µ = Φ(dk × dx). There are numerous possibilities for ∆. For example, relative to the Cartan decomposition G = K × p, we could have ∆ = ∆K + ∆p , the sum of the Laplacians. For this example, the m = 1/4 is directly related to the curvature of G/K = H 3 (relative to the normalization of our metric), because ∆G/K is equivalent to ∆p + 1/4 (see [4, Proposition 3.10, p. 268], and Example B.1 in Appendix B). This is relevant to the explanation for various miracles that occur in harmonic analysis in 3 versus n dimensions (see [4, p. 266]). In Appendix B, we consider a second possibility in detail. This second possibility is interesting because it generalizes to other Riemannian manifolds, in a way which seems linked to renormalization of sigma models. HG = Φ−1/2 ◦ ∆ ◦ Φ1/2 −
July 13, 2004 11:49 WSPC/148-RMP
00213
The Radial Part of the Zero-Mode Hamiltonian
617
We now discuss how to incorporate a level l, which presumably is related to the massive deformation of the conformally invariant WZW model at level l > 0. We first recall from [6] that, at least conjecturally, 2 sinh(2|x|) 1 1 (g0 )∗ µl = χl/2 (e2x ) dk × dx , (3.12) Z 2|x| cosh3 ((2 + l)|x|) where χl/2 is a character, at least for integer l/2. µl denotes the measure gotten by coupling a certain density appropriate at level l; it is the conjectural limit of the νβ,l in (2.12); see [6]. Proof of (3.12). In [6, Part III, Sec. 4.4] we conjectured that (in the case when G = SL(2, C), Λ˙ = 0, and λ is identified with λα1 , α1 = λ1 − λ2 ) Z π sin( 2(2+l) 2) . (3.13) a(g)−iλ dµl = π sin( 2(2+l) 2(1 − iλ)) Lhyp G Write (g0 )∗ µl = φl dm(g0 ), where dm denotes G Haar measure. By [6, (4.4.12)] π ) iλ sin( 2+l π λ = sin . (3.14) Hφl (λ) = π π sin( 2+l iλ) 2 + l sinh( 2+l λ) By [6, (4.4.15)] a 0 1 a2+l − a−(2+l) 1 φl = Z (a2+l + a−(2+l) )3 a2 − a−2 0 a−1 2 a 0 1 −3 cosh ((2 + l)x)χl/2 , = Z 0 a−2
(3.15)
where χl/2 is the character for the SU (2) representation of dimension l/2 (assuming this is integral). This implies (3.12). Write r = (2 + l)|x|, a =
2 2+l ,
D=
∂ ∂r ,
and
δ = sech3 (r) sinh(r) sinh(ar) ,
(3.16)
so that the radial projection of (g0 )∗ µl is (conjecturally) Z −1 δ(r)dr. Assuming that α = 1, for the same reasons cited above, we find that Hrl is conjecturally of the form tanh(r) 1 2 2 −1/2 2 1/2 2 5+2a −6a δ ◦|D| ◦δ + −(a coth(ar)−coth(r)) −15 sech (r) . 4 tanh(ar) The potential well for this operator digs deeper as l → ∞, so that the number of eigenvalues and bound states go to ∞ as l ↑ ∞. Since l ↑ ∞ is a kind of classical limit, one might expect that this spectrum converges to the spectrum of the Laplacian of K, but this has not been checked.
July 13, 2004 11:49 WSPC/148-RMP
618
00213
D. Pickrell
4. The Finite R Case As we explained in the Introduction and in (1.5), for the sigma model with target 1 K, we expect that there should be a natural Hilbert space H(SR ) = L2 ( Z1 Ω2R dV ) for each 0 < R ≤ ∞, where heuristically we think of ΩR as the vacuum state. At this point, we lack a construction and a conjectural characterization (as in Conjecture 1.1), for the appropriate measure, when R is finite. However, in this section we will assume this can be done. The point of this section is to explore what appears to be a natural conjecture for the g0 distribution. As in Sec. 3, we will write g0 = l0 mau0 for the triangular decomposition (when it exists), and Kg0 K = KaK for the Cartan decomposition, where a ∈ A = exp(hR ) and a ∈ A/W , respectively. As in [13, Chap. XXI], θ1 will denote the odd theta function θ1 (x, τ ) = 2q 1/8 sin(x) − 2q 3 = 2q 1/8 sin(x)
∞ Y
n=1
2
/8
sin(3x)) + · · ·
(1 − q n )(1 − q n ei2x )(1 − q n e−i2x ) ,
(4.1) (4.2)
where q = exp(2πiτ ) (this is the square of “q” in [13]), Im(τ ) > 0, and the equality is known as the Jacobi triple product formula. This theta function has the quasiperiodicity properties θ1 (x + π) = −θ1 (x),
θ1 (x + τ ) = −q 1/2 e−2ix θ1 (x) ,
(4.3)
and zeros at the points x = nπ + mπτ,
m, n ∈ Z .
(4.4)
Below we will also need to consider the even theta functions θ3 and θ4 , which have analogous properties (see [13]). Conjecture 4.1. The analogue of (2.12) (the diagonal distribution) is Z Y hρ − iλ, αi) sinh( 2R(πg+l) ˙ −iλ 1 2 , Ω dV = c a π Z R,l hρ − iλ, αiθ1 ( 2(g+l) hρ − iλ, αi, iR) ˙ α>0
(4.5)
where c is determined by the condition that the right hand side of (4.5) is 1 at λ = 0. If we write (g0 )∗ ( Z1 Ω2R,l dV ) = φR,l (g0 )dm(g0 ), where dm denotes Haar measure for G, then Conjecture 4.1 is equivalent to Y sin( 2R(πg+l) hλ, αi) ˙ (HφR,l )(λ) = c (4.6) π θ1 ( 2(g+l) hiλ, αi, iR) ˙ α>0
for the Harish–Chandra transform. Note that the zeros of the sine function in (4.6) exactly cancel with the zeros of θ1 (i(·)), so the α factor in (4.6) is smooth and rapidly decreasing as a function of the single variable hλ, αi, for each positive root α. The motivations for this conjecture are rather vague: the philosophy that theta functions are natural q-deformations of trigonometric functions, the relevance of the
July 13, 2004 11:49 WSPC/148-RMP
00213
The Radial Part of the Zero-Mode Hamiltonian
619
ˆ to integrable models (see [9] and references q-deformation of the affine algebra Lg there), and the surprising appearance of “τ ” in similar models (especially gauge theories; see e.g. [12]). To show that this formula is reasonable, there are several things that need to be checked. The first is to note that (4.5) reduces to (2.12) when R ↑ ∞. This follows in an elementary way from (4.1) (in verifying this, one must bear in mind the dependence of c on R). Thus this formula is consistent with our earlier claim. Secondly, we need to know that the transforms we are writing down actually correspond to positive measures. We first consider (4.5). Proposition 4.1. The right hand side of (4.5) is a positive definite function of λ ∈ h∗R . Proof. Products of positive definite functions are positive definite. In (4.5) we have a product over roots, and it suffices to show that each factor is positive definite as a function of one variable (the distance from ker(h·, αi)). The dual Coxeter number is given by g˙ = 1 + hρ, θi/2, where θ is the highest root and (we recall that) ρ is the sum of the positive roots. Using this and the fact that hρ, αi ≤ hρ, θi, for each root α, we can write π (hρ, αi − ihλ, αi) = x0 + iy , 2(g˙ + l)
(4.7)
where 0 < x0 < π, and y is a scaling of the variable hλ, αi. Since scaling a positive definite function does not change its positivity, it therefore suffices to prove that sinh((x0 + iy)/R) θ1 (x0 + iy, iR)(x0 + iy)/R
(4.8)
is a positive definite function of y. This is a consequence of the following striking result, which probably is known. Lemma 4.1. For 0 < x0 < π, Z 1 1 1 θ4 (πRp/2, iR) . eipy dy = 0 2π θ1 (x0 + iy, iR) θ1 (0, iR) ex0 p + e(x0 −π)p
(4.9)
Proof. This is a straightforward residue calculation. Suppose that p > 0. The residues for the integrand on the left hand side of (4.9), as a function of complex y, occur at the points y = mπR + i(x0 + nπ), n, m ∈ Z, n ≥ 0. Thus the left hand side of (4.9) equals i
X
m∈Z,n≥0
exp(ip(mπR + i(x0 + nπ))) . iθ10 (−nπ + imπR)
(4.10)
Using the quasi-periodicity properties of θ1 in (4.3), we obtain θ10 (−nπ + imπR) = (−1)n+m q −m
2
/2 0 θ1 (0) .
(4.11)
July 13, 2004 11:49 WSPC/148-RMP
620
00213
D. Pickrell
Thus (4.10) equals e−x0 p θ10 (0)
∞ X
n −πpn
(−1) e
n=0
!
X
m m2 /2 iRpm
(−1) q
e
m∈Z
!
.
(4.12)
The second sum is expressible in terms of θ4 , and this implies Lemma 4.1. The Fourier transform of sin(y)/y is essentially a characteristic function. This, together with the Lemma 4.1 implies that the inverse Fourier transform of (4.8), as a function of p, is the convolution of measures 1 1 θ4 (πRp/2, iR) ∗ e−x0 p χ[−1/R,1/R] (p)/2R . (4.13) θ10 (0, iR) ex0 p + e(x0 −π)p The crucial fact now is that the function θ4 (x, iR) is positive for x ∈ R. Thus both measures are positive, implying that the convolution is positive. This completes the proof of Proposition 4.1. Remark. Another possible approach to Lemma 4.1 is to consider the Jacobi triple product formula for θ1 , (4.2), which corresponds to an (infinite) convolution product formula for Lemma 4.1. If we compute the inverse Fourier transform for the nth term, we find that for 0 < x0 < π, Z 1 1 eipy dy (4.14) 2π 1 − 2 cos(2(x0 + iy))q n + q 2n =
sin(πnRp) , sinh(πnR)(ex0 p + e(x0 −π)p )
(4.15)
which is highly oscillatory. From this point of view, the positivity of Lemma 4.1 is surprising. We now consider the formula (4.6) for the Harish–Chandra transform. The abstract inversion formula is 1 φR,l (g0 ) = Q Z α>0 (aα − a−α ) ×
X W
(−1)w
Z Y α>0
hλ, αi2
hλ, αi) sin( 2R(πg+l) ˙ π θ1 ( 2(g+l) hλ, αi) ˙
aiw·λ dλ .
(4.16)
One can change variables in the integrals to reduce the calculations to the case when l = 0. We will analyze this in the case when K = SU (2). Lemma 4.2. Z y sinh(iy/R) ipy R sinh(π/R)θ3 (πRp/2, iR) i e dy = 0 . 2π θ1 (iy) iy/R 4θ1 (0)(sinh2 (π/(2R)) + cosh2 (πp/2))
(4.17)
Proof. This is another straightforward residue calculation. Suppose that p > 0. As a function of the complex variable y, the singularities of the integrand on the
July 13, 2004 11:49 WSPC/148-RMP
00213
The Radial Part of the Zero-Mode Hamiltonian
621
left hand side of (4.17) occur at the points y = mπR + inπ, m, n ∈ Z, n > 0. Using the formula (4.11) to calculate the residues, we see that the left hand side of (4.17) equals X exp(−πpn + iπRpm) sinh((−nπ + imRπ)/R) (4.18) =R (−1)n (−1)m q −m2 /2 θ10 (0) n>0,m R = 0 θ1 (0)
X
q
m
m2 /2 ipRm
e
!
X
n
(−1) sinh(−nπ/R)e
n>0
eπ/R e−π/R R θ (πRp/2) − . 3 2θ10 (0) eπp + e−π/R eπp + eπ/R After some elementary manipulations, this leads to (4.17). =
−nπp
!
(4.19)
(4.20)
For the SU (2, C) case, we employ the same notation as in Sec. 3. Thus we x identify a ∈ exp(hR ) with e e−x , x ∈ R, we identify λ ∈ hR with λα1 , where
λ ∈ R and α1 is the positive root for sl(2, C), and we write r = 2|x|. We then have 1 θ3 (Rz, iR) ∂ 1 − , (4.21) φR,l (g) = Z sinh(2|x|) ∂z sinh(π/(2R))2 + cosh(z)2 z=(2+l)x/π where Z is a normalization constant, so that the integral with respect to Haar measure for SL(2, C) is 1. Proposition 4.2. We have φR,l ≥ 0.
Proof. By doing the differentiation in (4.21), we see that Proposition 4.2 is equivalent to sinh(z) cosh(z) ∂ ln(θ3 (Rz, iR) ≤ 2 , z > 0. (4.22) ∂z sinh(π/(2R))2 + cosh(z)2 The left hand side of (4.22) has period π/R. The function θ3 (Rz, iR) is decreasing on [0, π/(2R)], so the left hand side of (4.22) is negative on this interval, and hence the claim is trivially true on this interval. It is straightforward to check that the right hand side of (4.22) is an increasing function of z. Thus it suffices to prove (4.22) on the finite interval [π/(2R), π/R]. (Note this means that for values of R on the order of 1, one can with confidence simply look at the graph of θ3 (Rz, iR)/(sh(π/(2R))2 + ch(z)2 ), and check that it is decreasing on the appropriate interval.) The left hand side of (4.22) equals ∞ X q n−1/2 −4R sin(2Rz) . (4.23) 1 + 2 cos(2Rz)q n + q 2n n=1 The maximum of this function of π/(2R) ≤ z ≤ π/R is the same as the maximum of the function ∞ X 1 , 0 ≤ θ ≤ π, (4.24) 2R sin(θ)eπR cosh(2πRn) − cos(θ) n=1 where z = (2π − θ)/(2R).
July 13, 2004 11:49 WSPC/148-RMP
622
00213
D. Pickrell
We first derive an easy bound for (4.24), which is sufficient for R sufficiently large. On the domain 0 ≤ θ ≤ π, the function sin(θ)/(cosh(2πRn) − cos(θ)) has a maximum value of 1/ sinh(2πRn), which is achieved at the point θ = θR,n satisfying cos(θ) = cosh(2πRn)−1 . Thus (4.24) is bounded by 2 Re
πR
∞ X
1 , sinh(2πRn) n=1
(4.25)
which is a decreasing function of R. It is easy to check that this is dominated by the minimum of the right hand side of (4.22), 2 sinh(π/(2R)) cosh(π/(2R)) , sinh(π/(2R))2 + cosh(π/(2R))2
(4.26)
for R sufficiently large (in fact for R > 1/20 (using Maple, for example)). But (4.25) diverges as R ↓ 0, and so this does not work in general. Now consider small R. The function (4.24) vanishes at 0 and π, and it has a unique maximum at a point θR in the interior. This point is determined by setting the derivative of (4.24) to zero, and this gives rise to the equation ∞ X cos(θR ) cosh(2πRn) − 1 = 0, (cosh(2πRn) − cos(θR ))2 n=1
(4.27)
which is not solvable. However, we previously calculated the unique critical points for the terms in (4.24), and from this we see that θR ≥ min{θR,n : n ≥ 1} = 2πR .
(4.28)
This will allow us to avoid multiple cases below. Since cosh(x) ≥ 1 + x2 /2, the function (4.24) is bounded by 2 ReπR sin(θ)
∞ X
1 . (1 − cos(θ)) + (2πRn)2 n=1
(4.29)
The Poisson summation formula (applied to the function f (x) = exp(−|x|)) implies the identity ∞ X 1 1 π 1 = cotanh(πα) − 2 . (4.30) α2 + n 2 2 α α n=1 This identity, with α2 = (1 − cos(θ))/(2πR), implies that (4.29) equals 1 π · 2πR (2πR)2 πR . Re sin(θ) cotanh(πα) − (2πR)2 (1 − cos(θ))1/2 1 − cos(θ)
(4.31)
Thus (4.24) is bounded by eπR
sin(θ) cotanh(πα) . 2(1 − cos(θ))1/2
(4.32)
July 13, 2004 11:49 WSPC/148-RMP
00213
The Radial Part of the Zero-Mode Hamiltonian
623
For sufficiently small R, because of (4.28), 1 − cos(θR ) ≥
1 (2πR)2 . 2
(4.33)
√ This implies that α(θR ) ≥√ 1/ 2. Because cotanh is decreasing, and sin(θ)(1 − cos(θ))−1/2 is bounded by 2, this implies that (4.32) is bounded by √ eπR 2−1/2 cotanh(π/ 2) ≤ (0.73)eπR . (4.34) This is bounded by (4.26) (which is very close to 1), for R < 1/10 (using Maple). We now specialize to the case when l = 0. In the analogue of (3.7) we have θ3 (Rr, iR) 1 ∂ δ = δR (r) = − sinh(r) . (4.35) Z ∂r sinh(π/2R)2 + cosh(r)2 This is positive, and its square root is the vacuum. The corresponding potential function is given by 2 1 δ0 1 1 1 δ 00 D2 (δ 1/2 ) 02 00 − (ln δ) . (4.36) = (ln δ) + = qR (r) = 2 δ 2 δ 2 2 δ 1/2 2 When R = ∞, this is 14 − 15 4 sech (r), which is bounded. As we explained in the Introduction, we would like to believe that the operator −D 2 + qR has discrete spectrum, when R < ∞. This is equivalent to showing that qR is unbounded as r ↑ ∞, for R < ∞. Unfortunately, I have not been able to resolve this issue.
Acknowledgment The author thanks Hermann Flaschka and John Palmer for helpful conversations. Appendix A In this appendix, we will take K = SU (2, C). In this case, the group Lpol K is generated by the two subgroups, K (the constant loops), and K τ , where τ is the outer automorphism a b a bz τ : LSL(2, C) → LSL(2, C) : → , (A.1) c d cz −1 d i.e. τ is conjugation by the multivalued loop
z 1/2 0
0 z
−1/2
. It is evident that τ
extends to the hyperfunction and formal completions. The Conjecture 1.1 would imply that µ is τ -invariant. The left and right actions of the constants K on Lhyp G are completely transparent, in terms of the Birkhoff factorization g = g− g0 g+ of a generic g ∈ Lhyp G −1 −1 kL · g · kR = [kL g− kL ] · [kL g0 kR ] · [kR g+ kR ] .
(A.2)
July 13, 2004 11:49 WSPC/148-RMP
624
00213
D. Pickrell
The action of K τ is transparent in terms of the τ -transformed factorization. But to put these two actions together, we need an expression for the action of K τ in terms of Birkhoff coordinates. on a generic g ∈ Lhyp G, g → k ·g, Lemma A.1. The left action of k = −¯bza−1 bz a ¯ in terms of the Birkhoff factorization g = g− g0 g+ , is given by −1 1 − βα1 z a bz α β(z) α −β0 −1 · g g g · , g g 0 − 0 + 0 0 α−1 −¯bz −1 a ¯ 0 α 0 1
2
2 +A1 C1 ) where α = a + bC1 , β0 = 2abA1 +b (C , β1 = −b, β(z) = β0 + β1 z, and α A(z) B(z) A1 B1 A2 B2 −1 g− = =1+ z + z −2 + · · · . C(z) D(z) C1 −A1 C 2 D2
Similarly, the right action, g → g "
g− g0
1
0
− γα1 z −1
1
g0−1
#"
· g0 ¯
a bz −1 −¯ bz −1 a ¯
is given by
# " −1 α · α−1 γ(z −1 ) 0
α −γ0
0 α
g+
a −¯bz −1
bz a ¯
−1 #
¯2
where α = a ¯ + ¯bb1 , γ0 = 2¯aba1 +b α(b2 +a1 b1 ) , γ1 = −¯b, γ(z −1 ) = γ0 + γ1 z −1 , and a(z) b(z) a1 b1 g+ = z +··· . =1+ c1 −a1 c(z) d(z) The action of τ on Lhyp G, in terms of the Birkhoff factorization, is given by 1 0 1 −B1 τ τ (g )− = (g− ) c0 −1 1 0 1 a0 z B1 B 1 c1 a 0 + a0 a0 (g τ )0 = c1 1 a0
(g τ )+ =
=
=
a0
1
0
−c1
1
b0 a0 z
1 0
1
a(z) +
(g+ )τ
b0 a0 c(z)
−c1 a(z) − c1 ab00 c(z) + c(z)z −1 1 + (a1 + (−c1 a1 −
b0 a0 c1 )z
c1 ab00 c1
+···
+ c2 )z + · · ·
b0 a0 d(z)z c1 ab00 d(z)z +
b(z)z + −c1 b(z)z −
d(z) ! b0 b0 2 a0 z + (b1 + a0 d2 )z + · · · , 1 + (d1 − c1 ab00 )z + · · ·
where in addition to the notation for g± above, we have also written g0 =
!
a 0 b0 c0 d0
.
Proof. These are tedious but straightforward calculations, and are best left to the reader!
,
July 13, 2004 11:49 WSPC/148-RMP
00213
The Radial Part of the Zero-Mode Hamiltonian
625
To understand the significance of these formulae, note that for the right action of k = −¯bza−1 bz , a ¯ g+ (gk
−1
)=
=
α−1
0
γ(z −1 )
α
1+
a1 a ¯+b2 ¯ b z α
+
a(g+ )¯ a + b(g+ )¯bz
a(g+ )(−bz) + b(g+ )a
... a2 a ¯+b3 ¯ b 2 z α
...
... +···
−b+ab1 z α
+
−a1 b+ab2 2 z α
...
+···
.
(A.3)
Inspection of the (1, 2) entry leads to (2.6). More generally, we have the following consequence. Proposition A.1. The space of variables {b1 , an , bn+1 } is invariant under the right action of k ∈ SU2τ , for each n = 1, 2, . . .. The action is given by b1 , an , bn+1 →
¯ + bn+1¯b −an b + abn+1 −b + ab1 an a , , α α α
or b1 , b0n+1 , an →
a ¯ + ¯bb0n+1 −b + ab1 −b + ab0n+1 , , a , n a ¯ + ¯bb1 a ¯ + ¯bb0n+1 a ¯ + ¯bb1
where b0n+1 = bn+1 /an . This implies the following generalization of Step 4. Corollary A.1. For each n ≥ 1, in the limit as β → 0, the distribution of b0n is ˆ the SU2 -invariant distribution on C, 1 (1 + |b0n |2 )−2 dm(b0n ) . Z
We can now use these formulae to explain why the coefficients for g+ (or x+ or θ+ ) have the property that the mass of their νβ -distributions does not escape to infinity as β → 0. For this purpose, we will say that one of these coefficients, or a function of these coefficients, say γ, is tight if the mass of the distributions {γ∗ νβ } does not escape to ∞ as β → 0 (i.e. the set of measures {γ∗ νβ } is “tight”). Note that the sum or product of two tight variables will also be tight. Write θ+ = (θ1 + θ2 z + · · ·)dz, θi = αγii −αβii , and we continue to write g+ as in Lemma A.1. Since β1 = b1 = b01 , Step 4 (or Proposition A.1) implies that β1 is tight. Now K acts on g ={θ1 } by the adjoint action (see (A.1)). Since the νβ are K-invariant, this implies that all K-translates of the functional β1 are also tight. Since these span the dual of g, it follows that θ1 is tight. This implies that a1 = α1 is tight. Since Proposition A.1 implies that b02 is tight, we now have that b2 = a1 b02 is tight. Thus β2 = b2 is tight. Using K-invariant again, we see that θ2 is tight. Since 2a2 = α2 +lower-order, a2 is tight. Together with the tightness of b03 , this implies that b3 is tight. Since 3b3 = β3 +lower-order, β3 is tight, and so on.
July 13, 2004 11:49 WSPC/148-RMP
626
00213
D. Pickrell
One can also use these formulae to generate very plausible formulae for the µ-distributions of the coefficients θn . However, these formulae become very complicated quite rapidly. Appendix B We identify p with Rn (with n = 3 in our rank one case), as in (3.5) above, so that our preferred inner product on p is twice the Euclidean dot product. Now suppose that we are given a metric on Rn , gx (ξ, η) = A(x)ξ · η ,
(B.1)
n
where A(x) is a positive matrix for each x ∈ R . Let ∇, dV, . . . denote the usual Euclidean gradient, volume element,. . .. Lemma B.1. We have ∇g = A−1 ∇; dVg = ρdV, ρ = det(A)1/2 ; divg (v) = ρ−1 div(ρv), v ∈ Vect(Rn ); ∆g = −ρ−1 div(ρA−1 ∇(·)); R R ¯ Qg (f ) = (∆g f )fρdV = g(A−1 ∇f, A−1 ∇f )ρdV . Assuming that g is orthogonally invariant, we also have (6) For f = f (r), Z ∞ Qg (f ) = f 02 α(r)ρ(r)rn−1 dr ,
(1) (2) (3) (4) (5)
(B.2)
0
∆gr = −δ −1/2 ◦ D ◦ α ◦ D ◦ δ 1/2 +
1 (αδ 0 )0 − (ln δ)02 , 2 δ
(B.3)
where δ(r) = ρ(r)r n−1 and α(r) = A−1 (x) xr · xr , for any choice of x with |x| = r. In particular , if A is analytic and locally expressible as a power series in ad(x), with A(0) = 1, then α = 1. Proof. Parts (1)–(5) are routine, and the formula for Qg in (B.2) follows directly from (5). For (B.3), since ∆gr is self-adjoint with respect to δ(r)dr, we know a priori that ∆gr has the form −δ −1/2 ◦ D ◦ α ◦ D ◦ δ 1/2 + γ .
(B.4)
We can plug this form into Qg in (5) and compare with (B.2). This determines α and γ. Example B.1. Suppose first that we identify p → G/K : x → ex K, where the latter space is equipped with its negatively curved metric. We have d x+tξ e , . . . gxG/K (ξ, η) = dt x t=0
= A(ad(x))ξ · η ,
e
(B.5)
July 13, 2004 11:49 WSPC/148-RMP
00213
The Radial Part of the Zero-Mode Hamiltonian
627
−z
where A(x) = | 1−ez |z=ad(x) |2 (using a standard formula for derivative of the exponential map). In the case when G = SL(2, C), so that p = R3 , 1 − e−r 1 − er 1 − e−z · = ρ(r) = det z r r z=ad(x) =
sinh2 ( r2 ) 2 − 2 cosh(r) , = r2 ( 2r )2
(B.6)
which is consistent with Harish–Chandra’s formula, if we remember how things are normalized. Now (6) implies that 1 (B.7) ∆G/K = −δ 1/2 ◦ D2 ◦ ∆1/2 + . 4 Example B.2. We consider the Guillemin–Stenzel Kahler structure for G = K × p = K × k = T K, where k → p : v → iv ([10]). This is interesting to consider, because there is a conjectural generalization of this to a general compact Riemannian manifold X with nonnegative sectional curvature (a condition related to asymptotic freedom for sigma models). The complex structure is the usual one for G. If we identify the tangent space of K × p with k ⊗ p using left translation, then at the point (k, x) ∈ K × p, the complex structure is given by 2(cosh(z)−1) ! 1−cosh(z) ξ ξ sinh(z) z sinh(z) , (B.8) J(k,x) =i cosh(z)−1 z η z=ad(x) η sinh(z)
sinh(z)
where “i” stands for usual multiplication by i on g = k ⊕ p. The canonical T ∗ K symplectic structure, in the Cartan coordinates K × p, is constant and given by 0 ξ ξ ω , = hiξ ⊗ η 0 − η ⊗ iξ 0 i ; (B.9) η η0 note that iξ ∈ p, so that the inner product makes sense. It follows from these calculations that the Riemannian metric is given by ξ ξ ξ ξ g(k,x) , = ω J(k,x) , η η η η 2(cosh(z)−1) ! 1−cosh(z) ξ ξ sinh(z) z sinh(z) , =ω i cosh(z)−1 z η z=ad(x) η sinh(z) sinh(z) 1 − cosh(z) 2(cosh(z) − 1) = ξ+ η ⊗η sinh(z) sinh(z) cosh(z) − 1 z − ξ+ η ⊗ ξ sinh(z) sinh(z) z=ad(x) cosh(ad(x)) − 1 ad(x) ξ⊗ξ+2 η ⊗ η , (B.10) = sinh(ad(x)) ad(x) sinh(ad(x))
July 13, 2004 11:49 WSPC/148-RMP
628
00213
D. Pickrell
where the bracket denotes the negative of the (appropriate multiple of) the Killing form. Now suppose that we just consider p. In this case tanh(w) cosh(z) − 1 (B.11) = A(ad(x)) = 2 1 z sinh(z) z=adx w w= adx 2
and ρ(r) = 2 tanh(r/2)/r. Again, α = 1. References
[1] K. Clancy and I. Gohberg, Factorization of Matrix Functions of Singular Integral Operators (Birkhauser, 1981). [2] K. Gawedzki, Introduction to CFT, in Quantum Fields and Strings: A Course for Mathematicians, AMS-IAS, Vol. 2 (1998), pp. 727–801. [3] L. Gross, Uniqueness of ground states for Schrodinger operators over loop groups, J. Funct. Anal. 112 (1993) 373–441. [4] S. Helgason, Groups and Geometric Analysis (Academic Press, 1984). [5] G. Lamb, Elements of Soliton Theory (John Wiley, 1980). [6] D. Pickrell, Invariant measures for unitary forms of Kac–Moody Lie groups, Memoirs of the AMS, Vol. 146(693) (2000). [7] A. M. Polyakov, Gauge fields and strings, in Contemporary Concepts in Physics, Vol. 3 (Harwood Academic Publishers, 1987). [8] A. Pressley and G. Segal, Loop Groups (Oxford University Press, 1986). [9] F. A. Smirnov, Space of local fields in integrable field theory and deformed abelian differentials, in Proc. Int. Congress of Mathematicians, Doc. Math. J. DMV, Vol. III (Berlin, 1998), pp. 183–192. [10] M. Stenzel, Kaehler structures on cotangent bundles of real analytic Riemannian manifolds, Ph.D. thesis, M. I. T. (1990). [11] E. Witten, Supersymmetry and Morse theory, J. Diff. Geom. 17 (1982) 661–692. [12] E. Witten, Dynamics of quantum field theory, in Quantum Fields and Strings: A Course for Mathematicians, AMS-IAS, Vol. II (1998), pp. 1119–1393. [13] E. Whitaker and G. Watson, A Course of Modern Analysis (Cambridge University Press, NY, 1943). [14] A. Zamolodchikov, Factorized S-matrices in two dimensions as the exact solutions of certain relativisitic quantum field theory models, Annals of Physics 120 (1979) 253–291.
July 20, 2004 9:35 WSPC/148-RMP
00206
Reviews in Mathematical Physics Vol. 16, No. 5 (2004) 629–637 c World Scientific Publishing Company
THE THERMODYNAMIC LIMIT FOR FINITE DIMENSIONAL CLASSICAL AND QUANTUM DISORDERED SYSTEMS
`† PIERLUIGI CONTUCCI∗ and CRISTIAN GIARDINA Dipartimento di Matematica, Universit` a di Bologna 40127 Bologna, Italy ∗[email protected] †[email protected] ´ JOSEPH PULE Department of Mathematical Physics, University College Dublin Belfield, Dublin 4, Ireland [email protected] Received 27 January 2004 Revised 15 April 2004 We provide a very simple proof for the existence of the thermodynamic limit for the quenched specific pressure for classical and quantum disordered systems on a ddimensional lattice, including spin glasses. We develop a method which relies simply on Jensen’s inequality and which works for any disorder distribution with the only condition (stability) that the quenched specific pressure is bounded. Keywords: Thermodynamic limit; quantum spin glasses; free energy; subadditivity.
1. Introduction, Definitions and Results In this paper we study the problem of the existence of the thermodynamic limit for a wide class of disordered models defined on finite dimensional lattices. We consider both the classical and quantum case with random two-body or multi-body interaction. The classical case has been studied in various places (see for example [4– 8]). In [4] and [7], the quantum case with pair interactions has also been considered. Here we deal only with the quenched pressure. Using only thermodynamic convexity and a mild stability condition, we give a very simple proof of the existence and monotonicity of the quenched specific pressure. A result in the same spirit for classical spin glasses has been obtained in [1] by using an interpolation technique introduced in [2, 3]. The present work extends the results of [1] not only to the quantum case, but also to the classical case with a nonzero mean of the interaction and to the continous spin space. We shall treat the classical and quantum cases in parallel. In the classical case to each point of the lattice i ∈ Zd , we associate a copy of the spin space S, which 629
July 20, 2004 9:35 WSPC/148-RMP
630
00206
P. Contucci, C. Giardin` a & J. Pul´ e
is equipped with an a priori probability measure µ. We shall denote this by Si . In the quantum analogue, we associate to each i ∈ Zd a copy of a finite dimensional Hilbert space H, denoted by Hi and a set of self-adjoint operators, spin operators, on Hi . Following [9] (see also [10]), we define the interaction in the following way. In the (j) classical case for each finite subset of Zd , X, we let SX := ×i∈X Si and {ΦX | j ∈ nX } is a finite set of bounded functions from SX to R which are measurable with (j) respect to the product measure µ|X| on SX . In the quantum case, each ΦX is a selfadjoint element of the algebra generated by the set of operators, the spin operators on HX := ⊗i∈X Hi . Without loss of generality, we set Φ∅ = 0. In both cases, we take the interaction to be translation invariant in the sense that if τa is translation by a ∈ Zd , then (j)
(j)
nτa X = nX and Φτa X = τa ΦX for j ∈ nX .
(1)
(j)
We now define the random coefficients. For each X, let {JX | j ∈ nX } be a set of (j) random variables. We assume that the JX ’s are independent random variables and (j) (j) that Jτa X and JX have the same distribution for all a ∈ Zd . We shall denote the average over the J’s by Av[·]. Let Λ ⊂ Zd be a finite set of a regular lattice in d dimensions and denote by |Λ| = N its cardinality. We define the random potential as X X (j) (j) UΛ (J, Φ) := JX Φ X . (2) X⊂Λ j∈nX
(j)
We stress here that the distributions of the JX ’s are independent of the volume Λ. This characterizes the short range case, such as the Edwards–Anderson model. In mean field (long range) models, such as the Sherrington–Kirkpatrick model in (j) which the Hamiltonian sums over all the couples (N 2 terms), the variance of JX −1 has to decrease like N in order to have a well defined thermodynamic behavior and in particular a finite energy density. The complete definition of the model we are considering requires that we specify also the interaction on the frontier ∂Λ, i.e. boundary conditions. However, standard surface over volume arguments imply that if the quenched specific pressure for one boundary condition converges, then it also converges for all other boundary conditions. Therefore, to prove the convergence of the quenched specific pressure, it is sufficient to consider the free boundary condition. Thus in the sequel, we shall assume the free boundary condition and prove that in this case the quenched pressure is monotonically increasing in the volume. We would like to emphasize the fact that in the classical case, our results are not restricted to the situation when the space S consists of a finite number of points. Here we also want to cover the case of continuous spins and therefore we shall keep the classical and quantum cases separate. Of course, both cases can be covered simultaneously in a C ∗ algebra setting but for the sake of simplicity, we shall not take this route.
July 20, 2004 9:35 WSPC/148-RMP
00206
Finite Dimensional Classical and Quantum Disordered Systems
631
Example 1 (Classical Edwards–Anderson Model). S = {−1, 1}, µ(σi ) = 1 1 2 δ(σi + 1) + 2 δ(σi − 1). The interaction is only between nearest neighbors: Φi,j (σi , σj ) = σi σj for |i − j| = 1, ΦX = 0 otherwise. To ensure that the specific pressure is bounded, it is enough that Av[|Jij |] < ∞ .
(3)
More generally, one may consider a long range interaction with Φi,j (σi , σj ) = σi σj /R(|i − j|) with a sufficient condition for boundedness, for example Av[J0i ] = 0 and
X Av[|J0i |2 ] i
(R(|i|))
2
< ∞,
(4)
or a many-body interaction with a suitable decay law. One can also add a (random) external field. We refer the reader to [1] for more classical examples. Example 2 (Quantum Edward–Anderson Model). H = C2 . The spin operators are the set of the Pauli matrices: σi = (σix , σiy , σiz ), 1 0 0 −i 0 1 (5) , σz = , σy = σx = 0 −1 i 0 1 0 with commutation and anticommutation relations [ σiα , σiβ ] = 2iαβγ σiγ ,
(6)
{σiα , σiβ } = 2δαβ .
(7)
The interaction is again only between nearest neighbors: Φi,j (σi , σj ) = σi · σj = σix σjx +σiy σjy +σiz σjz for |i−j| = 1, ΦX = 0 otherwise. A transverse field Φi (σi ) = σiz can also be added. One can have an asymmetric version with local interaction y x z Φzi,j (σi , σj ) , Φyi,j (σi , σj ) + Ji,j Ji,j Φxi,j (σi , σj ) + Ji,j
(8)
where Φxi,j (σi , σj ) = σix σjx , Φyi,j (σi , σj ) = σiy σjy and Φzi,j (σi , σj ) = σiz σjz . As in Example 1, one may consider a short range interaction with a suitable decay law. Notation. We shall use the notation Tr to denote both the classical expectation QN over S N with the measure µ(dσ) = i=1 µ(dσi ) and the usual trace in quantum mechanics on the Hilbert space ⊗N i=1 H. Definition 1. We define in the usual way: (1) the random partition function, ZΛ (J), by ZΛ (J) := Tr eUΛ (J,Φ) ;
(9)
(2) the quenched pressure, PΛ , by PΛ := Av[ln ZΛ (J)] ;
(10)
July 20, 2004 9:35 WSPC/148-RMP
632
00206
P. Contucci, C. Giardin` a & J. Pul´ e
(3) the quenched specific pressure, pΛ , by PΛ . N
pΛ :=
(11)
We are now ready to state our main theorem as follows. (j)
Theorem 1. If all the JX ’s with |X| > 1 have zero mean, then the quenched pressure is superadditive PΛ ≥
n X
P Λs .
(12)
s=1
(j)
Let kΦX k denote the supremum norm in the classical case and the operator norm (j) in the quantum case. For the case when the JX ’s do not have zero mean, we have the following corollary. Corollary 1. Let P¯Λ = PΛ +
X
X
(j)
(j)
|Av[JX ]| kΦX k .
(13)
X⊂Λ, |X|>1 j∈nX
Then P¯Λ is superadditive. Theorem 1 combined with the boundedness of the specific pressure is sufficient to ensure the convergence of the specific pressure in the thermodynamic limit (see for (j) example [9, Chap. IV]) in the case when all the JX ’s with |X| > 1 have zero mean. (j) In the case when the JX ’s do not have zero mean, we have to add to Corollary 1 the condition C :=
X
X30, |X|>1
X |a(j) |kΦ(j) k X X < ∞. |X| j∈n
(14)
X
This implies that 1 Λ→∞ N lim
X
X
(j)
(j)
|aX |kΦX k = C
(15)
X⊂Λ, |X|>1 j∈nX
and therefore the convergence of the specific pressure. To prove the boundedness of the specific pressure, we need the following stability condition (cf. [8]). Let X X Av |J (j) | kΦ(j) k X X kU k1 := (16) |X| j∈n X30
X
and kU k2 :=
X X Av[|J (j) |2 ]kΦ(j) k2 X X |X| j∈n
X30
X
! 12
.
(17)
July 20, 2004 9:35 WSPC/148-RMP
00206
Finite Dimensional Classical and Quantum Disordered Systems
633
Definition 2. We shall say that the random potential U (J, Φ) is stable if it is of the form ˜Λ (J˜, Φ) ˜ +U ˆΛ (J, ˆ Φ) ˆ , UΛ (J, Φ) = U
(18)
(j) (j) (j) ˜ 1 where all the J˜X ’s and JˆX ’s are independent, the JˆX ’s have zero mean and kUk ˆ and kUk2 are finite.
With this definition, we shall prove in the next theorem that the specific pressure is bounded. Note that the stability condition in Definition 2 implies that C as defined in (14) is finite since C ≤ kU k1 . Theorem 2. For a stable random potential , the quenched specific pressure is bounded. In the next section we prove the theorems. 2. Proof of the Theorems We start with the following definition. Definition 3. Consider a partition of Λ into n nonempty disjoint sets Λs Λ=
n [
Λs ,
(19)
s=1
Λ s ∩ Λ s0 = ∅ .
(20)
For each partition, the potential generated by all interactions among different subsets is defined as n X ˜Λ := UΛ − U U Λs . (21) s=1
From (2) it follows that ˜Λ = U
X X
(j)
(j)
JX Φ X ,
(22)
X∈CΛ j∈nX
where CΛ is the set of all X ⊂ Λ which are not subsets of any Λs . ˜Λ from the partition function. We shall use the The idea here is to eliminate U following three lemmas. Lemma 1. Let X1 , . . . , Xn be independent random variables with zero mean. Let F : Rn 7→ R be such that for each i = 1, . . . , n, xi 7→ F (x1 , . . . , xn ) is convex , then E[F (X1 , . . . , Xn )] ≥ F (0, . . . , 0) , where E denotes the expectation with respect to X1 , . . . , Xn . Proof. This follows by applying Jensen’s Inequality to each Xi successively.
(23)
July 20, 2004 9:35 WSPC/148-RMP
634
00206
P. Contucci, C. Giardin` a & J. Pul´ e
The following two lemmas are related to the thermodynamic convexity of the pressure. Lemma 2. Let µ be a probability measure on a space Ω, and let A and B 1 , . . . , Bn be measurable real-valued functions on Ω. Then " Z ( ) # Z n X E log exp A(σ) + Xi Bi (σ) µ(dσ) ≥ log exp[A(σ)]µ(dσ) . (24) Ω
Ω
i=1
Proof. We just have to check that if ) ( Z n X xi Bi (σ) µ(dσ) , exp A(σ) + F (x1 , . . . , xn ) = log Ω
i=1
then xi 7→ F (x1 , . . . , xn ) is convex. Let R Pn C(σ) exp{A(σ) + i=1 xi Bi (σ)} µ(dσ) ΩR Pn hCi := . i=1 xi Bi (σ)} µ(dσ) Ω exp{A(σ) +
(25)
Then, computing the derivatives, we have
∂F = hBi i ∂xi
(26)
and
2
∂2F ≥ 0. = Bi2 − hBi i2 = Bi − hBi i 2 ∂xi
(27)
The next lemma is the quantum analogue of the previous one. Lemma 3. Let H be finite-dimensional Hilbert space, and let A and B1 , . . . , Bn be self-adjoint operators on H. Then !# " n X ≥ log Tr exp A . (28) E log Tr exp A + Xi B i i=1
Proof. Again we just have to check that if F (x1 , . . . , xn ) = log Tr exp A +
n X
xi B i
i=1
!
,
then xi 7→ F (x1 , . . . , xn ) is convex. The first derivative gives ∂F = hBi i , ∂xi
(29)
where hCi :=
Tr Ce−H Tr e−H
(30)
July 20, 2004 9:35 WSPC/148-RMP
00206
Finite Dimensional Classical and Quantum Disordered Systems
635
with −H = A +
n X
xi B i
i=1
while, for the second derivative, we have ∂2F = (Bi , Bi ) − hBi i2 , ∂x2i
(31)
where (·, ·) denotes the Du Hamel inner product (see for example [10]) R1 Tr 0 ds e−sH C ∗ e(1−s)H D (C, D) := . Tr e−H
(32)
By using the fact that (C, 1) = hCi and (1, D) = hDi, we see that ∂2F = (Bi − hBi i, Bi − hBi i) ≥ 0 . ∂x2i
(33)
(j)
Proof of Theorem 1. Let us assume first that all the JX ’s with |X| > 1 have zero mean. PΛ = Av[ln Tr exp UΛ ] " !# n X X X (j) (j) = Av ln Tr exp U Λs + JX Φ X . s=1
(34)
X∈CΛ j∈nX
Note that CΛ does not contain any X with |X| = 1. Applying Lemma 2 (resp. P (j) Lemma 3) for the classical (resp. quantum) case with A = ns=1 UΛs , Bi = ΦX P and n = X∈CΛ nX , we get " !# n n n X X X PΛ ≥ Av ln Tr exp U Λs = Av[ln Tr exp UΛs ] = P Λs . (35) s=1
s=1
s=1
Proof of Corollary 1. Here we relax the condition that all the J’s have zero mean. (j) (j) (j) (j) (j) (j) Let aX := Av[JX ] and J¯X := JX − aX for |X| > 1, so that J¯X has zero mean (j) (j) and J¯X := JX if |X| = 1. Let X X (j) (j) (1) J¯X ΦX , (36) UΛ (J, Φ) := X⊂Λ j∈nX
(2)
UΛ (J, Φ) :=
X
X
(j)
(j)
(j)
(j)
aX ΦX + |aX |kΦX k
X⊂Λ, |X|>1 j∈nX
(37)
and ¯Λ (J, Φ) := U (1) (J, Φ) + U (2) (J, Φ) . U Λ Λ
(38)
July 20, 2004 9:35 WSPC/148-RMP
636
00206
P. Contucci, C. Giardin` a & J. Pul´ e
Then ¯Λ (J, Φ) = UΛ (J, Φ) + U
X
X
(j)
(j)
|aX |kΦX k .
(39)
X⊂Λ, |X|>1 j∈nX
¯Λ (J, Φ). One can then see that P¯Λ is Thus P¯Λ is the pressure corresponding to U (1) (j) superadditive by treating the terms in UΛ (J, Φ) as before, since each J¯X has zero mean, except possibly if |X| = 1, and by using the fact that all the terms in (2) UΛ (J, Φ) are positive (cf. [10]). In the quantum case, we need the inequality Tr e(A+B) ≥ Tr eA
(40)
if B is a positive operator. Proof of Theorem 2. The proof in the classical case is given in [8]. Here we modify that proof to cover the quantum case. From the Bogoliubov inequality Tr(A − B)eA Tr(A − B)eB ≤ ln Tr eA − ln Tr eB ≤ B Tr e Tr eA
(41)
with A = UΛ (J, Φ) and B = 0 we get log ZΛ (J) − N log dim H ≤ =
Tr UΛ (J, Φ)eUΛ (J,Φ) Tr eUΛ (J,Φ) ˆΛ (J, ˆ Φ)e ˆ UΛ (J,Φ) ˜Λ (J˜, Φ)e ˜ UΛ (J,Φ) Tr U Tr U + Tr eUΛ (J,Φ) Tr eUΛ (J,Φ)
ˆ ˆ ˆ UΛ (J,Φ) ˜Λ (J, ˜ Φ)k ˜ + Tr UΛ (J, Φ)e ≤ kU . Tr eUΛ (J,Φ)
(42)
Now ˜Λ (J, ˜ Φ)k] ˜ ˜ J˜, Φ)k ˜ 1. Av[kU ≤ N kU( For the other term, we use the identity for A and B self-adjoint Z 1 Tr AeB Tr AeA+B − = dt(A − hAit , A − hAit )t , Tr eA+B Tr eB 0
(43)
(44)
where h·it and (·, ·)t denote the mean and the Du Hamel inner product, respectively with respect to H = −(tA + B). The Du Hamel inner product satisfies (C, C) ≤
1 1 ∗ hC C + CC ∗ i 2 ≤ kCk2 . 2
(45)
Therefore Tr AeB Tr AeA+B − ≤ 4kAk2 . A+B Tr e Tr eB
(46)
July 20, 2004 9:35 WSPC/148-RMP
00206
Finite Dimensional Classical and Quantum Disordered Systems
637
j ˆj j ˆj With A = JˆX ΦX and B = UΛ (J, Φ) − JˆX ΦX we get
X X Tr Jˆj Φ ˆ j UΛ (J,Φ) ˆΛ (J, ˆ Φ)e ˆ UΛ (J,Φ) Tr U X Xe = Tr eUΛ (J,Φ) Tr eUΛ (J,Φ) X⊂Λ j∈ˆ nX
ˆj ˆ j
≤
eUΛ (J,Φ)−JX ΦX j ˆj Tr JˆX ΦX ˆj ˆ j Tr eUΛ (J,Φ)−JX ΦX X⊂Λ j∈ˆ nX X X
+4
X X
j 2 ˆj 2 |JˆX | kΦX k .
(47)
X⊂Λ j∈ˆ nX
j j ˆj j Thus since UΛ (J, Φ) − JˆX ΦX is independent of JˆX and Av JˆX = 0, # " X X ˆΛ (J, ˆ Φ)e ˆ UΛ (J,Φ) j 2 j 2 Tr U ˆ k ≤ 4N kU( ˆ J, ˆ Φ)k ˆ 2. Av |JˆX | kΦ ≤4 Av 2 X U (J,Φ) Λ Tr e X⊂Λ j∈ˆ n
(48)
X
Therefore
˜ J, ˜ Φ)k ˜ 1 + 4kU( ˆ Jˆ, Φ)k ˆ 2) . PΛ ≤ N (log dim H + kU( 2
(49)
Acknowledgments The authors wish to thank B. Nachtergaele and A. van Enter for very constructive suggestions. One of the authors (J. Pul´e) wishes to thank the Department of Mathematics of the University of Bologna, Italy, for their kind hospitality, and University College Dublin for the award of a President’s Fellowship. P. Contucci and C. Giardin` a wish to thank S. Graffi for some very useful discussions. References [1] P. Contucci and S. Graffi, J. Statist. Phys. 115(1/2) (2004) 581–589. [2] F. Guerra and F. Toninelli, Commun. Math. Phys. 230 (2002) 71–79. [3] P. Contucci, M. Degli Esposti, C. Giardin` a and S. Graffi, Commun. Math. Phys. 236 (2003) 55–63. [4] P. A. Vuillermot, J. Phys. A: Math. Gen. 10 (1977) 1319–1333. [5] L. A. Pastur and A. L. Figotin, Theor. Math. Phys. 35 (1978) 193–202. [6] K. M. Khanin and Ya. G. Sinai, J. Statist. Phys. 20 (1979) 573–584. [7] A. C. D. van Enter and J. L. van Hemmen, J. Statist. Phys. 32 (1983) 141–152. [8] B. Zegarlinski, Commun. Math. Phys. 139 (1991) 305–339. [9] D. Ruelle, Statistical Mechanics, Rigorous Results (W. A. Benjamin, New York, 1969). [10] B. Simon, The Statistical Mechanics of Lattice Gases (Princeton, University Press, 1992).
July 20, 2004 10:11 WSPC/148-RMP
00212
Reviews in Mathematical Physics Vol. 16, No. 5 (2004) 639–673 c World Scientific Publishing Company
THE INVARIANT MEASURES AT WEAK DISORDER FOR THE TWO-LINE ANDERSON MODEL
T. C. DORLAS Dublin Institute for Advanced Studies 10 Burlington Road, Dublin 4, Ireland [email protected] ´∗ J. V. PULE Department of Mathematical Physics, University College Dublin Belfield, Dublin 4, Ireland [email protected]
Received 1 October 2003 Revised 9 April 2004 We study the invariant measures in the weak disorder limit, for the Anderson model on two coupled chains. These measures live on a three-dimensional projective space, and we use a total set of functions on this space to characterize the measures. We find that at several points of the spectrum, there are anomalies similar to that first found by Kappus and Wegner for the single chain at zero energy. Keywords: Anderson model; invariant measures; Lyapunov exponents; weak disorder.
1. Introduction In this paper we consider the invariant measure for the one-dimensional Anderson model on two coupled chains. The Hamiltonian is given by H = H0 + λV , where (H0 ψ)(n, 1) = ψ(n + 1, 1) + ψ(n − 1, 1) + ψ(n, 2) , and
(H0 ψ)(n, 2) = ψ(n + 1, 2) + ψ(n − 1, 2) + ψ(n, 1) , (V ψ)(n, s) = vn,s ψ(n, s) ,
(1.1)
(1.2)
where s = 1, 2 and the vn,s are i.i.d. random variables. We should explain at the outset that in the case of two chains, the invariant measure does not provide as much information as in the case of a single chain. While in the latter case, the invariant measure gives both the density of states and the Lyapunov exponent (see (1.4) and (1.5) below), in the two chain case it gives only the upper Lyapunov ∗ Research
Associate, School of Theoretical Physics, Dublin Institute for Advanced Studies.
639
July 20, 2004 10:11 WSPC/148-RMP
640
00212
T. C. Dorlas & J. V. Pul´ e
exponent but gives no information on the second exponent (see for example [1, IV. 4.1]). However, as the simplest generalization of one chain, it is interesting to examine if the anomalies discovered there persist in the case of two chains. In connection with the Lyapunov exponents on a strip, we would like to note that a perturbation theory for them has been developed recently by Schulz-Baldes [2]. The invariant measure for one chain at weak disorder has been studied extensively. To get insight into the behavior for small disorder, Thouless [3] attempted to write down a perturbation expansion in the disorder (i.e. in λ) of the invariant measure in the case of a single chain. In terms of the variable Z(n) = ψ(n)/ψ(n−1), the Schr¨ odinger equation at energy E for this case can be written as 1 . Z(n + 1) = E − λvn − Z(n) The invariant measure νλE for this transformation is then defined by Z Z 1 E νλE (dx) f (x)νλ (dx) = E f E − λv − x
(1.3)
for all bounded continuous functions f . The Lyapunov exponent γ(E) and the density of states N (E) are related to this measure by γ(E) = Re γ˜(E) ,
N (E) = π Im γ˜ (E) ,
(1.4)
where γ˜ (E) =
Z
ln x νλE (dx) .
(1.5)
Kappus and Wegner [4] subsequently discovered that the perturbation series proposed by Thouless is incorrect for the case E = 0. They called this an anomaly. In fact, the limiting measure ν0E is discontinuous at E = 0. In particular one has c if E 6= 0 , x2 − Ex + 1 dx , ν0E := lim νλE = (1.6) c λ↓0 √ 0 dx , if E = 0 . x4 + 1 The problem was further analyzed by Derrida and Gardner [5]. They found that the perturbation series is also anomalous at the values E = 2 cos pq π for integer p and q. Bovier and Klein [6] then completed their investigation and derived the correct perturbation series in all cases. These series were subsequently shown to be asymptotic by Campanino and Klein [7] by means of a very sophisticated analysis. In this paper we develop a new approach to this problem which is simpler but at the same time less powerful. It suffices for proving the discontinuity of the limiting measure, i.e. (1.6), but not the fact that the perturbation series is asymptotic. We next apply the method to the analogous problem for the case of two chains, again concentrating on the more limited objective of proving the convergence of the measures as λ → 0, determining the limiting invariant measure and examining this measure for continuity. It turns out that this case is in fact incomparably more complicated than the case of a single chain.
July 20, 2004 10:11 WSPC/148-RMP
00212
Two-Line Anderson Model
641
For one chain, the method consists of choosing coordinates in which the transformation T0 at λ = 0 is a simple rotation and then applying the measures to a total set of functions. If the angle of rotation is an irrational multiple of π, it follows from ergodicity that the invariant measure is Lebesgue. Otherwise, we iterate the transformation at λ 6= 0 as in [6] and show that the second-order derivative with respect to λ converges when applied to a suitable total set of functions (Lemma 3.1). This leads to integral equations (3.53) for the measure which have the Lebesgue measure as unique solution, except when E = 0, in which case the solution corresponds to Eq. (1.6). In the case of two chains, the unperturbed (λ = 0) spectrum for the Hamiltonian (1.1) has two branches E(k) = 2 cos k ± 1 ,
k ∈ [−π, π] .
(1.7)
Thus the spectrum, [−3, 3], splits naturally into three subsets, [−3, −1], (−1, 1) and [1, 3]. We examine the limiting measure ν0E , for E in the interior of each of these subsets and at their boundaries. In some of these cases, we are not able to obtain the measure explicitly but we can only give the differential equation that it satisfies. We examine for continuity at each of the two junctions and at the edges of the spectrum. This is complicated by the fact that we have to use different parametrizations of the 3-dimensional projective space on which the measures live in the different parts of the spectrum. For E ∈ (−1, 1) there is a convenient parametrization in terms of three angles θ1 ∈ [0, 2π), θ2 ∈ [0, π) and θ3 ∈ (0, π2 ), for which the transformation at λ = 0 corresponds to a simple rotation in two planes over angles α and β related to the energy E: 2 cos α = E + 1 and 2 cos β = E − 1; see (4.68). (In fact, the parameter space consists of three pieces, Ω = Ω(0,π/2) ∪Ωπ/2 ∪Ω0 ; see (4.7). The transformation on each has to be considered separately.) If the angles α and β are both irrational multiples of π, then it follows by ergodicity that the invariant measure is Lebesgue with respect to the two coordinate angles θ1 and θ2 . The third coordinate angle θ3 is unaffected by the transformation at λ = 0. We therefore use the second-order term in λ of the expansion of Tλ . In Lemma 4.1 we show (analogous to Lemma 3.1 for the case of a single chain) that this term converges when the transformation is applied to a total set of functions, determining the measure. This leads to a differential equation for the density of the measure in the variable θ3 ; see (4.81). This equation can be solved explicitly; the solution has 3 different forms in different regimes of energy; see (4.84), (4.85) and (4.86). In the case that only one of α/π or β/π is irrational, ergodicity only applies in one of the variables and the above argument, but after iterating the transformation, leads to a partial differential equation for the density of the measure. This equation has the same solution, however, as in the double-irrational case. The only case where both α and β are rational multiples of π is E = 0. In that case we again derive a partial differential equation. A simple symmetry reduces this to two variables, but we have not been able to solve this equation. Nevertheless, it
July 20, 2004 10:11 WSPC/148-RMP
642
00212
T. C. Dorlas & J. V. Pul´ e
suffices for proving that the limiting invariant measure is discontinuous at E = 0, namely, the limit of the invariant measures at E 6= 0 does not satisfy the equation at E = 0. At E = ±1, the transformation T0 is no longer given by two rotations. Instead, it can be represented as a rotation in one angle variable and a contraction in another; see (4.101) and (4.102). As a result, one concludes immediately that the measure is concentrated on θ2 = π/2. We derive a differential equation for the density in the other two variables. This equation does not have a θ1 -independent solution, from which we conclude that there is an anomaly at E = ±1. In case when E ∈ (−3, −1) ∪ (1, 3), we show that the invariant measure is concentrated on the subspaces Ω0 respectively Ωπ/2 corresponding to θ3 = 0, π/2, on which it is Lebesgue measure. Finally, at E = 3, the invariant measure is a Dirac measure on Ωπ/2 ∩ {θ1 = π/4}. There is no anomaly at E = ±3. The paper is set out as follows. In Sec. 2, we describe the method in more detail. In particular, we rewrite the equation for the invariant measure on the projective space RP2l−1 and in terms of a parametrization t: RP2l−1 → Ω. We also give a general formula for the first and second order terms of the qth iterate of the linear transformation in the projective space. In Sec. 3, we demonstrate the method by doing the calculations for one chain. We obtain in a relatively simple fashion the limit as λ → 0 of the invariant measure in the sense of weak convergence. We also consider the special case E = ±2, which seems to have been overlooked in the literature. We next generalize our approach to the case of two coupled chains in Sec. 4. The calculations are unreasonably long to be suitable for publication in full. We therefore give them only in one or two shorter cases to give the flavour of the full problem. The parametrization of RP3 in terms of three angles is given in Sec. 4.1 and the iterated transformations are worked out in Sec. 4.2 leading to a general formulation of the invariance equation as a partial differential equation. In Sec. 4.3, the case E ∈ (−1, 1) is worked out, in Sec. 4.4, the case E = ±1, in Sec. 4.5 the case E ∈ (−3, −1) ∪ (1, 3), and in Sec. 4.6 the case E = ±3. We find that there are anomalies at E = 0, E = ±1 (from both sides) but not at E = ±3. 2. Description of the Method We can write the Schr¨ odinger equation for the case of two lines in transfer matrix form as follows
ψ(n + 1, 1)
ψ(n + 1, 2) = ψ(n, 1) ψ(n, 2)
E − λvn,1 −1 1
0
−1
E − λvn,2 0
1
−1
0
ψ(n, 1)
0 −1 ψ(n, 2) . 0 0 ψ(n − 1, 1) 0
0
ψ(n − 1, 2)
(2.1)
July 20, 2004 10:11 WSPC/148-RMP
00212
Two-Line Anderson Model
This can be written more concisely as ~ ~ ψ(n + 1) ψ(n) = Aλ , ~ ~ − 1) ψ(n) ψ(n
643
(2.2)
with
Aλ =
C + λX
E
−1
−Il
Il
0
,
(l = 2) ,
(2.3)
where C=
−1
E
and X =
−vn,1 0
0 −vn,2
.
This formulation has the advantage that it generalizes to an arbitrary number l of lines, though of course here we only consider l = 1 or 2. We consider the invariant measure on the projective space RP2l−1 = P (R2l ). The equation for the invariant measure νλE on RP2l−1 reads Z Z E(f ([Aλ x]))νλE (dx) , f (x)νλE (dx) = RP2l−1
RP2l−1
2l−1
for all f ∈ C(RP ). X is a random l×l matrix with finite moments and symmetric distribution and [y] denotes the class in RP2l−1 containing y. It is convenient to transform Aλ to a more suitable form, Jλ , say, so that the limit J0 = limλ→0 Jλ , is a real Jordan form of A0 . Let SA0 S −1 = J0 ,
(2.4)
SAλ S −1 = Jλ .
(2.5)
and
In terms of the image measures ν˜λE = νλE ◦ S −1 , where Sx = [Sx], the invariance equation reads Z Z f (x)˜ νλE (dx) = E(f ([Jλ x]))˜ νλE (dx) , RP2l−1
(2.6)
(2.7)
RP2l−1
for all f ∈ C(RP2l−1 ). It is convenient to parametrize RP2l−1 by 2l − 1 angles. Let Ω be a compact parametrization space and t: RP2l−1 → Ω a parametrization of P (R2l ). The parametrization for the two particular cases that we consider will be specified later. Defining σλE = ν˜λE ◦ t−1 , the invariance equation becomes Z Z E g(ω)σλ (dω) = E(g(t[Jλ t−1 ω]))σλE (dω) , Ω
Ω
(2.8)
(2.9)
July 20, 2004 10:11 WSPC/148-RMP
644
00212
T. C. Dorlas & J. V. Pul´ e
or with the notation (Tλ g)(ω) = E(g(t[Jλ t−1 ω])) , Z Z g(ω)σλE (dω) = (Tλ g)(ω)σλE (dω) .
(2.10) (2.11)
Ω
Ω
Now suppose that σλE tends to σ0E weakly as λ tends to 0 and Jλ tends to J0 . Let (T0 g)(ω) = g(t[J0 t−1 ω]) . The left hand side of (2.11) clearly converges to Z g(ω)σ0E (dω) .
(2.12)
(2.13)
Ω
Z Z (Tλ g)(ω)σλE (dω) − (T0 g)(ω)σ0E (dω) Ω
Ω
Z Z ≤ (Tλ g − T0 g)(ω)σλE (dω) + (T0 g)(ω)(σ0E (dω) − σλE (dω)) Ω
Ω
Z E E ≤ kTλ g − T0 gk + (T0 g)(ω)(σ0 (dω) − σλ (dω)) .
(2.14)
Ω
Thus if kT0 g − Tλ gk → 0 as λ → 0, then Z Z (Tλ g)(ω)σ E (dω) − (T0 g)(ω)σ E (dω) → 0 λ 0 Ω
(2.15)
Ω
and therefore
Z
Ω
g(ω)σ0E (dω)
=
Z
Ω
(T0 g)(ω)σ0E (dω) .
(2.16)
This invariance equation together with ergodicity is enough in some cases to determine σ0E . For the other cases, we need the following result. We have, again by (2.11), for any positive integer q, Z gλq (ω)σλE (dω) = 0 , (2.17) Ω
gλq
λ−2 (Tλq g
where = − g). Suppose gλq converges in norm as λ → 0 to a function g0q ∈ C(Ω). Z Z q q E E gλ (ω)σλ (dω) − g0 (ω)σ0 (dω) Ω Ω Z Z g0q (ω) σ0E (dω) − σλE (dω) (gλq − g0q ) (ω)σλE (dω) + ≤ Ω
Ω
Z ≤ kgλq − g0q k + g0q (ω)(σ0E (dω) − σλE (dω)) . Ω
(2.18)
July 20, 2004 10:11 WSPC/148-RMP
00212
Two-Line Anderson Model
645
Thus Z Z q q E E →0 g (ω)σ (dω) g (ω)σ (dω) − 0 λ 0 λ
and therefore
(2.19)
Ω
Ω
Z
g0q (ω)σ0E (dω) = 0 .
Ω
(2.20)
The power q is usually chosen so that T0q g = g. The first order term in λ in Tλq g is 0 because the distribution of the random matrix is symmetric. With an appropriate choice of g, we can determine σ0E or obtain a differential equation for it. For these choices of g, we have to prove that Tλ g converges to g and gλq converges to g0q in norm and that g0q is bounded. Because the Jordan form is different in the different regions of the spectrum, we cannot compare the measures directly at the junctions and edges. We have to transform them to the same set of variables first. Most of the work goes into the calculation of the limit of gλq for the right choice of functions. We have " q #!! Y (n) q −1 Tλ g (ω) = E g t Jλ t ω , (2.21) n=1
where
(n) Jλ
=
(n) SAλ S −1 .
(n) Aλ (n)
Aλ
are 2l × 2l matrices of the form C + λXn −Il = , Il 0
(2.22)
where X1 , X2 , . . . are i.i.d. random l×l matrices with finite moments and symmetric distribution. Let m Y (n) B(m) = Aλ , (2.23) n=1
then
Tλq g (ω)
"
=E g t S
q Y
(n) Aλ
n=1
!
S −1 t−1 ω
#!!
= E(g(t[SB(q)S −1 t−1 ω])) . (2.24)
(Tλq g)
To be able to expand to second order in λ, we shall need the following iteration result. The matrix C can be written as 2 cos G where G is an l × l matrix. Lemma 2.1. Let τ (x, r) =
sin rx sin x
(2.25)
and T (r) = τ (G, r). Then B(m) = B0 (m) + λB1 (m) + λ2 B2 (m)(X1 , . . . , Xm ) + O(λ3 ) ,
(2.26)
July 20, 2004 10:11 WSPC/148-RMP
646
00212
T. C. Dorlas & J. V. Pul´ e
where B0 (m) =
1 B1 (m) = 2
T (m + 1)
−T (m)
T (m)
−T (m − 1)
m X
n=1
,
T (n)
T (n)
T (n − 1)
T (n − 1)
(2.27)
Xn
0
0
Xn
T (m − n + 1)
−T (m − n)
T (m − n + 1)
−T (m − n)
,
(2.28)
and E(B2 (m)) = 0. The proof is by induction using the identity T (r) = 2 cos G T (r − 1) − T (r − 2) = 2T (r − 1) cos G − T (r − 2) .
(2.29)
3. The Case of a Single Chain (l = 1) In this section we study the case l = 1, i.e. a single chain. In this case, the projective space RP1 is homeomorphic to the circle and there is an obvious parametrization on Ω = [0, π), identifying 0 and π, defined by the map t: RP1 → Ω given by ( −1 x cot x21 ∈ (0, π) , if x1 6= 0 , θ= (3.1) 0, if x1 = 0 . We put C(Ω) = {f | f ∈ C([0, π]), f (0) = f (π)}. Recall that E ∈ [−2, 2], so that we can write E = 2 cos α with α ∈ [0, π] and 2 cos α + λX −1 Aλ = . (3.2) 1 0 For simplicity we take the variance of X to be 1. To proceed, we need the following lemma. Lemma 3.1. If the first r + 1 derivatives of g are bounded , then we have
k r
X ∂ λk
q −r q T g lim λ Tλ g −
= 0. λ→0
k! ∂λk λ λ=0
(3.3)
k=0
Proof. We note first that if M is an n × n matrix with det M = ±1, then kM xk ≥
kxk . n!kM k(n−1)
(3.4)
This follows from the inequality, obtained from Cramer’s formula for M −1 , kM −1 k ≤ n! which gives
kM k(n−1) = n!kM k(n−1) , |det M |
kxk = kM −1 M xk ≤ kM −1 kkM xk ≤ n!kM k(n−1) kM xk . Let M (λ) be a 2 × 2 matrix with det M (λ) = ±1.
(3.5)
(3.6)
July 20, 2004 10:11 WSPC/148-RMP
00212
Two-Line Anderson Model
647
Now let fλ (x) = tan−1 where x0 = M (λ)x, and let M (r) ≡
∂ r M (λ) ∂λr .
x01 , x02
(3.7)
Then
∂ M (1) (λ)x ∧ M (λ)x fλ (x) = ∂λ kM (λ)xk2
and so
(3.8)
(1) ∂ fλ (x) ≤ kM (λ)xk . ∂λ kM (λ)xk
(3.9)
Similarly, one derives bounds for higher derivatives. In general, k X k X ∂ kM (r1 ) (λ)xk · · · kM (rp ) (λ)xk . Cr1 ,...,rp ∂λk fλ (x) ≤ kM (λ)xkp p=1
(3.10)
r1 +r2 +···+rp =k ri ≥1
Inserting (3.4) we get k k X ∂ ≤ f (x) λ ∂λk p=1
X
r1 +r2 +···+rp =k ri ≥1
Qq
Cr1 ,...,rp 2p kM (r1 ) (λ)k · · · kM (rp ) (λ)kkM (λ)kp . (3.11) (n)
(n)
(n)
2
∂ D
(n)
Dλ , where Dλ = SAλ S −1 . Note that ∂λλ2 =
(n)
∂D (n) 0 and there exists a constant C such that both kDλ k ≤ C+kXn k and ∂λλ ≤ C for all n and all λ ∈ [−1, 1]. Since the random variables Xn ’s have finite moments (3.11) then gives for any k ∈ N, k k E ∂ fλ (x) ≤ E ∂ fλ (x) ≤ Ck . (3.12) k k ∂λ ∂λ Now we take M (λ) =
n=1
If hλ = g ◦ fλ , where the first k derivatives of g are bounded, then we also have k E ∂ hλ (x) ≤ Kk . (3.13) ∂λk Since Tλq g = E(hλ ◦ t−1 ), this gives
k
∂
q
T g
∂λk λ ≤ Kk .
(3.14)
By using the Mean-Value Theorem, we then see that if the first r + 1 derivatives of g are bounded, then we also have
k r
X ∂ λk
q −r q T g (3.15) lim λ Tλ g −
= 0. λ→0
k! ∂λk λ λ=0 k=0
July 20, 2004 10:11 WSPC/148-RMP
648
00212
T. C. Dorlas & J. V. Pul´ e
We now first consider the case E 6= ±2. Then the real Jordan form of A0 is Rα , the rotation by α cos α sin α . (3.16) Rα = − sin α cos α
We have
J0 = SA0 S −1 = Rα ,
(3.17)
where S=
As a result
sin α
0
cos α −1
.
(3.18)
(T0 g)(θ) = g((θ − α) mod π) .
(3.19)
If g ∈ C(Ω) has bounded first derivative, it follows from Lemma 3.1 that kT0 g − Tλ gk → 0, and therefore for such g the invariance equation (2.16) for σ0E holds. If α is not a rational multiple of π, the invariance equation (2.16) and ergodicity imply that σ0E is the uniform measure on [0, π). If α = pπ/q is a rational multiple of π, we use the fact that T0q is the identity map, I. Since the random variables Xn ∂ Tλq g λ=0 = 0. Therefore, if the first three derivatives of g are are symmetric, ∂λ q ∂2 bounded, Lemma 3.1 shows that ∂λ is bounded and 2 Tλ g λ=0
2
−2 q ∂ q
= 0. T g (3.20) lim λ (T g − g) − λ λ
2 λ→0 ∂λ λ=0 q ∂2 If ∂λ is continuous, Eq. (2.20) then yields 2 Tλ g λ=0 Z 2 ∂ q (θ)σ0E (dθ) = 0 . (3.21) T g ∂λ2 λ λ=0 Ω q ∂2 with g(θ) = e2inθ . We have from Lemma 2.1, We now calculate ∂λ 2 Tλ g λ=0 B(q) = B0 (q) + λB1 (q) + λ2 B2 (q)(X1 , . . . , Xq ) + O(λ3 ) ,
(3.22)
p
where B0 (q) = (−1) I2 and B1 (q) = (−1)p with X=
q X
n=1
Y =
q X
−X −Z
Y X
(3.23)
τ (α, n − 1)τ (α, n)Xn ,
(3.24)
τ (α, n)2 Xn ,
(3.25)
τ (α, n − 1)2 Xn .
(3.26)
n=1
Z=
q X
n=1
July 20, 2004 10:11 WSPC/148-RMP
00212
Two-Line Anderson Model
If α 6=
π 2,
(3 − 2 sin2 α)q , 8 sin4 α
E(X 2 ) = E(Y Z) = E(Y 2 ) = E(Z 2 ) =
π 2
(3.27)
3q , 8 sin4 α
E(XY ) = E(ZX) = If α =
649
(3.28)
3q cos α . 8 sin4 α
(3.29)
then E(X 2 ) = E(XY ) = E(Y Z) = E(XZ) = 0
(3.30)
E(Y 2 ) = E(Z 2 ) = 1 .
(3.31)
and
Let ˜1 (q) = SB1 (q)S −1 = (−1)p B
−Z1 Z3
−Z2 Z1
,
(3.32)
where Z1 = X − Y cos α, Z2 = Y sin α, Z3 = (Z + Y cos2 α − 2X cos α)/ sin α. If α 6=
π 2
then 3q , 8 sin2 α q , E(Z12 ) = E(Z2 Z3 ) = 8 sin2 α E(Z22 ) = E(Z32 ) =
If α =
π 2
(3.33) (3.34)
E(Z1 Z2 ) = E(Z1 Z3 ) = 0 .
(3.35)
E(Z12 ) = E(Z1 Z2 ) = E(Z2 Z3 ) = E(Z3 Z1 ) = 0
(3.36)
E(Z22 ) = E(Z32 ) = 1 .
(3.37)
˜ ˜1 (q) + λ2 B ˜2 (q) + O(λ3 ) , B(q) ≡ SB(q)S −1 = (−1)p I2 + λB
(3.38)
then
and
Now
˜2 (q)) = 0. If we put where E(B x=
sin θ cos θ
July 20, 2004 10:11 WSPC/148-RMP
650
00212
T. C. Dorlas & J. V. Pul´ e
˜ and x0 = B(q)x, then x01 = (−1)p {(1 − λZ1 ) sin θ − λZ2 cos θ} + λ2 w1 + O(λ3 )
(3.39)
x02 = (−1)p {λZ3 sin θ + (1 + λZ1 ) cos θ} + λ2 w2 + O(λ3 ) ,
(3.40)
and
where E(w) = 0. Writing 0
x =
sin θ0 cos θ0
,
−1 ˜ so that θ0 = t[B(q)t θ], we have
tan θ0 =
x01 = tan θ + λU + λ2 V + O(λ3 ) , x02
(3.41)
where U = −2 tan θ Z1 − tan2 θ Z3 − Z2 ,
(3.42)
V = 2 tan θ Z12 + tan3 θ Z32 + Z1 Z2 + tan θ Z2 Z3 + 3 tan2 θ Z3 Z1 + (−1)p sec2 θ(w1 cos θ − w2 sin θ) .
(3.43)
We then get exp(2inθ0 ) =
1 + i tan θ0 1 − i tan θ0
n
= exp(2inθ){1+2iλnU cos2 θ−2inλ2 cos4 θ(U 2 (tan θ − in)−V sec2 θ)+O(λ3 )} . From Lemma 3.1 we see that the O(λ3 ) term has finite expectation and therefore E(exp(2inθ0 )) = exp(2inθ){1 − 2iλ2 [n cos4 θ(E(U 2 )(tan θ − in) − E(V ) sec2 θ)] + O(λ3 )} ,
(3.44)
and thus lim λ−2 E [exp(2inθ0 ) − exp(2inθ)] = exp(2inθ){A1 n + A11 n2 } ,
λ→0
(3.45)
where A1 = 2i cos4 θ(E(V ) sec2 θ − E(U 2 ) tan θ) , A11 = −2E(U 2 ) cos4 θ . If α 6=
(3.46) (3.47)
π 2,
E(U 2 ) =
3q sec4 θ , 8 sin2 α
E(V ) =
3q tan θ sec2 θ , 8 sin2 α
(3.48)
July 20, 2004 10:11 WSPC/148-RMP
00212
Two-Line Anderson Model
651
and A1 = 0 , If α =
A11 = −
3q . 4 sin2 α
(3.49)
π 2,
E(U 2 ) = 1 + tan4 θ ,
E(V ) = tan3 θ ,
(3.50)
and 1 A1 = 2i(cos θ sin3 θ − sin θ cos3 θ) = − i sin 4θ , 2 1 A11 = −2(sin4 θ + cos4 θ) = − (cos 4θ + 3) . 2 From (3.21) with g(θ) = e2inθ we have Z e2inθ {A1 (θ) + nA11 (θ)}σ0E (dθ) = 0
(3.51) (3.52)
(3.53)
[0,π)
for n 6= 0. Recall that the set {e2inθ | n ∈ Z} is total in the space C(Ω). In the case when α 6= π2 , (3.53) gives immediately Z e2inθ σ0E (dθ) = 0 (3.54) [0,π)
π for n 6= 0, and therefore σ0E (dθ) = dθ π . In the case when α = 2 , that is E = 0, 0 since X is a symmetric random variable, σ0 is symmetric about π2 . It can be seen from the invariance equation that if σλ0 is an invariant measure, so is its reflection about π2 . By the uniqueness of the invariant measure for λ 6= 0 it follows that σλ0 is symmetric and therefore so is σ00 . We can integrate by parts in (3.53) to get Z e2inθ A1 (θ)σ00 (dθ) [0,π)
=
Z
[0,π)
A1 (θ)σ00 (dθ) − 2in
Since σ00 is symmetric about Z
[0,π)
[0,θ)
e2inθ
[0,π)
Z
[0,θ)
A1 (θ0 )σ00 (dθ0 )dθ .
(3.55)
π 2,
[0,π)
and Eq. (3.55) gives Z Z e2inθ 2i
Z
A1 (θ)σ00 (dθ) = 0 ,
A1 (θ0 )σ00 (dθ0 )dθ =
Z
[0,π)
(3.56)
e2inθ A11 (θ)σ00 (dθ).
(3.57)
Hence A11 (θ)σ00 (dθ) = 2i
Z
[0,θ)
A1 (θ0 )σ00 (dθ0 )dθ + Kdθ ,
(3.58)
July 20, 2004 10:11 WSPC/148-RMP
652
00212
T. C. Dorlas & J. V. Pul´ e
where K is a constant. Since A11 (θ) = 6 0, this implies that σ00 is absolutely contin0 uous. If ρ0 is the density of σ0 then Z A11 (θ)ρ0 (θ) = 2i A1 (θ0 )ρ0 (θ0 )dθ0 + K . (3.59) [0,θ)
Thus ρ0 is differentiable and (cos 4θ + 3)ρ00 (θ) = 2 sin 4θρ0 (θ) .
(3.60)
Integrating, we get 1
ρ0 (θ) = C(cos 4θ + 3)− 2 .
(3.61)
This corresponds to Eq. (1.6) for E = 0. Now suppose that E = 2. The case when E = −2 is similar. Here the real Jordan form for A0 1 1 J0 = . (3.62) 0 1 The matrix S is now given by S2 =
0
1
1 −1
.
(3.63)
Note that J0q =
1
q
0 1
(3.64)
and therefore (T0q g)(θ) = g(θ(q) ) ,
(3.65)
where θ(q) is given by
cot θ
(q)
=
1 q,
if θ = 0 , (3.66)
cot θ , 1 + q cot θ
if θ 6= 0 .
It follows that θ (q) → π2 as q → ∞. We now have Z Z (T0q g)(θ)σ02 (dθ) . g(θ)σ02 (dθ) = lim Ω
q→∞
Thus we have, for n ∈ Z, Z Z e2inθ σ02 (dθ) = Ω
(3.67)
Ω
Ω∩{θ= π 2}
Therefore σ02 is concentrated on Ω ∩ {θ =
e2in2 θ2 σ02 (dθ) .
π 2 },
i.e. σ02 = δπ/2 .
(3.68)
July 20, 2004 10:11 WSPC/148-RMP
00212
Two-Line Anderson Model
653
To investigate whether there is an anomaly at E = 2, we need to transform the invariant measure dθ to the coordinates given by the matrix S2 . Calling the new angle coordinate θ 0 , the transformation is given by sin θ sin θ0 −1 (3.69) = S2 S cos θ cos θ0 and S2 S
−1
=
0
1
1 −1
cosec α
0
cot α
−1
=
cot α 1−cos α sin α
−1
1
.
(3.70)
Hence cot θ0 = and dθ =
sin α cot θ + 1 − cos α cos α − sin α cot θ
sin α dθ0 . 2 0 2 0 sin θ cot θ + 2(1 − cos α)(1 + cot θ 0 )
(3.71)
(3.72)
As α tends to 0, i.e. E → 2, this measure tends to δπ/2 , so there is no (zeroth-order) anomaly at E = 2. 4. The Case of Two Coupled Chains (l = 2) 4.1. Parametrization In the case of two coupled chains (l = 2), the matrix C in (2.3) is given by E −1 C= . −1 E
(4.1)
If 1 U=√ 2
then C = U DU ∗ , where D=
1 1 −1 1
E+1
0
0
E−1
,
(4.2)
.
(4.3)
We can write D = 2 cos D0 with D0 =
α
0
0
β
,
(4.4)
where α and β are defined by 2 cos α = E + 1 and 2 cos β = E − 1. Note that α and β are not always real. It follows that G = U D0 U ∗ and T (r) = U τ (D0 )U ∗ . Thus τ (α, r) + τ (β, r) −τ (α, r) + τ (β, r) 1 T (r) = . (4.5) 2 −τ (α, r) + τ (β, r) τ (α, r) + τ (β, r)
July 20, 2004 10:11 WSPC/148-RMP
654
00212
T. C. Dorlas & J. V. Pul´ e
The real Jordan form of A0 is always of the form J1 0 . J0 = 0 J2
(4.6)
It is therefore convenient to parametrize the projective space RP3 so that the 1–2 plane and the 3–4 plane have the usual parametrization. We map the projective space RP3 onto the set Ω = Ω(0, π2 ) ∪ Ω0 ∪ Ω π2 , where π π Ω(0, π2 ) = [0, 2π) × [0, π) × 0, , Ω0 = [0, π) × {0} , Ω π2 = [0, π) × 2 2 (4.7) by the mapping t: RP3 → Ω defined as follows. If x21 + x22 6= 0 and x23 + x24 6= 0, t(x) = (θ1 , θ2 , θ3 ) ∈ Ω(0, π2 ) ,
(4.8)
where −1 x cot x21 ∈ (0, π) , cot−1 x2 + π ∈ (π, 2π) , x1 θ1 = 0 , π, ( if x3 cot−1 xx34 ∈ (0, π) , θ2 = 0, if x3 s x23 + x24 π −1 . θ3 = cot ∈ 0, x21 + x22 2
if x1 > 0 , if x1 < 0 , if x1 = 0 and x2 > 0 ,
(4.9)
if x1 = 0 and x2 < 0 . 6= 0 ,
= 0.
(4.10)
(4.11)
If x21 + x22 = 0, t(x) = (θ2 , 0) ∈ Ω0 ,
(4.12)
where θ2 =
(
cot−1
x4 x3
0,
∈ (0, π) ,
if x3 6= 0 , if x3 = 0 .
(4.13)
If x23 + x24 = 0, t(x) =
θ1 ,
π 2
∈ Ω π2 ,
(4.14)
where θ1 =
(
cot−1 0,
x2 x1
∈ (0, π) ,
if x1 6= 0 , if x1 = 0 .
(4.15)
July 20, 2004 10:11 WSPC/148-RMP
00212
Two-Line Anderson Model
655
We give the induced topology on Ω by describing the continuous functions C(Ω) on Ω. For f : Ω → C, define f(0, π2 ) = f Ω(0, π2 ) , f0 = f Ω0 and f π2 = f Ω π2 . Now, f is in C(Ω) if f(0, π2 ) , f0 and f π2 are continuous and lim f0 (θ2 , 0) = f0 (0, 0) ,
(4.16)
π π lim f π2 θ1 , = f π2 0, , θ1 →π 2 2
(4.17)
lim f(0, π2 ) (θ1 , θ2 , θ3 ) = f(0, π2 ) (0, θ2 , θ3 ) ,
(4.18)
lim f(0, π2 ) (θ1 , θ2 , θ3 ) = f0 (θ2 , 0) ,
(4.19)
θ2 →π
θ1 →2π
θ3 →0
lim f(0, π2 ) (θ1 , θ2 , θ3 ) = f
θ3 → π 2
π 2
π θ1 mod π, 2
,
lim f(0, π2 ) (θ1 , θ2 , θ3 ) = f(0, π2 ) ((θ1 + π) mod 2π, 0, θ3 ) .
θ2 →π
(4.20) (4.21)
The union of the following three sets is totally in C(Ω) {ei(n1 θ1 +n2 θ2 ) sin 2n3 θ3 1Ω(0, π ) | n1 , n2 ∈ Z, n3 ∈ N, n1 + n2 even} ,
(4.22)
{e2in1 θ1 sin θ3 1Ω(0, π ) + e2in1 θ1 1Ω π | n1 ∈ Z} ,
(4.23)
{e2in2 θ2 cos θ3 1Ω(0, π ) + e2in2 θ2 1Ω0 | n2 ∈ Z} .
(4.24)
2
2
2
2
4.2. General scheme We shall assume that the Xn ’s are diagonal, that is, (1) Xn 0 Xn = (2) 0 Xn
(4.25)
(i)
and that the Xn are all i.i.d.’s with variance 1. Let ˜ B(m) = SB(m)S −1 .
(4.26)
˜ ˜0 (m) + λB ˜1 (m) + λ2 B ˜2 (m)(X1 , . . . , Xm ) + O(λ3 ) , B(m) =B
(4.27)
Then
where ˜0 (m) = J0m , B
(4.28)
˜1 (m) = SB1 (m)S −1 , B
(4.29)
˜2 (m)) = 0. B ˜1 (m) can be expressed in the form and E(B ˜1 (m) = B
m X
n=1
Yn− Cn (m) +
m X
n=1
Yn+ Dn (m) ,
(4.30)
July 20, 2004 10:11 WSPC/148-RMP
656
00212
T. C. Dorlas & J. V. Pul´ e (1)
(2)
where Yn± = 12 (Xn ± Xn ). Let (m)
(m)
(m)
(T0m g)(θ1 , θ2 , θ3 ) = g(θ1 , θ2 , θ3 ) , sin θ1 sin θ3 cos θ1 sin θ3 x= sin θ2 cos θ3
(4.31)
(4.32)
cos θ2 cos θ3
˜ and x0 = B(m)x. Then (m) (m) sin θ1 sin θ3 (m) (m) cos θ1 sin θ3 0 2 3 x = (m) (m) + λy + λ w + O(λ ) , sin θ2 cos θ3 (m)
cos θ2
(4.33)
(m)
cos θ3
where E(w) = 0 and yi =
m X
n=1
E(y) = 0 and
m X
Yn− hCn (m)T ei , xi +
E(yi yj ) =
1 2 +
m X
n=1 m X
n=1
n=1
Yn+ hDn (m)T ei , xi .
(4.34)
hCn (m)T ei , xihCn (m)T ej , xi T
T
!
hDn (m) ei , xihDn (m) ej , xi ,
where {e1 , e2 , e3 , e4 } is the usual orthonormal basis in R4 . Writing sin θ10 sin θ30 cos θ0 sin θ0 1 3 x0 = sin θ20 cos θ30
(4.35)
(4.36)
cos θ20 cos θ30
we have
tan θ10 =
x01 (m) = tan θ1 + λU1 + λ2 V1 + O(λ3 ) , x02
(4.37)
where (m)
U1 =
y1 cos θ1
(m)
cos2 θ1
(m)
− y2 sin θ1
(4.38)
(m)
sin θ3
and (m)
V1 = −
y1 y2 cos θ1
(m)
cos3 θ1
(m)
− y22 sin θ1 (m)
sin2 θ3
+ W1 ,
(4.39)
July 20, 2004 10:11 WSPC/148-RMP
00212
657
Two-Line Anderson Model
with E(W1 ) = 0. Next we have tan θ20 =
x03 (m) = tan θ2 + λU2 + λ2 V2 + O(λ3 ) , x04
(4.40)
where (m)
(m)
U2 =
y3 cos θ2
− y4 sin θ2
(m)
cos2 θ2
(4.41)
(m)
cos θ3
and (m)
V2 = −
y3 y4 cos θ2
(m)
cos3 θ2
(m)
− y42 sin θ2
+ W2 ,
(m)
cos2 θ3
(4.42)
with E(W2 ) = 0. Thirdly, 1 0 2 (x1 ) + (x02 )2 2 (m) 0 = tan θ3 + λU3 + λ2 V3 + O(λ3 ) , tan θ3 = (x03 )2 + (x04 )2
(4.43)
where (m)
U3 =
cos θ3
(m)
(y1 sin θ1
(m)
+ y2 cos θ1
(m)
) − sin θ3
cos2
(m)
(y3 sin θ2
(m)
+ y4 cos θ2
)
(m) θ3
(4.44)
and (m)
V3 =
y12 cos2 θ1
(m)
+ y22 sin2 θ1
(m)
2 sin θ3 2 (m) 2 (m) y3 (3 sin θ2
+ sin θ3
(m)
− 2y1 y2 sin θ1
(m)
cos θ1
(m)
cos θ3
(m)
(m)
− 1) + y42 (3 cos2 θ2 − 1) + 6y3 y4 sin θ2 2 cos3 θ3
(m)
cos θ2
(m) (m) (m) (m) (m) (m) − y2 y4 cos θ1 cos θ2 + y1 y3 sin θ1 sin θ2 + y2 y3 cos θ1 sin θ2 (m) (m) (m) (4.45) + y1 y4 cos θ2 sin θ1 /(cos2 θ3 ) + W3 ,
where E(W3 ) = 0. For k = 1, 2, 3, therefore, 1n 1 + i tan θk0 2 k exp(ink θk0 ) = 1 − i tan θk0 (m) (m) (m) = exp(ink θk ) 1 + iλnk Uk cos2 θk + ink λ2 Vk cos2 θk 2
4
− ink λ cos
(m) θk Uk2
(m) tan θk
1 − ink 2
+ O(λ3 ) .
Hence exp(i(n1 θ10 + n2 θ20 + n3 θ30 )) (m)
= exp(i(n1 θ1
(m)
+ n 2 θ2
(m)
+ n 3 θ3
(m) )) 1 + iλ[n1 U1 cos2 θ1
July 20, 2004 10:11 WSPC/148-RMP
658
00212
T. C. Dorlas & J. V. Pul´ e (m)
+ n2 U2 cos2 θ2
(m)
+ n3 U3 cos2 θ3
]
+ λ2 [B1 n1 + B2 n2 + B3 n3 + B11 n21 + B22 n22 + B33 n23 + B12 n1 n2 + B23 n2 n3 + B31 n3 n1 ] + O(λ3 ) ,
where
(m)
Bk = i Vk cos2 θk
(m)
− Uk2 tan θk
(m)
cos4 θk
1 (m) Bkk = − Uk2 cos4 θk , 2 and for k 6= l,
(m)
Bkl = −Uk Ul cos2 θk
(m)
cos2 θl
,
.
(4.46)
(4.47) (4.48)
(4.49)
Taking expectations we get E(exp(i(n1 θ10 + n2 θ20 + n3 θ30 ))) (m)
(m)
(m)
= exp(i(n1 θ1 + n2 θ2 + n3 θ3 )) × 1 + λ2 [A1 n1 + A2 n2 + A3 n3 + A11 n21 + A22 n22 + A33 n23 + A12 n1 n2 + A23 n2 n3 + A31 n3 n1 ] + O(λ3 ) ,
(4.50)
where Ak = E(Bk ) and Akl = E(Bkl ). The right hand side of this equation can be written as ( ∂ ∂ ∂ ∂2 ∂2 ∂2 −iA1 (m) − iA2 (m) − iA3 (m) − A11 − A − A 22 33 (m) 2 (m) 2 (m) 2 ∂θ1 ∂θ2 ∂θ3 ∂θ1 ∂θ2 ∂θ3 − A12 − A31
∂2 (m) (m) ∂θ1 ∂θ2
∂2 (m)
∂θ3
(m)
∂θ1
− A23 )
∂2 (m) (m) ∂θ2 ∂θ3 (m)
exp(i(n1 θ1
(m)
+ n 2 θ2
(m)
+ n 3 θ3
)) + O(λ3 ) .
(4.51)
Though the calculations in this section are formal, in each application we use g’s as in the lemma below with r ≤ 2 and these are finite linear combinations of exponentials. Lemma 4.1. If g(θ3 ) = l(sin2 θ3 ) and the first r + 1 derivatives of l are bounded , then
k r
X ∂ λk
q −r q T g (4.52) lim λ Tλ g −
= 0. λ→0
k! ∂λk λ λ=0 k=0
If g is of the form g(θ1 , θ2 , θ3 ) = ei(N θ1 +M θ2 ) sin2s1 θ3 cos2s2 θ3 with N 6= 0, M 6= 0, and r + 1 ≤ min(s1 , s2 ), g(θ1 , θ2 , θ3 ) = eiN θ1 sin2s1 θ3 , with N 6= 0 and r + 1 ≤ s1
July 20, 2004 10:11 WSPC/148-RMP
00212
659
Two-Line Anderson Model
or g(θ1 , θ2 , θ3 ) = eiM θ2 cos2s2 θ3 with M 6= 0 and r + 1 ≤ s2 , then (4.52) is again satisfied. The proof of this lemma is very similar to that of Lemma 3.1. In this case we take fλ (x) =
2
2
x02 2
x01 3
x01 + x02 x01 2
+
+
+ x04 2
,
(4.53)
which can be written as fλ (x) =
kP M (λ)xk2 , kM (λ)xk2
(4.54)
where x0 = M (λ)x and P x = (x1 , x2 , 0, 0). Equivalently, fλ (x) = sin2 (θ30 ). Now (P M (1) (λ)x · P M (λ)x)kM (λ)xk2 − (M (1) (λ)x · M (λ)x)kP M (λ)xk2 ∂ fλ (x) = 2 ∂λ kM (λ)xk4 (4.55) and so (1) ∂ fλ (x) ≤ 4 |M (λ)xk . ∂λ kM (λ)xk
(4.56)
In general we have the analogue of (3.11) k k X ∂ X f (x) ≤ Cr1 ,...,rn (4!)n kM (r1 ) (λ)k · · · kM (rn ) (λ)kkM (λ)k3n . ∂λk λ n=1 r1 +r2 +···+rn =k ri ≥1
Again if we take M (λ) =
(4.57)
(n) n=1 Dλ ,
Qq
we get for any k ∈ N, k E ∂ fλ (x) ≤ Ck . k ∂λ
(4.58)
If hλ = l ◦ fλ , where the first k derivatives of l are bounded, then we also have k E ∂ hλ (x) ≤ Kk . (4.59) k ∂λ If g(θ3 ) = l(sin2 θ3 ), then Tλq g = E(hλ ◦ t−1 ) and we get
k
∂
q
T g
∂λk λ ≤ Kk .
(4.60)
By using the Mean-Value Theorem we then see that if the first r + 1 derivatives of l are bounded, then we also have
k r
k X ∂ λ
q q T g (4.61) lim λ−r Tλ g −
= 0. λ λ→0
k! ∂λk λ=0 k=0
July 20, 2004 10:11 WSPC/148-RMP
660
00212
T. C. Dorlas & J. V. Pul´ e 0
Next consider functions of the form eiN θ1 sin2s θ30 . First let x1 −1 , tλ (x) = tan x2
(4.62)
that is tλ (x) = θ10 . Then as in (3.10) k X ∂ kP M (r1 ) (λ)xk · · · kP M (rn ) (λ)xk t (x) . Cr1 ,...,rn ∂λk λ ≤ kP M (λ)xkn
(4.63)
r1 +r2 +···+rn =k 1≤ri ≤k
Let Sλ (x) = exp(iN tλ (x))(fλ (x))s where fλ is as in (4.54), that is, Sλ (x) = 0 ∂k eiN θ1 sin2s θ30 . ∂λ k Sλ (x) consists of a finite linear combination of terms with l = 0, . . . , k, of the form (q ) (p ) (p ) (q ) exp(iN tλ (x))(fλ (x))(s−n) fλ 1 (x) · · · fλ n (x) tλ 1 (x) · · · tλ m (x) (4.64) with p1 + · · · + pn = l, n ≤ l and q1 + · · · + qm = k − l, m ≤ k − l. If we use (4.63) (q ) (q ) to get an upper bound for |tλ 1 (x) · · · tλ m (x)|, we see that the highest power of kP M (λ)xk in the denominator of the upper bound is k − l. From (4.54) we see that the term in (4.64) is bounded if s − n ≥ (k − l)/2 and therefore if s ≥ (k + l)/2, it ∂k is bounded for all m. Thus if s ≥ k, ∂λ k Sλ (x) is bounded. 0 Clearly, the same argument works for eiM θ2 cos2s θ30 and for 0 0 ei(N θ1 +M θ2 ) sin2s1 θ30 cos2s2 θ30 if s1 ≥ k and s2 ≥ k. 4.3. The case E ∈ (−1, 1) If −1 < E < 1, we can choose α ∈ (0, π2 ) and β ∈ ( π2 , π) satisfying 2 cos α = E + 1 and 2 cos β = E − 1. The real Jordan form of A0 is Rα 0 J0 = , (4.65) 0 Rβ where Rα =
cos α
1
−sin α
sin α
cos α
0 S= −cos β −sin β
Note that
,
(4.66)
−1
−cos α
−cos β
1
0
−sin β
sin α 0
cos α
−sin α . 1
(4.67)
0
(T0 g)(θ1 , θ2 , θ3 ) = g((θ1 − α) mod 2π, (θ2 − β) mod π, θ3 )
(4.68)
and therefore (m)
θ1
= (θ1 − mα) mod 2π ,
(m)
θ2
= (θ2 − mβ) mod π ,
(m)
θ3
= θ3 .
(4.69)
July 20, 2004 10:11 WSPC/148-RMP
00212
Two-Line Anderson Model
661
Consider the total set {fn1 ,n2 ,n3 | n1 , n2 ∈ Z, n3 ∈ N, n1 + n2 even} ∪ {gn1 | n1 ∈ Z} ∪ {hn2 | n2 ∈ Z} ,
(4.70)
where fn1 ,n2 ,n3 (θ1 , θ2 , θ3 ) = ei(n1 θ1 +n2 θ2 ) cos 2n3 θ3 sin2 2θ3 1Ω(0, π ) , 2
(4.71)
gn1 (θ1 , θ3 ) = e2in1 θ1 sin2 θ3 1Ω(0, π ) + e2in1 θ1 1Ω π
(4.72)
hn2 (θ2 , θ3 ) = e2in2 θ2 cos2 θ3 1Ω(0, π ) + e2in2 θ2 1Ω0 .
(4.73)
2
2
and 2
If g is in this total set, then it satisfies Lemma 4.1 with r = 0, that is, lim kTλ g − T0 gk = 0
(4.74)
T0∗ σ0E = σ0E ,
(4.75)
λ→0
and therefore that is, σ0E is invariant under rotations of θ1 and θ2 by α and β respectively. This is all we can deduce from this equation unless one of α/π or β/π irrational. Consider the case when both α/π and β/π are irrational. Because of the relation between α and β, (n1 α + n2 β)/π is also irrational for any n1 , n2 ∈ Z. The standard ergodic argument then shows that σ0E is Lebesgue with respect to θ1 and θ2 , that is, on ˜0E (dθ3 ), on Ω π2 , σ0E (dθ1 ) = δ π2 dθ1 and on Ω0 , Ω(0, π2 ) , σ0E (dθ1 , dθ2 , dθ3 ) = dθ1 dθ2 σ E σ0 (dθ2 ) = δ0 dθ2 . A similar argument applies if only one of α/π or β/π is irrational. Suppose for example, that α/π is irrational and β/π = p/q where p and q are integers. Then, replacing T0 with T0q in the above argument, we can see that σ0E is Lebesgue with respect to θ1 , that is, on Ω(0, π2 ) , σ0E (dθ1 , dθ2 , dθ3 ) = dθ1 σ ˜0E (dθ2 , dθ3 ) and on Ω π2 , E σ0 (dθ1 ) = δ π2 dθ1 . Fortunately, there is only one case when both α/π and β/π are rational and that is the case when E = 0. 4.3.1. The special case when E = 0 Now we take E = 0 so that α = π3 and β = 2π 3 . In this case we choose m = 6. This is the smallest natural number so that when n1 + n2 is an even integer, m(n1 α + n2 β) is an integral multiple of 2π. Note that in this case, both mα and mβ are also integral multiples of 2π. In this case we can prove the following result. If we assume that σ00 is absolutely continuous on Ω(0, π2 ) with density ρ(θ1 , θ2 , θ3 ) satisfying the boundary conditions lim ρ(θ1 , θ2 , θ3 ) = ρ(0, θ2 , θ3 ) ,
θ1 →2π
lim ρ(θ1 , θ2 , θ3 ) = ρ((θ1 + π) mod 2π, 0, θ3 ) ,
θ2 →π
(4.76)
July 20, 2004 10:11 WSPC/148-RMP
662
00212
T. C. Dorlas & J. V. Pul´ e
then (1) σ00 (Ω0 ∪ Ω π2 ) = 0 ; (4.77) (2) ρ is a function of θ1 + θ2 and θ3 ; (3) with ψ = 2θ1 +2θ2 + π3 , 2θ3 = φ and ρ = sin φ S(ψ, φ), ρ satisfies the differential equation − 16 sin φ cos φ sin ψ
∂2S ∂2S + 2 sin2 φ(cos2 φ + 3 cos2 φ cos ψ − cos ψ + 3) 2 ∂φ∂ψ ∂φ
+ (8 cos2 φ cos ψ − 24 cos ψ + 72 − 40 cos2 φ)
∂2S ∂S − 8 sin ψ(−7 + 5 cos2 φ) ∂ψ 2 ∂ψ
+ 2 cos φ sin φ(−17 cos ψ + 5 cos2 φ + 15 cos2 φ cos ψ − 1)
∂S ∂φ
− 4 sin2 φ(3 cos2 φ + 9 cos2 φ cos ψ − 1 − 9 cos ψ)S = 0 .
(4.78)
This equation can also be written in terms of the variables u = cos 2θ and v = cos ψ as follows 2(1 − u)(1 − u2 )(3uv + u + 7)
∂ 2S ∂2S − 16(1 − u2 )(1 − v 2 ) 2 ∂u ∂u∂v
+ 4(1 − v 2 )(uv − 5u − 5v + 13)
∂ 2S ∂v 2
− (1 − u)(7u2 + 21u2 v + 22u − 2uv − 19v + 3) + (20uv − 52v − 36 − 24uv 2 + 20u + 56v 2 )
∂S ∂u
∂S ∂u
− (1 − u)(3u + 9uv − 9v + 1)S = 0 .
(4.79)
Unfortunately, we have not been able to solve this equation, nor prove that it has a unique positive solution.
4.3.2. The case when E ∈ (−1, 1) with both α/π and β/π irrational In this section, we consider the case when both α/π and β/π are irrational. We know that in this case σ0E is Lebesgue with respect to θ1 and θ2 , that is, on Ω(0, π2 ) , σ0E (dθ1 , dθ2 , dθ3 ) = dθ1 dθ2 σ ˜0E (dθ3 ), on Ω π2 , σ0E (dθ1 ) = δ π2 dθ1 and on Ω0 , σ0E (dθ2 ) = E δ0 dθ2 . Here σ0 is rotation invariant in both θ1 and θ2 . Therefore, we can choose m = 1. Also we need only to consider functions of θ3 to determine the limiting measure. We have in this case (1) σ00 (Ω0 ∪ Ω π2 ) = 0, that is δ0 = δ π2 = 0 ;
(4.80)
July 20, 2004 10:11 WSPC/148-RMP
00212
Two-Line Anderson Model
663
(2) on Ω(0, π2 ) , σ0E (dθ1 , dθ2 , dθ3 ) = ρ¯(θ3 )dθ1 dθ2 dθ3 , where ρ¯ is differentiable and satisfies the differential equation d (C33 ρ¯(θ3 )) − iC3 ρ¯(θ3 ) = 0 , dθ3
(4.81)
where in terms of φ = 2θ3 we have π 2 (1 − cos φ)(5 cos φ − cos2 φ + 2) C3 = i 8 sin2 α (1 + cos φ)(5 cos φ + cos2 φ − 2) cosec φ + sin2 β
(4.82)
and C33 = −
π2 16
3 + 4 cos φ + cos2 φ 3 − 4 cos φ + cos2 φ + sin2 β sin2 α
.
(4.83)
The differential Eq. (4.81) can be solved. Putting ρ¯ = S(cos φ) and t = cos φ, we get the following expressions of S(t), for E > 0. Note that S(−E, t) = S(E, −t). √ √ Let E0 = ( 13 − 2)/ 3 ≈ 0.927. (1) In the case when E = E0 , C S(t) = exp (a − t)2
a a−t
,
(4.84)
where a = 4E0 /(3 − E02 ). (Note that a > 1.7.) (2) In the case when E0 < E < 1, a C a + b − t 2b S(t) = , (a − t)2 − b2 a − b − t √
2
34E −3E 4E where a = 3−E 2 and b = 3−E 2 (3) In the case when 0 < E < E0 ,
4 −27
C S(t) = exp (a − t)2 + b2 where a =
4E 3−E 2
and b =
(4.85)
.
√ 3E 4 −34E 2 +27 . 3−E 2
b a tan−1 b (a − t)
,
(4.86)
Notice the limiting cases E → 0 ⇒ S(t) →
t3
C +3
(4.87)
E → 1 ⇒ S(t) →
C . (1 − t)2
(4.88)
and
The former clearly does not satisfy Eq. (4.78), which means that there is an anomaly at E = 0. The second even diverges at t = 1 and the corresponding ρ(θ3 ) also diverges at θ3 = 0. This of course means that the constant C needs to be scaled
July 20, 2004 10:11 WSPC/148-RMP
664
00212
T. C. Dorlas & J. V. Pul´ e
and the resulting measure is Lebesgue measure on Ω0 . This is due to the fact that the coordinates are singular at this point, however, we need a more careful analysis. C For small we can write a ≈ 2(1 − 2) and b ≈ 1 − 8, so that S(t) ≈ 1+4−t) 2, replacing a/2b by 1. The normalization constant C must be proportional to , so the density is C sin 2θ3 . 1 + 4 − cos 2θ3
ρ(θ3 ) ∼
(4.89)
To compare this measure with the invariant measure at E = 1, we need to change coordinates. The corresponding transformation is given by S1 S −1 , where S1 is the matrix (4.99) and S is the matrix (4.67). For E = 1 − we have √ 1 cot α 0 −cosec β 1 1/ 0 −1 √ −cot α 0 −cosec β 1 −1 1 −1 −1/ 0 −1 −1 S = . ≈ √ 2 0 0 1/ 1 cosec α 1 −cot β 2 0 √ 0 −1/ 1 0 0 −cosec α 1 −cot β (4.90) Thus
0
0
1 −1
0 0 −1 −1 S1 S −1 ≈ √ 0 −1/ 0 0
(4.91) 23
The invariant measures at weak disorder for0 the two-line Anderson model 1 0 0 Ε > Ε0
2.0
Ε = Ε0 1.5
ρ(θ3)
1.0
Ε < Ε0
0.5
0
0.2
0.4
0.6
Fig. 1.
0.8
θ3
1.0
1.2
1.4
π/2
θ3 7→ ρ(θ3 ).
Figure 1: θ3 7→ ρ(θ3 ) To compare this measure with the invariant measure at E = 1 we need to change coordinates. The corresponding transformation is given by S1 S −1 , where S1 is the matrix (4.99) and S is the matrix (4.67). For E = 1 − ² we have √ 1 cot α 0 − cosec β 1 1/ √² 0 −1 1 −1 − cot α 0 − cosec β 1 −1 −1/ ² 0 −1 √ ≈ . (4.90) S −1 = cosec α 1 − cot β 2 0 1/ √² 1 0 2 0
July 20, 2004 10:11 WSPC/148-RMP
00212
Two-Line Anderson Model
665
and hence if we denote the original coordinates by θ and the new coordinates by θ0 , we get √ C sin θ30 cos θ30 dθ10 dθ20 dθ30 ρ(θ3 )dθ1 dθ2 dθ3 = 2 . (4.92) 4 cos2 θ20 cos2 θ30 + (sin2 θ30 + cos2 θ30 (1 + cos2 θ20 ))
In the limit → 0, this tends to 1 ν1 = δ θ20 − π sin θ30 dθ10 dθ30 . 2
(4.93)
4.3.3. The case when E ∈ (−1, 1) with α/π rational and β/π irrational In this section, we consider the case when α/π is rational and β/π is irrational. We know that in this case σ0E is Lebesgue with respect to θ2 , that is, on Ω(0, π2 ) , σ0E (dθ1 , dθ2 , dθ3 ) = dθ2 σ ˆ0E (dθ1 , dθ3 ), and on Ω0 , σ0E (dθ2 ) = δ0 dθ2 . Since we need to consider only functions of θ1 and θ3 to determine the limiting measure, we choose m so that mα is an integral multiple of π. In this case if we assume that σ ˆ0E is absolutely continuous with density ρˆ, then (1) σ00 (Ω0 ∪ Ω π2 ) = 0 ; (2) ρˆ satisfies the differential equation 2 ∂2 ∂ ∂ C + C − i C ˆ(θ1 , θ3 ) = 0 , 3 33 11 ρ ∂θ32 ∂θ3 ∂θ12
(4.94)
(4.95)
where C3 and C33 are the same as in the previous case and C11 = −2
2(1 + cos φ) sin2 α + 3(1 − cos φ) sin2 β . (1 − cos φ) sin2 α sin2 β
(4.96)
Note that C11 is independent of θ1 and therefore, ρˆ(θ1 , θ3 ) = ρ¯(θ3 ) is a solution of this differential equation. 4.4. The case when E = ±1 Suppose that E = 1; the case when E = −1 is similar. Here the real Jordan form for A0 is π 0 R2 −1 , (4.97) J0 = SA0 S = 0 J2 where 1 1 J2 = . (4.98) 0 1 The matrix S is then given by 1 1 1 1 1 1 −1 −1 (4.99) S= . 0 0 1 −1 1 −1 −1
1
July 20, 2004 10:11 WSPC/148-RMP
666
00212
T. C. Dorlas & J. V. Pul´ e
Note that J2q (q)
(q)
and therefore θ1 , θ2
and
=
1
q
0 1
(4.100)
(q)
and θ3
are given by qπ (q) θ1 = θ1 − mod 2π , 2 1 if θ2 = 0 , q, (q) cot θ2 = cot θ2 , if θ2 6= 0 , 1 + q cot θ2
(4.101)
(4.102)
1
(q)
cot θ3 = cot θ3 (1 + q sin θ2 cos θ2 + q 2 cos θ2 ) 2 . (q)
(q)
(4.103)
(q)
Therefore, θ2 → π2 as q → ∞. If θ3 = 0 or π2 , then θ3 = θ3 . If θ2 = π2 , then (q) (q) θ3 = θ3 , otherwise θ3 → 0. We have Z Z g(ω)σ0E (dω) = lim (T04q g)(ω)σ01 (dω) . (4.104) q→∞
Ω
Ω
By using the functions (4.22) in (4.104), we get for n1 , n2 ∈ Z, n3 ∈ N, n1 + n2 even, Z ei(n1 θ1 +n2 θ2 ) sin 2n3 θ3 σ0E (dθ1 dθ2 dθ3 ) Ω(0, π ) 2
=
Z
Ω(0, π ) ∩{θ2 = π 2}
ei(n1 θ1 +n2 θ2 ) sin 2n3 θ3 σ01 (dθ1 dθ2 dθ3 ) .
(4.105)
2
Thus σ01 on Ω(0, π2 ) is concentrated on Ω(0, π2 ) ∩{θ2 = π2 }. Then by using the functions (4.24) in (4.104), we get, for n2 ∈ Z, Z Z e2in2 θ2 σ01 (dθ2 ) . e2in2 θ2 σ01 (dθ2 ) = (4.106) Ω0 ∩{θ2 = π 2}
Ω0
Therefore, σ01 is concentrated on (Ω(0, π2 ) ∪ Ω0 ) ∩ {θ2 = π2 } ∪ Ω π2 . Since π π (T04 g) θ1 , , θ3 = g θ1 , , θ3 , 2 2
(4.107)
we have Z Ω
∂2 4 T g ∂λ2 λ
λ=0
(θ)σ01 (dθ) = 0 .
(4.108)
July 20, 2004 10:11 WSPC/148-RMP
00212
Two-Line Anderson Model
It is sufficient to calculate
π ∂2 4 ∂λ2 Tλ g λ=0 (θ1 , 2 , θ3 ).
Let
sin θ1 sin θ3 cos θ1 sin θ3 x= cos θ3
667
(4.109)
0
and let
sin θ10 sin θ30
cos θ0 sin θ0 1 3 ˜ x0 = B(4) x= . sin θ20 cos θ30
(4.110)
cos θ20 cos θ30
Then we get from (4.50) with n2 = 0 and m = 4
E(exp(i(n1 θ10 + n3 θ30 ))) = exp(i(n1 θ1 + n3 θ3 )) × 1 + λ2 [A1 n1 + A3 n3 + A11 n21 + A33 n23 + A31 n3 n1 ] + O(λ3 ) .
(4.111)
Computing the A’s, we get i (4.112) A1 = cos ψ sin ψ , 2 i 4(5 cos φ + 3) + 13 sin φ + 15 cos φ sin φ + 2(2 − cos φ) sin φ sin2 ψ A3 = 16 sin φ − 8 cos φ sin φ cos ψ + 3(1 − cos φ) sin φ sin ψ , (4.113) A11 =
(3 − sin2 ψ) 2 − , 4 1 − cos φ
(4.114)
A31 =
sin φ (cos ψ sin ψ − 6 − 2 sin ψ) , 4
(4.115)
and 1 (2 sin2 ψ + 3 sin ψ + 8 cos ψ − 15) cos2 φ + (2 − 6 sin ψ) cos φ 32 + (3 sin ψ − 8 cos ψ − 2 sin2 ψ + 45) . (4.116)
A33 = −
As in the previous cases, we then get for suitable g’s Z ∂ ∂ ∂2 ∂2 −iA1 − iA3 − A11 2 − A33 2 ∂θ1 ∂θ3 ∂θ1 ∂θ3 Ω∩{θ2 = π 2} ∂2 − A31 g(θ1 , θ3 )σ01 (dθ1 dθ3 ) = 0 . ∂θ3 ∂θ1
(4.117)
If we assume that σ01 restricted to Ω(0, π2 ) ∩ {θ2 = π2 } is absolutely continuous with density ρ, then choosing g’s whose restriction to Ω0 ∪ Ω π2 is zero and such that the
July 20, 2004 10:11 WSPC/148-RMP
668
00212
T. C. Dorlas & J. V. Pul´ e
integrand is continuous, by integrating the parts we can show that ρ satisfies the differential equation ∂ ∂ ∂2 ∂2 ∂2 i A1 + i A3 − 2 A11 − 2 A33 − A31 ρ(θ1 , θ3 ) = 0 . (4.118) ∂θ1 ∂θ3 ∂θ1 ∂θ3 ∂θ3 ∂θ1 Near θ3 = 0, A3 behaves like iθ3−1 + O(θ3 ) and A33 behaves like −1 + O(θ32 ). While near θ3 = π2 , A3 = −i(4( π2 − θ3 ))−1 + O(( π2 − θ3 )) and 2 1 π A33 = − (3 sin ψ + 7) + O . (4.119) − θ3 8 2 Therefore, by choosing g(θ3 ) = sin2 θ3 cos4 θ3 we see that the measure σ01 is zero on Ω0 and by choosing g(θ3 ) = sin4 θ3 cos2 θ3 , Z (9 + 3 sin ψ)σ01 (dθ1 ) = 0 . (4.120) Ωπ 2
Since the integrand is positive, the measure σ01 is zero on Ω π2 also. To sum up, in this case we have that (1) σ01 is concentrated on Ω(0, π2 ) ∩ {θ2 = π2 }; (2) if we assume that σ01 is absolutely continuous on Ω(0, π2 ) ∩ {θ2 = ρ, then ρ satisfies the differential equation (4.118).
π 2}
with density
The differential equation (4.118) does not have a θ1 -independent solution, and in particular ρ(θ1 , θ3 ) = sin θ3 is not a solution, so that there is an anomaly at E = 1 on the left hand side. In the next section, we will see that there is also an anomaly on the right hand side. 4.5. The case when E ∈ (−3, −1) ∪ (1, 3) Suppose that 1 < E < 3. The case when −3 < E < −1 is similar. We can choose β ∈ (0, π2 ) but we cannot choose α to be a real number. In fact, if we put α = iγ, γ > 0, we get 2 cosh γ = E + 1 and 2 cos β = E − 1. Then Rβ 0 , (4.121) J0 = ˜γ 0 R where ˜γ = R
exp(−γ)
0
0
exp(γ) 1
0 S= −exp(−γ) exp(γ)
,
1 0 exp(−γ) −exp(γ)
(4.122) −cos β
sin β 1
−1
−cos β
sin β . −1 1
(4.123)
July 20, 2004 10:11 WSPC/148-RMP
00212
Two-Line Anderson Model
669
We have (q)
= (θ1 − qβ) mod 2π ,
(4.124)
(q)
= cot θ2 e2qγ
(4.125)
θ1
cot θ2 and
1
(q)
cot θ3 = cot θ3 (e−2qγ sin2 θ2 + e2qγ cos2 θ2 ) 2 .
(4.126)
(q)
Therefore, as q → ∞, θ3 converges to 0 or π2 . We have Z Z g(ω)σ0E (dω) = lim (T0q g)(ω)σ0E (dω) . q→∞
Ω
(4.127)
Ω
By using the functions (4.22) in (4.127), we get for n1 , n2 ∈ Z, n3 ∈ N, n1 + n2 even, Z ei(n1 θ1 +n2 θ2 ) sin 2n3 θ3 σ0E (dθ1 dθ2 dθ3 ) = 0 (4.128) Ω(0, π ) 2
since
(q) sin 2n3 θ3
converges to 0. Thus σ0E (Ω(0, π2 ) ) = 0. (q)
(q)
(q)
If θ3 = 0 or π2 , then θ3 = θ3 . If θ2 = π2 , then θ2 = π2 , otherwise θ2 → 0. So by using the functions (4.24) in (4.127), we get for n2 ∈ Z, Z π π E 2in2 θ2 E E +σ0 Ω0 ∩ θ2 = ein2 π . (4.129) e σ0 (dθ2 ) = σ0 Ω0 ∩ θ2 6= 2 2 Ω0 Therefore, σ0E is concentrated on (Ω0 ∩ {θ2 = 0 or Z lim λ−2 (Tλ g − g)(θ1 )σ0E (dθ1 ) λ→0
π 2 })
∪ Ω π2 . From (2.20) we have
Ωπ 2
+ lim λ−2 λ→0
Z
Ω0 ∩{θ2 =0}
(Tλ g − g)(θ2 )σ0E (dθ2 ) = 0 .
If we let g(θ3 ) = sin2 θ3 cos4 θ3 , then T0 g (Ω0 ∩{θ2 =0})∪Ω π = 0 = g (Ω0 ∩{θ2 =0})∪Ω π . 2
Thus Z Ωπ 2
∂2 Tλ g ∂λ2
λ=0
(θ1 )σ0E (dθ1 )
+
(4.130)
(4.131)
2
Z
Ω0 ∩{θ2 =0}
∂2 Tλ g ∂λ2
(θ2 )σ0E (dθ2 ) = 0 .
λ=0
(4.132) Using (4.50) with n1 = n2 = 0 and m = 1, we can show that the first term is 0 and that the second term is equal to π 2γ −2 E −2γ −2 E ) σ0 Ω 0 ∩ θ 2 = 16(e − 1) σ0 (Ω0 ∩ {θ2 = 0}) + 8(1 − e . 2 Therefore, σ0E (Ω0 ) = 0 and σ0E is concentrated on Ω π2 .
July 20, 2004 10:11 WSPC/148-RMP
670
00212
T. C. Dorlas & J. V. Pul´ e
It is clear that σ0E Ω π is invariant under rotation by β. Thus if β/π is irrational, 2
this measure must be the Lebesgue measure. If β/π = p/q, where p and q are positive integers, then we have 2 Z ∂ 2q T g (θ1 )σ0E (dθ1 ) = 0 . (4.133) 2 λ ∂λ Ωπ λ=0 2
Let
sin θ1
cos θ1 x= 0
(4.134)
0
˜ and let x0 = B(2q)x. As in (4.50) with n2 = n3 = 0 and m = 2q we have E(exp(in1 θ10 )) = exp(in1 θ1 ) 1 + λ2 [A1 n1 + A11 n21 ] + O(λ3 ) . One can then check that when θ3 =
(4.135)
π 2,
A1 = 0 and A11 = −
3q . sin2 β
Therefore, from (4.133) for n1 = 6 0 Z e2in1 θ1 σ0E (dθ1 ) = 0 .
(4.136)
(4.137)
Ωπ 2
Thus, σ0E Ω π is Lebesgue measure. 2
Summing up, we have that σ0E is concentrated on Ω π2 and on that, it is Lebesgue measure. To compare the limit as E → 1 with the solution at E = 1, we compute β+1 0 0 1 cos sin β 1 cos β−1 0 0 sin β −1 S1 S = (4.138) . eγ e−γ 0 0 eγ −e−γ eγ −e−γ 0
0
1+eγ eγ −e−γ
1+e−γ eγ −e−γ
This maps Ωπ/2 onto itself and in the limit E → 1 we get cot θ10 =
1 − cot θ1 . 1 + cot θ1
(4.139)
The limiting measure is therefore also concentrated on Ωπ/2 in contradistinction to σ01 .
July 20, 2004 10:11 WSPC/148-RMP
00212
Two-Line Anderson Model
671
4.6. The cases E = ±3 Finally, we come to the case when E = ±3. It is sufficient to study the case E = 3. Here cos α = 2 and β = 0. J1 0 , (4.140) J0 = 0 J2 where J1 = and J2 =
2−
√
3
0
1 1 0 1
0 √ 2+ 3
1
(q) cot θ1
(q)
=
(4.141)
.
1
1 √ S3 = −(2 − 3) √ 2+ 3
We have
(1 q
(4.142) 0
0
1 −1 −1 √ . 2− 3 1 −1 √ −(2 + 3) −1 1
,
if θ1 = 0 ,
cot θ1 1+q cot θ1
,
√
if θ1 6= 0 ,
(4.143)
(4.144)
3)2q
(4.145)
! 12 √ √ (2 − 3)2q sin2 θ2 + (2 + 3)2q cos2 θ2 . 1 + q 2 cos2 θ1 + 2q sin θ1 cos θ1
(4.146)
cot θ2
= cot θ2 (2 +
and (q) cot θ3
= cot θ3 (q)
Therefore as q → ∞, θ3 → 0 or π2 . We have Z Z 3 g(ω)σ0 (dω) = lim (T0q g)(ω)σ03 (dω) . Ω
q→∞
(4.147)
Ω
By using the functions (4.22) in (4.147), we get for n1 , n2 ∈ Z, n3 ∈ N, n1 + n2 even, Z ei(n1 θ1 +n2 θ2 ) sin 2n3 θ3 σ03 (dθ1 dθ2 dθ3 ) = 0 . (4.148) Ω(0, π ) 2
Thus σ03 (Ω(0, π2 ) ) = 0. (q)
(q)
(q)
Now θ3 = θ3 if θ3 = 0 or π2 . θ2 = θ2 if θ2 = 0 or π2 , otherwise θ2 → 0 as q → ∞. Therefore by the same argument as in Sec. 4.4, σ03 is concentrated on (Ω0 ∩ {θ2 = 0 or θ2 = π2 }) ∪ Ω π2 .
July 20, 2004 10:11 WSPC/148-RMP
672
00212
T. C. Dorlas & J. V. Pul´ e (q)
Similarly, since θ1 → π2 as q → ∞, we can argue that σ03 is concentrated on (Ω0 ∩ {θ2 = 0 or θ2 = π2 }) ∪ (Ω π2 ∩ {θ1 = π2 }). From (2.20), we have Z (Tλ g − g)(θ1 )σ0E (dθ1 ) lim λ−2 λ→0
Ω π ∩{θ1 = π 2} 2
+ lim λ−2 λ→0
Z
π 2}
Ω0 ∩{θ2 =0 or
(Tλ g − g)(θ2 )σ0E (dθ2 ) = 0 .
If we let g(θ3 ) = sin2 θ3 cos4 θ3 , then T0 g Ω0 ∪Ω π = 0 = g Ω0 ∪Ω π . 2
(4.149)
(4.150)
2
Thus
Z
Ω π ∩{θ1 = π 2} 2
+
Z
∂2 Tλ g ∂λ2
Ω0 ∩{θ2 =0 or
(θ1 )σ0E (dθ1 )
λ=0
π 2}
∂2 Tλ g ∂λ2
(θ2 )σ0E (dθ2 ) = 0 .
(4.151)
λ=0
From (4.50) with n1 = n2 = 0 and m = 1, we can check that the first term is 0 and that the second term is equal to σ0E (Ω0 ∩ {θ2 = 0 or π2 })/12. Therefore, σ0E (Ω0 ∩ {θ2 = 0 or π2 }) = 0 and σ0E is concentrated on Ω π2 ∩ {θ1 = π2 }. Thus σ0E is concentrated on Ω π2 and on that, it is the atomic measure at θ1 = π2 . The limiting measure at E = 3 has to be transformed via the matrix 1 cot β 0 0 S3 S
−1
1 = 0
cos β−1 sin β
0
0 0
0
0
√ −2+ 3+eγ 2 sinh γ √ 2+ 3−eγ 2 sinh γ
√ −2+ 3+e−γ 2 sinh γ √ 2+ 3−e−γ 2 sinh γ
.
(4.152)
In particular, we obtain a relation between the original and the transformed angles analogous to (3.71) cot θ10 =
sin β − (1 − cos β) cot θ1 . sin β + cos β cot θ1
(4.153)
This implies by a similar calculation as in the case of one chain, that as β → 0, the measure becomes concentrated at θ1 = π/2. We conclude therefore that there is no anomaly at E = 3. Acknowledgment One of the authors (J. V. Pul´e) wishes to thank the Department of Mathematics of the University of Bologna, Italy, for their kind hospitality and University College Dublin for the award of a President’s Fellowship.
July 20, 2004 10:11 WSPC/148-RMP
00212
Two-Line Anderson Model
673
References [1] R. Carmona and J. Lacroix, Spectral Theory of Random Schr¨ odinger Operators (Birkh¨ auser, Boston, 1990). [2] H. Schulz-Baldes, Perturbation theory for Lyapunov exponents of an Anderson model on a strip, Mathematical Physics Preprint Archive mp arc 03-369 Aug 14 (2003). [3] D. Thouless, in Ill-Condensed Matter, eds. R. Balian, R. Maynard and G. Toulouse (North-Holland, Amsterdam, 1979). [4] M. Kappus and F. Wegner, Z. Phys. B45 (1981) 15. [5] B. Derrida and E. Gardner, J. Phys. (Paris) 45 (1984) 1283. [6] A. Bovier and A. Klein, J. Statist. Phys. 51 (1988) 501. [7] M. Campanino and A. Klein, Commun. Math. Phys. 130 (1990) 441. [8] T. C. Dorlas and J. V. Pul´e, Markov Process. Related Fields 9 (2004) 567–578, in Special issue dedicated to Leonid Pastur on the occasion of his 65th birthday, eds. Jean-Michel Combes, Jean Ruiz and Valentin A. Zagrebnov.
July 14, 2004 18:6 WSPC/148-RMP
00210
Reviews in Mathematical Physics Vol. 16, No. 5 (2004) 675–677 c World Scientific Publishing Company
ERRATUM
A UNIFIED APPROACH TO RESOLVENT EXPANSIONS AT THRESHOLDS [Rev. Math. Phys. Vol. 13, No. 6 (2001) 717–754] Arne Jensen and Gheorghe Nenciu
This note contains an erratum to, and a few remarks on, the paper in the title [1]. The erratum is that the statement of Corollary 2.2 in [1] is incomplete. For the conclusion to be true, one needs that A0 S = 0, which is not the case when A0 has a nilpotent part. The correct form is Proposition 1. Let F ⊂ C have zero as an accumulation point. Let A(z), z ∈ F, be a family of bounded operators of the form A(z) = A0 + zA1 (z) ,
(1)
with A1 uniformly bounded as z → 0. Suppose 0 is an isolated point of the spectrum of A0 , and let S be the corresponding Riesz projection. If A0 S = 0,
(2)
then for sufficiently small z, the operator B(z): SH → SH defined by X 1 (S − S(A(z) + S)−1 S) = (−z)j S[A1 (z)(A0 + S)−1 ]j+1 S , z j=0 ∞
B(z) =
(3)
is uniformly bounded as z → 0. The operator A(z) has a bounded inverse in H, if and only if B(z) has a bounded inverse in SH, and in this case 1 A(z)−1 = (A(z) + S)−1 + (A(z) + S)−1 SB(z)−1 S(A(z) + S)−1 . (4) z The rest of the paper is not affected, since everywhere A0 is self-adjoint, so that (2) holds true. We remark that (2) might be true, even if A0 is not self-adjoint. As we shall prove below, this is the case when one considers the asymptotic expansion of the perturbed 675
July 14, 2004 18:6 WSPC/148-RMP
676
00210
A. Jensen & G. Nenciu
resolvent around an embedded non-threshold eigenvalue, λ0 , of the unperturbed (self-adjoint) Hamiltonian, as in [2]. More precisely (see [2] for details), if the unperturbed resolvent is R0 (z) =
P0 ˜ 0 (z) +R λ0 − z
(5)
and V = |V |1/2 U |V |1/2 is a bounded self-adjoint perturbation (here we choose U to be unitary by defining it to be 1 on Ker V , such that U 2 = I), then A0 turns out to be ˜ 0 (λ0 + iη)|V |1/2 . A0 = U + lim |V |1/2 R η&0
(6)
It is assumed that the limit exists and is compact, and then either A0 has a bounded inverse, or 0 is an isolated part of the spectrum of A0 . In the latter case, one can define the corresponding Riesz projection, S. The next proposition shows that under these conditions, (2) holds true. Proposition 2. SA0 = SA0 S = 0 .
(7)
Proof. The first equality follows from the fact that S is a Riesz projection. The key observation is that from (6) Im A0 ≥ 0 .
(8)
Consider now, as an operator in SH, A1 = SA0 S .
(9)
On one hand SH is finite dimensional, and on the other hand for all Ψ ∈ SH ImhΨ, A1 Ψi = ImhΨ, SA0 SΨi = ImhΨ, A0 SΨi = ImhΨ, A0 Ψi ≥ 0 , i.e. Im A1 ≥ 0 .
(10)
Tr A1 = Tr Re A1 + i Tr Im A1 = 0 ,
(11)
Since A1 is nilpotent,
which together with (10) implies that Im A1 = 0 .
(12)
As a consequence, A1 is self-adjoint, and since σ(A1 ) = {0}, it follows that A1 = 0.
The last remark is that one can generalize Corollary 2.2, and then the whole procedure in [1], to non-self-adjoint A0 , under the additional assumption that A0 is a Fredholm operator with index zero [3] (i.e. dim Ker A0 = dim(Ran A0 )⊥ ). Then
July 14, 2004 18:6 WSPC/148-RMP
00210
Resolvent Expansions at Thresholds
677
if A0 = W |A0 | is the polar decomposition of A0 , one can extend (in a non-unique way) W to a unitary operator U (just take {fj } and {gj } orthonormal bases in Ker A0 and (Ran A0 )⊥ respectively, and define U fj = gj ). Then write A(z) = U (|A0 | + zU −1 A1 (z))
(13)
and apply Corollary 2.2 to |A0 | + zU −1 A1 (z). There is a price to pay in this case: due to the non-unicity of U, the obtained expansions are not “canonical”. Of course, the coefficients of the expansion do not depend on U , but various identities have to be used in order to see that. References [1] A. Jensen and G. Nenciu, A unified approach to resolvent expansions at threshold, Rev. Math. Phys. 13 (2001) 717–754. [2] A. Jensen, On a unified approach to resolvent expansions for Schr¨ odinger operators, in Spectral and Scattering Theory and Related Topics (RIMS, Kyoto, Japan); S¯ urikaisekikenky¯ usho K¯ oky¯ uroku 1208 (2001) 91–103. [3] T. Kato, Perturbation Theory for Linear Operators (Springer-Verlag, Berlin, 1966).
September 6, 2004 14:35 WSPC/148-RMP
00214
Reviews in Mathematical Physics Vol. 16, No. 6 (2004) 679–808 c World Scientific Publishing Company
QUASITRIANGULAR WZW MODEL
ˇ ´IK CTIRAD KLIMC Institute de math´ ematiques de Luminy 163, Avenue de Luminy, 13288 Marseille, France Received 9 March 2004 A dynamical system is canonically associated to every Drinfeld double of any affine Kac–Moody group. In particular, the choice of the affine Lu–Weinstein double gives a smooth one-parameter deformation of the standard WZW model. The deformed WZW model is exactly solvable and it admits the chiral decomposition. Its classical action is not invariant with respect to the left and right action of the loop group, however, it satisfies the weaker condition of the Poisson–Lie symmetry. The structure of the deformed WZW theory is characterized by several ordinary and dynamical r-matrices with spectral parameter. They describe the q-deformed current algebras, appear in the definition of q-primary fields and characterize the quasitriangular exchange (braiding) relations. The symplectic structure of the deformed chiral WZW theory is cocharacterized by the same elliptic dynamical r-matrix that appears in the Bernard generalization of the Knizhnik–Zamolodchikov equation, with q entering the modular parameter of the Jacobi theta functions. This reveals a remarkable connection between the classical q-deformed WZW model and the quantum standard WZW theory on elliptic curves. Keywords: WZW model; dynamical r-matrices; Poisson–Lie symmetry; loop groups; symplectic geometry.
Contents 1. Introduction 1.1. Basic observation 1.2. Chiral decomposition 1.3. Quasitriangular symplectic structure 1.4. Quasitriangular Hamiltonian 1.5. Quasitriangular classical action 1.6. Quasitriangular current algebra 1.7. The plan of the paper 2. Universal WZW Model 2.1. Central biextension 2.2. The symplectic reduction ˜ 2.2.1. Second floor master model on G ˆ 2.2.2. The symplectic reduction to the first floor G 2.2.3. The reduction to the ground floor G 3. Chiral Decomposition of the WZW Model 679
681 681 681 683 685 687 688 690 691 691 695 695 696 698 702
September 6, 2004 14:35 WSPC/148-RMP
680
00214
C. Klimˇ c´ık
3.1. Chiral geodesical model on G0 3.1.1. Cartan decomposition 3.1.2. Standard chiral symplectic structure 3.1.3. Dynamical r-matrix 3.2. Chiral decomposition of the master model 3.2.1. Affine Cartan decomposition 3.2.2. Affine model space 3.2.3. Chiral reduction to the first floor 3.2.4. Ground floor: standard chiral WZW model 3.2.5. Affine dynamical r-matrix 3.2.6. Vertex-IRF transformation and braiding relation 4. Universal Quasitriangular WZW Model 4.1. Poisson–Lie primer 4.1.1. The Drinfeld double 4.1.2. The Heisenberg double 4.1.3. Lu–Weinstein double 4.1.4. Non-Abelian moment maps 4.2. Quasitriangular geodesical model ˜ 4.3. WZW Drinfeld doubles of G 4.4. Affine Lu–Weinstein double 4.4.1. The double D = LGC 0 ˆ [ C ˆ = R ×Q R LG 4.4.2. The double D 0 ˜ = R2 × [ C ˜ 4.4.3. The double D LG S,Q
0
4.5. Conclusion 5. Loop Group Quasitriangular WZW Model 5.1. Quasitriangular chiral geodesical model 5.1.1. Chiral splitting of the Semenov-Tian-Shansky form 5.1.2. The power of the Poisson–Lie symmetry 5.1.3. Deformed dynamical r-matrix 5.1.4. The classical solution 5.2. Quasitriangular chiral WZW model 5.2.1. Quasitriangular chiral master model 5.2.2. Chiral symplectic reduction: the first step 5.2.3. Chiral symplectic reduction: the second step 5.2.4. Deformed affine dynamical r-matrix 5.2.5. q-Kac–Moody primary fields 5.2.6. q-deformed current algebra 5.2.7. The limit q → 1 5.2.8. Quasitriangular exact solution 5.2.9. The left-right combination 6. Conclusions and Outlook Appendices A.1. The loop group primer A.2. Cotangent bundle of a group manifold A.3. The symplectic reduction in the dual language ˆ A.3.1. The map between Mκ (G)/U (1) and T ∗ G ˆ A.3.2. The reduced Poisson bracket on Mκ (G)/U (1) A.4. Proof of Lemma 5.8 References
702 702 704 705 709 709 712 713 715 718 720 723 724 724 725 734 735 738 739 741 741 742 747 751 751 752 752 756 758 763 764 764 765 769 773 778 780 784 786 787 788 789 789 792 797 798 800 804 806
September 6, 2004 14:35 WSPC/148-RMP
00214
Quasitriangular WZW Model
681
1. Introduction 1.1. Basic observation The WZW model [45] is certainly one of the most important models of the twodimensional (conformal) field theory. It is well-known that many interesting theories can be naturally obtained by its reductions, for e.g. the coset models [30] or Toda theories [21]. Such fabrication of new structures from the roof model was called “the WZW factory” in [27]. This paper is based on an observation, that the roof of the WZW factory is in fact two floors above the WZW model. In other words, we shall argue that there is a master model from which the WZW model can be obtained by two successive symplectic reductions. This master model describes the geodesical flow on the affine ˜ = R ׈ G ˆ and its action reads Kac–Moody group G S Z κ −1 d −1 d dτ g˜ g˜, g˜ g˜ . (1.1) S(˜ g) = − 4 dτ dτ G˜
˜ κ plays the role of the WZW level, (·, ·) ˜ is the invariant inner Here g˜(τ ) ∈ G, G ˜ G ˆ = LG d0 is the centrally extended loop group and × ˆ product on G˜ = Lie(G), S means the semidirect product corresponding to the loop group parameter shift automorphism σ → σ + s. The reader may be surprised that neither the world-sheet space derivative ∂σ nor the WZW term is present in the master action (1.1). We shall see, however, that they are “born” in the process of the symplectic reduction. The interest in lifting the WZW model relies on the fact that the master model sitting two floors higher has an extremely simple structure. Its phase space is the ˜ equipped with its canonical symplectic structure. We can cotangent bundle T ∗ G therefore easily construct deformations of the master model by using the theory of ˜ One simply replaces various doubles (Manin, Drinfeld, Heisenberg) of the group G. ˜ ˜ by a chosen Drinfeld double D ˜ and the symplectic structhe cotangent bundle T ∗ G ˜ ˜ The ture is then canonically given by the Semenov-Tian-Shansky two-form ω ˜ on D. ˜ left-right G-symmetries of the master model (1.1) are thus deformed to Poisson–Lie symmetries, and the Hamiltonian charges (= Abelian moment maps generating the standard symmetries) become non-Abelian Poisson–Lie moment maps. We then obtain the quasitriangular WZW theory by performing the two-step symplectic reduction of the deformed master model. Remark. It is also interesting to note that the various symplectic reductions can also be applied in the Poisson–Lie case, or, in other words, the whole WZW factory should survive the q-deformation. 1.2. Chiral decomposition It is well-known that the standard WZW model can be obtained by the appropriate combination of two identical copies of a simpler dynamical system called the chiral
September 6, 2004 14:35 WSPC/148-RMP
682
00214
C. Klimˇ c´ık
WZW model [19]. The same thing turns out to be true also for the master model (1.1). It can be combined from two copies of the chiral geodesical model whose (first order Hamiltonian) action is given by Z ˜ k˜−1 d k˜ + 1 (φ, ˜ φ) ˜ = dτ φ, ˜ φ) ˜ ˜∗ , S˜L (k, (1.2) G dτ 2κ ˜ )∈G ˜ ) ∈ A˜+ . Note that A˜+ is the Weyl alcove viewed as the ˜ and φ(τ where k(τ subset of the dual of the Cartan subalgebra T˜ of G˜ . We shall show that the chiral WZW model (whose symplectic structure was detailed by Gaw¸edzki [29]) can also be obtained by a simple two-step symplectic reduction from the chiral master model (1.2). In fact, the σ-shift and the central ˜ act in the standard Hamiltonian way on the phase space circle subgroups of G ˜ ˜ ˜ ML ≡ G × A+ . The reduction is then induced by setting the σ-shift Hamiltonian charge to 0 and its central circle fellow to κ. It turns out that the master model (1.1) and the chiral geodesical model (1.2) have natural deformations based on the choice of an appropriate Drinfeld double ˜ ˜ of the affine Kac–Moody group G. ˜ In particular, the resulting deformed chiral D ˜ L but now its action geodesical model is formulated on the same phase space M reads Z 1 ˜ ˜ q ˜ ˜ q ˜ (φ, φ)G˜∗ dτ . (1.3) SL (k, φ) = θ + 2κ In order to explain the notation, there exists a one-parameter family of embeddings ˜ ˜ L into the Drinfeld double D. ˜ This parameter will be of the affine model space M ε q a referred to as q ≡ e . Then θ is the solution of the equation dθ q = Ωq , where Ωq is the pullback of the Semenov-Tian-Shansky form to the q-embedded submanifold ˜ ˜ L ,→ D. ˜ For q = 1, θ1 is the standard symplectic form [29] of the non-deformed M chiral WZW model. ˜ ˜ namely, the σ-shift and the There is the crucial condition to be imposed on D, ˜ must still act on M ˜ L in the standard Hamiltonian central circle subgroups of G way but now with respect to the symplectic structure Ωq . Such good doubles will ˜ As in the non-deformed case, the quasibe referred to as the WZW doubles of G. triangular chiral WZW model will then be obtained from the action (1.3) by the κ-depending symplectic reduction based on setting the corresponding σ-shift and central circle Hamiltonian charges to 0 and κ, respectively. Although the σ-shift and the central circle still act in the standard Hamiltonian ˜ L , Ωq ), this is no longer true for the action of the remaining loop group way on (M ˜ Nevertheless, due to the fact that Ωq is the pullback of the generators of Lie(G). Semenov-Tian-Shansky form, the remaining generators act in the Poisson–Lie way. This means, in particular, that the action (1.3) of the deformed chiral geodesical a The
˜ L. classical action (1.3) makes sense even if this solution θ q exists only locally on M
September 6, 2004 14:35 WSPC/148-RMP
00214
Quasitriangular WZW Model
683
model is Kac–Moody Poisson–Lie symmetric. Due to this property, the quantized quasitriangular chiral WZW model will enjoy the q-Kac–Moody symmetry. Remarks. (i) The deformation of the master model (1) exists for every Drinfeld ˜ ˜ However, the two-step symplectic reduction can be performed only if D ˜ double of G. is the WZW double (see above). We found with satisfaction that a nontrivial WZW ˜ ˜ of G ˜ can indeed be constructed; it is in fact nothing but the complexifidouble D ˜ ˜ =G ˜ C of the affine Kac–Moody group G ˜ equipped with certain invariant cation D ˜ ˜ ˜ ˜ = Lie(D). ˜ We shall call D ˜ the “affine maximally noncompact inner product on D
Lu–Weinstein double”, since it turns out to be the natural affine generalization of the standard Lu–Weinstein double D0 of the group G0 . (ii) The Weyl alcove A+ is usualy viewed as the fundamental domain of the action of the affine Weyl group on the Cartan subalgebra T of G0 ≡ Lie(G0 ). The ˜ The alcove affine Weyl group acts also on the Cartan subalgebra T˜ of G˜ ≡ Lie(G). ˜ A+ is again the fundamental domain of this action. It is also important to note that the second floor chiral Hamiltonian ˜ φ) ˜ ˜∗ ˜ L = − 1 (φ, (1.4) H G 2κ
does not depend on q. The q-dependence of its descendant HLqW Z will turn out to be the fruit of the reduction. The reader will find more detailed explanations of all that in the body of the paper. (iii) The q-deformations of the WZW model have already been studied in the literature [12, 16, 19]. However, in those cases either the worldsheet or the target of the σ-model was first kinematically deformed to become a lattice or a noncommutative manifold. Then a kind of a (discrete or non-commutative) dynamics was formulated on this deformed background. What we are doing here is somewhat different; we avoid any preliminary kinematical deformation of the worldsheet or of the target. The q-deformed objects are generated dynamically. This means that they appear solely as the result of standard field theoretical quantization of some (chiral) classical theory whose phase space MLW Z is topologically the same as that of the non-deformed chiral WZW theory. The things that get (smoothly) deformed are the symplectic form and the Hamiltonian function on this unchanged phase space. 1.3. Quasitriangular symplectic structure As already stated in Remark (iii) above, the crucial result of the two-step symplectic reduction of (1.3) is the fact that the phase space of the quasitriangular chiral WZW model is topologically the same manifold as the phase space MLW Z of the non-deformed standard chiral WZW theory. Recall that points in MLW Z are the maps m : R → G0 , fulfilling the monodromy condition m(σ + 2π) = m(σ)M .
(1.5)
September 6, 2004 14:35 WSPC/148-RMP
684
00214
C. Klimˇ c´ık
Here the monodromyb M = exp(−2πiaµ H µ ) sits in the fundamental Weyl alcove viewed as the subset of the maximal torus T of G0 . In what follows, aµ will be coordinates on the alcove A+ corresponding to the choice of the orthonormal basis H µ on the Cartan subalgebra. qW Z Although the phase space MLW Z is the same, the symplectic structure ωL qW Z and the Hamiltonian HL differ, however, from their non-deformed WZW counterparts. One of the main results of this paper is the explicit description of the qW Z pair (ωL , HLqW Z ). Thus the symplectic structure corresponding to the two-form qW Z ωL is fully characterized by the following Poisson bracket {m(σ) ⊗, m(σ 0 )}qW Z µ
= (m(σ) ⊗ m(σ 0 ))Bε (aµ , σ − σ 0 ) + εˆ r (σ − σ 0 )(m(σ) ⊗ m(σ 0 )) ,
where Bε (a , σ) is the so-called quasitriangular braiding matrix given by iσ iπ i Hµ ⊗ Hµ , Bε (aµ , σ) = − ρ κ 2κε κε i X |α|2 iσ iπ − E α ⊗ E −α σaµ hα,H µ i , κ 2 2κε κε
(1.6)
(1.7)
α∈Φ
and rˆ(σ) is defined as
1 rˆ(σ) = r + C cotg σ. (1.8) 2 Here r and C are the ordinary (non-affine) r-matrix and Casimir elements, respectively, given by X i|α|2 r= (E −α ⊗ E α − E α ⊗ E −α ) ; (1.9) 2 α∈Φ+
C=
X µ
Hµ ⊗ Hµ +
X |α|2 (E −α ⊗ E α + E α ⊗ E −α ) . 2
(1.10)
α∈Φ+
The functions ρ(z, τ ), σw (z, τ ) are defined as (cf. [17, 20, 21]) σw (z, τ ) =
θ1 (w − z, τ )θ10 (0, τ ) , θ1 (w, τ )θ1 (z, τ )
ρ(z, τ ) =
θ10 (z, τ ) , θ1 (z, τ )
(1.11)
where θ1 (z, τ ) is the Jacobi theta function θ1 (z, τ ) = − b Sometimes
∞ X
1 2
eπi(j+ 2 )
τ +2πi(j+ 12 )(z+ 21 )
.
(1.12)
j=−∞
people consider [6, 19, 28] the bigger chiral WZW phase space in the sense that M can be an arbitrary element of G0 . Such an enlargement is useful for description of the (finite dimensional) quantum group symmetries of the standard WZW model, however, it is not necessary for recovering the full left-right WZW model by appropriate combination of two chiral models. The choice of the maximal torus monodromy is sufficient to do this job and we stick to it for the rest of this paper.
September 6, 2004 14:35 WSPC/148-RMP
00214
Quasitriangular WZW Model
685
In (1.11), the prime 0 means the derivative with respect to the first argument z, the argument τ (the modular parameter) is a nonzero complex number such that Im τ > 0. Remark. In Secs. 5.1 and 5.2, we describe in detail the hard work needed to arrive ˜ ˜ to the qWZW chiral Poisson from the Semenov–Tian–shansky form on the double D bracket {·, ·}qW Z . In spite of the intermediate complicated calculations, the resulting formula (1.6) is simple and aesthetically appealing. Tantalizingly, Bε (aµ , σ) is the Felder elliptic dynamical r-matrix that appears in the Bernard generalization of the Knizhnik–Zamolodchikov equation [7, 20, 21]. There is an added advantage: the corresponding quantum dynamical R-matrix is known [20]. Without doubt, the concept of the dynamical Hopf algebroid corresponding to R underlies an important part of the structure of a q-deformed conformal field theory. In the limit ε → 0, Eq. (1.6) becomes {m(σ) ⊗, m(σ 0 )}(q=1)W Z = (m(σ) ⊗ m(σ 0 ))B0 (aµ , σ − σ 0 ) ,
(1.13)
where
X |α|2 exp(iπη(σ)hα, H µ iaµ ) π µ µ α −α . E ⊗E B0 (a , σ) = − η(σ)(H ⊗ H ) − i κ 2 sin(πhα, H µ iaµ ) µ
α∈Φ
(1.14)
Here η(σ) is the function defined by η(σ) = 2
σ +1, 2π
(1.15)
where [σ/2π] is the largest integer less than or equal to σ/2π. It is shown in Sec. 3.2, WZ that the relation (1.13) completely characterizes the symplectic structure ω L of the standard non-deformed chiral WZW model. 1.4. Quasitriangular Hamiltonian The Hamiltonian HLqW Z descends (upon the reduction) from the second floor master Hamiltonian (1.4). We want to make explicit how HLqW Z is defined as the function on the phase space MLW Z . For this purpose, it is more convenient to perform the classical (inverse) vertex-IRF transformation defined by k(σ) = m(σ) exp(iaµ H µ σ) .
(1.16)
Note that k(σ) then becomes periodic, hence an element of the loop group G = LG0 . Therefore, topologically, MLW Z = LG0 × A+ . We shall first start with the case q = 1 that gives the standard chiral WZW Hamiltonian. Some basic knowledge of the Poisson–Lie world is needed for understanding the case q 6= 1. The interested reader may consult Secs. 4.1 where the relevant Poisson–Lie notions are explained.
September 6, 2004 14:35 WSPC/148-RMP
686
00214
C. Klimˇ c´ık
The standard chiral WZW Hamiltonian HLW Z is usually written in the monodromic variables m(σ) and is given by the Sugawara formula 1 (κ∂σ mm−1 , κ∂σ mm−1 )G 0 . (1.17) 2κ The minus sign appears because in our conventions the form (·, ·)G0 is negative definite. In the variables (k, aµ ), it becomes HLW Z (m) = −
HLW Z (k, aµ ) = −
1 κ (φ, φ)G ∗ − hφ, k −1 ∂σ ki − (k −1 ∂σ k, k −1 ∂σ k)G , 2κ 2
(1.18)
where (φˆκ )0 = φ (see the meaning of this notation in a while) and (·, ·)G is the invariant scalar product on the loop group Lie algebra G = LG0 . As always in the paper, h·, ·i means the canonical pairing between the elements of mutually dual spaces. The formula (1.18) can be rewritten in the following way 1 ] ˆ φˆκ )0 . (φ, φ)G ∗ − (Coad k 2κ It turns out that for generic q, (1.19) generalizes to HLW Z (k, aµ ) = −
HLqW Z (k, aµ ) = −
1 ˆκ ) 0 ˜ φ ] ˆ eΛ( (φ, φ)G ∗ − (Dres ) . k 2κ
(1.19)
(1.20)
Recall that φˆκ is the function of aµ hence the Hamiltonians really depend on the indicated variables. Notation 1.1. (i) Denote T˜ 0 ∈ G˜ and T˜ ∞ ∈ G˜ the generators of the σ-shift and of the central circle, respectively. Then we have the linear space decomposition G˜ = RT˜ 0 + RT˜ ∞ + G and its dual G˜∗ = Rt˜0 + Rt˜∞ + G ∗ . Now we define x˜ = x ˜0 t˜0 + x ˜∞ t˜∞ + x ˜0 ,
x ˜ ∈ G˜∗ , x˜0 ∈ G ∗ .
(1.21)
Expressed in words, x ˜0 is the σ-shift part of x˜, x ˜∞ the central circle part and x˜0 ∗ ˆ the G -part. The lifted alcove φκ is then characterized by the relations (φˆκ )0 = 0 ,
(φˆκ )∞ = κ ,
(φˆκ )0 = φ .
(1.22)
˜ and B = Lie(B), where B, ˜ B are respectively the (ii) Consider B˜ = Lie(B) c ˜ dual Poisson–Lie groups of G, G. There is the following unique decomposition of ˜ any element ˜b ∈ B ˜b = exp(˜b0 Λ( ˜ t˜0 )) exp(˜b∞ Λ( ˜ t˜∞ ))˜b0 ,
˜b ∈ B, ˜ ˜b0 ∈ B ,
(1.23)
where ˜b0 is the Poisson–Lie analogue of x ˜0 and the real numbers ˜b0 , ˜b∞ are the 0 ∞ ∗ ˜ ˜ analogues of x ˜ ,x ˜ . The map Λ : G → B˜ is the identification map defined by ˜ ˜ of G. ˜ This form can be the invariant bilinear form (·, ·)D˜˜ on the Drinfeld double D arbitrarily normalized. This normalization parameter is actually the deformation ˜ parameter of our WZW story. We call it either ε or q, with q = eε . Note that Λ c The
˜ ˜ B are implied by the structure of the Drinfeld double D. ˜ existence and properties of B,
September 6, 2004 14:35 WSPC/148-RMP
00214
Quasitriangular WZW Model
687
then depends on q which gives the q-dependence of the quasitriangular Hamiltonian (1.20). ˜ ] means the G-coadjoint (iii) Coad action on φˆκ viewed as the element of G˜∗ and ˜ ˆ ˜ ˜ ] Dres means the G-dressing action on eΛ(φκ ) viewed as the element of B. ˆ d (iv) G = LG0 is the principal U (1)-bundle over G = LG0 with the projection π. ˆ = k. Since G ˆ we can view kˆ ˆ such that π(k) ˜ = R × ˆ G, Then kˆ is any element of G S ˜ and it is in this sense that kˆ appears in (1.19) and (1.20). also as the element of G It is shown in Sec. 5.2.7, that the q → 1 limit of HLqW Z gives indeed HLW Z . We qW Z have already learned that the symplectic form ωL also has the correct q = 1 limit. Thus we conclude that the quasitriangular chiral WZW model is indeed the smooth deformation of its standard counterpart. 1.5. Quasitriangular classical action qW Z We have just described explicitly the pair (ωL , HLqW Z ). Knowing these data, we can write down the following classical action of the deformed chiral WZW model Z ∗ qW Z ηL θL − HLqW Z (ηL )dτ . (1.24) SLqW Z [ηL (τ )] =
Here ηL (τ ) is a trajectory in the phase space MLW Z parametrized by the ordinary qW Z (continuous) time parameter τ , θL is a 1-form on the phase space called the ∗ qW Z is its pullback by the map ηL . The symplectic symplectic potential and ηL θL qW Z form ωL on MLW Z can then be written as qW Z qW Z ωL = dθL .
(1.25)
Consider manifolds MLW Z = LG0 × A+ and MRW Z = LG0 × A− . Here A− = −A+ . The full left-right quasitriangular WZW model has the following classical action Z qW Z S qW Z [ηL , ηR , λµ ] = SLqW Z [ηL (τ )] + SR [ηR (τ )] + dτ λµ (τ )(aµL (τ ) + aµR (τ )) . (1.26)
Here ηL = (kL , aµL ), ηR = (kR , aµR ) with kL,R ∈ LG0 . The left and right chiral qW Z actions SLqW Z (kL , aµL ) and SR (kR , aµR ) have exactly the same dependence on their respective variables, but aµL ’s run over the positive standard Weyl alcove A+ and aµR ’s over the negative one A− . Finally, the fields λµ are the Lagrange multipliers. We note, that in the limit q → 1 the action (1.26) reduces to the classical action of the standard full left-right WZW model written in the form of Ref. 13. Remark. The variation of the action (1.24) does not depend on the choice of the qW Z qW Z symplectic potential θL but only on the pair (ωL , HLqW Z ). This explains why qW Z one can give the meaning to the classical action (1.24) also in the case where ω L qW Z is not exact (i.e. there is no globally defined θL such that (1.25) is valid).
September 6, 2004 14:35 WSPC/148-RMP
688
00214
C. Klimˇ c´ık
1.6. Quasitriangular current algebra It is of the crucial importance to understand the symmetries of the models (1.24) and (1.26). We have learned already from the standard (q = 1) chiral WZW example, that the canonical quantization of the model relies heavily on its symmetry structure. In fact, one needs the identification of suitable observables whose Poisson brackets get promoted to the quantum commutation relations. In the q = 1 case, such observables are the components of the Kac–Moody current j = κ∂σ mm−1 who serve as the Hamiltonian charges generating the action of the G = LG 0 on the phase space MLW Z = LG0 × A+ . The latter fact can be expressed succinctly by the following matrix Poisson brackets {k(σ) ⊗, j(σ 0 )}W Z = 2πCδ(σ − σ 0 )(k(σ) ⊗ 1) ,
(1.27)
{j(σ) ⊗, j(σ 0 )}W Z = πδ(σ − σ 0 )[C, j(σ) ⊗ 1 − 1 ⊗ j(σ 0 )] + 2πκC∂σ δ(σ − σ 0 ) . (1.28) We note that the quantum versions of (1.27) and of (1.28) mean respectively that k is the Kac–Moody primary field and that j generates the action of LG0 on the quantum Hilbert space. The present paper even clarifies the symmetry structure of the standard chiral q = 1 WZW model. In fact, the second floor master model (1.2) is strictly symmetric ˜ action. The two-step symplectic reduction down to the with respect to the left G ˜ chiral W ZW model reduces this exact G-symmetry to anomalous LG0 -symmetry (1.28), generated by the current j. It turns out that the quasitriangular picture is exactly analogous. The deformed ˜ Poisson–Lie symmetric with respect to the left chiral master model is strictly G ˜ ˜ L is strictly G˜ action of G. This means that the chiral master Hamiltonian H q ˜ invariant but G-invariance of the symplectic form Ω is broken in certain special ˜ (Poisson–Lie) way. Then also the G-invariance of the deformed chiral classical action (1.24) is broken in the special way dictated by the Poisson–Lie symmetry (see [4] for a nice general discussion of this issue). The two-step symplectic reduction ˜ Poisson–Lie symmetry into anomalous LG0 Poisson– then changes the strict G Lie symmetry, whose non-Abelian moment map will satisfy the q-deformed version of the current algebra. The precise way how the central anomaly manifests itself follows from the fundamental Poisson bracket (1.6) defining the chiral symplectic WZ form ωL . In order to find the quasitriangular analogue of the Kac–Moody current j ∈ G = LG0 , we first write j in the variables (k, aµ ) j = −κaµ kT µ k −1 + κ∂σ kk −1 .
(1.29)
] ˆ φˆκ )0 , (j, ·)G = (Coad k
(1.30)
It then follows
September 6, 2004 14:35 WSPC/148-RMP
00214
Quasitriangular WZW Model
689
where all necessary notations were already defined after Eq. (1.20). The q-Kac– Moody current F (k, aµ ) ∈ B is in turn given by ˜ ˆ
] ˆ eΛ(φκ ) )0 , F = (Dres k
(1.31)
in full analogy with (1.30). The Poisson brackets involving F follow from the defining formula (1.6). Their calculation is somewhat involved but the result is very simple. It reads {k ⊗, F }qW Z = rˆκ (k ⊗ F ) ,
(1.32)
{F ⊗, F }qW Z = [ˆ r, F ⊗ F ] ,
(1.33)
{F † ⊗, F † }qW Z = −[ˆ r, F † ⊗ F † ] , {(F † )−1 ⊗, F }qW Z = rˆ2κ ((F † )−1 ⊗ F ) − ((F † )−1 ⊗ F )ˆ r.
(1.34) (1.35)
The r-matrices rˆκ and rˆ2κ are detailed in Secs. 5.2.5 and 5.2.6. The superscript κ over rˆκ indicates the presence of the central extension. In the limit q → 1, the q-primary field condition (1.32) becomes (1.27) and the q-current algebra (1.33)– (1.35) yields (1.28). ] ˆ φˆκ ∈ G˜∗ is the observable of Remarks. (i) We observe that the quantity Coad k ] ˆ φˆκ )0 gives the Kac–Moody the standard chiral WZW model. Its G ∗ -part (Coad k ] ˆ φˆκ )∞ gives the level κ and its σ-shift part current j, its central circle part (Coad k ] ˆ φˆκ )0 contributes to the Hamiltonian H W Z . A similar observation can also (Coad L k ˆκ ) ˜ φ ˜ in the quasitriangular case. ] ˆ eΛ( be made for the variable Dres ∈B k (ii) It is worth stressing again that the q-Kac–Moody brackets (1.33)–(1.35) can be derived from the fundamental exchange relation (1.6) containing the elliptic dynamical r-matrix (1.7). This suggests, in particular, that there might also be a lurking q-current algebra in the description of the standard WZW conformal blocks on elliptic curves, Hitchin systems, etc. It is convenient to introduce a new dynamical variable defined by the relation L = FF† .
(1.36)
The Poisson brackets (1.33)–(1.35) can then be equivalently rewritten in terms of only one relation {L(σ) ⊗, L(σ 0 )}qW Z = (L(σ) ⊗ L(σ 0 ))εˆ r (σ − σ 0 ) + εˆ r (σ − σ 0 )(L(σ) ⊗ L(σ 0 )) − (1 ⊗ L(σ 0 ))εˆ r (σ − σ 0 + 2iεκ)(L(σ) ⊗ 1)
− (L(σ) ⊗ 1)εˆ r(σ − σ 0 − 2iεκ)(1 ⊗ L(σ 0 )) .
(1.37)
Our formula (1.37) coincides with the defining relation of the Poisson algebra introduced by Reshetikhin and Semenov-Tian-Shansky in Sec. 1 of their paper [41]. Note, that we derived (1.37) from the symplectic structure characterized by the
September 6, 2004 14:35 WSPC/148-RMP
690
00214
C. Klimˇ c´ık
qW Z bracket (1.6). In other words, we have constructed the dynamical system whose symmetry structure is given by the Reshetikhin and Semenov-Tian-Shansky Poisson algebra. Such a system was not known previously, therefore its construction constitutes one of the main original results of our paper. We also expect that the future quantization of our quasitriangular WZW model will give a (so far unknown) quantum dynamical system whose symmetry structure will be characterized by the q-current algebra introduced in Sec. 2 of [41]. By a slight abuse of notation we shall therefore refer to both quantities F and L as to the q-currents. Recall that the non-deformed current j can be simply expressed in terms of the primary field m(σ) j = κ∂σ mm−1 .
(1.38)
This relation can be called the classical Knizhnik–Zamolodchikov equation [36] since its quantum version becomes indeed the KZ-equation written in the operatorial form [23]. It turns out that the q-current L(σ) can also be simply expressed in terms of the q-primary field m(σ) L(σ) = m(σ + iκε)m−1 (σ − iεκ) .
(1.39)
This nice relation can be interpreted as the classical q-KZ equation. As expected, it is not differential but rather a difference equation. 1.7. The plan of the paper ˜ of a Lie In Sec. 2, we first explain the crucial notion of the central biextension G ˜ group G and we derive explicit formulae for adjoint and coadjoint actions of G. Then we detail our basic observation that the two-step symplectic reduction of ˜ gives the standard WZW model on G. We shall also the master model (1.1) on G see that this construction can be performed for any central biextension; the affine Kac–Moody group being only the special case. We are thus led to the notion of the universal WZW model. In Sec. 3, we study the case of the affine Kac–Moody group. We show that the ˜ can be decomposed in two copies of the simpler chiral model. master model on G Then we perform the two-step chiral symplectic reduction to obtain the standard chiral WZW model and we detail the symplectic structure of the model in the (k, aµ ) variables. We devote Sec. 4 to the construction of the universal quasitriangular model ˜ ˜ of the biextended group G. ˜ We first review some based on any Drinfeld double D basic notions of the theory of the Poisson–Lie groups and then we identify which ˜ ˜ must fulfil in order that the two-step reduction could be performed. conditions D ˜ ˜ the WZW doubles of G. ˜ The rest of the section is We call such good doubles D ˜ C and to proving devoted to the construction of the affine Lu–Weinstein double G that it does fulfil the required conditions. The quasitriangular WZW model based
September 6, 2004 14:35 WSPC/148-RMP
00214
Quasitriangular WZW Model
691
on this particular double is the q-deformation of the standard loop group WZW model. The core of the paper is Sec. 5. Starting from the affine Lu–Weinstein double, we first explain the construction of the deformed chiral geodesical model (1.3). Then we perform the two-step symplectic reduction down to the quasitriangular chiral WZW qW Z model. We shall make explicit the symplectic structure ωL and the Hamiltonian qW Z HL of the model, introduce the q-current algebra and show how its commutation qW Z relations follow from the symplectic structure ωL . We also show that the model has the correct q → 1 limit and finally we combine the two quasitriangular chiral WZW models to obtain the full left-right theory. In Sec. 6, we summarize the results and provide an outlook. In particular, we outline the further generalizations of the construction towards the q-deformation of the whole WZW factory, draw the plausible picture of the quantization of the model and also furnish some remarks on the role of the Virasoro group in the q-deformed case. The Appendices that are of three types: (1) they provide more background material for better understanding of the article; (2) they contain the detailed technical proofs of some assertions in the text; (3) they give alternative derivations of some results.
2. Universal WZW Model In this section, we shall be very general and shall work with an arbitrary Lie group ˜ which is the central biextension of a Lie group G. The loop group case leading to G the standard WZW model will be the only special (though very important) example of our construction. Indeed, we shall see that the WZW-like symplectic structure is “universal”; it can be defined not only for the loop groups and it does not depend on the detailed structure of the group multiplication or of the central extension.
2.1. Central biextension ˆ of a Lie group G by the circle group U (1). This Consider a central extension G means that there is the exact sequence of morphisms of groups π
ˆ → G → 1, 1 → U (1) → G
(2.1)
ˆ The morphism from G ˆ to G is denoted where U (1) is injected into the centre of G. as π. Note that the circle fibration over the base G can be topologically nontrivial. The fundamental example of this exact sequence is the famous central extension of the loop groups LG0 . It is reviewed in the Appendix A.1. The reader is invited to consult this appendix whenever an illustration of the general scheme presented in this section is needed.
September 6, 2004 14:35 WSPC/148-RMP
692
00214
C. Klimˇ c´ık
It is clear that the exact sequence (2.1) of groups induces the following exact sequence of their Lie algebras π
∗ 0 → R → Gˆ −→ G → 0.
(2.2)
Here π∗ : Gˆ → G is the Lie algebra homomorphism induced by the group homomorphism π. In general, there need not exist a canonical map between Gˆ and G that would go in the opposite direction of π∗ . Suppose, however, that we choosed such a map ι : G → Gˆ . We do not suppose, however, that ι is the homomorphism of Lie algebras! We just claim that ι be a linear injection of G into Gˆ fulfilling the following condition π∗ (ι(ξ)) = ξ
(2.3)
for every ξ ∈ G. The existence of the map ι immediately implies that the structure of the Lie ˆ must be given by the following commutatore algebra Gˆ of G ˆ ηˆ] = ι([π∗ ξ, ˆ π∗ ηˆ]) + ρ(π∗ ξ, ˆ π∗ ηˆ)Tˆ ∞ [ξ,
(2.4)
for some cocycle ρ : G ∧ G → R. Recall that the cocycle condition means ρ([ξ, η], ζ) + ρ([η, ζ], ξ) + ρ([ζ, ξ], η) = 0 ,
ξ, η, ζ ∈ G .
(2.5)
ˆ ηˆ in (2.4) are from G, ˆ the element Tˆ∞ ∈ Gˆ corresponds to the The elements ξ, ˆ according to the exact sequence (2.1). In other generator of U (1) injected in G words π∗ Tˆ∞ = 0 .
(2.6)
In particular, the relation (2.4) implies [ι(ξ), ι(η)] = ι([ξ, η]) + ρ(ξ, η)Tˆ∞ .
(2.7)
ˆ commutSuppose that there is a one-parameter subgroup Sˆ of automorphisms of G ing with the central circle action. It then gives rise to the one-parameter subgroup S of automorphisms of G. We denote as ∂ the generator of S and we define as follows: ˜ = R ׈ G ˆ is the central biextension if Definition 2.1. The group G S ρ(ξ, η) = (ξ, ∂η)G ,
(2.8)
d If G ˆ is constructed from G in a suitable way, the existence of a natural ι may be the consequence of this construction; this happens in the case of the central extensions of the loop groups of simple compact Lie groups. e We use the same symbol [·, ·] for the commutators of different Lie algebras. It should be clear which usage we have in mind by realizing to which Lie algebra the arguments of the commutator belong.
September 6, 2004 14:35 WSPC/148-RMP
00214
Quasitriangular WZW Model
693
where (·, ·)G is a symmetric non-degenerate invariant bilinear form on G = Lie(G), such that (ξ, ∂η)G + (∂ξ, η)G = 0 .
(2.9)
˜ is the central biextension, then there is a canonical symmetric non-degenerate If G ˜ It is given by invariant bilinear form on G˜ = Lie(G). ((iX, ξ, ix), (iY, η, iy))G˜ = (ξ, η)G − Xy − Y x .
(2.10)
Convention 2.2. The generator of Sˆ in G˜ will be denoted either as T˜ 0 or as (i, 0, 0). The elements ι(ξ) and Tˆ ∞ of Gˆ will be denoted either as ˜ι(ξ) and T˜ ∞ or ˜ as (0, ξ, 0) and (0, 0, i) when considered as the elements of the Lie algebra G. Remark. The theory of the biextension was developed by Medina and Revoy [32] in their quest to classify the Lie algebras admitting the symmetric non-degenerate invariant bilinear form. They used the terminology “double extension”. We take the liberty of modifying this name to “biextension” since in our paper, the word “double” will often appear in a different sense. It should also be noted that the extensions considered here form only the subclass of the Medina–Revoy extensions. For the physics-oriented survey of the subject, see also [22]. It is not difficult to verify directly the invariance of the bilinear form (·, ·)G˜ . For this, it is also useful to write the commutator in G˜ in terms of the notation above [(iX, ξ, ix), (iY, η, ix)] = (i0, [ξ, η] + X∂η − Y ∂ξ, i(ξ, ∂η)G ) .
(2.11)
The following theorem is of great importance for our construction. ˆ is viewed as the element of G, ˜ where Theorem 2.3. Suppose that an element gˆ ∈ G ˜ is the central biextension of G. Then the adjoint action of gˆ on the Lie algebra G G˜ has the following explicit form
f gˆ (iX, ξ, ix) = (iX, Adg ξ − X∂gg −1, ix − i(g −1 ∂g, ξ)G + 1 iX(g −1 ∂g, g −1 ∂g)G ) . Ad 2 (2.12)
Convention 2.4. Since the large part of our technical work will consist of “travˆ and G, ˜ we should be very careful in book-keeping elling” between the groups G, G f with respect to which group the certain operations are considered. Thus, e.g. Ad ˜ means that the adjoint operation is taken with respect to the group G and we shall always use the convention that g = π(ˆ g ), if both g and gˆ appear in the same formula. Moreover, we denote −∂gg −1 ≡ Adg ∂ − ∂ ;
g −1 ∂g ≡ Adg−1 ∂ − ∂ .
(2.13)
The Ad operation in (2.13) is taken in the group R ×S G. We do not denote this group by a special symbol because it will appear less frequently than its colleagues mentioned above. Finally, it is clear that both ∂gg −1 and g −1 ∂g live in G.
September 6, 2004 14:35 WSPC/148-RMP
694
00214
C. Klimˇ c´ık
Remark. In the context of the biextensions of the loop groups, ∂gg −1 equals to ∂σ gg −1 , where ∂σ is the derivative with respect to the loop parameter. Proof of Theorem 2.3. First we show that f gˆ (0, ξ, ix) = (0, Adg γ, ix − i(g −1 ∂g, ξ)G ) . Ad
(2.14)
c gˆ ι(ξ)) = Adg ξ , π∗ (Ad
(2.15)
˜ and work only with G. ˆ The In this special case, we can forget about the group G ˆ ˆ ˜ reason is that (0, ξ, x) is in G and G is the subgroup of G. First of all, we have because π is the homomorphism of groups. Moreover, c gˆ Tˆ ∞ = Tˆ ∞ , Ad
(2.16)
because Tˆ ∞ is the generator of the central circle. Thus we conclude, that c gˆ (ι(ξ) + xTˆ∞ ) = ι(Adg ξ) + (x − F (g, ξ))Tˆ∞ , Ad
(2.17)
where F (g, ξ) is a function to be determined. First of all, we know F (g, ξ) at the group origin g = e (where it vanishes) and near the group origin F (χ, ξ) = (∂χ, ξ)G , ,
χ∈G.
(2.18)
Moreover, it is easy to check that F (g, ξ) has to verify the following cocycle condition F (g1 g2 , ξ) = F (g2 , ξ) + F (g1 , Adg2 ξ) .
(2.19)
F (g1 χ, ξ) = (∂χ, ξ)G + F (g1 , [χ, ξ]) .
(2.20)
Infinitesimally,
Thus we obtain a first order differential equation with the known initial condition. One readily checks that F (g, ξ) = (g −1 ∂g, ξ)G
(2.21)
is its solution. The proof of the formula (2.12) with X 6= 0 is then like in [40], i.e, we know that the right-hand side of (2.12) must have the form (X, . . . , . . .) and then one directly checks that the formula (2.12) is the only possible one preserving the invariance of the bilinear form (·, ·)G˜ . The theorem is proved. It will also be useful to have the explicit expressions for the invariant bilinear form on the dual G˜∗ and for the coadjoint action of gˆ on G˜∗ . They read, respectively ((A, α, a)∗ , (B, β, b)∗ )G˜∗ = (α, β)G ∗ − Ab − Ba ;
(2.22)
] gˆ (C, γ, c)∗ = (C + hγ, g −1 ∂gi + 1 c(g −1 ∂g, g −1 ∂g)G ) , Coadg γ Coad 2 + cΥ−1 (∂gg −1 ), c)∗ .
(2.23)
September 6, 2004 14:35 WSPC/148-RMP
00214
Quasitriangular WZW Model
695
We also have \gˆ (π ∗ (γ) + ctˆ∞ ) = π ∗ (Coadg γ + cΥ−1 (∂gg −1 )) + ctˆ∞ . Coad
(2.24)
Convention 2.5. The map Υ : G ∗ → G is defined by the invariant bilinear form (·, ·)G . The decomposition G˜ = RT˜ 0 + ˜ι(G) + RT˜ ∞ induces the decomposition of the dual G˜∗ , hence every α ˜ ∈ G˜∗ can be cast asf α ˜ = (A, α, a)∗ ,
(2.25)
where A, a are in R and α in G ∗ . The element (1, 0, 0)∗ will be sometimes denoted as t˜0 , (0, α, 0)∗ as π ˜ ∗ (α) and (0, 0, 1)∗ as t˜∞ . In a similar way, the decomposition Gˆ = ι(G) + RTˆ ∞ induces α ˆ = π ∗ (α) + atˆ∞ ,
α ˆ ∈ Gˆ∗ ,
α ∈ G∗ ,
(2.26)
where tˆ∞ ∈ Gˆ∗ is characterized by the property htˆ∞ , ι(G)i = 0 ,
htˆ∞ , Tˆ ∞ i = 1 .
(2.27)
Of course, π ∗ : G ∗ → Gˆ∗ is the map dual to π∗ : Gˆ → G defined by the exact sequence (2.2). Note also that by definition ˜ = h˜ f gˆ−1 ξi ˜ , ] gˆ γ˜ , ξi hCoad γ , Ad
(2.28)
h(C, γ, c)∗ , (X, ξ, x)i = CX + cx + hγ, ξi .
(2.29)
where the pairing h·, ·i is clearly defined as
2.2. The symplectic reduction ˜ 2.2.1. Second floor master model on G As we have stated in the Introduction, the classical action of the geodesical model ˜ reads on G Z κ −1 d −1 d ˜ g˜, g˜ g˜ , (2.30) dτ g˜ S(˜ g) = − 4 dτ dτ G˜ ˜ and κ is a parameter playing the role of the level of the WZW where g˜(τ ) ∈ G model. We first rewrite this action in the first order Hamiltonian form Z ˜ β˜L , g˜) = dτ β˜L , d g˜g˜−1 + 1 (β˜L , β˜L ) ˜∗ . S( (2.31) G dτ κ Here β˜L ∈ G˜∗ are the “momentum” and g˜ the position coordinates induced by the ˜ The symplectic potential on T ∗ G ˜ is right trivialization of the phase space T ∗ G. θ˜ = hβ˜L , d˜ g g˜−1 i
(2.32)
f Note that there is no i in the dual objects. It is consistent to use such a notation because we shall never use in this paper the dual of the complexified algebra G˜C .
September 6, 2004 14:35 WSPC/148-RMP
696
00214
C. Klimˇ c´ık
in agreement with the general explanation after formula (1.3) in the Introduction. The reader can find in Appendix A.2 the detailed account of the canonical symplectic structure on the cotangent bundle of the group manifold. It is clear that we can obtain (2.30) from (2.31) by eliminating β˜L via the field equations. ˆ 2.2.2. The symplectic reduction to the first floor G ˜ of (2.31) can also be parametrized as follows The field multiplet β˜L ∈ G˜∗ , g˜ ∈ G ] u γ˜L , β˜L = Coad
g˜ = uˆ gu ,
(2.33)
˜ and gˆ ∈ G. ˆ Using the invariance of the form where γ˜L ∈ G˜∗ , u = exp sT˜ 0 ∈ G (·, ·)G˜∗ , the action written in the new variables becomes Z 1 d ds ˜ γL , gˆ, s) = dτ h˜ + γ˜L , gˆgˆ−1 + (˜ γL , γ˜L )G˜∗ . S(˜ γL , (T˜0 + gˆT˜ 0 gˆ−1 )i dτ dτ κ (2.34) Following our list of conventions, we can represent γ˜L as γ˜L = (γL0 , γL , γL∞ )∗ .
(2.35)
˜ We can introduce once again a new way of parametrizing the phase space T ∗ G, s ∞ now by the set of coordinates (γL , γL , γL , s, gˆ), where 1 γLs = h˜ γL , (T˜ 0 + gˆT˜ 0 gˆ−1 )i = 2γL0 − hγL , ∂gg −1 i + γL∞ (g −1 ∂g, g −1 ∂g)G . 2 The action in these newest coordinates becomes Z ˜ s , γL , γ ∞ , gˆ, s) = dτ γ s ds − 1 γ s γ ∞ S(γ L L L dτ κ L L Z 1 d −1 + (γL , γL )G ∗ + dτ γˆL , gˆgˆ dτ κ γL∞ 1 ∞ 2 −1 −1 −1 − hγL , ∂gg (γ ) (g ∂g, g ∂g)G . + κ 2κ L
(2.36)
(2.37)
Here we have used the formula (2.12), the explicit form (2.22) of the scalar product on G˜∗ , and we have set γˆL = (0, γL , γL∞ )∗ .
(2.38)
The reader should pay attention to the distribution of hats and tildes. Of course, we still use the convention that g = π(ˆ g ), if both g and gˆ appear in the same formula. The terms containing (not containing) the level κ encode the Hamiltonian (the ˜ in our new coordinates. It is clear that the coordinate γ s symplectic potential θ) L
September 6, 2004 14:35 WSPC/148-RMP
00214
Quasitriangular WZW Model
697
Poisson-commuteg with the Hamiltonian since the latter does not contain the variable s. We can therefore consistently set γ˜Ls = 0 in the action (2.37). This is the so-called symplectic reduction of the dynamical system (2.31) with respect to the ˜ The resulting reduced moment map generating the axial action of T˜ 0 on T ∗ G. action reads Z ∞ ˆ γL , gˆ) = dτ γˆL , d gˆgˆ−1 + 1 (γL , γL )G ∗ − γL hγL , ∂gg −1 i S(ˆ dτ κ κ 1 + (γL∞ )2 (g −1 ∂g, g −1 ∂g)G . (2.39) 2κ ˆ It can be rewritten entirely from the point of view of the cotangent bundle T ∗ G. Indeed, we shall first note that the symplectic potential θˆ of the reduced dynamical system (2.39) is θˆ = hˆ γL , dˆ ggˆ−1 i ,
(2.40)
ˆ written in the right trivialization which is the canonical symplectic potential on T ∗ G ˆ of the reduced model can also be written coordinates γˆL , gˆ. The Hamiltonian H ˆ Indeed, it turns out that elegantly in terms of natural quantities related to T ∗ G. 1 ∗ ˆ = − 1 (ι∗ (ˆ γL ), ι∗ (ˆ γL ))G ∗ − (ι (ˆ γR ), ι∗ (ˆ γR ))G ∗ , (2.41) H 2κ 2κ where ι∗ : Gˆ∗ → G ∗ is the map dual to the injection ι : G → Gˆ and γˆR is the coorˆ given by the left trivialization. Note also the minus signs reflecting dinate on T ∗ G the fact that the form (·, ·)G is negative definite. In other words, γˆL gˆ = gˆγˆR or [ gˆ γˆR . In deriving (2.41), we have used the formula (2.24) expressing the γˆL = Coad coadjoint action on Gˆ∗ . We conclude that the reduced first floor action Sˆ can be written in a completely left-right symmetric way as Z d −1 1 −1 d ˆ gˆ dτ γˆL , gˆgˆ + γˆR , gˆ S= 2 dτ dτ Z 1 (2.42) + γR ), ι∗ (ˆ γR ))G ∗ ] . dτ [(ι∗ (ˆ γL ), ι∗ (ˆ γL ))G ∗ + (ι∗ (ˆ 2κ Remark. The first floor universal WZW action (2.42) can be naturally written for whatever central extension of the group G admitting the biinvariant metric. In other words, even if the cocycle ρ(x, y) does not have the form (2.8), the action (2.42) makes sense. Actually, we had begun to write this article from the vantage point of the action (2.42). This seemed sufficient for our deformation programme ˆ is already canonical. Thus since the symplectic structure of the model on T ∗ G g Of
s is the moment map generating the axial course, this can be seen also from the fact that γL ˜ = − 1 (˜ action g˜ → u˜ g u, γ ˜L → Coadu γ ˜L , u = exp sT˜ 0 and the Hamiltonian H γ ,γ ˜ ) ˜∗ is clearly 2κ L L G invariant with respect to this action.
September 6, 2004 14:35 WSPC/148-RMP
698
00214
C. Klimˇ c´ık
we can introduce the Heisenberg double, Semenov-Tian-Shansky form, etc. What happened was, however, that we had canonically obtained the symplectic structure of the deformed model but there was no clue how to single out the canonical choice of the deformed Hamiltonian. In fact, it turned out that many Hamiltonians have satisfied the basic condition of the Poisson–Lie symmetry and have had the correct limit when the deformation parameter went to zero. It was the search of the natural Hamiltonian that finally opened our eyes and we realized that the model (2.42) can ˜ for the cocycles of the type (2.8). On G, ˜ there is the be lifted to the second floor G canonical choice of the Hamiltonian even in the deformed case. Then the canonical ˜ Hamiltonian on the first floor is the one inherited from the master model on G. 2.2.3. The reduction to the ground floor G The second symplectic reduction is slightly more involved since the central circle ˆ is nontrivial. In the Appendix A.3, the reduction is performed by bundle over G ˆ and T ∗ G. Here we shall rather working directly with the Poisson brackets on T ∗ G work with the symplectic forms. This form language has the advantage of being briefer and for this reason, we expose it in the main body of the paper. However, the dual (Poisson bracket) derivation has the advantage of being more transparently deformable to the case of the nontrivial Drinfeld doubles. Anyway, we offer both derivations in this paper. ˆ where ˆ is given by ω ˆ = dθ, Recall that the canonical symplectic form ω ˆ on T ∗ G ˆ θ is the symplectic potential θˆ = +hˆ γL , dˆ ggˆ−1 i .
(2.43)
In the same way, θ is the symplectic potential on T ∗ G θ = +hγL , dgg −1 i .
(2.44)
Of course, we have trivialized the cotangent bundle T ∗ G by the right-invariant forms hence every point K ∈ T ∗ G can be decomposed as K = γL g, where γL ∈ G ∗ and g ∈ G. In this section, we shall never work with the opposite decomposition K = gγR . ˆ → T ∗ G of the map π : G ˆ → G defined by Consider an extension πext : T ∗ G the exact sequence (2.1). πext can be easily expressed in the right trivialization as follows πext (ˆ γL gˆ) = ι∗ (ˆ γL )π(ˆ g) .
(2.45)
ˆ Then there is a natural relation between Recall that ι∗ is the map dual to ι : G → G. ˆ the forms θ and θ as the following lemma states. Lemma 2.6. It holds: ∗ θˆ = γL∞ (Rg∗ˆ−1 tˆ∞ ) + πext θ.
(2.46)
September 6, 2004 14:35 WSPC/148-RMP
00214
Quasitriangular WZW Model
699
Proof. Let T i be a basis of the Lie algebra G and ti its dual basis in G ∗ . We can ˆ Its dual is clearly tˆ∞ , π ∗ (ti ), since π∗ ◦ ι then choose Tˆ ∞ , ι(T i ) as the basis in G. equals to the identity map G → G (cf. (2.3)). The right invariant Maurer–Cartan form ρGˆ = dˆ ggˆ−1 can be clearly written as dˆ g gˆ−1 = Rg∗ˆ−1 tˆ∞ ⊗ Tˆ ∞ + Rg∗ˆ−1 π ∗ (ti ) ⊗ ι(T i ) ,
(2.47)
where R∗ is the pullback map. Then the symplectic potential θˆ can be expressed as follows θˆ = γL∞ (Rg∗ˆ−1 tˆ∞ ) + hˆ γL , ι(T i )i(Rg∗ˆ−1 π ∗ (ti )) .
(2.48)
Now we observe that (1) hγL , T i i is a function on T ∗ G. If we calculate its πext -pullback, we obtain ∗ πext hγL , T i i = hι∗ (ˆ γL ), T i i = hˆ γL , ι(T i )i .
(2.49)
∗ Rg∗ˆ−1 π ∗ = π ∗ Rπ(ˆ g )−1 .
(2.50)
ˆ → G is the group homomorphism, we have (2) Due to the fact that π : G Using (1) and (2) in (2.48), we infer that ∗ ∗ ∞ ∗ ˆ ∗ θˆ = γL∞ (Rg∗ˆ−1 tˆ∞ ) + πext hγL , T i iπ ∗ Rπ(ˆ g )−1 ti = γL (Rg ˆ−1 t∞ ) + πext θ .
(2.51)
The lemma is proved. By using the formula d(dˆ g gˆ−1 ) = dˆ g gˆ−1 ∧ dˆ g gˆ−1
(2.52)
for the exterior derivative of the Maurer–Cartan form, we can immediately calculate also the exterior derivatives of its components. Thus we obtain, in particular, that d(Rg∗ˆ−1 tˆ∞ ) =
1 ρ(T i , T j )(Rg∗ˆ−1 π ∗ (ti )) ∧ (Rg∗ˆ−1 π ∗ (tj )) 2
1 ∗ π ρ(dgg −1 ∧, dgg −1 ) , (2.53) 2 ext where ρ(·, ·) is the cocycle defining the central extension. We can therefore express ˆ as conveniently the symplectic form ω ˆ on T ∗ G =
1 ∗ ∗ ρ(dgg −1 ∧, dgg −1 ) + πext dθ . ω ˆ = dθˆ = +dγL∞ ∧ (Rg∗ˆ−1 tˆ∞ ) + γL∞ πext 2 Now we can directly perform the symplectic reduction by setting γL∞ = κ .
(2.54)
(2.55)
The restriction of the form ω ˆ to the submanifold determined by (2.55) is clearly a ∗ πext -pullback of the two-form ωred living on the manifold T ∗ G and given by κ ωred = ρ(dgg −1 ∧, dgg −1 ) + dhγL , dgg −1 i . (2.56) 2
September 6, 2004 14:35 WSPC/148-RMP
700
00214
C. Klimˇ c´ık
The form ωred is the reduced symplectic form as the notation indicates. Theorem 2.7. If G is the loop group LG0 then the form ωred on T ∗ G is the symplectic form of the standard WZW model. Proof. The loop group cocycle ρ(η, ξ) reads Z 1 (η, ∂σ ξ)G0C . ρ(η, ξ) = 2π S 1
The form ωred can be then rewritten as Z Z 1 κ −1 ∧ −1 (JL (σ), dgg −1 )G0 . d (dgg , ∂σ (dgg ))G0 + ωred = 4π S 1 2π S 1
(2.57)
(2.58)
∗ Here R JL (σ) = Υ(γL ), where Υ is the identification map G → G induced by (·, ·)G = 1 2π (·, ·)G0 . It turns out that (2.58) is the standard WZW symplectic form of Ref. 6. (Actually, there is the difference in the overall normalization factor (−2π); if so wished, this factor can be easily restored in all our formulae. The reader should also note that our bilinear form (·, ·)G0 is −T r of Ref. 6.) The theorem is proved.
ˆ of the first floor model on G ˆ can be read off from the The Hamiltonian H formula (2.41). It clearly Poisson-commutes with the moment map γ˜L∞ since it is invariant with respect to the central circle action. It thus descends to the function on the ground-floor phase space T ∗ G where it is given by the formula κ 1 HW ZW (γL , g) = − (γL , γL )G ∗ + hγL , ∂gg −1 i − (g −1 ∂g, g −1 ∂g)G κ 2 =−
1 1 (JL , JL )G − (JR , JR )G , 2κ 2κ
(2.59)
where JL = Υ(γL ) ,
JR (σ) = −Adg−1 JL (σ) + κg −1 ∂σ g .
(2.60)
We immediately observe that our Hamiltonian HW ZW coincides (up the factor (2π) mentioned above) with the standard WZW Hamiltonian of Ref. 6. Note that the symplectic form of Ref. 6 is the (−2π)-multiple of our ωred and the Hamiltonian [6] is (2π)-multiple of our HW ZW . The discrepancy in the relative sign is innocent. Indeed, if we change the sign of the symplectic potential in (2.31) and integrate away the momenta, we shall again obtain the same second order action (2.30). Thus we have proved the following theorem. Theorem 2.8. The two-step symplectic reduction of the master model (1.1) induced by equating γLs = 0, γL∞ = κ yields the standard WZW model. Remark. We stress that the dynamics of the WZW model is intrinsically left-right symmetric. The left-right asymmetry in the Hamiltonian (cf. (2.60)) is purely a coordinate effect which can be traced back to the asymmetric way of performing the
September 6, 2004 14:35 WSPC/148-RMP
00214
Quasitriangular WZW Model
701
symplectic reduction. Indeed, the choice of the right trivialization of the bundle in (2.32) already breaks the symmetry. The left-right symmetric formalism of Appendix A.1 does not use the trivialization of the cotangent bundle. We can choose ˆ a diffeomorphism relating Mκ (G)/U (1) and T ∗ G that does not break the left-right symmetry. The reason why we are not making such a symmetric choice (we prefer the asymmetric one) is simple: it is because we want to arrive at the standard T ∗ G presentation of the WZW symplectic structure existing in the literature [6, 19]. It is instructive to evaluate the Poisson bracket of functions on T ∗ G with respect to the reduced form ωred . It is convenient to use a short-hand notation hγL , T i i ≡ γLi . Then the reduced form ωred can be written as 1 ωred = +dγLi ∧ Rg∗−1 ti + (γLi fimn + κρ(T m , T n ))Rg∗−1 tm ∧ Rg∗−1 tn , 2
(2.61)
where fimn are the structure constants of the Lie algebra G. This expression can be readily inverted to give the corresponding Poisson tensor Πred : Πred =
1 ∂ ∂ ∂ ij − (κρ(T i , T j ) + γLm fm ) i ∧ ∧ Rg∗ T i . 2 ∂γL ∂γLj ∂γLi
(2.62)
Since we have that h∇L G , ξi = Rg∗ ξ, for ξ ∈ G, we obtain from Πred the following WZW Poisson brackets (cf. (A.41)–(A.43)): {Φ1 (g), Φ2 (g)}red = 0 ; d Φ(esξ g) ≡ h∇L {Φ(g), hγL , ξi}red = G Φ, ξi ; ds s=0 {hγL , ξi, hγL , ηi}red = hγL , [ξ, η]i + κρ(ξ, η) .
(2.63) (2.64) (2.65)
From this we obtain for the loop group case 1 {(T α , JL (σ))G0 , (T β , JL (σ 0 ))G0 }red 2π = ([T α , T β ], JL (σ))G0 δ(σ − σ 0 ) + κ(T α , T β )G0 ∂σ δ(σ − σ 0 ) ;
(2.66)
1 {(T α , JR (σ))G0 , (T β , JR (σ 0 ))G0 }red 2π = ([T α , T β ], JR (σ))G0 δ(σ − σ 0 ) − κ(T α , T β )G0 ∂σ δ(σ − σ 0 ) ;
(2.67)
1 {g(σ), (T α , JL (σ 0 ))G0 }red = T α g(σ)δ(σ − σ 0 ) ; 2π
(2.68)
1 {g(σ), (T α , JR (σ 0 ))G0 }red = −g(σ)T α δ(σ − σ 0 ) ; 2π
(2.69)
1 {(T α , JL (σ))G0 , (T β , JR (σ 0 ))G0 }red = 0 . 2π
(2.70)
September 6, 2004 14:35 WSPC/148-RMP
702
00214
C. Klimˇ c´ık
Here T α is some element of the Lie algebra G0 , g(σ) is understood to be a matrix in some (typically fundamental) representation and δ(σ−σ 0 ) is the standard δ-function given by 1 X in(σ−σ0 ) δ(σ − σ 0 ) = e . (2.71) 2π n∈Z
Upon (−2π) normalization (cf. the remark above concerning the normalization of the symplectic form), our reduced Poisson brackets (2.66)–(2.70) coincide with the Poisson brackets (2.4) of [6] and thus they define the standard WZW symplectic structure. Remarks. (1) We should complete the list of the Poisson brackets (2.66)–(2.70) by the following “trivial” bracket {g(σ) ⊗, g(σ 0 )}red = 0 .
(2.72)
It will turn out that in the quasitriangular generalization of the WZW model such a bracket will not vanish. (2) It is important to note that the space derivative ∂σ in the reduced Hamiltonian (2.59) was “born” in the process of the symplectic reduction. So we observe that the field theoretic character of the WZW model is in a sense the fruit of the central extension. 3. Chiral Decomposition of the WZW Model There exists a sort of square root of the dynamical structure of the standard WZW model. It is called the chiral WZW model [13, 14, 19] and it describes the dynamics of left (or right) movers independently. The full WZW model is then obtained by the appropriate combination of the left and right chiral WZW theories. The goal of this section is to present the derivation of the chiral WZW model starting from the master model (1.1). We shall first decompose (1.1) into the chiral components (1.2) called the chiral master models and then perform an appropriate two-step symplectic reduction of the latter. We shall see that the result is indeed the standard chiral WZW model [13, 14, 19]. As it was often remarked [4, 28], the analogue of the chiral decomposition already exists at the level of finite-dimensional Lie groups. We shall devote a section to the description of this finite-dimensional story in order to set the technical, notational and ideological background for the more involved infinite-dimensional case. 3.1. Chiral Geodesical Model on G0 3.1.1. Cartan decomposition The geodesical model can be naturally associated with every Lie group possessing a biinvariant non-degenerate metric. In other words, it is required that an invariant
September 6, 2004 14:35 WSPC/148-RMP
00214
Quasitriangular WZW Model
703
symmetric non-degenerate R-bilinear form (·, ·)G exists on the Lie algebra G of the group G. In this section, we are going to study the case of a simple compact connected and simply connected group G0 equipped with its standard Kiling–Cartan form playing the role of (·, ·)G0 . In what follows, we shall introduce a map Υ0 that identifies the dual G0∗ of G0 with G0 itself via the bilinear form (·, ·)G0 . Thus hx∗ , yi = (Υ0 (x∗ ), y)G0 ,
x∗ ∈ G0∗ ,
y ∈ G0 .
(3.1)
∗ Consider now a subspace Υ−1 0 (T ) of G0 , where T is the Lie algebra of a chosen −1 maximal torus T in G0 . We can view Υ0 (T ) also as the subspace of the cotangent space at the unit element of G0 hence as the subgroup of T ∗ G0 . In the latter case, 0 ∗ we shall denote Υ−1 0 (T ) as A and we shall call it the Cartan subgroup of T G0 . This terminology is not standard but it is very suitable for the purposes of this paper, in particular for the generalization to the loop group case. Consider now subgroups N orm(A0 ) ⊂ G0 and Cent(A0 ) ⊂ G0 given by
N orm(A0 ) = {w ∈ G0 , wA0 w−1 ∈ A0 } ; Cent(A0 ) = {w ∈ G0 , waw−1 = a, if a ∈ A0 } .
(3.2) (3.3)
Here the group multiplication law is that of the cotangent bundle T ∗ G0 . Clearly, Cent(A0 ) is the normal subgroup of N orm(A0 ). Definition 3.1. Weyl group W is the factor group N orm(A0 )/Cent(A0 ). Remark. The group Cent(A0 ) is the maximal torus T of G0 . The Weyl group acts T or on A0 . The fundamental domains of this action on A0 are called (Weyl) chambers. One usually chooses one chamber which is then called the fundamental (or positive) Weyl chamber and denoted as A0+ . It is a well-known fact that the G0 -adjoint orbit of every element of G0 intersects the Cartan subalgebra T of G0 (the diagonalization property in [8]). This fact, the trivializability of the cotangent bundle T ∗ G0 and the definition of the Weyl chamber, together imply the following theorem. Theorem 3.2 (Cartan Decomposition). Every element K ∈ T ∗ G0 can be decomposed as −1 K = kL φkR ,
kL,R ∈ G0 ,
φ ∈ A0+ .
(3.4)
The ambiguity of the decomposition is given by the simultaneous right multiplication of kL and kR by the same element of Cent(A0 ) = T. Proof. By the left trivialization, every element K ∈ T ∗ G0 can be written as K = gL βR , where gL ∈ G0 and βR ∈ G0∗ . By diagonalization, βR can be written as −1 −1 βR = kR φkR , for some kR ∈ G0 and φ ∈ A0 . By writing gL as kL kR for certain
September 6, 2004 14:35 WSPC/148-RMP
704
00214
C. Klimˇ c´ık
kL ∈ G0 and by using the action of the Weyl group, we immediately arrive at the Cartan decomposition formula (3.4). The theorem is proved. 3.1.2. Standard chiral symplectic structure It is explained in Appendix A.2 that the symplectic potential θ on T ∗ G0 can be simply expressed in the right trivialization K = βL gR as −1 i. θ = hβL , ρG0 i ≡ hβL , dgR gR
(3.5)
The dynamical system characterized by the symplectic form dθ and by the Hamiltonian H(K) = −(βL (K), βL (K))G0∗
(3.6)
is called the standard geodesical model on G0 . Recall that the form (·, ·)G0∗ is dual to (·, ·)G0 . The latter form is defined by the restriction of the Killing–Cartan form (·, ·)G0C to the compact real form G0 . As such, the form (·, ·)G0 is negative definite which explains the minus sign in the definition (3.6) of the Hamiltonian and also in the second order action Z 1 −1 d −1 d g, g g dτ g . (3.7) S(g) = − 4 dτ dτ G0 Now consider a manifold G0 × A0+ × G0 ; we shall denote its points as triples (kL , φ, kR ). The Cartan decomposition (3.4) then induces a natural map Ξ from this manifold into the cotangent double T ∗ G0 . We can then pull back the polarization form θ by the map Ξ. By noting that −1 βL = kL φkL ,
−1 gR = k L kR ,
(3.8)
we obtain −1 −1 −1 −1 −1 Ξ∗ θ = hCoadkL φ, dkL kL + kL dkR kR kL i = hφ, kL dkL i − hφ, kR dkR i .
(3.9)
Recall that φ is also the element of G0∗ , hence the pairing in (3.9) makes sense. We observe that the resulting form can be chirally decomposed in the left and right parts which talk to each other only via the variable φ. We can make the left and right form in (3.9) completely independent by means of the following construction Consider a manifold ML = G0 × A0+ . Its elements are couples (kL , φL ) and it is clearly a submanifold of T ∗ G0 . We can pullback the symplectic potential θ on T ∗ G0 to ML by the map (kL , φL ) → kL φL ∈ T ∗ G0 , where the multiplication is in the sense of the group law in T ∗ G0 . The result is clearly −1 θL = hφL , kL dkL i .
(3.10)
We shall prove soon that the form dθL on G0 × A0+ is nondegenerate, hence it defines the symplectic structure.
September 6, 2004 14:35 WSPC/148-RMP
00214
Quasitriangular WZW Model
705
Definition 3.3. The manifold ML = G0 × A0+ equipped with the symplectic form dθL is referred to as the model space of the (simple compact, etc.) group G0 . We have seen that we can obtain the symplectic structure on the model space by the simple pullback of the canonical symplectic form on T ∗ G0 . We can show with the help of the Cartan decomposition that a sort of the “inverse” procedure is also possible. Indeed, consider a direct product ML × MR of two copies of the model space ML = G0 × A0+ and MR = G0 × A0− , where A0− = −A0+ . Equip the manifold ML × MR with a symplectic form −1 −1 ωL×R = dθL + dθR = dhφL , kL dkL i + dhφR , kR dkR i .
(3.11)
The cotangent bundle T ∗ G0 with its canonical symplectic structure ω = dθ can be obtained by an appropriate symplectic reduction of the symplectic manifold (ML ×MR , ωL×R ). Indeed, consider a submanifold of ML ×MR obtained by equating φL + φR = 0. The form ωL×R restricted to this submanifold becomes just Ξ∗ θ. It is clearly degenerate, since by construction of the map Ξ, its kernel is given by vector fields generating the simultaneous right action of the maximal torus T on kL and kR . By imposing the equivalence (kL , kR , φ) ∼ = (kL h, kR h, φ), where h ∈ T, we obtain the reduced manifold. According to the Cartan decomposition theorem, the latter is nothing but T ∗ G0 . It turns out that the Hamiltonian (3.6) of the geodesical model on T ∗ G0 can be also “descended” from a natural Hamiltonian on ML × MR . The latter is given by the following formula 1 1 (3.12) HL×R = HL + HR = − (φL , φL )G0∗ − (φR , φR )G0∗ . 2 2 Since HL×R restricted to φL + φR = 0 is trivially invariant with respect to the maximal torus action (kL , kR , φ) ∼ = (kL h, kR h, φ), it defines certain Hamiltonian on the reduced manifold T ∗ G0 . In order to show that this is precisely the Hamiltonian of the geodesical flow in (3.6), it is sufficient to note that −1 −1 (βL (K), βL (K))G0∗ = (βL (kL φkR ), βL (kL φkR ))G0∗ −1 −1 ∗ = (kL φkL , kL φkL )G0 = (φ, φ)G0∗ .
(3.13)
3.1.3. Dynamical r-matrix This section is devoted to the study of the chiral dynamical system defined on the model space ML and characterized by the symplectic potential θL and the Hamiltonian HL . This system has been proposed in [3] as the finite dimensional analogue of the chiral WZW model. We have seen in the previous paragraph that the geodesical model on G0 admits the chiral decomposition in two chiral models. By this we mean that it can be defined by the symplectic reduction of the model on ML × MR , characterized by the symplectic form ωL×R and by the Hamiltonian HL×R .
September 6, 2004 14:35 WSPC/148-RMP
706
00214
C. Klimˇ c´ık
The chiral dynamics can be derived from the following action principle Z 1 −1 ˙ (3.14) S = dτ hφL , kL kL i + (φL , φL )G0∗ , 2
where the dot indicates the time derivative. The equations of motion can be easily derived d −1 ˙ P T kL kL = −Υ0 (φL ) , (3.15) (CoadkL φL ) = 0 , dτ where PT denotes the orthogonal projection on T and Υ0 : G0∗ → G0 is the map that identifies G0∗ with G0 via the form (·, ·)G0 . From the first equation, it follows CoadkL (τ ) φL (τ ) = CoadkL (0) φL (0) ,
(3.16)
φL (τ ) = Coadk−1 (τ )kL (0) φL (0) .
(3.17)
or L
This implies that kL (τ )−1 kL (0) ∈ T and φL (τ ) = φL (0), where T is the maximal torus of G0 . From this and Eq. (3.15), we finally obtain kL (τ ) = kL (0) exp[−Υ0 (φL (0))τ ] .
(3.18)
The only thing that changes in the treatment of the right model space MR is the fact that φR (0) ∈ A0− . The solution of the right dynamical system is kR (τ ) = kR (0) exp[−Υ0 (φR (0))τ ] .
(3.19)
We combine the left and right systems by identifying φL = −φR which gives the standard geodesical motion on the group manifold −1 −1 k(τ ) = kL (τ )kR (τ ) = kL (0) exp[−2Υ0 (φL (0))τ ]kR (0) .
(3.20)
The symplectic form dθL can be easily inverted to give the Poisson bracket on the model space. Although this calculation was already detailed in the literature [4], we shall repeat it here as the simplest prototype of several similar but technically more involved computations that we shall be doing later on. Recall that the Killing–Cartan form (·, ·)G0C on G0C is normalized in such a way that the square of the length of the longest root is equal to two. We pick an orthonormal basis H µ ∈ iT in the Cartan subalgebra T C of G0C with respect to the Killing–Cartan form (·, ·)G0C . Note that the elements H µ are Hermitian, hence they are not the elements of the Lie algebra T of the maximal torus T. Consider the root space decomposition of G0C M G0C = T C (⊕α∈Φ CE α ) , (3.21) where α runs over the space Φ of all roots α ∈ T C∗ . The step generators E α fulfil [H µ , E α ] = α(H µ )E α ,
[E α , E −α ] = α∨ ,
(E α )† = E −α ,
[α∨ , E ±α ] = ±2E ±α ,
(3.22) (E α , E −α )G0C =
2 . |α|2
(3.23)
September 6, 2004 14:35 WSPC/148-RMP
00214
Quasitriangular WZW Model
707
The element α∨ ∈ iT is called the coroot of the root α. Thus the basis of the complex vector space G0C is (H µ , E α ), α ∈ Φ. The corresponding dual basis of (G0C )∗ will be denoted as (hµ , eα ). We want to invert the symplectic form on the model space ML of the compact real form G0 of the simple complex group GC 0 . For this, we need a basis on the Lie algebra G0 . We can construct such a basis in a canonical way from the basis (H µ , E α ) on G0C . Set T µ = iH µ ,
i B α = √ (E α + E −α ) , 2
1 C α = √ (E α − E −α ) . 2
(3.24)
The set (T µ , B α , C α ), is an orthogonal basis of the real vector space G0 with respect to (·, ·)G0 . Note that α runs only over the positive roots in this context! The dual basis of G0∗ will be denoted as (tµ , bα , cα ). Using the relation X k −1 dk = L∗k−1 tµ ⊗ T µ + (3.25) (L∗k−1 bα ⊗ B α + L∗k−1 cα ⊗ C α ) , α∈Φ+
we can write down the symplectic form dθL on ML as dθL = hdφ ∧, k −1 dki − hφ, k −1 dk ∧ k −1 dki X = daµ ∧ L∗k−1 tµ + hφ, iα∨ iL∗k−1 bα ∧ L∗k−1 cα .
(3.26)
α∈Φ+
Here aµ ’s are defined by the expansion φ = aµ tµ . In deriving (3.26), we have used the commutation relation [B α , C α ] = −iα∨ .
(3.27)
To make the formulae less cumbersome, we have also suppressed the index L on φL and kL . It is now very easy to invert the symplectic form dθ. The corresponding Poisson tensor reads X 1 ∂ Lk∗ B α ∧ Lk∗ C α . (3.28) ΠL = Lk∗ T µ ∧ µ − ∂a hφ, iα∨ i α∈Φ+
It is useful to give an explicit formula for the Poisson brackets of functions that can be obtained as matrix elements of representations of the group G0 . Consider two finite-dimensional representations ρi : G0 → End V0 , i = 1, 2. The matrix element of the representation can be obtained as the function hw ∗ , ρi (k)vi, where v ∈ V0 and w∗ ∈ V0∗ . The Poisson bracket of two such functions then reads {hw1∗ , ρ1 (k)v1 i, hw2∗ , ρ2 (k)v2 i}ML
= h(w1∗ ⊗ w2∗ ), (ρ1 (k) ⊗ ρ2 (k))(ρ1 ⊗ ρ2 )(r0 (a))(v1 ⊗ v2 )i , where r0 (aµ ) =
X
α∈Φ+
−1 (B α ⊗ C α − C α ⊗ B α ) hφ, iα∨ i
(3.29)
September 6, 2004 14:35 WSPC/148-RMP
708
00214
C. Klimˇ c´ık
=
X
α∈Φ+
i|α|2 E α ⊗ E −α . 2aµ hα, H µ i
(3.30)
The last equality in (3.30) follows from (3.24) and from the well-known relations α∨ =
2 hα, H µ iH µ |α|2
or
iα∨ =
2 hα, H µ iT µ . |α|2
(3.31)
Note that r0 (aµ ) is an aµ -dependent element of G0 ∧ G0 ; it is called the dynamical r-matrix. It is to be contrasted with the standard r-matrix (cf. (1.9)) which does not depend on aµ . Both standard and dynamical r-matrices have to satisfy some consistency conditions if the Poisson brackets based on them are to satisfy the Jacobi identities. Those conditions are called, respectively, the Yang–Baxter and the dynamical Yang–Baxter equations. We do not worry about the Jacobi identity here because we know a priori that the symplectic form dθL is closed. Physicists use the so-called matrix Poisson brackets (cf. e.g. [5, 18]) in order to make the expressions like (3.29) more transparent. For simplicity, let us consider the case where ρ1 = ρ2 and choose some basis of the representation space V0 . Then the Poisson bracket of two matrix valued functions Aij and Bkl is written as {Aij , Bkl } ≡ {A ⊗, B}ik,jl .
(3.32)
With such a notation, we can write the brackets (3.29) of the matrix elements in the following matrix form {ρ(k) ⊗, ρ(k)}ML = (ρ(k) ⊗ ρ(k))ρ(r0 (aµ )) .
(3.33)
Even more often, people use a notation where the dependence on the representation ρ is explicitly suppressed but tacitly assumed, i.e. {k ⊗, k}ML = (k ⊗ k)r0 (aµ ) .
(3.34)
The Poisson bracket between the variables k and a can also be written in the matrix form as follows {k, aµ }ML = kT µ .
(3.35)
We complete this section with the commutation relation of the moment maps generating the left action of G0 on ML = G0 × A0+ . The symplectic potential θL (hence the symplectic form dθL ) is clearly invariant with respect to the left multiplication by any k0 ∈ G0 . Consider the infinitesimal vector field V = RkL ∗ T on ML corresponding to the left action of a generator T ∈ G0 . As usual (cf. also (4.87)), the corresponding moment map hM, T i is defined by the relation −iV dθL ≡ dθL (·, V ) = dhM, T i .
(3.36)
The invariance of the symplectic potential θL means the vanishing of its Lie derivative with respect to V . In other words, (iV d + diV )θL = 0 .
(3.37)
September 6, 2004 14:35 WSPC/148-RMP
00214
Quasitriangular WZW Model
709
From this relation, it immediately follows that hM, T i = iV θL = hφL , Lk−1 ∗ RkL ∗ T i = hCoadkL φL , T i = hβL (kL φL ), T i . L
(3.38)
Recall that βL (K) is the G0∗ -valued map defined by the decomposition K = βL (K)gR (K). From (3.36) , one can immediately infer ΠL (·, dhM, T i) = Rk∗ T .
(3.39)
Equations (3.38) and (3.39) imply that for any function f (kL , φL ) on ML , it holds {f (kL , ΦL ), hβL (kL φL ), T i}ML = h∇L G0 f, T i ,
(3.40)
where the differential operator ∇L G0 is defined in (A.54). In particular, by remarking −1 that βL (kL φL ) = kL φL kL , we obtain {hβL (kL φL ), xi, hβL (kL φL ), yi}ML = hβL (kL φL ), [x, y]i ,
x, y ∈ G0 .
(3.41)
Of course, the same relation can be directly obtained from the fact that βL (kL φL ) is the moment map generating the left action of G0 on ML . 3.2. Chiral decomposition of the master model 3.2.1. Affine Cartan decomposition Usually people derive the standard WZW left-right decomposition by using the equations of motion of the full WZW theory [19]. The solutions of these equations of motion split into left and right movers. Because the phase space of a dynamical system can be identified with its space of solutions, one infers that the phase space itself can be split in its chiral parts. The corresponding symplectic forms on the chiral parts have been derived by Gaw¸edzki [28]. Here we shall show how the standard chiral WZW dynamics [28] emerges from the perspective of the master model (1.1). We do not start with the equations of motion. Instead, we shall consider a Cartan-like decomposition of the cotangent ˜ This will give the left-right splitbundle of the centrally biextended loop group G. ting without the use of the field equations. Denote T the Lie algebra of the maximal torus T of the simple compact group G0 . Clearly, T can also be interpreted as the subalgebra of the loop group algebra LG0 consisting of the constant maps from S 1 into T . In what follows, we shall use the same notation for T being the subalgebra of G0 or of LG0 . Now we consider following subalgebra of G˜ T˜ = RT˜ 0 + ˜ι(T ) + RT˜ ∞ .
(3.42)
Their elements are triples (iX, ξ0 , ix), X, x ∈ R, ξ0 ∈ G0 ⊂ LG0 in the terminology ˜ −1 to the subgroupΥ ˜ −1(T˜ ) of of Sec. 2.1. The subalgebra T˜ can be mapped by Υ ∗˜ ∗ T G, since G˜ can be identified with the cotangent space at the unit element of ˜ Here, as usual, the identification map Υ ˜ : G˜∗ → G˜ is induced by the the group G. ˜ invariant bilinear form (2.10) on G.
September 6, 2004 14:35 WSPC/148-RMP
710
00214
C. Klimˇ c´ık
˜ −1 (T˜ ) ≡ A˜ is called the Cartan subgroup of the cotangent space Definition 3.4. Υ ˜ at the unit element e˜ of G. ˜ In the terminology of Sec. 2.1, their elements are Te˜∗ G 0 ∞ ∗ ˜ ˜ −1 (T ), a0 , a∞ ∈ R. Of course, A˜ can also be triples φ = (a , φ, a ) , where φ ∈ Υ ˜˜ of the group D ˜ ˜ = T ∗ G. ˜ We shall interpreted as the subalgebra of the Lie algebra D ˜ ˆ also define two subalgebras of A denoted A and A; the former spanned by elements having a0 = 0 and the latter by those having a0 = a∞ = 0.
˜ ⊂G ˆ and Cent(A) ˜ ⊂G ˆ given by Consider now subgroups N orm(A) ˜ = {w ˆ Coad ˜ ; ] wˆ A˜ ∈ A} N orm(A) ˆ ∈ G,
(3.43)
˜ = {w ˆ Coad ˜ . ] wˆ φ˜ = φ˜ , if φ˜ ∈ A} Cent(A) ˆ ∈ G,
(3.44)
˜ is a normal subgroup of N orm(A). ˜ Clearly, Cent(A)
˜ is the factor group N orm(A)/Cent( ˜ ˜ Definition 3.5. Affine Weyl group W A). ˜ is nothing but the direct Remark. We shall see soon that the group Cent(A) product T × U (1), where T is the maximal torus of G0 and U (1) is the central ˆ = LG d0 . It is important to realize in this context that the circle circle subgroup of G ˆ bundle over T ⊂ LG0 is trivial, hence T can be embedded in G. ˆ on A˜ is given by the formula (2.23) The coadjoint action of G ] gˆ φ˜ = Coad ] gˆ (a0 , φ, a∞ )∗ Coad = a0 + hφ, g −1 ∂gi 1 + a∞ (g −1 ∂g, g −1 ∂g)G , Coadg φ + a∞ Υ−1 (∂gg −1 ), a∞ 2
∗
.
(3.45)
Consider now an element h(σ) = eivσ from the coroot group Hom(U (1), T). We have ∗ ] ˆ (a0 , φ, a∞ )∗ = a0 +hφ, ivi+ 1 a∞ (iv, iv)G , φ+a∞ Υ−1 (iv), a∞ . Coad (3.46) h 2 ˆ is some π −1 -lift of h(σ) into G ˆ and v ∈ H(= iT ) is an σ-independent Here h element called the coroot which corresponds to the element h(σ) in Hom(U (1), T). This correspondence is clearly one-to-one and for this reason, the coroot group Hom(U (1), T) is often viewed as the coroot lattice in iT or in T . The inspection of the formulae (3.45) and (3.46) tells us what is the affine Weyl group. It is the semidirect product of the standard Weyl group of G0 (which ˆ and of the coroot group Hom(U (1), T). By can also be naturally embedded in G) ˜ Elements of W ˜ can be represented construction, the affine Weyl group acts on A. ˆ by the elements of G. For the elements of the ordinary Weyl group W , standard representation by the elements G0 can be chosen. For the elements of the coroot
September 6, 2004 14:35 WSPC/148-RMP
00214
Quasitriangular WZW Model
711
lattice, the representation is evident since every element of Hom(U (1), T) can be viewed by definition as the element of the loop group LT. By the way, the fact that ˆ = T × U (1) follows directly from the formulae (3.45) and (3.46). Cent(A)
We immediately realize from (3.46) that the affine Weyl group acts not only on ˜ but also on Aˆ and A. However, in the latter case the action depends on A∞ as A, on the parameter. The fundamental domains of this action on A are called alcoves. Consider the decomposition of the positive Weyl chamber into alcoves. The alcove attached to the origin (zero) of this positive Weyl chamber is referred to as the ∞ fundamental alcove Aa+ . It clearly depends on a∞ . Now an element from A˜ of the ∞ form (a0 , φ, a∞ )∗ is said to be in A˜+ , if φ is in the fundamental alcove Aa+ . In order not to create too much confusion, we shall call A˜+ the fundamental alcove, too. After this preliminary discussion we can now state the important theorem. ˜ ∈ T ∗G ˜ can be decomposed as Theorem 3.6. Every element K ˜ = k˜L φ˜k˜−1 , K R
˜, k˜L,R ∈ G
φ˜ ∈ A˜+ .
(3.47)
The ambiguity of the decomposition is given by the simultaneous right xmultiplica˜ × RS = exp T˜ . tion of k˜L and k˜R by the same element of Cent(A) Proof. Obviously, we must prove that every element of G˜∗ can be connected ˜ In other words, for every to some element in A˜+ by the coadjoint action of G. ∗ ∗ 0 ∞ ∗ ˜ ˜ ˜ (C, γ, c) ∈ G , there exists kL ∈ G and (a , φ, a ) ∈ A˜+ such that ] ˜ (a0 , φ, a∞ )∗ = (C, γ, c)∗ . Coad kL
(3.48)
ˆ already does the job. Indeed, we have In fact, it turns out that kˆL ∈ G ] ˆ (a0 , φ, a∞ )∗ = (a0 + hφ, k −1 ∂kL i + 1 a∞ (k −1 ∂kL , k −1 ∂kL )G , CoadkL φ Coad L L L kL 2 −1 + a∞ Υ−1 (∂kL kL ), a∞ )∗ ,
(3.49)
where we remind ourselves that our convention is kL = π(kˆL ) if both kˆL and kL are present in the same formula. Thus we have to show that every γ can be written as −1 −1 Υ(γ) = kL Υ(φ)kL + a ∞ ∂σ kL kL ,
where φ is in the fundamental
∞ alcove Aa+ . Z σ ←
V (σ) =P exp
0
(3.50)
Define
Υ(γ)(σ) dσ . a∞
(3.51)
The monodromy V (2π) is the element of the compact group G0 , hence it can be diagonalized [9, 19] as V (2π) = k0 e2πρ k0−1 , where k0 is in G0 and ρ in the alcove
Υ(A1+ ).
kL (σ) = V (σ)k0 exp (−ρσ) ,
(3.52)
Define now kL and φ as follows φ = a∞ Υ−1 (ρ) .
(3.53)
September 6, 2004 14:35 WSPC/148-RMP
712
00214
C. Klimˇ c´ık
It can be easily checked that the pair (kL , φ) defined in this way satisfies (3.50). Thus we see that the elements k˜L , k˜R and φ˜ from the statement of the theorem always exist, moreover, φ˜ is unique. It then easily follows that ambiguity of the choice of k˜L and k˜R is given by the simultaneous right multiplication by an element from exp T˜ . The theorem is proved. 3.2.2. Affine model space We wish to construct the phase space of the loop group chiral WZW model. The first part of the exposition of this section will follow the spirit of the Sec. 3.1. Indeed, ˜L = G ˜ × A˜+ with a symplectic structure we shall equip the affine model space M by taking the pull-back of the canonical symplectic form on the cotangent bundle ˜ of the centrally biextended loop group. We shall also write down the natural T ∗G Hamiltonian thus constructing the chiral geodesical model on the affine Kac–Moody ˜ Then we shall take steps which are not rooted in Sec. 3.1; namely, we group G. perform the symplectic reduction of that chiral master model (down to the chiral WZW model). ˜ can be simply Recall from Sec. A.2 that the symplectic potential θ˜ on T ∗ G ˜ ˜ expressed in the right trivialization K = βL g˜ as θ˜ = hβ˜L , d˜ g g˜−1 i .
(3.54)
The dynamical system characterized by the symplactic form dθ˜ and by the Hamiltonian ˜ β˜L (K)) ˜ ˜∗ ˜ K) ˜ = − 1 (β˜L (K), (3.55) H( G κ is nothing but the master model (1.1). ˜ L = G× ˜ A˜+ . Its elements are couples (k˜L , φ˜L ) Consider the affine model space M ∗˜ and it is clearly the submanifold of T G. We can pullback the symplectic potential θ˜ ˜ to M ˜ L by the map (k˜L , φ˜L ) → k˜L φ˜L ∈ T ∗ G, ˜ where the group multiplication on T ∗ G ∗˜ law is considered in the sense of T G. The result is clearly −1 ˜ θ˜L = hφ˜L , k˜L dkL i .
(3.56)
˜ × A˜+ will turn out to be non-nondegenerate, hence it defines a The form dθ˜L on G symplectic structure. We have seen that we can obtain the symplectic structure on the affine model ˜ We can space by the simple pullback of the canonical symplectic form on T ∗ G. show with the help of the affine Cartan decomposition that a sort of the “inverse” ˜L × M ˜ R of two procedure is also possible. Indeed, consider the direct product M ˜ ˜ ˜ ˜ ˜ ˜ ˜ copies of the model space ML = G × A+ and ML = G × A− , where A− = −A˜+ . ˜L × M ˜ R with a symplectic form Equip the manifold M −1 ˜ −1 ˜ ω ˜ L×R = dθ˜L + dθ˜R = dhφ˜L , k˜L dkL i + dhφ˜R , k˜R dkR i .
(3.57)
September 6, 2004 14:35 WSPC/148-RMP
00214
Quasitriangular WZW Model
713
˜ with its canonical symplectic structure ω The cotangent bundle T ∗ G ˜ = dθ˜ can be obtained by the appropriate symplectic reduction of the symplectic manifold ˜L × M ˜ R, ω (M ˜ L×R ) induced by equating φ˜L + φ˜R = 0. The argument proving this statement is step by step identical to the finite dimensional argument of Sec. 3.2 and we shall not repeat it here. We have just shown that the symplectic structure of the geodesical model on ˜ can be obtained by the symplectic reduction of the product of two affine model G ˜ can be descended from a spaces. It also turns out that the Hamiltonian on T ∗ G ˜L × M ˜ R . The latter is given by Hamiltonian on M
˜ L×R = H ˜L + H ˜ R = − 1 (φ˜L , φ˜L ) ˜∗ − 1 (φ˜R , φ˜R ) ˜∗ . (3.58) H G G 2κ 2κ Here (·, ·)G˜∗ is the form (2.22). ˜L × M ˜ R to T ∗ G ˜ is governed by the moment The symplectic reduction from M ˜ ˜ maps φL + φR . It generates the simultaneous right action of the group exp T˜ ≡ ˜ L×R is invariant with respect to T × U (1) × RS on k˜L and k˜R . The Hamiltonian H ˜ on T ∗ G ˜ given by this action hence it defines the Hamiltonian H ˜ = − 1 (Coad ] ˜ φ˜L , Coad ] ˜ φ˜L , Coad ] ˜ φ˜L ) ˜∗ − 1 (Coad ] ˜ φ˜L ) ˜∗ H G G kL kL kL kL 2κ 2κ 1 1 −1 ˜ ˜ ˜˜ −1 ˜ β˜L (K)) ˜ ˜∗ . = − (β˜L (k˜L φ˜k˜R ), βL (kL φkR ))G˜∗ = − (β˜L (K), (3.59) G κ κ ˜ given by the formula (3.55). Thus we have recovered the Hamiltonian H ˜ L given In what follows, we are therefore going to study the chiral model on M by the action Z 1 ˜ ˜ −1 d ˜ S˜L (k˜L , φ˜L ) = dτ φ˜L , k˜L kL + (φL , φL )G˜∗ . (3.60) dτ 2κ ˜ admits the chiral decomSo far we have learned that the master model (1.1) on G position into two chiral models (3.60). By this we mean that it can be defined by ˜L × M ˜ R and characterized by the symplectic reduction of the model defined on M ˜ the symplectic form ω ˜ L×R and by the Hamiltonian HL×R . Combining this fact with the results of Sec. 2.2, we learn that the standard WZW model can be produced by combining the two models (3.60) and then performing the symplectic reduction. Next we shall show that we arrive at the same result (WZW model) if we first perform a simple symplectic reduction at the chiral level (3.60) and then combine two such reduced (chiral WZW) models. 3.2.3. Chiral reduction to the first floor In order to perform the first step of the reduction, we should evaluate the standard ˜ on M ˜ L . The symplectic (Abelian) moment maps generating the left action of G ˜ ˜ potential θL (hence the symplectic form dθL ) is clearly invariant with respect to ˜ Consider an infinitesimal vector field V˜ = the left multiplication by any k˜0 ∈ G.
September 6, 2004 14:35 WSPC/148-RMP
714
00214
C. Klimˇ c´ık
˜ L corresponding to the left action of a generator T˜ ∈ G. ˜ As usual, the Rk˜L ∗ T˜ on M ˜ ˜ corresponding moment map hM, T i is defined by the relation ˜ , T˜i . −iV˜ dθ˜L ≡ dθ˜L (·, V˜ ) = dhM
(3.61)
The invariance of the symplectic potential θ˜L means the vanishing of its Lie derivative with respect to V˜ . In other words, (iV˜ d + diV˜ )θ˜L = 0 .
(3.62)
From this relation, it immediately follows that ˜ , T˜i = i ˜ θ˜L = hφ˜L , L˜−1 R˜ T˜i = hCoad ] ˜ φ˜L , T˜i = hβ˜L (k˜L φ˜L ), T˜i . hM V kL k ∗ kL ∗ L
(3.63)
] ˜ φ˜L is the moment map of the standard Hamiltonian Since β˜L (k˜L φ˜L ) = Coad kL ˜ on M ˜ L , we infer immediately the following Poisson brackets of its left action of G ˜L coefficient functions on M {hβ˜L (k˜L φ˜L ), x˜i, hβ˜L (k˜φ˜L ), y˜i}M˜ L = hβ˜L (k˜φ˜L ), [˜ x, y˜]i ,
x˜, y˜ ∈ G˜ .
(3.64)
The particular case of these Poisson brackets will play the important role in what follows {hβ˜L (k˜L φ˜L ), (0, ξ, 0)i, hβ˜L (k˜φ˜L ), (0, η, 0)i}M˜ L = hβ˜L (k˜L φ˜L ), (0, [ξ, η], 0)i + a∞ L ρ(ξ, η) ,
ξ, η ∈ G .
(3.65)
Here ρ is the loop group cocycle (A.16) and we have used the fact that hβ˜L (k˜L a ˜L ), T˜ ∞ i = a∞ L .
(3.66)
a∞ L is defined by the decomposition (for the notation cf. Sec. 3.2.1) ∗ φ˜L = (a0L , φL , a∞ L ) .
(3.67)
˜ Note also that k˜L φ˜L is the product in the sense of T ∗ G. 0 ˜ ˆ ˜ Now we parametrize kL = ukL , u = exp sT and rewrite the action (3.60) as Z ] ˆ φ˜L , T˜ 0 i ds + φˆL , kˆ−1 d kˆL + 1 (φ˜L , φ˜L ) ˜∗ . S˜L (s, kˆL , φ˜L ) = dτ hCoad L G kL dτ dτ 2κ (3.68) ] ˆ φ˜L , T˜ 0 i ≡ β˜0 is the moment map generating the infinitesimal Of course, hCoad L kL left action of T˜ 0 and we put ∗ φˆL = (0, φL , a∞ L ) .
(3.69)
0 ˆ Now we introduce the set of coordinates (s, β˜L , kL , φˆL ) and, using the formula (2.23), we calculate −1 0 ] ˆ φ˜L , T˜ 0 i = a0L + hφL , k −1 ∂kL i + 1 a∞ β˜L = hCoad (k −1 ∂kL , kL ∂kL )G . L kL 2 L L
(3.70)
September 6, 2004 14:35 WSPC/148-RMP
00214
Quasitriangular WZW Model
715
As usual, here kL = π(kˆL ). The action (3.6.8) finally becomes Z 1 ˜0 ∞ −1 d ˆ 0 ds 0 ˆ ˜ ˜ ˆ ˆ ˜ SL (s, βL , kL , a ˆL ) = dτ βL − βL a L + φ L , k L kL dτ κ dτ 1 a∞ −1 (φL , φL )G ∗ + L hφL , kL ∂kL i 2κ κ (a∞ )2 −1 −1 ∂kL , kL ∂kL )G . + L (kL 2κ +
(3.71)
˜ L = − 1 (φ˜L , φ˜L ) ˜∗ is obviously invariant with respect to the The Hamiltonian H G 2κ ˜ L , hence it Poisson-commutes with the moment map β˜0 and action of T˜ 0 on M L 0 we can consistently set β˜L to zero in (3.71). This constitutes the first step of the symplectic reduction. 3.2.4. Ground floor: standard chiral WZW model Recall that the first step of the symplectic reduction from the chiral master model (3.60) gave the result Z 1 −1 d ˆ kL + (φL , φL )G ∗ SˆL (kˆL , φˆL ) = dτ φˆL , kˆL dτ 2κ 2 a∞ (a∞ −1 −1 −1 L L ) + hφL , kL ∂kL i + (kL ∂kL , kL ∂kL )G . (3.72) κ 2κ ˆL = G ˆ × Aˆ+ that This first floor chiral theory is formulated on the phase space M we shall call the reduced affine model space. Recall that the elements of the alcove ∞ ˆ L is given by Aˆ+ have the form (0, φL , a∞ ), where φL ∈ Aa+ . The Hamiltonian H the collection of terms in (3.72) depending on κ. On the other hand, the symplectic potential is independent on κ. In order to perform the second step of the symplectic reduction, it will be convenient to express the symplectic form −1 ˆ ω ˆ L = dθˆL = dhφˆL , kˆL dkL i
(3.73)
d0 . The convenient basis can be obtained by in some basis of the Lie algebra Gˆ = LG injecting a basis of G = LG0 into Gˆ by the map ι and adding the generator Tˆ ∞ . The basis of LG0 , in turn, can be naturally constructed from the canonical Cartan–Weyl basis of the complexified Lie algebra LG0C . The step generators of the latter are of the form E α einσ ≡ Enα ,
n ∈ Z,
H µ einσ ≡ Hnµ ,
n ∈ Z,
n 6= 0 ,
(3.74)
where E α , H µ are the basis of G0C (cf. Sec. 3.1.3) and σ is the loop parameter. In what follows, we shall often denote a generic element of the set (3.74) as E αˆ , where ˆ stands for the corresponding labels (α, n) or (µ, n 6= 0). If α α ˆ∈Φ ˆ is such that α, µ
September 6, 2004 14:35 WSPC/148-RMP
716
00214
C. Klimˇ c´ık
ˆ + . The basis of are arbitrary and n > 0, or α ∈ Φ+ and n = 0, we say that α ˆ∈Φ µ α ˆ α ˆ ˆ the Lie algebra LG0 can then be chosen as (T , B , C ), α ˆ ∈ Φ+ , where i B αˆ = √ (E αˆ + E −αˆ ) , 2
T µ = iH µ ,
1 C αˆ = √ (E αˆ − E −αˆ ) . 2
(3.75)
Here by −ˆ α we mean (−α, −n) for α ˆ = (α, n) and (µ, −n) for α ˆ = (µ, n). It turns out that this basis is orthogonal with respect to the form (·, ·)G defined in (A.1). ˆ +. The dual basis to (3.75) will be denoted as tµ , bαˆ , cαˆ , α ˆ∈Φ d0 alluded above; it reads Now we can finally write down the basis of Gˆ = LG Tˆ ∞ , ι(T µ ), ι(B αˆ ), ι(C αˆ ) ,
ˆ+ . α ˆ∈Φ
(3.76)
The dual basis is tˆ∞ , π ∗ (tµ ), π ∗ (bαˆ ), π ∗ (cαˆ ) ,
ˆ+ , α ˆ∈Φ
(3.77)
where the map π ∗ : G ∗ → Gˆ∗ is induced by the exact sequence (2.2). By using the general formula (A.14) and the explicit form (A.16) of the cocycle, there is no problem to write down all commutation relations among the generators of the basis (3.76). Here we shall write down only the commutators relevant for further discussion, or, in other words, only the commutators which are not annihilated by ˆ+ all elements of Span(tˆ∞ , π ∗ (tµ )). Thus we have for every α ˆ∈Φ [ι(B αˆ ), ι(C αˆ )] = −iˆ α∨ ,
(3.78)
where α ˆ∨ is the so-called affine coroot. It is given explicitly as follows −iˆ α∨ = ι(−iα∨ ) −
2n ˆ ∞ T , |α|2
α ˆ = (α, n) ,
−iˆ α∨ = −nTˆ ∞ ,
α ˆ = (µ, n) . (3.79)
Now we are ready to study the symplectic form ω ˆ L = dθˆL on the reduced model ˆ space ML . In what follows, we shall suppress the subscript L on the coordinates (kˆL , φˆL ) of the model space. First of all, ω ˆ L can be written as ˆ − hφ, ˆ kˆ−1 dkˆ ∧ kˆ−1 dki ˆ ω ˆ L = hdφˆ ∧, kˆ−1 dki = da∞ ∧ L∗kˆ−1 tˆ∞ + d(a∞ aµ ) ∧ L∗kˆ−1 π ∗ (tµ ) X ˆ iˆ hφ, α∨ iL∗kˆ−1 π ∗ (bαˆ ) ∧ L∗kˆ−1 π ∗ (cαˆ ) . +
(3.80)
ˆ+ α∈ ˆ Φ
Here we have set φˆ = a∞ tˆ∞ + a∞ aµ π ∗ (tµ )
(3.81)
and used kˆ−1 dkˆ = L∗kˆ−1 tˆ∞ ⊗ Tˆ ∞ + L∗kˆ−1 π ∗ (tµ ) ⊗ ι(T µ ) + L∗kˆ−1 π ∗ (bαˆ ) ⊗ ι(B αˆ ) + L∗kˆ−1 π ∗ (cαˆ ) ⊗ ι(C αˆ ) .
(3.82)
September 6, 2004 14:35 WSPC/148-RMP
00214
Quasitriangular WZW Model
717
Note that the normalization is chosen in the way that aµ ’s parametrize the alcove A1+ , i.e. aµ tµ ∈ A1+ . We have 2a∞ (n + hα, H µ iaµ ) , |α|2
ˆ iˆ hφ, α∨ i =
ˆ iˆ hφ, α∨ i = na∞ ,
α ˆ = (α, n) ;
(3.83)
α ˆ = (µ, n) .
(3.84)
2 hα, H µ iH µ . |α|2
(3.85)
This follows from the well-known fact α∨ =
Theorem 3.7. The symplectic reduction of ω ˆ L , induced by setting the moment map ∞ ˆ ˆ ˆ ˆ hβL (k φ), T i equal to some real number κ, gives the chiral WZW symplectic form WZ ωL on the WZW model space MLW Z = G × A1+ = LG0 × A1+ . ˆ Tˆ ∞ i = a∞ . The form ω Proof. First we observe that hβˆL (kˆφ), ˆ L restricted to the ∞ surface a = κ becomes ω ˆ L |aˆ∞ =κ = κdaµ ∧ π ∗ L∗π(k) ˆ −1 tµ X
+κ
ˆ+ α=(α,n)∈ ˆ Φ
+κ
X
ˆ+ α=(µ,n)∈ ˆ Φ
2 ∗ ∗ (n + hα, H µ iaµ )π ∗ L∗π(k) ˆ ∧ π Lπ(k) ˆ ˆ −1 bα ˆ −1 cα |α|2 ∗ ∗ nπ ∗ L∗π(k) ˆ. ˆ ∧ π Lπ(k) ˆ −1 cα ˆ −1 bα
(3.86)
In deriving this formula, we have used (3.83) and (3.84) and also the fact that π is ∗ ∗ the group homomorphism, which implies that π ∗ L∗π(k) ˆ −1 = Lk ˆ−1 π . ˆa), Tˆ ∞ i = a∞ generates the central Since we know that the moment map hβˆL (kˆ
circle action, we conclude immediately that the kernel of the form ω ˆ L restricted to ∞ ∞ ˆ a = κ is spanned by the vectors Lk∗ ˆ T . This can also be seen directly from the formula (3.86) since the central circle does not act on the coordinates aµ , a∞ of ˆ L . The restricted form (3.86) is therefore pullback the reduced affine model space M 0 ˆ aµ ) → of some two-form ωL on the manifold ML = G × A1+ by the map π : (k, ˆ aµ ). It remains to find this two-form ω 0 on ML . The first term in (3.86) can (π(k), L be rewritten as ∗ µ −1 κdaµ ∧ π ∗ L∗π(k) dki . (3.87) ˆ −1 tµ = κπ da ∧ htµ , k
Then we have
κ
X
ˆ+ α=(α,n)∈ ˆ Φ
2 ∗ ∗ hα, H µ iaµ π ∗ L∗π(k) ˆ ˆ ∧ π Lπ(k) ˆ −1 cα ˆ −1 bα |α|2
= −κπ ∗ haµ tµ , k −1 dk ∧ k −1 dki .
(3.88)
September 6, 2004 14:35 WSPC/148-RMP
718
00214
C. Klimˇ c´ık
Here we have used the commutation relations in the Lie algebra LG0 [B αˆ , C αˆ ] = −iα∨ , [B αˆ , C αˆ ] = 0 ,
α ˆ = (α, n) ; α ˆ = (µ, n) .
(3.89) (3.90)
By using the same commutation relations and the cocycle formula (A.16), we find that the remaining term proportional to κ is in fact equal to − κ2 π ∗ (k −1 dk ∧, ∂σ (k −1 dk))G . Putting them all together κ 0 ωL = κdaµ ∧ htµ , k −1 dki − κhaµ tµ , k −1 dk ∧ k −1 dki − (k −1 dk ∧ ∂σ (k −1 dk))G . 2 (3.91) Now we make a comparison with the formula (4.5) of [19] to conclude that, up to 0 WZ is indeed the symplectic form ωL the (2π) normalization (cf. Sec. 2.2.3), our ωL of the chiral WZW model. The theorem is proved. We conclude this paragraph by writing the formula for the (doubly) reduced Hamiltonian on the WZW model space MLW Z . It can be read off from the formula (3.72) κ −1 1 −1 −1 (φL , φL )G ∗ − hφL , kL ∂kL i − (kL ∂kL , kL ∂kL )G , (3.92) 2κ 2 where φL = κaµ tµ . This coincides with the Sugawara Hamiltonian of the chiral WZW model as we shall see in Sec. 3.2.6. Having obtained the correct symplectic form and Hamiltonian, we have indeed produced the standard chiral WZW theory by the two-step chiral symplectic reduction from the chiral master model (1.2). The fact that the full left-right WZW model can be obtained by combining two chiral WZW theories has been explained, e.g. in [19] and we shall not repeat this argument here. HLW Z = −
3.2.5. Affine dynamical r-matrix Our next task is to prepare for the quasitriangular generalization described later WZ on in this paper. For this we have to invert the chiral WZW symplectic form ωL . We shall write it as follows X WZ ωL = κdaµ ∧ L∗k−1 tµ + hφˆκ , iˆ α∨ iL∗k−1 bαˆ ∧ L∗k−1 cαˆ , (3.93) ˆ+ α∈ ˆ Φ
where φˆκ = κtˆ∞ + κaµ π ∗ (tµ ) . Z From here we immediately find the corresponding Poisson bivector ΠW L X 1 ∂ µ Z µ Lk∗ B αˆ ∧ Lk∗ C αˆ . ∧ L T − ΠW k∗ L (k, a ) = − ∨i ˆ κ∂aµ h φ , iˆ α κ ˆ α∈ ˆ Φ+
(3.94)
(3.95)
September 6, 2004 14:35 WSPC/148-RMP
00214
Quasitriangular WZW Model
719
Let us calculate the Poisson bracket of certain special functions of the variables (k, aµ ) ∈ G×A1+ . These functions are simply the matrix elements in some representation of LG0 . The bivector formula (3.95) then immediately implies the following Poisson brackets {k ⊗, k}W Z = (k ⊗ k)ˆ r0 (φˆκ ) , (3.96) {k, aµ }W Z = kT µ ,
where
rˆ0 (φˆκ ) =
X
ˆ α∈ ˆ Φ
{aµ , aν }W Z = 0 ,
i E αˆ ⊗ E −αˆ . ˆ hφκ , iˆ α∨ i
(3.97)
(3.98)
The brackets (3.96) and (3.97) characterize completely the Poisson structure on the WZW model space MLW Z . Note also that the summation in (3.98) is not restricted only to the positive roots. The fundamental braiding relation (3.96) can be rewritten in the representation corresponding to the pointwise action of the loop group element on the G0 -representation space V0 . In other words, consider a function on LG0 of the 0 form ρσij (k), where ρ is a matrix representation of the group G0 and σ 0 is some point on the loop. In words, this function is defined as follows: take an element (k, aµ ) ∈ MLW Z , forget about aµ , consider the element k(σ 0 ) of the group G0 obtained by evaluation of k at the point σ 0 and finally take the matrix element ij of the element k(σ 0 ) in the G0 -representation ρ. If we recall the definition (3.75) of B αˆ and C αˆ in terms of E αˆ and the definition (3.74) of E αˆ in terms of E α , H µ and einσ , we can directly derive from (3.98) {k(σ) ⊗, k(σ 0 )}W Z = (k(σ) ⊗ k(σ 0 ))ˆ r0 (φˆκ , σ − σ 0 ) , (3.99) where the affine dynamical r-matrix is denoted as rˆ0 (φˆκ , σ − σ 0 ) and defined as X |α|2 1 E α ⊗ E −α exp in(σ − σ 0 ) rˆ0 (φˆκ , σ − σ 0 ) = i 2 κ(n + hα, H µ iaµ ) α∈Φ,n∈Z
+i
X
µ,n∈Z,n6=0
1 µ H ⊗ H µ exp in(σ − σ 0 ) . nκ
(3.100)
It is important to note that the summation goes over all roots α ∈ Φ, not only over the positive ones. From (3.97), we can also derive the following bracket 1 {k(σ), aµ }W Z = k(σ)T µ . (3.101) κ It is simple to sum up the Fourier series in (3.100). The result is X 1 rˆ0 (φˆκ , σ − σ 0 ) = Per(σ − σ 0 − π) Hµ ⊗ Hµ κ µ +
X |α|2 µ µ 0 2π Per(e−ia hα,H i(σ−σ ) )E α ⊗ E −α , µ hα,H µ i −2πia 2κ e −1
α∈Φ
(3.102)
September 6, 2004 14:35 WSPC/148-RMP
720
00214
C. Klimˇ c´ık
where the notation Per(f (σ)) means the function of σ periodic with the period 2π and defined as f (σ) for σ ∈ [0, 2π]. 3.2.6. Vertex-IRF transformation and braiding relation The formula of the type (3.96) appears in the WZW literature [3, 10, 14, 15, 18] under the name of the exchange (braiding) relation. The reader might have noticed however, that our formula (3.99) does not at all resemble, e.g. the braiding relations (26) and (31) of Ref. 14. The reason is that we have used different coordinates on the WZW model space MLW Z . In order to establish the equivalence of our approach with that of [14], we must perform the so-called classical vertex-IRF transformation (the terminology is borrowed from [19]). Consider a map σ → m(σ) ∈ G0 defined as m(σ) = k(σ) exp(aµ Υ(tµ )σ) ,
(3.103)
where (k, aµ ) are the old coordinates of the WZW model space MLW Z . Now we introduce the new set of “monodromic” coordinates m(σ). The name is motivated by the fact that m(σ) is no longer a single-valued function but it develops a monodromy upon going around the circle of sigmas. This monodromy is encoded in the variables aµ , hence m(σ) encodes the information about both k and aµ . We wish to calculate the exchange relation (3.99) in terms of the variables m. We shall use the following obvious matrix relation {AB ⊗, CD} = (A ⊗ 1){B ⊗, C}(1 ⊗ D) + (A ⊗ C){B ⊗, D} + {A ⊗, C}(B ⊗ D) + (1 ⊗ C){A ⊗, D}(B ⊗ 1)
(3.104)
and write {m(σ) ⊗, m(σ 0 )}W Z = {k(σ)ea
µ
Υ(tµ )σ ⊗ ,
k(σ 0 )ea
= (m(σ) ⊗ m(σ 0 )) − + (e−a
µ
Υ(tµ )σ
⊗ e−a
µ
µ
Υ(tµ )σ 0
}W Z
σ − σ0 µ (H ⊗ H µ ) κ
Υ(tµ )σ 0
)ˆ r0 (φˆκ , σ − σ 0 )(ea
≡ (m(σ) ⊗ m(σ 0 ))B0 (φˆκ , σ − σ 0 ) .
µ
Υ(tµ )σ
⊗ ea
µ
Υ(tµ )σ 0
)
(3.105)
We shall call B0 (φˆκ , σ − σ 0 ) the quasiclassical braiding matrix. It is important to note that the argument σ of the braiding matrix B0 (φˆκ , σ) is the element of R and not of S 1 . This is related to the fact that the monodromic coordinate m(σ) on ML is multi-valued from the point of view of S 1 . Considering σ as an element of R makes the quantity m(σ) single-valued and the Poisson bracket (3.105) well-defined.
September 6, 2004 14:35 WSPC/148-RMP
00214
Quasitriangular WZW Model
721
Now combining (3.102) and (3.105), we immediately arrive at (cf. [14]) X |α|2 exp(iπη(σ)hα, H µ iaµ ) π α −α . B0 (φˆκ , σ) = − η(σ)(H µ ⊗ H µ ) − i E ⊗ E κ 2 sin(πhα, H µ iaµ ) α
(3.106)
Here η(σ) is the function defined by η(σ) = 2
σ +1, 2π
(3.107)
σ where [σ/2π] is the largest integer less than or equal to 2π . It turns out that important dynamical variables can be particularly simply expressed in terms of the new variables m(σ). Before showing this, some discussion ˆ ˆ L and about is needed about the moment maps generating the left G-action on M their behavior under the symplectic reduction leading from the reduced affine model ˆ L to the WZW model space M W Z . space M L ˆ = Coad [ ˆ φˆ generate (via the We already know that the moment maps βˆL (kφ) k ˆ L ) the left action of the group G ˆ on M ˆ L . In particular, the Poisson bracket on M ˆL = G ˆ × Aˆ+ . This means moment hβˆL , Tˆ ∞ i generates the central circle action on M that
{hβˆL , Tˆ ∞ i, hβˆL , ι(x)i}Mˆ L = 0 ,
x∈G.
(3.108)
It then follows that ˆ ι(x)i = hβˆL (esTˆ∞ kˆφ), ˆ ι(x)i , hβˆL (kˆφ),
x∈G;
(3.109)
d0 . kˆ ∈ LG
(3.110)
or, in other words, hβˆL , ι(x)i is invariant function with respect to the central circle action. As such, it gives rise to some function on the space of the central circle ˆ L . The latter orbits located at the submanifold a∞ = κ of the affine model space M WZ space of orbits is nothing but the reduced model space ML , hence we conclude that hβˆL , ι(x)i can be interpreted as the honest function on MLW Z . We denote it as jLx (k, aµ ). Actually, the functions jLx (k, aµ ) are the “important dynamical variables” mentioned above. In fact they are nothing but the generators of the chiral current algebra. To see this, we calculate their Poisson brackets {jLx , jLy }W Z . The computation follows the general procedure of the symplectic reduction at the level of the Poisson brackets as described in the Appendix A.3. Consider a pair of functions φi , i = 1, 2 on MLW Z . We wish to calculate their reduced Poisson bracket {φ1 , φ2 }W Z . In our particular situation, the procedure ˆ L that fulfil works as follows: define two functions φˆi on M ˆ aµ , a∞ = κ) = φi (π(k), ˆ aµ ) , φˆi (k,
ˆ L . It verifies Calculate then the Poisson bracket {φˆ1 , φˆ2 }Mˆ L on M {a∞ , {φˆ1 , φˆ2 }Mˆ L }Mˆ L = 0
(3.111)
September 6, 2004 14:35 WSPC/148-RMP
722
00214
C. Klimˇ c´ık
as the simple consequence of the Jacobi identity and the central circle invariance of φˆi . This means that there exists a function on MLW Z denoted suggestively as {φ1 , φ2 }W Z which verifies ˆ aµ , a∞ = κ) = {φ1 , φ2 }W Z (π(k), ˆ aµ ) . {φˆ1 , φˆ2 }Mˆ L (k,
(3.112)
Needless to say, the function {φ1 , φ2 }W Z is the reduced Poisson bracket that we seek. ˆ L plays the role of φˆ1 for the function φ1 = j x Now the function hβˆL , ι(x)i on M L on MLW Z . But we know from (3.64) and (2.7) that {hβˆL , ι(x)i, hβˆL , ι(y)i}Mˆ L = hβˆL , ι([x, y])i + hβˆL , Tˆ ∞ iρ(x, y) .
(3.113)
Using the fact that hβˆL , Tˆ ∞ i = a∞ and the relation (3.112), we obtain immediately [x,y]
{jLx , jLy }W Z = jL
+ κρ(x, y) .
(3.114)
This is nothing but the basic relation defining the chiral current algebra. Let us calculate the currents jLx as explicit functions of k, aµ . By using the formulae (3.38) and (2.24), we infer ˆ ι(x)i = hCoad [ ˆ (a∞ tˆ∞ + a∞ aµ π ∗ (tµ )), ι(x)i hβˆL (kˆφ), k = hπ ∗ (Coadk (a∞ aµ tµ ) + a∞ Υ−1 (∂σ kk −1 )), ι(x)i ,
(3.115)
ˆ Thus for the currents, we obtain where k = π(k).
jLx (k, aµ ) = κ(aµ kΥ(tµ )k −1 + ∂σ kk −1 , x)G .
(3.116)
This expression looks quite complicated but it drastically simplifies in the monodromic variables m(σ) jLx = κ(∂σ mm−1 , x)G .
(3.117)
Moreover, the Hamiltonian HLW Z given by (3.92) also simplifies considerably 1 (κ∂σ mm−1 , κ∂σ mm−1 )G . (3.118) 2κ This is the Sugawara formula expressing the chiral WZW Hamiltonian solely in terms of the Kac–Moody currents. The reader should not be confused by the minus sign. It is related to the negative definiteness of our bilinear form (·, ·)G . The monodromic variables are generally used in the study of the standard WZW model. It turns out, however, that viewing things from the points of view of the variables k, aµ will be more insightful for seeking the quasitriangular generalization of the story. Another important Poisson bracket is the following one HLW Z = −
{m(σ), (κ∂σ m(σ 0 )m−1 (σ 0 ), x0 )G0 }W Z = x0 m(σ)δ(σ − σ 0 ) ,
x0 ∈ G0 . (3.119)
It is direct consequence of the braiding relation (3.105) and it expresses the simple fact, that even after the symplectic reduction the current κ∂σ mm−1 continues to
September 6, 2004 14:35 WSPC/148-RMP
00214
Quasitriangular WZW Model
723
generate the left action of the centrally extended loop group on the WZW model space MLW Z . However, the action of the generator of the central circle is now trivial. Upon the quantization, the bracket (3.119) means that m(σ) is the Kac–Moody primary field. Finally, as an illustration of the suitability of the monodromic variables, we give the explicit expression for the solution of the classical field equations of the chiral WZW model. It turns out that the time evolution of the coordinates m(σ) is given by the following simple formula [m(σ)](τ ) = m(σ − τ ) .
(3.120)
This fact can be derived from the braiding relation (3.105) and from the Sugawara formula (3.118) (because those two ingredients anyway characterize completely the structure of the standard chiral WZW model). Here we shall offer another derivation based on the solution of the second floor chiral master model (1.2). Indeed, varying the action (1.2), we find in full analogy with the finite-dimensional calculation of Sec. 3.1.3 that the classical solution of the second floor chiral model ˜ ˜ −Υ(φ0 ) ˜ ˜ τ . (3.121) k(τ ) = k0 exp κ Since we have performed the reduction with respect to the left action of the group ˜ ) as RS , we have to cast k(τ ˜ ) = es(τ )T˜ 0 k(τ ˆ ), k(τ
(3.122)
˜0
and suppress es(τ )T ∈ RS . This corresponds to the first step of the reduction. The result is ∞ ∞ ˆ ) = e− a κ τ T˜0 kˆ0 e a κ τ T˜ 0 e κ1 (a0 Tˆ∞ −a∞ aµ Υ(tµ ))τ , k(τ (3.123)
The reduction a∞ = κ to the ground floor gives
ˆ )) = k0 (σ − τ )e−aµ Υ(tµ )τ , k(τ ) = π(k(τ
(3.124)
where π(kˆ0 ) ≡ k0 . Combining the definition (3.103) of the monodromic variables with the last formula (3.124), we arrive directly at the desired relation (3.120). 4. Universal Quasitriangular WZW Model In the two preceding sections, we have described two important dynamical systems. The first one, the geodesical model, was constructed for any Lie group G possessing an invariant symmetric non-degenerate bilinear form on the Lie algebra G of G. The construction of the second one, the WZW model, necessitated moreover an existence ˜ of G. The symplectic structures of both these models of the central biextension G have either been identified to or derived from the canonical symplectic structure of ˜ (for the WZW the cotangent bundle T ∗ G (for the geodesical model) and of T ∗ G model).
September 6, 2004 14:35 WSPC/148-RMP
724
00214
C. Klimˇ c´ık
In this section, we are going to show that one can generalize both geodesical and WZW models mentioned above. A recipe on how to do this lies in a crucial observation that T ∗ G is the so-called Heisenberg double of the group G. The latter is a certain group equipped with an additional structure that we shall describe in what follows. There may exist many different Heisenberg doubles of a given group G; for us it is important that the geodesical model and the WZW one can be constructed by using only those properties of T ∗ G that are shared also by all other Heisenberg doubles of G. In particular, given a nontrivial Heisenberg double of the ˜ we can define an associated WZW-like model. We centrally biextended group G, shall refer the latter as the universal quasitriangular WZW model. In order to keep this paper as self-contained as possible, we are going to give here a quick review of the theory of Poisson–Lie groups and of related various doubles of Lie groups. The reader may find it somewhat inconvenient to read through the text with demonstrations of the relevant standard propositions. However, these demonstrations involve many important facts, technical skills and computational tools that may facilitate the understanding of the present article. 4.1. Poisson–Lie primer 4.1.1. The Drinfeld double A Poisson bracket on a Lie group manifold G that is compatible with the group multiplication law is called the Poisson–Lie bracket. Denote ∆ : F un(G) → F un(G) ⊗ F un(G) the standard coproduct defined as (∆F )(g, g 0 ) = F (gg 0 ) ,
g, g 0 ∈ G ,
F ∈ F un(G) .
(4.1)
Then the condition of compatibility reads {∆F1 , ∆F2 }G×G = ∆{F1 , F2 }G .
(4.2)
Here {·, ·}G×G is the direct product Poisson bracket on G × G characterized by the condition {F1 (x)G1 (y), F2 (x)G2 (y)}G×G = {F1 (x), F2 (x)}G G1 (y)G2 (y) + F1 (x)F2 (x){G1 (y), G2 (y)}G ,
(4.3)
where x and y are coordinates on the first and second copy of G, respectively. Note that upon the quantization of the algebra of functions on the group manifold, we obtain the so-called quantum group. The quantum version of the Poisson–Lie condition (4.2) then becomes the usual statement in the theory of Hopf algebras saying that the coproduct is the algebra homomorphism. On a given group G, one may have several inequivalent Poisson–Lie brackets. In this paper, we shall always be concerned with one privileged way of constructing the Poisson–Lie structures on G that uses the concept of the Heisenberg double of G. Before defining this notion, we have to recall respectively the definitions of the Manin and Drinfeld doubles.
September 6, 2004 14:35 WSPC/148-RMP
00214
Quasitriangular WZW Model
725
The Manin double of a Lie group G is any Lie group D whose dimension is twice as big as that of G and that fulfils two other conditions. (1) The double D contains G as its subgroup; (2) The Lie algebra D of the double D is equipped with an invariant symmetric nondegenerate bilinear form (·, ·)D such that the Lie algebra G of G is isotropic, i.e. (G, G)D = 0. Suppose now that the same group D equipped with the form (·, ·)D on D is the Manin double of two Lie groups G and B. We say that D is the Drinfeld double of G (and of B), if the Lie algebras G and B of G and B linearly generate all the Lie algebra D of D. In other words, D = G +B,
(G, G)D = 0 ,
(B, B)D = 0 .
(4.4)
It is very important to realize that G + B means here the direct sum of two vector spaces and not the direct sum of two Lie algebras. The Lie bracket on D can be conveniently encoded in terms of the Lie brackets on G and B and of the form (·, ·)D as follows [X + α, Y + β]D = [X, Y ]G + [α, β]B + CoadX β − CoadY α + Coadα Y − Coadβ X .
(4.5)
Here X, Y ∈ G and α, β ∈ B. As far as the coadjoint action is concerned, the elements α, β are viewed as the elements of G ∗ by the prescription hα, Xi = (α, X)D ,
(4.6)
and similarly X, Y are viewed as the elements of B ∗ under the coadjoint action of B. 4.1.2. The Heisenberg double The group multiplication law in the Drinfeld double D induces smooth maps ML : G × B → D and MR : B × G → D given by ML (g, b) = gb ,
MR (b, g) = bg ,
g ∈ G,
b∈ B.
(4.7)
The crucial fact underlying this article can be formulated as the following theorem Theorem 4.1. If the maps ML,R defined above are bijective, then D is a symplectic manifold and the following expression defines the symplectic form 1 1 ∗ ∗ ρG )D + (b∗R ρB ∧, gL λG ) D . (4.8) ω = (b∗L λB ∧, gR 2 2 The corresponding Poisson bracket reads 1 1 (4.9) {φ, ψ}D = (∇L φ, R∗ ∇L ψ)D∗ + (∇R φ, R∗ ∇R ψ)D∗ . 2 2 Remark. The Poisson bracket (4.9) was introduced by Semenov-Tian-Shansky in [43]. If the maps ML,R are not bijective, then (4.9) still defines a Poisson bracket
September 6, 2004 14:35 WSPC/148-RMP
726
00214
C. Klimˇ c´ık
and the symplectic leaves of the corresponding Poisson structure were described in [2]. The bijectiveness of ML,R is often referred to as the property that the group D is smoothly globally decomposable as D = GB and D = BG. Let us explain the symbols appearing in (4.8). The maps bL : D → B and gR : D → G are induced by the decomposition D = BG and gL : D → G and bR : D → B by D = GB. The expression λG (ρG ) denotes the left (right) invariant G-valued Maurer–Cartan form on the group G. Recall that λG (Xg ) = Lg−1 ∗ Xg ,
ρG (Xg ) = Rg−1 ∗ Xg ,
Xg ∈ T g G .
(4.10)
Note that the forms λG and ρG are often written also as λG = g −1 dg ,
ρG = dgg −1 .
(4.11)
The notation used in (4.9) is as follows: (·, ·)D∗ is the bilinear form on the dual of D induced by the (nondegenerate) bilinear form (·, ·)D and R∗ : D∗ → D∗ is the map dual to R : D → D. The latter is given by R = P rB − P rG ,
(4.12)
where P rG (P rB ) is the projector on G (B) with the kernel B (G). Clearly, the decomposition D = G + B induces the corresponding decomposition of the dual D∗ = G ∗ + B ∗ and R∗ = P r B ∗ − P r G ∗ .
(4.13)
Recall also the definitions (cf. (A.26), (A.27)) of the differential operators ∇L ,∇R on D ∇L : F un(D) → F un(D) ⊗ D ∗ , d L φ(esα K) , h∇ φ, αi(K) = ds s=0
∇R : F un(D) → F un(D) ⊗ D ∗ , (4.14) d h∇R φ, αi(K) = φ(Kesα ) . (4.15) ds s=0
Here α ∈ D, K ∈ D and φ ∈ F un(D). It is useful to write the bracket (4.9) in some basis T i , ti ; i = 1, . . . , dim G of D, where T i ’s form the basis of G and ti ’s the corresponding dual basis of B. We obtain 1 1 {φ, ψ}D = h∇L φ, T i ih∇L ψ, ti i − h∇L φ, ti ih∇L ψ, T i i 2 2 1 1 + h∇R φ, T i ih∇R ψ, ti i − h∇R φ, ti ih∇R ψ, T i i , (4.16) 2 2 where the standard Einstein summation convention is used. By the duality of the basis ti with respect to the basis T i , we mean that the following relation holds (ti , T j )D = δij .
(4.17)
In order to prove Theorem 4.1, we need to handle in an efficient way the SemenovTian-Shansky symplectic form ω given by (4.8). We shall first prove the following lemma that will be used also for the proof of Theorem 4.6 in the next section.
September 6, 2004 14:35 WSPC/148-RMP
00214
Quasitriangular WZW Model
727
Lemma 4.2. Consider a point K ∈ D and four linear subspaces of the tangent space TK D defined as SL = LK∗ G, SR = RK∗ G, S˜L = LK∗ B and S˜R = RK∗ B. Let ΠLR˜ be a projector on S˜R with a kernel SL and ΠLR a projector on SR with a ˜ ˜ kernel SL . Then ω(t, u) = (t, (ΠLR ˜ − Π LR ˜ )u)D ,
(4.18)
where t, u are arbitrary two vectors in the tangent space TK D at the point K ∈ D and the metric (·, ·)D at the point K is defined by the right or left transport of the bilinear form (·, ·)D defined at the unit element E ∈ D. Proof. First we rewrite (4.8) as ω=
1 1 ∗ (bL ρB ∧, ρD )D + (b∗R λB ∧, λD )D , 2 2
(4.19)
where λD (ρD ) is the left (right) invariant Maurer–Cartan form on the double D. In order to see that (4.19) is correct, we write 1 1 −1 −1 −1 ∧ −1 (dKK −1 ∧, dbL b−1 L )D = (dbL bL + bL dgR gR bL , dbL bL )D 2 2 =
1 ∗ (g ρG ∧, b∗L λB )D 2 R
(4.20)
and 1 −1 1 −1 −1 −1 ∧ −1 (K dK ∧, b−1 R dbR )D = (bR gL dgL bR + bR dbR , bR dbR )D 2 2 =
1 ∗ (g λG ∧, b∗R ρB )D . 2 L
(4.21)
Here we used K = bL (K)gR (K) in the first relation (4.20) and K = gL (K)bR (K) in the second one (4.21). Take a vector v ∈ SL ⊂ TK D and calculate the expression hb∗L ρB , vi = Rb−1 ∗ bL∗ v = 0 . L
(4.22)
The vanishing of this expression follows from the fact that bL (KesT ) = bL (K) for every T ∈ G. Now consider another vector w ∈ S˜R ⊂ TK D. We have hb∗L ρB , wi = Rb−1 ∗ bL∗ w = Rb−1 ∗ Rg−1 ∗ w = RK −1 ∗ w . L
L
R
(4.23)
−1 This follows from the fact that bL (est K) = est bL (K) = est KgR (K) for every t ∈ B. We thus obtain for an arbitrary vector u ∈ TK D that
hb∗L ρB , ui = hb∗L ρB , (ΠRL ˜ + Π LR ˜ )ui = Rb−1 ∗ bL∗ ΠLR ˜ u = RK −1 ∗ ΠLR ˜u . L
(4.24)
Much in the same way as above, we derive hb∗R λB , ui = LK −1 ∗ ΠRL˜ u .
(4.25)
September 6, 2004 14:35 WSPC/148-RMP
728
00214
C. Klimˇ c´ık
Combining the formulae (4.19), (4.24) and (4.25), we arrive at −2ω(t, u) = (RK −1 ∗ t, RK −1 ∗ ΠLR˜ u)D + (LK −1 ∗ t, LK −1 ∗ ΠRL˜ u)D − (RK −1 ∗ u, RK −1 ∗ ΠLR˜ t)D − (LK −1 ∗ u, LK −1 ∗ ΠRL˜ t)D = (t, ΠLR˜ u)D + (t, ΠRL˜ u)D − (u, ΠLR˜ t)D − (u, ΠRL˜ t)D .
(4.26)
Now we use the obvious fact that ΠLR˜ + ΠRL ˜ = 1,
ΠRL˜ + ΠLR ˜ =1
(here the first index of the projector denotes its kernel and the second its image) and obtain three relations (t, ΠRL˜ u)D = (t, u) − (t, ΠLR ˜ u)D ;
(4.27)
−(u, ΠLR˜ t)D = −(ΠRL ˜ u, ΠLR ˜ t)D = −(ΠRL ˜ u, t)D = −(u, t) + (ΠLR˜ u, t)D ;
(4.28)
−(u, ΠRL˜ t)D = −(ΠLR ˜ u, ΠRL ˜ t)D = −(ΠLR ˜ u, t)D .
(4.29)
Inserting (4.27), (4.28) and (4.29) into (4.26), we obtain ω(t, u) = (t, (ΠLR ˜ − Π LR ˜ )u)D .
(4.30)
The lemma is proved. Proof of Theorem 4.1. The strategy of the proof is as follows. Since the antisymmetry of the bracket (4.9) is obvious from (4.16), we have to prove only the validity of the Jacobi identity, in order to show that (4.9) is really the Poisson bracket. Then we shall prove the non-degeneracy of this Poisson structure by showing that the form ω is dual (inverse) to the Poisson bivector corresponding to the bracket (4.9). The Poisson bracket (4.9) on D can be rewritten in terms of a certain bivector (antisymmetric two-tensor) α such that the following relation holds {φ, ψ}D = α(dφ, dψ) ,
φ, ψ ∈ F un(D) .
(4.31)
We can easily identify α by noting that (4.16) can be rewritten as {φ, ψ}D = hr, ∇L φ ⊗ ∇L ψi + hr, ∇R φ ⊗ ∇R ψi ,
(4.32)
where r ∈ Λ2 D is the so-called classical r-matrix. It is clearly given by r=
1 i (T ⊗ ti − ti ⊗ T i ) . 2
(4.33)
From this, we have αK = (LK∗ + RK∗ )r ,
K ∈ D.
(4.34)
September 6, 2004 14:35 WSPC/148-RMP
00214
Quasitriangular WZW Model
729
The Jacobi identity for the Poisson bracket is equivalent to the following condition for the bivector α [α, α]S = 0 .
(4.35)
Here [·, ·]S is the Schouten bracket of the multivectors [1]. Recall its main properties [α, β]S = −(−)(|α|−1)(|β|−1)[β, α]S , [α, β ∧ γ]S = [α, β]S ∧ γ + (−1)(|α|−1)|β| β ∧ [α, γ]S .
(4.36) (4.37)
Moreover, for any vector field X, the bracket [X, α]S is just the Lie derivative. Let us calculate [α, α]S for α given by (4.34). Since right-invariant vector fields on the Lie group manifolds commute with left-invariant ones, the result reads [α, α]S = [LK∗ r, LK∗ r]S + [RK∗ r, RK∗ r]S = (LK∗ − RK∗ )[r, r]S .
(4.38)
Here we use the same symbol [·, ·]S also for the Schouten bracket based on the Lie algebra commutator. Thus we see that the Jacobi identity is fulfilled if and only if [r, r]S is the invariant element of Λ3 D. Actually, using (4.33), the direct calculation gives [r, r]S = ([ti , tl ], T k )D tk ∧ T i ∧ T l + ([T i , T l ], tk )D T k ∧ ti ∧ tl .
(4.39)
Using the bracket (4.5), the D-invariance of this expression can be checked by the direct calculation. But another way to see it is to realize that the right-hand side of (4.39) coincides with the invariant Cartan (or WZW) element of Λ3 D canonically associated to the invariant bilinear form (·, ·)D (cf. [1]). It is well-known that the bivector α satisfying (4.35) defines a symplectic structure if and only if there exists a dual (or inverse) 2-form ω ∈ Λ2 T ∗ D. The latter is then automatically closed (dω = 0), as the consequence of (4.35). The duality means that α(·, ω(·, u)) = u
(4.40)
for any vector field u ∈ T D. Let us prove (4.40) for α given by (4.34) and ω by (4.18). First we note that the element ti ⊗ T i + T i ⊗ ti ∈ D ⊗ D is invariant since it corresponds to the invariant bilinear form (·, ·)D . Due to this fact, the expression for α can be rewritten as α = LK∗ (T i ⊗ ti ) − RK∗ (ti ⊗ T i ) .
(4.41)
Thus we have i α(·, ω(·, u)) = RK∗ ti (RK∗ T i , (ΠLR˜ − ΠLR ˜ )u)D − LK∗ T (LK∗ ti , (ΠLR ˜ − ΠLR ˜ )u)D . (4.42)
This can be rewritten as α(·, ω(·, u)) = (ΠRR˜ − ΠLL ˜ )(ΠLR ˜ − ΠLR ˜ )u .
(4.43)
September 6, 2004 14:35 WSPC/148-RMP
730
00214
C. Klimˇ c´ık
Recall that the first subscript of the projector stands for the kernel and the second for the image (cf. proof of Lemma 4.2). Now we have (ΠRR˜ − ΠLL ˜ )(ΠLR ˜ − ΠLR ˜ ) = Π RR ˜ Π LR ˜ − ΠLL ˜ Π LR ˜ + ΠLL ˜ ΠLR ˜ = ΠLR˜ − ΠLL ˜ Π LR ˜ + ΠLL ˜ = (ΠLR˜ + ΠRL ˜ ) + (ΠLL ˜ − ΠRL ˜ − ΠLL ˜ Π LR ˜) = 1+0.
(4.44)
Combining (4.43) and (4.44), we arrive finally at α(·, ω(·, u)) = u .
(4.45)
The theorem is proved. Remark. In order to show that dω = 0, we can also use the Polyakov–Wiegmann formula [39] applied for K = bL (K)gR (K) and K = gL (K)bR (K) respectively 1 1 ∗ (ρD ∧, ρD ∧ ρD )D = d(gR ρG ∧, b∗L λB )D + (b∗L ρB ∧, b∗L ρB ∧ b∗L ρB )D 3 3 1 ∗ ∗ ∗ ρG ∧, gR ρG ∧ g R ρ G )D ; + (gR 3
(4.46)
1 1 ∗ (ρD ∧, ρD ∧ ρD )D = d(b∗R ρB ∧, gL λG )D + (b∗R ρR ∧, b∗R ρB ∧ bR ∗ ρB )D 3 3 1 ∗ ∗ ∗ + (gL ρG ∧, gL ρG ∧ g L ρ G )D . 3
(4.47)
Note that the last two terms in (4.46) and also in (4.47) vanish because of the isotropy of the Lie algebras G and B with respect to the bilinear form (·, ·)D . Using (4.46), (4.47) and the definition (4.8) of the Semenov-Tian-Shansky form ω, we arrive at −dω = =
1 1 ∗ d(g ∗ ρG ∧, b∗L λB )D + d(gL λG ∧, b∗R ρB )D 2 R 2 1 1 (ρD ∧, ρD ∧ ρD )D − (ρD ∧, ρD ∧ ρD )D = 0 . 6 6
(4.48)
We note also that physicists write the Polyakov–Wiegmann formula (4.46) as follows 1 −1 ∧ −1 (dKK −1 ∧, dKK −1 ∧ dKK −1 )D = d(dgR gR , bL dbL )D 3 1 −1 −1 ∧ + (dbL b−1 L , dbL bL ∧ dbL bL )D 3 1 −1 ∧ −1 −1 + (dgR gR , dgR gR ∧ dgR gR )D . 3 (4.49)
September 6, 2004 14:35 WSPC/148-RMP
00214
Quasitriangular WZW Model
731
Definition 4.3. The Heisenberg double is the Drinfeld double equipped with the Poisson structure (4.9). In what follows, we shall always suppose that the multiplication maps ML,R are bijective hence for us the Heisenberg double will always be the symplectic manifold. The Poisson bracket on the Heisenberg double of G has the following crucial property. Proposition 4.4 (Semenov-Tian-Shansky [43]). The algebra F un(D)B of right B-invariant functions on the Heisenberg double D is a Lie subalgebra in the Poisson algebra (4.9) of all functions F un(D). The same is true for the algebra B F un(D) of left B-invariant functions and also for the algebras F un(D)G and G F un(D). Proof. Since the role of the groups G and B is completely symmetric in D, we shall restrict our attention only to the cases F un(D)G and G F un(D). Let ρ, η ∈ F un(D)G , which means that h∇R ρ, T i i = h∇R η, T i i = 0 .
(4.50)
From (4.16) we see immediately that {ρ, η}D =
1 1 L h∇ ρ, T i ih∇L η, ti i − h∇L ρ, ti ih∇L η, T i i 2 2
1 L (4.51) (∇ ρ, R∗ ∇L η)D∗ . 2 Now having in mind that the left and right derivatives ∇L and ∇R commute with each other, we obtain =
The case
G
h∇R {ρ, η}D , T i i = 0 .
(4.52)
F un(D) can be treated in the same way. The proposition is proved.
This proposition permits us to construct the Poisson–Lie brackets on B (and on G). Indeed, both F un(D)G and G F un(D) are naturally isomorphic to F un(B) (due to the existence of the global decomposition D = GB = BG) and the following proposition holds. Proposition 4.5. The Poisson brackets on F un(B) induced from the SemenovTian-Shansky bracket on F un(D)G and G F un(D) coincide up to sign and they verify the Poisson–Lie condition (4.2). Proof. Take two functions Φ and Ψ in F un(B) and calculate their Poisson bracket G {Φ, Ψ}R B induced from F un(D) . We have by definition ∗ ∗ b∗L {Φ, Ψ}R B = {bL Φ, bL Ψ}D =
1 L ∗ ∗ h∇ (b Φ), T i ih∇L D (bL Ψ), ti i 2 D L
1 ∗ i − h∇L (b∗ Φ), ti ih∇L D (bL Ψ), T i . 2 D L
(4.53)
September 6, 2004 14:35 WSPC/148-RMP
732
00214
C. Klimˇ c´ık
The superscript R over the Poisson bracket on B indicates that the latter originates from F un(D)G . Now it is obvious that ∗ ∗ L h∇L D (bL Ψ), ti i = bL h∇B Ψ, ti i ,
(4.54)
where the subscripts D and B indicate the group on which the differential operators ∗ i live. It is less straightforward to evaluate h∇L D (bL Φ), T i. We shall proceed as follows: we note that a map bL ◦ ML : G × B → B given by g × b → bL (gb) ≡ Dresg b
(4.55)
defines a left action of the group G on the manifold B. It is easy to see this since the following relation holds bL (g1 g2 b) = bL (g1 bL (g2 b)gR (g2 b)) = bL (g1 bL (g2 b)) , Now by definition ∗ i h∇L D (bL Φ), T i(K) =
d ds
g 1 , g2 ∈ G ,
i
Φ(bL (esT bL (K))) , s=0
b∈ B.
K ∈ D.
(4.56)
(4.57)
Looking at the relations (4.56) and (4.57), we see that there exists a vector field (a differential operator) ∇iB acting on F un(B) such that ∗ i ∗ i h∇L D (bL Φ), T i = bL (∇B Φ) .
(4.58)
Such operator ∇iB can certainly be expressed as a linear combination of the operij ators h∇L B , tj i. In other words, there exists a matrix valued function Π R (b) on B such that L ∇iB Φ = Πij R (b)h∇B Φ, tj i .
(4.59)
Hence from (4.53), we obtain for our (right) Poisson bracket the following expression 1 L ji L h∇ Φ, ti i(Πij R (b) − ΠR (b))h∇B Ψ, tj i . 2 B Proceeding in the same way, we define the left bracket {Φ, Ψ}R B =
(4.60)
∗ ∗ b∗R {Φ, Ψ}L B = {bR Φ, bR Ψ}D
(4.61)
and we have {Φ, Ψ}L B =
1 R ji R h∇ Φ, ti i(Πij L (b) − ΠL (b))h∇B Ψ, tj i 2 B
(4.62)
for certain matrix valued function Πij L (b) on B. In order to show that the left Poisson bracket (4.60) is equal to the minus right one (4.62), we have to know more about the matrices Πij L,R . Introduce first the following matrices (cf. [34]) Aji (K) = (K −1 ti K, T j )D ,
B ij (K) = (K −1 T i K, T j )D ,
K ∈ D.
(4.63)
Now calculate T i b = bb−1 T i b = b(B ij (b)tj + Aij (b−1 )T j ) .
(4.64)
September 6, 2004 14:35 WSPC/148-RMP
00214
Quasitriangular WZW Model
733
Since the differential operator h∇R B , tj i corresponds to the vector field Lb∗ tj = btj , we see from (4.64) that the operator ∇iB can be expressed as j −1 ik ∇iB = B ij (b)h∇R )h∇L B , tj i = B (b)Ak (b B , tj i .
(4.65)
j −1 ik Πij ). R (b) = −B (b)Ak (b
(4.66)
Hence
On the other hand, we have bT i = bT i b−1 b = (B ij (b−1 )tj + Aij (b)T j )b
(4.67)
and from this, we arrive at ik −1 Πij )Ajk (b) = −B ki (b)Ajk (b) . L (b) = −B (b
Let us now show that the tensors −Πij L (b)
=B
ki
(b)Ajk (b)
= (b
−1
Πij L
k
and
i
T b, T )D (b
Πij R
−1
are antisymmetric. We have
tk b, T j )D
= (Adb T i , T k )D (tk , Adb T j ) = −(Adb T i , tk )D (T k , Adb T j ) = Πji L (b) and similarly for
Πij R.
(4.68)
(4.69)
Thus the expressions (4.60) and (4.62) simplify as follows ij L L {Φ, Ψ}R B = h∇B Φ, ti iΠR (b)h∇B Ψ, tj i ,
(4.70)
ij R R {Φ, Ψ}L B = h∇B Φ, ti iΠL (b)h∇B Ψ, tj i .
(4.71)
Using the obvious relation j R h∇L B , ti i = Ai (b)h∇B , tj i ,
(4.72)
we can rewrite the right bracket as ij R m l R {Φ, Ψ}R B = h∇B Φ, tm iAi (b)ΠR (b)Aj (b)h∇B Ψ, tl i .
(4.73)
Now we have ij j −1 l m ik Am )Alj (b) i (b)ΠR (b)Aj (b) = −Ai (b)B (b)Ak (b il lm = −Am i (b)B (b) = ΠL (b) .
(4.74)
From (4.70), (4.71) and (4.74), we conclude that L {Φ, Ψ}R B = −{Φ, Ψ}B .
(4.75)
{·, ·}R B
It remains to show that the Poisson bracket verifies the Poisson–Lie condition (4.2). Recall that a Poisson bracket on any manifold can be equivalently described by a Poisson bivector (= antisymmetric tensor) α ∈ Λ2 T B; the Poisson–Lie bracket on B in terms of α is given by {φ, ψ}B = hα, dφ ⊗ dψi ,
φ, ψ ∈ F un(B) .
(4.76)
The Poisson–Lie condition (4.2) can be directly rewritten as αab = La∗ αb + Rb∗ αa ,
a, b ∈ B ,
(4.77)
September 6, 2004 14:35 WSPC/148-RMP
734
00214
C. Klimˇ c´ık
where αb ∈ Λ2 Tb B. Since the bivector bundle Λ2 T B on any group manifold is trivializable by the right-invariant vector fields, we lose no information about the Poisson–Lie structure α if we trade it for another object, namely a map Π : G → Λ2 G defined as follows ΠR (b) = Rb−1 ∗ αb .
(4.78)
Now ΠR (b) can be expressed in the basis ti as ΠR (b) = Πij R (b)ti ⊗ tj .
(4.79)
ΠR (ab) = ΠR (a) + Ada ΠR (b) .
(4.80)
In fact, the matrix Πij R just introduced is the same as the one in (4.59) (the notation is thus consistent!) because the operator h∇L , ti i corresponds to the vector field defined in each point b ∈ B as Rb∗ ti . The condition (4.77) translates in terms of ΠR (b) as
One can directly check that the expression (4.66) fulfils (4.80), by using the definition (4.63) of the matrices Aji (b) and B ij (b). The theorem is proved. 4.1.3. Lu–Weinstein double We are now going to present two important examples of the Heisenberg doubles. We shall present a third important (loop group) example in Sec. 4.4. (1) The cotangent bundle T ∗ G of any Lie group is its Heisenberg double. The bilinear form (·, ·)D is defined as in (A.22). The role of the group B is played by the subgroup of all elements K of T ∗ G for which PK = e. Looking at the T ∗ G multiplication law (A.18), we immediatelly discover that B is an Abelian group, in fact, it is nothing but the dual vector space space G ∗ of the Lie algebra G of G. The global decomposability D = GB = BG follows from the fact that the cotangent bundle of any Lie group is trivializable. The Poisson–Lie bracket on G, induced from the bracket on the Heisenberg double T ∗ G, identically vanishes. This follows from (A.44). On the other hand, the Poisson–Lie bracket on B is nontrivial. From (A.53), we see that it is in fact the Kirillov–Kostant Poisson bracket on G ∗ . (2) Consider now any finite-dimensional simple compact connected Lie group G0 . For its Heisenberg double D we take simply its complexification (viewed as the real group) GC 0 of G0 . So, for example, the double of SU (2) is SL(2, C). The invariant non-degenerate form (·, ·)D on the Lie algebra D = G0C of D = GC 0 is given by 1 Im(x, y)G0C , (4.81) ε or, in other words, it is just the imaginary part of the Killing–Cartan form (·, ·)G0C divided by a real parameter ε. We shall see later that ε plays the role of a deformation parameter. Since G0 is the compact real form of GC 0 , clearly the imaginary (x, y)D =
September 6, 2004 14:35 WSPC/148-RMP
00214
Quasitriangular WZW Model
735
part of (x, y)G0C vanishes if x, y ∈ G0 . Hence, G0 is indeed isotropically embedded C in GC 0 . The double G0 equipped with the metric (4.81) is usually referred to as the double of Lu and Weinstein [37]. C It turns out that GC 0 is indeed the Drinfeld double, because D = G0 is at the same time the double of its another subgroup B0 which coincides with the so-called AN group in the Iwasawa decomposition of GC 0 GC 0 = G0 AN = AN G0 .
(4.82)
For the groups SL(n, C), the group AN can be identified with the upper triangular matrices of determinant 1 and with positive real numbers on the diagonal. In general, the elements of AN can be uniquely represented by means of the exponential map as follows g˜ = eφ exp[Σα>0 vα E α ] ≡ eφ n .
(4.83)
Here α’s denote the roots of G0C , vα are complex numbers, E α are the step operators and φ is a Hermitian elementh of the Cartan subalgebra of G0C . Loosely said, A is the “noncompact part” of the complex maximal torus of GC 0 . The isotropy of the Lie algebra B0 of B0 = AN follows from (4.81); the fact that G0 and B0 generate together the Lie algebra D of the whole double is evident from (4.82). The Iwasawa decomposition itself is the global decomposition D = G0 B0 = B0 G0 needed for ensuring that the Semenov-Tian-Shansky Poisson bracket (4.9) does indeed define the (everywhere non-degenerate) symplectic structure on D. 4.1.4. Non-Abelian moment maps The concept of a Poisson–Lie symmetry [43] is the generalization of the traditional Hamiltonian symmetry of a dynamical system defined by a symplectic manifold and a Hamiltonian. Here we shall partially follow the exposition of the papers [4, 19] and [35]. First we need to recall the definition of the dressing action of G on its dual Poisson–Lie group B. An element g ∈ G acts on an element b ∈ B to give Dresg b ≡ bL (gb) ,
(4.84)
where the multiplication gb is taken in the Drinfeld double D and the map bL : D → B is induced by the decomposition D = BG. It follows from (4.56) that this is really the group action, i.e. Drese b = b , h Recall
Dres(g1 g2 ) b = Dresg1 (Dresg2 b) .
(4.85)
that the Hermitian element of any complex simple Lie algebra G C is an eigenvector of the involution which defines the compact real form G; the corresponding eigenvalue is (−1). This involution originates from the group involution g → (g −1 )† . The anti-Hermitian elements that span the compact real form are eigenvectors of the same involution with the eigenvalue equal to 1. For elements of sl(n, C) Lie algebra, the Hermitian element is indeed a Hermitian matrix in the standard sense.
September 6, 2004 14:35 WSPC/148-RMP
736
00214
C. Klimˇ c´ık
Suppose now that there is a symplectic form ω on a manifold P and there is a left action of a Poisson–Lie group G on P , infinitesimally generated by a section v of the bundle T P ⊗ G ∗ = T P ⊗ B. The vector field corresponding to the action of a generator T ∈ G is then hv, T i ∈ T P . Recall that B is the dual Poisson–Lie group and B is its Lie algebra; the dual space G ∗ is identified with B via the invariant bilinear form (·, ·)D on the Drinfeld double D = G + B (cf. Secs. 4.1.1 and 4.1.2). Moreover, suppose that there is a G-equivariant map M : P → B, where G acts on P as above and acts on B via the dressing action. Finally, such a map M is called the non-Abelian moment map if it holds −iv ω ≡ ω(·, v) = M ∗ ρB .
(4.86)
Expressed in word, the contraction of the symplectic form ω by the section v ∈ T P ⊗ B is equal to the pullback of the right-invariant Maurer–Cartan form ρB on B by the map M . The condition (4.86) is often written as ω(·, v) = dM M −1 .
(4.87)
Theorem 4.6. Let the manifold P be the Heisenberg double D of G and ω is the Semenov-Tian-Shansky symplectic form (4.8). Then (1) The map bL : D → B induced by the decomposition D = BG is the nonAbelian moment map M of the standard left action G × D → D given by the group multiplication law in D, i.e. (g, K) → gK, g ∈ G, K ∈ D. (2) The map b−1 R : D → B induced by the decomposition D = GB is the nonAbelian moment map M of the left action G × D → D given by (g, K) → Kg −1 , g ∈ G, K ∈ D. Proof. The first part (1) is slightly easier. We have to show, that for a generator T ∈ G, it holds ω(u, RK∗ T ) = (T, hb∗L ρB , ui)D ,
(4.88)
where K ∈ D and u ∈ TK D. We have ∗ ω(u, RK∗ T ) = (RK∗ T, (ΠLR˜ − ΠLR ˜ )u)D = (RK∗ T, ΠLR ˜ u)D = (T, hbL ρB , ui)D .
(4.89)
Here the first equality follows from Lemma 4.2, the second from the isotropy of the space SR with respect to (·, ·)D and the third one from the relation (4.24). For the second part (2), consider again the generator T ∈ G. We want to show that −1 −1 ω(LK∗ T, u) = (T, hd(b−1 , ui)D . R )(bR )
(4.90)
The last relation can be rewritten as ∗ ω(u, LK∗ T ) = (T, hb−1 R dbR , ui)D = (T, hbR λB , ui)D .
(4.91)
September 6, 2004 14:35 WSPC/148-RMP
00214
Quasitriangular WZW Model
737
Let us prove (4.91). We have ω(u, LK∗ T ) = (LK∗ T, (ΠLR˜ − ΠLR ˜ )u)D = (LK∗ T, (ΠLR˜ + ΠRL ˜ )u)D − (LK∗ T, (ΠLR ˜ + Π RL ˜ )u)D + (LK∗ T, ΠRL ˜ u)D = (LK∗ T, u)D − (LK∗ T, u)D + (LK∗ T, ΠRL˜ u)D = (LK∗ T, ΠRL˜ u)D = (T, hb∗R λB , ui)D .
(4.92)
The last equality in (4.92) follows from (4.25). The G-equivariance of M is obvious in the first case (1); in the second it follows from the relation −1 b−1 ), R (K) = bL (K
K ∈ D.
(4.93)
The theorem is proved. We end this section with a definition. Definition 4.7. A dynamical system characterized by a symplectic manifold P , a symplectic form ω and a Hamiltonian H is Poisson–Lie symmetric with respect to a left action of a Poisson–Lie group G, if the Hamiltonian is G-invariant and the G-action on P is generated by the non-Abelian moment map M : P → B fulfilling the condition (4.86). Remark. The classical action of any dynamical system can be written in the standard way Z S = (θ − Hdt) , (4.94)
where dθ is the symplectic form and H the Hamiltonian. The variation of the action of a general Poisson–Lie symmetric system was calculated in [4] with the result Z δT S = − i A i . (4.95)
Here Ai is the set of functions on the phase space P , satisfying the non-Abelian zero-curvature condition i dAi = f˜kl Ak Al ; (4.96) i i i are the coefficients of the generator T ∈ G in some basis T of G and f˜ are the kl
structure constants of B in the dual basis ti . i obviously vanishes If Ai = dX i for some collection of functions X i on P , then f˜kl and the integrand on the right-hand side of (4.95) is the total derivative. The action S is then strictly symmetric with respect to the G-action and X i are nothing but i do not the standard Hamiltonian charges generating the symmetry. However, if f˜kl vanish, the action S is not symmetric in the strict sense of this word. However, it is by definition Poisson–Lie symmetric since its variation has the special form encoded in (4.95) and (4.96).
September 6, 2004 14:35 WSPC/148-RMP
738
00214
C. Klimˇ c´ık
4.2. Quasitriangular geodesical model Definition 4.8. Consider now a Lie group G and let D be some of its Heisenberg doubles. Choose an appropriate G-biinvariant Hamiltonian function H(K), K ∈ D. The dynamical system defined by the Semenov-Tian-Shansky symplectic structure on D and by the biinvariant Hamiltonian H(K) will be called the quasitriangular geodesical model. We know from the results of the previous section that the quasitriangular geodesical model is distinguished by two independent Poisson–Lie symmetries given by the left multiplication kL K, kL ∈ G or the right multiplication (but the left ac−1 tion!) KkR , kR ∈ G. Even without knowing anything more about the Hamiltonian H(K), its G-biinvariance entails immediately an important information about the trajectories of this dynamical system (we shall call such a trajectory a quasitriangular geodesics). Indeed, let K(t) be the quasitriangular geodesics. Then bL (K(t)) = bL (K(0)) = b0L , b0L
bR (K(t)) = bR (K(0)) = b0R .
(4.97)
b0R
In other words, and are nonlinear constant of motions. We may say loosely that the nonlinear momentum bL (or bR ) is constant, and the quasitriangular geodesical motion is therefore “nonlinearly free” in the nonlinear coordinate g R (or gL ). The independence of bL and bR on time is the consequence of the G-biinvariance of H(K) and of the following theorem. Theorem 4.9. The Semenov-Tian-Shansky Poisson bracket of a G-left-invariant function on D with a G-right-invariant function on D always vanishes. Proof. Due to the existence of the global decompositions D = GB = BG, each G-right-invariant function on D is a bL -pullback of some function Φ ∈ F un(B) and similarly each G-left-invariant function on D is a bR -pullback of some function Ψ ∈ F un(B). Using the formula (4.41), the Semenov-Tian-Shansky bracket (4.9) of such two functions then becomes 1 ∗ (b∗ Φ), T i ih∇R {b∗L Φ, b∗R Ψ}D = h∇R D (bR Ψ), ti i 2 D L 1 ∗ i (b∗ Φ), ti ih∇L − h∇L D (bR Ψ), T i . 2 D L Now the left (right) G-invariance of b∗R Ψ (b∗L Φ) means, respectively,
(4.98)
∗ i h∇L D (bR Ψ), T i(K) = 0 ,
(4.99)
∗ i h∇R D (bL Φ), T i(K) = 0 .
(4.100)
Thus {b∗L Φ, b∗R Ψ}D = 0 . The theorem is proved.
(4.101)
September 6, 2004 14:35 WSPC/148-RMP
00214
Quasitriangular WZW Model
739
As an example, we can take for G the simple compact connected group G0 and its Heisenberg double is the Lu–Weinstein double D0 = GC 0 of the example (2) of the previous paragraph. Now we look for a biinvariant Hamiltonian. Recall that in the case of the standard geodesical model the choice of the biinvariant Hamiltonian was canonical. It turns out that in the D0 = GC 0 case, there is also the canonical choice. Up to a normalization (to be fixed later), it is given by H = (ln(b†L bL ), ln(b†L bL ))G0C = (ln(b†R bR ), ln(b†R bR ))G0C ,
(4.102)
where † is the Hermitian conjugation defined in Sec. 4.1.3. We shall return to the quasitriangular geodesical model on GC 0 later when we shall study its chiral decomposition and give the exact solution of the model. ˜ 4.3. WZW Drinfeld doubles of G ˜ will denote the central biextension of the group G. Recall that In this section, G in the non-deformed case, we have performed the two-step symplectic reduction ˜ with its canonical symplectic structure starting from the geodesical model (1.1) on G ∗˜ on T G and arriving at the WZW model on G with its (non-canonical) WZW symplectic form on T ∗ G. In principle, we can construct the deformation of the ˜ However, if we want to also master model (1.1) for whatever Drinfeld double of G. make the two-step symplectic reduction, the double must be of the special form described in the following definition. ˜ ˜ ˜B ˜ = B ˜ G) ˜ of the central G Definition 4.10. We shall say that a double D(= ˜ biextension G is of the WZW type (or simply is the WZW double), if and only if ˆ ˆ = (G ˆB ˆ=B ˆ G) ˆ of G ˆ and a double D(= GB = BG) of G such there exist a double D that ˜ is isomorphic to the direct product of the real line with B, ˆ i.e. (i) The group B ˜ = R × B. ˆ B ˆ is isomorphic to the semi-direct product of the real line with B, (ii) The group B ˆ i.e. B = R ×Q B, where Q is some one parameter group of automorphisms of B. Theorem 4.11. The properties (i)–(ii) of the Definition 4.10 are satisfied by the ˆ ˜ ˜ D ˆ = T ∗G ˆ and D = T ∗ G. ˜ = T ∗ G, triple of Drinfeld doubles D ˜ is clearly G ˜ ∗ viewed as the Abelian group (linear space). Proof. The dual group B ∗ Their elements are (A, α, a) (cf. Convention 2.5). The subgroups formed by the ˆ and B, respectively. Since B ˜ is elements of the forms (0, α, a)∗ and (0, α, 0)∗ are B Abelian, the theorem is proved. In order to give the definition of the universal quasitriangular WZW model, it is useful to set appropriate conventions.
September 6, 2004 14:35 WSPC/148-RMP
740
00214
C. Klimˇ c´ık
˜ = R×B ˆ = R × (R ×Q B) from DefiniConvention 4.12. Consider the group B tion 4.10. The symbol × without (with) the subscript means the direct (semi-direct) product of groups. This decomposition induces several natural maps; our conven˜ = R× B ˆ we obtain two natural maps tions will give them names. First of all, from B ˜ → R and m ˜ → B. ˆ Then from B ˆ = R×Q B we produce m∞ : B ˆ → R and m0 : B ˆ :B ˆ ˆ two maps mL,R : B → B given by two possible decompositions B = BR = RB. Recall also that the notation ˜bL,R is induced by the two canonical decompositions ˜ ˜ =G ˜B ˜ =B ˜ G. ˜ D Definition 4.13. The universal quasitriangular WZW model is the rule that as˜ ˜ (of the central sociates a dynamical system to every WZW Drinfeld double D ˜ ˜ and to every G-biinvariant ˜ ˜ K), ˜ K ˜ ∈ D. ˜ This system is biextension G) function H( ˜ ˜ by the two-step symplectic obtained from the quasitriangular geodesical model on D reduction induced by setting ˜ + m0 (˜bR (K)) ˜ = 0, m0 (˜bL (K)) ˜ + (m∞ ◦ m)( ˜ = 2κ . (m∞ ◦ m)( ˆ ˜bL (K)) ˆ ˜bR (K))
(4.103) (4.104)
˜ ˜ H)-quasitriangular ˜ We shall call this dynamical system the (D, WZW model. ˜ ˜ we do not have a natural choice of Remarks. (1) For a general WZW double D, ˜ Hamiltonian H. However, two important WZW doubles of the affine Kac–Moody ˜ permit to choose the Hamiltonian in the canonical way. The first one is group G ˜ which leads to the definition of the standard loop group the cotangent bundle T ∗ G WZW model described in Secs. 2 and 3. The second one is the affine Lu–Weinstein double which will be introduced in the next section and which will lead to the main result of this paper: the construction of the loop group quasitriangular WZW model. (2) It is important to note that the moment maps appearing in the relations (4.103) and (4.104) are ordinary functions. They generate respectively the axial ˜ → uKu) ˜ of the group RS generated by T˜ 0 and of the central circle actions (i.e. K 1 ∞ ˜ such axial action would be S generated by Tˆ . For a non-WZW double of G, only of the Poisson–Lie type and, generically, we would not be able to disentangle the moment maps of the T˜ 0 and Tˆ ∞ symmetries from the moment maps of the ˜ other G-symmetries. The choice of the WZW double ensures that these particular symmetries are Hamiltonian in the standard sense of this word hence the symplectic reduction can be performed. (3) We shall describe the details of the symplectic reduction for the affine Lu– Weinstein double of the affine Kac–Moody group. There is no necessity to list explicitly the reduction for whatever WZW double since the corresponding formulas are too general anyway to be illuminating.
September 6, 2004 14:35 WSPC/148-RMP
00214
Quasitriangular WZW Model
741
4.4. Affine Lu–Weinstein double ˆ ˜ ˜ D, ˆ D) of Now we are advancing to our most important example of the triple (D, ˜ ˆ the Drinfeld doubles of the groups (G, G, G) (cf. Definition 4.10). G will be the loop group G = LG0 , where G0 is a simple compact connected and simply connected Lie ˆ = LG d0 will be its standard central extension described in Appendix A.1 group.i G ˜ d0 . Although the construction and G will be the affine Kac–Moody group R ×Sˆ LG that we are going to present here is apparently original, we choose the name “affine ˜ ˜ has many features similar as Lu–Weinstein double” because the resulting double D C the finite-dimensional Lu–Weinstein double D0 = G0 . 4.4.1. The double D = LGC 0 For the Heisenberg double D of G, we take the loop group LGC 0 consisting of smooth maps from the circle S 1 into GC . It will often be convenient to view the loop group 0 C LG0 as a group of holomorphic maps from the σ-Riemann sphere without poles into 1 the complex group GC 0 . Clearly, the loop circle S is identified with the equator. The σ-Riemann sphere is in fact the ordinary Riemann sphere but since in this section we shall encounter the notion of the Riemann sphere in several different contexts, we shall use the labels to distinguish them. [ C The complex Lie algebra G C = LG 0 is equipped with an invariant nondegenerate bilinear form given by Z 1 dσ(x(σ), y(σ))G0C , (4.105) (x, y)G C = 2π where the elements x, y ∈ G C are smooth maps from S 1 into G0C . Recall that D [ C viewed as the real group. The invariant nondegenerate bilinear is the group LG 0
form on D = Lie(D) is then defined as
1 Im(x, y)G C , (4.106) ε where ε is a real positive parameter. Note the full analogy with the finitedimensional definition (4.81). The Lie algebra G = LG0 is isotropic with respect to (·, ·)D since it is “pointwise” isotropic with respect to Im(·, ·)G0C . Thus we see that D is indeed the Manin double of G. In order to show that it is also the Drinfeld double, we need the complementary (or dual) group B. In fact, B is the group C L+ G C 0 consisting of loops in G0 that are boundary values of holomorphic maps C from the unit disc into GC 0 . In other words, we may view L+ G0 as the group of holomorphic maps from the σ-Riemann sphere without the north pole into the complex group GC 0 . We require moreover, that the value of this holomorphic map at the origin of the disc (= the south pole of the σ-Riemann sphere) is an element of B0 = AN ∈ GC 0 . (x, y)D =
i Note
that G is then connected.
September 6, 2004 14:35 WSPC/148-RMP
742
00214
C. Klimˇ c´ık
As we have already said, the isotropy of the Lie algebra LG0 of G = LG0 with respect to the bilinear form (4.106) is obvious. But the Lie algebra B = L+ G0C is also isotropic with respect to (·, ·)D . Indeed, the expression to be integrated in (4.106) has only the non-negative Fourier modes. The integral of all strictly positive modes then vanishes. The zero mode does not contribute either since Im(Lie(AN ), Lie(AN ))G0 = 0
(4.107)
as in the example (2) of Sec. 4.1.3. Finally, the existence of the global decomposition C D = LGC 0 = (LG0 )(L+ G0 ) = GB = BG was proved in [40]. ˆ ˆ = R ×Q 4.4.2. The double D
R[ LGC 0
ˆ ˆ of the centrally extended loop group Now we are going to construct the double D ˆ G. C Consider the group DGC 0 of smooth maps from the unit disc into G0 with the \ C whose usual pointwise multiplication. We can now define an extended group R DG 0 C ¯ ¯ elements are pairs (l, λ), where l ∈ DG and λ ∈ U (1) and whose multiplication 0
law reads
(¯l1 , λ1 )(¯l2 , λ2 ) = (¯l1 ¯l2 , λ1 λ2 exp[2πiβR (¯l1 , ¯l2 )]) . Here βR is a real valued 2-cocycle on DGC 0 given by Z 1 Re(¯l1−1 d¯l1 ∧, d¯l2 ¯ l2−1 )G0C , βR (¯l1 , ¯l2 ) = 2 8π Disc
(4.108)
(4.109)
where (·, ·)G0C is again the (standardly normalized) Killing–Cartan form on G0C . It is crucial to note the presence of the real part symbol in the definition of the 2-cocycle. \ C. This real part is also reflected by the (left) superscript R in the symbol R DG 0 Consider now a subgroup ∂GC of DGC 0 consisting of all smooth maps from the 1 Disc into GC 0 such that their value at every point of the boundary ∂D = S is the C C 2 ¯ ¯ unit element e0 of G0 . Any l ∈ ∂G can be thought of as a map l : S → GC 0 by identifying the boundary S 1 of Disc with the north pole of S 2 . The Riemann sphere thus obtained will be called the D-Riemann sphere. It turns out that there C R\ is a homomorphism ΘC DGC 0 defined by R : ∂G → C ¯ ¯ ¯ ΘC R (l) = (l, exp[−2πiCR (l)]) ,
where C ¯ CR (l) =
1 24π 2
Z
Ball
Re(d¯l¯l−1 ∧, d¯l¯l−1 ∧ d¯l¯ l−1 ])G0C .
(4.110)
(4.111)
Here Ball is the unit ball whose boundary is the D-Riemann sphere and we have C ¯ extended the map ¯l : S 2 → GC 0 to a map l : Ball → G0 . The proof of the fact C ¯ that exp[−2πiCR (l)] does not depend on the extension of ¯l to Ball reduces to the same proof as for the compact group G0 since GC 0 has the same homotopies
September 6, 2004 14:35 WSPC/148-RMP
00214
743
Quasitriangular WZW Model
as G0 . The demonstration that ΘC R is indeed a homomorphism and the fact that C C the image ΘR (∂G ) is the normal subgroup in R \ DGC 0 follows again from the Polyakov–Wiegmann formula [39] (see Appendix A.1) which asserts that C ¯¯ C ¯ C ¯ ( l1 ) + C R (l2 ) − βR (¯l1 , ¯l2 ) . CR ( l1 l2 ) = C R
(4.112)
ˆ [ C of the double D ˆ ≡ R LG ˆ is now defined as the factor group The subgroup D 0 R\ C C DGC 0 /ΘR (∂G ). This group is a (nontrivial) circle bundle over the base space C ˆ is represented by LG0 = D. The projection Π0 is (¯l, λ) → ¯l|S 1 and the center of D \ C . The projection homothe Θ-equivalence classes represented by (1, λ) ∈ R DG 0
R\ DGC 0
ˆ will be referred to as ℘C . onto D ˆ ˆ we need a one-parameter group In order to construct the Drinfeld double D, ˆ R ˆ of automorphisms of D. For this, it is convenient first to define a one-parameter morphism from Q
¯ group of automorphisms RQ¯ of R \ DGC 0 . If (l(z, r), λ) is an element in action of an element w ∈ RQ¯ reads q
(¯l(z, r), λ) = (¯l(qz, r), λ) ,
R\ DGC 0 ,
the
(4.113)
where q = ew . Recall that we view the loop group LGC 0 as the group of holomorphic maps from the Riemann sphere without poles into the complex group GC 0 . The standard polar coordinates (σ, r) of the Disc thus get traded for (z, r). We can view the disc as the intersection of the equatorial plane with the interior of the σ-Riemann sphere. We stress however that this σ-Riemann sphere is not the same C ¯ (l)] for elements as the D-sphere used for the definition of the term exp [−2πiCR ¯l ∈ ∂GC . We have to prove that q
((¯l1 , λ1 )(¯l2 , λ2 )) =
where the product is considered in
q
(¯l1 , λ1 )q (¯l2 , λ2 ) ,
R\ DGC 0 .
(4.114)
This in turn amounts to show that
βR (¯l1 (z, r), ¯l2 (z, r)) = βR (¯l1 (qz, r), ¯l2 (qz, r)) .
(4.115)
Recalling the definition of the cocycle βR , we can rewrite (4.109) as Z 1 Z 1 dr dz(Re(¯l1−1 ∂r ¯l1 , ∂z ¯ l2 ¯ l2−1 )G0C βR (¯l1 , ¯l2 ) = 8π 2 0 |z|=1 − Re(¯l1−1∂z ¯l1 , ∂r ¯ l2 ¯ l2−1 )G0C ) .
(4.116)
Here the integration over σ is replaced by the contour integration along the equator |z| = 1 on the σ-Riemann sphere. It is straightforward to show that βR (¯l1 (qz, r), ¯l2 (qz, r)) is given by the same integral as (4.116) but along the new contour |z| = q. Since the integrand is everywhere holomorphic function between the two contours, they can be deformed one to the other and we conclude that (4.115) holds.
September 6, 2004 14:35 WSPC/148-RMP
744
00214
C. Klimˇ c´ık
Now we have to show that the group action (4.113) survives the factorization \ C descend to ¯ of R DG by the group δGC . In other words, the automorphisms Q 0 R[ C ˆ ˆ ¯ automorphisms Q of the factor group D = LG , or still rephrased differently: Q acts on the ∂GC classes in
R\ DGC 0 .
0
This amounts to show that
C ¯ C ¯ (l(qz, r)) . CR (l(z, r)) = CR
(4.117)
Looking at the formula (4.111), the integration now goes over three variables parametrizing the interior of the D-Riemann sphere. We can again change the σintegration to the contour integration on the σ-Riemann sphere and then we prove (4.117) by the identical contour deformation argument as above. ˆ [ C , q ∈ R+ ˆ is defined as the group of couples (X, q), X ∈ R LG Thus our group D 0
with the following composition law
(X1 , q1 )(X2 , q2 ) = (X1
q1
X2 , q 1 q 2 ) .
(4.118)
ˆ ˆ is indeed the Drinfeld Now we have to prove several lemmas in order to show that D ˆ double of G. ˆ d ˆ Lemma 4.14. The subgroup Π−1 0 (LG0 ) ⊂ D is isomorphic to LG0 = G.
Proof. Consider an element ˆl ∈ Π−1 0 (LG0 ) ⊂
R[ LGC 0 .
It can be lifted by the “map” \ C . We stress that ¯ [ (of course non-uniquely) to some element l ∈ DG0 ⊂ R DG 0 [0 . The element ¯ the element ¯ l can be chosen in DG l can then be projected by the map d0 . We now show that this element of LG d0 does ℘ (not ℘C !) to some element of LG ¯ [ not depend on the choice l ∈ DG0 , which means that we have constructed a certain ¯ ¯ d [ map µ : Π−1 0 (LG0 ) → LG0 . Indeed, if we have two elements l1 , l2 ∈ DG0 such that ℘C (¯l1 ) = ℘C (¯l2 ) = ˆ l, then there certainly exists an element F ∈ ∂G(⊂ ∂GC ) such that ¯ l1 = F ¯l2 . This means, in other words, that ℘(¯l1 ) = ℘(¯l2 ), hence µ is a well-defined map. Let us show that µ is a homomorphism. First of all, the unit element in Π−1 0 can [ be lifted directly to the unit element of DG0 and mapped by ℘ to the unit element d0 (since ℘ is a homomorphism). Then we want to prove that of LG
℘−1 C
µ(ˆl1 ˆ l2 ) = µ(ˆl1 )µ(ˆl2 ) ,
ˆ l1 , ˆl2 ∈ Π−1 0 (LG0 ) .
[0 are such that ℘C (¯li ) = ˆ But if ¯ l1 , ¯l2 ∈ DG li , i = 1, 2, then ℘C (¯l1 ¯l2 ) = ˆl1 ˆl2 since ℘C is the homomorphism. Thus we have µ(ˆl1 ˆl2 ) = ℘(¯l1 ¯l2 ) = ℘(¯l1 )℘(¯l2 ) = µ(ˆl1 )µ(ˆl2 ) because ℘ is also the homomorphism. ˆ ˆ Injectivity: if we take again ˆ l1 , ˆl2 ∈ Π−1 0 (LG0 ) such that Π0 (l1 ) 6= Π0 (l2 ) then ˆ ˆ ˆ ˆ clearly π(µ(l1 )) 6= π(µ(l2 )) hence µ(l1 ) 6= µ(l2 ) (recall that π denotes the homomord0 to LG0 defined by the exact sequence (2.1)). If Π0 (ˆl1 ) = Π0 (ˆl2 ) phism from LG [ C. and ˆ l 6= ˆl , then ˆ l = Y ˆl , where Y is a non-unit central circle element in R LG 1
2
1
2
0
September 6, 2004 14:35 WSPC/148-RMP
00214
Quasitriangular WZW Model
745
[0 can be chosen to be connected by the same non-unit central Then also ¯l1,2 ∈ DG [0 and ℘(¯l1 ) = µ(ˆl1 ) = Y µ(ˆl2 ), where circle element viewed as the element of DG d0 . now the same Y is viewed as the element of LG Since the surjectivity is evident, the lemma is proved.
ˆ Lemma 4.15. The group B = L+ GC 0 can be homomorphically injected into D = R[ ˆ LGC . Moreover , the image of this injection is preserved by the automorphisms Q. 0
Proof. Consider an element b ∈ L+ GC 0 . By definition, it is the boundary value of the holomorphic map ¯b from the unit disc into GC 0 . Consider now the map C R\ C ν¯ : L G → DG defined by +
0
0
ν¯(b) = (¯b, 1) .
(4.119)
The crucial thing is that the map ν¯ is the homomorphism of groups. This follows from the fact that the cocycle βR (¯b1 , ¯b2 ) vanishes if ¯b0i s are the holomorphic maps. Indeed, by using the contour representation (4.116) of this cocycle, we see immediately that the contour can be contracted to the origin of the unit disc without encountering any singularity because the integrated function is everywhere holomorphic. R[ ˆ Consider now the map ν : L+ GC LGC 0 →D = 0 defined by ν = ℘C ◦ ν¯ .
(4.120)
First of all, ν is the group homomorphism being the composition of two homomorphisms. Moreover, it holds (Π0 ◦ ℘C ◦ ν¯)(b) = b ∈ LGC 0 .
(4.121)
From this, it follows that ν is the injection. The invariance of ν(B) under the action ˆ is obvious. of Q The lemma is proved. ˆ = LG d0 and Remark. The two preceding lemmas say, in other words, that both G ˆ ˆ = R ×Q L+ GC are subgroups of D ˆ which is one of the basic properties of the B 0 ˆ ˆ Heisenberg double of G and of B. ˆ ˆ ˆ can be globally decomposed as D ˆ ˆB ˆ=B ˆ G, ˆ where Lemma 4.16. The group D =G C ˆ = LG d0 and B ˆ = R × Q L+ G 0 . G
ˆ Proof. Of course, in the sense of the two lemmas above, here Π−1 0 (LG0 ) ⊂ D is ˆ C ˆ ˆ ˆ viewed as G and R ×Q ν(L+ G0 ) as B. Take any element K in the extended double ˆ ˆ ˆ ˆ ˆ ˆ K ˆ D. Since D = R ×Q D, can be uniquely decomposed as ˆ ˆ = (K, ˆ 1)(1, q) , K
ˆ ∈D ˆ and q = ew , w ∈ RQ . where K
(4.122)
September 6, 2004 14:35 WSPC/148-RMP
746
00214
C. Klimˇ c´ık
ˆ can be uniquely decomposed as Now we have to prove that K ˆ =a K ˆν(b) ,
ˆ = Π−1 (LG0 ) , a ˆ∈G 0
b ∈ B = L + GC 0 .
(4.123)
ˆ ∈ LGC . It can be uniquely decomposed as Consider first the element Π0 (K) 0 ˆ = ab , Π0 (K)
a ∈ LG0 ,
b ∈ L + GC 0 .
(4.124)
ˆ can be found among the elements of the circle Now it is certainly true that K −1 fiber Π0 (a)ν(b) above ab. The existence of a ˆ and b from (4.123) then follows immediately. It remains to be proved that the decomposition (4.123) is unique. So if ˆ =a K ˆν(b) = a ˆ0 ν(b0 ) ,
ˆ, a, a0 ∈ G
b, b0 ∈ B ,
(4.125)
Π0 (ˆ aν(b)) = Π0 (ˆ a)Π0 (ν(b)) = Π0 (ˆ a)b = Π0 (ˆ a0 )b0 .
(4.126)
then
The second equality here follows from (4.121). It is now clear that b = b0 because they are both in L+ GC aν(b)) ∈ LGC 0 and the decomposition of the element Π0 (ˆ 0 0 C ˆ=a ˆ0 . into elements of LG0 and L+ G0 is unique. We conclude that b = b , hence a The lemma is proved. ˆ [ C is indeed the common Heisenberg ˆ = R ×Q R LG Lemma 4.17. The group D 0 ˆ 0 and R ×Q L+ GC double of the groups LG 0 , if we define the invariant bilinear form ˆ ˆ as on its Lie algebra D Z 1 dσ Im(x(σ), y(σ))G0C = (x, y)D , (4.127) (ˆι(x), ˆι(y)) ˆˆ = D 2πε (Tˆ∞ , ˆι(D)) ˆˆ = (tˆ1∞ , ˆι(D)) ˆˆ = 0 ; D
D
(Tˆ ∞ , tˆ1∞ ) ˆˆ = D
1 . ε
(4.128)
ˆˆ Here tˆ1∞ is the generator of RQ ; Tˆ ∞ that of the central U (1) and ˆι : D → D is the ˆ The injection ˆι acting on B is given natural extension of the injection ι : G → G. by the derivation map ν∗ . ˆ ˆ contains both G ˆ = LG ˆ 0 and B ˆ = R ×Q L+ GC as Proof. We know already that D 0 ˆ =G ˆB ˆ=B ˆG ˆ takes place. The its subgroups and that the global decomposition D isotropy of the corresponding Lie subalgebras Gˆ and Bˆ follows from the isotropy of G and B with respect to the form (·, ·)D . Indeed, for instance, Gˆ = ι(G) + Span(Tˆ∞ ) then for ξ, η ∈ G, we have (ι(ξ), ι(η)) ˆˆ = (ξ, η)D = 0 . D
ˆ G) ˆ ˆ = 0. Adding this to the fact (4.128) that (Tˆ ∞ , ι(G)) ˆˆ = 0, we obtain (G, ˆ D D The bilinear form (·, ·) ˆˆ is also symmetric (if we complete appropriately the part D (4.128) of its definition) since the form (·, ·)D is symmetric. The non-degeneracy of
September 6, 2004 14:35 WSPC/148-RMP
00214
Quasitriangular WZW Model
747
the form (·, ·)Dˆˆ also follows immediately from the defining relations (4.127), (4.128) and from the fact that (·, ·)D is non-degenerate. The remaining thing to show is the invariance of the form (·, ·) ˆˆ . This can be D ˆˆ derived by the direct calculation from the Lie bracket in D [w1 Tˆ ∞ + ξ1c (σ) + %1 tˆ1∞ , w2 Tˆ ∞ + ξ2c (σ) + %2 tˆ1∞ ] Z 1 = + dσRe(ξ1c , ∂σ ξ2c )Tˆ ∞ + [ξ1c , ξ2c ](σ) − %1 i∂σ ξ2c (σ) + %2 i∂σ ξ1c (σ) . 2π
Here wi , %i are real numbers and ξic (σ) are from LG0C . The lemma is proved.
ˆ = R ×Q B as required by the Definition 4.10. Remark. We observe that B ˜ = R2 × [ C ˜ 4.4.3. The double D S,Q LG0
˜ ˜ of the affine Kac–Moody As the title of this subsection suggests, the double D [ C. ˜ group G will be the semi-direct product of the plane R2 with certain group LG 0 [ C , we shall proceed in close analogy with the previous In order to construct LG 0
section and our intermediate explanations will be therefore much briefer. C Consider the group DGC 0 of smooth maps from the unit disc into G0 with the \ C whose usual pointwise multiplication. We can now define an extended group DG 0 × C elements are pairs (¯l, v), where ¯l ∈ DG and v ∈ C and whose multiplication law 0
reads
(¯l1 , v1 )(¯l2 , v2 ) = (¯l1 ¯ l2 , v1 v2 exp[2πiβ(¯l1 , ¯l2 )]) .
(4.129)
Here C× is the complex plane without the origin viewed as the Abelian multiplicative group of complex numbers and β is a complex-valued 2-cocycle on DGC 0 given by Z 1 ¯ ¯ β(l1 , l2 ) = 2 (¯l−1 d¯l1 ∧, d¯l2 ¯l2−1 )G0C . (4.130) 8π Disc 1
The remaining notations are the same as in the previous section. It is crucial to note the absence of the real part symbol in the definition of the 2-cocycle. Consider now the subgroup ∂GC of DGC 0 . It turns out that there is a homoC C \ C morphism Θ : ∂G → DG defined by 0
Θ (¯l) = (¯l, exp[−2πiC C (¯l)]) , C
where C (¯l) = C
1 24π 2
Z
Ball
(d¯l¯ l−1 ∧, d¯l¯ l−1 ∧ d¯l¯l−1 ])G0C .
(4.131)
(4.132)
[ C C C is now defined as the factor group \ The group LG DGC 0 0 /Θ (∂G ). This group × C is a (nontrivial) C bundle over the base space LG0 = D. The projection ΠC 0 is
September 6, 2004 14:35 WSPC/148-RMP
748
00214
C. Klimˇ c´ık
[ C is represented by the ΘC -equivalence classes (¯l, v) → ¯l|S 1 and the center of LG 0 \ [ C . The projection homomorphism from \ C represented by (1, v) ∈ DG DGC onto LG 0
0
0
will be referred to as ℘ˆC . ˜ ˜ we need two commuting oneIn order to construct the Drinfeld double D, [ C . We shall denote them as R and R . parameter groups of automorphisms of LG ˆ ˆ 0 Q S To define them, associate first to every complex number w + is an automorphism of \ DGC 0 given by q
(¯l(z, r), v) = (¯l(qz, r), v) ,
(4.133)
C where q = exp(w + is) and ¯l(z, r) ∈ DGC 0 . Recall that we view the loop group LG0 as the group of holomorphic maps from the σ-Riemann sphere without poles into the complex group GC 0 . Similarly as in Sec. 4.4.2, we can prove that q
((¯l1 , v1 )(¯l2 , v2 )) =
q
(¯l1 , v1 )q (¯l2 , v2 ) ,
(4.134)
and that the group action (4.133) survives the factorization by the group ∂G C . ˜ [ C , q ∈ C× ˜ is defined as the group of couples (X, q), X ∈ LG Thus our group D 0
with the following composition law
(X1 , q1 )(X2 , q2 ) = (X1
q1
X2 , q 1 q 2 ) .
(4.135)
˜ ˜ is indeed the Drinfeld Now we have to prove several lemmas in order to show that D ˜ double of G. ˜ ˜ ˆ is the subgroup of D. ˜ = R ×S G Lemma 4.18. The group G [ C . Consider the set ˆ = LG d0 is the subset of LG Proof. First we prove that G 0 \ [ C, ℘ C, ∃ ¯ [0 ⊂ DG ˆC (¯l) = ˆ l} . l ∈ DG S = {ˆl ∈ LG 0 0
(4.136)
[ C isomorphic to LG d0 . We are going to show that the set S is the subgroup of LG 0 d The isomorphism µ : S → LG0 is defined as follows µ(ˆl) = ℘(¯l) .
(4.137)
d0 is the map associating to every element of DG [0 → LG [0 its Recall that ℘ : DG Θ-class (cf. Sec. A.1). It is immediate to check that the definition of µ does not [0 . depend on the choice of the representative ¯l ∈ DG ¯ ¯ If l1 , l2 are the respective representatives of ˆ l1 , ˆl2 , then obviously ¯l1 ¯l2 can be ˆ ˆ ¯ ¯ [0 , it follows that the set S is chosen as the representative of l1 l2 . Since l1 l2 ∈ DG [ C . Moreover, one has the subgroup of LG 0 l2 ) = ℘(¯l1 )℘(¯l2 ) , ℘(¯l1 ¯
(4.138)
hence µ is the group homomorphism. It remains to show the injectivity and surjectivity of µ.
September 6, 2004 14:35 WSPC/148-RMP
00214
Quasitriangular WZW Model
ˆ
ˆ
If l1 6= l2 and C ˆ ˆ Π 0 ( l1 ) = Π C 0 (l2 ),
749
C ˆ ˆ ¯ ¯ ˆ ˆ ΠC 0 (l1 ) 6= Π0 (l2 ), then obviously, ℘(l1 ) 6= ℘(l2 ). If l1 6= l2 and ¯ ¯ then we can choose l1 , l2 in such a way that
¯ l1 = ¯l2 (1, eiφ ) ,
eiφ 6= 1 .
(4.139)
Then obviously ℘(¯l1 ) 6= ℘(¯l2 ) and the injectivity follows. The surjectivity is also clear since the central circle acts freely on S. The lemma is proved. [ C Lemma 4.19. The group B = L+ GC 0 can be homomorphically injected into LG0 . ˆ Moreover , the image of this injection is preserved by the automorphisms Q. Proof. Consider an element b ∈ L+ GC 0 . By definition, it is the boundary value of the holomorphic map ¯b from the unit disc into GC 0 . Consider now the map \ C C ν¯ : L G → DG defined by +
0
0
ν¯(b) = (¯b, 1) .
(4.140)
As in the proof of Lemma 4.15, it can be shown that the map ν¯ is the homomorphism of groups. [ C Consider now the map νˆ : L+ GC 0 → LG0 defined by νˆ = ℘ˆC ◦ ν¯ .
(4.141)
First of all, νˆ is the group homomorphism being the composition of two homomorphisms. Moreover, it holds that (ΠC ¯)(b) = b ∈ LGC 0 ◦ ℘C ◦ ν 0 .
(4.142)
From this it follows that νˆ is the injection. The invariance of νˆ(B) under the action ˆ is obvious. of Q The lemma is proved. ˜ = R ×S G ˆ Remark. The two preceding lemmas say, in other words, that both G ˜ C ˜ ˜ and B = (R ×Q L+ G0 ) × R are subgroups of D which is one of the basic properties ˜ and of B. ˜ Of course, the direct product factor R of the Heisenberg double of G [ \ C C of here is the central line subgroup of LG0 corresponding to ΘC -classes in DG 0 t the form (1, e ), t ∈ R. ˜ ˜ =G ˜ can be globally decomposed as D ˜ ˜B ˜=B ˜G ˜ where Lemma 4.20. The group D C ˜ = R ×S LG d0 and B ˜ = (R ×Q L+ G ) × R. G 0
[ [ C can be decomposed as LG C = Proof. Of course, it is sufficient to prove that LG 0 0 ˆ ν (B) × R), where R stands for the central line in the sense of the remark above. G(ˆ [ C can be uniquely decomposed as ˆ ∈ LG Thus we are going to prove that K 0 ˆ =a K ˆνˆ(b)et ,
ˆ, a ˆ∈G
b ∈ B = L + GC 0 ,
t ∈ R.
(4.143)
September 6, 2004 14:35 WSPC/148-RMP
750
00214
C. Klimˇ c´ık
C ˆ Consider first the element ΠC 0 (K) ∈ LG0 . It can be uniquely decomposed as
ˆ ΠC 0 (K) = ab ,
a ∈ LG0 ,
b ∈ L + GC 0 .
(4.144)
ˆ can be found among the elements of the C× fiber Now it is certainly true that K −1 (ΠC (a)ˆ ν (b) above ab. The existence of a ˆ, b and t from (4.143) then follows im0 ) mediately. It remains to be proved that the decomposition (4.143) is unique. The required argument is very similar to that of the proof of Lemma 4.16 and we shall not repeat it here. The lemma is proved. There exists the succinct way of presenting the commutator in the Lie algebra ˜ ˜ D. It reads [(X c , ξ c (σ), xc ), (Y c , η c (σ), y c )] Z c c c c c c c c i (ξ , ∂σ η )G0C . = 0, [ξ , η ] − iX ∂σ η + iY ∂σ ξ , 2π
(4.145)
Here X c , Y c , xc , y c are complex numbers and ξ c , η c ∈ LG0C . The identification of various generators is as follows. T˜ 0 = (i, 0, 0) corresponds to the automorphisms S and t˜1∞ = (1, 0, 0) to the automorphisms Q. Moreover, T˜ ∞ = (0, 0, i) is to be identified with the central circle generator and t˜10 = (0, 0, 1) with the central line ˜ = R× B. ˆ We stress generator corresponding to the group R in the decomposition B ˜ ˜ as the real Lie algebra, nevertheless we observe that the commutator that we view D ˜˜ possesses the natural complex structure. just defined is complex bilinear, hence D Moreover, it is instructive to compare this formula with the commutator (2.11) of ˜ We observe immediately that G˜ is the real form the central biextension algebra G. ˜ ˜ of D viewed as the complex Lie algebra. In other words, the affine Lu–Weinstein ˜ ˜ is nothing but the complexification G˜C , in full analogy with the state double of D of matters for the finite dimensional ordinary Lu–Weinstein double. We can define the following invariant non-degenerate bilinear form on G˜C ((X c , ξ c , xc ), (Y c , η c , y c ))G˜C = (ξ c , η c )G C + X c y c + Y c xc ,
(4.146)
where the form (·, ·)G C was defined in (4.105). The invariant non-degenerate bilinear ˜ ˜ is then defined as form on the double D ((X c , ξ c , xc ), (Y c , η c , y c ))D˜˜ =
1 Im((X c , ξ c , xc ), (Y c , η c , y c ))G˜C . ε
(4.147)
Note that the form (4.147) is the analogue of the forms (4.106) and (4.81). ˜ ˜ is indeed the common Heisenberg double of the groups Lemma 4.21. The group D ˜ ˜ G and B.
September 6, 2004 14:35 WSPC/148-RMP
00214
Quasitriangular WZW Model
751
Proof. The parts of this proposition were already proved in the preceding lemmas. The only thing that remains is to prove the isotropy of the Lie algebras G˜ and B˜ with respect to the bilinear form (4.147). ˜˜ given by (4.145), we have G˜ = ˜ In the parametrization of D We start with G. ˜ G) ˜ ˜ = 0; On Span(iR, G, ir), R, r ∈ R and from (4.146) and (4.147), it follows (G, ˜ D ˜ ˜ ˜ the other hand, B = Span(R, B, r), R, r ∈ R and again (B, B) ˜ = 0. The lemma is proved.
˜ D
4.5. Conclusion ˜ is the direct product of the central line R with B, ˆ i.e. We have shown that B ˜ = R×B ˆ and B ˆ = R ×Q B. Thus the requirements of the Definition 4.10 of B ˜ ˜ are indeed verified for the affine Lu–Weinstein double and we the WZW double D can safely perform the symplectic reduction leading to the quasitriangular WZW model. 5. Loop Group Quasitriangular WZW Model We arrive at the core of this paper. We shall consider the loop group quasitriangular ˜ ˜ introduced in the WZW model corresponding to the affine Lu–Weinstein double D previous section. The definition of this model involves certain symplectic reduction (cf. Definition 4.13 of Sec. 4.3), which is the quasitriangular analogue of the full left-right reduction described in Secs. 2.2.2 and 2.2.3. However, in the case of the standard loop group WZW model described in Secs. 2 and 3, we have used also the alternative approach by performing the reduction at the chiral level. The full left-right WZW model was then obtained by combining two copies of the reduced chiral model. Although both approaches gave the same result, the chiral reduction was much simpler from both conceptual and technical points of view. Here we shall take advantage of the fact that in the loop group case, one can also construct a chiral second floor quasitriangular master model. Therefore, we can perform the symplectic reduction directly at the chiral level. As in the standard case, the full left-right quasitriangular WZW model will then be obtained by an appropriate combination of the reduced chiral copies. It can be shown that this ˜ ˜ in gives the same result as the reduction of the full left-right master model on D the spirit of Definition 4.13. Our strategy here will therefore be as follows: first we introduce the quasitriangu˜ lar chiral master G-model and perform the quasitriangular analogue of the two-step chiral symplectic reduction of Sec. 3. In this way, we construct the main result of our paper which is the chiral quasitriangular WZW model. Moreover, we shall give the very explicit description of its symplectic structure and of its Hamiltonian making thus evident that we have really to do with the one-parameter deformation of the standard chiral WZW model described in Sec. 3.2.4. We shall finally combine two copies of the chiral model to obtain the full left-right quasitriangular WZW theory.
September 6, 2004 14:35 WSPC/148-RMP
752
00214
C. Klimˇ c´ık
5.1. Quasitriangular chiral geodesical model We shall first illustrate the idea of the quasitriangular chiral decomposition in the (finite-dimensional) case of the geodesical model on the Lu–Weinstein–Soibelman double GC 0 . This section should serve as the ideological and technical reference for the more complicated infinite-dimensional structures described later. 5.1.1. Chiral splitting of the Semenov-Tian-Shansky form It is a well-known fact [47] that every element K of a simple complex connected and simply connected group GC 0 can be decomposed as −1 K = kL akR ,
kL,R ∈ G0 ,
a ∈ A+ = exp Λ0 (A0+ ) .
(5.1)
−1 0 ∗ Here G is the compact real form of GC 0 and A+ ⊂ Υ0 (T ) ⊂ G0 is the positive Weyl chamber introduced in Sec. 3.1.1. Moreover, Λ0 is the identification map Λ0 : G0∗ → B0 defined as
(Λ0 (x∗ ), y)D = hx∗ , yi ,
x∗ ∈ G0∗ , y ∈ G0 .
(5.2)
Note that Λ0 depends on the parameter ε (cf. (4.81)), since the form (·, ·)D does, too. It is easy to see that A+ = exp Λ0 (A0+ ) is the subset of the group A appearing in the Iwasawa decomposition GC 0 = G0 AN . This subset does not depend on ε, however, since the Weyl chamber is invariant under the scaling. The decomposition (5.1) is called the Cartan one and its ambiguity is given by the simultaneous right multiplication of kL and kR by the same element of T. Note that the same thing was true for the Cartan decomposition of T ∗ G0 (cf. Theorem 3.2). Consider the manifold G0 × A+ × G0 ; we shall denote its points as triples (kL , a, kR ). The Cartan decomposition (5.1) then induces a natural map Ξ from this manifold into the complex Heisenberg double D = GC 0 . We can then pull back the Semenov-Tian-Shansky symplectic form ω by the map Ξ. The following lemma is of crucial importance to the success of our programme. Lemma 5.1. Consider maps ΞL,R : G0 × A+ × G0 → D defined as ΞL (kL , a, kR ) = −1 kL a and ΞR (kL , a, kR ) = akR . Then Ξ∗ ω = Ξ∗L ω + Ξ∗R ω .
(5.3)
Remark. The proposition of Lemma 5.1 can be restated intuitively as follows. The pullback form Ξ∗ ω can be chirally decomposed on the left and right part, which talk to each other only via the variable a. Proof. First we write the Semenov-Tian-Shansky form ω as follows 1 1 −1 −1 ∧ dgR ∧, K −1 dK)D − (dgL gL , dKK −1 )D . ω = − (gR 2 2
(5.4)
September 6, 2004 14:35 WSPC/148-RMP
00214
753
Quasitriangular WZW Model
Recall (cf. (4.11)) that K −1 dK denotes the left-invariant Maurer–Cartan form on −1 ∗ the Heisenberg double D and gR dgR is gR λG0 , where gR and gL are induced by the decomposition K = bL (K)gR (K) = gL (K)bR (K) .
(5.5)
We can easily recover the original formula (4.8) for ω by using the isotropy of G 0 with respect to (·, ·)D and by noting that −1 −1 −1 K −1 dK = gR (bL dbL )gR + gR dgR ,
−1 −1 dKK −1 = dgL gL + gL (dbR b−1 R )gL .
(5.6) Consider now another decomposition of K ∈ D, K = pL k = kpR ,
(5.7)
where −1 k = k L kR ,
−1 pL = kL akL ,
−1 pR = kR akR .
(5.8)
Clearly, kL,R and a come from the Cartan decomposition (5.1). Although they are not given unambiguously, this ambiguity disappears in (5.8); in other words, k and pL,R are uniquely fixed by K ∈ D. We can successively rewrite the form ω as follows −1 −1 ∧ , dKK −1 )D −2ω = (gR dgR ∧, K −1 dK)D + (dgL gL −1 −1 ∧ −1 −1 = (gR dgR ∧, k −1 p−1 )D L dpL k)D + (dgL gL , kdpR pR k −1 −1 ∧ −1 = (kgR d(gR k −1 ) ∧, p−1 gL )gL k , dpR p−1 L dpL )D + (d(k R )D −1 + (dkk −1 ∧, p−1 dk ∧, dpR p−1 L dpL )D + (k R )D .
(5.9)
In deriving this equality, we have used the isotropy of G0 with respect to (·, ·)D . Now we see from (5.5) and (5.7) that gR k −1 = b−1 L pL ,
k −1 gL = pR b−1 R .
(5.10)
This permits us to write −1 −1 ∧ ∧ −1 −2ω = (dpL p−1 L , dbL bL )D + (pR dpR , bR dbR )D −1 + (dkk −1 ∧, p−1 dk ∧, dpR p−1 L dpL )D + (k R )D .
(5.11)
Now there are four terms on the right-hand side of (5.11). We insert the expressions (5.8) into the last two of them. The result reads Ξ∗ ω =
1 −1 1 −1 ∧ ∧ −1 (dbL b−1 L , dpL pL )D + (bR dbR , pR dpR )D 2 2 1 1 −1 −1 dkL )D − ((daa−1 + a−1 da) ∧, kR dkR )D + ((daa−1 + a−1 da) ∧, kL 2 2 1 −1 1 −1 −1 −1 + (kL dkL ∧, akL dkL a−1 )D − (kR dkR ∧, akR dkR a−1 )D . 2 2
(5.12)
September 6, 2004 14:35 WSPC/148-RMP
754
00214
C. Klimˇ c´ık
Now observe that −1 bL (kL akR ) = bL (kL a) ,
−1 −1 bR (kL akR ) = bR (akR ).
(5.13)
This means that quantities bearing the index L (R) do not depend on kR (kL ), hence the proposition of the lemma follows.
A0−
Consider now the model spaces ML = G0 × A0+ and MR = G0 × A0− , where = −A0+ . The symplectic form ωL on ML is defined as ˜ ∗L ω = 1 (dbL b−1 ∧, dpL p−1 )D + (daa−1 ∧, k −1 dkL )D ωL = Ξ L L L 2 1 −1 −1 + (kL dkL ∧, akL dkL a−1 )D . 2
(5.14)
˜ L : ML → D is given by Ξ(k ˜ L , φL ) = kL aL for (kL , φL ) ∈ ML , where The map Ξ aL = exp Λ0 (φL )
(5.15)
and pL and bL are defined as before. Now a subtlety: we define the symplectic form ωR on MR by exactly the same formula, i.e, ˜∗ ω , ωR = Ξ R
(5.16)
˜ R (kR , φR ) = kR aR , aR = exp Λ0 (φR ). Such a definition may look surprising where Ξ because the right part of the form Ξ∗ ω in (5.12) was obtained by pulling back by −1 the map aL kR rather than by kR aR . We shall see in a while that this gives the same thing, however. The advantage of our definition is evident: it allows us to study only the case ML since the symplectic structure on MR is the same (up to the change in the domain of the variable φ: A0+ → A0− ). Consider the manifold ML × MR equipped with the symplectic form ωL×R = ωL + ωR .
(5.17)
Lemma 5.2. The submanifold of ML × MR , given by equating aL aR = 1, is naturally diffeomorphic to G0 ×A+ ×G0 and the form ωL×R restricted to this submanifold is nothing but Ξ∗ ω given by Eq. (5.12). Proof. The left part of Ξ∗ ω (cf. (5.12)) coincides with ωL by definition. Let us show that the right part gives ωR . We have from (5.14) and (5.16) ωR =
1 −1 ∧ (dbL (kR aR )b−1 L (kR aR ) , dpR pR )D 2 1 −1 −1 −1 ∧ −1 ∧ + (daR a−1 R , kR dkR )D + (kR dkR , aR kR dkR aR )D , 2
(5.18)
September 6, 2004 14:35 WSPC/148-RMP
00214
Quasitriangular WZW Model
755
−1 −1 where pR = kR aR kR . Now we have to set aR = a−1 and insert in (5.18). L ≡ a We obtain 1 −1 −1 −1 ∧ ) , d(kR a−1 kR )kR akR )D ωR = (dbL (kR a−1 )b−1 L (kR a 2
1 −1 −1 −1 + (kR dkR ∧, daa−1 )D − (kR dkR ∧, akR dkR a−1 )D . (5.19) 2 Now we see that the second and third terms on the of right-hand side (5.19) have their counterparts in the right part of Ξ∗ ω. It remains to show that −1 −1 ∧ −1 (d(kR a−1 kR )kR akR , dbL (kR a−1 )b−1 ))D L (kR a −1 −1 ∧ −1 −1 −1 = ((kR a−1 kR )d(kR akR ) , bR (akR )dbR (akR ))D .
(5.20)
But this equality follows from the following obvious relation −1 bL (kR a−1 ) = b−1 R (akR ) .
(5.21)
The lemma is proved. Corollary 5.3. The symplectic reduction of the form ωL×R by the relation aL aR = 1 is the Semenov-Tian-Shansky form ω on GC 0 . Proof. The form ωL×R restricted on aL aR = 1 coincides with Ξ∗ ω given by (5.12) and it is clearly degenerate along orbits of the maximal torus T acting as (a, kL , kR ) → (a, kL h, kR h), h ∈ T. The orbit space is nothing but GC 0 . So far we have shown that the symplectic structure ω on D = GC 0 naturally originates from ωL×R under the symplectic reduction induced by setting aL aR = 1. But the Hamiltonian H(K) introduced in Sec. 4.2 can also be “descended” from some Hamiltonian on ML × MR . Indeed, the latter is defined as follows
1 1 1 1 HL×R = HL + HR = − (φL , φL )G0∗ − (φR , φR )G0∗ = aµL aµL + aµR aµR . 2 2 2 2 Recall that aL = exp (aµL Λ0 (tµ )) ,
aR = exp (aµR Λ0 (tµ )) .
(5.22)
(5.23)
Clearly, Λ0 (tµ )’s are in Lie(A) ⊂ B0 , they fulfil (Λ0 (tµ ), T ν )D = htµ , T µ i = δµν and T µ ∈ G0 were defined in (3.24). By the abuse of notation, we shall often write Λ0 (ti ) = ti , i.e, the identification map will be tacitly assumed (cf. (4.6)). However, in the case where confusion can arise, we shall use the symbol Λ0 explicitly. Note that the Hamiltonians HL,R on ML,R coincide with the Hamiltonians of the standard geodesical model (3.12). It is the symplectic form ωL , defined by (5.14), that differs from dθL given by (3.11). We shall see, however, that for ε → 0 it holds that ωL → dθL . The Hamiltonians do not depend on kL , kR , respectively, hence they trivially survive the symplectic reduction and give the Hamiltonian (4.102) on D = GC 0 .
September 6, 2004 14:35 WSPC/148-RMP
756
00214
C. Klimˇ c´ık
Definition 5.4. The chiral quasitriangular geodesical model is the dynamical system whose phase space is ML parametrized by couples (k ∈ G0 , φ ∈ A0+ ), whose Hamiltonian is 1 1 HL = − (φ, φ)G0 ∗ = aµ aµ 2 2
(5.24)
and whose symplectic form is ωL =
1 −1 ∧ (dbL (ka)b−1 )D + (daa−1 ∧, k −1 dk)D L (ka) , dpp 2 1 + (k −1 dk ∧, a(k −1 dk)a−1 )D , 2
(5.25)
where p = kak −1 and a = exp Λ0 (φ) = exp(aµ Λ0 (tµ )). We have learned in this section that the quasitriangular geodesical model formulated on the Lu–Weinstein double GC 0 admits the chiral decomposition into two chiral models defined above. By this we mean that it can be defined by the symplectic reduction of the model defined on ML ×MR and characterized by the symplectic form ωL×R and the Hamiltonian HL×R . Remark. The chiral quasitriangular geodesical model was apparently first proposed in [4]. In what follows, we shall invert the symplectic form ωL in a way which technically differs from that of [4]. Its spirit is the same, however, in that we use the Poisson–Lie symmetry of ωL . 5.1.2. The power of the Poisson–Lie symmetry In Secs. 5.1.2 and 5.1.3, we shall often parametrize the points of ML by couples (k, a), where k ∈ G0 and a = eΛ0 (φ) ∈ A+ . Lemma 5.5. The chiral quasitriangular geodesical model is Poisson–Lie symmetric (cf. Definition 4.7) with respect to the left action (k, a) → (k0 k, a) of the group G0 on the model space ML . The corresponding non-Abelian moment map M : ML → B is given by M (k, a) = bL (ka) = Dresk a. Proof. Consider a point (k, a) in ML . This point can be mapped into D as ˜ a) = ka under the embedding ML ,→ D. Multiplication of (k, a) on the left by Ξ(k, an infinitesimal generator T ∈ G0 gives the vector vT = (T k, a), hence the vector ˜ ∗ vT = T ka = R(ka)∗ T ∈ T(ka) D. We want to show that Ξ ωL (·, vT ) = (T, dM M −1 )D .
(5.26)
˜ ∗ ω, we infer Since the form ωL is the pullback Ξ ˜ ∗ u, Ξ ˜ ∗ vT ) = ω(Ξ ˜ ∗ u, R(ka)∗ T ) , ωL (u, vT ) = ω(Ξ
(5.27)
September 6, 2004 14:35 WSPC/148-RMP
00214
Quasitriangular WZW Model
757
where u is an arbitrary vector at the point (k, a) ∈ ML . Now from (4.88), we conclude ˜ ∗ u, R(ka)∗ T ) = (T, hb∗L ρB , Ξ ˜ ∗ ui)D = (T, hM ∗ ρB , ui)D = (T, hdM M −1 , ui)D , ω(Ξ
(5.28)
˜ ∗ bL = M . The Hamiltonian HL does not depend on k, it is therefore because Ξ clearly invariant with respect to (k, a) → (k0 k, a). The lemma is proved. The existence of the non-Abelian moment map M : ML → B entails certain differential condition on the Poisson bivector ΣL , corresponding to the form ωL . Recall first, that ΣL (·, ωL (·, u)) = u ,
(5.29)
for arbitrary vector u at arbitrary point of ML . It then follows from (5.26), that ΣL (·, dM M −1 ) = v ,
(5.30)
G0∗ (≡
where v ∈ T ML ⊗ T ML ⊗ B0 ) generates the left action of G0 on the model space ML . Now calculate the Lie derivative Lv ΣL with respect to v. The result is −1 L v Σ L = L v ωL = −ΣL Lv ωL ΣL = −ΣL div ωL ΣL
= +ΣL d(dM M −1 )ΣL = ΣL (dM M −1 ∧ dM M −1 )ΣL = −v ∧ v .
(5.31)
The last relation can also be written in some basis T i of G0 1 i j Lvi ΣL = − f˜jk v ∧ vk , 2
(5.32)
where i f˜jk = (T i , [tj , tk ])D ,
(ti , T j )D = δij ,
v i = (v, T i )D .
(5.33)
This is the differential condition that the Poisson tensor ΣL must obey. Note first that there is a particular solution of the differential condition (5.32). It reads 1 R i j Σpart (5.34) L (k, a) = − Πij (k)Rk∗ T ∧ Rk∗ T , 2 where the matrix ΠR ij (k) is given by k −1 ΠR ), ij (k) = −Bik (k)A j (k
Bik (k) = (k −1 ti k, tk )D ,
(5.35) Akj (k) = (k −1 T k k, tj )D .
(5.36)
Now we replace the role of the groups G and B in the Proposition 4.5 in order to realize that, up to the sign minus, the right-hand side of (5.35) is the Poisson–Lie bivector on the group manifold G0 (here viewed as the bivector on G0 × A+ ). If we
September 6, 2004 14:35 WSPC/148-RMP
758
00214
C. Klimˇ c´ık
regard the condition (5.32) as the differential equation for the unknown bivector ΣL , we see immediately that its general solution can be given as the sum of the particular solution (5.34) and any solution of the homogeneous equation. But the latter is nothing but every G0 -right-invariant bivector because v i = Rk∗ T i . We can therefore write the following ansatz for the Poisson bivector ΣL that we seek 1 ΣL (k, a) = − ΠR (k)Rk∗ T i ∧ Rk∗ T j 2 ij + Σ0ij (a)Lk∗ T i ∧ Lk∗ T j + σi µ (a)Lk∗ T i ∧
∂ ∂aµ
∂ ∂ ∧ ν. (5.37) µ ∂a ∂a Our task is to find the coefficient functions Σ0ij (a), σiµ (a) and sµν (a). Thus we observe the power of the Poisson–Lie symmetry: the Poisson tensor ΣL on the model space ML is completely determined by its value at the points (e, a) ∈ ML , in other words, at the unit element e of G0 . + sµν (a)
5.1.3. Deformed dynamical r-matrix Here we shall evaluate the unknown functions Σ0ij (a), σi µ (a) and sµν (a). For this, we first calculate the matrix of the symplectic form ωL at points (e, a) (in the basis Ra∗ T i , Ra∗ tµ of T(e,a) ML ) and then we invert it to obtain the unknown coefficient functions of the Poisson bracket. The basis T i of G0 was canonically chosen in (3.24), i.e. T i = (T µ , B α , C α ), where i 1 T µ = iH µ , B α = √ (E α + E −α ) , C α = √ (E α − E −α ) . (3.24) 2 2 Recall then that a is an element of the group A; its Lie algebra Lie(A) is generated by the generators tµ = εH µ which are dual to iH µ with respect to the bilinear form (·, ·)D given by the formula (4.81). The full dual basis reads 2 ε|α|2 α µ ε|α| α (5.38) ti = (tµ , bα , cα ) = εH , √ E , −i √ E 2 2 and it satisfies the basic condition (ti , T j )D = δij . Remark. Note that the form (·, ·)D given by (4.81) depends on ε, therefore the dual generators ti are also ε-dependent. We shall occasionally stress this dependance by writing (·, ·)ε and tεi . We decompose the elements of the basis of T(e,a) ML into several parts: the α-part generated by Ra∗ B α , Ra∗ C α , α ∈ Φ+ and the µ-part generated by Ra∗ T µ , Ra∗ tµ . Now we use the formula (4.18) ω(t, u) = (t, (ΠLR ˜ − Π LR ˜ )u)D .
September 6, 2004 14:35 WSPC/148-RMP
00214
Quasitriangular WZW Model
759
We immediately observe that ωL (α, µ) = 0 ,
ωL (α, β) = 0 ,
α, β ∈ Φ+ ,
α 6= β .
(5.39)
This is because we are at the point (e, a) ∈ ML (⊂ D). It is hence sufficient to invert the matrix ωL for the µ-sector and for every α-sector separately. We have ΠLR˜ Ra∗ T µ = 0 ,
µ µ ΠLR ˜ Ra∗ T = Ra∗ T .
(5.40)
From this, we obtain ωL (Ra∗ tν , Ra∗ T µ ) = δνµ .
(5.41)
Ra∗ C α = (Ra∗ C α , La∗ ti )D La∗ T i + (Ra∗ C α , La∗ T i )D La∗ ti
(5.42)
One has
and from (5.38) La∗ bα = ea
µ
hα,tµ i
La∗ cα = ea
Ra∗ bα ,
µ
hα,tµ i
Ra∗ cα .
(5.43)
Then we have ΠLR˜ Ra∗ C α = ea
µ
hα,tµ i
+ ea
µ
(La∗ B α , Ra∗ C α )D Ra∗ bα
hα,tµ i
(La∗ C α , Ra∗ C α )D Ra∗ cα ,
α α ΠLR ˜ Ra∗ C = Ra∗ C .
(5.44) (5.45)
It follows that ωL (Ra∗ B α , Ra∗ C α ) = −ea
µ
hα,tµ i
(La∗ B α , Ra∗ C α )D =
µ 1 (e2a hα,tµ i −1) . (5.46) 2 ε|α|
Our conclusion is that sµν = 0 ,
σαµ = 0,
σµν = δµν ,
Σ0µν = Σ0αµ = Σ0αβ = 0
(5.47)
and 1 ΣL (k, a) = − ΠR (k)Rk∗ T i ∧ Rk∗ T j 2 ij X ε|α|2 ∂ α ∧ Lk∗ C α + Lk∗ T µ ∧ µ . + µ hα,H µ i Lk∗ B 2εa ∂a (1 − e ) α∈Φ
(5.48)
+
µ
Recall that in the formulas above, we have set a = ea tµ and tµ = εH µ . Denote {·, ·}qML the Poisson bracket corresponding to the bivector ΣL . Now we wish to calculate the bracket of the type {k ⊗, k}qML ; according to (5.48), it can be decomposed in two parts {k ⊗, k}qML = {k ⊗, k}Σ0 − {k ⊗, k}R G0 ,
(5.49)
where, of course, k is understood in some representation ρ, the bracket {k ⊗, k}Σ0 is associated to the bivector on the second line of (5.48) and the bracket {k ⊗, k}R G0
September 6, 2004 14:35 WSPC/148-RMP
760
00214
C. Klimˇ c´ık
is the standard Poisson–Lie bracket (4.70) on the group G0 . Let us calculate the latter more explicitly: i j R {k ⊗, k}R G0 = (T k ⊗ T k)Πij (k)
= −(T i k ⊗ T j k)(k −1 ti k, tl )D (kT l k −1 , tj )D .
(5.50)
From the isotropy of G0 , it follows that (kT l k −1 , tj )D T j = kT l k −1 . Inserting this back into (5.50), we obtain i l −1 {k ⊗, k}R ti k, tl )D . G0 = −(T k ⊗ kT )(k
(5.51)
We insert in (5.51), another obvious identity k −1 ti k = (k −1 ti k, T l )D tl + (k −1 ti k, tl )D T l to finally obtain i i i {k ⊗, k}R G0 = −T k ⊗ ti k + kT ⊗ kti = [(k ⊗ k), (T ⊗ ti )] .
(5.52)
We recall that D = GC 0 was viewed as the real group. Among the representations of D, we can therefore consider those originating from the complex representations of G0C . From now on, we restrict our attention to the faithful representations of this type. The representatives of the elements B α , C α , bα , cα are then obtained from the representatives of E α by using the formulae (3.24) and (5.38). In such representations and using the canonical choice of the basis (3.24) and (5.38), we can calculate ! X i µ µ 2 −α α r+ ≡ T ⊗ ti = iε H ⊗ H + |α| E ⊗ E = iεC + εr . (5.53) α>0
Here C ≡ Hµ ⊗ Hµ +
X |α|2 (E α ⊗ E −α + E −α ⊗ E α ) 2
(5.54)
α∈Φ+
is the Casimir element and X i|α|2 (E −α ⊗ E α − E α ⊗ E −α ) r≡ 2
(5.55)
α∈Φ+
is the so-called classical r-matrix. We note that the Casimir element commutes with the diagonal elements like (k ⊗ k). Hence, we can rewrite (5.52) as −{k ⊗, k}R G0 = ε[r, (k ⊗ k)] .
(5.56)
Now we use (5.56) and (5.48) to write down the Poisson bracket 0 µ {k ⊗, k}ML = (k ⊗ k)rε0 (aµ ) − {k ⊗, k}R G0 = (k ⊗ k)rε (a ) + ε[r, (k ⊗ k)] ,
(5.57)
September 6, 2004 14:35 WSPC/148-RMP
00214
Quasitriangular WZW Model
761
where rε0 (aµ ) =
X
α∈Φ+
ε|α|2 (1 − e2εaµ hα,H µ i )
(B α ⊗ C α − C α ⊗ B α ) .
(5.58)
The Poisson bracket (5.57) can be rewritten as {k ⊗, k}qML = (k ⊗ k)rε (aµ ) + εr(k ⊗ k) ,
(5.59)
where rε (aµ ) is the so-called canonical dynamical r-matrix associated to a simple Lie algebra (see e.g. [46]). It is given by rε (aµ ) = rε0 (aµ ) − εr = iε
X |α|2 coth(εaµ hα, H µ i)E α ⊗ E −α . 2
(5.60)
α∈Φ
It is interesting to observe that the deformed braiding relation (5.59) involves two canonical r-matrices: the standard one and the dynamical one. The description of the full Poisson bivector ΣL on ML is then completed by the following bracket {k, aµ }qML = kT µ .
(5.61)
It is important to calculate the limit ε → 0 (or, equivalently, q → 1) of the dynamical r-matrix rε (aµ ) and of the Poisson brackets {k ⊗, k}qML , {k, aµ }qML . Recall the explicit expression for the dynamical r-matrix r0 (aµ ) obtained in (3.30) r0 (aµ ) =
X
α∈Φ+
i|α|2 2aµ hα, H µ i
E α ⊗ E −α .
(5.62)
We observe immediately that limε→0 rε (aµ ) = r0 (aµ ). Looking at (3.34), (3.35), (5.59) and (5.61), we conclude that limq→1 {·, ·}qML = {·, ·}ML . In other words, the symplectic structure of the chiral quasitriangular geodesical model is the smooth q-deformation of the symplectic structure of the standard chiral geodesical model. The same conclusion can obviously be obtained also by studying directly the ε → 0 limit of the bivector (5.48). Our next task is to calculate the Poisson bracket {bL (ka) ⊗, bL (ka)}qML of the non-Abelian moment maps bL (ka). The simplest way to do it is to use Lemma 5.5 and realize that for any function f (k, a) on ML , it holds that i L (T i , {f (k, a), bL(ka)}qML b−1 L (ka))D = h∇G0 f, T i
(5.63)
or, equivalently, i {f (k, a), bL(ka)}qML = h∇L G0 f, T iti bL (ka) .
(5.64)
In particular, we have {k ⊗, bL (ka)}qML = (T i ⊗ ti )(k ⊗ bL (ka)) = r+ (k ⊗ bL (ka)) .
(5.65)
September 6, 2004 14:35 WSPC/148-RMP
762
00214
C. Klimˇ c´ık
Replacing f (k, a) by the matrix valued function bL (ka), we obtain {bL(ka) ⊗, bL (ka)}ML = bL (T i ka) ⊗ ti bL (ka) i j = (b−1 L T b L , T )D b L tj ⊗ t i b L
= (T i ⊗ ti )(bL ⊗ bL ) − (bL ⊗ bL )(T i ⊗ ti ) ,
(5.66)
where the last equality follows from −1 i −1 i i j j b−1 L T bL = (bL T bL , T )D tj + (bL T bL , tj )D T .
(5.67)
Using (5.53), we can finally write {bL (ka) ⊗, bL (ka)}qML = ε[r, bL (ka) ⊗ bL (ka)] .
(5.68)
In the case of the compact group G0 , the Poisson brackets of the type {k ⊗, k} determines completely the Poisson tensor. In the case of the group AN , this is no longer true, because (5.68) computes only the Poisson brackets of the holomorphic functions of the variables vα in the formula (4.83). It turns out that knowing two other matrix Poisson brackets of the form {b†L(ka) ⊗, b†L (ka)}qML and {(b†L (ka))−1 ⊗, bL (ka)}qML is already sufficient. The former bracket can be similarly calculated as before with the result {b†L (ka) ⊗, b†L (ka)}qML = −ε[r, b†L(ka) ⊗ b†L (ka)] .
(5.69)
The calculation of the latter goes as follows † −1 † i j tj ⊗ t i b L {(b†L (ka))−1 ⊗, bL (ka)}qML = −(b−1 L T bL , T )D (bL ) † −1 j i = T i (b†L )−1 ⊗ ti bL − (b−1 T ⊗ t i bL L T bL , tj )D (bL )
= (T i ⊗ ti )((b†L )−1 ⊗ bL ) − ((b†L )−1 ⊗ bL )(T i ⊗ ti ) = [r+ , ((b†L )−1 ⊗ bL )] .
(5.70)
In the derivation of the relation above, we have used the following formula † −1 i i j j −b†L T i (b†L )−1 = (b−1 L T bL , T )D tj − (bL T bL , tj )D T ,
(5.71)
which is the consequence of (5.67). The Poisson brackets (5.68)–(5.70) constitute the quasitriangular generalization of the commutation relation (3.41) between the coefficients of the standard Abelian moment map M (k, a) = βL (ka) (cf. Sec. 3.1.3). To see this, we compute the ε → 0 limit of (5.68)–(5.70). This calculation requires some notational care (cf. remark after Eq. (5.23)). Until the end of this paragraph, we shall make the notational distinction between G0∗ and B0 , i.e. ti ’s will be always the elements of G0∗ dual to T i ∈ G0 with respect to the pairing h·, ·i and tεi = Λ0 (ti ) the elements of B0 dual to T i ∈ G0 with respect to the pairing (·, ·)ε . We start from the formula b(ka) = Dresk (a) = 1 + aµ Cµj (k)tεj + O(ε2 )
(5.72)
September 6, 2004 14:35 WSPC/148-RMP
00214
Quasitriangular WZW Model
763
j written as some matrix representation of GC 0 . Here Ci (k) is the matrix defined by
Coadk ti = Ci j (k)tj .
(5.73)
The formula (5.72) can be inferred from a = exp(aµ tεµ ) = exp(aµ εH µ ) = 1 + aµ tεµ + O(ε2 )
(5.74)
and from the relation (4.5) rewritten as [T i , tεj ] = f kji tεk + ([tεi , tεk ], T j )ε T k .
(5.75)
Here fikj are the structure constants of G0 defined by the relation [T i , T j ] = flij T l . Now recall the formula (3.38) for the Abelian moment maps hM (k, aµ ), T j i studied in Sec. 3.1.3. It can be rewritten as hM (k, aµ ), T j i = hβL (kφ), T j i = aµ Cµj (k) .
(5.76)
Note that here the multiplication kφ is in the sense of T ∗ G0 . We conclude that the ε-expansion (5.72) can be rewritten as b(ka) = Dresk (a) = 1 + εhM (k, aµ ), T j it1j + O(ε2 ) .
(5.77)
Inserting this expansion into the formulae (5.68)–(5.70) and using the explicit form (5.38) of the dual basis t1j as the chosen representation, we obtain (in the lowest nontrivial order ∝ ε2 ) the following formula {hM, T i i, hM, T j i} = flij hM, T l i .
(5.78)
The result (5.78) coincides with the standard formula (3.41) of Sec. 3.1.3. 5.1.4. The classical solution It is very easy to solve classically the quasitriangular chiral geodesical model. It is enough to use the Hamiltonian (5.22) and the Poisson bracket (5.61) to conclude that the quasitriangular geodesics satisfy the equation d d µ aL = {aµL , HL }qML = 0 , kL = {kL , HL }qML = kL T µ aµL . dτ dτ The general solution of these equations has the form kL (τ ) = kL (0) exp(aµL T µ )τ ] ,
aµL (τ ) = aµL (0) .
(5.79)
(5.80)
By comparing with (3.18), we see that the deformation has not changed the classical solutions of the geodesical model! So what got deformed after all? It is in fact the symplectic structure on the space of solutions (phase space) that got deformed. This means that the natural dynamical variables of the group theoretical origin will have modified Poisson bracket and, upon quantization, modified commutation relations. For instance, the correlation functions (in the field theoretical applications) will change.
September 6, 2004 14:35 WSPC/148-RMP
764
00214
C. Klimˇ c´ık
5.2. Quasitriangular chiral WZW model We proceed in full analogy with Sec. 5.1, where the chiral model was constructed from the full geodesical model. Thus the deformed chiral master model is going ˜ L as the non-deformed one (3.60). Its to live on the same affine model space M q ˜L symplectic structure ω ˜ L will be obtained by embedding the affine model space M ˜ ˜ and by pulling back the Semenov-Tian(defined in Sec. 3.2.2) into the double D ˜ ˜ to M ˜ by the embedding map. The chiral Hamiltonian on Shansky form from D L
˜ L will be the same as in the standard non-deformed master model (3.58). Then M the two-step symplectic reduction of this quasitriangular chiral master model will be the deformed chiral WZW theory. 5.2.1. Quasitriangular chiral master model ˜ ˜ introduced in Sec. 4.4. There is the Consider the affine Lu–Weinstein double D ˜ ˜ ˜ distinguished subgroup A ⊂ D that we shall call the Cartan subgroup. It is defined as A˜ = RQ × ν(A) × Rl .
(5.81)
Here the first copy RQ corresponds to the automorphisms Q obtained by exponentiating the generator t˜1∞ = (1, 0, 0) and the second copy Rl to the (central) line generated by t˜10 = (0, 0, 1). Recall that ν is the injective homomorphism sending the group B = L+ GC 0 into ˜ ˜ D (cf. Sec. 4.4.2). Moreover, we have already encountered the group A in Sec. 4.1.3. We have A ⊂ B0 ⊂ B, hence the notation ν(A) makes sense. Note that the group B0 = AN is the zero mode subgroup of B = L+ GC 0 , hence the automorphisms from ˜ ˜ is Q do not act on A. From this fact we conclude that the Cartan subgroup A˜ ⊂ D
commutative and the direct products in (5.81) make sense. ˜ was defined as Υ ˜ −1 (T˜ ), where T˜ = Recall that the Cartan subgroup A˜ ⊂ T ∗ G ˜ : T ⊕ R(i, 0, 0) ⊕ R(0, 0, i) and T = Lie(T). Consider the identification map Λ ∗ ˜ defined by G˜ → B˜ = Lie(B) ˜ x∗ ), y˜) ˜ = h˜ (Λ(˜ x∗ , y˜i , ˜ D
x ˜∗ ∈ G˜∗ ,
y˜ ∈ G˜ .
(5.82)
˜ depends on ε because (·, ·) ˜ does (cf. (4.147)). We can naturally define Note that Λ ˜ D ˜ ˜ as the fundamental Weyl alcove A˜+ in the Cartan subgroup A˜ ⊂ D ˜ A˜+ ) , A˜+ = exp Λ(
(5.83)
ˆ as where A˜+ is the fundamental Weyl alcove in the Cartan subgroup A˜ ⊂ T ∗ G explained in Sec. 3.2.1. ˜ ˜ is Recall that the Semenov-Tian-Shansky form ω ˜ on the Heisenberg double D given by the formula ω ˜=
1 1 ˜∗ ∗ ∗ (bL λB˜ ∧, g˜R ρG˜ )D˜˜ + (˜b∗R ρB˜ ∧, g˜L λG˜ )D˜˜ . 2 2
(5.84)
September 6, 2004 14:35 WSPC/148-RMP
00214
Quasitriangular WZW Model
765
˜ ˜ →G ˜ →B ˜ and g˜R : D ˜ ˜ are induced by the decomposition Here the maps ˜bL : D ˜ ˜ ˜ ˜ ˜ =B ˜G ˜ and g˜L : D ˜ →G ˜ and ˜bR : D ˜ →B ˜ by D ˜ =G ˜ B. ˜ The expression λ ˜ (ρ ˜ ) D G G ˜ ˜ denotes the left (right) invariant G-valued Maurer–Cartan form on the group G. ˜ L = G× ˜ A˜+ . The Semenov–Tian–shansky Consider now the affine model space M ˜ ˜ L by the map Ξ ˜ :M ˜L → D ˜ defined by form ω ˜ can be pulled back to the space M ˜ ˜ a ˜a ∈ D, ˜ and φ˜ ∈ A˜+ . The pullback form ˜ k, ˜ where k˜ ∈ G, ˜ a ˜ φ) Ξ( ˜) = k˜ ˜ = exp Λ( ∗ ˜ ˜ L . Its explicit form can be ω ˜L ≡ Ξ ω ˜ defines the chiral symplectic structure on M found by reconducting step by step the finite-dimensional calculation of Sec. 5.1.1. The result is 1 ˜a)˜b−1 (k˜ ˜a) ∧, d˜ ˜∗ω pp˜−1 )D˜˜ ω ˜L ≡ Ξ ˜ = (d˜bL (k˜ L 2 ˜ ˜ + 1 (k˜−1 dk˜ ∧, a ˜ a−1 ) ˜ , + (d˜ aa ˜−1 ∧, k˜−1 dk) ˜(k˜−1 dk)˜ ˜ ˜ D D 2
(5.85)
˜ak˜−1 . This leads to the following definition. where p˜ = k˜ Definition 5.6. The quasitriangular chiral master model is the dynamical system ˜ L , whose symplectic structure is given by (5.85) and whose on the phase space M Hamiltonian is ˜ φ) ˜ ˜. ˜ L = − 1 (φ, (5.86) H G 2κ Note that the Hamiltonian (5.86) is the same as that of the standard chiral master model (3.58), however, the symplectic structure is different. It is also crucial to remark that the model (5.85)–(5.86) is Poisson–Lie symmetric in the sense of Definition 4.7. The proof of this fact is very similar to that of the analogous finitedimensional result expressed in Lemma 5.5. In particular, the non-Abelian moment ˜a). ˜ ˜ l is given by ˜bL (k˜ map corresponding to the left G-action on M 5.2.2. Chiral symplectic reduction: the first step Now we are going to perform the symplectic reduction of the chiral master model introduced in the previous subsection. As we know, we can do it in the language of the symplectic forms (like in Sec. 2.2.3) or using the Poisson bracket formalism (as in Sec. A.3). Here we choose the first (second) possibility for the first (second) step of the reduction. We first need some preliminary description of the objects with which we are ˆ ˆ is defined as going to work. The Cartan subgroup Aˆ of the first floor double D Aˆ = RQ × ν(A) ,
(5.87)
ˆ : Gˆ∗ → with the same notation as in (5.81). Then define the identification map Λ ˆ ˆ B = Lie(B) as follows ˆ x∗ ), yˆ) ˆ = hˆ x∗ , yˆi , (Λ(ˆ ˆ D
x ˆ∗ ∈ Gˆ∗ ,
yˆ ∈ Gˆ .
(5.88)
September 6, 2004 14:35 WSPC/148-RMP
766
00214
C. Klimˇ c´ık
Recall also the definition of the alcove Aˆ+
Aˆ+ = {φˆ ∈ A˜+ , φˆ = (0, φ, a∞ )∗ } .
ˆ ˆ The hat-alcove Aˆ+ in D is then set to be
ˆ Aˆ+ ) . Aˆ+ = exp Λ(
(5.89)
ˆL = G ˆ × Aˆ+ . The Semenov-TianConsider now the reduced model space M ˆ ˆ can be pulled back to the reduced model space M ˆ L by the shansky form ω ˆ on D ˆ ˆ ˆ a ˆa ∈ D, ˆ and ˆ :M ˆL → D ˆ defined by Ξ( ˆ k, ˆ where kˆ ∈ G, ˆ a ˆ φ) map Ξ ˆ) = kˆ ˆ = exp Λ( ∗ ˆ ˆ ˆ ˆ φ ∈ A+ . The pullback form ω ˆL ≡ Ξ ω ˆ defines the symplectic structure on ML 1 ˆa)ˆb−1 (kˆ ˆa) ∧, dˆ ˆ∗ω ω ˆL ≡ Ξ ˆ = (dˆbL (kˆ ppˆ−1 ) ˆˆ L D 2 ˆ a−1 ) ˆ , ˆ ˆ + 1 (kˆ−1 dkˆ ∧, a ˆ(kˆ−1 dk)ˆ + (dˆ aa ˆ−1 ∧, kˆ−1 dk) ˆ ˆ D D 2
(5.90)
ˆakˆ−1 . where pˆ = kˆ ˜ = R ×S G, ˆ the affine model space M ˜ L is We note that due to the fact that G ˆ diffeomorphic to ML ×RS ×Rl . The natural additive coordinates on Rl and RS can ˜ φ) ˜ of M ˜ L can be naturally be denoted as a0 and s, respectively. Thus any element (k, 0˜ 0 sT˜ 0 ˆ ˆ ˆ ˜ represented by the quadruple (k, φ, a , s), where k = e k and a ˜=a ˆea Υ(t˜0 ) . In order to perform the symplectic reduction, it is useful to change the coor˜ L . For this, recall that the maps m0 : B ˜ → R and m ˜ →B ˆ (cf. dinates on M ˆ :B ˜=B ˆ × Rl . We shall write Convention 4.12) were induced by the decomposition B ˜ 0 0 ˜ ˜ ˜ ˜ ˜ → R and occasionally m(b) = b nad m( ˆ b) = b . We define also the maps m0L : D ˜ →B ˜ ˆ as m0 = m0 ◦ ˜b and m m ˆ :D ˆ =m ˆ ◦ ˜b . The essence of the first step of L
L
L
L
L
the symplectic reduction is contained in the following proposition.
Theorem 5.7. (1) It holds that 0 ˆ φ, ˆ a0 , s) = a0 + (Dres ] ˆa m0L (k, k ˆ) ,
ˆ . ˆ φ) a ˆ = exp Λ(
(5.91)
ˆ φ, ˆ m0 , s) on M ˜ L , the symplectic form ω (2) In the coordinates (k, ˜ L can be written L as ω ˜ L = −ds ∧ dm0L + ω ˆL ,
(5.92)
ˆ L defined in (5.90). where ω ˆ L is the symplectic form on M The proof of this theorem will necessitate the following lemma. Lemma 5.8. It holds that f ˆa ˆ c ˆa m ˆ L (Ad ˆ) , k ˆ) = bL (Adk
f ˆa c ˆa g˜R (Ad ˆR (Ad k ˆ) = g k ˆ) .
(5.93)
This lemma is proved in Appendix A.4. Proof of the Theorem 5.7. (1) We have
0˜ ˜ ˜ t˜0 ) ˜0 ˆ ˜bL (esT˜0 kˆ ˆaea0 Υ( ) = ea Υ(t0 )˜bL (esT kˆ a)
(5.94)
September 6, 2004 14:35 WSPC/148-RMP
00214
Quasitriangular WZW Model
767
0˜ ˜ ˆa in ˜ We stress that the multiplication kˆ because the line ea Υ(t˜0 ) is central in D. ˜ ˜ Then we use the fact that the automorphism S preserves both (5.94) is taken in D. ˜ ˜ This permits us to write subgroups G and B.
˜bL (esT˜0 kˆ ˆa) = ˜bL (esT˜0 kˆ ˆae−sT˜0 ) = esT˜0 ˜bL (kˆ ˆa)e−sT˜0 ˆ a)Υ( ˆa)e−sT˜0 , ˜ t˜0 )))esT˜0 m = (exp(m0L (kˆ ˆ L (kˆ
(5.95)
ˆ it where the maps m , m ˆ were defined in Convention 4.12. Since S preserve also B, follows from (5.94) and (5.95) that 0
0 ˆ φ, ˆ a0 , s) = a0 + m0 (kˆ ˆ ] ˆa m0L (k, ˆ )0 . L a) ≡ a + (Dresk
(5.96)
(2) We needed statement (1) in order to show that the coordinate a0 can be traded for the new coordinate m0L . So from now until the end of the proof, we shall work ˆ φ, ˆ m0 , s) on M ˜ L . The quasitriangular chiral master model is with the variables (k, L ˜ acts on the affine model Poisson–Lie symmetric, hence we know that the group G ˜ ˜L space ML in the Poisson–Lie way. Recall that this follows from the fact that M ˜ ˜ ˜ preserved by the G-action ˜ ˜ The latter can be viewed as the submanifold of D on D. is of the Poisson–Lie type by construction and it has the non-Abelian moment map ˜ ˜ → B. ˜ The restriction of the map ˜bL to the given by the factorization map ˜bL : D ˜ L is the Poisson–Lie moment map on M ˜ L. affine model space M The relation (4.87) proved in Sec. 5.1.2 implies the following formula ω ˜ L (·, vT˜0 ) = (T˜ 0 , d˜bL˜b−1 ˜. ˜ L )D
(5.97)
∂ ˜ L correis viewed as the vector field on M Here on the left-hand side, vT˜0 = ∂s 0 sponding to the left action of the generator T˜ and on the right-hand side, T˜0 is ˜ viewed simply as the element of G. ˜ → R. It is induced by the decomNow recall the definition of the map m0 : B ˜ ˆ ˆ it follows that position B = B × R. Since R commutes with B, 0 0 ˜0 ˜ ω ˜ L (., vT˜0 ) = (T˜ 0 , d˜bL˜b−1 ˜ = (T , Λ(t˜0 ))D ˜ dmL = dmL . L )D ˜ ˜
(5.98)
˜ ˜ insures, that the T˜0 -action on In other words, the structure of our WZW double D ˜ L is Hamiltonian in the standard (Abelian) sense. M It follows immediately from the formula (5.92) that ˜ k, ˆ φ, ˆ m0 ) , ω ˜ L = −ds ∧ dm0L + Ω( L
(5.99)
˜ does not contain ds. Moreover, Ω ˜ does not depend on s, because ω where Ω ˜ L is invariant with respect to the Hamiltonian vector field ∂/∂s. In fact, it is evident ˜ is nothing but the pullback of ω ˜ L . We can that Ω ˜ L on the surface s = 0 in M ˜ =ω therefore write Ω ˜ L (s = 0). Now we have from (5.85) ω ˜ L (s = 0) =
1 ˜ ˆ ˜−1 ˆ ∧ (dbL (k˜ a)bL (k˜ a) , d˜ p0 p˜−1 ˜ 0 )D ˜ 2 ˆ ˜ + 1 (kˆ−1 dkˆ ∧, a ˆ a−1 ) ˜ , + (d˜ aa ˜−1 ∧, kˆ−1 dk) ˜(kˆ−1 dk)˜ ˜ ˜ D D 2
(5.100)
September 6, 2004 14:35 WSPC/148-RMP
768
00214
C. Klimˇ c´ık
f ˆa ˆ a ˜ t˜0 )). The reader should pay atˆ)Υ( where p˜0 = Ad ˜ = a ˆ exp(a0 (m0L , k, k ˜ and a tention to the distribution of the hats and tildes and to the fact that the group ˜ ˜ Now we study (5.100) term by term multiplication in this formula is taken in D. ˆ ˜ = (dˆ ˆ ˆ; (d˜ aa ˜−1 ∧, kˆ−1 dk) aa ˆ−1 ∧, kˆ−1 dk) ˜ ˆ D
(5.101)
D
ˆ Υ( ˜ t˜0 )) ˜ = 0. Then we have this follows from the the fact that (kˆ−1 dk, ˜ D ˆ a−1 ) ˜ = (kˆ−1 dkˆ ∧, Ad f aˆ (kˆ−1 dk)) ˆ ˜, (kˆ−1 dkˆ ∧, a ˜(kˆ−1 dk)˜ ˜ ˜ D D
(5.102)
f aˆ (kˆ−1 dk)) ˆ ˜ = (kˆ−1 dkˆ ∧, Ad c aˆ (kˆ−1 dk)) ˆ ˆ. (kˆ−1 dkˆ ∧, Ad ˜ ˆ D
(5.103)
˜ ˜ t˜0 )) is in the center of D. ˜ Moreover, one checks directly that because exp (a0 Υ( D
It is first convenient to rewrite the remaining term as 1 ˜ ˆ ˜−1 ˆ ∧ (dbL (k˜ a)bL (k˜ a) , d˜ p0 p˜−1 ˜ 0 )D ˜ 2 =
1 ˜−1 f −1 f ∧ g (Ad f ˆa f ˆa (b (Adkˆ a ˜)d˜bL (Ad gR (Adkˆ a ˜))D˜˜ . R k ˜) , d˜ k ˜)˜ 2 L
(5.104)
f ˆa ˆ We know from Lemma 5.8 that g˜R (Ad k ˜) ∈ G, hence it follows that 1 ˜ ˆ ˜−1 ˆ ∧ (dbL (k˜ a)bL (k˜ a) , d˜ p0 p˜−1 ˜ 0 )D ˜ 2 =
1 −1 f −1 f ∧ g (Ad f ˆa f ˆa (m ˆ (Adkˆ a ˆ)dm ˆ L (Ad gR (Adkˆ a ˆ))D˜˜ , R k ˆ) , d˜ k ˆ)˜ 2 L
(5.105)
where m ˆL = m ˆ ◦ ˜bL . We use again Lemma 5.8 to finally rewrite (5.105) into 1 ˜ ˆ ˜−1 ˆ ∧ (dbL (k˜ a)bL (k˜ a) , d˜ p0 p˜−1 ˜ 0 )D ˜ 2 =
1 ˆ−1 ˆ −1 ˆ ∧ g (Ad ˆ ˆa ˆ ˆa (b (Adkˆ a ˆ)dˆbL (Ad gR (Adkˆ a ˆ)) ˆˆ . R k ˆ) , dˆ k ˆ)ˆ D 2 L
(5.106)
Collecting (5.101), (5.103) and (5.106), we conclude that ω ˜ L (s = 0) = ω ˆL . The theorem is proved. The first step of the symplectic reduction now consists of setting m0L = 0. This ˜ φ) ˜ ˜∗ Poisson commutes with m0 ˜ L = − 1 (φ, is consistent, since the Hamiltonian H L G 2κ ˜ ˜ L. because it is obviously invariant with respect to the left G-action on M From Theorem 5.7, one concludes immediately that the reduced dynamical sysˆ L for its phase space, its Hamiltonian reads tem has M ∞ 0 ˆ L = − 1 (φ, φ)G ∗ − a (Dres ] ˆa H k ˆ) , 2κ κ
(5.107)
September 6, 2004 14:35 WSPC/148-RMP
00214
Quasitriangular WZW Model
769
ˆ ˆ where a ˆ = eΛ(φ) , φˆ = a∞ tˆ∞ + π ∗ (φ) and its symplectic form is
ω ˆL =
1 ˆ ˆ ˆ−1 ˆ ∧ (dbL (kˆ a)bL (kˆ a) , dˆ ppˆ−1 )Dˆˆ 2 ˆ ˆ + 1 (kˆ−1 dkˆ ∧, a ˆ a−1 ) ˆ , + (dˆ aa ˆ−1 ∧, kˆ−1 dk) ˆ(kˆ−1 dk)ˆ ˆ ˆ D D 2
(5.108)
ˆakˆ−1 . where pˆ = kˆ 5.2.3. Chiral symplectic reduction: the second step In order to perform the second step of the reduction, we have to find the standard ˆ L . Its existence is Hamiltonian charge generating the central circle action on M ˜ ˜ is of the WZW guaranteed because we know that the affine Lu–Weinstein double D type (cf. Definition 4.10). We find this charge from the fundamental relation (4.87) ∞ ˆ ω ˆ L (., vTˆ∞ ) = (Tˆ ∞ , dˆbLˆb−1 a) , ˆ = dmL (kˆ L )ˆ D
(5.109)
ˆ ˆ Here v ˆ∞ is which holds because ω ˆ L was pulled back from the Drinfeld double D. T ∞ ∞ ˆ ˆ ˆ the vector field on ML corresponding to the left action of T and m∞ L = m ◦ bL . ˆ → R was defined by the decomposition B ˆ = R ×Q B. It is easy Recall that m∞ : B ∞ ∞ ˆ ∞ to evaluate mL , in fact, it holds that mL (kˆ a) = a . ˆ L defined by a∞ = κ and consider the space of Now fix the submanifold of M the central circle orbits on this submanifold. This space of orbits is nothing but the WZW model space MLW Z = G × A1+ , where A1+ is the standard Weyl alcove. Before giving the explicit description of the (doubly) reduced symplectic structure on MLW Z , we first write the reduced Hamiltonian HLqW Z on MLW Z . The consistency ˆ a) be invariant with respect to the ˆ L (kˆ requires that the first floor Hamiltonian H central circle action. But this is evident since
Thus we obtain
˜ ˆa) = ˜b(Ad f ˆa ] ˆa Dres k ˆ = b(kˆ k ˆ) .
HLqW Z (k, aµ ) = −
1 ˆκ ) 0 ˆ φ ] ˆ eΛ( ) , (φ, φ)G ∗ − (Dres k 2κ
(5.110)
(5.111)
where φˆκ = κtˆ∞ + π ∗ (φ), φ = κaµ tµ and aµ tµ ∈ A1+ . The Hamiltonian HLqW Z ˆ and of the map m0 : B ˜ → R. We shall depends on ε due to the ε-dependence of Λ see in Sec. 5.2.7 that in the ε → 0 (or q → 1) limit the formula (5.111) yields the standard chiral Sugawara Hamiltonian HLW Z given by (3.118). Let us now study the quasitriangular symplectic structure on MLW Z . As we have already said, it is convenient to perform the second step of the symplectic reduction by working with the Poisson brackets. This means that we shall first need to invert ˆ L on M ˆ L. the symplectic form ω ˆ L to obtain the corresponding Poisson bivector Σ Our strategy for inverting ω ˆ L will rely on using its Poisson–Lie symmetry. Indeed, the story of the finite-dimensional Sec. 5.1.2 can be directly used also in our affine
September 6, 2004 14:35 WSPC/148-RMP
770
00214
C. Klimˇ c´ık
ˆ and Aˆ+ , respectively. Thus situation if we replace G0 and A0+ of Sec. 5.1.2 by our G we have ˆa)ˆb−1 (kˆ ˆa)) ˆ , ω ˆ L (·, vTˆ ) = (Tˆ, dˆbL (kˆ L ˆ D
(5.112)
ˆ L corresponding to where Tˆ is any generator of Gˆ and vTˆ is the vector field on M ˆ ˆ the left action of T on ML . Following the reasoning after the proof of Lemma 5.5, ˆ L on M ˆ L fulfils we conclude that the Poisson bivector Σ ˆ L = −ˆ Lvˆ Σ v ∧ vˆ ,
(5.113)
ˆ L ⊗ Gˆ∗ = T M ˆ L ⊗ B generates the left action of G ˆ on M ˆ L and Lvˆ is where vˆ ∈ T M ˆ the Lie derivative. From (5.113), it follows (cf. (5.37)) that ΣL must be of the form ˆ ˆ Tˆi ∧ Rˆ Tˆ j ˆ φ) ˆ = − 1 (ΠRˆ )ij (k)R ˆ L (k, Σ k∗ k∗ 2 G ˆ ˆ Tˆ i ∧ Lˆ Tˆj + σ ˆ ˆ Tˆ i ∧ ˆ 0 (φ)L +Σ ˆi µˆ (φ)L ij k∗ k∗ k∗ ˆ + sˆµˆνˆ (φ)
∂ ∂φµˆ
∂ ∂ ∧ . ∂φµˆ ∂φνˆ
(5.114)
Here φˆ = φ∞ tˆ∞ + φµ π ∗ (tµ ) ≡ a∞ tˆ∞ + a∞ aµ π ∗ (tµ ), and the notation φµˆ means that we consider at the same time the coordinates φµ and φ∞ ; in other words, the ˆ ˆ Tˆ i ∧ Rˆ Tˆ j is subscript µ ˆ runs over µ and over ∞. The expression 12 (ΠR ˆ )ij (k)Rk∗ k∗ G ˆ written as some basis Tˆ i of G. ˆ Recall that the Poisson–Lie bivector on the group G ˆ is entirely specified by the structure of the affine the Poisson–Lie bracket on G ˆ ˆ Lu–Weinstein double D, as it was explained in Sec. 4.1.2. Thus we again observe the power of the Poisson–Lie symmetry: the Poisson ˆ L on the affine model space M ˆ L is completely determined by its value at tensor Σ ˆ ˆ ˆ the points (ˆ e, φ) ∈ ML , in other words, at the unit element eˆ of G. µ ˆ ˆ 0 ˆ ˆ For ˆ Our next task is to find the coefficient functions Σij (φ), σ ˆi (φ) and sˆµˆ νˆ (φ). ˆ We do this, we first calculate the matrix of the symplectic form ω ˆ L at points (ˆ e, φ). i ˆ ˆ ˆ ˆ ˆ it in the basis Raˆ ∗ T , Raˆ∗ Λ(tµˆ ) of the Ξ-pushed-forward tangent space T(ˆe,φ) ˆ ML ; ˆ and we set tˆµˆ = (tˆ∞ , π ∗ (tµ )). We then invert this matrix ˆ φ) recall that a ˆ = exp Λ( to obtain the unknown coefficient functions of the Poisson bivector. d0 reads The convenient basis of Gˆ = LG Tˆ i = Tˆ ∞ , ι(T µ ), ι(B αˆ ), ι(C αˆ ) ,
ˆ+ . α ˆ∈Φ
(5.115)
Recall that T µ = iH µ ,
i B αˆ = √ (E αˆ + E −αˆ ) , 2
1 C αˆ = √ (E αˆ − E −αˆ ) , 2
(5.116)
where E αˆ = E α einσ ,
α ˆ = (α, n) ;
E αˆ = H µ einσ ,
α ˆ = (µ, n 6= 0) .
(5.117)
September 6, 2004 14:35 WSPC/148-RMP
00214
Quasitriangular WZW Model
771
The complete list of the convention and notations concerning this basis can be found in Sec. 3.2.4. The corresponding dual basis of Gˆ∗ is ˆ+ . α ˆ∈Φ
tˆi = tˆ∞ , π ∗ (tµ ), π ∗ (bαˆ ), π ∗ (cαˆ ) ,
(5.118)
ˆ it reads From this we obtain the dual basis of B; 2 |ˆ α|2 αˆ |ˆ α| αˆ 1 µ ˆ ˆ ˆ , Λ(ti ) = εt∞ , εν∗ (H ), εν∗ √ E , εν∗ − i √ E 2 2
ˆ+ , α ˆ∈Φ
(5.119)
where tˆ1∞ = −i∂σ and ν∗ : B → Bˆ were defined in Sec. 4.4.2. Moreover, |ˆ α|2 = |α|2 ˆˆ 2 ˆ ⊂D for α ˆ = (α, n) and |ˆ α| = 2 for α ˆ = (µ, n). Of course, Bˆ is the Lie algebra of B
C and B = Lie(B) where L+ GC 0 = B ⊂ D = LG0 . The dual basis depends on ε and j j ˆ ˆ ˆ satisfies the basic condition (Λ(ti ), T ) ˆˆ = δi . D ˆ is the element of the alcove Aˆ+ ⊂ A; ˆ φ) ˆ the Lie algebra Aˆ Recall that a ˆ = exp Λ( ∗ ˆ ˆ is generated by the generators Λ(tˆµˆ ) = Λ(tˆ∞ , π (tµ )) = (εν∗ (H µ ), −iε∂σ ). We deˆ M ˆ L ) into two parts: the α-part compose the elements of the basis of Taˆ Ξ( ˆ generated ˆ + and the µ ˆ tˆµˆ ). by Raˆ ∗ ι(B αˆ ), Raˆ∗ ι(C αˆ ), α ˆ ∈ Φ ˆ-part generated by Raˆ∗ Tˆµˆ , Raˆ∗ Λ( µ ˆ µ ∞ ˆ ˆ Here by T we mean either ι(T ) or T . Now we use the formula (4.18) from ˆ ˆ Sec. 4.1.2, expressing the Semenov-Tian-Shansky form on D.
ω ˆ (t, u) = (t, (ΠLR ˜ − Π LR ˜ )u) ˆ ˆ. D We immediately observe that ˆ = 0, ω ˆ L (ˆ α, β)
ω ˆ L (ˆ α, µ ˆ) = 0 ,
ˆ+ , α ˆ , βˆ ∈ Φ
α ˆ 6= βˆ .
(5.120)
ˆ ˆ L (⊂ D). ˆ It is hence sufficient to invert This is because we are at the point (ˆ e, a ˆ) ∈ M the matrix ω ˆ L for the µ ˆ-sector and for every α ˆ -sector separately. We have ˆµˆ ˆµˆ ΠLR ˜ Ra ˆ∗ T . ˆ ∗ T = Ra
ΠLR˜ Raˆ∗ Tˆµˆ = 0 ,
(5.121)
From this we obtain ˆ tˆνˆ ), Raˆ∗ Tˆµˆ ) = δ µˆ . ωL (Raˆ∗ Λ( νˆ
(5.122)
One has ˆ tˆi ) ˆ tˆi )) ˆ Laˆ∗ Tˆ i + (Raˆ∗ ι(C αˆ ), Laˆ∗ Tˆi ) ˆ Laˆ∗ Λ( Raˆ∗ ι(C αˆ ) = (Raˆ∗ ι(C αˆ ), Laˆ∗ Λ( ˆ ˆ D
D
(5.123)
and ε
ˆ αˆ ) = e 2 |ˆa| Laˆ∗ Λ(b
2
ˆ hiα ˆ ∨ ,φi
ε
ˆ αˆ ) , Raˆ∗ Λ(b
ˆ+ Then we have for every α ˆ∈Φ ε
ΠLR˜ Raˆ∗ ι(C αˆ ) = e 2 |ˆa|
2
ˆ αˆ ) = e 2 |ˆa| Laˆ∗ Λ(c
ˆ hiα ˆ ∨ ,φi
2
ˆ hiα ˆ ∨ ,φi
ˆ αˆ ) . (5.124) Raˆ∗ Λ(c
ˆ αˆ ) (Laˆ∗ ι(B αˆ ), Raˆ∗ ι(C αˆ )) ˆˆ Raˆ∗ Λ(b D
ˆ αˆ ) ; + (Laˆ∗ ι(C ), Raˆ∗ ι(C )) ˆˆ Raˆ∗ Λ(c α ˆ
α ˆ
D
α ˆ α ˆ ΠLR ˜ Ra ˆ ∗ ι(C ) = Ra ˆ ∗ ι(C ) .
(5.125) (5.126)
September 6, 2004 14:35 WSPC/148-RMP
772
00214
C. Klimˇ c´ık
It follows that ε
ω ˆ L (Raˆ∗ ι(B αˆ ), Raˆ ∗ ι(C αˆ )) = −e 2 |ˆa| =
2
ˆ hiα ˆ ∨ ,φi
(Laˆ∗ ι(B αˆ ), Raˆ∗ ι(C αˆ )) ˆˆ
D
2 ∨ ˆ ε 1 (e2 2 |ˆa| hiαˆ ,φi − 1) . ε|ˆ α|2
Recall from Sec. 3.2.4, that for α ˆ = (α, n) ε 2 ∨ ˆ |ˆ a| hiˆ α , φi = εa∞ (aµ hα, H µ i + n) 2
(5.127)
(5.128)
and for α ˆ = (µ, n), ε 2 ∨ ˆ |ˆ a| hiˆ α , φi = εa∞ n . 2
(5.129)
Our conclusion is that sˆµˆ νˆ = 0 ,
σ ˆαµˆˆ = 0 ,
σ ˆµνˆˆ = δµνˆˆ ,
ˆ0 = Σ ˆ0 = Σ ˆ0 ˆ = 0 Σ µ ˆ νˆ α ˆµ ˆ α ˆβ
(5.130)
ˆ L on M ˆ L reads and the explicit formula for the Poisson bivector Σ ˆ φˆµˆ ) = − 1 (ΠRˆ )ij (k)R ˆ ˆ Tˆi ∧ Rˆ Tˆ j ˆ L (k, Σ k∗ k∗ 2 G X ε|ˆ α|2 α ˆ α ˆ Lk∗ + ˆ ι(B ) ∧ Lk∗ ˆ ι(C ) ε 2 hiα ∨ ,φi ˆ 2 |ˆ a | ˆ 2 ) ˆ (1 − e α∈ ˆ Φ +
ˆµˆ + Lk∗ ˆ T ∧
∂ ∂ φˆµˆ
.
(5.131)
So far we have inverted the form ω ˆ L , now we are ready to perform the second step of the symplectic reduction induced by setting a∞ = κ. Consider a pair of functions φi , i = 1, 2 on MLW Z . We wish to calculate their reduced Poisson bracket {φ1 , φ2 }qW Z . The general procedure of the symplectic reduction at the level of the Poisson brackets is described in Appendix A.3. In our particular situation, it works ˆ L as as follows: define two functions φˆi on M ˆ aµ , a∞ = κ) ≡ φi (π(k), ˆ aµ ) , φˆi (k,
d0 . kˆ ∈ LG
(5.132)
ˆ L . It verifies Then calculate the quasitriangular Poisson bracket {φˆ1 , φˆ2 }qMˆ L on M {a∞ , {φˆ1 , φˆ2 }qMˆ L }qMˆ L = 0
(5.133)
as the simple consequence of the Jacobi identity and the central circle invariance of φˆi . This means that there exists a function on MLW Z denoted suggestively as {φ1 , φ2 }qW Z which verifies ˆ aµ , a∞ = κ) = {φ1 , φ2 }qW Z (π(k), ˆ aµ ) . {φˆ1 , φˆ2 }qMˆ L (k,
(5.134)
Needless to say, the function {φ1 , φ2 }qW Z is the reduced Poisson bracket that we seek. This method we now apply for the functions φi of particular form.
September 6, 2004 14:35 WSPC/148-RMP
00214
Quasitriangular WZW Model
773
5.2.4. Deformed affine dynamical r-matrix Our next task consists of computing the reduced Poisson bracket {k ⊗, k}qW Z . The reader who is still not used to the notation for the matrix Poisson bracket should again consult Secs. 3.1.3 and 3.2.6. We have {k ⊗, k}qW Z = {k ⊗, k}0 − {k ⊗, k}G ,
(5.135)
where the first bracket correspond (modulo the reduction) to the bivector on the second line on the right-hand side of (5.131), and the second bracket to that on the first line. Of course, the relation (5.135) is the affine analogue of (5.49). Also the way of evaluating the two terms in (5.135) follows in spirit calculation Sec. 5.1.3. We first introduce some necessary notions. Define the identification map Λ : G ∗ → B = Lie(B) as usual (Λ(x∗ ), y)D = hx∗ , yi ,
x∗ ∈ G ∗ , y ∈ G .
(5.136)
We can directly check from the definition (4.127), (4.128) of (·, ·) ˆˆ that D
ˆ ◦ π = ν∗ ◦ Λ . Λ ∗
(5.137)
Now using the relations (4.63) and (4.66), we first calculate the relevant Poisson–Lie ˆ = LG d0 bracket on G ˆ ⊗, π(k)} ˆ Rˆ {π(k) G ˆ ⊗ π(ι(T j )k) ˆ = − π(ι(T i )k) ˆ ι(T l )) ˆ ˆ (Λ ˆ ◦ π ∗ )(tj )k, ˆ ◦ π ∗ )(ti )k, ˆ ◦ π ∗ )(tl )) ˆ (kˆ−1 (Λ × (kˆ−1 (Λ ˆ ˆ D D ˆ Tˆ ∞ ) ˆ ˆ Λ( ˆ ◦ π ∗ )(tj )k, ˆ ◦ π ∗ )(ti )k, ˆ tˆ∞ )) ˆ (kˆ−1 (Λ + (kˆ−1 (Λ ˆ ˆ D D
ˆ ⊗ T j π(k) ˆ × (π(k) ˆ −1 Λ(ti )π(k), ˆ Λ(tl ))D (π(k) ˆ −1 Λ(tj )π(k), ˆ T l )D . = − T i π(k)
(5.138)
Now we use the formulae (5.88) and (5.137) to calculate the {·, ·}G -contribution to the Poisson bracket {·, ·}qW Z according to the decomposition (5.135). The result is i j −1 {k ⊗, k}R Λ(ti )k, Λ(tl ))D (k −1 Λ(tj )k, T l )D G = −(T k ⊗ T k) × (k
= [(k ⊗ k), (T i ⊗ Λ(ti ))] ,
(5.139)
which is by the way nothing but the Poisson–Lie bracket on G = LG0 induced by the double D = LGC 0 . The calculation giving this result follows step by step the computation leading from (5.50) to (5.52). Although the continuation of this affine story is very similar to the finitedimensional case described in Sec. 5.1.3, some additional care is needed in the affine case because of the infinite number of the elements of the basis (T i , Λ(ti )) of
September 6, 2004 14:35 WSPC/148-RMP
774
00214
C. Klimˇ c´ık
D = LG0 . Indeed, we must give the meaning to the series T i ⊗ Λ(ti ). Recall that the basis is given by (cf. (5.116) and (5.119)) T i = T µ , B αˆ , C αˆ ,
ε|ˆ α|2 ε|ˆ α|2 Λ(ti ) = εH µ , √ E αˆ , −i √ E αˆ , 2 2
ˆ + . (5.140) α ˆ∈Φ
A simple computation then shows that T i ⊗ Λ(ti ) = iεH µ ⊗ H µ + iε
X
ˆ α∈ ˆ Φ
|ˆ α|2 E −αˆ ⊗ E αˆ .
(5.141)
We wish to calculate this expression in the evaluation representation of LG0 . Recall that its representation space LV0 is given by square-integrable maps from the loop circle into the representation space V0 of some finite-dimensional (typically irreducible) representation of G0 . This means (e.g. for the affine root α ˆ = (α, n)) that E αˆ is to be viewed as E α einσ , where E α ∈ End(V0 ) and einσ is “the multiplication by function” operator in End(LV0 ). Among the summation over all affine roots, we can consider the subsummation over α ˆ = (α, n), where α is kept fixed and n acquires whatever integer value. Then it is easy to see that in the evaluation representation the Fourier series over n diverges. This divergence shows that we have to care about the analytic aspect of working with the infinite dimensional symplectic manifolds. In other words, we have to give a meaning to the divergent Fourier series (5.141). The reader should understand that the prescription associating a well-defined function of σ to the series (5.141) is a part of the definition of our chiral quasitriangular WZW model. Indeed, the resulting functions appear in the definition of the symplectic structure of the model. Of course, our prescription for summing the divergent series must fulfil some consistency conditions. Among them there is the most important one: the Poisson bracket (5.139) must fulfil the Jacobi identity. On top of this, we shall require that the resulting function of σ be meromorphic. In this way, we shall find ourselves in the standard world of the r-matrices. It turns out that the prescription fulfilling both conditions exists: it is called the Abel–Poisson summation method and is based on the following observation. The series Σn>0 (an cos nσ + bn sin nσ) becomes (uniformly) convergent if we replace (an , bn ) by (rn an , rn bn ), where 0 ≤ r < 1. Its sum we denote as Sr (σ). If the limit lim r→1 Sr (σ) exists, it is called the Abel–Poisson sum of the original series. For example, we have (cf. [40, p. 83]) X
1 (einσ − e−inσ ) = i cotg σ 2 n>0
(5.142)
in the Abel–Poisson sense. It is indeed this formula, which permits us to compute T i ⊗ Λ(ti ) in the evaluation representation 1 T i ⊗ Λ(ti ) ≡ εˆ r (σ − σ 0 ) = εr + εC cotg (σ − σ 0 ) . 2
(5.143)
September 6, 2004 14:35 WSPC/148-RMP
00214
Quasitriangular WZW Model
775
Recall that r=
X i|α|2 (E −α ⊗ E α − E α ⊗ E −α ) ; 2
(5.144)
α∈Φ+
C=
X µ
Hµ ⊗ Hµ +
X |α|2 (E −α ⊗ E α + E α ⊗ E −α ) . 2
(5.145)
α∈Φ+
Inserting (5.143) into the formula (5.139), we obtain −{k(σ) ⊗, k(σ 0 )}R r , (k(σ) ⊗ k(σ 0 ))] . G = ε[ˆ
(5.146)
This Poisson bracket indeed obeys the Jacobi identity, since it can be easily checked that the r-matrix rˆ(σ−σ 0 ) satisfies the ordinary Yang–Baxter equation with spectral parameter, i.e. [ˆ r12 (σ1 − σ2 ), rˆ13 (σ1 − σ3 ) + rˆ23 (σ2 − σ3 )] + [ˆ r13 (σ1 − σ3 ), rˆ23 (σ2 − σ3 )] = 0 .
(5.147)
The calculation of the bracket {k ⊗, k}0 in (5.135) is even more straightforward. We use (5.131) and (5.134) to arrive at {k ⊗, k}0 = (k ⊗ k)ˆ rε0 (φˆκ ) ,
(5.148)
where rˆε0 (φˆκ ) =
X
ˆ+ α∈ ˆ Φ
(1 − e
ε|ˆ α|2
ε ˆκ i ˆ 2 hiα ˆ ∨ ,φ 2 |α|
)
B αˆ ∧ C αˆ .
(5.149)
Recall also that φˆκ = κtˆ∞ + κaµ π ∗ (tµ ). Putting together (5.146) and (5.149), we arrive at the most important formula of this paper {k ⊗, k}qW Z = (k ⊗ k)ˆ rε (φˆκ ) + εˆ r(k ⊗ k) ,
(5.150)
where rˆ is the standard affine r-matrix (5.143) and rˆε (φˆκ ) = iε
X |ˆ ε 2 ∨ ˆ α|2 coth |ˆ α| hiˆ α , φκ i E αˆ ⊗ E −αˆ . 2 2
(5.151)
ˆ α∈ ˆ Φ
Note that here the summation goes over all roots. It is useful to note that rˆε is nothing but the direct affinization of the formula (5.60) for rε . We shall refer to rˆε (φˆκ ) as to the deformed affine dynamical r-matrix. We shall see in a moment that (5.151) in the evaluation representation will indeed fulfil the dynamical Yang– Baxter equation with spectral parameter.
September 6, 2004 14:35 WSPC/148-RMP
776
00214
C. Klimˇ c´ık
It is insightful to visualize the σ-dependence in (5.151). Recalling the explicit expressions (3.83), (3.84) for hiˆ α∨ , φˆκ i, we can write X 0 rˆε (φˆκ )(σ − σ 0 ) = −iε coth(−εκn)(H µ ⊗ H µ )ein(σ−σ ) µ,n6=0
− iε
X
α∈Φ,n∈Z
0 |α|2 coth(−εκaµ hα, H µ i − εκn)(E α ⊗ E −α )ein(σ−σ ) . 2
(5.152)
Of course, the summation is to be taken in the Abel–Poisson sense. We use the following classical formulae [44] σ−y (z, τ ) = π(cotg πz + cotg πy) + 4πΣm,n>0 e2πiτ mn sin 2π(mz + ny) ; (5.153) ρ(z, τ ) = π cotg πz + 4πΣn>0
e2πinτ sin 2πnz , 1 − e2πinτ
(5.154)
where the functions ρ(z, τ ), σw (z, τ ) are defined as (cf. [17, 20, 21]) σw (z, τ ) =
θ1 (w − z, τ )θ10 (0, τ ) , θ1 (w, τ )θ1 (z, τ )
ρ(z, τ ) =
θ10 (z, τ ) . θ1 (z, τ )
(5.155)
Note that θ1 (z, τ ) is the Jacobi theta functionj θ1 (z, τ ) = −
∞ X
1 2
eπi(j+ 2 )
τ +2πi(j+ 12 )(z+ 12 )
,
(5.156)
j=−∞
the prime 0 means the derivative with respect to the first argument z and the argument τ (the modular parameter) is a nonzero complex number such that Im τ > 0. Now we can sum up the Fourier series (5.152) by using the classical formulae (5.153), (5.154) and the relation (5.142). First we obtaink X i (z, τ ) , (5.157) e2πizn (1 + coth(a + iπnτ )) = − σ ia π π n∈Z
and 1+
X
n∈Z\0
i e2πizn (1 + coth(iπnτ )) = − ρ(z, τ ) . π
(5.158)
Now by using (5.157) and (5.158), we finally arrive at {k(σ) ⊗, k(σ 0 )}qW Z = (k(σ) ⊗ k(σ 0 ))ˆ rε (aµ , σ − σ 0 ) + εˆ r (σ − σ 0 )(k(σ) ⊗ k(σ 0 )) ,
(5.159)
j We
have θ1 (z, τ ) = ϑ1 (πz, τ ) with ϑ1 in [44]. formulae (5.157) and (5.158) appear also in [17] but with several misprinted signs. Those wrong signs turn out to be innocent in the context of [17] since they conspire to give an r-matrix which also fulfils the dynamical Yang–Baxter equations (5.163). In fact, the correct and wrong r-matrices differ by the gauge transformation of type 4. (cf. [17, Sec. 4.2.].) k The
September 6, 2004 14:35 WSPC/148-RMP
00214
Quasitriangular WZW Model
777
where rˆε (aµ , σ) is the Felder–Wieczerkowski [21] elliptic dynamical r-matrix given by ε σ iκε rˆε (aµ , σ) = − ρ Hµ ⊗ Hµ , π 2π π ε X |α|2 σ iκε − E α ⊗ E −α . (5.160) σ εκaµ hα,H µ i , πi π 2 2π π α∈Φ
The (quasitriangular) braiding relation (5.159) plays the same role in the quasitriangular chiral WZW model as the braiding relation (3.99) in the standard chiral WZW theory. The description of the bracket {·, ·}qW Z on MLW Z is then completed by the following formula, that can be easily derived from (5.131) and (5.134) 1 µ kT , {aµ , aν }qW Z = 0 . (5.161) κ It remains to verify that our Abel–Poisson summation method did indeed give the consistent result. First of all, both r-matrices appearing in (5.159) are clearly meromorphic functions of σ. The Jacobi identity for the Poisson brackets (5.159) and (5.161), then requires (5.147) and {k, aµ }qW Z =
[ˆ rε , 1 ⊗ T µ + T µ ⊗ 1] ≡ [ˆ rε12 , (T µ )1 + (T µ )2 ] = 0 ;
(5.162)
[ˆ rε12 (σ1 − σ2 ), rˆε13 (σ1 − σ3 ) + rˆε23 (σ2 − σ3 )] + [ˆ rε13 (σ1 − σ3 ), rˆε23 (σ2 − σ3 )] ∂ 12 ∂ 23 ∂ 31 1 1 1 µ 3 µ 1 r ˆ (T ) + r ˆ (T ) + r ˆ (T µ )2 = 0 . + κ ∂aµ ε κ ∂aµ ε κ ∂aµ ε (5.163) The relation (5.163) is called the dynamical Yang–Baxter equation with spectral parameter [17, 20]. It is straightforward to check, that the elliptic r-matrix rˆε (aµ , σ) does verify both conditions (5.162) and (5.163). It is instructive to rewrite the defining qWZW Poisson brackets (5.159) and (5.161) in terms of the monodromic variables m(σ) defined by the relation (3.103). The result is {m(σ) ⊗, m(σ 0 )}qW Z = (m(σ) ⊗ m(σ 0 ))Bε (aµ , σ − σ 0 ) + εˆ r(σ − σ 0 )(m(σ) ⊗ m(σ 0 )) ,
(5.164)
where Bε (aµ , σ) is the quasitriangular braiding matrix generalizing the matrix B0 (aµ , σ) defined in (3.105) and (3.106). We find it by generalizing the computation (3.105). The result is iσ iπ i , Hµ ⊗ Hµ Bε (aµ , σ) = − ρ κ 2κε κε i X |α|2 iσ iπ − E α ⊗ E −α . (5.165) σaµ hα,H µ i , κ 2 2κε κε α∈Φ
September 6, 2004 14:35 WSPC/148-RMP
778
00214
C. Klimˇ c´ık
We observe that the quasitriangular braiding matrix is again given by the Felder r-matrix but with different modular parameter of the elliptic functions. Indeed, we have derived (5.165) by using the following modular identities 1 z 1 ; (5.166) σ−τ y (z, τ ) = − e−2πiyz σy − , − τ τ τ 1 z 1 2πiz − ρ − ,− ρ(z, τ ) = − . (5.167) τ τ τ τ We can conclude this section by saying that the vertex-IRF transformation can be in a sense interpreted as the modular transformation in the deformation parameter. 5.2.5. q-Kac–Moody primary fields We return for a moment to the symplectic structure of the standard (nondeformed) chiral WZW model and we note that Eqs. (3.117) and (3.119) imply {m(σ), jLx }W Z = x(σ)m(σ) .
(5.168)
Here m(σ) was defined in (3.103), x(σ) is some element of LG0 and jLx is the corresponding component of the Kac–Moody current. The Poisson bracket (5.168) plays a very important role in the quantum theory where it becomes the commutator of two quantum fields m and jL . The former plays the role of the vertex operator and the latter is the Kac–Moody current. The quantized bracket (5.168) then expresses the fact that m is the Kac–Moody primary field, or, in other words, the Kac–Moody tensor operator. It will be convenient to rewrite (5.168) by using the inverse vertex-IRF transformation (cf. Sec. 3.2.6). Let jL be such a G-valued function on MLW Z that jLx = (jL , x)G . Then the bracket (5.168) can be rewritten in the matrix form as follows {k(σ) ⊗, jL (σ 0 )}W Z = 2πCδ(σ − σ 0 )(k(σ) ⊗ 1) ,
(5.169)
k(σ) = m(σ) exp(−aµ Υ(tµ )σ)
(5.170)
where
and C is the Casimir element defined in (5.145). We wish to find the quasitriangular generalization of the relation (5.169). The quantity k(σ) keeps its meaning also in the deformed case; however, the standard ˆ a) ∈ Gˆ∗ defined on M ˆ L is to be replaced by the non-Abelian moment map βˆL (kˆ ˆ ˆ ˆa (in ˆ Poisson–Lie moment map bL (kˆ a) ∈ B. Note that the group multiplication kˆ ˆ ˆ where ˆ and a ˆ φ), the argument) then takes place in the Drinfeld double D ˆ = exp Λ( ∞ˆ ∞ µ ∗ ˆ φ = a t∞ + a a π (tµ ) (cf. (5.114)). What is the analogue of jL ? In fact, jL can be written as π ∗ jL = Υ ◦ ι∗ ◦ βˆL .
(5.171)
September 6, 2004 14:35 WSPC/148-RMP
00214
Quasitriangular WZW Model
779
ˆ But we do not have Note that ι∗ is the map from Gˆ∗ → G ∗ dual to ι : G → G. ˆ ˆ a canonical map from B into B because B is the non-Abelian group. There are however two natural possibilities exploiting the maps mL,R defined in Sec. 4.3, Convention 4.12. Both are equally good to work with and we shall choose m R . The deformed case analogue of ι∗ ◦ βˆL will therefore be the map FL0 = mR ◦ ˆbL .
(5.172)
ˆa) is the invariIt is easy to see that similarly (as in the non-deformed case), FL0 (kˆ ˆ L with respect to the central circle action. Restriction ant B-valued function on M of this invariant function on the surface a∞ = κ can be interpreted as the Bvalued function on the quasitriangular model space MLW Z which will be denoted as FL (k, aµ ). The quasitriangular analogue of {k ⊗, jL }W Z will be now the Poisson bracket {k ⊗, FL }qW Z . We want to calculate this bracket explicitly. By definition, we have ˆ aµ , a∞ = κ) = FL (π(k), ˆ aµ ) . FL0 (k,
(5.173)
We note, moreover, that (cf. Sec. 5.2.3) ˆa) = (m∞ ◦ ˆbL )(kˆ ˆa) , a∞ = m∞ L (kˆ
(5.174)
ˆ → R = exp Span(Λ( ˆ tˆ∞ )) was introduced in Convenwhere the map m∞ : B tion 4.12 of Sec. 4.3. ˆa) is the nonˆ L was denoted as Σ ˆ L . Because ˆbL (kˆ The Poisson bivector on M Abelian moment map, we have the relation (cf. (5.30)) ˆ L (·, dˆbLˆb−1 ) = Λ( ˆ tˆ∞ ) ⊗ ∇Lˆ∞ + (ν∗ ◦ Λ)(ti ) ⊗ ∇L i , Σ ι(T ) L T
(5.175)
d0 and (Λ( ˆ tˆ∞ ), (ν∗ ◦ Λ)(ti )) is the basis where (Tˆ∞ , ι(T i )) is the basis of Gˆ = LG ˆ ˆ Recall also that Λ ˆ ◦ π ∗ = ν∗ ◦ Λ. The vector of Bˆ in the sense of the double D. L ˆ d0 on the model space fields ∇ correspond to the left action of the group G = LG ˆ ˆ ˆ ML = G × A+ . Because ˆbL = exp(m∞ Λ( ˆ tˆ∞ ))F 0 = exp(a∞ Λ( ˆ tˆ∞ ))F 0 , L L L
(5.176)
we infer ∞ˆ ˆ ˆ tˆ∞ ) ˆ L (·, da∞ Λ( ˆ tˆ∞ ) + ea∞ Λ( Σ dFL0 (FL0 )−1 e−a Λ(t∞ ) )
ˆ tˆ∞ ) ⊗ ∇Lˆ∞ + (ν∗ ◦ Λ)(ti ) ⊗ ∇L i . = Λ( ι(T ) T
(5.177)
From this relation, we obtain readily ∞ˆ ˆ ˆ tˆ∞ ) ˆ ⊗ e−a∞ Λ( ˆ ⊗, F 0 } ˆ = T i π(k) (ν∗ ◦ Λ)(ti )ea Λ(t∞ ) FL0 . {π(k) L ML
(5.178)
By using the relation (5.134) and the fact that a∞ = κ, the symplectic reduction is trivially performed to give {k ⊗, FL }qW Z = (T i ⊗
−κ
Λ(ti ))(k ⊗ FL ) ≡ εˆ rκ (k ⊗ FL ) .
(5.179)
September 6, 2004 14:35 WSPC/148-RMP
780
00214
C. Klimˇ c´ık
The notation means −κ
Λ(ti ) ≡ e−κ(−iε∂σ ) Λ(ti )eκ(−iε∂σ ) .
(5.180)
Thus we observe the presence of yet another r-matrix in our game. It is instructive to give explicit formulae for the elements −κ Λ(ti ) −κ
Λ(tµ ) = Λ(tµ ) ,
−κ
Λ(bαˆ ) = e−εκn Λ(bαˆ ) ,
−κ
Λ(cαˆ ) = e−εκn Λ(cαˆ ) , (5.181)
where n is taken from α ˆ = (α, n) or from α ˆ = (µ, n). The σ dependence of the κ matrix rˆ is therefore as follows 1 (5.182) rˆκ (σ − σ 0 ) = rˆ(σ − σ 0 − iεκ) = r + C cotg (σ − σ 0 − iεκ) . 2 The formula (5.179) is the principal result of this section. It is the quasitriangular generalization of the standard primary field condition (5.169). Upon the quantization, the relation should express the crucial property that the primary field should be the tensor operator with respect to the q-current algebra. Remark. The characterization of the tensor operators of certain quantum groups by means of suitable r-matrices was discussed in [11, 12]. Our results fit in spirit in the framework of those references. 5.2.6. q-deformed current algebra Recall the basic relation (3.114) defining the standard chiral current algebra [x,y]
{jLx , jLy }W Z = jL
+ κρ(x, y) .
(5.183)
Here x, y ∈ G = LG0 . Recall that jL = κ∂σ mm−1 = κaµ kΥ(tµ )k −1 + κ∂σ kk −1 .
(5.184)
The basic relation (5.183) can be cast in the following matrix form {jL (σ) ⊗, jL (σ 0 )}W Z = πδ(σ − σ 0 )[C, jL (σ) ⊗ 1 − 1 ⊗ jL (σ 0 )] + κ2πC∂σ δ(σ − σ 0 ) ,
(5.185)
where C is the Casimir element defined in (5.145). Our goal is to calculate the quasitriangular analogue of the current commutator (5.185). For this, we have to evaluate the Poisson brackets of the q-currents {F L ⊗, FL }qW Z , {FL† ⊗, FL† }qW Z and {(FL† )−1 ⊗, FL }qW Z . The reason why the knowledge of the first bracket only is not sufficient is the same as in the finite case (cf. the text between (5.68) and (5.69)). The calculation is similar to the one performed in the previous section. We start with the basic relation (5.175) ˆ L (·, dˆbLˆb−1 ) = Λ( ˆ tˆ∞ ) ⊗ ∇Lˆ∞ + (ν∗ ◦ Λ)(ti ) ⊗ ∇L i . Σ ι(T ) L T
September 6, 2004 14:35 WSPC/148-RMP
00214
Quasitriangular WZW Model
781
ˆ is some function invariant with respect to the central circle Suppose that f (k) action. Then we obtain readily (cf. (5.178)) ∞ˆ ˆ ˆ tˆ∞ ) ˆ F 0 } ˆ = ∇L i f (k) ˆ × (e−a∞ Λ( {f (k), (ν∗ ◦ Λ)(ti )ea Λ(t∞ ) FL0 ) . L ML ι(T )
(5.186)
Now we take for f the function FL0 itself. Since FL0 = mR ◦ˆbL , we can easily calculate the derivative i ˆ p 0 0 ˆ−1 ∇L ˆ FL (ν∗ ◦ Λ)(tp ) . ι(T i ) FL = (bL ι(T )bL , ι(T )) ˆ
(5.187)
D
Thus we obtain {FL0 ⊗, FL0 }Mˆ L = FL0 (ν∗ ◦ Λ)(tp ) ⊗ (e−a × (FL0
−1
−a∞
(
∞ˆ
Λ(tˆ∞ )
(ν∗ ◦ Λ)(ti )ea
∞ˆ
Λ(tˆ∞ )
)FL0
T i )FL0 , T p )D .
(5.188)
Now we are ready to get down to the Poisson bracket {·, ·}qW Z on MLW Z . We obtain {FL ⊗, FL }qW Z = FL Λ(tp ) ⊗
−κ
Λ(ti )FL × (FL−1 (
−κ
T i )FL , T p )D .
(5.189)
Now we use the obvious relation FL−1 (
−κ
T i )FL = (FL−1 (
−κ
T i )FL , Tp )D Λ(tp ) + (FL−1 (
−κ
T i )FL , Λ(tp ))D Tp (5.190)
to rewrite (5.189) in the form Ti ⊗
−κ
× (FL −1 (
−κ
{FL ⊗, FL }qW Z = ( =(
−κ
−κ
Ti ⊗
Λ(ti ))(FL ⊗ FL ) − FL Tp ⊗
−κ
Λ(ti )FL
T i )FL , Λ(tp ))D
−κ
Λ(ti ))(FL ⊗ FL ) − (FL ⊗ FL )(T i ⊗ Λ(ti )) .
(5.191)
By using the concrete properties of the basis T i , Λ(ti ) given by (5.140), it can be easily checked that (5.191) can be rewritten as {FL ⊗, FL }qW Z = ε[ˆ r , F L ⊗ FL ] .
(5.192)
Note that this formula is fully analoguous to the relation (5.68) holding in the finite dimensional (non-affine) case. The case {FL† ⊗, FL† }qW Z can be calculated similarly to yield {FL† ⊗, FL† }qW Z = −ε[ˆ r, FL† ⊗ FL† ] .
(5.193)
It remains to calculate {(FL† )−1 ⊗, FL }qW Z . We proceed as before to arrive at the following counterpart of the relation (5.188) {(FL0† )−1 ⊗, FL0 }Mˆ L = −(FL0
−1
(
−a∞
T i )FL0 , T p )D
× (FL0† )−1 ν∗ ((Λ(tp ))† ) ⊗ (e−a
∞ˆ
Λ(tˆ∞ )
(ν∗ ◦ Λ)(ti )ea
∞ˆ
Λ(tˆ∞ )
)FL0 .
(5.194)
September 6, 2004 14:35 WSPC/148-RMP
782
00214
C. Klimˇ c´ık
Getting down to the Poisson bracket {·, ·}qW Z on MLW Z , we obtain {(FL† )−1 ⊗, FL }qW Z = −(FL† )−1 (Λ(tp ))† ⊗
−κ
Λ(ti )FL × (
−κ
T i , FL T p FL−1 )D
= −(FL† )−1 (Λ(tp ))† ⊗ FL T p + (FL† )−1 (Λ(tp ))† ⊗ ×(
−κ
−κ
Λ(ti ), FL T p FL−1 )D .
T i FL (5.195)
Now we use two obvious relations (FL−1
−κ
(
Λ(ti )FL )† = (FL−1 −κ
Λ(ti ))† =
κ
−κ
Λ(ti )FL , T p )D (Λ(tp ))† ,
((Λ(ti ))† )
(5.196)
to rewrite (5.195) in the form {(FL† )−1 ⊗, FL }qW Z = −((FL† )−1 ⊗ FL )((Λ(tp ))† ⊗ T p ) +
κ
((Λ(tp ))† ) ⊗
−κ
T p )((FL† )−1 ⊗ FL ) .
(5.197)
Using the explicit form of the base (T , Λ(ti )), we find easily that εˆ r = (Λ(tp ))† ⊗T p . Thus we obtain the final form of the third defining Poisson bracket of the q-current algebra i
{(FL† )−1 ⊗, FL }qW Z = εˆ r2κ ((FL† )−1 ⊗ FL ) − ((FL† )−1 ⊗ FL )εˆ r,
(5.198)
rˆ2κ (σ − σ 0 ) = rˆ(σ − σ 0 − 2iεκ) .
(5.199)
where
Remark. The Poisson bracket of the type (5.198) resembles the brackets arising in the description of the structure of the so-called twisted Heisenberg double of the reference [42]. Although we do not see here any direct connection of [42] with (5.198) (since MLW Z does not even have the structure of the double), we believe, nevertheless, that there is a deeper reason why similar formulae appear here and in [42]. Most probably, the double D = LGC 0 equipped with the qW ZW bracket ˆ ˆ reduced from D could be a sort of the real form of the twisted Heisenberg double of [42]. More precisely, there exists a method of generating a new (non-twisted) Heisenberg double Dreal from an old one Dcomplex = GB if the latter is equipped with an involution. Indeed, Dreal is simply the subgroup of Dcomplex consisting of the elements which are stable under the involution. There is a condition to fulfil, however, that the restriction of the bilinear form (·, ·)D on Lie(Dreal) must be nondegenerate. Then Dreal has canonically the structure of the Heisenberg double. It can be decomposed as D0 = G0 B0 where G0 , B0 play the role of the mutually dual isotropic Poisson–Lie groups and they themselves consist of the elements of G and B stable under the involution. We conjecture that the similar realification can also be made for the twisted doubles and that among the class of the twisted Heisenberg doubles of LG0 we can find such Dcomplex that the derived twisted double Dreal = LGC 0 (given by a suitable involution to be specified) is canonically equipped with the qW ZW symplectic structure. However, the fact whether the
September 6, 2004 14:35 WSPC/148-RMP
00214
Quasitriangular WZW Model
783
conjecture is true or false does not have direct implications for our qW ZW story and we shall not further study possible connections of our formalism with the theory of the twisted Heisenberg doubles. There already exists in the literature (cf. [41]) the concept of the q-deformed current algebra. We now show that our construction fits in this framework due to Reshetikhin and Semenov-Tian-Shansky [41]. First, we consider the (matrix) observable L = FL FL† .
(5.200)
The defining commutations relations (5.192), (5.193) and (5.198) can be then equivalently written in terms of one relation only {L(σ) ⊗, L(σ 0 )}qW Z = (L(σ) ⊗ L(σ 0 ))εˆ r (σ − σ 0 ) + εˆ r (σ − σ 0 )(L(σ) ⊗ L(σ 0 )) − (1 ⊗ L(σ 0 ))εˆ r (σ − σ 0 + 2iεκ)(L(σ) ⊗ 1) − (L(σ) ⊗ 1)εˆ r(σ − σ 0 − 2iεκ)(1 ⊗ L(σ 0 )) .
(5.201)
Our formula (5.201) coincides with the defining relation of the Poisson algebra introduced by Reshetikhin and Semenov-Tian-Shansky in Sec. 1 of their paper [41]. Note that (5.201) can be derived from the symplectic structure characterized by the qW Z bracket (5.164) (see the remark at the end of this paragraph). In other words, we have constructed the dynamical system whose symmetry structure is given by the Reshetikhin and Semenov-Tian-Shansky Poisson algebra. Such a system was not known previously, therefore, its construction constitutes one of the main original results of our paper. We expect also that the future quantization of our quasitriangular WZW model will give a (so far unknown) quantum dynamical system whose symmetry structure will be characterized by the q-current algebra introduced in [41, Sec. 2]. Recall that in the undeformed WZW model, the current jL (σ) can be written in terms of the primary field m(σ) as follows jL = κ∂σ mm−1 .
(5.202)
This relation can be called the classical Knizhnik–Zamolodchikov equation [36] since its quantum analogue is nothing but the standard KZ-equation written in the operatorial form [23]. Recall that here we have used the monodromic variables m(σ) = k(σ) exp (−aµ T µ σ) for the description of the phase space MLW Z of the undeformed (and also deformed) chiral WZW model. The quasitriangular analogue of (5.202) can be easily derived from (5.200) and from (5.172) rewritten as FL = bL (k(σ + iεκ)eεκa
µ
Hµ
).
(5.203)
September 6, 2004 14:35 WSPC/148-RMP
784
00214
C. Klimˇ c´ık
The result is simple and a esthetically appealing L(σ) = m(σ + iεκ)m−1 (σ − iεκ) .
(5.204)
This is the classical version of the q-KZ equation; as expected, it is not differential but rather a difference equation. Remark. It is an instructive exercise to calculate the Poisson bracket {L(σ) ⊗, L(σ 0 )}qW Z starting with the representation (5.204) and using the formula (5.164). In order to arrive at the formula (5.201), one needs to know the (quasi)periodic behavior of the involved elliptic functions σw (z + τ, τ ) = σw (z, τ )e2πiw , ρ(z + τ, τ ) = ρ(z, τ ) − 2πi ,
σw (z + 1, τ ) = σw (z, τ ) ; ρ(z + 1, τ ) = ρ(z, τ ) .
(5.205) (5.206)
5.2.7. The limit q → 1 The symplectic structure of the quasitriangular chiral WZW model is fully describedl by the fundamental quasitriangular braiding relation (5.153) {m(σ) ⊗, m(σ 0 )}qW Z = (m(σ) ⊗ m(σ 0 ))Bε (aµ , σ − σ 0 ) + εˆ r(σ − σ 0 )(m(σ) ⊗ m(σ 0 )) , (5.164) µ where Bε (a , σ) is the quasitriangular braiding matrix given by iσ iπ i Hµ ⊗ Hµ Bε (aµ , σ) = − ρ , κ 2κε κε i X |α|2 iσ iπ − E α ⊗ E −α . (5.165) σaµ hα,H µ i , κ 2 2κε κε α∈Φ
We wish to show that in the limit ε → 0, Eq. (5.164) gives {m(σ) ⊗, m(σ 0 )}(q=1)W Z = (m(σ) ⊗ m(σ 0 ))B0 (aµ , σ − σ 0 ) ,
(3.105)
where
X |α|2 exp(iπη(σ)hα, H µ iaµ ) π α −α E ⊗ E η(σ)(H µ ⊗ H µ ) − i . κ 2 sin(πhα, H µ iaµ ) α (3.106) Recall that η(σ) is the function defined by σ η(σ) = 2 +1, (5.207) 2π B0 (aµ , σ) = −
where [σ/2π] is the largest integer less than or equal to σ/2π. l The description by the nonmonodromic variables k(σ), aµ is given by the formulae (5.159) and (5.161).
September 6, 2004 14:35 WSPC/148-RMP
00214
Quasitriangular WZW Model
785
Now we observe that the term containing rˆ in (5.164) disappears in the limit ε → 0. Hence, it is enough to show iσ iπ exp(iπη(σ)hα, H µ iaµ ) , , (5.208) = −π lim σaµ hα,H µ i ε→0 2κε κε sin(πhα, H µ iaµ ) iσ iπ = −iπη(σ) . (5.209) lim ρ , ε→0 2κε κε Using the formulae (5.205) and (5.206) and the fact that η(σ + 2π) = η(σ) + 2, we conclude that it is enough to establish the limits (5.208), (5.209) for σ ∈ [−π, π]. Then we can represent the elliptic functions on the left-hand side by the series (5.153) and (5.154). For σ ∈ [−π, π], the contributions of the sums over m, n > 0 disappear in the ε → 0 limit and we can consider only the “cotangents”. This gives immediately (5.208) and (5.209). The correct ε → 0 limit of the fundamental braiding relation is therefore established. Our next task is to establish the ε → 0 limit of the quasitriangular Hamiltonian 1 ˆκ ) 0 ˆ φ ] ˆ eΛ( ) . HLqW Z = − (φ, φ)G ∗ − (Dres k 2κ ˜ → R was defined We remind the reader that (˜b)0 ≡ m0 (˜b) and that the map m0 : B in Convention 4.12 of Sec. 4.3 as
˜ t˜0 )) = ˜b , m( ˆ ˜b) exp(m0 (˜b)Λ(
(5.210)
ˆ This decomposition is unambiguous thus defining the function where m( ˆ ˜b) ∈ B. 0 ˜ t˜0 ) itself depends on ε, m0 also does which we m . Moreover, since the generator Λ( 0 may indicate by the subscript mε . Now we use the same reasoning as in the finite-dimensional case of Sec. 5.1.3 showing that in the ε → 0 limit, the dressing action becomes the coadjoint one. Thus we have ] ˆ φˆκ , T˜ 0 i ] ˆ )(eΛˆ ε (φˆκ ) ) = hCoad lim (m0ε ◦ Dres k k
ε→0
1 = hφ, k −1 ∂σ ki + κ(k −1 ∂σ k, k −1 ∂σ k)G , 2
(5.211)
ˆ for brevity and we indicated by the subscript that Λ ˆ depends where k stands for π(k) on ε. From the relation (5.211), we then obtain immediately κ 1 (φ, φ)G ∗ − hφ, k −1 ∂σ ki − (k −1 ∂σ k, k −1 ∂σ k)G ≡ HLW Z . (5.212) ε→0 2κ 2 Thus the standard chiral Hamiltonian (3.92) is indeed recovered in the limit ε → 0. We conclude that the quasitriangular chiral WZW model gives in the limit ε → 0 (or q → 1) the standard chiral WZW theory. We know that the fundamental bracket (5.164) implies the q-current algebra bracket (5.201). Our next task is to show that in the limit ε → 0 (or, equivalently, q → 1), we recover from (5.179) the standard Kac–Moody primary field condition (5.169) and from (5.201) the ordinary current algebra bracket (5.185). lim HLqW Z = −
September 6, 2004 14:35 WSPC/148-RMP
786
00214
C. Klimˇ c´ık
First we rewrite (5.179) in the equivalent way using the q-current L(σ) defined in (5.200). We obtain {k(σ) ⊗, L(σ 0 )}qW Z = εˆ r (σ − σ 0 − iεκ)(k(σ) ⊗ L(σ 0 )) − (1 ⊗ L(σ 0 ))εˆ r (σ − σ 0 + iεκ)(k(σ) ⊗ 1) .
(5.213)
From the classical q-KZ equation (5.204), we derive L(σ) = 1 + 2iεκ∂σ mm−1 + O(ε2 ) = 1 + 2iεjL (σ) + O(ε2 ) .
(5.214)
Inserting this into (5.213) and using (5.182), we obtain in the lowest order in ε the desired relation (5.169) {k(σ) ⊗, jL (σ 0 )}W Z = 2πCδ(σ − σ 0 )(k(σ) ⊗ 1) . Here the δ-function was produced as the following limit ε → 0+
1 1 4πiδ(σ − σ 0 ) = cotg (σ − σ 0 − i0+ ) − cotg (σ − σ 0 + i0+ ) . 2 2
(5.215)
Now we establish the q → 1 limit of the q-current algebra {L(σ) ⊗, L(σ 0 )}qW Z = (L(σ) ⊗ L(σ 0 ))εˆ r (σ − σ 0 ) + εˆ r (σ − σ 0 )(L(σ) ⊗ L(σ 0 )) − (1 ⊗ L(σ 0 ))εˆ r (σ − σ 0 + 2iεκ)(L(σ) ⊗ 1)
− (L(σ) ⊗ 1)εˆ r (σ − σ 0 − 2iεκ)(1 ⊗ L(σ 0 )) .
(5.201) Inserting the ε-expansion (5.214) into (5.201), we obtain in the lowest order ε 2 the correct result {jL (σ) ⊗, jL (σ 0 )}W Z = πδ(σ − σ 0 )[C, jL (σ) ⊗ 1 − 1 ⊗ jL (σ 0 )] + 2πκC∂σ δ(σ − σ 0 ) .
(5.185)
Here we have needed three formulae 8πi∂σ δ(σ − σ 0 ) =
sin2 21 (σ
1 1 − , 21 0 + − σ + i0 ) sin 2 (σ − σ 0 − i0+ )
(5.216)
1 1 2πiδ(σ − σ 0 ) = cotg (σ − σ 0 − i0+ ) − cotg (σ − σ 0 ) , 2 2
(5.217)
1 1 2πiδ(σ − σ 0 ) = cotg (σ − σ 0 ) − cotg (σ − σ 0 + i0+ ) . 2 2
(5.218)
The relation (5.216) can be obtained by deriving (5.215) and the remaining equalities (5.217) and (5.218) can be proved by using the Plemelj–Sokhotsky formula. 5.2.8. Quasitriangular exact solution The simplest way to describe the classical solutions of the quasitriangular chiral WZW model consists in using the monodromic variables m(σ) (cf. (3.103)) on the
September 6, 2004 14:35 WSPC/148-RMP
00214
Quasitriangular WZW Model
787
phase space MLW Z . It turns out that the time evolution is the same as in the nondeformed case, i.e. [m(σ)](τ ) = m(σ − τ ) .
(5.219)
In order to prove that, we have to combine the arguments of Secs. 3.1.3 and 3.2.6. ˜ can be solved in the First of all, the quasitriangular chiral master model on G ˜ φ˜ same way as its finite-dimensional counterpart (3.14) on G0 , with the result (in k, ˜ L) variables on the affine model space M ˜ ˜ ˜ ) = k˜0 exp −Υ(φ0 ) τ , ˜ ) = φ˜0 . k(τ φ(τ (5.220) κ ˜ Here the multiplication is taken in the sense of the group G. Now this solution can be projected on the (doubly) reduced phase MLW Z following step by step the argumentation between (3.121) and (3.124). The result is then (5.219). Remark. It may seem astonishing that the set of classical solutions does not change under the q-deformation. We have seen the finite-dimensional version of this phenomenon already in Sec. 5.1.4. The solution of the puzzle is the same. It is the symplectic structure on the phase space that gets deformed in such a way that the G-action ceases to be Hamiltonian but becomes Poisson–Lie. This means that the natural dynamical variables of the group theoretical origin will have modified Poisson brackets and, upon the quantization, modified commutation relations. For instance, the field theoretical correlation functions change. 5.2.9. The left-right combination Consider the topological direct product MLW Z × MRW Z , where MLW Z = LG0 × A1+ and MRW Z = LG0 × A1− . Recall that A1− ≡ −A1+ . The product symplectic structure on MLW Z × MRW Z is given by the symplectic form qW Z qW Z qW Z ωL×R = ωL + ωR .
(5.221)
qW Z qW Z The symplectic form ωR differs from ωL only by the domain of definition of the variables aµ . The Hamiltonian on MLW Z × MRW Z is given by qW Z qW Z HL×R (kL , kR , aµL , aµR ) = HLqW Z (kL , aµL ) + HR (kR , aµR ) .
(5.222)
Now we perform a symplectic reduction by setting aµL + aµR = 0 .
(5.223)
We learn from the Poisson brackets (5.161) that the quantities aµL + aµR are the moment maps generating the simultaneous right action of the Cartan generators qW Z T µ on MLW Z and MRW Z . The reduction makes sense only if the Hamiltonian HL×R
September 6, 2004 14:35 WSPC/148-RMP
788
00214
C. Klimˇ c´ık
is invariant with respect to this T-action. But this is the case as we see from (1.20) and from the following chain of formulae ] ˆ ην T ν (eΛˆ ε (φˆκ ) ) Dres (ke ) ˆ ην T ν eΛˆ ε (φˆκ ) ) = ˜bL (Ad f ˆ ην T ν eΛˆ ε (φˆκ ) ) = ˜bL (Ad f ˆ eΛˆ ε (φˆκ ) ) ≡ ˜bL (ke k (ke )
] ˆ (eΛˆ ε (φˆκ ) ) . ≡ Dres k
(5.224)
Recall in this respect that (5.224) makes sense since T can be embedded as the d0 . subgroup into LG We conclude that the reduced symplectic form and Hamiltonian live on the reduced phase space ((LG0 × LG0 )/Tdiag ) × A1+ . This is nothing but the phase space of the standard full left-right WZW model. We shall not make more explicit the full left-right WZW symplectic structure. The corresponding formulae are complicated and not illuminating. Anyway, the canonical quantization of the quasitriangular WZW model will proceed by quantization of its chiral components. This is fortunate, because all important Poisson brackets are written in terms of the collection of r-matrices. These r-matrices appeared already in the literature in different context (e.g. KZB equation [7, 21]) and their quantum counterparts are often known, too [20]. Of course, the latter circumstance makes the quantization task more accessible. 6. Conclusions and Outlook The most important result of this paper is the explicit description of the symplectic qW Z structure ωL and of the Hamiltonian HLqW Z of the quasitriangular chiral WZW model. The q-deformation of the standard current algebra then emerged in the natural way. We have actually achieved more than this in the sense that we have built up the whole approach how to q-deform the WZW model and its derived products. In fact, we have no doubt that our construction should be easily generalizable to the gauged WZW models by performing an appropriate symplectic reduction induced by equating the non-Abelian moment maps to some fixed values. Finally, the supersymmetrization of the quasitriangular geodesical model should also lead to the SUSY version of the q-WZW theory. Among other obtained results, we should mention the universal description of the WZW model in terms of the central biextensions. We can thus argue, that the WZW-like models can be constructed not only for the loop group case. From the point of view of outlook, the most important open problem is the quantization of the quasitriangular WZW model. As in the standard case, the chiral part of the model should have the Hilbert space consisting of the set of quantized dressing orbits of the Kac–Moody group. Only the orbits with the integrable highest weight should be present (with the multiplicity 1). Those with integrable highest weight correspond to those points in the alcove for which the induced symplectic form on the dressing orbit is integral. The quantized dressing orbits should carry the
September 6, 2004 14:35 WSPC/148-RMP
00214
Quasitriangular WZW Model
789
unitary representation of the q-Kac–Moody group. The q-vertex operators should be the q-Kac–Moody tensor operators fulfilling the quantum version of the relation (5.174). They should obey the quantized braiding relation of the type (5.153); the theory of the quantum grupoids [46] should be relevant for this story. It is important that the Poisson brackets of the basic observables are of the r-matrix type. The explicit knowledge of the corresponding quantum R-matrices should considerably facilitate the quantization program. Another related important task would be to find the q-free field representation of the model. Perhaps the results of [38] might be of use in the sl2 case although a sort of analytic continuation in κ is needed to arrive from their version of the q-current algebra to ours. We expect that further development of the ideas presented in this article can reveal interesting connections with results of other references, e.g. [24–26, 31]. The important question is also the status of the Virasoro generators. It is remarkable that the Virasoro group does act on the classical chiral phase space MLW Z for every value of q. For q = 1 (or ε = 0), this action is Hamiltonian. This is not the case for a generic q and there is an open problem to learn whether it is at least of Poisson–Lie (probably in the combined Virasoro-Kac–Moody sense), or even of quasi-Poisson type. Appendices A.1. The loop group primer In what follows, G0 will be always a simple compact connected and simply connected Lie group. A group G of smooth maps from the circle S 1 into G0 is called the loop group and it is often denoted as G = LG0 . The group structure of G is naturally given by the pointwise multiplication, the unit element e is the constant map with value e0 , where e0 is the unit element of G0 . The Lie algebra G of G consists of smooth maps from S 1 into G0 again with the pointwise commutator. The invariant symmetric nondegenerate bilinear form on G is given by Z 1 dσ(ξ(σ), η(σ))G0C , (A.1) (ξ, η)G = 2π where (·, ·)G0C is the standard Killing–Cartan form on G0C renormalized in such a way that the square of the length of the longest root is equal to two. For example, for the Lie algebra su(N ), (·, ·)G0C is given by the trace taken in the fundamental representation. We shall identify the dual G ∗ of G with G itself via the bilinear form (·, ·)G . We shall call the corresponding map Υ. Thus hα, ξi = (Υ(α), ξ)G ,
α ∈ G∗ ,
ξ ∈G.
(A.2)
The construction of the standard central extension of loop groups goes as follows [33]. First one considers a group DG0 of smooth maps from the unit Disc (in the complex plane) into G0 with the usual pointwise multiplication. We can now define
September 6, 2004 14:35 WSPC/148-RMP
790
00214
C. Klimˇ c´ık
[0 whose elements are pairs (f, λ), where f ∈ DG0 and an extended group DG λ ∈ U (1) and whose multiplication law reads (f1 , λ1 )(f2 , λ2 ) = (f1 f2 , λ1 λ2 exp [2πiγ(f1 , f2 )]) . Here γ is a real valued 2-cocycle on DG0 given by Z 1 γ(f1 , f2 ) = (f −1 df1 ∧, df2 f2−1 )G0C . 8π 2 Disc 1
(A.3)
(A.4)
Consider now a subgroup ∂G of DG0 consisting of all smooth maps from Disc into G0 such that their value on the boundary ∂ Disc = S 1 is the unit element e0 of G0 . Any g ∈ ∂G can be thought of as a map g : S 2 → G0 by identifying the boundary S 1 of Disc with the north pole of S 2 . It turns out that there is a homomorphism [0 defined by Θ : ∂G → DG Θ(g) = (g, exp[−2πiC(g)]) , where C(g) =
1 24π 2
Z
Ball
(dgg −1 ∧, dgg −1 ∧ dgg −1 )G0C .
(A.5)
(A.6)
Here Ball is the unit ball whose boundary is S 2 and we have extended the map g : S 2 → G0 to a map g : Ball → G0 . It is not immediately obvious that the homomorphism Θ is correctly defined, or, in other words, that exp[−2πiC(g)] does not depend on the extension of g to B. The standard argument of the independence of this term on the extension can be found, e.g. in [33]. The demonstration that Θ is indeed a homomorphism is based on the Polyakov–Wiegmann formula [39] which asserts that C(g1 g2 ) = C(g1 ) + C(g2 ) − γ(g1 , g2 ) .
(A.7)
[0 follows from the The fact that the image Θ(∂G) is a normal subgroup in DG identity C(f gf −1 ) + γ(f, g) + γ(f g, f −1 ) = C(g) .
(A.8)
The latter identity is also the direct consequence of the Polyakov–Wiegmann formula. ˆ of the group G = LG0 is defined as the factor The standard central extension G [0 /Θ(∂G). This group is a (nontrivial) circle bundle over the base space group DG LG0 = {g : S 1 → G0 |g smooth }. The projection π is (g, λ) → g|S 1 and the center ˆ is represented by the pairs (1, λ) ∈ DG [0 . The projection homomorphism from of G ˆ [ DG0 onto G will be referred to as ℘. Now we are going to calculate the following expression d gˆesˆx gˆ−1 e−sˆx ∈ Gˆ , (A.9) ds s=0
September 6, 2004 14:35 WSPC/148-RMP
00214
Quasitriangular WZW Model
791
ˆ and x ˆ ≡ G. ˆ From this computation, we shall extract two where gˆ ∈ G ˆ ∈ Lie(G) ˆ (2) what is the explicit form of the informations: (1) what is the commutator in G; ˆ ˆ adjoint action of G on G. −1 [0 of the classes gˆ and Let us choose two representatives ℘−1 gˆ and es℘∗ xˆ in DG ˆ It is clear that esˆx in G. −1 −1 gˆesˆx gˆ−1 e−sˆx = ℘ (℘−1 gˆ)es℘∗ xˆ (℘−1 gˆ)−1 e−s℘∗ xˆ . (A.10) We shall therefore first calculate an expression d (Γ, λ)(esξ , esα )(Γ, λ)−1 (e−sξ , e−sα ) , ds s=0
(A.11)
where (Γ, λ) = ℘−1 gˆ ;
(ξ, α) = ℘−1 ˆ. ∗ x
(A.12)
Using (A.3), we calculate d (Γ, λ)(esξ , esα )(Γ, λ)−1 (e−sξ , e−sα ) ds s=0 d (Γ, 1)(esξ , 1)(Γ, 1)−1 (e−sξ , 1) = ds s=0 Z is is d Γesξ , exp (Γ−1 dΓ ∧, dξ)G0C )(Γ−1 e−sξ , exp = ds s=0 4π Disc 4π Z −1 ∧ × (dΓΓ , dξ)G0C Disc
=
d ds
s=0
Γesξ Γ−1 e−sξ , exp
+ ([ξ, Γ−1 dΓ] ∧, Γ−1 dΓ)G0C } = =
d ds
ΓξΓ
s=0
−1
Γesξ Γ−1 e−sξ , exp
i − ξ, − 2π
Z
is 4π
−is 2π (Γ
S 1 =∂ Disc
Z
−1
Disc
Z
{2(Γ−1 dΓ ∧, dξ)G0C
Disc
d(Γ−1 dΓ, ξ)G0C
dΓ, ξ)G0C
.
(A.13)
b 0 ) = DG0 + From (A.13), one may derive the Lie algebra commutator in Lie(DG tη iR. Indeed, by setting Γ = e and deriving with respect to t at t = 0, one obtains Z i (η, dξ)G0C , [(η, iα), (ξ, iβ)] = [η, ξ], α, β ∈ R . (A.14) 2π S 1 ≡∂ Disc ˆ is given by the same formula Finally, the commutator in the Lie algebra Gˆ = Lie(G) as (A.14) but η and ξ are to be considered as elements of LG0 rather than those of
September 6, 2004 14:35 WSPC/148-RMP
792
00214
C. Klimˇ c´ık
DG0 . We see also that the map ι : G → Gˆ (cf. Sec. 2.1) is simply given by ι(ξ) = (ξ, 0) .
(A.15)
Although it should be already clear from (A.14) what is the formula for the cocycle ρ, we nevertheless write it down explicitly Z 1 ρ(η, ξ) = (η, dξ)G0C . (A.16) 2π S 1
From (A.13), we obtain also the formula for the adjoint action Z i gˆ(ξ, iβ)ˆ g −1 = gξg −1 , iβ − (g −1 dg, ξ)G0C , 2π S 1
(A.17)
where g = π(ˆ g ).
A.2. Cotangent bundle of a group manifold Consider a (possibly infinite-dimensional) connected Lie group G and its cotangent bundle T ∗ G. The points K of T ∗ G are couples (PK , FK ), where PK is a point in G and FK is a differential 1-form at the point PK ; in other words, FK ∈ TP∗K G. We shall equip T ∗ G with the standard group structure by introducing the following product QK of two elements Q, K ∈ T ∗ G FQK = RP∗ −1 FQ + L∗P −1 FK .
PQK = PQ PK ;
K
(A.18)
Q
Here R∗ and L∗ denote pullbacks with respect to the right and left translations by elements of G, respectively. The inverse element K −1 is given by −1 PK −1 = PK ,
FK −1 = −RP∗ K L∗PK FK .
(A.19)
The unit element E fulfills PE = e ,
FE = 0 ,
(A.20)
where e is the unit element of G. Remarks. (1) The projection on the base P : T ∗ G → G defined by P (K) = PK is a morphism of groups according to (A.18). (2) Upon the trivialization of the cotangent bundle T ∗ G by right (or left) translations, the group law (A.18) turns out to correspond to the semidirect product of G and its coalgebra G ∗ . The latter is viewed as the Abelian group underlying the vector space G ∗ and G acts on G ∗ by means of the coadjoint action. However, in what follows, we shall rather use the formula (A.18) because the trivialization breaks the natural left-right symmetry of the product. We shall denote the Lie algebra of the group T ∗ G as D. Clearly, D can be written as a semi-direct sum of Lie algebras D = G + G∗ ,
(A.21)
September 6, 2004 14:35 WSPC/148-RMP
00214
Quasitriangular WZW Model
793
where G acts on G ∗ in the coadjoint way and G ∗ is the commutative Lie subalgebra of D. It turns out, that there exists a non-degenerate symmetric bilinear form on D defined by (x + x∗ , y + y ∗ )D = hx∗ , yi + hy ∗ , xi ,
x, y ∈ G ,
x∗ , y ∗ ∈ G ∗ .
(A.22)
∗
Here h·, ·i denotes the canonical pairing between the Lie coalgebra G and the Lie algebra G. It is a matter of straightforward calculation to see that the form (·, ·)D is moreover invariant and the subalgebras G and G ∗ are both isotropic with respect to this form. Expressed in formulae (G, G)D = (G ∗ , G ∗ )D = 0 ,
([X, Y ], Z)D + (Y, [X, Z])D = 0 ,
(A.23)
where X, Y, Z ∈ D. The cotangent bundle of any manifold M is equipped with the canonical symplectic structure. The corresponding symplectic 2-form ω can be written as ω = dθ ,
(A.24)
where θ is a 1-form on T ∗ M called the symplectic potential. It is defined in a point K = (PK , FK ) ∈ T ∗ M as θ = P ∗ FK .
(A.25)
In words, θ is the pullback of the form FK living in PK ∈ M by the projection map P : K → PK . It is now convenient to introduce a set of differential operators (vector fields) on T ∗ G. Define (as in [42]) ∇L : C ∞ (T ∗ G) → C ∞ (T ∗ G) ⊗ D∗ , as follows h∇L φ, αi(K) =
d ds
∗
φ(esα K) ,
(A.26)
h∇R φ, αi(K) =
s=0
∞
∇R : C ∞ (T ∗ G) → C ∞ (T ∗ G) ⊗ D∗
d ds
φ(Kesα ) . s=0
(A.27)
∗
Here α ∈ D, K ∈ T G and φ ∈ C (T G). Define also a linear operator R : D → D as follows R(x + x∗ ) = x∗ − x ,
x∈G,
x∗ ∈ G ∗ .
(A.28)
Parenthetically, this operator is known as the classical R-matrix and the Lie algebra D is the factorizable Baxter–Lie algebra in the sense of [42]. We have now the following lemma. Lemma A.1 (Semenov-Tian-Shansky [43]). The Poisson bracket corresponding to the symplectic form (A.24) on T ∗ G can be written as follows {φ, ψ}T ∗ G =
1 1 L (∇ φ, R∗ ∇L ψ)D∗ + (∇R φ, R∗ ∇R ψ)D∗ . 2 2
(A.29)
September 6, 2004 14:35 WSPC/148-RMP
794
00214
C. Klimˇ c´ık
Here (·, ·)D∗ is the bilinear form on the dual of D induced by the (nondegenerate) bilinear form (·, ·)D and R∗ : D∗ → D∗ is the map dual to R. It might be illuminating to write the bracket (A.29) in some basis T i , ti ; i = 1, . . . , dim G of D where T i ’s form the basis of G and ti ’s the corresponding dual basis of G ∗ . We obtain {φ, ψ}T ∗ G =
1 L 1 h∇ φ, T i ih∇L ψ, ti i − h∇L φ, ti ih∇L ψ, T i i 2 2
1 1 + h∇R φ, T i ih∇R ψ, ti i − h∇R φ, ti ih∇R ψ, T i i , 2 2 where the standard Einstein summation convention is used.
(A.30)
Remark. It is important to note that the canonical Poisson bracket on T ∗ G can be written entirely in terms of the Lie group structure of T ∗ G. This way of writing this bracket stands at the basis of our Poisson–Lie generalization of the standard WZW story. Proof. First we realize that the left (right) trivialization of the cotangent bundle T ∗ G gives a diffeomorphism between T ∗ G and the direct product of two manifolds G and G ∗ . In other words, there exist two global decompositions T ∗ G = GG ∗ = G ∗ G, where G ∗ is the fiber of the cotangent bundle at the unit element e of the group G. Thus we may write for each K ∈ T ∗ G K = (gL (K), 0)(e, βR (K)) = (e, βL (K))(gR (K), 0) ,
(A.31)
where gL (K) = gR (K) = PK ,
(A.32)
βL (K) = RP∗ K FK ,
(A.33)
βR (K) = L∗PK FK .
Here L∗PK is the pullback map of the differential forms on G with respect to the left translation by the element PK and similarly RP∗ K is the right pullback. Instead of somewhat cumbersome expressions (A.31), we shall rather write K = gL (K)βR (K) = βL (K)gR (K) .
(A.34)
It is then clear that the functions on T ∗ G of the form Φ(PK ) and Ψ(βL (K)) generate the whole algebra of smooth functions on T ∗ G, hence it is enough to compute the Poisson brackets between them. Even more specially, instead of arbitrary functions Ψ on G ∗ , it is sufficient to consider linear ones, i.e. the functions hβL , xi, where x ∈ G. For the case of the group manifold, there exists a convenient expression for the symplectic potential θ in terms of the invariant Maurer–Cartan forms. Recall their definitions: in what follows, the expression λG (ρG ) will denote the left (right) invariant G-valued Maurer–Cartan form on the group G defined by λG (Xg ) = Lg−1 ∗ Xg ,
ρG (Xg ) = Rg−1 ∗ Xg ,
Xg ∈ T g G .
(A.35)
September 6, 2004 14:35 WSPC/148-RMP
00214
Quasitriangular WZW Model
795
Note that the forms λG and ρG are often written as λG = g −1 dg ,
ρG = dgg −1 .
(A.36)
The symplectic potential θ can be then simply expressed in the “coordinates” (β, g) as θ = hβL , dgg −1 i .
(A.37)
Here we have abandoned the subscript R on g, since anyway gR = gL ≡ g. By using the formula d(dgg −1 ) = dgg −1 ∧ dgg −1 for the exterior derivative of the Maurer–Cartan form ρG , we have ω = hdβL ∧ dgg −1 i + hβL , dgg −1 ∧ dgg −1 i .
(A.38)
Now pick up a basis T i ∈ G and its dual basis ti ∈ G ∗ . It is also convenient to use a short-hand notation hβL , T i i ≡ β i . Then the form ω can be written as
1 (A.39) ω = dβ i ∧ Rg∗−1 ti + β i fimn Rg∗−1 tm ∧ Rg∗−1 tn , 2 where fimn are the structure constants of the Lie algebra G. This expression can be readily inverted to give the corresponding Poisson tensor Π Π=
1 m ij ∂ ∂ ∂ β fm i ∧ − ∧ Rg∗ T i . 2 ∂β ∂β j ∂β i
(A.40)
Since we have that h∇L G , xi = Rg∗ x, for x ∈ G, we obtain from Π the following Poisson brackets {Φ1 (g), Φ2 (g)} = 0 ; d {Φ(g), hβL , xi} = Φ(esx g) ≡ h∇L G Φ, xi ; ds s=0 {hβL , xi, hβL , yi} = hβL , [x, y]i .
(A.41) (A.42) (A.43)
We are going to prove now that the same set of the Poisson bracket can be obtained directly from the Semenov-Tian-Shansky formula (A.29). For two functions Φ1,2 : G → R, we calculate {Φ1 (PK ), Φ2 (PK )}T ∗ G = 0 . This follows from the following fact d d Φj (Pesx∗ K ) = Φj (PK ) = 0 , h∇L Φj (PK ), x∗ i = ds s=0 ds s=0
(A.44)
j = 1, 2 (A.45)
and from its right analogue. Here x∗ is an element of G ∗ viewed as the element of D. Typically, x∗ is ti in (A.45).
September 6, 2004 14:35 WSPC/148-RMP
796
00214
C. Klimˇ c´ık
The bracket of the type {hβL , xi, hβL , yi}T ∗ G for x, y ∈ G is more involved. In order to compute it, we need to prove the following formulae h∇L hβR (K), xi, yi = h∇R hβL (K), xi, yi = 0 ;
(A.46)
h∇L hβL (K), xi, y ∗ i = h∇R hβR (K), xi, y ∗ i = hy ∗ , xi ;
(A.47)
h∇L hβR (K), xi, y ∗ i = hy ∗ , AdPK xi ;
(A.48)
h∇R hβL (K), xi, y ∗ i = hy ∗ , AdP −1 xi ;
(A.49)
K
h∇L hβL (K), xi, yi = hβL (K), [x, y]i ;
(A.50)
h∇R hβR (K), xi, yi = −hβR (K), [x, y]i ,
(A.51)
where x, y ∈ G and x∗ , y ∗ ∈ G ∗ . We prove, e.g. only the last formula, one can prove the others in full analogy. We have d hβR (Kesy ), xi h∇R hβR (K), xi, yi = ds s=0 d d ∗ = hL FKesy , xi = hL∗sy L∗PK Re∗−sy FK , xi ds s=0 P(Kesy ) ds s=0 e d hβR (K), Lesy ∗ Re−sy ∗ xi = −hβR (K), [x, y]i . (A.52) = ds s=0
Now with the help of the formulae (A.46)–(A.51), we calculate directly {hβL (K), xi, hβL (K), yi}T ∗ G = hβL (K), [x, y]i .
(A.53)
The remaining bracket between Φ : G → R and hβL , xi can be directly evaluated again with the help of the formulae (A.46)–(A.51): 1 1 R i i −1 xi {Φ(PK ), hβL (K), xi}T ∗ G = h∇L G φ, T ihx, ti i + h∇G φ, T ihti , AdPK 2 2 d Φ(esx PK ) ≡ h∇L (A.54) = G Φ, xi . ds s=0
∞ ∞ ∗ Here ∇R G is the map from C (G) into C (G) ⊗ G (cf. (A.26, A.27)). The reader L has certainly noticed that if the operators ∇ and ∇R appear without an index specifying the group, it means that they act on the double D. Otherwise we indicate R as above ∇L G or ∇G . We recognize in the Poisson brackets (A.44), (A.53) and (A.54) the brackets (A.41), (A.43) and (A.42) obtained by inverting the symplectic form. For completeness, we list the brackets involving the right “currents” hβR (K), xi
{hβR (K), xi, hβR (K), yi}T ∗ G = −hβR (K), [x, y]i ;
(A.55)
{hβR (K), xi, hβL (K), yi}T ∗ G = 0 ;
(A.56)
{Φ(PK ), hβR (K), xi} = h∇R G Φ, xi .
(A.57)
The lemma is proved.
September 6, 2004 14:35 WSPC/148-RMP
00214
Quasitriangular WZW Model
797
A.3. The symplectic reduction in the dual language ˆ Tˆ ∞ i = hβˆL (K), ˆ Tˆ ∞ i defined on T ∗ G ˆ is the moment map The function hβˆR (K), ∗ˆ that generates the central circle action on T G. We can see it from (A.30) d ∞ R ∞ ˆ ˆ ˆ ˆ tTˆ ∞ ) = h∇L φ, Tˆ ∞ i {φ, hβR , T i}T ∗ Gˆ = h∇ φ, T i ≡ φ(Ke (A.58) ds s=0 ˆ The symplectic reduction of Sec. 2.2.3 can be performed for every function on T ∗ G. ˆ of T ∗ G ˆ consisting of those points where also as follows: first fix a submanifold Mκ (G) ∞ ˆ ˆ the moment map hβR , T i acquires the fixed constant value κ. Note that the same ˆ given by (2.41). Now constant κ appears in the definition of the Hamiltonian H ˆ hence we can consider the central circle action preserves the submanifold Mκ (G), ˆ the space of its orbits Mκ (G)/U (1). In topologically good cases, the latter is a manifold and, in fact, it is the reduced phase space. We can easily calculate the reduced Poisson bracket of functions η and ρ living on the reduced symplectic ˆ manifold Mκ (G)/U (1). First we take their pullbacks Π∗ η and Π∗ ρ with respect ˆ → Mκ (G)/U ˆ ˆ the to the map Π : Mκ (G) (1) that associates to a point in Mκ (G) ˆ corresponding central circle orbit. Those pullbacks are functions living on Mκ (G). ∗ˆ We extend them on the whole unreduced manifold T G in such a way that they ˆ We compute the are invariant with respect to the central circle action on T ∗ G. unreduced Poisson bracket of the extended functions and we restrict the result on ˆ The function on Mκ (G) ˆ thus obtained is clearly invariant with respect Mκ (G). to the central circle action (this follows from the Jacobi identity for the Poisson bracket); or, in other words, it is constant on the orbits. It is then a pullback of ˆ some function living on Mκ (G)/U (1). The latter is nothing but the reduced Poisson bracket {η, ρ}red. The mechanism of the symplectic reduction is perhaps even more transparent in the dual language liked by noncommutative geometers. The role of the ˆ The alˆ is played by the algebra A of smooth functions on T ∗ G. manifold T ∗ G ˆ gebra of functions on the reduced phase space Mκ (G)/U (1) can be obtained in two steps. First one considers the subalgebra Inv(A) ⊂ A consisting of functions in A whose unreduced Poisson bracket with the moment map hβˆL , Tˆ ∞ i vanishes. There is a distinguished ideal Iκ (A) in Inv(A) consisting of the functions of the form Inv(A)(hβˆL , Tˆ ∞ i − κ). Factorizing Inv(A) by its ideal Ik (A) gives the algebra of ˆ functions on the reduced symplectic manifold Mκ (G)/U (1). The reduced Poisson bracket of η and ρ as above can be computed by choosing any representative of the classes η and ρ in Inv(A) and by computing the unreduced Poisson bracket of those representatives. The last step consists in taking the class of the result. Of course, the symplectic reduction of some dynamical system is a consistent procedure if the Hamiltonian of the unreduced system (Poisson) commutes with the moment map. In this case, the Hamiltonian is an element of Inv(A) and as such it gives rise to some function on the reduced phase space. This function is called the Hamiltonian of the reduced system. In our case, we have to show that
September 6, 2004 14:35 WSPC/148-RMP
798
00214
C. Klimˇ c´ık
the Hamiltonian ˆ K) ˆ = − 1 (ι∗ (βˆL ), ι∗ (βˆL ))G ∗ − 1 (ι∗ (βˆR ), ι∗ (βˆR ))G ∗ (2.41) H( 2κ 2κ ˆ It is easy to is invariant function with respect to the central circle action on T ∗ G. see this since we have from (A.18) and (A.33) ˆ∞ ˆ βˆR (esT K) = L∗Pˆ
ˆ∞ ˆ (esT K)
∗ ∗ ˆ Fˆ(esTˆ ∞ K) ˆ = LPˆ LesTˆ ∞ F(esTˆ ∞ K) ˆ ˆ K
ˆ . = L∗Pˆ L∗esTˆ∞ L∗e−sTˆ ∞ FˆKˆ = L∗Pˆ FˆKˆ = βˆR (K) ˆ K
ˆ K
(A.59)
In the same way, we may check the invariance of βˆL and hence of the whole Hamiltonian H. ˆ We are going to show that the reduced symplectic manifold Mκ (G)/U (1) is ∗ indeed diffeomorphic to the cotangent bundle T G of the non-extended group G. On the other hand, the reduced symplectic structure does not coincide with the canonical symplectic structure on T ∗ G (unless κ = 0). ˆ A.3.1. The map between Mκ (G)/U (1) and T ∗ G ˆ that we shall denote M0 (G). ˆ It is formed There is a distinguished subgroup of T ∗ G ∗ ˆ of T G ˆ that satisfy by those elements K FˆKˆ (LPˆ ˆ ∗ Tˆ∞ ) = 0
(A.60)
ˆ Tˆ ∞ i = 0 . hβˆR (K),
(A.61)
K
or, equivalently, ˆ The Here the hats indicate that we are dealing with the extended group G. ∗ ˆ ˆ ˆ map βR : G → G is defined as in (A.33). L∗ is the push-forward map acting on ˆ viewed as the tangent space the vector Tˆ ∞ which lives in the Lie algebra Gˆ of G ˆ ˆ TeˆG at the unit element eˆ of G. The reader may note that the condition (A.60) is equivalent to Fˆ ˆ (R ˆ Tˆ ∞ ) = 0. This follows from the fact that the vector L ˆ Tˆ ∞ K
PK ˆ∗
PK ˆ∗
ˆ at corresponds to the right infinitesimal action of the central circle (injected in G) the point PˆKˆ and from the fact that the extension is central, hence the left action of the U (1) coincides with the right action. ˆ fulfilling (A.60), (A.61) It is the matter of a direct check that the elements K ∗ˆ ˆ is naturally the central form a subgroup of T G. We shall now show that M0 (G) extension of the group T ∗ G by the circle group U (1). In order to write down the corresponding exact sequence of group homomorphisms Π0 ˆ −→ T ∗G → 1 , 1 → U (1) → M0 (G)
(A.62)
ˆ and the homomorphism Π : we have to specify the injection of U (1) into M0 (G) ∗ ˆ → T G. The injection is clear since G ˆ is the subgroup of M0 (G) ˆ (it is formed M0 (G) ∗ ˆ ˆ ˆ ˆ as in (2.1) by the elements K of T (G) with FKˆ = 0), hence we inject U (1) in G and this trivially induces the injection in (A.62).
September 6, 2004 14:35 WSPC/148-RMP
00214
Quasitriangular WZW Model
799
Now the homomorphism Π0 is constructed as follows: one first notes that the ˆ is linearly generated kernel of the push-forward map π∗ at some point PˆKˆ ∈ G ∞ ∞ precisely by the vector LPˆ ˆ ∗ Tˆ (= RPˆ ˆ ∗ Tˆ ). In other words, the condition (A.60) K K implies that there exists a unique 1-form FK living in the point π(Pˆ ˆ ) such that K
∗
π FK
= FˆKˆ .
(A.63)
Rephrasing differently, every form FˆKˆ , living in the point PˆKˆ and satisfying the condition (A.60), is a pullback by π of the uniquely given form FK in the point π(PˆKˆ ). But a form in a point defines an element of the group T ∗ G; we have denoted ˆ ∈ it as K in our context. Now the map Π0 is defined by associating to every K ∗ ˆ M0 (G) the corresponding element K ∈ T G. Another way of writing this is as follows ˆˆ ) ; PΠ0 (K) ˆ = π(PK
(A.64)
FˆKˆ = π ∗ FΠ0 (K) ˆ .
(A.65)
Lemma A.2. Π0 is the group homomorphism. Proof. One has to verify the validity of two relations ˆ K)) ˆ = P (Π0 (Q))P ˆ ˆ ; P (Π0 (Q (Π0 (K)) ∗ FΠ0 (Qˆ K) ˆ = RP −1
ˆ Π0 (K)
∗ FΠ0 (Q) ˆ + LP −1
(A.66)
ˆ Π0 (Q)
FΠ0 (K) ˆ .
(A.67)
The first one (A.66) is simple, one has from (A.64) ˆ K)) ˆ = π(Pˆ (Q ˆ K)) ˆ = π(Pˆ (Q))π( ˆ ˆ = P (Π0 (Q))P ˆ ˆ . P (Π0 (Q Pˆ (K)) (Π0 (K))
(A.68)
To prove the second one, we have from (A.65) and from the “hat”-version of (A.18) ˆ ˆ ˆ = R∗ˆ −1 π ∗ F ˆ + L∗ˆ −1 π ∗ F π ∗ FΠ0 (Qˆ K) ˆ = FQ ˆ . Π0 (Q) Π0 (K) K P P ˆ K
(A.69)
ˆ Q
ˆ → G, it follows for every kˆ ∈ G: ˆ From the homomorphism property of π : G ∗ ∗ ∗ π ∗ Rπ( ˆ = Rk ˆπ k)
(A.70)
ˆ Combining this fact together with (A.64) and similarly for the left translation by k. and (A.65), we arrive at ∗ ∗ π ∗ FΠ0 (Qˆ K) ˆ = π RP −1
ˆ Π0 (K)
∗ ∗ FΠ0 (Q) ˆ + π LP −1
ˆ Π0 (Q)
FΠ0 (K) ˆ .
(A.71)
But this implies the relation (A.67) because the pullback π ∗ of a non-zero form on ˆ G would be a non-zero form on G. ˆ of the group manifold T ∗ G ˆ We shall now again consider submanifold Mκ (G) ∗ˆ ˆ formed by all points K ∈ T G fulfilling the condition ˆ Tˆ∞ i = κ . FˆKˆ (LPˆ ˆ ∗ Tˆ ∞ ) = hβˆR (K), K
(A.72)
September 6, 2004 14:35 WSPC/148-RMP
800
00214
C. Klimˇ c´ık
ˆ the manifold M0 (G) ˆ defined above, hence our For κ = 0, we recover from Mκ (G) ˆ is not a notation is consistent. Note, however, that if κ 6= 0, the manifold Mκ (G) ∗ˆ subgroup of T G. ˆ and M0 (G). ˆ We We can construct a natural diffeomorphism relating Mκ (G) ˆ consisting of proceed as follows: first consider a one-parameter subgroup of T ∗ G ˆ (s) ∈ T ∗ G ˆ that fulfil those points N PˆNˆ (s) = eˆ ,
FˆNˆ (s) = stˆ∞ ,
(A.73)
where s ∈ R and tˆ∞ is the 1-form in eˆ satisfying tˆ∞ (Tˆ∞ ) = 1 ,
tˆ∞ (ι(G)) = 0 .
(A.74)
The conditions (A.74) determine tˆ∞ unambiguously. ˆ (s) normalizes the group M0 (G) ˆ Remark. It is easy to check that the group N −1 ∗ˆ ˆ ˆ ˆ ˆ (i.e. N M0 (G)N ⊂ M0 (G)). This means in our context that T G is naturally a ˆ and M0 (G). ˆ semi-direct product of N ˆ and M0 (G) ˆ is now simply given by The diffeomorphism relating Mκ (G) ˆ 0 → N(κ) ˆ K ˆ0 , K
ˆ 0 ∈ M0 (G) ˆ . K
(A.75)
It is evident that this diffeomorphism commutes with the central circle action, ˆ and on M0 (G) ˆ are also diffeomorphic. hence the spaces of the U (1) orbits on Mκ (G) ˆ Finally, from the exact sequence (A.62), it follows that the space of orbits of M 0 (G) is nothing but T ∗ G. ˆ A.3.2. The reduced Poisson bracket on Mκ (G)/U (1) ˆ Let us now compute the reduced Poisson bracket on Mκ (G)/U (1). Since the latter ∗ is diffeomorphic to T G, it is sufficient to determine the Poisson brackets of a distinguished set of functions on T ∗ G of the form Φ(PK ) and hβR (K), xi. Here Φ : G → R, K ∈ T ∗ G and x ∈ G. The functions of this special form generate (via ˆ the diffeomorphism above) the whole algebra of functions on Mκ (G)/U (1). Recall that the canonical Poisson brackets of those functions are given by the equations (A.41)–(A.43)). However, the reduced Poisson bracket of the same quantities are different as the following theorem. ˆ Theorem A.3. The reduced symplectic structure on Mκ (G)/U (1) ∼ = T ∗ G is fully determined by the following Poisson brackets {Φ1 (PK ), Φ2 (PK )}red = 0 ; d Φ(esx PK ) ≡ h∇L {Φ(PK ), hβL (K), xi}red = G Φ, xi ; ds s=0 {hβL (K), xi, hβL (K), yi}red = hβL (K), [x, y]i + κρ(x, y) . Recall that ρ(x, y) is the cocycle characterizing the central extension (2.2).
(A.76) (A.77) (A.78)
September 6, 2004 14:35 WSPC/148-RMP
00214
Quasitriangular WZW Model
801
Remarks. (1) The reduced brackets (A.76) and (A.77) are the same as the canonical brackets (A.41) and (A.42), respectively. However, the reduced bracket (A.78) differs from the corresponding canonical bracket by the cocycle term. We thus see that (unless κ = 0) the reduced symplectic structure does not coincides with the canonical one on T ∗ G. (2) It may seem that the adding of the cocycle in (A.78) does not represent a “big” change of the Poisson bracket. However, the reader may check as an exercise that the reduced bracket of the type {hβR , xi, hβR , yi}red is already much more complicated than its canonical counterpart (A.55) (We do not list this calculation here since we shall not need it in this paper.) The reason why the bracket of the left currents βL is simpler than that of the right currents βR is due to the left-right ˆ asymmetry in the map relating Mκ (G)/U (1) and T ∗ G. The choice of this map is not canonical; we have chosen it in this way in order to make contact with the standard description of the WZW symplectic structure in [6]. If we change this map (it is also possible to make a left-right symmetric choice) the reduced Poisson brackets of the left and right currents βL,R change but the theory does not change. ˆ Indeed, βL,R would then correspond to different functions on Mκ (G)/U (1) if the diffeomorphism has changed. We stress that the natural dynamical variables of the problem are βˆL,R and they get expressed differently in terms of the observables on T ∗ G under the change of the diffeomorphism. (3) It does not follow from anywhere that the level κ must be an integer. This constraint will appear at the quantum level. It can be simply understood intuitively, since k is the “momentum” conjugated to the angle variable parametrizing the central circle. In order to prove the theorem, we shall need the following lemma. Lemma A.4. The pullback Π∗ hβL , xi of the function hβL , xi on T ∗ G via the map ˆ → Mκ (G)/U ˆ Π : Mκ (G) (1) is given by the following formula ˆ = hβˆL (K), ˆ ι(x)i, (Π∗ hβL , xi)(K)
ˆ ∈ Mκ (G) ˆ , K
x∈G.
(A.79)
ˆ → Mκ (G)/U ˆ Remark. The map Π : Mκ (G) (1) has been defined at the beginning of Sec. A.3; the statement of the lemma makes sense since we have established in ˆ Sec. A.3.1 that Mκ (G)/U (1) and T ∗ G are diffeomorphic. Proof. First we have to prove two important relations ˆ xi = hβˆL (K), ˆ ι(x)i , hβL (Π0 (K)),
ˆ ∈ M0 (G) ˆ K
(A.80)
ˆ ∈ T ∗G ˆ, K
x ∈ G . (A.81)
and ˆ (s)K), ˆ ι(x)i = hβˆL (K), ˆ ι(x)i , hβˆL (N
s ∈ R,
Indeed, the first relation is implied by (A.33), (2.3), (A.64), (A.70) and (A.65) as follows ˆ xi = hR∗ hβL (Π0 (K)), PΠ
ˆ 0 (K)
FΠ0 (K) ˆ , π∗ (ι(x))i = hFΠ0 (K) ˆ , Rπ(Pˆ ˆ )∗ π∗ (ι(x))i K
September 6, 2004 14:35 WSPC/148-RMP
802
00214
C. Klimˇ c´ık ∗ = hFΠ0 (K) ˆ , π∗ RPˆ ˆ ∗ ι(x)i = hπ FΠ0 (K) ˆ , RPˆ ˆ ∗ ι(x)i K
K
ˆ ι(x)i . = hFˆKˆ , RPˆ ˆ ∗ ι(x)i = hRP∗ˆ FˆKˆ , ι(x)i = hβˆL (K), ˆ K
K
(A.82)
The second relation in turn follows from (A.18), (A.33) and (A.74) ˆ (s)K), ˆ ι(x)i = hR∗ˆ hβˆL (N P
ˆ (s)K ˆ N
FˆNˆ (s)Kˆ , ι(x)i
= hRP∗ˆ (sRP∗ˆ −1 tˆ∞ + FˆKˆ ), ι(x)i ˆ K
ˆ K
ˆ ι(x)i = shtˆ∞ , ι(x)i + hβˆL (K), ˆ ι(x)i . = hβˆL (K),
(A.83)
Now we have by definition ˆ = hβL (Π0 (N ˆ (−κ)K))i ˆ . (Π∗ hβL , xi)(K)
(A.84)
Combining (A.80) and (A.81), it follows ˆ = hβˆL (K), ˆ ι(x)i . (Π∗ hβL , xi)(K)
(A.85)
The lemma is proved. Proof of Theorem A.3. Consider now an arbitrary pair of functions φ, ψ on T ∗ G. In order to calculate their reduced Poisson brackets, we first have to pull ˆ via our map Π : Mκ (G) ˆ → Mκ (G)/U ˆ them back to functions on Mκ (G) (1). We have, for instance, ˆ = φ(Π0 (N ˆ (−κ)K)) ˆ , (Π∗ φ)(K)
ˆ ∈ Mκ (G) ˆ . K
(A.86)
ˆ in Now we have to extend the functions Π φ and Π ψ to the whole manifold T ∗ G such a way that they are invariant with respect to the central circle action. The resulting functions can be referred to as φext , ψext and can be conveniently chosen as ∗
∗
ˆ = φ(Π0 (N ˆ (−hβR (K), ˆ Tˆ ∞ i)K)) ˆ , φext (K)
ˆ ∈ T ∗G ˆ K
(A.87)
and in the same way for ψext . Now we should compute the unreduced canonical bracket of φext and ψext . First we calculate the extensions of two particular functions on T ∗ G. First one is (Φ ◦ P )(K) ≡ Φ(PK ), where Φ : G → R and the second is hβL (K), xi, where x ∈ G. We obtain from (A.64) and (A.87) ˆ = Φ(P (Φ ◦ P )ext (K) ˆ (−hβR (K), ˆ Tˆ ∞ i)K) ˆ ) Π0 ( N
ˆ ˆ )) = Φ(π(PˆNˆ (−hβR (K), ˆ Tˆ ∞ i)K ˆ )) = Φ(π(PK
(A.88)
and from (A.80), (A.81) and (A.87) ˆ = hβL (Π0 (N ˆ (−hβR (K), ˆ Tˆ ∞ i)K), ˆ xi hβL , xiext (K) ˆ (−hβR (K), ˆ Tˆ ∞ i)K), ˆ ι(x)i = hβˆL (K), ˆ ι(x)i . (A.89) = hβˆL (N
September 6, 2004 14:35 WSPC/148-RMP
00214
Quasitriangular WZW Model
803
Now it is easy to calculate the unreduced bracket {hβL , xiext , hβL , yiext }T ∗ Gˆ . We have from (A.89), (A.43) and (2.7) ˆ = {hβˆL (K), ˆ ι(x)i, hβˆL (K), ˆ ι(y)i} ∗ ˆ {hβL , xiext , hβL , yiext }T ∗ Gˆ (K) T G ˆ [ι(x), ι(y)]i = hβˆL (K), ˆ ι([x, y]) + ρ(x, y)Tˆ∞ i = hβˆL (K), ˆ ι([x, y])i + ρ(x, y)hβˆL (K), ˆ Tˆ∞ i . (A.90) = hβˆL (K), The resulting function is clearly invariant with respect to the central circle action. ˆ is given From Lemma A.4 and Eq. (A.72), it follows that its restriction to Mκ (G) by the following expression ˆ ι([x, y]i + κρ(x, y) = (Π∗ hβl , [x, y]i)(K) ˆ + κρ(x, y) , hβˆL (K),
ˆ ∈ Mκ (G) ˆ . K
(A.91)
Hence we obtain for the reduced Poisson bracket {hβL (K), xi, hβL (K), yi}red = hβL (K), [x, y]i + κρ(x, y) .
(A.92)
This is precisely the relation (A.78). To prove (A.76) is very easy because from (A.88) and (A.44), we have immediately ˆ (Φ2 ◦ P )ext (K)} ˆ ˆ ˆ )), Φ2 (π(Pˆ ˆ ))} ∗ ˆ = 0 . (A.93) {(Φ1 ◦ P )ext (K), ˆ = {Φ1 (π(PK T ∗G T G K It remains to verify the relation (A.77). It follows directly from the following computation (cf. (A.42)) ˆ hβL , xiext (K)} ˆ ˆ ˆ )), hβˆL (K), ˆ ι(x)i} ∗ ˆ {(Φ ◦ P )ext (K), ˆ = {Φ(π(PK T ∗G T G d Φ(π(esι(x) PˆKˆ )) = ds s=0 d Φ(esx π(PˆKˆ )) . (A.94) = ds s=0 The theorem is proved. A.4. Proof of Lemma 5.8 With the notation of Sec. 5.2.2, the proposition to be proved reads Lemma 5.8. It holds that f ˆa ˆ c ˆa m ˆ L (Ad ˆ) , k ˆ) = bL (Adk
f ˆa c ˆa g˜R (Ad ˆR (Ad k ˆ) = g k ˆ) .
(A.95)
C Proof. Consider any element (¯l, λ) ∈ \ DGC 0 . The elements of DG0 can be viewed as smooth maps from an interval [0, 1] parametrized by r (the radius of the disc)
September 6, 2004 14:35 WSPC/148-RMP
804
00214
C. Klimˇ c´ık
C into the loop group LGC 0 . Since the loop group LG0 is diffeomorphic to the diC rect product of the manifolds LG0 and L+ G0 , it follows that ¯l can be uniquely decomposed as
¯l = XA ,
X ∈ D+ GC 0 .
A ∈ DG0 ,
(A.96)
C Here D+ GC 0 is the subgroup DG0 consisting of the elements of the form X(σ, r), where for every fixed r = r0 , X(σ, r0 ) ∈ L+ GC 0 . ˜ ˜ ¯ C 2 ˜ ∈ D; ˜ Consider a Θ -representative K ∈ R ×S,Q \ DGC of an element K 0
¯ = (¯l, eip+P , w + is) , K
(A.97)
¯ C DGC where (¯l, eiλ+L ) ∈ \ 0 , l ∈ DG0 and p, P, w, s ∈ R. As a consequence of the ¯ decomposition (A.96), K also can be decomposed uniquely as ¯ = (¯b, et , w)˜ K ∗(g¯, eiφ , is) ,
(A.98)
\ C. where g¯ ∈ DG0 , ¯b ∈ D+ GC ∗ is that of R2 ×S,Q DG 0 , φ, t ∈ R and the product ˜ 0 By applying the homomorphism ℘ˆC (cf. Sec. 4.4.3) on the both sides of (A.98), we obtain ˜ ˜ = (℘ˆ (¯b, et ), w)(℘ˆ (g¯, eiφ ), is) . K (A.99) C
C
Now we prove that ˜ ˜ , (℘ˆC (¯b, et ), w) = ˜bL (K)
˜ ˜ . (℘ˆC (g¯, eiφ ), is) = g˜R (K)
(A.100)
˜ ˜ =B ˜ G, ˜ if This will follow immediately from the uniqueness of the decomposition D we demonstrate that (i)
˜, (℘ˆC (¯b, et ), w) ∈ B
(ii)
˜. (℘ˆC (g¯, eiφ ), is) ∈ G
(A.101)
˜ ˜ ˜ is embedded in D The statement (ii) is the direct consequence of the way how G (Lemma 4.18 and its proof); the statement (i) requires a bit more of work, however. Actually we must to show that it exists b ∈ L+ GC 0 such that ℘ˆC (¯b, et ) = ℘ˆC (¯b, et ) .
(A.102)
Recall that ¯b was defined in Lemma 4.19, as the map from the Disc into GC 0 whose boundary is b ∈ L+ GC . Let us show that b given by 0 b = (ΠC ˆC )((¯b, et )) 0 ◦℘
(A.103)
solves (A.102). This amounts to show that −1 C C (¯b¯b ) = 0 .
(A.104)
C The last equation follows from the general fact that C C vanishes on D+ GC 0 ∩ ∂G . C C Indeed, recall that if ¯ l ∈ δG ∩ D+ G0 , then ¯ l can be interpreted as a map from 2 C the D-Riemann sphere S into G0 (cf. the discussion of the WZW term in Sec. 4.4 and in Appendix A.1) and extended to a map ¯lext from the unit Ball wrapped by
September 6, 2004 14:35 WSPC/148-RMP
00214
Quasitriangular WZW Model
805
¯ S 2 into GC 0 . Now the map lext can also be interpreted as a map from a certain C half-disc into LG0 . This half-disc is diffeomorphic to the space of orbits of the azimutal rotation around the axis connecting the northern with the southern pole of the Ball and it can be clearly identified with the half-disc bounded by the northsouth axis of rotation and by the half of the Greenwich Meridian. Points of each such orbit are parametrized by the loop parameter σ and the map ¯lext restricted to any such orbit is naturally the element of LGC 0 . It is crucial to remark, that the loop group LGC 0 can be diffeomorphically decomposed as a product of LG0 and L+ G C 0 hence also the group of smooth maps from the half-disc (with apropriate boundary conditions) into LGC 0 can be decomposed accordingly. If we decompose ¯ the map lext in this way, we observe that the L+ GC 0 part of this decomposition is still the map that extends the original element ¯ l ∈ ∂GC ∩ D+ GC 0 into the Ball. + We refer to such an extension as to ¯lext and we recall that it can be understood as C ¯ the map from the half-disc into L+ GC 0 . Thus we can calculate the C (l) by using + the extension ¯ lext in the defining formula (4.132). The result is that the integral over the loop variable σ can be converted on the equatorial contour integral on the σ-Riemann sphere, where the integrated function in the z-variable is everywhere holomorphic on the southern hemisphere. The contour can be therefore shrunk to the point without encountering any singularity, hence C C (¯l) = 0. ˜ ˜ ˜ ∈ D, ˜ we can We recapitulate, what we have learned so far: having an element K ˜ ˜ ¯ C 2 ˜ ˜ ˜ find bL (K) and g˜R (K) by picking up any Θ -representative K ∈ R ×S,Q \ DGC 0 , ¯ decomposing K as in (A.98) and evaluating ˜ ˜bL (K) ˜ = (℘ˆC (¯b, et ), w) ,
˜ ˜ = (℘ˆC (g¯, eiφ ), is) . g˜R (K)
Similarly, we can also prove that ˆ ˆbL (K) ˆ = (℘C (¯b, 1), w) ,
ˆ ˆ = (℘C (g¯, eiφ ), 0) . gˆR (K)
(A.105)
ˆ ˆ ¯ = (¯l, eip , w) is any ΘC -representative of K ˆ ˆ with a decomposition Here K ∈D R ¯ = (¯b, 1, w)ˆ K ∗(g¯, eiφ , 0) .
(A.106)
R\ DGC 0 ,
¯l ∈ DGC , g¯ ∈ DG , ¯b ∈ D GC , p, w, φ ∈ R and the 0 + 0 0 R \ C product ˆ ∗ is taken in R ×Q DG0 . Now we need the formula for m ˆL ≡ m ˆ ◦ ˜bL . Clearly, Of course, (¯l, eip ) ∈
˜ ˜ = ((ΠC m ˆ L (K) ˆC )(¯b, et ), w) . 0 ◦℘
(A.107)
˜ ˜ does not depend on t. In particular, we can immediately realize that m ˆ L (K)
September 6, 2004 14:35 WSPC/148-RMP
806
00214
C. Klimˇ c´ık
We have shown so far, how to calculate the maps m ˆ L ,˜bL , ˆbL , g˜R and gˆR by ˜ C ˜ (or ΘC using the “Iwasawa” decomposition (A.96) of the Θ -representatives of K R ˆ ˆ We can now view the elements kˆ ∈ G, ˆ a ˆ as elements of representatives of K). ˆ∈B ˆ ˜ Their representatives can be chosen as ˆ ˜ D but also as elements of D. ¯ eiψ , 0) , k¯ = (k,
¯ = (¯ a a, 1, a∞ ) .
(A.108)
¯ a ¯ make sense as the elements of R2 ×S,Q \ DGC Now it is crucial to notice that k, 0 and also as the elements of R ×Q R \ DGC . If we want to calculate the quantities 0 ¯ a ˜bL (Ad f ˆa c ˆa ¯ respectively in the former and ˆ) and ˆbL (Ad ˆ), we should understand k, k
k
the latter sense. \ C of the form ¯ j , j = 1, . . . n of R2 ×S,Q DG Consider now any n elements K 0 ¯ = (¯l , eipj , w ) . K j j j
(A.109)
¯ of this form also as the elements of R × As we already know, we can view K j Q R\ 2 \ C and by ˆ ˜ ∗ the one in . Now we denote by ∗ the product in R × DG DGC S,Q 0 0 R × R\ DGC . Using (4.108) and (4.129), we arrive immediately at Q
0
¯ ˆ ¯ ∗···ˆ ¯ = (1, et(K¯ 1 ,K¯ 2 ,...,K¯ n ) , 0) ˜ ¯ ˜ ¯ ∗···˜ ¯ ), K ∗K ∗ (K ∗K 1 ∗K 2 ˆ n 1 ∗K 2 ˜ n
(A.110)
\ C and t(·, . . . , ·) is where left-hand side is viewed as the element of R2 ×S,Q DG 0 some real function whose explicit form is not needed for the proof of the lemma. ¯ K ¯ = k, ¯ = k¯−1 , we conclude that ¯ =a ¯ and K From the Eq. (A.110) for K 1 3 2 ¯ ¯a ¯ ,k t(k, c ¯a ¯ Ad k = (1, e
−1
)
f ¯a ¯ , 0)˜ ∗(Ad k ).
(A.111)
Now we have the “Iwasawa” decomposition (A.106) ¯ c ¯a ¯ Ad ∗(g¯, eiφ , 0) , k = (b, 1, w)ˆ
(A.112)
¯ for some ¯b ∈ D+ GC 0 , g ∈ DG0 , φ, w ∈ R. From (A.110) and (A.111), we obtain ¯a ¯) ¯ t0 (k, f ¯a ¯ Ad , w)˜ ∗(g¯, eiφ , 0) , k = (b, e
(A.113)
¯ a ¯) is some real function whose form is irrelevant for us. Now from where t0 (k, (A.100), (A.105), (A.107), (A.111)–(A.113), we conclude immediately f ˆa ˆ c ˆa m ˆ L (Ad ˆ) , k ˆ) = bL (Adk
The lemma is proved.
f ˆa c ˆa g˜R (Ad ˆR (Ad k ˆ) = g k ˆ) .
References [1] A. Alekseev, Y. Kosmann-Schwarzbach and E. Meinrenken, Quasi-Poisson manifolds, math.DG/0006168. [2] A. Alekseev and A. Malkin, Commun. Math. Phys. 162 (1994) 147. [3] A. Alekseev and S. Shatashvili, Commun. Math. Phys. 128 (1990) 197. [4] A. Alekseev and I. Todorov, Nucl. Phys. B421 (1994) 413.
September 6, 2004 14:35 WSPC/148-RMP
00214
Quasitriangular WZW Model
807
[5] O. Babelon, Phys. Lett. B215 (1988) 523. [6] J. Balog, L. Feher and L. Palla, On the chiral WZNW phase space, exchange rmatrices and Poisson–Lie groupoids, hep-th/9912173. [7] D. Bernard, Nucl. Phys. B303 (1988) 77. [8] M. Blau and G. Thompson, Lectures on 2d gauge theories: Topological aspects and path integral techniques, in Proc. 1993 Trieste Summer School on High Energy Physics and Cosmology, eds. E. Gava et al. (World Scientific, Singapore, 1994), p. 175, hep-th/9310144. [9] M. Blau and G. Thompson, Commun. Math. Phys. 171 (1995) 639. [10] B. Blok, Phys. Lett. B233 (1989) 359. [11] A. G. Bytsko, Tensor operators in R-matrix approach, q-alg/9512030. [12] A. G. Bytsko and V. Schomerus, Commun. Math. Phys. 191 (1998) 87. [13] L. Caneschi and M. Lysiansky, Nucl. Phys. B505 (1997) 701. [14] M. Chu, P. Goddard, I. Halliday, D. Olive and A. Schwimmer, Phys. Lett. B266 (1991) 71. [15] M. Chu and P. Goddard, Nucl. Phys. B445 (1995) 145. [16] L. Dabrowski, T. Krajewski and G. Landi, Int. J. Mod. Phys. B14 (2000) 2367. [17] P. Etingof and A. Varchenko, Geometry and classification of solutions of the Classical Dynamical Yang–Baxter Equation, q-alg/9703040. [18] L. D. Faddeev, Commun. Math. Phys. 132 (1990) 131. [19] F. Falceto and K. Gaw¸edzki, J. Geom. Phys. 11 (1993) 251. [20] G. Felder, Conformal field theory and integrable systems associated to elliptic curves, in Proc. Int. Congress of Mathematicians (Zurich, 1994). [21] G. Felder and C. Wieczerkowski, Commun. Math. Phys. 176 (1996) 133. [22] J. Figueroa - O’ Farrill and S. Stanciu, Phys. Lett. B327 (1994) 40–46. [23] P. Furlan, L. Hadjiivanov and I. Todorov, Nucl. Phys. B474 (1996) 497. [24] E. Frenkel and N. Reshetikhin, Quantum affine algebras and deformations of the Virasoro and W -algebras, q-alg/9505025. [25] I. Frenkel and N. Reshetikhin, Commun. Math. Phys. 146 (1992) 1. [26] C. Fronsdal, Exact deformations of quantum groups; applications to the affine case, q-alg/9602034. [27] K. Gaw¸edzki, Lectures on CFT, preprint IHES-P-97-2 (1997). [28] K. Gaw¸edzki, Conformal field theory: A case study, hep-th/9904145. [29] K. Gaw¸edzki, Commun. Math. Phys. 139 (1991) 201. [30] P. Goddard, A. Kent and D. Olive, Commun. Math. Phys. 103 (1986) 105. [31] M. Jimbo, H. Konno, S. Odake and J. Shiraishi, Quasi-Hopf twistors for elliptic quantum groups, q-alg/9712029. [32] A. Medina and P. Revoy, Ann. Scient. ENS 18 (1985) 553, in French. [33] J. Mickelsson, Current algebras and groups (New York, Plenum Press, 1989). ˇ [34] C. Klimˇc´ık and P. Severa, Phys. Lett. B351 (1995) 455; C. Klimˇc´ık, Nucl. Phys. ˇ (Proc. Suppl.) B46 (1996) 116; P. Severa, Minim´ alne plochy a dualita, Diploma thesis (Slovak, 1995). ˇ [35] C. Klimˇc´ık and P. Severa, T-duality and the moment map, preprint IHES-P-96-70, hep-th/9610198 (Carg`ese 1996); Quantum fields and quantum space time, 323–329. [36] V. G. Knizhnik and A. B. Zamolodchikov, Nucl. Phys. B247 (1984) 83. [37] J.-H. Lu and A. Weinstein, J. Diff. Geom. 31 (1990) 510. [38] S. Lukyanov and S. Shatashvili, Phys. Lett. B298 (1993) 111. [39] A. Polyakov and P. B. Wiegmann, Phys. Lett. B311 (1983) 549. [40] A. Pressley and G. Segal, Loop groups (Oxford, Clarendon Press, 1986). [41] N. Yu. Reshetikhin and M. A. Semenov-Tian-Shansky, Lett. Math. Phys. 19 (1990) 133.
September 6, 2004 14:35 WSPC/148-RMP
808
00214
C. Klimˇ c´ık
[42] M. Semenov-Tian-Shansky, Theor. Math. Physics 93 (1992) 302. [43] M. Semenov-Tian-Shansky, Publ. RIMS 21 (Kyoto Univ., 1985), p. 1237. [44] E. Whittaker and G. Watson, A course of modern analysis (Cambridge, Cambridge University Press, 1969), p. 489. [45] E. Witten, Commun. Math. Phys. 92 (1984) 455. [46] P. Xu, Quantum grupoids, math.QA/9905192. [47] D. Zhelobenko and A. Stern, Representations of Lie groups (Moscow, Nauka, 1983, in Russian), p. 117.
September 6, 2004 14:54 WSPC/148-RMP
00215
Reviews in Mathematical Physics Vol. 16, No. 6 (2004) 809–822 c World Scientific Publishing Company
CONTINUITY OF A CLASS OF ENTROPIES AND RELATIVE ENTROPIES
JAN NAUDTS Departement Natuurkunde, Universiteit Antwerpen Universiteitsplein 1, 2610 Antwerpen, Belgium [email protected] Received 31 October 2003 Revised 26 July 2004 The present paper studies continuity of generalized entropy functions and relative entropies defined using the notion of a deformed logarithmic function. In particular, two distinct definitions of relative entropy are discussed. As an application, all considered entropies are shown to satisfy Lesche’s stability condition. The entropies of Tsallis’ nonextensive thermostatistics are taken as examples. Keywords: Entropy; relative entropy; divergence; information content; Lesche’s stability condition; generalized thermostatistics.
1. Introduction The discrete entropy functional I0 (p) = −
X k
pk log(pk ) ≤ ∞ ,
is not continuous in the total variation norm X ||p − q||1 = |pk − qk | ,
(1)
(2)
k
in case the number of microstates k is infinite. This means that a small change in probability distribution may cause an arbitrary large change in entropy. This discontinuity has been identified recently [1] as an essential characteristic of information content in natural languages. But its occurrence can make it difficult to obtain a reliable estimate of entropy from experimental observation. In many cases the probabilities pk are defined over a finite index set k = 1, 2, . . . , N . Then uniform continuity holds and a useful estimate, called Lesche’s stability condition [2], exists — see expression (55). The inequality was already known before since Fannes [3] proved the quantum version of the inequality about ten years earlier. However, Lesche formulated the inequality as a condition which is satisfied by (1) but not by the alpha-entropies of R´enyi [4]. Recently [5], it has been shown that the q-entropies 809
September 6, 2004 14:54 WSPC/148-RMP
810
00215
J. Naudts
of Tsallis’ non-extensive thermostatistics also satisfy Lesche’s condition. Here, we generalize this proof to a large class of entropy functions, and formulate a more general continuity estimate (38). It has long been known that in (1) the natural logarithm may be replaced by an arbitrary increasing function f (x). The entropy of the discrete probability distribution function (pdf) p then reads X ˜ =− I(p) pk f (pk ) . (3) k
In the terminology of [6–8] these are quasi-entropies. It is clear that for general functions f (x) not much can be said about continuity of entropy or relative entropy. It is obvious that f (x) is required to share some of the properties of the natural logarithm. A class of functions satisfying such extra conditions has been introduced recently [9]. They have been used as the basis for a broad generalization of thermostatistics [10, 11]. The present paper focuses on entropy functionals occurring in this generalized thermostatistics. A possible generalization of relative entropy, also called divergence [12], is f divergence [13, 14], defined by X I(p||q) = qk f (pk /qk ) , (4) k
with f (x) a convex function, defined for x > 0, strictly convex at x = 1. The ratio pk /qk can be seen as the discrete Radon–Nikodym derivative of p with respect to q. The latter has been the basis for a systematic generalization to the context of quantum mechanics — see [7, Chap. 5]. Alternative expressions of the form X D(p||q) = f (pk ) − f (qk ) − (pk − qk )f 0 (qk ) , (5) k
0
with f (x) the derivative of f (x), are called divergences of the Bregman type in the mathematics literature. In the original definition [15] the pdfs p and q are interchanged. Then (4) and (5) are identical in the case f (x) = x log(x). Hence, in the standard theory there is no need to make a difference between the two forms. To clarify why both are needed let us remark that mean entropy, in contrast with dynamical entropy, is negative relative entropy with respect to some reference state. If the number N of microstates is finite then entropy is relative entropy with respect to uniform probabilities qk = 1/N −I(p||q) = −
1 X f (N pk ) . N
(6)
k
The continuum limit of (6) becomes −I(p||q) → −
Z
1
dk f ρ(k) 0
(7)
September 6, 2004 14:54 WSPC/148-RMP
00215
Continuity of a Class of Entropies and Relative Entropies
811
for any probability measure p with density function ρ(x) with respect to the Lebesgue measure dx of [0, 1]. This continuum limit makes clear why a definition of relative entropy of the form (4) is needed. In what follows, the definition of generalized entropy that will be used is I(p) = −
X
f (pk ) .
(8)
k
By omitting the factors N from (6) the explicit dependence on the number of microstates disappears and the expression is of the form (3). In particular, if f (x) = x log(x) then I(p) coincides with I0 (p). There exist also situations where a divergence of the form (5) is needed. In (generalized) statistical mechanics relative entropy D(p||q) measures the difference in free energy between an arbitrary pdf p and the equilibrium pdf q. The quantity −f 0 (qk ) equals the energy of the kth microstate divided by temperature (up to a constant term). Hence, −
X k
pk lnκ (qk ) − I(p)
(9)
is the (non-equilibrium) free energy of p divided by temperature T (again up to a constant term). Then (5) expresses that free energy as a function of the pdf p is minimal at equilibrium p = q. In information theory the linking identity connects average code length, entropy and divergence hκ, pi = I(p) + D(p||q) .
(10)
See e.g. [1]. Here, divergence measures the redundancy of the code κ against the pdf p. From (10) follows hκ, pi − hκ, qi = I(p) + D(p||q) − I(q) ,
(11)
which can be identified with (5), provided that the average code length is given by hκ, pi = −
X
pk f 0 (qk ) + C ,
(12)
k
with C a suitably chosen constant. The paper is organized as follows. The next section gives a short review of deformed exponentials and logarithms. Sections 3–5 discuss the definitions of entropy and relative entropy. Continuity estimates for entropy and relative entropy are given in Sec. 6. Finally, Lesche’s stability condition is discussed in Secs. 7 and 8. The paper concludes with a short discussion of results, followed by appendices containing proofs of inequalities.
September 6, 2004 14:54 WSPC/148-RMP
812
00215
J. Naudts
2. Deformed Exponentials and Logarithms In [9], a deformed logarithm is defined as a strictly increasing concave function, defined for all x > 0, vanishing for x = 0. Following [11] it is written as Z x 1 (13) lnφ (x) = dy φ(y) 1 with φ(y) a strictly positive increasing function. For convenience, the integral of lnφ (x) is denoted Z x Z x x−y . (14) Fφ (x) = dy lnφ (y) = dy φ(y) 1 1
The possible divergence of lnφ (x) at x = 0 should be mild enough so that Fφ (0) is finite. The inverse function is the deformed exponential expφ (x) and is defined on the range of lnφ (x), which may be less than the whole real line. If needed, the domain of definition is extended by putting expφ (x) = 0 if x is too small, and expφ (x) = +∞ if x is too large. For further use the notion of deduced logarithmic function ωφ (x), associated with lnφ (x), is needed. It is defined by ωφ (x) = (x − 1)Fφ (0) − xFφ (1/x) Z 1/x =x dy(− lnφ (y) − Fφ (0)) 0
=
Z
1/x
dy 0
xy − 1 . φ(y)
(15)
It is again a deformed logarithm provided that Z 1 dx lnφ (1/x) < +∞ .
(16)
0
The name of κ-deformed logarithm is used in [9] and, with a more restricted meaning, in [16]. To avoid confusion this name is used in the present paper only with the latter restricted meaning. Its origin is the kappa-distribution, which is a generalization of the Maxwell distribution. This distribution is given by −1−κ 1 v2 β 2 (17) ρ(v) = A 1 + 2κ v0
and can be written as ρ(v) = A expφ (−(1/2)βv 2 /v02 ) with the deformed logarithm lnφ (x) defined by Z x κ dy y −(2+κ)/(1+κ) , κ > 0. (18) lnφ (x) = κ(1 − x−1/(1+κ) ) = 1+κ 1 As a simple example of deformed exponential and logarithmic functions, consider the piecewise linear functions determined by the values lnφ (an ) = n ,
expφ (n) = an ,
n ∈ Z,
(19)
September 6, 2004 14:54 WSPC/148-RMP
00215
Continuity of a Class of Entropies and Relative Entropies
813
√ with a > 0 any base number. But the function lnφ (x) = −1 + x is also a deformed logarithm. Its inverse is given by expφ (x) = 0 if x ≤ −1, and expφ (x) = (1 + x)2 otherwise. 3. Entropy The entropy Iφ (p) of a discrete pdf p is defined by means of the deduced logarithmic function ωφ (x), rather than by the deformed logarithm lnφ (x). The reason for doing so is that the derivative of ωφ (x) exists and can be calculated in terms of lnφ (x) while not much is known in general about the derivative 1/φ(x) of the function lnφ (x). The definition of entropy functional reads X Iφ (p) = pk ωφ (1/pk ) ≤ +∞. (20) k
Note that the function xωφ (1/x) is non-negative and goes to zero in the limit x = 0. Hence the expression is well-defined. Basic properties are Iφ (p) ≥ 0 and Iφ (λp + (1 − λ)q) ≥ λIφ (p) + (1 − λ)Iφ (q),
0 ≤ λ ≤ 1,
(21)
i.e. entropy Iφ (p) is a concave function of the pdf p. From the definition of the deduced logarithmic function ωφ (x) follows that X Iφ (p) = [(1 − pk )Fφ (0) − Fφ (pk )] k
= −Fφ (0) −
XZ k
pk
dx lnφ (x) .
(22)
0
In particular, Iφ (p) is of the form (8) with
f (x) = Fφ (x) − (1 − x)Fφ (0) .
(23)
Let us discuss some examples. If lnφ (x) is replaced by the natural logarithm log(x) then the entropy is denoted I0 (p) and is given by the well-known expression (1). As a further example, consider entropy in the context of Tsallis’ non-extensive thermodynamics [17]. Fix a number κ between −1 and 1, not equal to 0. A deformed logarithm is defined by Z x 1+κ −1 κ (24) lnφ (x) = (1 + κ )(x − 1) = dy 1−κ . y 1
Note that this definition differs from the definition of q-logarithm found in the Tsallis literature [18], which coincides with the deduced logarithm ωφ (x) = (1/κ)(1 − x−κ ) .
(25)
A short calculation yields the entropy functional X 1 1− Iφ (p) = p1+κ k κ k
!
.
(26)
September 6, 2004 14:54 WSPC/148-RMP
814
00215
J. Naudts
This entropy functional was studied long ago by Havrda and Charvat [19] and by Dar´ oczy [20]. It is a monotonic function of R´enyi’s alpha-entropies [4]. It is the starting point of Tsallis’ thermostatistics. In the latter context it is common to use the parameter q = 1 + κ instead of κ. In the present paper the symbols p, q, and r are used for pdfs. As a final example, consider the κ-deformed logarithm introduced by Kaniadakis [16, 21] 1 κ (x − x−κ ) . (27) 2κ The parameter κ should satisfy −1 < κ < 1 to guarantee concavity of the deformed logarithm. The inverse function reads p 1/κ . (28) expκ (x) = κx + 1 + κ2 x2 lnκ (x) =
The corresponding entropy functional is obtained directly from (22). The result is ! ! X X 1 1 1+κ 1−κ pk + Iκ (p) = 1− pk − 1 . (29) 2κ(1 + κ) 2κ(1 − κ) k
k
4. Relative Entropy Let q be a pdf for which qk > 0 holds for all k (this condition can be omitted if the deformed logarithm is such that ωφ (0) is finite). From (4) it follows that the relative entropy of the pdf p, given q, is defined by X Iφ (p||q) = − pk ωφ (qk /pk ) . (30) k
Note that, using the definition of ωφ , one obtains X Z pk dx lnφ (x/qk ) . Iφ (p||q) =
(31)
qk
k
Expression (30) is of the form (4) with f (x) given by (23). In particular, this means that the divergence Iφ (p||q), considered here, is a special case of the f divergence of [13, 14], with functions f which are strictly convex and have a concave derivative. Many properties of f -divergence are known — see [22]. In particular, one has Iφ (p||q) ≥ 0 and Iφ (p||q) = 0 implies p = q. Also, Iφ (p||q) is jointly convex in p and q. For the example of Tsallis’ entropy functional one obtains, using (25), κ 1X pk −1 . (32) pk Iφ (p||q) = κ qk k
This expression has been introduced in the context of Tsallis’ thermostatistics independently by several authors [23–25]. However, the definition was known before in the context of R´enyi’s alpha-entropies — see [26].
September 6, 2004 14:54 WSPC/148-RMP
00215
Continuity of a Class of Entropies and Relative Entropies
815
If lnφ (x) has a unique derivative ln0φ (x) = 1/φ(x) in the point x = 1 and the probabilities pk depend on parameters θ i then the generalized Fisher information metric [27], defined by Iφ (p + dp||p) = Iφ (p||p + dp) = (1/2)gij (p)dθi dθj , becomes X ∂ log(pk ) ∂ log(pk ) gij (p) = ln0κ (1) . (33) pk ∂θi ∂θj k
Note that this expression does not depend on the actual choice of deformed logarithm, except through the prefactor ln0κ (1). 5. Alternative Definition of Divergence So far, definition (30) seems quite satisfactory. However, as discussed in the introduction, there is a need for an alternative definition of the form (5). By modification of (31) one obtains X Z pk Dφ (p||q) = dx(lnφ (x) − lnφ (qk )) qk
k
=
X k
[Fφ (pk ) − Fφ (qk ) − (pk − qk ) lnφ (qk )]
= Iφ (q) − Iφ (p) −
X k
(pk − qk ) lnφ (qk ) .
(34)
This expression is of the form (5) with f (x) given by (23). Positivity of Dφ (p||q) follows immediately because lnφ (x) is an increasing function of x. Equality Dφ (p||q) = 0 implies that p = q. Convexity in the first argument is straightforward. For the example of Tsallis’ entropy one obtains X 1X (pk − qk )qkκ , (35) pk (pκk − qkκ ) − Dφ (p||q) = κ k
k
which is definitely different from (32). If the probabilities pk depend on parameters θ i then the generalized Fisher information metric becomes X ∂pk ∂pk gij (p) = . (36) ln0φ (pk ) i ∂θ ∂θj k
Indeed, one has
Dφ (p + dp||p) =
XZ k
=
XZ k
=
pk +dpk pk pk +dpk pk
dx lnφ (x) − lnφ (pk )
dx ln0φ (pk )(x − pk ) + · · ·
2 1X 0 lnφ (pk ) dpk + · · · , 2 k
(37)
September 6, 2004 14:54 WSPC/148-RMP
816
00215
J. Naudts
and similarly for Dφ (p||p+dp). In contrast with (33) the metric tensor (36) depends in a non-trivial way on the deformed logarithm lnφ . 6. Continuity Estimates of Entropy and of Relative Entropy In Appendix A it is proved that |Iφ (p) − Iφ (q)| ≤ − =
XZ k
X k
|pk −qk |
lnφ (x)dx 0
[Fφ (0) − Fφ (|pk − qk |)]
≡ d(p, q) ≤ +∞ .
(38)
The right-hand side of (38) defines a metric d(p, q). In particular, it satisfies the triangle inequality. Note that the distance between two pdfs may be infinite. This is not a problem since one can always define a new metric by dM (x, y) = min{d(x, y), M }, with M a fixed positive constant. The two metrics d and dM define the same topology. If lnφ (x) is taken to be the natural logarithm ln(x) then (38) becomes X |I0 (p) − I0 (q)| ≤ |p − q||1 − |pk − qk | ln(|pk − qk |) . (39) k
More generally, take lnφ equal to the logarithm (24), used in the Tsallis context. Then (38) becomes X |Iφ (p) − Iφ (q)| ≤ (1 + κ−1 )||p − q||1 − κ−1 |pk − qk |1+κ . (40) k
Differences in relative entropy can be estimated in a similar way as for entropy differences. One finds (see Appendix A) |Iφ (p||r) − Iφ (q||r)| ≤ d(p, q) + hr (p, q)
(41a)
|Dφ (p||r) − Dφ (q||r)| ≤ d(p, q) + er (p, q)
(41b)
with d(p, q) as before, and with hr (p, q) =
X k
er (p, q) = −
|pk − qk | lnφ (1/rk ) ,
X k
|pk − qk | lnφ (rk ) .
(42)
The right-hand side of (41) is the sum of two distances, each satisfying the triangle inequality. Take q = r in (41) to obtain an upper bound for Iφ (p||q), respectively Dφ (p||q).
September 6, 2004 14:54 WSPC/148-RMP
00215
Continuity of a Class of Entropies and Relative Entropies
817
7. A General Continuity Condition The right-hand side of (38) resembles the entropy of a distribution with elements |pk − qk |. Introduce therefore the symmetric difference p∆q of two distinct pdfs p and q by (p∆q)k =
|pk − qk | . ||p − q||1
(43)
Note that p∆q is again a pdf. Its elements satisfy (p∆q)k ≤ 1/2. This implies that Iφ (p∆q) ≥ −Fφ (0) − lnφ (1/2) .
(44)
In Appendix B it is shown that from (38) if ||p − q||1 ≤ 1 then |Iφ (p) − Iφ (q)| ≤
Fφ (0) − Fφ (||p − q||1 ) Fφ (0) + Iφ (p∆q) . Fφ (0)
(45)
If ||p − q||1 = 1, then this inequality coincides with (38). Take lnφ equal to the logarithm (24), used in the Tsallis context. Then (45) becomes |Iφ (p) − Iφ (q)| ≤
1 (1 + κ)||p − q||1 − ||p − q||1+κ 1 + Iφ (p∆q) . 1 κ
(46)
This is less sharp than (40) which can be written as |Iφ (p) − Iφ (q)| ≤
1 1 (1 + κ)||p − q||1 + ||p − q||11+κ Iφ (p∆q) − . κ κ
(47)
In combination with (44), (45) shows that the entropy functional Iφ (p) satisfies the following condition. Condition 1. For each > 0 there exists δ > 0 such that |I(p) − I(q)| ≤ I(p∆q)
(48)
holds for all pdfs p and q satisfying p 6= q and ||p − q||1 ≤ δ. To show the relevance of this condition one consequence is highlighted. Note that (λp + (1 − λ)q)∆q does not depend on λ in the range 0 < λ ≤ 1. Hence Condition 1 implies that for each > 0 there exists δ > 0 such that |I(λp + (1 − λ)q) − I(µp + (1 − µ)q)| ≤ I(p∆q)
(49)
holds for distinct pairs p and q, and for all λ and µ between 0 and 1, satisfying |λ−µ| ||p−q||1 ≤ δ. This result implies uniform continuity of entropy on the segment (p, q), provided I(p∆q) is finite.
September 6, 2004 14:54 WSPC/148-RMP
818
00215
J. Naudts
8. Lesche’s Stability Condition Assume now that the number of microstates is finite, equal to N (i.e., the index k of the pdfs p and q runs from 1 to N ). Introduce the notation I max (N ) = max{I(p) : pk = 0 for k > N } .
(50)
Lesche [2] showed twenty years ago that I0 (p) satisfies the following condition. Condition 2. For each > 0 there exists δ > 0 such that |I(p) − I(q)| ≤ I max (N )
(51)
holds for all pdfs p and q satisfying ||p − q||1 ≤ δ and pk = qk = 0 for k > N . It is clear that an entropy function I(p) satisfying Condition 1 also satisfies Condition 2. For fixed N these conditions imply uniform continuity, which is a rather trivial statement because a continuous function on a compact set is automatically uniformly continuous. In addition, (51) specifies how the estimate depends on the number of nonzero components N . In the remainder of this section some inequalities, used in the literature to prove Lesche’s condition, are shown to follow from (38). In Appendix C, it is shown that (38) implies that |Iφ (p) − Iφ (q)| ≤ N Fφ (0) − N Fφ (N −1 ||p − q||1 ) Z ||p−q||1 /N = −N dx lnφ (x) 0
= ||p − q||1 Fφ (0) + ωφ (N/||p − q||1 ) .
(52)
It is difficult to bound ωφ (N/||p − q||1 ) by Iφmax (N ) = ωφ (N ) in the general case using only that ωφ (x) is a concave increasing function. However, in the case that the deformed logarithm is given by (24), then one has ωφ (N/||p − q||1 ) =
1 (1 − ||p − q||κ1 ) + ||p − q||κ1 ωφ (N ) . κ
(53)
This can be used to write (52) in the following form |Iφ (p) − Iφ (q)| ≤ (1 + κ−1 )||p − q||1 + −κ−1 + Iφmax (N ) ||p − q||1+κ . 1
(54)
|I0 (p) − I0 (q)| ≤ 1 + I0max (N ) ||p − q||1 − ||p − q||1 ln(||p − q||1 ) .
(55)
This is the result obtained recently by Abe [5]. It implies that Iφ (p) satisfies Condition 2. In the limit κ = 0, (54) becomes
September 6, 2004 14:54 WSPC/148-RMP
00215
Continuity of a Class of Entropies and Relative Entropies
819
This is the expression obtained originally by Lesche [2]. Fannesa [3] showed that, if ||p − q||1 ≤ 1/3, then one has the slightly stronger inequality |I0 (p) − I0 (q)| ≤ I0max (N )||p − q||1 − ||p − q||1 ln(||p − q||1 ) .
(56)
9. Discussion The present paper considers a large class of entropy functionals. Their definition is based on the concept of deformed logarithms. These entropies have nice enough properties to enable the proof of useful estimates. Only discrete pdfs have been considered. Expressions for continuous distributions and for quantum probabilities are found in [28]. For each entropy functional Iφ (p) there exists a metric d(p, q) bounding the difference |Iφ (p) − Iφ (q)| — see inequality (38). The difference of relative entropies |Iφ (p||r) − Iφ (q||r)| is bounded by the sum of two distances, the distance d(p, q) mentioned above, and a distance hr (p, q) which depends on the pdf r — see (41, 42). An alternative definition of relative entropy Dφ (p||q) has been proposed. It satisfies similar properties as Iφ (p||q), but serves other goals. It is used in generalized statistical physics to measure changes in free energy. In information theory it is a measure of redundancy. Although the proof of (38) is rather elementary, the result can be used to show that all entropy functionals, considered in the present paper, satisfy Lesche’s stability condition (Condition 2 of the paper), as well as a stronger version of the inequality (Condition 1 of the paper). The proof is shorter and more transparent than that of [5]. Acknowledgment I thank Dr. P. Harremo¨es and Prof. H. Hasegawa for providing some of the references to the literature. Appendix A Here we prove the inequalities (38) and (41). Consider X Z pk dx lnφ (x) . Iφ (p) − Iφ (q) = − k
(A.1)
qk
a The
quantum mechanical entropy of a state with density matrix ρ is defined by I0 (ρ) = −Tr ρ log ρ. The inequality |I0 (ρ) − I0 (σ)| ≤ λ log N − λ log λ
holds with λ = Tr|ρ − σ| whenever λ ≤ 1/3. Here, N is the dimension of the Hilbert space, Tr is the trace. The inequality is quoted in [7], together with a proof, in Proposition 1.8, be it with quite different notations.
September 6, 2004 14:54 WSPC/148-RMP
820
00215
J. Naudts
If pk < qk then the contribution is negative and may be omitted when trying to obtain an upperbound. Hence one gets immediately, using Heavisides function θ(x), Z pk −qk X θ(pk − qk ) Iφ (p) − Iφ (q) ≤ − dx lnφ (x) 0
k
≤−
XZ k
|pk −qk |
dx lnφ (x) .
(A.2)
0
This proves (38). To prove (41) note that from (31) follows XZ Iφ (p||r) − Iφ (q||r) = k
pk
dx lnφ (x/rk ) .
(A.3)
qk
Assume pk < qk and write the kth term as Z qk − dx lnφ (x/rk ) .
(A.4)
pk
It increases when − lnφ (x/rk ) is replaced by − lnφ (x). Hence the sum of all these terms is less than d(p, q). On the other hand, if pk ≥ qk then the factor lnφ (x/rk ) in the kth term can be replaced by lnφ (1/rk ), which yields the bound Z pk dx lnφ (x/rk ) ≤ (pk − qk ) lnφ (1/rk ) . (A.5) qk
The sum of these terms is bounded by hr (p, q). This finishes the proof of (41a). In the case of the alternative definition of divergence one has X Dφ (p||r) − Dφ (q||r) = −Iφ (p) + Iφ (q) − (pk − qk ) lnφ (rk ) . (A.6) k
Hence, in this case the estimate is straightforward.
Appendix B Here, expression (45) is derived. Note that any increasing concave function g(x), satisfying g(0) ≥ 0, also satisfies g(λx)g(y) ≤ g(x)g(λy)
(B.1)
for all λ, x, and y, for which 0 ≤ λ ≤ 1 and 0 < x < y hold. Apply this result with g(x) = Fφ (0) − Fφ (x) (which is increasing on 0 ≤ x ≤ 1), λ = ||p − q||1 , x = |pk − qk |/||p − q||1 , and y = 1. Note that the assumption ||p − q||1 ≤ 1 is needed here. There follows, using Fφ (1) = 0, Fφ (0) − Fφ (|pk − qk |) Fφ (0) ≤ Fφ (0) − Fφ (|pk − qk |/||p − q||1 ) Fφ (0) − Fφ (||p − q||1 ) . (B.2)
Using (38) this implies (45).
September 6, 2004 14:54 WSPC/148-RMP
00215
Continuity of a Class of Entropies and Relative Entropies
821
Appendix C Here, inequality (52) is proved. Because Fφ (x) is convex one has for any x and a>0 Fφ (x) ≥ Fφ (a) + (x − a) lnφ (a) .
(C1)
|Iφ (p) − Iφ (q)| ≤ N Fφ (0) − N Fφ (a) − ||p − q||1 − N a lnφ (a) .
(C2)
Therefore (38) implies
The optimal choice of a is a = N −1 ||p − q||1 . This implies (52). References
[1] P. Harremo¨es and F. Topsøe, Maximum entropy fundamentals, Entropy 3 (2001) 191–226. [2] B. Lesche, Instabilities of R´enyi entropies, J. Stat. Phys. 27 (1982) 419–423. [3] M. Fannes, A continuity property of the energy density for spin lattice systems, Commun. Math. Phys. 31 (1973) 291–294. [4] A. R´enyi, On the foundations of information theory, Rev. Int. Stat. Inst. 33 (1965) 1–14. [5] S. Abe, Stability of Tsallis entropy and instabilities of R´enyi and normalized Tsallis entropies: A basis for q-exponential distributions, Phys. Rev. E66 (2002) 046134, arXiv:cond-mat/0206078. [6] D. Petz, Quasi-entropies for finite quantum systems, Rep. Math. Phys. 23 (1986) 57–65. [7] M. Ohya and D. Petz, Quantum Entropy and Its Use (Springer-Verlag, 1993). [8] D. Petz, Monotonicity of quantum relative entropy revisited, Rev. Math. Phys. 15(1) (2003) 79–91. [9] J. Naudts, Deformed exponentials and logarithms in generalized thermostatistics, Physica A316 (2002) 323–334, arXiv:cond-mat/0203489. [10] J. Naudts, Generalized thermostatistics and mean-field theory, Physica A332 (2004) 279–300, arXiv:cond-mat/0211444. [11] J. Naudts, Generalized thermostatistics based on deformed exponential and logarithmic functions, Physica A340 (2004) 32–40, arXiv:cond-mat/0311438. [12] S. Kullback and R. Leibler, On information and sufficiency, Ann. Math. Stat. 22 (1951) 79–86. [13] I. Csisz´ ar, Eine informationstheoretische Ungleichung und ihre Anwendung auf den Beweis der Ergodizit¨ at von Markoffschen Ketten, Maggar. Tud. Akad. Mat. Kutatø Int. K¨ ozl. 8 (1963) 85–108. [14] I. Csisz´ ar, A class of measures of informativity of observation channels, Per. Math. Hung. 2(1–4) (1972) 191–213. [15] L. M. Bregman, The relaxation method of finding a common point of convex sets and its application to the solution of problems in convex programming, USSR Comp. Math. Math. Phys. 7 (1967) 200–217. [16] G. Kaniadakis and A. M. Scarfone, A new one parameter deformation of the exponential function, NEXT2001 Meeting, Physica A305 (2002) 69–75, cond-mat/0109537. [17] C. Tsallis, Possible Generalization of Boltzmann-Gibbs Statistics, J. Stat. Phys. 52 (1988) 479–487. [18] C. Tsallis, What are the numbers that experiments provide? Quimica Nova 17 (1994) 468.
September 6, 2004 14:54 WSPC/148-RMP
822
00215
J. Naudts
[19] J. Havrda and F. Charvat, Quantification methods of classification processes: Concepts of structural entropy, Kybernetica 3 (1967) 30–35. [20] Z. Dar´ oczy, Generalized information functions, Inform. Control 16 (1970) 36–51. [21] G. Kaniadakis, Nonlinear kinetics underlying generalized statistics, Physica A296 (2001) 405–425. [22] Inequalities for Csisz´ ar f-Divergence in Information Theory, ed. S. S. Dragomir, RGMIA Monographs, Victoria University, 2000; http://rgmia.vu.edu.au/monographs/. [23] S. Abe, q-Deformed Entropies and Fisher Metrics, in Proceedings of The 5th International Wigner Symposium August 25-29, 1997, Vienna, Austria, eds. P. Kasperkovitz and D. Grau (World Scientific, Singapore, 1998), p. 66. [24] C. Tsallis, Generalized entropy-based criterion for consistent testing, Phys. Rev. E58(2) 1442–1445 (1998). [25] M. Shiino, H-theorem with generalized relative entropies and the Tsallis statistics, J. Phys. Soc. Jpn. 67(11) 3658–3660 (1998). [26] H. Hasegawa, α-divergence of the non-commutative information geometry, Rep. Math. Phys. 33(1/2) 87–93 (1993). [27] S. Amari, Differential-geometrical methods in statistics, Lecture Notes in Statistics 28 (1985). [28] J. Naudts, Non-unique way to generalize the Boltzmann-Gibbs distribution, arXiv:cond-mat/0303051.
October 15, 2004 17:23 WSPC/148-RMP
00218
Reviews in Mathematical Physics Vol. 16, No. 7 (2004) 823–849 c World Scientific Publishing Company
DEFORMATIONS OF LOOP ALGEBRAS AND CLASSICAL INTEGRABLE SYSTEMS: FINITE-DIMENSIONAL HAMILTONIAN SYSTEMS
T. SKRYPNYK Bogoliubov Institute for Theoretical Physics, Metrologichna st. 14-b, Kiev 03143, Ukraine [email protected] Received 16 January 2003 Revised 5 June 2004 We construct a family of infinite-dimensional quasigraded Lie algebras, that could be viewed as deformation of the graded loop algebras and admit Kostant–Adler scheme. Using them we obtain new integrable hamiltonian systems admitting Lax-type representations with the spectral parameter. Keywords: Quasigraded Lie algebras; Lax representation; integrable systems. Mathematical Subject Classification 2000: 17B65, 81R12, 37J35, 70H06
1. Introduction Special importance of graded loop algebras and their central extensions [1] in the theory of integrable hamiltonian systems, both finite [2–6] and infinite-dimensional [8–12], has been well-known since the early eighties. The pioneering work in the subject belongs to Reyman and Semenov-Tian-Shansky [2] who were the first to realize the possibility of applying the so-called Kostant–Adler scheme [4] to loop algebras and developed the corresponding general theory (see [6, 7]). The basic property of the loop algebras, that permits their usage in the theory of integrable systems, is their property of being graded [1, 9]. It provides the possibility of applying to them the so-called Kostant–Adler scheme [2–7]. In the present paper we construct a new class of infinite-dimensional Lie algebras that might be used in the theory of integrable systems. Unlike the loop algebras they are not graded but possess a weaker property of a quasigradation [34]. Constructed quasigraded Lie algebras are continuous multiparametric deformations of the graded loop algebras ˜ g. For the parameters of the deformation serve matrix elements of a certain matrix A. We denote the corresponding infinite-dimensional Lie algebras by g˜A .a Lie a Note,
that despite the similar notations algebras ˜ gA are not connected with Kac–Moody algebras
g(A). 823
October 15, 2004 17:23 WSPC/148-RMP
824
00218
T. Skrypnyk
algebras ˜ gA were already presented in our previous short paper [18]. They generalize our previous semi-geometric construction of the special Lie algebras g˜H on the higher genus curves H [14–17]. Algebras g˜H are in turn direct generalization of the special elliptic Lie algebras of Holod [25–27]. Lie algebras ˜ gA admit decomposition − ˜ g˜A = ˜ g+ + g , which is the main ingredient of the Kostant–Adler scheme. It is also A A necessary to mention the papers of Golubchik and Sokolov [22–24], where (specially realized) subalgebras ˜ g− A were independently constructed in the spirit of ideas of Cherednik [20] as a possible complementary subalgebra to the Lie algebra of formal Taylor series in the Lie algebra of Loran power series ˜ g(λ−1 , λ) and paper of Bordag and Yanovsky [21] where the case g = so(4) was considered. The possibility of defining the complementary Lie subalgebra g˜+ A , the corresponding “large” Lie ˜ algebra gA and the quasigraded character of all these Lie algebras was not noticed and acknowledged there. gA for all clasIn the present paper we study the properties of the Lie algebras ˜ sical matrix Lie algebra g and arbitrary matrix A. We construct their coadjoint representations and infinite set of their invariants. Following the standard procedure [6] we introduce “direct-difference” Lie–Poisson brackets into linear spaces g˜∗A . In the result we obtain an infinite number of commuting with respect to the Lie–Poisson brackets of the “direct-difference” functions on the dual spaces of our algebras. That permits us to develop the theory of integrable systems (both finite and infinite dimensional) based entirely on the algebras ˜ gA . We concentrate our attention on the theory of finite-dimensional hamiltonian systems connected with the algebras ˜ gA . In order to obtain these systems in the framework of our construction we use the fact that the algebras ˜ gA are quasigraded. This property permits us to define an infinite sequence of ideals of finite co-dimensions in the algebra ˜ gA equipped with the “direct difference” bracket. As a result we obtain a large number of commuting functions on the dual space of each quotient algebra of finite dimension, and, hence, an infinite sequence of integrable equations of Euler–Arnold type. We study properties of these systems in details. We explicitly construct a Lie–Poisson bracket on each quotient algebra and determine which of the obtained commuting functions are Casimir invariants and which define non-trivial hamiltonian flows. We construct Lax-type representations with the spectral parameters for these hamiltonian systems. They coincide with the “multiparametric deformation” of the usual Lax representation. We explicitly construct a hierarchy of M -operators that correspond to the constructed commuting integrals. In order to show the effectiveness of the application of the algebras ˜ gA to the theory of finite-dimensional integrable systems we devote a substantial part of the present article to a consideration of interesting examples. We consider in detail constructed hamiltonian systems in the quotient spaces of a small quasigrade. Among them there are several types of known series of integrable systems and a number of new series. The simplest series of integrable systems we have obtained
October 15, 2004 17:23 WSPC/148-RMP
00218
Deformations of Loop Algebras and Classical Integrable Systems
825
are generalized tops. Our tops coincide with well-known ones [28–30] but the constructed Lax-type representations with spectral parameters are new for the cases g = gl(n), sp(n) (for the case g = so(n) our Lax-type representation can be transformed to the Lax representation for the generalized so(n) tops constructed in [39]). Other examples give us a new series of integrable systems that cannot be obtained using loop algebras. These systems exist for all the series of classical Lie algebras. They generalize two Steklov integrable cases connected with so(3) algebra [31, 33]. Contrary to loop algebras our algebras contain a lot of free parameters that determine their structure constants. These parameters could be varied in order to obtain new interesting examples of hamiltonian systems. In such a way, considering special degenerations of algebras ˜ gA , i.e. sending some of the parameters of deformations to zero, we have obtained other integrable hamiltonian systems. In particular, in this way we have constructed analogs of the generalized Clebsch systems for all classical series of simple Lie algebras. In the case g = so(n) they coincide with generalized Clebsch systems discovered in [35]. Moreover using the same degeneration of algebras ˜ gA we obtain “spin generalization” of the generalized Clebsch systems — i.e. integrable cases of the generalized Clebsch systems interacting with generalized tops. The last system seems to be unknown even in the case of small rank algebras. We obtain explicit formulas for Lax pairs for all of the above systems. At the end of the paper we consider conditions that should be imposed on the matrix A in order to provide completeness of the constructed families of the commuting integrals. We also consider some algebro-geometric objects connected with the procedure of integration of the obtained Euler–Arnold equations such as “spectral” curves, Baker–Ahiezer functions etc. 2. Integrable Systems via Lie Algebras: General Scheme In this section we will recall the main facts from the Lie-algebraic approach to the theory of finite-dimensional hamiltonian systems based on the so-called Kostant– Adler scheme [4]. Let g˜ be some Lie algebra (finite or infinite dimensional) with Lie brackets [ , ]. Let ˜ g∗ be its dual space, { , } — standard Lie–Poisson bracket on ˜ g∗ , I(˜ g∗ ) — ring of invariants of coadjoint representation of ˜ g. Moreover let ˜ g possess the following special property: ˜ g=˜ g+ + ˜ g− , where ˜ g± are subalgebras of ˜ g and + denotes the direct sum of linear subspaces. Then, as is well-known [6], it is possible to introduce into linear space ˜ g new Lie brackets, defined as follows: [X, Y ]0 = [X+ , Y+ ] − [X− , Y− ] , ˜. We will denote linear where X = X+ + X− , Y = Y+ + Y− , X± , Y± ∈ ˜ g± , X, Y ∈ g 0 space ˜ g with a bracket { , }0 by ˜ g . Let { , }0 be the Lie–Poisson bracket on ˜ g∗ that corresponds to the bracket [ , ]0 . The basic theorem in the Lie algebraic approach to the theory of integrable systems is the so called Kostan–Adler Theorem [6]:
October 15, 2004 17:23 WSPC/148-RMP
826
00218
T. Skrypnyk
Theorem 2.1. (i) Algebra of functions I(˜ g∗ ) is commutative with respect to the ∗ bracket { , }0 on ˜ g . (ii) Hamiltonian equations of motion defined by the function I ∈ I(˜ g∗ ) with respect to the bracket { , }0 have the following form: dL = ±ad∗M L , dt where L ∈ ˜ g∗ , M = ±P± ∇I, and ∇I is algebra valued gradient of I. This theorem is the basic for the application in the theory of integrable systems. It provides hamiltonian systems on (˜ g0 )∗ possessing many mutually commuting g∗ g∗ ). In the case when algebra ˜ integrals of motion — generators of the ring I(˜ is infinite-dimensional or in the case when the number of independent generators of I(˜ g∗ ) is less than the dimension of the generic coadjoint orbit in (˜ g0 )∗ in order to obtain integrable finite-dimensional hamiltonian systems via this scheme it is necessary to use the following corollary of Theorem 2.1: ˜0 of finite co-dimension. Then Corollary 2.1. Let J be an ideal of the algebra g g/J)∗ ) is commutative with respect to the restriction of (i) Algebra of functions I((˜ the bracket { , }0 on (˜ g/J)∗ ; (ii) Hamiltonian equations of motion defined by the g/J)∗ ) with respect to the bracket { , }0 have the following form: function I ∈ I((˜ dL = ±ad∗M L , dt where L ∈ (˜ g/J)∗ , M = ±P± (∇I)|L∈(˜g/J)∗ , and (∇I)|L∈(˜g/J)∗ is a restriction of the algebra valued gradient of I ∈ I(˜ g)∗ onto (˜ g/J)∗ . Surprisingly large numbers of finite-dimensional integrable systems are obtained via Corollary 2.1. Majority of them are obtained using graded loop algebras [1, 2, 6]. It is known [2, 6] that the finite-dimensional quotients of the loop algebras coincide with phase spaces of many interesting integrable systems of Euler–Arnold type. The main property that makes Theorem 2.1 and Corollary 2.1 applicable to the loop algebras is their property of being graded. In the next sections we will show that not only graded loop algebras but also special quasigraded Lie algebras could be used in order to produce integrable systems via the Kostant–Adler scheme, and consider corresponding examples. 3. Deformation of Loop Algebras and Integrable Systems ˜A 3.1. Quasigraded Lie algebra g Let us consider classical matrix Lie algebras g of the type gl(n), so(n) and sp(n) over the field K of the complex or real numbers. We will realize algebra so(n) as algebra of skew-symmetric matrices: so(n) = {X ∈ gl(n)|X = −X >} and algebra sp(n) as the following matrix algebra: sp(n) = {X ∈ gl(n)|X = sX > s}, where n ∈ 2Z , s ∈ so(n) and s2 = −1. We introduce into the space ˜ g = g ⊗ P (λ, λ−1 ) new Lie brackets: [X ⊗ p(λ), Y ⊗ q(λ)] = [X, Y ] ⊗ p(λ)q(λ) − [X, Y ]A ⊗ λp(λ)q(λ) , where p(λ), q(λ) ∈ P ol(λ, λ−1 ), [X, Y ]A ≡ XAY − YAX.
(1)
October 15, 2004 17:23 WSPC/148-RMP
00218
Deformations of Loop Algebras and Classical Integrable Systems
827
The following proposition holds true: Proposition 3.1. Let matrix A ∈ M at(n) have the following form: (i) A = A> for g = so(n), (ii) A = −sA> s for g = sp(n), (iii) A is arbitrary for g = gl(n). Then brackets (1) are correctly defined Lie brackets on g ⊗ P (λ, λ −1 ). Remark 3.1. Lie brackets [ , ] and [ , ]A were used in the context of the theory of consistent Lie–Poisson brackets in [13] and [36]. In the present paper we use these two consistent brackets on the finite-dimensional Lie algebras g in order to define new Lie structure on the infinite-dimensional space ˜ g = g ⊗ P (λ, λ−1 ) (see also [18, 19]). These two brackets were used also in the papers [23] and [24] in order to induce new algebraic structure into the space ˜ g− = g ⊗ P (λ−1 ), but the possibility of extending this structure to the whole space g˜ was not acknowledged in these papers. ˜ equipped with the Lie bracket Definition 3.1. We will denote the linear space g given by (1) by ˜ gA and linear space g with the brackets [ , ]A by gA . From the explicit form of the brackets (1) one can easily see that constructed Lie algebras ˜ gA may be viewed as multiparametric (for parameters serve matrix elements Aij of the matrix A) deformations of the loop algebras. When Aij → 0 algebras ˜ gA coincide with ordinary loop algebras ˜ g. Remark 3.2. Algebras ˜ gA can be realized also in the space of special matrix-valued functions of λ with an ordinary “non-deformed” Lie bracket. In particular in the case of the diagonal matrices A they can be realized as the special quasigraded Lie algebras on the higher genus curves (see [14–17], see also [22] where such realization was constructed for the subalgebra ˜ g− A ). They can also be realized as special quasigraded subalgebras of gl(n)-loop algebras, which contrary to the graded subalgebras of loop algebras are not isomorphic to the corresponding loop algebras. Nevertheless we consider the realization in the space g ⊗ P ol(λ, λ−1 ) with the “deformed” bracket to be the most convenient. m Let Xij denote basic elements of the matrix Lie algebras g. Let Xij ≡ Xij ⊗ λm be the natural basis in ˜ gA . Commutation relations (1) in this basis have the form: X pq X pq r m r+m r+m+1 [Xij , Xkl ]= Cij,kl Xpq − Cij,kl (A)Xpq , (2) p,q
where
pq Cij,kl
and
pq Cij,kl (A)
p,q
are the structure constants of the Lie algebras g and gA .
Remark 3.3. From the explicit form of the commutation relations (2) it is evident gA is not graded but possess a weaker property, which will be called that algebra ˜ [34] quasigrading: X gj = g ⊗ λ j , ˜ g= gj , [gi , gj ] ⊂ gi+j + gi+j+1 . (3) j∈Z
The following proposition holds true:
October 15, 2004 17:23 WSPC/148-RMP
828
00218
T. Skrypnyk
Proposition 3.2. (i) Algebra ˜ gA admit decomposition into the sum of two subalge− ± + m m ˜ ˜ bras ˜ gA = ˜ g+ +˜ g , where g : g = SpanK {Xij |m ≥ 0}, g˜− A A A = SpanK {Xij |m < 0}. P−s−1A A m P ∞ (ii) Subspaces Js,p = m=−∞ g ⊗ λ + m=p g ⊗ λm are ideals in the Lie algebra + − 0 g˜A = ˜ gA g˜A . Proof. The proposition is proved by direct verification. Both items of the proposition follows from the property of ˜ gA and ˜ g± A to be quasigraded (3). From Proposition 3.2 it follows that algebra ˜ gA fits into the framework of the gA . For this Kostant–Adler scheme. We have only to find coadjoint invariants of ˜ purpose in the next subsection we will explicitly describe the dual space ˜ g∗A of ˜ gA . ˜∗A and invariants of coadjoint representation 3.2. Dual space g Let us define the pairing between ˜ gA and ˜ g∗A in the following standard way: hX, Li = resλ=0 Tr(X(λ)L(λ)) . ˜∗A under this choice of pairing has The generic element of the dual space L(λ) ∈ g the form: L(λ) =
X X
(k)
∗ lij λ−(k+1) Xij .
k∈Z i,j=1,n
Using the explicit form of the adjoint representation and the above pairing it is easy to deduce that coadjoint the action of g˜A on g˜∗A has the following form: ad∗X(λ) ◦ L(λ) = A(λ)X(λ)L(λ) − L(λ)X(λ)A(λ), where A(λ) = 1 − λA .
(4)
Remark 3.4. Note that coadjoint action of algebras ˜ gA does not coincide with the adjoint. From the explicit form of the coadjoint action (4), the next proposition follows: ˜∗A . Then functions Proposition 3.3. Let L(λ) be the generic element of g Ikm (L(λ)) =
1 resλ=0 λ−(k+1) Tr(L(λ)A(λ)−1 )m , where k ∈ Z, m ∈ 1, n m
(5)
are invariants of the coadjoint representation. Remark 3.5. Matrix A(λ)−1 ≡ (1 − λA)−1 has to be understood as a power series in λ in the neighborhood of 0 or ∞: A(λ)−1 = (1 + Aλ + A2 λ2 + · · ·) or A(λ)−1 = −(A−1 λ−1 + A−2 λ−2 + · · ·).
October 15, 2004 17:23 WSPC/148-RMP
00218
Deformations of Loop Algebras and Classical Integrable Systems
829
˜∗A 3.3. Two Lie Poisson structures on g ˜∗A . It can be done using the pairing Let us define Poisson structures in the space g h , i described above. It determines Lie–Poisson bracket on P (˜ g∗A ) in the standard way: {F (L(λ)), G(L(λ))} = hL(λ), ([∇F (L(λ)), ∇G(L(λ))]−λ[∇F (L(λ)), ∇G(L(λ))]A )i , (6) where ∇F (L(λ)) =
n X X ∂F
X λm , ∇G(L) = (m) ij
m∈Z i,j=1
∂lij
n X X ∂G(λ)
(m) m∈Z k,l=1 ∂lkl
Xkl λm .
(m)
For the coordinate functions lij this bracket has the following form: X pq X pq (n) (m) (n+m) (n+m+1) {lij , lkl } = Cij,kl lpq − Cij,kl (A)lpq . p,q
(7)
p,q
(n)
It determines in the space of functions {lij } the structure of Lie algebra isomorphic to ˜ gA . g∗A a new Poisson bracket { , }0 [6], which Let us now introduce into the space ˜ is a Lie–Poisson bracket for the algebra g˜0A . Explicitly, this bracket has the form: (n)
(m)
(n)
(m)
{lij , lkl }0 = {lij , lkl }, n, m ≥ 0 , (n)
(n)
(m)
(n)
(m)
{lij , lkl }0 = −{lij , lkl }, n, m < 0 , (8)
(m)
{lij , lkl }0 = 0, m < 0, n ≥ 0 or n < 0, m ≥ 0 .
(9)
3.4. Poisson subspaces Ms,p (A) Let us consider the following finite-dimensional subspaces of (˜ g0A )∗ : Ms,p (A) = (˜ g0A /Jp,s )∗ ,
s, p ≥ 0 ,
˜0A are defined as in Proposition 3.2. From Proposition 3.2 it where spaces Jp,s ⊂ g follows that subspaces Jp,s are ideals in ˜ g0A i.e. bracket { , }0 is correctly restricted to Ms,p (A). In other words space Ms,p (A) is isomorphic to the dual space of the finite-dimensional Lie algebra gs,p = ˜ g0A /Jp,s . The Lax operator L(λ) in the finite-dimensional subspace Ms,p (A) has the form: L(λ) =
p−1 X
L(k) λ−(k+1) =
p−1 X X
(k)
lij λ−(k+1) Xji .
k=−s ji
k=−s
(k)
The Lie–Poisson bracket on Ms,p (A) among the coordinate functions lij is written as follows: X pq X pq (n) (m) (n+m+1) (n+m) , when n, m ≥ 0, n+m+1 < p − Cij,kl (A)lpq Cij,kl lpq {lij , lkl }0 = p,q
p,q
October 15, 2004 17:23 WSPC/148-RMP
830
00218
T. Skrypnyk
(n)
(m)
{lij , lkl }0 = −
X
pq (n+m) Cij,kl lpq +
X
pq (n+m+1) Cij,kl (A)lpq ,
p,q
p,q
(n)
when n, m < 0, n + m + 1 > −s
(m)
{lij , lkl }0 = 0 in other cases . 3.5. Hamiltonian and “deformed” Lax equations on Ms,p (A) Now let L(λ) ∈ Ms,p (A). Let I(L(λ)) be the restriction of the invariant function I on ˜ g∗A onto Ms,p (A). We can write the corresponding hamiltonian equations of motion in the form: dLij (λ) = {Lij (λ), I(L(λ))}0 , dt
(10)
The following theorem will be basic for all our subsequent considerations. Theorem 3.1. (i) The polynomial functions {Ikm (L(λ))} commute with respect to the restriction of the bracket { , }0 on Ms,p (A). (ii) Let the hamiltonian I(L(λ)) belong to the set {Ikm |Ms,p (A) }. Then equations of motion (10) can be written in the “deformed” Lax form: dL(λ) = A(λ)M L(λ) − L(λ)M A(λ) dt
(11)
where M = ±(P± ∇I)|Ms,p (A) and ∇I is defined as above. Proof. Item (i) of the theorem follows directly from item (i) of Corollary 2.1, item (ii) of Propositions 3.2 and 3.3. Item (ii) of the theorem follows from item (ii) of Corollary 2.1 and explicit form of coadjoint action (4). Remark 3.6. Note that restriction onto the quotient space Ms,p (A) in the definition of the M operator should be made after taking the matrix gradient of I. Remark 3.7. It is possible to transform “deformed” Lax equations (11) to the form of the usual Lax equations using the above mentioned realizations of algebra g˜A . Nevertheless we prefer to work with the Lax equations in the “deformed” form (11), because in this case corresponding L–M pairs are the most simple. 4. Integrable Systems in Finite-Dimensional Quotients Now we will consider hamiltonian systems on the general finite-dimensional subspace Ms,p (A) and discuss some of their general properties. For this purpose we will describe more explicitly the set of mutually commuting integrals with respect to the brackets { , }0 functions. Their form depend on our definition of A−1 (λ).
October 15, 2004 17:23 WSPC/148-RMP
00218
Deformations of Loop Algebras and Classical Integrable Systems
831
When we decompose A−1 (λ) in the power series in the neighborhood of zero and infinity we obtain different integrals. We denote them I r± (λ): ∞ X r 1 Inr+ λn , tr L(λ)(1 + Aλ + A2 λ2 + · · ·) ≡ r n=−rp
I r+ (λ) ≡
r(s−2) X r 1 Inr− λn . tr L(λ)(A−1 λ−1 + A−2 λ−2 + · · ·) ≡ r n=−∞
I r− (λ) ≡
(12)
(13)
It is not difficult to obtain the following formulas for the integrals Inr± : Inr+
1 = r
k1 +···+kr =n−(m1 +···+mr )
X
k1 ,k2 ,...,kr =0
s−1 X
m1 ,m2 ,...,mr =−p
× tr(L(−m1 −1) Ak1 L(−m2 −1) Ak2 · · · L(−mr −1) Akr ) Inr− =
1 r
k1 +···+kr =n+(m1 +···+mr )
X
k1 ,k2 ,...,kr =1
(14)
s−1 X
m1 ,m2 ,...,mr =−p
× tr(L(−m1 −1) A−k1 L(−m2 −1) A−k2 · · · L(−mr −1) A−kr ) .
(15)
Not all integrals of these two infinite sets are independent after the restriction onto the finite-dimensional space Ms,p (A). Besides, some of them coincide with the Casimir functions of the considered algebras. Nevertheless in all examples the family of the independent non-trivial integrals (including higher-order integrals) is wide enough to provide integrability of the considered systems. These integrals generate non-trivial hamiltonian flows that, due to the results of the previous section, can be written in the Lax-type form: dL(λ) = A(λ)Mkr± L(λ) − L(λ)Mkr± A(λ), dtk
(16)
where Mkr± are the M -operators that corresponds to the integrals Ikr± (L). Let us calculate them explicitly using formulas (11): Mkr+ = P− λ−(k+1) ((1 + Aλ + A2 λ2 + · · ·)L(λ))r−1 (1 + Aλ + A2 λ2 + · · ·) (17) Mkr− = P+ λ−(k+1) (A−1 λ−1 (1 + A−1 λ−1 + · · ·)L(λ))r−1 × A−1 λ−1 (1 + A−1 λ−1 + · · ·)
(18)
Mkr±
The expression for the operators gives us the possibility to distinguish the Casimir functions and non-trivial integrals. The following proposition holds true: Proposition 4.1. The functions Ikr− , r(s − 2) ≥ k > (r − 1)(s − 2) − 2 and Ilr+ , −rp ≤ l < −(r − 1)p are the Casimir functions of the bracket { , }0 restricted to Ms,p (A). Other functions determine non-trivial hamiltonian flows with respect to the bracket { , }0 restricted to Ms,p (A).
October 15, 2004 17:23 WSPC/148-RMP
832
00218
T. Skrypnyk
Proof. We apply the direct verification. Indeed, from Eqs. (17) and (18) it follows that M -operators corresponding to the integrals Ikr± with the described in the proposition values of k are identically equal to zero. Hence they determine trivial hamiltonian flows on Ms,p (A) and coincide with the Casimir functions. On the other hand, it is easy to see, that other integrals possess non-trivial M operators which are finite polynomials in λ. From the Lax-type representation (16) it follows that corresponding Lax equations and hamiltonian flows are non-trivial on Ms,p (A). For the obtained hamiltonian systems there exists nice symmetry. It is given by the following proposition: Proposition 4.2. Let det A 6= 0. Then the hamiltonian systems on M0,s (A−1 ) with the algebra of integrals generated by functions Ikr± coincide with the hamiltonian systems on Ms,0 (A) with the algebra of integrals generated by functions Ikr∓ . Proof. To prove this proposition it is necessary to show that M0,s (A−1 ) and Ms,0 (A) are isomorphic as Poisson spaces and that corresponding algebras of integrals coincide. Due to the fact, that the Poisson spaces M0,s (A−1 ) and Ms,0 (A) ˜− are dual spaces to the quotient algebras of ˜ g+ A−1 and gA , we will prove their isomorphism if we show that algebra ˜ g− g+ A is isomorphic to the algebra ˜ A−1 . But this −1/2 −n−1 −1/2 n ˜ follows from the explicit substitution of variables: X = A X A ,n<0 + ˜ which maps generators of ˜ g− into the generators of g . Now, it is easy to see that A A−1 the functions Ikr− on Ms,0 (A) under the isomorphism described above are mapped to the functions Ikr+ on M0,s (A−1 ) and vice versa. ˜A and corresponding Lie– Due to the fact that the structure of the algebras g Poisson brackets depend on the matrices of the deformation A, we will distinguish two types of the classical integrable systems obtained with their help. The first type includes systems connected with non-degenerate matrices A. It is evident, that for this type of the systems both types of generating functions I r+ (λ) and I r− (λ) are correctly defined. The second type is connected with degenerated matrices A. In the case of the systems of the second type it is necessary to regularize functions I nr− . This procedure will be worked out in the next subsections on the concrete examples. We will restrict ourselves to the consideration of the generating functions of the second order integrals: H(λ)± ≡ I 2± (λ) which we call simply “hamiltonians”. In a majority of cases interesting examples of the integrable finite-dimensional systems arise in the subspaces Ms,p (A) with small s and p. That is why in all our examples we restrict ourselves to the consideration of the cases s, p ≤ 2. 4.1. Generalized tops Let us consider hamiltonian systems in the space M1,0 (A) in the case when the matrix A, that determine the structure of gA , is non-degenerate. The Lax matrix
October 15, 2004 17:23 WSPC/148-RMP
00218
Deformations of Loop Algebras and Classical Integrable Systems
L(λ) ∈ M1,0 (A) has the following form: L(λ) ≡ L(−1) = Lie–Poisson bracket on M1,0 (A) ' g∗A is written as follows: X pq (−1) (−1) (−1) Cij,kl (A)lpq . {lij , lkl }0 =
P
(−1) Xji . i,j=1,n lij
833
The (19)
p,q
Let us show, that in the case when the matrix A is non-degenerate the Lie–Poisson bracket (19) is isomorphic to the standard Lie–Poisson bracket on g∗ . The following proposition is true: Proposition 4.3. Let the spectral parameter-independent matrix L be defined as follows: L ≡ A−1/2 L(−1) A−1/2 . Then the Lie–Poisson bracket between its matrix elements lij and lkl will have the standard form: X pq Cij,kl lpq . {lij , lkl } = p,q
Proof. It is enough to take into consideration, that the bracket (19) is the Lie–Poisson bracket for the matrix Lie algebra gA with commutation relations: ˜ = A−1/2 XA−1/2 defines the isomor[X, Y ]A = XAY − YAX. Hence substitution X phism between gA and g. The same transformation transforms the corresponding Lie–Poisson bracket to the standard form. Let us now consider a set of mutually commuting hamiltonians on the Lie algebra g∗A ' g∗ , obtained in the framework of our construction. Using general formulas (14) for the commuting second-order hamiltonians we obtain the following expressions: n
Hn+ (L(−1) ) =
1X tr(L(−1) Ak L(−1) An−k ) 2
(20)
k=0 n
− (L(−1) ) H−(n+2)
1X = tr(L(−1) A−k−1 L(−1) A−(n−k+1) ) . 2
(21)
k=0
Let us write down the explicit form of the first two functions of the sets H + (λ), H − (λ): 1 tr(L(−1) )2 , H1+ (L(−1) ) = tr(AL(−1) L(−1) ) , 2 1 − − H−2 (L(−1) ) = tr(A−1 L(−1) A−1 L(−1) ) , H−3 (L(−1) ) = tr(A−1 L(−1) A−2 L(−1) ) . 2 M -operators that correspond to the above hamiltonians are given by expressions: H0+ (L(−1) ) =
M0+ = λ−1 L(−1) ,
M1+ = λ−2 L(−1) + λ−1 (AL(−1) + L(−1) A) ,
− M−2 = 0,
− M−3 = A−1 L(−1) A−1 .
− This yields that the function H−2 (L) is the Casimir function of the bracket (19). Other functions are the commuting integrals, that generate non-trivial hamiltonian
October 15, 2004 17:23 WSPC/148-RMP
834
00218
T. Skrypnyk
flows on the corresponding coadjoint orbits. For the hamiltonian of these systems, which we will hereafter call generalized tops we take the function H0+ (L(−1) ). There exist several types of the generalized tops on the simple (reductive) Lie algebras [28, 29, 37]. To make the identification of our systems on g∗A ' g with the known ones it is necessary to take into consideration restriction on the form of the matrix A, that follows from our approach. Comparing our formulas with the results of [28, 29] and [37] we deduce that the following statement holds true: Proposition 4.4. The set of commuting integrals (20) coincide with the algebra of integrals of the generalized tops in the case of so-called “complex series” if g = gl(n), or “normal series” if g = so(n). Remark 4.1. As follows from the above proposition integrable systems obtained in this subsection could also be obtained using loop algebras. Although the set of commuting integrals obtained using these two approaches will coincide, but Laxtype representations is different. Example 4.1. Let us consider the case g = so(3). Without the lost of generality we will put A = diag(a1 , a1 , a3 ), where ai 6= 0. Putting Xk = ijk Xij we can write P3 (−1) the Lax operator in the form: L = Xk . Poisson bracket between the k=1 lk coordinate functions is the following: (−1)
{li
(−1)
, lj
(−1)
}0 = ijk ak lk
.
Independent commuting integrals are the functions: H0+ (L(−1) ) =
3 X
(−1) 2
) ,
(lk
H1+ (L(−1) ) =
(−1) 2
ak (lk
) .
k=1
k=1
Changing the variables: lk =
3 X
(ak )1/2 (−1) l we obtain the standard so(3) bracket: (a1 a2 a3 )1/2 k {li , lj }0 = ijk lk .
In this standard coordinates we obtain for the hamiltonians the following expressions: H0+ (lk ) = (a1 a2 a3 )
3 X
2 a−1 k lk ,
H1+ (lk ) = (a1 a2 a3 )
k=1
3 X
lk2 .
k=1
We see, that modulo the multiplication by the constants they coincide with the hamiltonian and the Casimir invariant of the usual Euler top, where ak stands for the components of the inertia tensor. 4.2. Generalized Steklov Liapunov systems Let us consider hamiltonian systems in the space M2,0 (A) in the case when the matrix A, that determines the structure of gA is non-degenerate. The Lax operator
October 15, 2004 17:23 WSPC/148-RMP
00218
Deformations of Loop Algebras and Classical Integrable Systems
835
will have the following form: L(λ) ≡ L(−1) + λL(−2) =
X
(−1)
lij
(−2)
Xji + λlij
Xji .
i,j=1,n
The Lie–Poisson bracket on M2,0 (A) ' (gA + gA )∗ will be the following: X pq X pq (−1) (−1) (−1) (−2) {lij , lkl }0 = Cij,kl (A)lpq − , Cij,kl lpq p,q
(−1)
{lij
(−2)
, lkl
}0 = −
X
p,q
pq (−2) Cij,kl (A)lpq ,
(22)
p,q
(−2)
{lij
(−2)
, lkl
}0 = 0 .
In the case of the non-degenerate matrices the A Lie–Poisson bracket (22) could be transformed to the form of the standard form of the Lie–Poisson bracket of semi-direct sum (g + g)∗ , where the second summand is considered to be abelian. The following proposition holds true: Proposition 4.5. Let matrices L and P be defined as follows: L = A−1/2 (L(−1) + 1 (−2) −1 A + A−1 L(−2) ))A−1/2 , P = A−1/2 L(−2) A−1/2 . Then the Lie–Poisson 2 (L bracket between their matrix elements lij and pkl is a standard Lie–Poisson bracket on (g + g)∗ : X pq X pq {lij , lkl }0 = Cij,kl lpq , {lij , pkl }0 = Cij,kl ppq , {pij , pkl }0 = 0 . p,q
p,q
Proof. It is enough to take into consideration that bracket (19) is the Lie–Poisson bracket for the matrix Lie algebra g ⊗ A, where A = P ol(λ−1 )/λ−2 P ol(λ−1 ), with the commutation relations: [Xλ−1 , Y λ−1 ] = −λ−1 [X, Y ]A + λ−2 [X, Y ] , [Xλ−1 , Y λ−2 ] = −λ−2 [X, Y ]A , [Xλ−2 , Y λ−2 ] = 0 . ˜ 1 = A−1/2 (Xλ−1 + 1 (A−1 Xλ−2 + It is easy to show that the substitution: X 2 −2 −1 −1/2 ˜ −1/2 −2 −1/2 Xλ A ))A , X2 = A (Xλ )A maps these relations to the standard relations of the semi-direct sum: ^ ˜ 1 , Y˜1 ] = [X, [X Y ]1 ,
^ ˜ 1 , Y˜2 ] = [X, [X Y ]2 ,
˜ 2 , Y˜2 ] = 0 . [X
The same transformation maps the corresponding Lie–Poisson bracket into the standard form. Let us now consider the set of mutually commuting hamiltonians on the Lie algebra (gA + gA )∗ ' (g + g)∗ , obtained in the framework of our construction. From the general formulas for the second-order integrals (14) one can easily calculate
October 15, 2004 17:23 WSPC/148-RMP
836
00218
T. Skrypnyk
all second-order integrals. We write down here explicit expressions of the first two functions of the sets H + (λ), H − (λ): H0+ =
1 tr(L(−1) )2 , 2
H1+ = tr(L(−1) L(−2) ) + tr(AL(−1) L(−1) ) , H0− =
1 tr(A−1 L(−2) A−1 L(−2) ), 2
− H−1 = tr(A−2 L(−2) A−1 L(−2) + A−1 L(−2) A−1 L(−1) ) .
The corresponding M -operators are: M0+ = λ−1 L(−1) , M1+ = λ−2 L(−1) + λ−1 (L(−2) + (AL(−1) + L(−1) A)) , − M0− = M−1 = 0.
From the expressions for the M -operators it follows that the functions H0− (L), are the Casimir function of the bracket (22). Other functions are just commuting integrals, that generate non-trivial hamiltonian flows on the corresponding coadjoint orbits. We will take for the hamiltonian of the generalized Steklov system function H0+ . In the (standard) coordinates introduced above on (gA + gA )∗ ' (g + g)∗ it is written as: 2 1 . (23) H0+ = tr A−1 L − 1/2(P A−1 + A−1 P ) 2 − H−1 (L)
− The Casimir functions H0− , H−1 pass to the second-order Casimirs of the semidirect product:
1 − tr P 2 , H−1 = tr(LP ) . (24) 2 In order to show that the corresponding hamiltonian system is actually the direct generalization of the ordinary Steklov system we will consider the following example: H0− =
Example 4.2. Let us consider the case g = so(3). Without the lost of generality we will put A = diag(a1 , a1 , a3 ), ai 6= 0. Putting Xk = ijk Xij we have the Lax operator in the form: L=
3 X
(−1)
(lk
(−2)
+ λlk
)Xk .
k=1
The Lie–Poisson bracket between the coordinate functions are given by: (−1)
{li (−1)
{li
(−1)
− ijk ak lk
(−2)
,
}0 = ijk ak lk
, l+ (−2)j }0 = ijk ak lk
(−2)
{li
(−1)
, lj
(−2)
, lj
}0 = 0 .
(−1)
,
October 15, 2004 17:23 WSPC/148-RMP
00218
Deformations of Loop Algebras and Classical Integrable Systems
837
Non-trivial commuting integrals are the following functions: H0+ (L(−1) ) =
3 X
(−1) 2
(lk
3 X
H1+ (L(−1) ) =
) ,
k=1
(−1) (−2) lk
(lk
(−1) 2
+ ak (lk
) ).
k=1
Changing the variables: 3 X
(ak )1/2 1 (−1) lk = l + 2 (a1 a2 a3 )1/2 k
−1 a−1 i − ak
i=1
!
(−2) lk
!
,
pk =
(ak )1/2 (−2) l (a1 a2 a3 )1/2 k
we obtain the standard bracket on so(3) + R3 : {li , lj }0 = ijk lk ,
{li , pj }0 = ijk pk ,
{pi , pj }0 = 0 .
In this standard coordinates we obtain for the hamiltonian H0+ the following expressions: H0+ = (a1 a2 a3 )
3 X
−1 −1 −1 −1 2 a−1 k (lk − 1/2(a1 + a2 + a3 − ak )pk ) .
k=1
We see that modulo the multiplication by the constants and replacement ak → a−1 k they coincide with the hamiltonian of the Steklov–Liapunov system in the form − of Kotter [38]. Hamiltonians H0− and H−1 are the standard Casimir functions of so(3) + R3 : H0− =
3 X
p2k ,
− H−1 =
k=1
3 X
p k lk .
k=1
4.3. Generalized interacting tops Let us consider hamiltonian systems in the space M1,1 (A), in the case of the nondegenerate matrices A. The Lax matrix is written as: X (0) (−1) L(λ) ≡ λ−1 L(0) + L(−1) = λ−1 lij Xji + lij Xji . i,j=1,n
The Lie–Poisson bracket on M1,1 (A) ' (g⊕gA )∗ is easy to show to be the following: X pq (−1) (−1) (−1) {lij , lkl }0 = Cij,kl (A)lpq , p,q
(0) (0) {lij , lkl }0
=
X
pq (0) Cij,kl lpq ,
(25)
p,q
(0)
(−1)
{lij , lkl
}0 = 0 .
Let us now consider the set of mutually commuting hamiltonians on the Lie algebra (g ⊕ gA )∗ , obtained in the framework of our construction. The general formulas for
October 15, 2004 17:23 WSPC/148-RMP
838
00218
T. Skrypnyk
the commuting hamiltonians are the following: Hn+ =
− H−n
1 2
1 = 2
k1 +k2 =n−(m1 +m2 )
X
0 X
k1 ,k2 =0
m1 ,m2 =−1
k1 +k2 =n+(m1 +m2 )
0 X
X
k1 ,k2 =1
tr(L(−m1 −1) Ak1 L(−m2 −1) Ak2 )
(26)
tr(L(−m1 −1) A−k1 L(−m2 −1) A−k2 ) .
(27)
m1 ,m2 =−1
Let us write down the expressions of the first two functions of the sets H + (λ), H − (λ): 1 tr(L(0) )2 , 2 = tr(L(0) L(−1) ) + tr(AL(0) L(0) ) , 1 = tr(A−1 L(−1) A−1 L(−1) ) , 2 = tr(A−1 L(0) A−1 L(−1) ) + tr(A−2 L(−1) A−1 L(−1) ) .
+ H−2 = + H−1 − H−2 − H−3
The corresponding M -operators are: + M−2 = 0,
+ = λ−1 L(0) , M−1
− M−2 = 0,
− M−1 = A−1 L(−1) A−1 .
± Function H−2 (L) is easily shown to be the Casimir function of the bracket (25). Other functions are just commuting integrals that generate non-trivial hamiltonian flows on the corresponding coadjoint orbits. We will take for the hamiltonian of the + − generalized interacting tops functions H−1 or H−3 . Now let us transform the Lie–Poisson bracket (25) to the form of the standard Lie–Poisson bracket on (g ⊕ g)∗ . The following proposition holds true:
Proposition 4.6. Let matrices L and M be defined as follows: M = L(0) , L = A−1/2 L(−1) A−1/2 . Then the Lie–Poisson bracket between their matrix elements lij and mkl is a standard Lie–Poisson bracket on (g ⊕ g)∗ : X pq X pq Cij,kl mpq , {lij , mkl }0 = 0 . Cij,kl lpq , {mij , mkl }0 = {lij , lkl }0 = p,q
p,q
Proof of this proposition follows directly from the proof of Proposition 4.3. In the above standard coordinates on (g ⊕ g)∗ the chosen hamiltonians of the generalized interacting tops are written as: + H−1 = tr(LA1/2 M A1/2 ) + tr(AM 2 ) ,
+ H−1 = tr(LA−1/2 M A−1/2 ) + tr(A−1 L2 ) .
(28) The Casimir functions direct sum:
+ H−2 ,
− H−2
pass to the standard second order Casimirs of the
1 1 − tr(M 2 ) , H−2 = tr(L2 ) . 2 2 To illustrate this let us now consider the following example: + H−2 =
(29)
October 15, 2004 17:23 WSPC/148-RMP
00218
Deformations of Loop Algebras and Classical Integrable Systems
839
Example 4.3. Let us consider the case g = so(3). Without the lost of generality we will put A = diag(a1 , a1 , a3 ), where ai 6= 0. The Lax operator is P3 (−1) −1 (0) L = lk + lk )Xk . The Lie–Poisson bracket between the coordinate k=1 (λ functions are the following: (−1)
{li
(−1)
, lj
(−1)
}0 = ijk ak lk
(0)
,
(0)
(0)
(−1)
{li , lj }0 = ijk lk ,
{li
(0)
, lj } 0 = 0 .
There are two non-trivial commuting integrals: ! 3 X (0) (−1) (0) 2 + + (−1) H−1 (L )= 2lk lk − ak (lk ) + (tr A)H−2 , k=1
− H−3 (L(−1) )
= (a1 a2 a3 )
−1
3 X
(0) (−1) 2ak lk lk
−
(−1) (lk )2
k=1
(−1) (ak )1/2 l (a1 a2 a3 )1/2 k
(0)
Changing the variables: mk = lk , lk = bracket: {li , lj }0 = ijk lk ,
{mi , mj }0 = ijk mk ,
!
− + (tr A−1 )H−2 .
we obtain the standard
{mi , lj }0 = 0 .
± The hamiltonians H−2 in these coordinates are the Casimir functions of so(3) + so(3): − H−2 =
3 X
lk2 ,
+ H−2 =
k=1
3 X
m2k .
k=1
In the standard coordinates we obtain for the non-trivial integrals the following expressions: + H−1 = 2(a1 a2 a3 )1/2
3 X
−1/2
(ak
lk mk ) − ak m2k ,
k=1 − H−3 = 2(a1 a2 a3 )−1/2
3 X
1/2
2 (ak lk mk ) − a−1 k lk .
k=1
They coincide with the integrals of the interacting so(3)-tops discovered by Veselov [33]. 4.4. Generalized Clebsch systems In the next subsections we will consider hamiltonian systems obtained from the algebras ˜ gA in the case det A = 0. In this subsection we will consider hamiltonian systems on the space M1,0 (A) and obtain a generalization of the Clebsch systems for all series of classical Lie algebras. For the sake of simplicity we will restrict ourselves to the consideration of the case of the minimal degeneracy of A. Taking into account the restrictions on the form of A we obtain that minimally degenerated matrices A have the matrix rank equal to n − 1 in the case of so(n) and gl(n), and
October 15, 2004 17:23 WSPC/148-RMP
840
00218
T. Skrypnyk
n − 2 in the case of sp(n). The Lax operator in all these cases has the same form as in the case of the non-degenerate matrices A: X (−1) L(λ) ≡ L(−1) = lij Xji . i,j=1,n
Lie–Poisson structure in the space M1,0 (A) has the following form: X pq (−1) (−1) (−1) Cij,kl (A)lpq . {lij , lkl }0
(30)
p,q
Let us identify this Lie–Poisson bracket with the standard one. For this purpose we need to transform the matrix A to some standard form. The following proposition holds true: Proposition 4.7. Let the matrix A be as described above. Then by the inner automorphisms of the algebras g˜A the matrix A could be transformed to the form A = diag(A0 , 0), where A0 ∈ M at(n − 1) in the case of gl(n) or so(n), A0 ∈ M at(n − 2) in the case of sp(n). Let us now transform bracket (30) to the standard form. For this purpose we will represent the matrix L(−1) in the block form: (−1) (−1) ! L L 11 12 , L(−1) = (−1) (−1) L21 L22 (−1)
(−1)
where L11 ∈ M at(m), L22 ∈ M at(n − m), m = n − 1 in the case of gl(n) and so(n), m = n − 2 in the case of sp(n). By the direct calculations we prove the following proposition: Proposition 4.8. Let the blocks Lij , i, j = 1, 2 of the block diagonal spectral pa(−1) rameter independent matrix L be defined as follows: L11 = (A0 )−1/2 L11 (A0 )−1/2 , (−1) (−1) (−1) L12 = (A0 )−1/2 L21 , L21 = L12 (A0 )−1/2 , L22 = L22 . Then the Lie–Poisson bracket between matrix elements lij of the matrix L has the form of the Lie–Poisson bracket of : (i) The euclidian Lie algebra e(n − 1) if algebra g is equal to so(n); (ii) the Lie algebra gl(n − 1) + H 2n−1, where H 2n−1 is the Heisenberg algebra in the space R2n−1 constituted by the elements of L12 , L21 and center L22 , if algebra g is equal to gl(n); (iii) the Lie algebra sp(n − 1) + N 2n−1 , where N 2n−1 is the nilpotent algebra in the space R2n−1 constituted by the elements of L12 , L21 and center L22 , if algebra g is equal to sp(n). Remark 4.2. In order to obtain precise analogs of the Clebsch system we will hereafter put L22 ≡ IL(−1) I = 0, considering the algebra gl(n − 1) + R2n−2 instead of gl(n−1)+H 2n−1 and the algebra sp(n−1)+R2n−4 instead of sp(n−1)+N 2n−1. Let us now consider commuting second-order integrals (hamiltonians), that could be obtained within the framework of our construction. For this purpose we
October 15, 2004 17:23 WSPC/148-RMP
00218
841
Deformations of Loop Algebras and Classical Integrable Systems
will consider block-diagonal non-degenerate matrices A of the type: A = A + aI, where A is the matrix described above, I = diag(0, 0, . . . , 0, 1) in the case of gl(n) and so(n), I = diag(0, 0, . . . , 0, 1, 1) in the case of sp(n). By the very definition A = lima→0 A. We will consider hamiltonians of the systems connected with ˜ gA and obtain the corresponding hamiltonians on ˜ gA as a limiting case of the hamiltonians on g˜A . For the case of the last algebras we are in the settings of the Sec. 4.1 and we can use previously obtained formulas. Commuting hamiltonians of the series Hn+ (L(−1) ) on ˜ gA are obtained directly from the hamiltonians Hn+ (L(−1) ) on ˜ gA (see (20)) and have the same form as in the case of the non-degenerate matrices A: n
Hn+ (L(−1) ) =
X 1 tr(L(−1) Ak L(−1) An−k ) lim 2 a→0 k=0
=
1 2
n X
tr(L(−1) Ak L(−1) An−k ) .
(31)
k=0
Commuting integrals of the series H − (λ) contain expression A−1 and in the limit a → 0 should be regularized in the appropriate way. We will illustrate this on the − − example of the first three functions of the sets H − (λ), i.e. functions H−2 , H−3 , − H−4 . In order to obtain non-singular but non-trivial integrals in the limit a → 0 we will consider linear combinations of the above three integrals. Taking into account that A−1 = A−1 + a−1 I we obtain: − 0 H−2 (L(−1) ) = lim a2 H−2 (L(−1) ) = a→0
1 tr(IL(−1) IL(−1) ) , 2
− − 0 H−3 (L(−1) ) = lim (a2 H−3 (L(−1) ) − 2aH−2 (L(−1) )) = tr(IL(−1) A−1 L(−1) ) , a→0
− − − 0 H−4 (L(−1) ) = lim (a2 H−4 (L(−1) ) − 2aH−3 (L(−1) ) + H−2 (L(−1) )) a→0
=
1 tr(A−1 L(−1) A−1 L(−1) ) − tr(A−2 L(−1) IL(−1) ) . 2
Here, slightly abusing the language, we have introduced the notation: A−1 ≡ diag((A0 )−1 , 0). Let us now consider the hamiltonians in the above standard coordinates. We obtain: 1 tr(IL(−1) IL(−1) ) = 0 , 2 = tr(LIL) , 1 = tr((1n − I)L2 ) − tr(A−1 LIL) . 2
0 H−2 = 0 H−3 0 H−4
In an analogous way for the hamiltonian H0+ (L) we have the following expression: H0+ (L(−1) ) =
1 tr(ALAL) + tr(ALIL) . 2
October 15, 2004 17:23 WSPC/148-RMP
842
00218
T. Skrypnyk
Let us now calculate the corresponding M operators. They have the following form: 0 M−2 ≡ 0, 0 M−4
− 0 M−3 = lim a2 M−3 = lim a2 (A−1 L(−1) A−1 ) = IL(−1) I = 0 ,
= lim a a→0
a→0
2
− (M−4
a→0
− 2a
−1
− M−3 )
= −(IL(−1) A + AL(−1) I), M0+ = L(−1) .
0 From this it follows that (after restriction H−2 = 0 being imposed) the function 0 0 H−3 became the Casimir function. The functions H−4 and H0+ generate non-trivial hamiltonian flows on each coadjoint orbit of gA . For the hamiltonian of the generalized Clebsch system, in order to be in good agreement with the g = so(4) case, 0 we will take the function H−4 .
Example 4.4. Let us consider the case g = so(4). Putting A = diag(a1 , a2 , a3 , 0) we obtain the usual Clebsch system. Let us consider this in detail. The Lax operator has the form: 3 3 X X (−1) (−1) L= lij Xij + li4 Xi4 . i,j=1
i=1
(−1)
(−1)
Introducing the coordinates lk = ijk lij we obtain that the Lie–Poisson bracket between coordinate functions have the following form: (−1)
{li
(−1)
, lj
(−1)
}0 = ijk ak lk
(−1)
,
{li
(−1)
(−1)
, lj4 }0 = ijk ak lk4
,
(−1)
{li4 P3
(−1)
, lj4 }0 = 0 . (−1)
0 2 As follows from that said above, the function H−3 (L(−1) ) = k=1 a−1 k (lk4 ) , is a Casimir function of this bracket. The non-trivial commuting integrals have the following form:
H0+ (L(−1) ) =
3 X
(−1) 2
(lk
) +
k=1 0 H−4 (L(−1) )
3 X
(−1)
(lk4 )2 ,
k=1
= (a1 a2 a3 )
−1
3 X
(−1) ak (lk )2
−
k=1
The change of variables: lk =
(−1)
2 a−2 k (lk4 ) .
k=1
(−1) (ak )1/2 l , (a1 a2 a3 )1/2 k
(−1)
xk =
lk4
1/2
ak
bracket: {li , lj }0 = ijk lk ,
3 X
{li , xj }0 = ijk xk ,
gives us the standard e(3)
{xi , xj }0 = 0 .
In this standard coordinates we obtain for the hamiltonians the following expressions: 3 3 X a1 a2 a3 2 X lk + ak x2k , H0+ (L(−1) ) = ak k=1
k=1
0 H−4 (L(−1) ) =
3 X
k=1
lk2 −
3 X
2 a−1 k xk .
k=1
They coincide with the usual hamiltonians of the ordinary Clebsch system [32].
October 15, 2004 17:23 WSPC/148-RMP
00218
Deformations of Loop Algebras and Classical Integrable Systems
843
4.5. Spin generalization of the Clebsch systems In this subsection we will consider hamiltonian systems in the space M1,1 (A) ' g∗ ⊕ g∗A in the case det A = 0. We will call corresponding hamiltonian systems “spin generalizations of the Clebsch systems”. In the case of the space M1,1 (A) the Lax operator is written as: X (0) (−1) λ−1 lij Xji + lij Xji . L(λ) ≡ λ−1 L(0) + L(−1) = i,j=1,n
The Lie–Poisson bracket on M1,1 (A) ' (g ⊕ gA )∗ will be the following: X pq X pq (0) (−1) (0) (0) (−1) (−1) (0) (−1) Cij,kl lpq , {lij , lkl }0 = 0 . Cij,kl (A)lpq , {lij , lkl }0 = {lij , lkl }0 = p,q
p,q
(32)
Let us again consider the simplest degeneracy of the matrix A and transform the Lie–Poisson bracket (32) to the standard form. The following proposition holds true: Proposition 4.9. (i) If the Lie algebra g is equal to so(n) then the corresponding Lie algebra g ⊕ gA is isomorphic to the Lie algebra so(n) ⊕ e(n − 1). (ii) If the Lie algebra g is equal to gl(n) then the Lie algebra g+g A is isomorphic to the Lie algebra gl(n) ⊕ (gl(n − 1) + H 2n−1 ). (iii) If the Lie algebra g is equal to sp(n) then the Lie algebra g+g A is isomorphic to the Lie algebra sp(n) ⊕ (sp(n − 1) + N 2n−1 ). Proof. It follows from the explicit substitution of the variables. Indeed, let M ≡ L(0) and the matrix L be defined as in the previous subsection. Then it is easy to see that the matrix elements of these two matrices are wanted standard coordinates on g ⊕ gA . Remark 4.3. As in the previous case of the generalized Clebsch systems we may put L22 ≡ IL(−1) I = 0, considering the Lie algebra gl(n)⊕(gl(n−1)+R2n−2) instead of gl(n) ⊕ (gl(n − 1) + H 2n−1 ) and the Lie algebra sp(n) ⊕ (sp(n − 1) + R2n−4 ) instead of sp(n) ⊕ (sp(n − 1) + N 2n−1 ). Let the matrices A, A0 , I, A be defined as in the previous subsection. We will consider our systems, to be the limiting case of the systems on M1,1 (A) defined with the help of the non-degenerate matrices A ≡ A + aI. The commuting hamiltonians from the series H + (L(λ)) in the limit a → 0 will have the same form as in the non-degenerate case: Hn+ (L(−1) ) =
1 2
k1 +k2 =n−(m1 +m2 )
X
k1 ,k2 =0
0 X
tr(L(−m1 −1) Ak1 L(−m2 −1) Ak2 ) .
m1 ,m2 =−1
The commuting integrals from the series H − (L(λ)) contain the matrix A−1 . To be correctly defined in the limit a = 0 they require a kind of a regularization. We will
October 15, 2004 17:23 WSPC/148-RMP
844
00218
T. Skrypnyk
again demonstrate this regularization on the example of the first three functions − − of the set H − (λ). More explicitly, instead of the hamiltonians H−2 (L), H−3 (L), − H−4 (L) which will be singular in the case of the degenerated matrices A we will consider the following hamiltonians: − 0 H−2 (L(λ)) = lim a2 H−2 (L(λ)) = a→0
1 tr(IL(−1) IL(−1) ) , 2
− − 0 H−3 (L(λ)) = lim (a2 H−3 (L(λ)) − 2aH−2 (L(λ))) a→0
= tr(IL(−1) IL(0) ) − tr(IL(−1) A−1 L(−1) ) , − − − 0 H−4 (L(λ)) = lim (a2 H−4 (L(λ)) − 2aH−3 (L(λ)) + H−2 (L(λ))) a→0
1 1 tr(A−1 L(−1) A−1 L(−1) ) + tr(IL(0) IL(0) ) 2 2
=
− (tr(A−2 L(−1) IL(−1) ) + tr(A−1 L(0) IL(−1) )) + tr(A−1 L(0) IL(−1) )) . The corresponding M -operators have the following form: 0 M−2 ≡ 0, 0 M−4 =
lim a
a→0
2
− (M−4
− 0 M−3 = lim a2 M−3 = lim a2 (A−1 L(−1) A−1 ) = IL(−1) I , a→0
− 2a
−1
a→0
− M−3 ) =λIL(−1) I
+ (IL(0) I−(IL(−1) A−1 + A−1 L(−1) I)) .
In the standard coordinates introduced above the obtained hamiltonians are written as: 0 H−2 (L(−1) ) =
1 tr(IL(−1) IL(−1) ) = 0 , 2
0 H−3 (L(−1) ) = tr(IL(−1) A−1 L(−1) ) = tr(LIL) , 0 H−4 (L(−1) ) =
1 (tr((1n − I)L2 ) + tr(IMIM )) − (tr(A−1 LIL) 2 + tr(MA−1/2 LI ) + tr(MILA−1/2 )) .
0 0 After the restriction H−2 = 0 is imposed, the function H−3 becomes the Casimir 0 function. The function H−4 generates non-trivial hamiltonian flow on each coadjoint orbit of g ⊕ gA and will be chosen for the hamiltonian of the spin generalization of the Clebsch system. In order to show that the corresponding systems are indeed spin generalizations of the Clebsch systems, we will consider the following example:
Example 4.5. Let us consider the case g = so(4). Putting A = diag(a1 , a2 , a3 , 0) we obtain a spin generalization of the usual Clebsch system. Let us consider these in detail. The Lax operator is written as follows: L = λ−1
3 X
i,j=1
(0)
lij Xij + λ−1
3 X i=1
(0)
li4 Xi4 +
3 X
i,j=1
(−1)
lij
Xij +
3 X i=1
(−1)
li4
Xi4 .
October 15, 2004 17:23 WSPC/148-RMP
00218
Deformations of Loop Algebras and Classical Integrable Systems (−1)
(−1)
(0)
845
(0)
Introducing coordinates lk = ijk lij , lk = ijk lij we obtain that the Lie– Poisson bracket between coordinate functions will have the following form: (−1)
{li
(−1)
}0 = ijk ak lk
(−1)
(0)
(0)
, lj
(−1)
,
{li
(0)
(0)
{li , lj }0 = ijk lk , (0)
(−1)
{li , lj
}0 = 0 ,
(0)
(−1)
(−1)
, lj4 }0 = ijk ak lk4
(0)
(0)
(0)
{li , lj4 }0 = ijk lk4 , (−1)
(−1)
{li , lj4 }0 = 0 ,
{li
(−1)
,
{li4
(−1)
, lj4 }0 = 0 ,
(0)
(0)
{li4 , lj4 }0 = ijk lk ,
(0)
(0)
(−1)
{li4 , lj4 }0 = 0 .
, lj4 }0 = 0 ,
P3 P3 (−1) (−1) (0) + 0 The functions H−3 (L(−1) ) = k=1 ak (lk4 )2 and H−2 (L(−1) ) = k=1 (lk )2 + P3 (0) 2 k=1 (lk4 ) are the Casimir functions of this bracket. The hamiltonian of our system have the following form: 0 H−4 (L(−1) ) = (a1 a2 a3 )−1
3 X
(−1) 2
ak (lk
) −
k=1
3 X k=1
(−1) (ak )1/2 l , (a1 a2 a3 )1/2 k
The change of variables: lk =
(−1)
2 a−2 k (lk4 ) + 2
3 X
(−1) (0)
a−1 k lk4 lk4 .
k=1 (−1) lk4 1/2 ak
xk =
(0)
(0)
, mk = lk , yk = lk4
transforms the above bracket to the standard so(4) ⊕ e(3) form: {li , lj }0 = ijk lk ,
{li , xj }0 = ijk xk ,
{mi , mj }0 = ijk mk ,
{xi , xj }0 = 0 ,
{mi , yj }0 = ijk yk ,
{yi , yj }0 = ijk mk ,
{mi , lj }0 = {mi , xj }0 = {li , yj }0 = {xi , yj }0 = 0 . In the standard coordinates we obtain for our hamiltonian the following expression: 0 H−4 (L(−1) ) =
3 X
lk2 −
3 X
3 X
2 a−1 k xk + 2
k=1
k=1
−1/2
ak
xk y k .
k=1
Taking into account that so(4) ' so(3) ⊕ so(3), and introducing the corresponding coordinates of the direct sum: tk ≡ 12 (mk + yk ), sk ≡ 12 (mk − yk ): {ti , tj }0 = ijk tk ,
{si , sj }0 = ijk sk ,
{ti , sj }0 = 0 ,
we obtain for our hamiltonian the following formula: 0 H−4 (L(−1) ) =
3 X
lk2 −
k=1
3 X
2 a−1 k xk +
k=1
3 X
−1/2
ak
xk (tk − sk ) .
k=1
This is the hamiltonian of the Clebsch system interacting with two so(3) “spins”. Finally, putting tk or sk equal to zero we obtain the hamiltonian of the Clebsch system that interact with spin ~s ∈ so(3): 0 H−4 (L(−1) )
=
3 X
k=1
lk2
−
3 X k=1
2 a−1 k xk
±
3 X
k=1
−1/2
ak
xk s k .
October 15, 2004 17:23 WSPC/148-RMP
846
00218
T. Skrypnyk
4.6. Completeness of the constructed families of integrals The proof of the fact that functions of the series I r± (λ) form a complete family of the integrals of motion on all the spaces Ms,p (A) is, in general, a complicated problem. The direct computational proof of this statement is possible only in the small rank cases where independent integrals have degree not higher than two in the generators of algebra (see [27] for the so(3) case). Nevertheless it turned out that it is possible to prove complete integrability of the constructed hamiltonian systems on the spaces Ms,p (A) where s + p ≤ 2 provided that the matrix A determining deformation of the loop algebra L(g) is sufficiently generic. The following theorem holds true: Theorem 4.1. Let us consider a hamiltonian system on the space Ms,p (A) where s + p ≤ 2 and g = so(n) or g = gl(n). Let the matrix A be a regular element of gl(n). Then the hamiltonian systems on Ms,p (A) ' (gs,p )∗ with the hamiltonian H ∈ {Ikr± } are completely integrable on the generic coadjoint orbits of G s,p , where Gs,p is the Lie group of the Lie algebra gs,p . Proof. In order to prove the theorem it is necessary to show, that among functions H ∈ {Ikr± } restricted to the space Ms,p (A) there are 12 (dim gs,p + ind gs,p ) functionally independent integrals. We will rely on the following proposition: Proposition 4.10. (i) Let the matrix A be a regular element of gl(n). Then the hamiltonian systems on M0,1 (A) with the hamiltonian H ∈ {Ikr± } are completely integrable on the generic coadjoint orbits of G0,1 , and the complete families of the functionally independent mutually commuting integrals could be chosen among the functions {Ikr+ }. (ii) Let the matrix A be a regular element of gl(n) and det A 6= 0. Then the hamiltonian systems on M1,0 (A) with the hamiltonian H ∈ {Ikr± } are completely integrable on the generic coadjoint orbits of G1,0 , and the complete families of the functionally independent mutually commuting integrals could be chosen among the functions {Ikr− }. Proof. Let us prove item (i). It is easy to see that g0,1 ' g for all the matrices A. On the other hand, by direct verification one can prove that in this case the family of integrals {Ikr+ (L, A)} coincide with the family of the integrals constructed with the help of the procedure of the “shift of the argument” on the constant co-vector A applied to the Casimir functions I p (L). Now the proof of item (i) follows from the fact that the set of the integrals constructed with the help of the “shift of the argument” procedure is complete on generic coadjoint orbits in the case of the regular “shift-matrix” A [37]. The proof of item (ii) follows from item (i) and an explicit form of the isomorphism between Lie algebras of Poisson functions on M0,1 (A−1 ) ' g∗ and M1,0 (A) ' g∗A (see Proposition 4.2) which maps the set of functions {Ikr+ } on M0,1 (A) onto the set of functions {Ikr− } on M1,0 (A−1 ).
October 15, 2004 17:23 WSPC/148-RMP
00218
Deformations of Loop Algebras and Classical Integrable Systems
847
Let us proceed with the proof of the theorem and assume that A is a regular element of gl(n) and det A 6= 0. Now let us consider integrable systems on M1,1 (A) ' (g ⊕ gA )∗ . It follows from the above proposition that there are r+ 1 ∗ 2 (dim g + ind g) independent functions {Ik } considered as functions on g and r− 1 ∗ 2 (dim gA + ind gA ) independent functions {Ik } considered as functions on gA . 1 ∗ Hence we have 2 (dim g1,1 + ind g1,1 ) independent functions on M1,1 (A) ' g1,1 . That proves integrability of the constructed hamiltonian systems in the case of the spaces M1,1 (A). To prove also integrability of the constructed hamiltonian systems on M2,0 (A) and M0,2 (A) it is enough to notice that integrals {Ikr± } on M2,0 (A) and M0,2 (A) are given by the same expressions as on M1,1 (A) and that dim g1,1 = dim g2,0 = dim g0,2 , ind g1,1 = ind g2,0 = ind g0,2 . Now, let det A = 0. It is evident that in order for the matrix A to be a regular element of gl(n) it should have matrix rank equal to n − 1. As was shown in the previous subsection, such matrices could be written in the form A = diag(A0 , 0), where A0 ∈ gl(n − 1). Let α be a real or complex number that does not coincide with the eigenvalue of A. Then matrix A = A − α1n is a regular element of gl(n) and det A 6= 0. The matrices A and A determine the same algebras of the integrals. λ . To prove this it is enough to notice that I r (A(λ)) ∼ I r (A(λ0 )), where λ0 = 1−αλ Hence in this case there is the same number of independent commuting integrals as in the case of the non-degenerate matrices A. Now, to prove that this is sufficient for complete integrability on the generic coadjoint orbit it is enough to notice that in the considered degenerated case algebras gs,p , s + p ≤ 2 have the same indices and dimensions as in the case of the non-degenerate matrices A. Remark 4.4. In the course of the proof of the above theorem we have essentially used the fact that for the cases g = so(n) and g = gl(n) the obtained hamiltonian systems in the spaces M0,1 (A) coincide with the known ones and their integrability has been proved (see [37] and references therein). Unfortunately, the same proof could not be given for the case of the algebras sp(n). In this case the obtained hamiltonian systems on M0,1 (A) are not those of “complex” or “normal” series and we cannot rely on the classical results of [37]. 4.7. Remarks on the spectral curve and integration procedure At the end of the paper we want to make several comments on the explicit procedure of the integration of the obtained hamiltonian equations of motion. There is a standard scheme of construction of solutions of the integrable hamiltonian equations admitting Lax representations [38, 40]. It is based on the notion of a spectral (determinant) curve and its Baker–Ahiezer functions. It relies on the possibility of reconstructing L operators using Baker–Ahiezer functions on the spectral curve. Now we will show how to modify definition of a spectral curve and the Baker– Ahiezer function in order to integrate Lax equations written in the “deformed” form. The following proposition holds true:
October 15, 2004 17:23 WSPC/148-RMP
848
00218
T. Skrypnyk
Proposition 4.11. Let L(λ) ∈ Ms,p (A) satisfy “deformed” Lax equations (11). Then (i) the spectral curve for the “deformed” Lax equations (11) coincide with the following curve: Rs,p (λ, µ) ≡ det(L(λ) − µA(λ)) = 0 ,
(33)
(ii) “deformed” eigenvalue problem for the Lax matrix have the following form: L(λ)ψ = µA(λ)ψ ,
(34)
(iii) if tk is the time variable, that corresponds to the hamiltonian Hk and M operator Mk then function ψ satisfies the following equation: ∂ψ = Mk (λ)A(λ)ψ , ∂tk
(35)
(iv) if ψj ≡ ψ(t1 , . . . , tk ; γj ), where γj = (λ, µj ) ∈ Rs,p , µj is “deformed” eigenvalue of L(λ), is a solution of Eqs. (34) and (35) then the solution L(t1 , . . . , tk ; λ) of Eq. (11) is given by the formula: L(λ) = A(λ)Ψb µΨ−1 ,
(36)
where Ψ is the matrix whose columns coincide with the vector functions ψ j , µ b= diag(µ1 , . . . , µn ). Proof. In order to prove this proposition we will use other functional realization of the algebra ˜ gA and its dual space g˜∗A than in the main body of the article. Namely, ˜ (λ) = M (λ)A(λ), L(λ) ˜ let us make the substitution M = A(λ)−1 L(λ) in Eq. (11). This will provide realization of g˜A as a special subalgebra of the gl(n) loop algebra and transform Eq. (16) to the form of the standard Lax equation. Upon this all the statements of the proposition follows from the standard theorems for the ordinary Lax equations [38, 40]. Remark 4.5. Note that the algebraic curve Rs,p differs from the one that could be obtained with the help of the Kostant–Adler scheme and ordinary loop algebras [6]. In particular, curve Rs,p is never hyper-elliptic. Acknowledgments This author expresses gratitude to P. I. Holod for many helpful discussions. The research was partially supported by INTAS Young Scientis Fellowship Nr 03-552233. References [1] [2] [3] [4]
V. Kac, Infinite-Dimensional Lie Algebras (Mir, Moscow, 1993). A. G. Reyman and M. A. Semenov-Tian-Shansky, Invent. Math. 54 (1979) 81. A. G. Reyman, Zapiski LOMI 95 (1980) 3. B. Kostant, Adv. Math. 34 (1979) 195.
October 15, 2004 17:23 WSPC/148-RMP
00218
Deformations of Loop Algebras and Classical Integrable Systems
849
[5] M. Adler and P. van Moerbeke, Commun. Math. Phys. 83 (1982) 83. [6] A. G. Reyman and M. A. Semenov-Tian-Shansky, VINITI: Fundam. Trends 6 (1989) 145. [7] A. G. Reyman and M. A. Semenov-Tian-Shansky, Integrable system. Theoreticallygroup approach (R&C Dynamics, Izhevsk, 2003), p. 351. [8] V. G. Drinfeld and V. V. Sokolov, J. Sov. Math. 30 (1985) 1975. [9] M. F. de Groot, T. J. Hollowood and J. L. Miramontes, Commun. Math. Phys. 145(1) (1992) 57. [10] L. Feher and J. Harnad Marshall I., Commun. Math. Phys. 154 (1993) 181. [11] F. Delduc, L. Feher and L. Gallot, J. Phys. A31 (1998) 5545. [12] L. Tahtadjan and L. Faddejev, Hamiltonian Approach in the Theory of Solitons (Nauka, Moscow, 1986). [13] I. L. Cantor and D. E. Persits, Closed stacks of Poisson brackets, in Proc. IX USSR Conf. in Geometry (Shtinitsa, Kishinev, 1988), p. 141. [14] P. I. Holod and T. V. Skrypnyk, Naukovi Zapysky NAUKMA, Ser. Phys.-Math. Sciences 18 (2000) 20. [15] T. V. Skrypnyk, Lie algebras on hyperelliptic curves and finite-dimensional integrable systems, in Proc. XXIII Int. Colloquium on the Group Theoretical Methods in Physics, Dubna, Russia, 31 July–5 August 2000; nlin.SI-0010005 (e-print). [16] T. V. Skrypnyk, J. Math. Phys. 42(9) (2001) 4570. [17] T. V. Skrypnyk and P. I. Holod, J. Phys. A: Mathematical and General 34(9) (2001) 1123. [18] T. Skrypnyk, Czech. J. Phys. 52(11) (2002) 1283. [19] T. Skrypnyk, Czech. J. Phys. 53(11) (2003) 1119. [20] I. V. Cherednik, Funct. Anal. Appl. 17(3) (1983) 93. [21] L. Bordag and A. Yanovsky, J. Phys. A28 (1995) 4007. [22] I. Z. Golubchik and V. V. Sokolov, Theoret. Math. Phys. 124(1) (2000) 62. [23] I. Z. Golubchik and V. V. Sokolov, Funct. Anal. Appl. 34(4) (2000) 75. [24] I. Z. Golubchik and V. V. Sokolov, Funct. Anal. Appl. 36(3) (2002) 172. [25] P. I. Holod, Doklady Academy of Sciences of Ukrainian SSR 276(5) (1984) 5. [26] P. I. Holod, Theoret. Math. Phys. 70(1) (1987) 18. [27] P. I. Holod, Doklady Academy of Sciences of the USSR 292(5) (1987) 1087. [28] A. S. Mishchenko and A. T. Fomenko, Izv. AN SSSR, Ser. Mat. 42 (1978) 396. [29] A. S. Mishchenko and A. T. Fomenko, Trudy Seminara po Tenz. i Vect. Analizu 19 (1979) 3. [30] S. V. Manakov, Funct. Anal. Appl. 10(4) (1976) 93. [31] V. A. Steklov, Acta of the Harkov University (Harkov, 1893), p. 234. [32] A. Clebshch, Math. Ann. (1870) 238. [33] A. P. Veselov, Doklady Academy of Sciences of USSR 276(3) (1984) 590. [34] I. M. Krichiver and S. P. Novikov, Funct. Anal. Appl. 21(2) (1987) 46. [35] A. M. Perelomov, Funct. Anal. Appl. 15(2) (1981) 83. [36] A. V. Bolsinov, Trudy Seminara po Tenz. i Vect. Analizu 23 (1988) 18. [37] A. T. Fomenko, Symplectic Geometry: Methods and Applications (Moscow University Press, Moscow, 1988). [38] B. A. Dubrovin, I. M. Krichever and S. P. Novikov, VINITI, Fund. Trends 4 (1985) 179. [39] Yu. Fedorov, Math. Notes 54(1) (1993) 94. [40] E. D. Belokols, A. I. Bobenko, V. Z. Enolski, A. R. Its and V. B. Matveev, AlgebroGeometric Approach to Nonlinear Integrable Equations (Springer, Berlin, Heidelberg, New York, 1994).
October 18, 2004 14:30 WSPC/148-RMP
00219
Reviews in Mathematical Physics Vol. 16, No. 7 (2004) 851–907 c World Scientific Publishing Company
ZERO MODES IN A SYSTEM OF AHARONOV–BOHM FLUXES
V. A. GEYLER Department of Mathematics, Mordovian State University Bolshevistskaya 68, Saransk 430000, Russia ˇ TOV ˇ ´ICEK ˇ P. S Department of Mathematics, Faculty of Nuclear Science Czech Technical University Trojanova 13, 120 00 Prague, Czech Republic Received 4 February 2004 Revised 10 September 2004 We study zero modes of two-dimensional Pauli operators with Aharonov–Bohm fluxes in the case when the solenoids are arranged in periodic structures like chains or lattices. We also consider perturbations to such periodic systems which may be infinite and irregular but they are always supposed to be sufficiently scarce. Keywords: Aharonov–Bohm flux; Pauli operator; Aharonov–Casher decomposition; zero mode.
1. Introduction The appearance of zero modes (wave functions at zero energy which are ground states for a positive quantum Hamiltonian) belongs to the most interesting phenomena in systems with topologically non-trivial configuration spaces; see the discussion and an extensive bibliography in [1]. Zero modes of the Dirac and Pauli operators are of great importance in many places in quantum field theory and mathematical physics [2–4]. They are the ingredients for the computation of the index of these operators and play a key role in understanding anomalies. One of the best known examples for such operators is the Pauli Hamiltonian of a two-dimensional charged particle moving in a magnetic field perpendicular to a plane and penetrating the plane in a bounded domain. In this case the field defines a vector bundle with a non-trivial connection and zero modes appear at sufficiently high strength of the field [5]. More precisely, it is easy to prove that the dimension d of the space of zero modes is d = b|Φ|c where Φ is the total flux of the magnetic field measured in magnetic flux quanta, and for a real x, x ≥ 0, bxc denotes the lower integer part of x (b0c = 0, bnc = n − 1 for n ≥ 1 integer, and otherwise bxc = [x], the integer part of x). It is worthy to note that in the three-dimensional case the appearance 851
October 18, 2004 14:30 WSPC/148-RMP
852
00219
ˇˇtov´ıˇ V. A. Geyler & P. S cek
and the degeneracy of zero modes is a more subtle fact (see e.g. [6–11] and the discussion therein). In the current paper we restrict our consideration to two-dimensional systems only. More precisely, we consider Pauli operators which are Hamiltonians of an electron confined to a plane and subjected to a perpendicular time-independent magnetic field which is the sum of a uniform field and an additional field contributed by a (finite or infinite) array of singular flux tubes or, in other words, by an array of solenoids of zero width. We focus on zero modes in such systems. In more detail, the aim of the paper is to find conditions for appearance of zero modes in systems placed in a magnetic field with an infinite array of Aharonov–Bohm vortices. It has been shown in [12] on the physical level of rigor that zero modes occur if Aharonov–Bohm vortices are arranged in a periodic plane lattice provided that not all magnetic fluxes involved have integer values. In this paper we present a rigorous proof and show that under the same condition imposed on the flux, the result is true for a chain of Aharonov–Bohm solenoids or, more generally, for a uniformly discrete union of such chains. Moreover, the zero modes are retained if one adds to such a periodic structure of Aharonov–Bohm solenoids a not necessarily regular array of solenoids having sufficiently low density. This stability of zero modes for the Hamiltonian that we call Hmax (its definition is discussed in Sec. 2) shows that their origin differs from that for localized states in the so-called Aharonov–Bohm cages [13, 14], the latter are destroyed by arbitrarily small period modulations [15]. The main results of the paper are obtained with the help of a version of the Aharonov–Casher ansatz [5]. This version was proposed by Dubrovin and Novikov in [16] who employed it for an explicit construction of ground states of periodic magnetic Schr¨ odinger operators (see Novikov’s review paper [17]). In our case, this ansatz reduces the problem of finding zero modes to some estimates for entire functions. The mechanism of appearance of zero modes in the considered cases is close to that for a two-dimensional system in a uniform magnetic field in the presence of an infinite array of point scatterers [38, 18–22]. An interesting physical consequence of our result is the occurrence of oscillations of the type “localization–delocalization” in periodic systems of Aharonov–Bohm solenoids placed in a varying uniform magnetic field (Theorem 8.16). Another interesting result described in Theorems 8.8 and 8.16 is related to the problem of absolute continuity of the spectrum of the Schr¨ odinger operator with periodic vector potential A. This absolute continuity has been proved for a wide class of potentials A [23–28]. An example of a vector potential A having eigenvalues in the spectrum of the corresponding Schr¨ odinger operator was given in [29] but only for dimensions higher than 3. Our results give such an example in dimension 2. The paper is organized as follows. In Sec. 2 we try to point out some aspects regarding the history and the background of the problem. In Sec. 3 we discuss shortly the gauge invariance in the case when the magnetic field is a distribution. In Sec. 4 we introduce several basic examples of models with Aharonov–Bohm
October 18, 2004 14:30 WSPC/148-RMP
00219
Zero Modes in a System of Aharonov–Bohm Fluxes
853
fluxes some of them are the main subject of this paper and are studied in detail in the sequel. Section 5 is devoted to a rigorous definition of the Pauli operator with Aharonov–Bohm fluxes. In Sec. 6 we discuss the elimination of integer-valued Aharonov–Bohm fluxes. In Sec. 7 we recall the Aharonov–Casher ansatz which makes it possible to construct ground states of the Pauli operator using the theory of analytic functions. The main results of the paper are contained in Secs. 8 and 9. In Sec. 8 we study zero modes of the Pauli operator with an infinite periodic system of Aharonov–Bohm solenoids. In Sec. 9 we address the question of perturbations of such periodic structures caused by translations and additions of Aharonov–Bohm solenoids. The subsystem formed by solenoids affected by the perturbation may be infinite and irregular but we always suppose that it is sufficiently scarce. Here we also discuss some examples of irregular Aharonov–Bohm systems. For the reader’s convenience we have included three appendices. In the first appendix we collect some basic definitions and auxiliary results concerning lattices. In the second appendix we recall some basic notions and results from the theory of analytic functions related to the growth of entire functions. The third appendix is devoted to the Weierstrass σ-function. 2. Additional Comments on the History and the Background of the Problem There are many interesting and important physical problems related to systems involving Aharonov–Bohm fluxes. Since the publication of the original paper due to Aharonov and Bohm [30] the physics of a magnetic flux in an infinitely thin solenoid (called Aharonov–Bohm flux or Aharonov–Bohm vortex) has been investigated both from theoretical and experimental points of view [31, 32]. The physical origin of the Aharonov–Bohm effect is even a subject of theoretical investigations up to now [33]. On the other hand, the motion of a charged particle (an electron, a hole or a composite fermion) in a plane perpendicular to a uniform magnetic field has found an important application in physics of the quantum Hall effect [34, 35]. The most striking feature of the Hamiltonian of such a system is the Landau quantization of the spectrum which consists of highly degenerated equidistant energy levels; this makes quantum Hall phenomena possible. Moreover, it is of interest to know how the quantum Hall system is altered by various defects, in particular, by impurities or by inhomogeneities of the magnetic field. Additional Aharonov–Bohm fluxes appear to be a minimal modification of the uniform magnetic field, while general inhomogeneous magnetic fields are extremely difficult to handle [36, 37]. Similarly, a minimal perturbation of the quantum Hall system is given by a point perturbation of the Landau operator (i.e. the Schr¨ odinger operator with a uniform magnetic field) [38]. As shown below, both modifications require the operator extension theory for a correct construction of the corresponding Hamiltonian [39]. The vector potential of a system of Aharonov–Bohm solenoids has a strong singularity at the points where the plane intersects the solenoids. Therefore the
October 18, 2004 14:30 WSPC/148-RMP
854
00219
ˇˇtov´ıˇ V. A. Geyler & P. S cek
differential operator defining the Hamiltonian is not essentially self-adjoint on its natural domain. This is true both in the non-relativistic case (for the Schr¨ odinger and Pauli operators) and the relativistic one (for the Dirac operator). The boundary conditions for Schr¨ odinger operators with an Aharonov–Bohm vortex as well as the corresponding self-adjoint extensions (i.e. Hamiltonians describing a spinless nonrelativistic quantum particle) are considered in many papers, let us mention e.g. [40–43]. The multi-solenoid case is more difficult because of the rotational symmetry violation. This case was treated by means of the Krein resolvent formula in [44], and for an infinite chain of solenoids in [45]; different approaches are presented in [46–48]. The problem of defining the boundary conditions at the presence of a uniform background field has been investigated in [49, 50]. In the relativistic case, the problem of defining the appropriate Dirac operator is discussed e.g. in [51– 53], and at the presence of a uniform component – in the recent articles [54–57]. In all the mentioned papers, the spectral or scattering properties of the derived Hamiltonians are studied as well. On sufficiently smooth functions from L2 (R2 ) ⊗ C2 = L2 (R2 ; C2 ) the twodimensional Pauli operator for a charged particle with the spin s and the gyromagnetic ratio g acts as a formal differential operator [58] 2 ~ e 1 ˆ ≡ H(A) ˆ −µ ˆB (1) ∇ − A H = 2m∗ i c where e and m∗ are the charge and the mass of the particle, respectively, A = (Ax , Ay ) is the vector potential of a magnetic field B = Bez , B = ∂x Ay − ∂y Ax , µ ˆ is the magnetic momentum operator, µ ˆ = gsµB sˆz , with µB being the Bohr magneton, µB = −|e|~/(2m∗ c), and 0 1 1 1 sˆz = σz = 2 2 0 −1
(we consider the motion of a particle in the plane R2 canonically embedded in the space R3 ). In general, the non-relativistic limit of the Dirac equation leads to the value g = 2, and the main part of our work deals with this value of the gyromagnetic ratio. In the case of an Aharonov–Bohm solenoid B is proportional to the Dirac delta function, δ(r), and therefore the operator (1) takes the form of the Schr¨ odinger operator perturbed at a point and with a finite coupling constant α standing in front of the “δ-potential”. On the other hand, it is well known that in the two-dimensional case under consideration the expression (1) defines a non-trivial perturbation of the operator 2 ~ e 1 ˆ ˆ (2) ∇− A H0 ≡ H0 (A) = 2m∗ i c
only if α is in some sense “infinitesimal” [39] (we suppose that appropriate boundary ˆ 0 are chosen). This problem has been analyzed in [59] in detail conditions defining H
October 18, 2004 14:30 WSPC/148-RMP
00219
Zero Modes in a System of Aharonov–Bohm Fluxes
855
for an arbitrary positive value of g. To get around it, a solenoid of finite radius R is considered with a rotationally symmetric magnetic flux inside the solenoid but otherwise having an arbitrary profile (including the magnetic flux supported on the surface of the infinite cylinder), and the limit R → 0 is discussed. In addition to [59] let us also mention papers [60–68]. Of course, the same approach is useful when a uniform component of the field is present or in the case of the Dirac operator (see [69–71] and references therein). In the most important case when the gyromagnetic ratio g equals 2 the Pauli operator has remarkable supersymmetry properties which makes it possible to use the Aharonov–Casher decomposition [5]. As a result, we have a convenient definition of the Pauli operator with a singular potential by means of a quadratic form (see Sec. 5). More precisely, in this case we have, as usual, two natural quadratic forms associated to the expression (1) – the minimal and the maximal one (with the definition of the magnetic Schr¨ odinger operator taken from [72]). These forms ± ± provide us with two natural types of self-adjoint operators denoted Hmax and Hmin and playing the role of Pauli operators with Aharonov–Bohm solenoids (the sign ± stands for spin up and spin down supersymmetric partners, respectively). There is an important distinction between the operators Hmin and Hmax . As it follows from ± the definitions, both operators Hmin coincide with the Friedrichs extension of the symmetric operator defined by expression (2) with the vector potential A corresponding to a system of Aharonov–Bohm fluxes. Therefore this extension (denoted simply by Hmin ) may be interpreted as the Hamiltonian of a “spinless” particle moving in a system of Aharonov–Bohm fluxes (this corresponds to physical problems for an electron when the spin–orbit coupling can be neglected and spin splitting is taken into account with the help of the perturbation theory [58]). Such a Hamilto± nian has been considered e.g., in [41, 59]. On the other hand, the operators Hmax do not coincide in general which indicates that they directly take into account the energy of the spin–orbit interaction and therefore they may be regarded as the Pauli operators of the system under consideration. In the present article we concentrate mainly on zero modes of Hmax . Note that boundary conditions defining the Hamiltonian Hmax are given in [42, 43] (in the case of a single solenoid) and in [73] (in the two-solenoid case). For a finite system of Aharonov–Bohm solenoids, the existence problem of zeroenergy eigenfunctions was considered in [74–78]. In this case the number d of linearly independent zero-modes depends on the fractional parts of fluxes in the individual solenoids, {x} = x − [x], rather than only on the total flux Φ in the system. This phenomenon is a consequence of the gauge invariance properties for the Aharonov– Bohm fluxes (see e.g. papers [79–82]). In the case when the considered magnetic field has a “regular” component in addition to the magnetic field of Aharonov– Bohm solenoids the appearance of zero modes has been analyzed in [83, 84]. The results of [84] are applicable also to the case when an infinite number of Aharonov– Bohm solenoids is present in the system but the total magnetic flux is necessarily finite (moreover, after some gauge transformation the total variation of the flux
October 18, 2004 14:30 WSPC/148-RMP
856
00219
ˇˇtov´ıˇ V. A. Geyler & P. S cek
must be finite). On the other hand, it is clear that the thermodynamic limit of a bounded system with a fixed density of Aharonov–Bohm fluxes is a system with an infinite number of Aharonov–Bohm solenoids and with an infinite total flux. An example for a system of such a kind is the quasi-two-dimensional system with columnar defects in a uniform magnetic field directed along the defect axis [85–87] or the GaAs/AlGaAs heterostructure coated with a film of type-II superconductor [88] (in the latter case the Aharonov–Bohm fluxes are arranged in a honeycomb lattice, the so-called Abrikosov lattice). As for the spectral properties of the operator Hmin , they have been investigated recently in detail by Melgaard, Ouhabaz and Rozenblum [89]. In particular, these authors proved with the help of results from [90] and [91] that Hmin has no zero modes at least for periodic lattices of Aharonov–Bohm solenoids, and therefore it + − differs from Hmax and Hmax for generic values of magnetic fluxes (and even it is not unitarily equivalent to these operators). Let us note that it is possible to extend this result to a chain of Aharonov–Bohm solenoids. 3. The Pauli Operator with a Singular Magnetic Field In what follows we consider the motion of an electron with the gyromagnetic ratio g = 2, therefore 2 2 e e e ~2 ˆ i∂x + Ax + i∂y + Ay − σz B . (3) H= 2m∗ c~ c~ c~ Let us denote for simplicity e B = b, c~
e A = a, c~
(4)
so that ∂x ay −∂y ax = b. In order to employ the dimensionless units we shall consider the operator H ≡ H(a) =
2m∗ ˆ H(A) . ~2
(5)
Introducing a quantum of the magnetic flux, Φ0 =
2πc~ , e
(6)
we also have a=
2π A, Φ0
b=
2π B, Φ0
H ≡ H(a) = (i∂x + ax )2 + (i∂y + ay )2 − σz b .
(7) (8)
ˆ decomposes in a sum of two The operator H (and respectively the operator H) scalar operators, H ± ≡ H ± (a) = (i∂x + ax )2 + (i∂y + ay )2 ∓ b ,
(9)
October 18, 2004 14:30 WSPC/148-RMP
00219
Zero Modes in a System of Aharonov–Bohm Fluxes
857
ˆ ± (A) ≡ H ˆ ± ) acting in L2 (R2 ). We admit the vector potential a to (respectively H have singular points, more precisely, we assume that ax , ay ∈ L1loc (R2 ) ∩ C ∞ (R2 \Ω)
(10)
where Ω is a discrete subset (possibly finite or empty) in R2 . Consequently, the magnetic field b = ∂x ay − ∂y ax is, in general, a distribution in R2 whose singular support is contained in Ω. Expressions (1) and (9) represent symmetric operaˆ ± (A, Ω) and tors with the domain C0∞ (R2 \Ω); these operators will be denoted H ± H (a, Ω), respectively. If the singular support of B coincides with Ω (in this case Ω ˆ ± (A) and H ± (a). is determined by the vector potential A) we shall simply write H It is important to note that also in the case when b is a distribution the operator ± H (a) depends, up to unitary equivalence, only on b. More precisely, we have the following proposition. ˜ Theorem 3.1 (Gauge Invariance of the Operator H ± (a)). Let a and a ˜ ∈ L1loc (R2 ; R2 ) ∩ be vector potentials with the same magnetic field b (i.e. a, a C ∞ (R2 \Ω; R2 ) and ∂x ay − ∂y ax = ∂x a ˜ y − ∂y a ˜x = b in the sense of distributions). ± ± Then the operators H (a, Ω) and H (˜ a, Ω) are unitarily equivalent. In more detail , ˜ = a+grad f, there exists a real-valued function f belonging to C ∞ (R2 \Ω) such that a and H ± (˜ a, Ω) = W −1 H ± (a, Ω)W where W is the unitary operator acting via multiplication by the function exp(−if ). Of course, this theorem is well known in the case when the field b is a function (not a distribution). In the case when b is a distribution the theorem is a consequence of the following lemma whose elementary proof was communicated to us by K. V. Pankrashkin. Lemma 3.2. Assume that a ∈ L1loc (R2 )∩C ∞ (R2 \Ω) and the equality ∂x ay −∂y ax = 0 holds true in R2 in the sense of distributions. Let ω ∈ Ω and let Q be a rectangle containing ω but no other points from Ω. Then Z ax dx + ay dy = 0 . (11) ∂Q
Proof. Let us choose functions ϕ, ψ ∈ C0∞ (R2 ) so that ω ∈ / supp ϕ, ϕ(x, y) = 1 in some neighborhood of the boundary ∂Q, ψ(x, y) = ϕ(x, y) on R2 \Q and ψ(x, y) = 1 on Q. Using the Green formula we obtain ZZ Z Z (∂x (ϕay ) − ∂y (ϕax ))dxdy ϕax dx + ϕay dy = ax dx + ay dy = Q
∂Q
∂Q
= =
ZZ ZZ
Q∩ supp (ϕ)
R2
ϕ(∂x ay − ∂y ax )dxdy +
ZZ
Q
(ay ∂x ϕ − ax ∂y ϕ)dxdy
(ay ∂x (ϕ − ψ) − ax ∂y (ϕ − ψ))dxdy = 0 .
October 18, 2004 14:30 WSPC/148-RMP
858
00219
ˇˇtov´ıˇ V. A. Geyler & P. S cek
Here we have used the fact that the expression ∂x ay − ∂y ax represents a smooth function on R2 \Ω which necessarily vanishes on this domain. Proof of Theorem 3.1. From Lemma 3.2 we derive in a standard manner that if ∂x ay − ∂y ax = 0 on R2 in the sense of distributions then there exists a real-valued ˜ function f ∈ C ∞ (R2 \Ω) such that a = grad f on R2 \Ω. Consequently, if a and a obey the assumptions of the theorem then for some function f ∈ C ∞ (R2 \Ω) we ˜ = a + grad f . Let us denote by W the operator acting via multiplication have a by the function exp(−if ). Clearly, W is a well defined unitary operator in L2 (R2 ). Moreover, W leaves invariant the subspace C0∞ (R2 \Ω). A simple computation shows that W −1 H ± (a, Ω)W = H ± (˜ a, Ω). Hence the operators H ± (a, Ω) and H ± (˜ a, Ω) are unitarily equivalent. Remark 3.3. Clearly, if a = grad f in the sense of distributions then ∂x ay −∂y ax = 0 in the same sense. Remark 3.4. A proposition analogous to that of Theorem 3.1 is also valid for ˆ ± (A, Ω). Namely, if ∂x Ay − ∂y Ax = ∂y A˜x − ∂x A˜y = B then A ˜ = the operator H ± −1 ˆ ± ˜ ˆ A + grad f and H (A, Ω) = W H (A, Ω)W where W = exp(−(ie/c~)f ). Owing to the gauge invariance it is possible to require the vector potential A to have some additional properties. For example, the vector potential A can be frequently chosen so that it fulfills the Lorentz gauge condition div A = 0 .
(12)
4. Basic Examples In this section we recall several basic examples of magnetic fields fulfilling condition (10). At the same time, we introduce the necessary notation. The majority of results presented in the current paper concern Examples 5, 6 and 7. In what follows it will be convenient to identify the Euclidean plane R2 with the complex plane C and to work with the complex coordinates z = x + iy and z¯ = x − iy. Example 1. The homogeneous field In this case B = const by definition and one can set B B y, Ay = x 2 2 (the symmetric gauge). In the complex coordinates we have Ax = −
B B Im z¯ , Ay = Re z¯ . 2 2 In this example b = 2πξ where ξ is the number of magnetic flux quanta through a unit area in R2 (the flux density). The Lorentz gauge condition (12) is obviously fulfilled. Ax =
October 18, 2004 14:30 WSPC/148-RMP
00219
Zero Modes in a System of Aharonov–Bohm Fluxes
859
Example 2. The magnetic field of an Aharonov–Bohm solenoid Here B(r) = Φδ(r) where Φ is the magnetic flux through the solenoid. In this case one can set Ax = −
Φ y , 2π r2
Ay =
Φ x . 2π r2
Equivalently, ax = θ Im
1 , z
ay = θ Re
1 , z
where θ = Φ/Φ0 is the number of magnetic flux quanta through the Aharonov– Bohm solenoid. Actually, it is well known that ∆ ln(|z|) = 2πδ(z) . In the local coordinates we have ∂ ∂ Φ ∂ 1 ∂ 1 ∂2 Φ ∂2 Ay − Ax = Re − Im + B= ln |z| = Φδ(z) . = ∂x ∂y 2π ∂x z ∂y z 2π ∂x2 ∂y 2 The vector potential a can be also written as a = θ sgrad ln |z| .
(13)
Here and everywhere in what follows sgrad stands for the symplectic gradient, ∂ ∂ sgrad f = − f, f . (14) ∂y ∂x Hence b = 2πθδ(z). The equality div a = 0 trivially follows from (13). Example 3. An arbitrary system of Aharonov–Bohm solenoids Let now Ω be a discrete subset of the plane R2 and let (Φω )ω∈Ω be an arbitrary family of real numbers with indices from Ω. We shall consider a system of Aharonov– Bohm fluxes intersecting the plane in the points from the set Ω and perpendicular to the plane. The number Φω equals the flux in the solenoid passing through the point ω ∈ Ω. Then X 2π θω δ(z − ω) b = 0 B = 2π Φ ω∈Ω
where, of course, θω = Φω /Φ0 is the number of magnetic flux quanta through the solenoid ω. For a vector potential a fulfilling the Lorentz gauge condition one can choose a meromorphic function M (z) with the following properties: (1) M (z) has simple poles only, (2) the set of poles of M (z) coincides with Ω, (3) the residue of M (z) at the point ω equals θω .
October 18, 2004 14:30 WSPC/148-RMP
860
00219
ˇˇtov´ıˇ V. A. Geyler & P. S cek
According to the Mittag–Leffler theorem such a function always exists. The computations carried out in Example 2 (jointly with the Cauchy–Riemann conditions) show that one can set ax (z, z¯) = Im M (z) ,
ay (z, z¯) = Re M (z) .
The operator H ± (a) will be also denoted by the symbol H ± (Ω, Θ) where Θ = (θω )ω∈Ω . The couple (Ω, Θ) determines the operator H ± (Ω, Θ) unambiguously up to unitary equivalence. Example 4. An arbitrary system of Aharonov–Bohm solenoids with fluxes taking a finite number of values Separately we consider the case when the number of mutually different fluxes in the family (Φω )ω∈Ω is finite (equivalently, the family (θω )ω∈Ω contains only a finite number of mutually different numbers θω ). We start from the case when all the involved solenoids carry the same flux: θω = θ, ∀ω ∈ Ω. In this case we always set M (z) = θ
W 0 (z) . W (z)
Here the function W (z) differs from the Weierstrass canonical product WΩ (z) related to the set Ω only by a multiplier exp(g(z)) where g(z) is an entire function. Obviously, the set of poles of the function W 0 (z)/W (z) coincides with Ω, all the poles are simple and all the residues are equal to 1. Thus one can set a = θ sgrad ln(|W (z)|) .
(15)
Actually, locally we have W 0 (z) 1 ∂ ∂ ∂ ¯ (¯ ln(W (z) + W z )) = Re ln(|W (z)|) = + , ∂x 2 ∂z ∂ z¯ W (z)
and analogously,
W 0 (z) ∂ ln(|W (z)|) = −Im . ∂y W (z) In general, let Ω1 , . . . , ΩN be mutually disjoint discrete (possibly empty) sets, and let θj , j = 1, . . . , N , be (not necessarily distinct) real numbers. The vector potential a is defined unambiguously, up to gauge equivalence, by the expression ! n N Y X θj |Wj (z)| θj sgrad ln(|Wj (z)|) = sgrad ln (16) a= j=1
j=1
where Wj is an entire function having simple zeros only and with the zero set being equal to Ωj . The function Wj differs from the Weierstrass canonical product related to the set Ωj only by a multiplier of the form exp(gj (z)) where gj is an arbitrary entire function. An Aharonov–Bohm potential of the form (16) will be called a potential of finite type. The operator H ± (a) will be also denoted by the symbols H ± (Ω1 , . . . , ΩN ; θ1 , . . . , θN ) or H ± ((Ωj ); (θj )).
October 18, 2004 14:30 WSPC/148-RMP
00219
Zero Modes in a System of Aharonov–Bohm Fluxes
861
The most important particular cases of potentials of finite type are those for which the Aharonov–Bohm field is invariant with respect to a discrete group Λ which is formed by motions of the Euclidean plane R2 and whose action on Ω is co-finite. First of all we shall be interested in the case when the group Λ is formed by parallel translations. Up to isomorphism, there exist just three groups of this type in the plane and they are characterized by their rank r (r = 0, 1, 2). (1) r = 0. In this case Λ = {0} and the set Ω is finite. (2) r = 1. In this case Λ is isomorphic to Z and has the form Λ = {kω0 ; k ∈ Z} where ω0 is a nonzero vector from R2 . The set Ω has the form Ω = K + Λ where K is a finite subset of the “elementary strip” F = {x ∈ R2 ; 0 ≤ x · ω0 < |ω0 |2 } (or, in the complex coordinates, F = {z ∈ C; 0 ≤ Re z¯ω0 < |ω0 |2 }). Since each ω ∈ Ω is uniquely expressible in the form ω = κ + λ, with κ ∈ K and λ ∈ Λ, every Λ-invariant family Θ is unambiguously determined by its subfamily ΘK = (θκ )κ∈K . (3) r = 2. In this case Λ is isomorphic to Z2 and has the form Λ = {k1 ω1 + k2 ω2 ; k1 , k2 ∈ Z} where ω1 , ω2 are linearly independent vectors from R2 . The set Ω has the form Ω = K + Λ where K is a finite subset of the elementary cell F = {t1 ω1 + t2 ω2 ; 0 ≤ t1 , t2 < 1}. We shall assume that the basis ω1 , ω2 is positively oriented so that ω1 ∧ ω2 = Im ω ¯ 1 ω2 > 0. This expression is nothing but the area S = SΛ of the elementary cell F of the lattice Λ. We shall discuss each of these cases separately. Example 5. A finite number of Aharonov–Bohm solenoids Let Λ = {0}. In this case the set Ω is finite, Ω = {ω1 , . . . , ωn }, and b = 2π
n X j=1
θj δ(z − ωj ) .
As a rule, the vector potential in this case will be chosen in the form a=
n X j=1
θj sgrad ln(|z − ωj |) .
The operator H ± (a) will be also denoted by H ± (ω1 , . . . , ωn ; θ1 , . . . , θn ). Example 6. A chain of Aharonov–Bohm solenoids Assume now that the rank of Λ equals 1. Firstly we contains only one element. Without loss of generality Then Ω = Λ, θω = θ for all ω, and ∞ Y Y z ez/kω0 = z 1− WΩ (z) = z 1− kω0 k∈Z, k6=0
k=1
consider the case when K we assume that K = {0}. z2 2 k ω02
=
πz 1 sin . π ω0
October 18, 2004 14:30 WSPC/148-RMP
862
00219
ˇˇtov´ıˇ V. A. Geyler & P. S cek
Therefore one can set
πz W (z) = sin ω0
.
Consequently, πz a = θ sgrad ln sin , ω0
which means that ax =
πθ πz , Im ctg ω0 ω0
ay =
πθ πz . Re ctg ω0 ω0
Generally, Ω = K + Λ with an arbitrary finite subset K ⊂ F , and we have X X B= δ(z − λ − κ) . Φκ κ∈K
λ∈Λ
Then the vector potential reads a = sgrad
X
κ∈K
π (z − κ . θκ ln sin ω0
Example 7. A lattice of Aharonov–Bohm solenoids Assume now that the rank of Λ equals 2 which means that Λ is a two-dimensional lattice. Again, we shall start from the case when K = {0}, hence Ω = Λ. In this case WΩ (z) coincides with the Weierstrass σ-function of the lattice Λ, Y z2 z z exp + 2 . 1− σ(z; ω1 , ω2 ) ≡ σ(z) = z ω ω ω ω∈Ω\{0}
At the same time, σ 0 (z) = ζ(z) = ζ(z; ω1 , ω2 ) σ(z) is the Weierstrass ζ-function of the lattice Λ. Thus a = θ sgrad ln(|σ(z)|) = θ(Im ζ(z), Re ζ(z)) . In the general case Ω = K + Λ with an arbitrary finite subset K ⊂ F . Then the magnetic field takes the form X X B= Φκ δ(z − λ − κ) . κ∈K
λ∈Λ
One can set a = sgrad
X
κ∈K
θκ ln(|σ(z − κ)|) .
October 18, 2004 14:30 WSPC/148-RMP
00219
Zero Modes in a System of Aharonov–Bohm Fluxes
863
Remark 4.1. In all the examples with an Aharonov–Bohm potential of finite type, and also in the case of a homogeneous magnetic field, there exists a function ϕ(x, y) such that a = sgrad ϕ or, equivalently, ∆ϕ = b .
(17)
Namely, one can respectively set in Examples 1–7: 1 1 ϕ(x, y) = πξ(x2 + y 2 ) = πξ|z|2 , 2 2 ϕ(x, y) = ϕ(x, y) =
1 θ ln(x2 + y 2 ) = θ ln(|z|) , 2 n X
θj ln|Wj (z)| = ln
n X
θj ln(|z − ωj |) ,
j=1
ϕ(x, y) =
j=1
ϕ(x, y) =
j=1
X
κ∈K
ϕ(x, y) =
X
n Y
κ∈K
|Wj (z)|
θj
π(z − κ) θκ ln sin = ln ω0
θκ ln(|σ(z − κ; ω1 , ω2 )|) = ln
!
,
! Y π(z − κ) θκ sin , ω0
κ∈K
Y
κ∈K
|σ(z − κ; ω1 , ω2 )|
θκ
!
.
Let us note that in the general case when B is a Λ-periodic continuous field the solution of the Eq. (17) is expressible in the form ZZ 1 ϕ(z) = ln(|σ(z − z 0 )|)b(z 0 )dx0 dy 0 , (18) 2π F
where F is an elementary cell of the lattice Λ [17]. Actually, we have already seen that X ∆ ln(|σ(z)|) = 2π δ(z − λ) . λ∈Λ
Therefore a formal computation yields X ZZ ∆ϕ(z) = b(z 0 )δ(z − z 0 − λ)dx0 dy 0 . λ∈Λ
(19)
F
For every z ∈ C there exists a unique λ0 ∈ Λ such that z ∈ F + λ0 , i.e. z − λ0 ∈ F . Then the summands in Eq. (19) with λ 6= λ0 vanish and we have ZZ ∆ϕ(z) = b(z 0 )δ(z 0 − (z − λ0 ))dxdy = b(z − λ0 ) = b(z) . F
In the case of a lattice formed by Aharonov–Bohm solenoids formula (18) still makes sense and it again yields ϕ(z) = θ ln(|σ(z)|) .
October 18, 2004 14:30 WSPC/148-RMP
864
00219
ˇˇtov´ıˇ V. A. Geyler & P. S cek
Let us note that the Lorentz gauge condition (12) follows from the equality a = sgrad ϕ. In the sequel, the main results will be derived for Hamiltonians corresponding to three types of systems of Aharonov–Bohm solenoids. Namely, the set Ω formed by the intersection points of solenoids with the plane may be (1) a finite set, (2) a chain or a finite union of chains, (3) a lattice or a finite union of lattices. These systems will be called regular. 5. A Rigorous Definition of the Pauli Operator as a Self-Adjoint Operator Let us return to the symmetric operators H ± = H ± (a, Ω) defined in Eq. (9) while assuming that condition (10) is satisfied. Let us introduce the momentum operators Px ≡ Px (a, Ω) = −i∂x − ax ,
Py ≡ Py (a, Ω) = −i∂y − ay .
(20)
By virtue of condition (10) these operators can be considered as symmetric operators in L2 (R2 ) with the domain C0∞ (R2 \Ω). Following Aharonov–Casher [5] we define the operators T± ≡ T± (a, Ω) = Px ± iPy ,
(21)
¯ z¯) where A = ax + iay . Then the or T+ = −2i∂z¯ − A(z, z¯), T− = −2i∂z − A(z, ∞ 2 following equalities hold true on C0 (R \Ω): T+ T− = H − ,
T− T+ = H + .
(22)
By a straightforward computation one can verify a simple but important lemma. Lemma 5.1. The commutation relations [Px , Py ] = ib ,
[T− , T+ ] = −2b ,
(23)
are valid on C0∞ (R2 \Ω). In particular, if supp B ⊂ Ω (including the case when B corresponds to a system of Aharonov–Bohm solenoids) then the operators P x and Py (respectively T+ and T− ) commute on the domain C0∞ (R2 \Ω). From the obvious inclusions T±∗ ⊃ T∓
(24)
we immediately deduce that the operators T± are closable and therefore the selfadjoint operators ± ± Hmin ≡ Hmin (a, Ω) = T±∗ T¯±
(25)
h T± ϕ|T± ψ i ,
(26)
are well defined (see, e.g., [92, Theorem X.25]). The associated quadratic forms h ± min are closures of positive forms defined on C0∞ (R2 \Ω) by the expressions respectively.
October 18, 2004 14:30 WSPC/148-RMP
00219
Zero Modes in a System of Aharonov–Bohm Fluxes
865
On the other side, let us consider a quadratic form defined on C0∞ (R2 \Ω) by the relation s± (ϕ, ψ) = h Px ϕ|Px ψ i + h Py ϕ|Py ψ i ∓ h bϕ|ψ i .
(27)
By a straightforward computation using relation (23) one can show the following lemma. ± ∞ 2 Lemma 5.2. The quadratic forms h± min and s coincide on C0 (R \Ω).
In particular, if the support of B is contained in Ω then the quadratic forms − ∞ 2 h+ min and hmin coincide on C0 (R \Ω) and therefore they are necessarily equal. Corollary 5.3. If B is a distribution with a support contained in Ω then the oper+ − ators Hmin and Hmin coincide. In particular , for the vector potential a of a system of the Aharonov–Bohm vortices supported on a set Ω ⊂ R2 these operators coincide with the Friedrichs extension of the symmetric operator defined on C 0∞ (R2 \Ω) by the differential expression H0 (a) = (i∂x + ax )2 + (i∂y + ay )2 .
(28)
± . In view of Lemma 5.2 we shall sometimes simply write Hmin instead of Hmin The operator Hmin has been investigated in detail in [89]. Jointly with the operator Hmin (a, Ω) let us consider the operators ± ± Hmax ≡ Hmax (a, Ω) = T¯∓ T∓∗
(29)
± with the associated quadratic forms defined on D(Hmax ) by the expressions ∗ ∗ h± max (ϕ, ψ) = h T∓ ϕ|T∓ ψ i ,
(30)
respectively. ± ± The definitions of Hmax (a, Ω) and Hmin (a, Ω) in principle depend on the choice of the discrete set Ω. If Ω coincides with the singular support of b, however, we ± ± shall simply write, similarly as in Sec. 3, Hmin (a) and Hmax (a) since in that case the vector potential a determines Ω unambiguously. ± If the field B is sufficiently regular and Ω = ∅ then the operator Hmin coin± cides with the operator Hmax [72]. This is not true, however, for operators with ± Aharonov–Bohm fluxes (see [74], [89]). Since in this case Hmin is defined by expression (28) and is independent of spin, this operator is the Schr¨ odinger operator of a spinless particle in the presence of the Aharonov–Bohm fluxes (or the Schr¨ odinger operator of a particle with spin when interaction of the spin with the field can be ± neglected). On the other hand, Hmax are defined by expression (9), they depend on the spin and may be considered as the Pauli operators for an electron with the gyromagnetic ratio g = 2. Below we are interested in the properties of ground states ± of the operator Hmax . ± For the analysis of operators Hmax the following description of the operators T±∗ will be useful. Namely, owing to condition (10) the differential operators −i∂x − ax
October 18, 2004 14:30 WSPC/148-RMP
866
00219
ˇˇtov´ıˇ V. A. Geyler & P. S cek
and −i∂y −ay are well defined on the space of distributions D 0 (R2 \Ω). Consequently, the operators T± defined on C0∞ (R2 \Ω) can be naturally extended to linear mappings T˜± defined on D0 (R2 \Ω). Using the fact that L2 (R2 ) is naturally embedded into D0 (R2 \Ω) we get the following lemma. Lemma 5.4. The operator T±∗ is a restriction of T˜∓ to the domain {f ∈ L2 (R2 ); T˜∓ f ∈ L2 (R2 )} . Using this observation we can prove the following lemma. ¯ Then Lemma 5.5. Let C be the operator of complex conjugation, Cf = f. ± ∓ ± ∓ CHmax (a, Ω) = Hmax (−a, Ω)C and CHmin (a, Ω) = Hmin (−a, Ω)C. + − Corollary 5.6. The operators Hmax (a, Ω) and Hmax (−a, Ω) have the same spectra. In particular , they have the same eigenvalues with equal multiplicities. An analogous + − proposition holds true for the couple of operators Hmin (a, Ω) and Hmin (−a, Ω).
6. Elimination of Aharonov–Bohm Solenoids with Integer Fluxes ˜ of the form In this section we consider a vector potential a ˜ = a + aAB a where aAB is a vector potential corresponding to a system of Aharonov–Bohm solenoids intersecting the plane in the points of Ω. We describe here briefly the ˜; details can be “gauge-periodicity” of the operators with the vector potential a found e.g. in [79–82]. First we shall assume that the considered solenoids carry equal fluxes of the value θAB . In this case we set aAB = θAB sgrad ln(|W (z)|) (cf. Example 4 in Sec. 4). Let θAB be an integer. Then the function W (z) = exp iθAB arg(W (z)) , g(z, z¯) = exp θAB ln |W (z)|
is well defined and continuous in the domain C\Ω. Clearly, |g(z, z¯)| = 1, ∀z ∈ C\Ω, and, moreover, g ∈ C ∞ (R2 \Ω). Lemma 6.1. If θAB is an integer then the following relations hold true g −1 Px (˜ a, Ω)g = Px (a, Ω) ,
g −1 Py (˜ a, Ω)g = Py (a, Ω) .
Proof. It suffices to show that −i grad g = gaAB . Actually, we have ∂ 1 ∂ W (z) ∂ = ln(W (z)) − ln(W (z)) + ln(W (z)) ln + ∂x |W (z)| ∂z ∂ z¯ 2 = i Im
W 0 (z) −1 = iθAB aAB,x . W (z)
(31)
October 18, 2004 14:30 WSPC/148-RMP
00219
Zero Modes in a System of Aharonov–Bohm Fluxes
867
Analogously, ∂ W (z) W 0 (z) −1 = i Re ln = iθAB aAB,y . ∂y |W (z)| W (z) Relation (31) obviously follows from these equalities. Assume now that aAB is a vector potential corresponding to an Aharonov–Bohm field of finite type whose singular support coincides with Ω = Ω1 ∪ · · · ∪ Ωn , and with an array of fluxes denoted by Θ = (θj )1≤j≤n . Let (mj )1≤j≤n be an arbitrary ˜AB be another Aharonov–Bohm potential of finite type array of integers and let a defined by the same array of sets (Ωj ) but with the fluxes θ˜j = θj + mj . ˜AB have the same meaning as described Theorem 6.2. Assume that a, aAB and a ± ± above. Then Hmin (a + aAB , Ω) (respectively, Hmax (a + aAB , Ω)) is unitarily equiv± ± ˜AB , Ω) (respectively, Hmax ˜AB , Ω)). alent to the operator Hmin (a + a (a + a Proof. Let T1± , T2± be the operators corresponding to the vector potentials a + ˜AB , respectively, as described in Sec. 5. By construction, aAB and a + a D(T1± ) = D(T2± ) = C0∞ (R2 \Ω) . Applying repeatedly Lemma 6.1 one can show that there exists a unitary operator U such that U C0∞ (R2 \Ω) = C0∞ (R2 \Ω)
and
U −1 T2± U = T1± . From the unitarity of U it follows that U −1 T¯2± U = T¯1±
and
∗ ∗ U −1 T2± U = T1± .
Consequently, ± ± ∗ ¯ ∗ ¯ ˜AB , Ω)U = U −1 T2± U −1 Hmin (a + a T2± U = T1± T1± = Hmin (a + aAB , Ω)
and ± ∗ ∗ ± ˜AB , Ω)U = U −1 T¯2∓ T2∓ U −1 Hmax (a + a U = T¯1∓ T1∓ = Hmax (a + aAB , Ω) .
This shows the theorem.
October 18, 2004 14:30 WSPC/148-RMP
868
00219
ˇˇtov´ıˇ V. A. Geyler & P. S cek
± Corollary 6.3. If all fluxes θj are integers then the operator Hmin (a + aAB , Ω) ± ± (respectively, Hmax (a + aAB , Ω)) is unitarily equivalent to the operator Hmin (a, Ω) ± (respectively, Hmax (a, Ω)).
Let us formulate separately two most important cases of this corollary. The first ± ± one is based on the fact that both the operators Hmin (0, Ω) and Hmax (0, Ω) do not depend on the choice of the discrete set Ω and coincide with the Laplace operator −∆. Corollary 6.4. Let Ω be a discrete set which is invariant with respect to a co-finite action of a lattice Λ of rank r, 0 ≤ r ≤ 2. Assume that a = 0 and aAB is a vector potential corresponding to a system of Aharonov–Bohm solenoids supported on the ± set Ω and such that all fluxes are integers. Then each of the operators H min (aAB , Ω) ± and Hmax (aAB , Ω) is unitarily equivalent to the Laplace operator −∆. Proof. Since the action is co-finite we are again in the situation when Ω splits into a finite union Ω = Ω1 ∪ · · · ∪ Ωn . Hence one can apply Corollary 6.3. The unitary operator induced by multiplication with the function g acts locally in the ± ± form sense [93] and therefore each of the operators Hmin (0, Ω) and Hmax (0, Ω) is a point perturbation of −∆ supported on the set Ω. The perturbed operator is clearly positive and local in the form sense [94]. On the other hand, every nontrivial point perturbation in the two-dimensional case is known to have a strictly negative infimum of the quadratic form over unit vectors [39]. Since the minimum of spectrum in the case of a periodic point perturbation of the Landau operator is strictly smaller than the minimum of spectrum of the unperturbed operator [38] the following corollary is also true. Corollary 6.5. Let a be a vector potential of a nonzero homogeneous magnetic field and assume again that the discrete set Ω is invariant with respect to a cofinite action of a lattice Λ of rank r, 0 ≤ r ≤ 2. Then for b > 0, each of the + + operators Hmin (a + aAB , Ω) and Hmax (a + aAB , Ω) is unitarily equivalent to the + Landau operator H (a). For b < 0, an analogous statement is true for the operators − − Hmin (a + aAB , Ω) and Hmax (a + aAB , Ω). To simplify the discussion to follow we shall assume once and for all that an appropriate gauge transformation has been applied so that the values of all involved Aharonov–Bohm fluxes belong to the interval [0, 1[ . If there are some zero values then Ω is strictly larger then the singular support of b. As shown by Corollaries 6.4 and 6.5, the zero values can be eliminated in some particular cases. We shall proceed in our simplifications even further. If not said otherwise, we assume everywhere in what follows that the values of Aharonov–Bohm fluxes belong to the interval ]0, 1[ and, consequently, the singular support of b coincides with Ω.
October 18, 2004 14:30 WSPC/148-RMP
00219
Zero Modes in a System of Aharonov–Bohm Fluxes
869
7. The Ground States (Zero Modes) of the Pauli Operator ± ± It follows immediately from the definition of the operators Hmax and Hmin that they are nonnegative. Consequently, if the equation ± Hmin ψ=0
(32)
± Hmax ψ=0
(33)
or the equation has a solution in L2 (R2 ) then this solution ψ± (called zero mode) is a ground state of ± ± the corresponding operator. Since the equality Hmin ψ = 0 implies h Hmin ψ|ψ i = 0, 2 i.e. the equality kT¯± ψk = 0, Eq. (32) is equivalent to the equality T¯± ψ = 0 .
(34)
Analogously, Eq. (33) is equivalent to the equality T∓∗ ψ = 0 ,
(35)
or, this is the same, to the condition T˜± ψ = 0 ,
ψ ∈ L2 (R2 ) .
(36)
Suppose that the vector potential a was chosen to have the form a = sgrad ϕ where ϕ satisfies the equation ∆ϕ = b in the sense of distributions. We shall seek a solution of Eq. (36) in the form ψ± (x, y) = exp(∓ϕ(x, y))f (x, y) = exp(∓ϕ(z, z¯))f (z, z¯) ,
(37)
where f has to be chosen so that ψ± ∈ L2 (R2 ) (the Aharonov–Casher ansatz). In the space of distributions D 0 (R2 \Ω) we have T−∗ ψ+ = T˜+ ψ+
= exp(−ϕ) i(∂x ϕ − ay )f + (−∂y ϕ − ax )f − i(∂x f + i∂y f ) = −2i exp(−ϕ)
∂f ∂ z¯
and T+∗ ψ− = T˜− ψ− = exp(ϕ) i(−∂x ϕ + ay )f + (−∂y ϕ − ax )f − i(∂x f − i∂y f )
∂f . ∂z From here we deduce that the relation = −2i exp(ϕ)
+ Hmax ψ+ = 0 ,
ψ+ ∈ L2 (R2 ) ,
(38)
is equivalent to the condition ∂f = 0, ∂ z¯
(z ∈ C\Ω) ,
exp(−ϕ)f ∈ L2 (R2 ) .
(39)
October 18, 2004 14:30 WSPC/148-RMP
870
00219
ˇˇtov´ıˇ V. A. Geyler & P. S cek
Analogously, the relation − Hmax ψ− = 0 ,
ψ− ∈ L2 (R2 ) ,
is equivalent to the condition ∂f = 0, (z ∈ C\Ω) , exp(ϕ)f ∈ L2 (R2 ) . ∂z This shows the following theorem due to Aharonov and Casher [5].
(40)
(41)
Theorem 7.1. Assume that a vector potential a is expressed in the form a = sgrad ϕ where ϕ satisfies the equation ∆ϕ = b in the sense of distributions. Then + solutions of the equation Hmax ψ = 0 in L2 (R2 ) are exactly those functions from 2 2 L (R ) which have the form ψ+ (z, z¯) = exp(−ϕ(z, z¯))f (z) where f is a holomorphic function in the domain C\Ω. − Similarly, solutions of the equation Hmax ψ = 0 in L2 (R2 ) are exactly those 2 2 functions from L (R ) which have the form ψ− (z, z¯) = exp(ϕ(z, z¯))f (¯ z ) where f is a holomorphic function in the domain C\Ω. Let us point out an interesting consequence of the theorem. + − Proposition 7.2. Assume that the both operators Hmax and Hmax have zero ∞ modes. Then they are distinct. In particular , the set C0 (C\Ω) is not a core for at least one of them. + Proof. Let ψ be a zero mode of Hmax . Suppose that this operator coincides with − − Hmax . Then ψ is a zero mode for Hmax as well. Using notation of Theorem 7.1 we have ψ = exp(−ϕ)f = exp(ϕ)g where f is holomorphic in the domain C\Ω and g is antiholomorphic in the same domain. Since ϕ is real it holds true that |ψ|2 = f g¯. Taking into account that g¯ is holomorphic the last equality implies that f g¯ is a constant function and hence the same is true for |ψ|2 . Since ψ ∈ L2 (R2 ) it follows that ψ = 0, a contradiction. ± 8. Zero Modes of the Operators Hmax with Aharonov–Bohm Potential of Finite Type
8.1. Formulation of the problem ± In this section we shall study ground states of the operator Hmax (a, Ω) for an Aharonov–Bohm potential a of finite type determined by mutually disjoint discrete sets Ω1 , . . . , Ωn such that Ω = Ω1 ∪ · · · ∪ Ωn , and by fluxes (not necessarily distinct) θ1 , . . . , θn (cf. Example 4 from Sec. 4). Recall that we assume that 0 < θj < 1, for all j. We can rephrase the formulation of the problem. Namely, according to Theorems 6.2 and 7.1 we have to study square integrability of a function ψ having the form n Y ψ(z, z¯) = f (z) |Wj (z)|−θj +mj (42) j=1
October 18, 2004 14:30 WSPC/148-RMP
00219
Zero Modes in a System of Aharonov–Bohm Fluxes
871
where the numbers mj are integers, f (z) is holomorphic or antiholomorphic in the domain C\Ω, and the functions Wj determine the potential a according to formula (16). In this section the following lemma will be useful. Lemma 8.1. Assume that a function f (z) is expressible in an annulus r1 < |z| < r2 as a Laurent series ∞ X f (z) = an z n . n=−∞
Then for r1 < r < r2 and arbitrary n ∈ Z it holds true that Z |f (z)| |dz| ≥ 2π|an |rn+1 .
(43)
In particular , if r1 = 0 and n > −2 then ZZ 2π|an | n+2 r , |f (z)|dxdy ≥ n+2 |z|
(44)
|z|=r
if r1 = 0 and an 6= 0 for some n ≤ −2 then ZZ |f (z)|dxdy = ∞ ,
(45)
|z|
if r2 = ∞ and an 6= 0 for some n ≥ −2 then ZZ |f (z)|dxdy = ∞ .
(46)
|z|>r
Proof. The proof immediately follows from the simple estimate Z 2π Z Z 2π |f (z)| |dz| = r |f (reiϕ )|dϕ ≥ r e−inϕ f (reiϕ ) dϕ = 2π|an |rn+1 . |z|=r
0
0
An application of Lemma 8.1 yields the following auxiliary result.
Lemma 8.2. Assume that in (42) it holds mj = 0, for all j, i.e. n Y ψ(z, z¯) = f (z) |Wj (z)|βj j=1
where −1 < βj < 0, for each j, and f (z) is holomorphic in the domain C\Ω. If ψ ∈ L2 (R2 ) then f is an entire function.
Proof. Assume that ω ∈ Ωk and r > 0 is sufficiently small so that D(ω, r) ∩ Ω = {ω} where D(ω, r) is the disc with radius r centered at the point ω. Since Wj (ω) 6= 0, for j 6= k, and ω is a simple zero of Wk (z) one can assume that r is small enough so that it holds n Y |Wj (z)|βj ≥ c > 0 , j=1
for 0 < |z − ω| < r.
October 18, 2004 14:30 WSPC/148-RMP
872
00219
ˇˇtov´ıˇ V. A. Geyler & P. S cek
Now one can show that ω cannot be a pole nor an essential singularity of f . Otherwise ω would be a pole of even order or an essential singularity of f 2 . In any case Lemma 8.1 implies that ZZ ZZ |ψ(z, z¯)|2 dxdy ≥ c2 |f (z)|2 dxdy = ∞ , D(ω,r)
D(ω,r)
a contradiction. 8.2. Finite number of Aharonov–Bohm fluxes ± We start from the simplest case, i.e. from the Hamiltonian Hmax (Ω, Θ) corresponding to a finite number of Aharonov–Bohm fluxes (Example 5 from Sec. 4). Then Ω = {ω1 , . . . , ωn } is a finite set, Θ = (θ1 , . . . , θn ), 0 < θj < 1. In this case zero modes may occur under suitable assumptions on fluxes θj . More precisely, the following theorem is true [74]. + Theorem 8.3. A sufficient and necessary condition for the operators H max (Ω, Θ) − and Hmax (Ω, Θ) to have zero modes is n X
θj > 1 ,
(47)
θj < n − 1
(48)
j=1
in the former case, and n X j=1
in the latter case. + Proof. Let us start from the operator Hmax . We have to find a nonzero function f which is holomorphic in the domain C\Ω and such that the function
ψ(z, z¯) = f (z)
n Y
j=1
|z − ωj |−θj
(49)
is square integrable. Suppose that condition (47) is satisfied. Taking for f a constant function it is easy to verify that in that case we get a square integrable function ψ. Conversely, assume that ψ is square integrable but condition (47) is false. Then from Eq. (49) one easily deduces that f cannot be a nonzero constant. Furthermore, from the equality Y f (z) = ψ(z, z¯) |z − ωj |θj , θj ∈J
we find that there exists a constant c1 > 0 such that |f (z)| ≤ c1 (1 + |z|)|ψ(z, z¯)| on C .
October 18, 2004 14:30 WSPC/148-RMP
00219
Zero Modes in a System of Aharonov–Bohm Fluxes
873
Consequently, if r > 1 then ZZ
|z|
|f (z)|2 dxdy ≤ c2 r2
where c2 = 4c21
ZZ
R2
|ψ(z, z¯)|2 dxdy .
From inequality (43) in Lemma 8.1 it follows that f (z) is a constant function. This + contradiction proves the theorem in the case of the operator Hmax . To prove the − theorem in the case of the operator Hmax one can either modify the above argument or apply Corollary 5.6. 8.3. A chain of Aharonov–Bohm fluxes ± Here we show that the Hamiltonians Hmax corresponding to a finite union of chains of Aharonov–Bohm fluxes have infinitely many zero modes. In this subsection we use the notation from Example 6 in Sec. 4. The proof uses the following elementary estimate. Since for z = x + iy, x, y ∈ R, we have |sin(z)|2 = ch2 (y) − cos2 (x) = sh2 (y) + sin2 (x), it holds true that |sh(y)| ≤ |sin(z)| ≤ ch(y). Hence
e|y| − 1 ≤ |sin(z)| ≤ e|y| . (50) 2 Theorem 8.4. Let a uniformly discrete set Ω be expressible as a disjoint union of a finite number of chains Ω1 , . . . , Ωn , and let the chain Ωj = Kj +Λj carry Aharonov– ± (Ω) Bohm fluxes (θκ )κ∈Kj (j = 1, . . . , n, 0 < θκ < 1). Then the Hamiltonians Hmax have infinitely degenerate zero modes.
+ Proof. In the proof we shall consider only the operator Hmax . Using Lemma A.1 we may assume that each chain Ωj is contained in a line Lj and that it holds Lj 6= Lk for j 6= k. Then the Bravais lattice Λj of the chain Ωj has the form Λj = {kωj ; k ∈ Z}, ωj ∈ C, ωj 6= 0. Without loss of generality we can suppose that ω1 > 0 and κ1 = 0. Hence L1 = R and Lj = ωj R + κj (j = 2, . . . , n) where κj is a fixed element from Kj . For each line Lj we shall construct a strip Pj with border lines parallel to Lj and containing Lj in its interior. Furthermore, let Q be a sufficiently large disk centered at 0 such that outside Q the strips Pj do not intersect each other. It suffices to show that there exists an infinite number of linearly independent entire functions f (z) for which the function
ψ(z, z¯) = f (z)g1 (z, z¯) · . . . · gn (z, z¯), with gj (z, z¯) =
−θκ Y sin π(z − κ) , ωj
κ∈Kj
(51)
October 18, 2004 14:30 WSPC/148-RMP
874
00219
ˇˇtov´ıˇ V. A. Geyler & P. S cek
is square integrable. We shall show that this condition is satisfied for any function f (z) =
sin(αz) z
where 0 < α < θ :=
π X θκ . ω1
(52)
κ∈K1
The verification follows from a series of claims. (A) ψ ∈ L2 (Q), (B) each function gj is bounded outside the strip Pj , (C) the function g1 (z, z¯) sin(αz) is bounded outside the strip P1 . Claim (A) follows from the fact that f is bounded on Q and that the functions gj have square integrable singularities. Moreover, only finitely many singularities are contained in Q. Claims (B) and (C) are consequences of the inequalities in (50) and condition (52). To complete the proof it remains to show that (D) ψ ∈ L2 (Pj \Q), ∀ j = 1, . . . , n, (E) ψ ∈ L2 (R2 \(P1 ∪ · · · ∪ Pn )). To show Claim (D) notice that (B) and (C) imply the estimate cj |gj (z, z¯)| , |ψ(z, z¯)| ≤ |z| valid on Pj \Q with some constant cj > 0, and that the function gj (z, z¯) is periodic along the line Lj . For j = 1 one uses also that sin(αz) is bounded on the strip P1 . To show Claim (E) let us point out that the inequality 0 sin(αz) |ψ(z, z¯)| ≤ c |g1 (z, z¯)| z
holds true on C\(P2 ∪ · · · ∪ Pn ) with some constant c0 > 0, as it follows from (B). From inequalities (50) one derives that |ψ(z, z¯)| ≤ c00
exp((α − θ)|y|) . |z|
on C\(P1 ∪ P2 ∪ · · · ∪ Pn ). Finally, condition (52) implies that ψ ∈ L2 (R2 \(P1 ∪ · · · ∪ Pn )). Under more restrictive conditions on the fluxes θκ the assumption on the uniform discreteness can be dropped. Since every chain is a union of one-atom chains we can confine ourselves to such chains. Moreover, it is clear that a union of chains need not be a uniformly discrete set only in the case when among the chains in question there are at least two contained in the same line. Consequently, it suffices to analyze the case when the chains are contained in a single line, say, in the real
October 18, 2004 14:30 WSPC/148-RMP
00219
Zero Modes in a System of Aharonov–Bohm Fluxes
875
axis R. Suppose that Ωj = κj + Λj where Λj = {ωj k; k ∈ Z}, κj ∈ R, ωj > 0 (j = 1, . . . , n), are mutually disjoint one-atom chains and θ1 , . . . , θn (0 < θj < 1) are the corresponding Aharonov–Bohm fluxes. Theorem 8.5. Assume that all chains Ωj , j = 1, . . . , n, are contained in R. Then + + (Ω1 , . . . , Ωn ; θ1 , . . . , θn ) has an infinitely degenerate the Hamiltonian Hmax = Hmax zero mode if one of the following conditions is satisfied : (i) θ1 + · · · + θn < 1, θ1 θn 1 1 1 (ii) +···+ > +···+ − , ω1 ωn ω1 ωn min ωj j
(iii) n = 2. Proof. (i) In this case one can choose numbers pj > 1 so that pj θj < 1, ∀ j = P −1 1, . . . , n, and pj = 1 (e.g., p−1 = θj /(θ1 + · · · + θn )). Let us consider the j functions −θ sin αj (z − κj ) π(z − κj ) j gj (z, z¯) = , sin z − κj ωj
where
0 < αj < min j
π θj . ωj
Set ψ = g1 · · · gn . It suffices to show that ψ is square integrable. From the Jensen’s inequality it follows that |ψ|2 ≤
|gn |2pn |g1 |2p1 +···+ . p1 pn
(53)
Recalling that pj θj < 1, pj > 1, and repeating the considerations from the proof of Theorem 8.4 one can show that each summand on the RHS of formula (53) is integrable. Let us now discuss condition (ii). One can assume that minj ωj = ω1 and κ1 = 0. We shall consider a function ψ of the form ψ(z, z¯) =
sin(αz) g1 (z, z¯) · · · gn (z, z¯) z
where −θ1 πz g1 (z, z¯) = sin , ω1
1−θ π(z − κj ) j , gj (z, z¯) = sin ωj
(54)
(55)
for j = 2, . . . , n, and α obeys the condition 0 < α < θ :=
X 1 − θj π −π . ω1 ωj j
(56)
October 18, 2004 14:30 WSPC/148-RMP
876
00219
ˇˇtov´ıˇ V. A. Geyler & P. S cek
Let T be a strip parallel to the real line R and containing R in its interior. One again concludes from inequalities (50) that outside the strip T it holds true that |ψ(z, z¯)| ≤ c1
exp((α − θ)|y|) . |z|
Furthermore, inside T we can use the estimate sin(αz) |g1 (z, z¯)| . |ψ(z, z¯)| ≤ c2 z
Therefore one can use a similar reasoning as in the proof of Theorem 8.4 to show that ψ ∈ L2 (R2 ). Finally, let us discuss condition (iii). If ω1 = ω2 then one can refer to Theorem 8.4. In the opposite case we shall assume that ω1 < ω2 . If θ1 + θ2 < 1 then we apply condition (i) from the theorem. If not then we have θ2 θ1 + θ 2 1 1 1 1 θ1 + > ≥ = + − , ω1 ω2 ω2 ω2 ω1 ω2 min ωj and we can apply condition (ii). − According to Lemma 5.5 we can reformulate the result for the operator Hmax as follows.
Theorem 8.6. Assume that all chains Ωj , j = 1, . . . , n, are contained in R. Then − − the Hamiltonian Hmax = Hmax (Ω1 , . . . , Ωn ; θ1 , . . . , θn ) has an infinitely degenerate zero mode if one of the following conditions is satisfied : (i) θ1 + · · · + θn > n − 1, θ1 θn 1 (ii) +··· + < , ω1 ωn min ωj j
(iii) n = 2. 8.4. A lattice of Aharonov–Bohm fluxes ± Let us now consider the Hamiltonian Hmax for a lattice of Aharonov–Bohm solenoids Ω = K + Λ where Λ is the Bravais lattice of the crystallographic lattice Ω with a basis {ω1 , ω2 } (cf. Example 7 in Sec. 4). To analyze this case we shall use the Weierstrass σ-function σ(z) ≡ σ(z; ω1 , ω2 ). Let us introduce, following [95, 96], the modified Weierstrass σ-function σ ˜ (z), 2
σ ˜ (z) = e−νz σ(z)
(57)
where ν=
i (η1 ω ¯ 2 − η2 ω ¯1 ) , 4S
ηj = 2ζ
ωj 2
,
S = Im(ω1 ω2 ) ,
October 18, 2004 14:30 WSPC/148-RMP
00219
Zero Modes in a System of Aharonov–Bohm Fluxes
877
and ζ(z) is the Weierstrass ζ-function (cf. Appendix C). We shall need the following lemma. The number µ, occurring in the formulation of the lemma and depending on the lattice Λ, is defined by π µ= . (58) 2S Lemma 8.7. Let αj , j = 1, . . . , n, be real numbers such that 0 < αj < 1, let β be an arbitrary real number , and let aj , j = 1, . . . , n, be an arbitrary array of complex numbers such that among them there is no couple congruent modulo Λ. If the condition n X β<µ αj j=1
is satisfied then
exp(β|z|2 )
n Y
j=1
|˜ σ (z − aj )|−αj ∈ L2 (R2 ) .
Proof. We shall consider a shifted elementary cell of the lattice Λ, Lε = {(t1 + ε)ω1 + (t2 + ε)ω2 ; 0 ≤ t1 , t2 < 1} , where ε > 0 is chosen so that the interior of Lε contains exactly one zero for each of the functions σ ˜ (z − aj ), and hence exactly one pole of the function 1/˜ σ(z − aj ). But in that case, −2αj Z Y n σ ˜ (z − aj ) dxdy < ∞ . (59) Lε j=1
Let ρ(z, z¯) be a function defined by the formula
|σ(z)|2 = exp(νz 2 + ν¯z¯2 + 2µz z¯)ρ(z, z¯) . From formula (60) we deduce that ZZ Y n |ρ(z − aj , z¯ − a ¯j )|−αj dxdy < ∞ . I :=
(60)
(61)
Lε j=1
Since the function ρ(z, z¯) is Λ-periodic and |˜ σ (z)|2 = exp(2µ|z|2 )ρ(z, z¯) (see Lemma C.1) it holds true that −2αj ZZ n Y 2 exp(2β|z| ) σ ˜ (z − aj ) dxdy R2 j=1
=
X ZZ
λ∈Λ
≤ I
X
λ∈Λ
−αj n Y X 2 exp 2 β − µ αj |z| ρ(z − aj , z¯ − a ¯ j ) dxdy Lε +λ
sup
j=1
n
o X exp 2 β − µ αj |z|2 ; z ∈ Lε + λ < ∞ .
October 18, 2004 14:30 WSPC/148-RMP
878
00219
ˇˇtov´ıˇ V. A. Geyler & P. S cek
Theorem 8.8. Let Ω = K + Λ be a lattice of Aharonov–Bohm solenoids with an ± array of fluxes Θ = (θκ )κ∈K , 0 < θκ < 1. Then each of the operators Hmax (Ω; Θ) has an infinitely degenerate zero mode. + Proof. We shall confine ourselves to the case of the operator Hmax . Let us consider the function Y ψ(z, z¯) = f (z) |˜ σ (z − κ)|−θκ . (62) κ∈K
2
2
According to Lemma 8.7, ψ ∈ L (R ) if f is an arbitrary polynomial. Remark 8.9. Owing to Theorem 8.8, we can describe an interesting example related to the question of absolutely continuous spectrum for the Pauli operator ± Hmax (a) with a magnetic field b = ∂x ay − ∂y ax which is supposed to be periodic with respect to a lattice Λ = {k1 ω1 + k2 ω2 ; k1 , k2 ∈ Z}, with S = Im(¯ ω1 ω2 ) > 0. If the vector potential a is “sufficiently regular” and the flux of the field b through ± the elementary cell equals zero then the spectrum of the operators Hmax (a) is purely absolutely continuous (see [23–28] and others). The same result is true for Schr¨ odinger operators with “sufficiently regular” periodic vector potentials in the space L2 (Rd ) for any d ≥ 2. In the case d ≥ 3, N. D. Filonov described in [29] an example showing that the assumptions on the vector potential stated in [28] and other papers cannot be essentially weakened. Theorem 8.8 shows that twodimensional Pauli operators with a singular two-periodic magnetic field may have (infinitely degenerate) eigenvalues. In more detail, let us take, for example, a set K containing two elements, K = {κ1 , κ2 }, and suppose that θ ≡ θκ1 = −θκ2 ∈ ]0, 1[. ± Then by Theorem 8.8 both operators Hmax (a) have an eigenvalue, namely the number zero. According to Example 7 from Sec. 4, the corresponding vector potential a reads ax = θ Im(ζ(z − κ1 ) − ζ(z − κ2 )), ay = θ Re(ζ(z − κ1 ) − ζ(z − κ2 )). Owing to quasi-periodicity of the Weierstrass function ζ(z), ζ(z + ωj ) = ζ(z) + ηj where ηj = 2 ζ(ωj /2), the vector potential a is Λ-periodic. Now we state an analog of Theorem 8.4 for lattices of Aharonov–Bohm fluxes. In view of the example described in Remark A.2, we cannot here repeat the arguments from the proof of Theorem 8.4 but the properties of the modified Weierstrass σfunction simplify matters considerably. Theorem 8.10. Let a uniformly discrete set Ω be expressible as a disjoint union of a finite number of lattices Ω1 , . . . , Ωn , and let the lattice Ωj = Kj + Λj carry Aharonov–Bohm fluxes (θκ )κ∈Kj (j = 1, . . . , n, 0 < θκ < 1). Then the Hamiltonians ± Hmax (Ω) have infinitely degenerate zero modes. + Proof. In the proof we shall consider only the operator Hmax . Without loss of generality we suppose that each Kj is a singleton: Kj = {κj }, and we shall write θj instead of θκj . By the hypothesis, there is a sufficiently small disk D centered at 0
October 18, 2004 14:30 WSPC/148-RMP
00219
Zero Modes in a System of Aharonov–Bohm Fluxes
879
such that for ω1 , ω2 ∈ Ω, ω1 6= ω2 , the sets D + ω1 and D + ω2 are disjoint. Denote Lj := D + κj + Λj , then for every j there exists cj > 0 such that |˜ σ (z − κj )|−θj ≤ cj for z ∈ / Lj . It is clear that ! n n Y X Y σ (z − aj )|−θj . ck |˜ |˜ σ (z − aj )|−θj ≤ j=1
j=1
k6=j
Now we can refer to Lemma 8.7.
Analogs of Theorems 8.5 and 8.6 are also valid and can be proved by the same method. In more detail, let Ωj , j = 1, . . . , n, be mutually disjoint simple crystallo(j) (j) graphic lattices, Ωj = κj + Λj where κj ∈ C, Λj = {k1 ω1 + k2 ω2 ; k1 , k2 ∈ Z}. (j) (j) Furthermore, Sj = Im(¯ ω1 ω2 ) designates the area of the elementary cell of the Bravais lattice Λj . + + Theorem 8.11. The Hamiltonian Hmax = Hmax (Ω1 , . . . , Ωn ; θ1 , . . . , θn ) has an infinitely degenerate zero mode if one of the following conditions is satisfied :
(i) θ1 + · · · + θn < 1, θn 1 1 1 θ1 +···+ > +···+ − , (ii) S1 Sn S1 Sn min Sj j
(iii) n = 2 and S1 6= S2 . − − Theorem 8.12. The Hamiltonian Hmax = Hmax (Ω1 , . . . , Ωn ; θ1 , . . . , θn ) has an infinitely degenerate zero mode if one of the following conditions is satisfied :
(i) θ1 + · · · + θn > n − 1, θ1 θn 1 (ii) +···+ < , S1 Sn min Sj j
(iii) n = 2 and S1 6= S2 . 8.5. Superposition of a homogeneous magnetic field with a field corresponding to Aharonov–Bohm solenoids Here we consider a perturbation of a homogeneous magnetic field by the field corresponding to a system of Aharonov–Bohm solenoids, i.e. we consider a vector potential a of the form a = a0 + aAB where a0 is the vector potential of a homogeneous magnetic field b0 = 2πξ0 with a flux density ξ0 , a0 = πξ0 (−y, x), and aAB is the vector potential of a system of Aharonov–Bohm fluxes. We shall suppose that the potential aAB is of finite type. In that case we have a finite family of mutually disjoint discrete subsets in the complex plane, Ω1 , . . . , Ωn , and in each point of the set Ωj (j = 1, . . . , n) there is a flux of magnitude θj (0 < θj < 1) intersecting the plane. + Suppose for definiteness that b0 > 0. Then the operator Hmax (a0 ) has an infinitely degenerate zero mode (the lowest Landau level shifted by the value −b 0 )
October 18, 2004 14:30 WSPC/148-RMP
880
00219
ˇˇtov´ıˇ V. A. Geyler & P. S cek
− while the ground state of the operator Hmax (a0 ) is strictly positive (this is the lowest Landau level shifted by the value b0 ). Thus the latter operator has no zero mode. Intuitively, the results proved below in Theorems 8.13 and 8.16 mean that if the set Ω = Ω1 ∪ · · · ∪ Ωn has a finite density then a superposition with the poten+ tial aAB does not remove the zero mode from the spectrum of the operator Hmax , − and a zero mode cannot occur in the spectrum of the operator Hmax provided b0 is sufficiently large. Moreover, if this set has zero density then the same statement + − about zero modes of the operators Hmax and Hmax is true for any b0 > 0. In the case when Ω is a lattice, a superposition with the potential aAB does not remove + the zero mode from the spectrum of Hmax for any b0 > 0 but a zero mode may − occur in the spectrum of Hmax for particular values of fluxes θj . An attentive reader can effortlessly guess what happens for b0 < 0. Let α be an arbitrary positive number. For any r > 0 we denote X ω −α , (63) S(r) = ω∈Ω,0<|ω|≤r
T (r) =
X
ω∈Ω,0<|ω|≤r
|ω|−α ,
n(r) = #{ω ∈ Ω; |ω| ≤ r} .
(64) (65)
Theorem 8.13. Suppose that a = a0 + aAB and that the Aharonov–Bohm vector potential aAB is of finite type. Let the following conditions be satisfied: (a) for any α > 2, the sums T (r) are uniformly bounded , (b) n(r) = O(r 2 ), (c) the sums S(r) are uniformly bounded for α = 2. Then, for sufficiently large b0 > 0, the + − Hamiltonian Hmax (a) has an infinitely degenerate zero mode and Hmax (a) has no zero mode. If for α = 2 the sums T (r) are uniformly bounded then, for any b0 > 0, the + − Hamiltonian Hmax (a) has an infinitely degenerate zero mode and Hmax (a) has no zero mode. In the case when b0 < 0 the same claim remains true when interchanging the + − role of Hmax (a) and Hmax (a). + Proof. Let us consider the operator Hmax (a). In view of Theorem 7.1, we can assume that its zero mode, if any, has the form Y n 1 ψ(z, z¯) = f (z) exp − πξ0 |z|2 |Wj |−θj (66) 2 j=1
where Wj (z) = WΩj (z) is the Weierstrass canonical product for the set Ωj (see Appendix B) and f (z) is a nonzero entire function (cf. Lemma 8.2). Let assumptions (a), (b), (c) be satisfied. Then, according to the Borel theorem and to the Lindel¨ of theorem (cf. Theorems B.1 and B.4 in Appendix), every function Wj (z) has order 2 and a finite type, i.e. |Wj (z)| ≤ aj exp(cj |z|2 ) with some constants
October 18, 2004 14:30 WSPC/148-RMP
00219
Zero Modes in a System of Aharonov–Bohm Fluxes
881
aj , cj > 0. It follows that for b0 > c1 (1 − θ1 ) + · · · + cn (1 − θn ) 4
(67)
Qn the function (66) is square integrable if we set f (z) = p(z) j=1 Wj (z) where p(z) is an arbitrary polynomial. If the functions T (r) are uniformly bounded for α = 2 then pΩ ≤ 1 (see (B.12)) and, according to Theorem B.5 from Appendix, the functions Wj (z) are of minimal type and so the constants cj can be chosen arbitrarily small. Consequently, the restriction on the field b0 > 0 is not necessary anymore. − In the case of the operator Hmax (a) we have to discuss the function Y n 1 |Wj |θj −1 . (68) πξ0 |z|2 ψ(z, z¯) = f (z) exp 2 j=1 If assumptions (a), (b), (c) are satisfied then |Wj (z)|θj −1 ≥ aj exp(cj (θj − 1)|z|2 ) with some constants aj , cj > 0. Consequently, for b0 obeying (67), R > 0 sufficiently large and for some c > 0 we have the inequality |ψ(z)|2 ≥ c|f (z)|2 if |z| ≥ R. But − then, as one deduces from Lemma 8.1, ψ is not square integrable, hence Hmax (a) has no zero modes. + − Obviously, changing the sign at b0 means that Hmax (a) and Hmax (a) interchange their roles in the above considerations. Remark 8.14. If conditions (a), (b), (c) from Theorem 8.13 are satisfied then Ω has finite density, i.e. lim sup n(r)/r2 < ∞. If, in addition, the sums T (r) are r→∞
uniformly bounded for α = 2 then the density of the set Ω is zero (see inequalities (B.17)). Remark 8.15. All assumptions of Theorem 8.13 are fulfilled if every set Ωj is either finite or a union of chains. Let now Ω be a lattice. Using the above introduced notation we write Ωj = κj +Λ where Λ = {k1 ω1 + k2 ω2 ; k1 , k2 ∈ Z}. Suppose that S = Im (¯ ω1 ω2 ) > 0 (S is the area of an elementary cell in the lattice Λ). Let η0 = ξ0 S designate the flux of the homogeneous component of the field through the elementary cell of the lattice Λ. Theorem 8.16. Suppose that a = a0 +aAB and that Ω is a lattice. Let b0 > 0. Then + the Hamiltonian Hmax (a) has an infinitely degenerate zero mode. The inequality η0 +
n X
θj < n
j=1
− is a sufficient and necessary condition for Hmax (a) to have a zero mode, and if it is fulfilled then the zero mode is infinitely degenerate.
October 18, 2004 14:30 WSPC/148-RMP
882
00219
ˇˇtov´ıˇ V. A. Geyler & P. S cek
− Let b0 < 0. Then the Hamiltonian Hmax (a) has an infinitely degenerate zero mode. The inequality
|η0 | <
n X
θj
j=1
+ is a sufficient and necessary condition for Hmax (a) to have a zero mode, and if it is fulfilled then the zero mode is infinitely degenerate. + Proof. We shall start the proof from the operator Hmax . In analogy with the proof of Theorem 8.8 we consider the function n πη0 2 Y ψ(z, z¯) = f (z) exp − |˜ σ (z − κj )|−θj (69) |z| 2S j=1
(recall that η0 = ξ0 S). From Lemma 8.7 we immediately deduce that for an arbitrary polynomial f it holds true that ψ ∈ L2 (R2 ), hence ψ is an infinitely degenerated zero mode. − Let us now turn to the operator Hmax . According to Theorem 6.2 and − Lemma 8.2, the ground state of the operator Hmax , if any, has the form n πη0 2 Y ψ(z, z¯) = f (¯ z ) exp |z| |˜ σ (z − κj )|θj −1 (70) 2S j=1 where f (z) is an entire function. Using formula (C.17) from Appendix C (where µ = π/2S) we get n πη0 2 Y π 2 2 2 |ψ(z, z¯)| = |f (¯ z )| exp |z| exp (θj − 1)|z| S S j=1 ×
n Y
j=1
|ρ(z − κj , z¯ − κ ¯ j )|2(θj −1)
= |f (¯ z )|2 exp(c|z|2 )
n Y
j=1
|ρ(z − κj , z¯ − κ ¯ j )|2(θj −1)
(71)
where ! n X π c= η0 + (θj − 1) . S j=1 P The condition η0 + nj=1 θj < n is equivalent to the condition c < 0. But in that case the membership ψ ∈ L2 (R2 ) can be proved as in Lemma 8.7. Conversely, assume that c ≥ 0 and that there exists a non-zero entire function f (z) such that ψ ∈ L2 (R2 ). Then from Eq. (71) we derive that |f (¯ z )|2 ≤ c1 |ψ(z, z¯)|2 2 2 with some constant c1 . Consequently, f ∈ L (R ) which contradicts Lemma 8.1.
October 18, 2004 14:30 WSPC/148-RMP
00219
Zero Modes in a System of Aharonov–Bohm Fluxes
883
In the case b0 < 0 one can either repeat the above reasoning or apply Corollary 5.6 while noticing that {−x} = 1 − {x} for any x ∈ R which is not an integer. Remark 8.17. Similarly to the case b0 = 0 (cf. Remark 8.9), both operators ± Hmax (a) may have localized states also when the total flux through the elementary cell is zero. Suppose, for example, that b0 > 0, 0 < θ1 < 1, 0 < η0 + θ1 < 1 and θ2 = 1 − η0 − θ1 . Then η0 + θ1 + θ2 = 1 < 2 and the assumption of the theorem is satisfied. Remark 8.18. Theorem 8.16 shows that, for b0 > 0, an oscillation of the type “localization–delocalization” occurs after adding an Aharonov–Bohm flux to a sys− tem with the Hamiltonian Hmax . 9. Conservation of Zero Modes under Translations and Additions of Aharonov–Bohm Solenoids. Irregular Aharonov–Bohm Systems 9.1. Translation and addition of finitely many Aharonov–Bohm solenoids Up to now we have investigated zero modes of regular Aharonov–Bohm systems (in the sense of the definition given at the end of Sec. 4), with Theorem 8.13 representing the only exception. The proof of this theorem suggests that one should expect zero modes also in the case when the homogeneous component of the magnetic field is absent provided the perturbation corresponds to a (in general, irregular) “sufficiently scarce” system. Further we shall consider such scarce perturbations applied to systems of chains or lattices of Aharonov–Bohm solenoids. Before addressing this question we shall prove that the zero mode of the Hamiltonian corresponding to a system of solenoids of finite type does not disappear if a finite number of solenoids are moved or if one joins a finite number of solenoids to the system. In the following theorems, a designates a potential of finite type corresponding to a system of Aharonov–Bohm solenoids which is determined by an array of mutually disjoint discrete sets, Ω1 , . . . , Ωn , and by an array of fluxes, θ1 , . . . , θn (0 < θj < 1). Ω designates the union Ω = Ω1 ∪ · · · ∪ Ωn . Theorem 9.1. In addition to the above introduced notation let Kj = 0 , . . . , ωn0 j ,j } be a finite sub{ω1j , . . . , ωnj ,j } be a finite subset of Ωj and let Kj0 = {ω1j 0 0 set of C such that the sets Ωj = (Ωj \Kj ) ∪ Kj , j = 1, . . . , n, are mutually disjoint. ± ± If the operator Hmax (a) has a zero mode then the operator Hmax (a0 ) determined by 0 0 the array (Ω1 , . . . , Ωn ; θ1 , . . . , θn ) has also a zero mode with the same multiplicity ± as that of the zero mode for the operator Hmax (a).
October 18, 2004 14:30 WSPC/148-RMP
884
00219
ˇˇtov´ıˇ V. A. Geyler & P. S cek
+ Proof. We shall confine ourselves to the discussion of the operator Hmax (a). The zero modes of this operator can be written in the form
ψ(z, z¯) = f (z)
n Y
j=1
|Wj (z)|−θj
where f is an entire function and Wj (z) = WΩj (z) is the Weierstrass canonical product for Ωj . Then the function ˜ z¯) = f (z) ψ(z,
n Y
j=1
|Wj (z)|
−θj
nj Y z − ωkj θj z − ω0 kj
k=1
+ represents a zero mode of Hmax (a0 ). One has only to verify that ψ˜ ∈ L2 (R2 ). 0 It is actually so because the additional singularities at the points ωkj are square integrable and outside a compact set ψ˜ differs from ψ by a bounded factor. This argument clearly shows that the multiplicity of the zero mode for the ± operator Hmax (a0 ) is not smaller than the multiplicity of the zero mode for the ± operator Hmax (a). Since the operators play an equivalent role in the assumptions the converse is also true.
Theorem 9.2. Assume that additionally to the considered system of solenoids there 0 are given a finite set Ω0 = {ω10 , . . . , ωm } ⊂ C not intersecting Ω and a corresponding 0 0 0 family of fluxes {θ1 , . . . , θm } (0 < θj < 1). Let a0 be the vector potential determined 0 by the array of sets Ω1 , . . . , Ωn , Ω0 , and by the array of fluxes θ1 , . . . , θn , θ10 , . . . , θm . ± ± 0 If the operator Hmax (a) has a zero mode then the operator Hmax (a ) also has a zero mode whose multiplicity is not smaller than the multiplicity of the zero mode for ± the operator Hmax (a). + Proof. We shall confine ourselves to the discussion of the operator Hmax (a). The zero modes of this operator can be written in the form
ψ(z, z¯) = f (z)
n Y
j=1
|Wj (z)|−θj
where f is an entire function and Wj (z) = WΩj (z) is the Weierstrass canonical product for Ωj . It turns out that the function ˜ z¯) = f (z) ψ(z,
n Y
j=1
|Wj (z)|−θj
m Y
k=1
0
|z − ωk0 |−θk
+ is a zero mode of Hmax (a0 ). One has only to verify that ψ ∈ L2 (R2 ). This again follows from the fact that the singularities at the points ωk0 are square integrable and the function ψ˜ differs from ψ by a bounded factor outside a compact set.
October 18, 2004 14:30 WSPC/148-RMP
00219
Zero Modes in a System of Aharonov–Bohm Fluxes
885
9.2. Additional notation Up to the end of the current section, Ωj (j = 1, . . . , n) designates either an array of mutually disjoint chains or an array of mutually disjoint lattices, Ωj = Kj + Λj where Λj is a Bravais lattice of rank 1 or 2 and Kj = {κ1j , . . . , κmj ,j } is a finite set. To each set Ωj we relate an array of Aharonov–Bohm fluxes Θj = (θkj )1≤k≤mj . By Ω we denote the union Ω = Ω1 ∪· · ·∪Ωn . In addition, we shall consider another array of discrete subsets in the plane, Ω01 , . . . , Ω0m , whose members are mutually disjoint as well as disjoint with the sets Ω1 , . . . , Ωn , and we relate to these additional sets an array of fluxes Θ0 = (θj0 )1≤j≤m . Finally, we consider a discrete set Ω00 ⊂ Ω whose points are supposed to be removed from Ω. Set Ω0 = Ω00 ∪ Ω01 ∪ · · · ∪ Ω0m . Furthermore, Wj (z) is the Weierstrass canonical product for the set Ω0j , Vkj (z) is the Weierstrass canonical product for the set Ω00 ∩ (κkj + Λj ). By τ 0 we denote the convergence exponent of the set Ω0 and by p0 its genus. The symbol n0 (r) designates the number of points of the set Ω0 contained in the disc |z| ≤ r. The symbol a0 designates the vector potential for the perturbed system of solenoids determined by the discrete sets Ω1 \Ω00 , . . . , Ωn \Ω00 , Ω01 ∪ · · · ∪ Ω0m , and by the array of fluxes obtained by concatenating the arrays Θ1 , . . . , Θn and Θ0 . Let us note that we still assume that 0 < θj < 1, 0 < θj0 < 1. Thus the set Ω0 representing the total perturbation of the original set Ω need not be finite nor regular. On the other hand, we shall always suppose that the set Ω0 is sufficiently scarce by imposing restrictive assumptions on its genus p0 .
9.3. Addition of solenoids to a union of Aharonov–Bohm chains In Theorem 8.4 it has been shown, roughly, that if Ω is a finite union of chains ± then the Hamiltonians Hmax (Ω) have infinitely many zero modes. Below we show that this property survives provided the perturbation Ω0 is sufficiently scarce and at least two chains are not parallel. Theorem 9.3. Let Ω1 , . . . , Ωn be chains whose union is a uniformly discrete set. Suppose that among the chains there are at least two which are not parallel. Further± more, suppose that the genus of Ω0 fulfills p0 = 0. Then the Hamiltonians Hmax (a0 ) have infinitely degenerate zero modes. Proof. By virtue of Lemma A.1 one can assume that every chain Ωj = Kj + Λj , with Λj = {kωj ; k ∈ Z}, ωj 6= 0, is contained in a line Lj and that different chains are contained in different lines. We shall suppose, without loss of generality, that L1 and L2 are not parallel and that L1 coincides with the real line R, with 0 ∈ Ω1 . Let us consider a function ψ of the form ψ(z, z¯) = f (z)
n Y
j=1
gj (z, z¯)
m Y
k=1
0
|Wk (z)|1−θk
(72)
October 18, 2004 14:30 WSPC/148-RMP
886
00219
ˇˇtov´ıˇ V. A. Geyler & P. S cek
where −θ mj Y π(z − κkj ) kj |Vkj (z)|θkj . gj (z, z¯) = sin ωj
(73)
k=1
To prove the theorem it suffices to find an infinite, linearly independent family of entire function f (z) such that ψ ∈ L2 (R2 ). Let us show that the functions sin(αz) f (z) = (74) z with sufficiently small α > 0 suit this condition. To this end we shall need the following lemma. Lemma 9.4. Assume that α1 , α2 , a1 , a2 ∈ C fulfill α1 α2 6= 0, α1 /α2 ∈ / R. Then for every ε > 0 there exist constants c˜, c1 , c2 , γ1 , γ2 > 0 such that the inequality c1 eγ1 |z| ≤ |sin α1 (z − a1 ) sin α2 (z − a2 ) | ≤ c2 eγ2 |z| (75) holds true whenever |z| ≥ c˜ and the distance from z to the lines L 1 = α−1 1 R + a1 −1 and L2 = α2 R + a2 is greater than ε.
Proof of Lemma 9.4. Let L be a line written in the form L = α−1 R + a where α, a ∈ C, α 6= 0. Then inequality (50) implies that
e|α|d − 1 ≤ |sin α(z − a) | ≤ e|α|d (76) 2 where d is the distance from the point z to the line L. Actually, set α = |α|eiϕ . Then |Im α(z − a)| = |α|d where d is the distance from eiϕ (z − a) to R. But, at the same time, d is the distance from z to e−iϕ R + a = α−1 R + a. From inquality (76) we deduce that for every ε > 0 one can find a constant c > 0 such that |sin α(z − a) | ≥ ce|α|d (77) whenever the distance d from z to L is greater than ε. Actually, it suffices to choose c = (1 − e−|α|ε )/2. Let us now reconsider the lines L1 and L2 from the lemma. Since α1 /α2 ∈ / R the lines intersect each other in a point v ∈ C. Let us denote by ϕ the angle between the lines L1 and L2 (0 < ϕ < π), and by θ the angle between the vector z − v and the line L1 . Then the distances d1 and d2 from the point z to the lines L1 and L2 are respectively equal d1 = |z − v||sin(θ)| ,
d2 = |z − v||sin(θ − ϕ)| .
From inequality (76) we get |sin α1 (z − a1 ) sin α2 (z − a2 ) | ≤ e|α1 |d1 +|α2 |d2 ≤ emax(|α1 |,|α2 |)(d1 +d2 ) . Notice that d1 + d2 ≤ 2|z − v| ≤ 2|z| + 2|v|, hence |sin α1 (z − a1 ) sin α2 (z − a2 ) | ≤ c2 eγ2 |z|
October 18, 2004 14:30 WSPC/148-RMP
00219
Zero Modes in a System of Aharonov–Bohm Fluxes
887
where γ2 = 2 max(|α1 |, |α2 |) ,
c2 = e|v|γ2 ,
(this is true for any z ∈ C). Using inequality (77) we can relate to every ε > 0 a constant c > 0 such that if d1 , d2 > ε then |sin α(z − a1 ) sin α(z − a2 ) | ≥ c2 emin(|α1 |,|α2 |)(d1 +d2 ) . On the other hand,
d1 + d2 ≥ |z − v|(sin2 θ + sin2 (θ − ϕ)) = |z − v|(1 − cos ϕ cos(ϕ − 2θ)) ≥ |z − v|(1 − | cos ϕ|) ≥ |z|(1 − | cos ϕ|) − |v|(1 − | cos ϕ|) . From here we deduce that the inequality |sin α(z − a1 ) sin α(z − a2 ) | ≥ c1 eγ1 |z|
holds true for
c1 = c2 e− min(|α1 |,|α2 |)|v|(1−| cos ϕ|) ,
γ1 = min(|α1 |, |α2 |)(1 − |cos ϕ|) .
This proves the lemma. Proof of Theorem 9.3 (Continued). Let us return to the proof of Theorem 9.3. From the assumptions of the theorem (p0 = 0) it follows that the functions Vkj (z) and Wj (z) are of growth (1, 0) (see Appendix B). This implies that for every ε > 0 there exists a constant cε > 0 such that m Y l=1
0
|Wl (z)|1−θl
mj n Y Y
j=1 k=1
|Vkj (z)|θkj ≤ cε exp(ε|z|) .
(78)
Let Pj be a strip with border lines parallel to Lj and containing Lj in its interior, and let Q be a sufficiently large disk centered at 0. By virtue of Lemma 9.4, the disk Q can be chosen so that, for z ∈ / Q ∪ P1 ∪ P2 , the inequality g2B (z, z¯)| |ψ(z, z¯)| ≤ c1 |f (z)||˜ g1B (z, z¯)||˜
n Y
j=3
|˜ gjA (z, z¯)|
holds true with some constant c1 > 0. Here we have set −θ mj Y π(z − κkj ) kj A g˜j (z, z¯) = sin ωj k=1
and
g˜jB (z, z¯)
−β mj −θ π(z − κ1j ) j Y π(z − κjk ) kj = sin sin , ωj ωj k=2
with 0 < βj < θ1j .
(79)
October 18, 2004 14:30 WSPC/148-RMP
888
00219
ˇˇtov´ıˇ V. A. Geyler & P. S cek
On the other hand, one can choose Q so that, for z ∈ P2 \Q, we have the inequality |z| ≤ c0 d1 where d1 is the distance from z to the line L1 , and c0 > 0 does not depend on z. Then from inequality (77) we deduce that Q can be replaced by a larger disk such that the inequality |ψ(z, z¯)| ≤ c01 |f (z)||˜ g1B (z, z¯)|
n Y
j=2
|˜ gjA (z, z¯)| ,
(80)
with a constant c01 > 0, holds true for z ∈ P2 \Q. An analogous assertion is true when interchanging the strips P1 and P2 . Finally, for any choice of the disk Q, the inequality |ψ(z, z¯)| ≤ c001 |f (z)|
n Y
j=1
|˜ gjA (z, z¯)|
(81)
holds true in the interior of Q. Formulas (79), (80) and (81) make it possible to complete the proof by arguing in the same way as in the proof of Theorem 8.4. In the case when all chains are parallel we have a somewhat weaker result. Theorem 9.5. Let Ω1 , . . . , Ωn be parallel chains whose union is a uniformly discrete set. Assume that the convergence exponent of Ω0 satisfies either τ 0 < 1/2 ± or τ 0 ≤ 1/2 and n0 (r) = o(r1/2 ). Then the Hamiltonians Hmax (a0 ) have infinitely degenerate zero modes. Remark 9.6. Under the assumptions of Theorem 9.5 it holds true that p0 = 0 but this equality does not imply the assumptions of the theorem. Proof. By virtue of Lemma A.1 one can assume that every chain Ωj = Kj + Λj , Λj = {kωj ; k ∈ Z}, is contained in a line Lj , with different chains being contained in different lines, and that all lines are parallel to the real axis. Hence one can assume that ωj > 0 for all j and that all lines Lj are contained in a half-plane Im z > a where a > 0. Let us consider a function ψ of the form ψ(z, z¯) = f (z)
n Y
j=1
gj (z, z¯)
m Y
k=1
0
|Wk (z)|1−θk
(82)
where −θ mj Y π(z − κkj ) kj |Vkj (z)|θkj . gj (z, z¯) = sin ωj
(83)
k=1
To prove the theorem it suffices to find an infinite, linearly independent family of entire functions f (z) such that ψ ∈ L2 (R2 ). Let us show that in this case the functions sin(αz) √ f (z) = , (84) √ sin( παz) sin( −παz)
October 18, 2004 14:30 WSPC/148-RMP
00219
Zero Modes in a System of Aharonov–Bohm Fluxes
889
with sufficiently small α > 0, will do. Here the function f (z) is well defined and analytic in the upper half-plane provided the usual branch of the square root has √ √ √ √ been chosen. If Im z > 0 then z −z = −iz. Since sin( z)/ z = 1 − z/3! + · · · is in fact an entire function, also the function f (z) extends as an entire function. We start the verification from several preliminary observations. The first one follows from inequality (50). (A) For every ε > 0 there exists c > 0 such that √ 1 |sin( z)| ≥ exp(|z|1/2 ) for |z| ≥ c, ε ≤ | arg z| ≤ π , 3 √ 1 |sin( −z)| ≥ exp(|z|1/2 ) for |z| ≥ c, 0 ≤ | arg z| ≤ π − ε . 3 Suppose that n ∈ N and 0 < δ < π/4. Let us denote √ Bn (δ) = {z ∈ C; | πz − πn| < δ} . (B) For n 6= m the sets Bn (δ) and Bm (δ) are disjoint. Actually, suppose that n > m. If z ∈ Bn (δ) ∩ Bm (δ) then there exist u, v ∈ C, |u|, |v| < δ, such that πz = (πn + u)2 = (πm + v)2 . Hence π 2 (n − m)(n + m) = 2π(mv − nu) + v 2 − u2 . At the same time it holds true that π 2 (n − m)(n + m) ≥ π 2 (n + m), |2π(mv − nu) + v 2 − u2 | ≤ 2πδ(m + n) + 2δ 2 < 4πδ(m + n) < π 2 (n + m). Set √ Q(ε) = {z ∈ C; |sin( πz)| < ε} . Let us denote by Qn (ε) the connected component of the set Q(ε) containing the point πn2 , and by U (ε) the connected component of the set {w ∈ C; |sin(w)| < ε} containing 0. Observe that (B) implies the following claim. (C) For sufficiently small ε > 0 the connected components Qn (ε) are mutually disjoint. To complete the proof we shall need the following two lemmas. Lemma 9.7. For every δ > 0 there exists a constant c > 0 such that sin(z) √ √ sin( πz) ≤ c max(| πz| + 1, |sin(z)|) on the strip |Im z| ≤ δ.
Proof of Lemma 9.7. The equality |sin(z)|2 = sin2 (x) + sh2 (y), with z = x + iy, x, y ∈ R, implies that |sin z| ≤ (x2 + c2δ y 2 )1/2 ≤ cδ |z| for |Im z| ≤ δ where we have set cδ = sh(δ)/δ. Let us choose ε > 0 small enough so that the sets Qn (ε) are mutually disjoint, √ | πz| > πn − 1 for z ∈ Qn (ε) and |sin(w)| ≥ |w|/2 on U (ε). For z ∈ C\Q(ε), the
October 18, 2004 14:30 WSPC/148-RMP
890
00219
ˇˇtov´ıˇ V. A. Geyler & P. S cek
√ desired inequality is valid with c = ε−1 . If z ∈ Qn (ε) then πz − πn ∈ U (ε) and therefore √ √ √ |sin( πz)| = |sin( πz − πn)| ≥ | πz − πn|/2 . Furthermore, if |Im z| ≤ δ then
|sin(z)| = |sin(z − πn2 )| ≤ cδ |z − πn2 | .
Consequently we have, for z ∈ Qn (ε) and |Im z| ≤ δ, sin(z) 4cδ √ 2cδ √ |z − πn2 | √ sin( πz) ≤ 2cδ |√πz − πn| ≤ π (| πz| + πn) ≤ π (| πz| + 1) . This proves the lemma.
Lemma 9.8. For any b > 0 there exists ε > 0 such that ZZ sin(z) 2 1/2 dxdy < ∞ sin(√πz) exp − b|z| Q(ε)
where one chooses the principal branch of the square root on C\R − . Proof of Lemma 9.8. We choose ε small enough so that Claim (C) is true. In the integral ZZ sin(z) 2 1/2 In = dxdy sin(√πz) exp − b|z| Qn (ε) √ we apply the substitution w = πz − πn, i.e. z = (w + πn)2 /π, where w = u + iv, u, v ∈ R. Since dx ∧ dy = (4/π 2 )|w + πn|2 du ∧ dv and Qn (ε) is mapped onto U (ε) we get 2 ZZ sin 2nw + wπ 2 4 |w + πn|2 exp − √b |w + πn| dudv . In = 2 π sin(w) π U (ε)
One can assume that ε is sufficiently small so that |sin(w)| ≥ |w|/2 on U (ε). Furthermore, there clearly exists a constant c > 0, depending on ε but independent of n, such that | sin(2nw + (w 2 /π))| < c sh(2n|w|) on U (ε). Thus we get ZZ b sh2 (2n|w|) 16c2 2 |w + πn| exp − √ |w + πn| dudv . In ≤ 2 π |w|2 π U (ε) By modifying the constant in front of the integral we can simplify this inequality, ZZ √ sh2 (2n|w|) 0 2 dudv . In ≤ c n exp − b π n |w|2 U (ε)
Here again the constant c0 > 0 does not depend on n. If sup{|w|; w ∈ U (ε)} < √ b π/4 then ZZ ∞ ∞ X X √ sh2 (2n|w|) In ≤ c 0 n2 exp − b πn dudv < ∞ . |w|2 U (ε) n=0 n=0 This proves the lemma.
October 18, 2004 14:30 WSPC/148-RMP
00219
Zero Modes in a System of Aharonov–Bohm Fluxes
891
Proof of Theorem 9.5 (Continued). Let us return to the proof of Theorem 9.5. We denote A1 = {z ∈ C; Re z ≥ 0}, A2 = {z ∈ C; Re z ≤ 0}, and we shall show that ψ ∈ L2 (Aj ), j = 1, 2. More precisely, we shall prove only the membership ψ ∈ L2 (A1 ), the property ψ ∈ L2 (A2 ) can be shown analogously. We split the set A1 into a union A1 = P1 ∪ P2 with P1 = {z ∈ A1 ; 0 ≤ Im z ≤ b}, P2 = A1 \P1 . The bound b is chosen so that the strip P1 contains the set Ω in its interior. From the assumptions it follows that the functions Vkj (z) and Wj (z) are of growth (1/2, 0), i.e. for every ε > 0 there exists a constant Cε > 0 such that maxj,k {|Vkj (z)|, |Wk (z)|} ≤ Cε exp(ε|z|1/2 ) (cf. Appendix B). Let us denote mj m n Y Y Y 1 1−θl0 √ |Wl (z)| G(z, z¯) = |Vkj (z)|θkj . sin( −παz) l=1 j=1 k=1
(85)
Hence ψ(z, z¯) =
n Y sin(αz) √ G(z, z¯) g˜j (z, z¯) sin( παz) j=1
where g˜j (z, z¯) =
−θkj mj Y sin π(z − κkj ) . ωj
k=1
According to Claim (A) there exist constants c1 , c2 > 0 such that |G(z, z¯)| ≤ c1 exp(−c2 |z|1/2 ) ,
(86)
for all z ∈ A1 . With the aid of Lemma 9.7 and formula (86) we can estimate the function ψ on the strip P1 , 0
√
|ψ(z, z¯)| ≤ c (| παz| + 1) exp(−c2 |z|
1/2
)
n Y
g˜j (z, z¯) .
j=1
The singularities of ψ in P1 are square integrable and therefore, similarly as in the proof of Theorem 8.4, we obtain (1) ψ ∈ L2 (P1 ). Since Ω ⊂ P1 , on P2 ∩ α−1 Q(ε) we have the estimate 00 sin(αz) √ |ψ(z, z¯)| ≤ c exp(−c2 |z|1/2 ) . sin( παz)
If ε is small enough then Lemma 9.8 implies that (2) ψ ∈ L2 (P2 ∩ α−1 Q(ε)).
√ Finally, for z ∈ P2 \α−1 Q(ε) we have |sin( παz)| ≥ ε and hence the inequality |ψ(z, z¯)| ≤ c000 |sin(αz)|
n Y
j=1
g˜j (z, z¯)
October 18, 2004 14:30 WSPC/148-RMP
892
00219
ˇˇtov´ıˇ V. A. Geyler & P. S cek
holds true. We conclude, similarly as in the proof of Theorem 8.4, that if 0<α<
mj n X X π θkj ωj j=1 k=1
then (3) ψ ∈ L2 (P2 \Qα (ε)). This concludes the proof of the theorem. 9.4. Addition of solenoids to an Aharonov–Bohm lattice The case when the Aharonov–Bohm fluxes are arranged in a lattice Ω has been discussed in Theorem 8.8. It turns out that for a scarce perturbation the result stated in the theorem is still true. Theorem 9.9. Suppose that Ω is a lattice, i.e. Ωj = κj + Λ where Λ is a Bravais lattice of rank 2. Suppose further that the genus of the set Ω0 fulfills p0 ≤ 1. Then ± the Hamiltonians Hmax (a0 ) have infinitely degenerate zero modes. ˜ (z) be the Weierstrass canonical product for the set Ω0 , let %0 be Proof. Let W ˜ (z), and let τ 0 be the convergence exponent of Ω0 . Then the growth order of W 0 0 0 ˜ (z) is of growth (2, 0). If % = τ ≤ p + 1 ≤ 2 (see Appendix B). If %0 < 2 then W 0 0 ˜ % = 2 = p + 1 then, by Theorem B.5(b), W (z) is of minimal type. Consequently, ˜ (z) is of growth (2, 0). This means that for any c > 0 there also in the latter case W exist a > 0 and R > 0 such that for all z, |z| > R, it holds true that ˜ (z)| ≤ a exp c|z|2 . |W (87) The same observation is clearly true for any subset of Ω0 . In particular, the functions Vj (z) and W` (z) are of growth (2, 0) and obey estimates similar to (87). + Zero modes of Hmax (a0 ) are gauge equivalent to functions of the form ψ(z, z¯) = f (z)
n Y
j=1
|˜ σ (z − κj )|
−θj
n Y
j=1
|Vj (z)|
θj
m Y
j=1
0
|W` (z)|1−θ` .
From Lemma 8.7 and from the estimate (87) with sufficiently small c > 0 it follows that ψ is square integrable if f (z) is an arbitrary polynomial. 9.5. Irregular systems of Aharonov–Bohm solenoids All the preceding results were concerned with Aharonov–Bohm systems Ω with bounded density, i.e. for which lim supr→∞ n(r)/r2 < ∞ (cf. Eq. (65)). Here we show that zero modes may occur also in systems with infinite density, more precisely, in systems for which lim inf r→∞ n(r)/r2 = ∞. Moreover, we shall present examples
October 18, 2004 14:30 WSPC/148-RMP
00219
Zero Modes in a System of Aharonov–Bohm Fluxes
893
of systems Ω with arbitrarily large convergence exponent τΩ . Let us fix a natural number N ≥ 2 and set ΩN = {eπik/N m1/N ; m ∈ N , k = 0, 1, . . . , 2N − 1} .
Obviously, the convergence exponent τ for the set ΩN equals N . In particular, for N > 2 we have limr→∞ n(r)/r2 = ∞. Let θ be an arbitrary number from the interval ]0, 1[. Then the vector potential a of the Aharonov–Bohm system determined by the couple (ΩN , θ) reads a(z, z¯) = θ sgrad ln(|W (z)|) where W (z) =
sin(πz N ) . z N −1
± Theorem 9.10. The Hamiltonians Hmax (a) have infinitely degenerate zero modes.
Proof. It is sufficient to show that for 0 < α < πθ the function ψ(z, z¯) =
sin(αz N ) |W (z)|−θ zN
is square integrable. Set S= Then ZZ
π 3π z ∈ C; − < arg z < 2N 2N 2
R2
|ψ(z, z¯)| dxdy = N
ZZ
S
.
|ψ(z, z¯)|2 dxdy
and therefore it suffices to verify that ZZ |ψ(z, z¯)|2 dxdy < ∞ .
(88)
S
Let us make a substitution of the integration variable in (88), w = z N where w = u + iv, u, v ∈ R. Since du ∧ dv = N 2 |z|2N −2 dx ∧ dy we can rewrite the integral as ZZ ZZ 1 1 |ψ(z, z¯)|2 dxdy = 2 |sin(αw)|2 |sin(πw)|−2θ dudv (89) β N S R2 |w|
where
1−θ . N Since 2 − 2θ − β > −2 and β > 1 one can show that the integral (89) is finite using a reasoning as in the proof of Theorem 8.4. β = 4 − 2θ − 2
Acknowledgments V.A.G. was supported by the Grants of RFBR (no. 02-01-00804), DFG–RAS ˇ gratefully acknowledges (no. 436 RUS 113/572/0-2) and INTAS (no. 00-257). P.S. support of the Ministry of Education of Czech Republic under the research plan MSM210000018.
October 18, 2004 14:30 WSPC/148-RMP
894
00219
ˇˇtov´ıˇ V. A. Geyler & P. S cek
Appendix A. Lattices Here we collect basic definitions and some auxiliary results about lattices. Let E be a finite-dimensional real Euclidean space with dimension d. A discrete subgroup Λ of the additive group E is called a Bravais lattice. For any Bravais lattice Λ there exist linearly independent vectors ω1 , . . . , ωr ∈ E such that Λ = Zω1 + Zω2 + · · · + Zωr . The array (ωj )1≤j≤r is called a basis of the Bravais lattice Λ. The integer r does not depend on the choice of basis and is called the rank of the lattice Λ. To every basis (ωj )1≤j≤r one relates the elementary cell F , F ⊂ E, formed by all x ∈ E whose orthogonal projection x0 onto the linear span L of the lattice Λ has a decomposition x0 = t 1 ω 1 + t 2 ω 2 + · · · + t r ω r , with 0 ≤ tj < 1 for all j. If r = d (the dimension of the lattice Λ is maximal possible) then F is a convex parallelepiped. In the opposite case F is even not bounded. A non-empty discrete subset Ω ⊂ E is called a crystal with the Bravais lattice Λ if it is invariant with respect to the action of Λ on E and has a finite number of orbits. Obviously, every crystal Ω can be written in the form Ω = K + Λ where K ⊂ E is a finite set whose number of elements equals the number of orbits. Without loss of generality we may assume that K ⊂ F . Conversely, every set of the form Ω = K + Λ is a crystal. If |K| = 1 then the crystal is called mono-atomic or simple. In the general case when |K| = n the crystal Ω is called n-atomic. If r = 1 then Ω is called a chain (a simple chain if in addition |K| = 1). If r = d then Ω is called a lattice (more precisely, a crystal lattice) in the space E. In other words, a crystal is such a discrete subset Ω ⊂ E whose group of parallel translations acts co-compactly on E. Let us note that in our definition we do not exclude the case r = 0. If so then Λ = {0} and a crystal with the Bravais lattice Λ is simply a finite subset of E. It is worth noticing that a crystal Ω is always a uniformly discrete subset of E. This means that there exists a constant c > 0 such that |ω 0 − ω 00 | ≥ c whenever ω 0 , ω 00 ∈ Ω, ω 0 6= ω 00 . We shall need the following lemma. Lemma A.1. Assume that dim E = 1 and that Ω1 , . . . , Ωn are chains in E. The union Ω = Ω1 ∪ · · · ∪ Ωn is a chain if and only if Ω is a uniformly discrete set. Proof. We only need to prove that this condition is sufficient. Moreover, it suffices to consider the case n = 2. The general case then follows by mathematical induction. Let us write Ωj = K j + Λ j
(j = 1, 2) ,
where Λj is the Bravais lattice of Ωj . Let us identify E with R. Then Λj = Zωj , with ωj > 0.
October 18, 2004 14:30 WSPC/148-RMP
00219
Zero Modes in a System of Aharonov–Bohm Fluxes
895
We shall show that the number p = ω1 /ω2 is rational. Actually, in the opposite case the set Zω1 + Zω2 would be dense in R. Let us choose κ1 ∈ K1 and κ2 ∈ K2 . We can find a sequence nk ω2 − mk ω1 (mk , nk ∈ Z) converging to κ1 − κ2 and such that κ1 − κ2 6= nk ω2 − mk ω1 for all k. Obviously, this contradicts the assumption that the set Ω is uniformly discrete. Hence p = N/M , with N, M ∈ N. Then M ω1 = N ω2 and therefore Ω is invariant with respect to the lattice Λ with the basis vector M ω1 . It is easy to see that the number of orbits of the group Λ in Ω is finite. But this means that Ω is a chain. Remark A.2. For dim E > 1 an analog of Lemma A.1 is false. Actually, already for dim E = 2 it can happen that a union of two simple lattices is uniformly discrete but not a lattice. This is demonstrated by the following example. Let E = R2 . Let Λ1 be a Bravais lattice with√the basis ω1 = e1 , ω2 = e2 , and let Λ2 be a Bravais lattice with the basis ω10 = 2e1 , ω20 = e2 . Consider the crystal lattices Ω1 = Λ1 and Ω2 = κ + Λ2 where κ = (1/2)e2 . Obviously, Ω = Ω1 ∪ Ω2 is a uniformly discrete set. Let Λ be a group of parallel translations acting on Ω. Suppose that the number of orbits of the group Λ in Ω is finite. Then there exist n , m ∈ Z, n 6= m, such that nω1 and mω1 belong to the same orbit. Hence kω1 ∈ Λ for some k ∈ Z, k 6= 0. But κ + kω1 ∈ / Ω. Appendix B. Auxiliary Results from the Theory of Analytic Functions Here we recall some results from the theory of analytic functions that are necessary for our presentation, for the details see [101, 102, 106]. For an entire function f we set Mf (r) ≡ M (r) = max |f (z)| = max |f (z)| . |z|=r
|z|≤r
(B.1)
The order (more precisely, the growth order) of an entire function f is the number %f ≡ % = inf{α; ∃Rα > 0 ,
∀ r > Rα ,
M (r) < exp(r α )} ,
(B.2)
or, equivalently, ln ln(M (r)) . %f = lim sup ln(r) r→∞ P∞ Let us note that if f (z) = n=0 an z n then %f = lim sup n→∞
(B.3)
ln(n) . ln(|an |−1/n )
(B.4)
If %f < ∞ then one says that f is a function of finite order. For a function of finite order % the number ςf ≡ ς = inf{K > 0; ∃RK > 0 ,
∀ r > RK ,
M (r) < exp(Kr % )}
(B.5)
October 18, 2004 14:30 WSPC/148-RMP
896
00219
ˇˇtov´ıˇ V. A. Geyler & P. S cek
is well defined and it is called the type of the function f . The type can be equivalently defined by the formula ln M (r) . (B.6) ςf = lim sup r% r→∞ Moreover, ςf can be expressed in terms of the Taylor coefficients an , (ςf e%f )1/% = lim sup n1/% |an |1/n .
(B.7)
n→∞
If %f < ∞ and ςf = 0 then f is called a function of minimal type. A function f of finite order % and of finite type ς obeys the estimate |f (z)| ≤ exp((ς + ε)|z|% )
(B.8)
for arbitrary ε > 0 and |z| greater than a constant Rε > 0. Conversely, if the estimate |f (z)| ≤ c exp(ς1 |z|%1 )
(B.9)
is fulfilled with a constant c > 0 then the function f has both a finite order and a finite type, and it holds %f ≤ %1 and ςf ≤ ς1 . A couple of numbers (%1 , ς1 ), 0 ≤ %1 , ς1 ≤ ∞, determines the growth of a function f (z) if %f ≤ %1 and %f = %1 implies ςf ≤ ς1 . Functions of growth (1, ς1 ), with ς1 < ∞, are said to have exponential growth. The order and the type of an entire function on one side and the distribution of its zeroes on the other side are deeply related. Let us arrange all nonzero elements of a discrete set Ω ⊂ C in a sequence Ω∗ = (ωk )k≥1 which is ascending in the absolute value and ascending in the argument (0 ≤ arg z < 2π) in the case of equal absolute values. The convergence exponent of the set Ω (or of the sequence Ω ∗ ) is the number ∞ X 1 < ∞ , (B.10) τΩ ≡ τ = inf α > 0; |ωk |α k=1
or, equivalently,
τΩ = lim sup k→∞
ln(k) . ln(|ωk |)
If τΩ is finite then the number ( ) ∞ X 1 max n ∈ N; =∞ |ωk |n pΩ ≡ p = k=1 −∞
(B.11)
if Ω is infinite ,
(B.12)
if Ω is finite ,
is well defined and it is called the genus of the set Ω (or of the sequence Ω∗ ). For τΩ = ∞ we set pΩ = ∞. For r > 0 we set nΩ (r) ≡ n(r) = #{ω ∈ Ω; |ω| ≤ r} .
(B.13)
October 18, 2004 14:30 WSPC/148-RMP
00219
Zero Modes in a System of Aharonov–Bohm Fluxes
897
For a non-empty set Ω the formula ln n(r) τΩ = lim sup , ln(r) r→∞
(B.14)
shows that that the convergence exponent of a discrete set characterizes its density [106, Theorem 2.5.8]. On the other hand, let Ωf ≡ Ω be the zero set of an entire function f . Then the following fundamental inequality is valid (the Hadamard theorem, see for example [106, Theorem 2.5.18]): τΩ ≤ % f .
(B.15)
If %f is not an integer then τΩ = %f [106, Theorem 2.9.1]. From (B.15) we deduce that n(r) = O(r%+ε )
(B.16)
for any ε > 0. If %f > 0 and ςf < ∞ then a stronger estimate is valid [106, Theorem 2.5.13]: L ≡ lim sup r−% n(r) ≤ e%f ςf , r→∞
l ≡ lim inf r−% n(r) ≤ %f ςf .
(B.17)
r→∞
l/L
Moreover, if τΩ > 0 then Le ≤ %f ςf , in particular, L + l ≤ %f ςf . For %f < ∞ it can never happen that nΩ (r) = o(r%−ε ) [106, Theorem 2.9.3]. The following theorem is due to Lindel¨ of (see [106, Theorems 2.9.5 and 2.10.1]). Theorem B.1. (1) Assume that % ≡ %f < ∞ is not an integer. An entire function f (z) is of finite type if and only if nΩ (r) = O(r% ), and it is of minimal type if and only if nΩ (r) = o(r% ). (2) Assume that %f is a positive integer. The function f (z) is of finite type if and only if nΩ (r) = O(r% ) and the sums X S(r) = ωk−% 0<|ωk |≤r
are bounded. An entire function with simple zeroes is determined by its zero set up to an multiplier eg(z) where g(z) is an entire function. Furthermore, for an arbitrary discrete set Ω ⊂ C there exists an entire function f (z) with simple zeroes whose zero set coincides with Ω. Denote by E(u, p) the Weierstrass canonical multiplier, with u ∈ C and with p ∈ N, up u2 +··· + E(u, p) = (1 − u) exp u + 2 p (by definition, E(u, 0) = 1−u). Let Ω∗ = (ωk )k≥1 be, as above, the sequence formed by all nonzero elements of the set Ω appropriately enumerated. Let us denote by
October 18, 2004 14:30 WSPC/148-RMP
898
00219
ˇˇtov´ıˇ V. A. Geyler & P. S cek
χΩ ≡ χ an integer which is equal 1 if 0 ∈ Ω and 0 in the opposite case. The Weierstrass canonical product associated to Ω is by definition an entire function WΩ (z) defined by the infinite product z
χ
∞ Y
E(z/ωk , pΩ )
(B.18)
k=1
if the convergence exponent of Ω is finite, and by the infinite product zχ
∞ Y
E(z/ωk , k)
(B.19)
k=1
in the opposite case.
Theorem B.2 (Weierstrass, Hadamard). The infinite product defining the function WΩ (z) converges absolutely and locally uniformly. Consequently, W Ω (z) is an entire function and its zero set coincides with Ω. Moreover , the zero set of an entire function f (z) with simple zeroes only equals Ω if and only if the function f (z) is of the form f (z) = eg(z) WΩ (z)
(B.20)
where g(z) is an entire function. The growth order of the function WΩ (z) equals the convergence exponent of the set Ω. Moreover, the following theorem is true. Theorem B.3 (Hadamard). If the function f from relation (B.20) has a finite order %f then g(z) is a polynomial with a degree not exceeding [%f ]. Theorem B.4 (Borel Theorem). Conversely, if τΩ < ∞ and f (z) is a function written in the form (B.20) where g(z) is a polynomial of degree n then f (z) has a P −τ finite order %f = max(τ, n). If either τΩ < n or the series ∞ is convergent k=1 |ωk | then the function f (z) is of finite type. The genus of a function f (z) having the form (B.20), where g(z) is a polynomial of degree n, is the integer qf ≡ q = max(n, pΩ ). The following theorem is a useful completion of Theorem B.1 due to Lindel¨ of [106, Theorem 2.10.3]. Theorem B.5. Under the assumptions of Theorem B.1 let %f be a positive integer. A function f (z) written in the form (B.20), where g(z) is a polynomial , is of minimal type if and only if one of the following conditions is satisfied : (a) nΩ (r) = o(r% ), pΩ = %f , and ∞ X
k=1
ωk−% = −%f α0
where α0 is the coefficient (possibly vanishing) standing at z % in the polynomial g(z), (b) pΩ = %f − 1 and α0 = 0.
October 18, 2004 14:30 WSPC/148-RMP
00219
Zero Modes in a System of Aharonov–Bohm Fluxes
899
In particular , if qf < %f then f (z) is of minimal type. We shall also need the following particular case of the Mittag–Leffler theorem (see [101, II.7.3.2] or [102]). Theorem B.6. For an arbitrary discrete subset Ω of the complex plane C and for an arbitrary sequence of complex numbers (θω )ω∈Ω there exists a meromorphic function M (z) obeying the following conditions: (1) M (z) has only simple poles, (2) the set of poles of the function M (z) coincides with Ω, (3) the residuum of M (z) at the point ω equals θω . Appendix C. The Weierstrass σ-Function and Related Functions In our approach an important role is played by the order % and by the type ς of the Weierstrass σ-function σ(z), Y z2 z z . exp + 1− σ(z) ≡ σ(z; ω1 , ω2 ) = z ω ω 2ω 2 ω∈Λ\{0}
It is easy to see that the convergence exponent of any lattice in the plane equals 2. Actually, the series +∞ X 0
n1 ,n2
1 |n ω + n 2 ω 2 |α 1 1 =−∞
(C.1)
(the dash indicates, as usual, that the summand with indices n1 = n2 = 0 is omitted) converges if and only if α > 2. Hence, by the Borel theorem, % = 2. Since for α = 2 the series (C.1) diverges the Borel theorem does not say anything about the type of the function σ(z) (apart from the fact that it is finite). The type of this function has been found in the general case by A. M. Perelomov [95]. In order to make our presentation self-contained we reproduce below some details from his derivation. Let us start from recalling the notation ζ(z) = σ 0 (z)/σ(z), ωj ηj = 2 ζ (C.2) 2 and the fact that the σ-function is quasi-periodic in the following sense: ωj σ(z + ωj ) = −σ(z) exp ηj z + . 2
(C.3)
Recall also that S = Im(¯ ω1 ω2 ) designates the area of the elementary cell. Lemma C.1 ([95]). The function |σ(z)|2 can be expressed in the form |σ(z)|2 = exp(νz 2 + ν¯z¯2 + 2µz z¯)ρ(z, z¯)
(C.4)
October 18, 2004 14:30 WSPC/148-RMP
900
00219
ˇˇtov´ıˇ V. A. Geyler & P. S cek
where ρ is a Λ-periodic function, ν=
i (η1 ω ¯ 2 − η2 ω ¯1) , 4S
µ=
π . 2S
(C.5)
Proof. From (C.3) we obtain ωj . |σ(z + ωj )| = |σ(z)| exp 2 Re ηj z + 2 2
2
(C.6)
On the other side, the function ρ defined by equality (C.4), with ν ∈ C and µ ∈ R, is periodic if and only if it holds |σ(z + ωj )|2 = exp 2νzωj + 2νzωj + 2µ(z ω ¯ j + z¯ωj ) + νωj2 + ν¯ω ¯ j2 + 2µωj ω ¯ j |σ(z)|2 .
(C.7)
Comparing (C.6) to (C.7) and taking into account the equality η j ωj η j ωj ωj + , 2 Re ηj z + = ηj z + η¯j z¯ + 2 2 2 we arrive at the system νωj + µ¯ ωj =
1 ηj , 2
j = 1, 2 ,
νωj2 + ν¯ω ¯ j2 + 2µωj ω ¯j =
1 (ηj ωj + η¯j ω ¯j ) , 2
(C.8) j = 1, 2 .
The first couple of equations in (C.8) gives ν=
1 η1 ω ¯ 2 − η2 ω ¯1 , 2 ω1 ω ¯2 − ω ¯ 1 ω2
µ=
1 ω1 η 2 − ω 2 η 1 . 2 ω1 ω ¯2 − ω ¯ 1 ω2
(C.9)
Since ω1 ω ¯2 − ω ¯ 1 ω2 = −2iS and by virtue of the Lagrange identity η1 ω2 − η2 ω1 = 2πi
(C.10)
we find that relations (C.9) and (C.5) are equivalent. Using (C.9) and the fact that µ is real one can check that the second couple of equations in (C.8) is satisfied identically. Lemma C.2 ([95]). The type of the function σ(z; ω1 , ω2 ) is given by the equality ς = |ν| + µ =
1 (|η1 ω ¯ 2 − η2 ω ¯ 1 | + 2π) . 4S
(C.11)
Proof. Let us rewrite (C.4) as follows, |σ(z)|2 = exp (ν + ν¯)(x2 − y 2 ) + 2i(ν − ν¯)xy + 2µ(x2 + y 2 ) ρ(z, z¯) .
(C.12)
The quadratic form occurring in the exponent, 2 Re(ν)(x2 − y 2 ) − 4 Im(ν)xy, can be diagonalized with the aid of a rotation of the coordinate system. Eigenvalues of the corresponding symmetric matrix are λ1 = −λ2 = 2|ν|. Set ε = eiϕ where ϕ is
October 18, 2004 14:30 WSPC/148-RMP
00219
Zero Modes in a System of Aharonov–Bohm Fluxes
901
the angle of the rotation. Since the quadratic form x2 + y 2 is rotationally invariant we have |σ(εz)| = exp (µ + |ν|)x2 + (µ − |ν|)y 2 ρ1/2 (εz, ε¯z¯) . (C.13)
Owing to the periodicity of the function ρ it holds true that max |σ(z)| = max |σ(εz)| ≤ c exp (µ + |ν|)r2 . |z|=r
|z|=r
(C.14)
Consequently, ς ≤ |ν| + µ. To show the opposite inequality it suffices to construct a sequence zk such that |zk | → ∞ and |σ(εzk )| ≥ c exp((µ + |ν| − δk )|zk |2 ) ,
(C.15)
where δk ↓ 0 and c > 0 is a fixed constant. First we note that, by the uniqueness theorem for analytic functions in a real variable, there exists a point z0 such that ρ(z0 , z¯0 ) 6= 0. Then there exists c > 0 such that |ρ(z, z¯)| > c on a neighborhood V of z0 . This gives the choice of c. Further we consider the canonical mapping h : R2 → R2 /Λ. Two cases are possible: either the image h(z0 + R) is a closed curve in the torus T = R2 /Λ or this image is dense in T . In the former case there exists a sequence λk ∈ R such that λk → ∞ and h(z0 + λk ) = h(z0 ) ,
(C.16)
in the latter case condition (C.16) should be replaced by h(z0 + λk ) → h(z0 ). In both cases condition (C.15) holds true with zk = z0 + λk . Following [95] we introduce the function 2
σ ˜ (z) = e−νz σ(z) . Lemma C.1 implies the equality |˜ σ (z)|2 = exp(2µ|z|2 )ρ(z, z¯) .
(C.17)
Lemma C.3 ([95]). Let f (z) be an entire function whose zero set coincides with Λ = Zω1 + Zω2 , with all zeroes being simple. Then the order %f is at least 2, %f ≥ 2, and if %f = 2 then the type ςf is at least µ, ςf ≥ µ = π/2S. Moreover , in the case of the function σ ˜ (z) the minimal values are achieved both for the order % and the type ς, i.e. %σ˜ = 2 and ςσ˜ = µ = π/2S. Proof. Since the function σ(z) is expressed as a Weierstrass canonical product its order equals the convergence exponent τΛ = 2. Let us consider the entire function 2 f (z) = e−αz σ(z), with α ∈ C. Then |f (z)|2 = exp 2 Re (ν − α)(x2 − y 2 ) − 4 Im(ν − α)xy + 2µ(x2 + y 2 ) ρ(z, z¯) .
(C.18)
October 18, 2004 14:30 WSPC/148-RMP
902
00219
ˇˇtov´ıˇ V. A. Geyler & P. S cek
It is clear that the order of the function f (z) equals 2, and similarly as in the proof of Lemma C.2, the type of f equals |ν −α|+µ. Obviously, the smallest type (namely, µ) is achieved for α = ν. In particular, the function σ ˜ (z) is of order 2 and its type equals µ. Conversely, suppose that the zero set of an entire function f (z) coincides with Λ and that all zeroes of f (z) are simple. Since τΛ = 2 the Hadamard theorem implies that %f ≥ 2. Suppose that %f = 2. We can write f (z) in the form f (z) = eg(z) σ(z). By the Hadamard theorem, g(z) = az 2 +bz +c. If a = 0 then the type of f (z) equals the type of σ(z), if a 6= 0 then the type of f (z) equals the type of exp(az 2 )σ(z). In both cases the type of f (z) is greater or equal µ. Remark C.4. If Λ is a quadratic or hexagonal lattice then ν = 0 and, consequently, σ ˜ (z) = σ(z). Actually, in the former case we can suppose that ω1 > 0, ω2 = iω1 . Then η1 = π/ω1 , η2 = −πi/ω1 [104, 18.14.8 and 18.14.10], hence ν = 0. In the latter case we can suppose that ω1 = ke−iπ/3 , ω2 = keiπ/3 , with k > 0. Then η1 = √
2πeiπ/3 , 3(ω1 + ω2 )
η2 = √
2πe−iπ/3 3(ω1 + ω2 )
[104, 18.13.16 and 18.13.19]. In this case, too, ν = 0. Remark C.5. There exist lattices for which ν 6= 0 and, consequently, σ ˜ (z) 6= σ(z). It suffices to consider a lattice with η2 = 0 (such a lattice exists, see [104, 18.3.10]). Then, by the Lagrange formula, |η1 ω2 | = 2π and hence |ν| = π/2S. This means that the type of the σ-function for such a lattice equals π/S. Since ν depends on (ω1 , ω2 ) continuously any value of |ν| lying between 0 and π/2S is realized by a convenient lattice. References [1] P. W. Brouwer, E. Racine, A. Furusaki, Y. Hatsugai, Y. Morita and C. Mudry, Zero-modes in the random hopping model, Phys. Rev. B66 (2002) 014204-1–11. [2] M. F. Atiyah and I. M. Singer, The index of elliptic operators, Ann. Math. 87 (1968) 484–530. [3] R. Jackiw and C. Rebbi, Solitons with fermion number 1/2, Phys. Rev. D13 (1977) 3398–3409. [4] R. Jackiw and C. Rebbi, Spinor analysis of Yang–Mills theory, Phys. Rev. D16 (1977) 1052–1060. [5] Y. Aharonov and A. Casher, Ground state of a spin 1/2 charged particle in a twodimensional magnetic field, Phys. Rev. A19 (1979) 2461–2462. [6] C. Adam, B. Muratori and C. Nash, Zero modes of the Dirac operator in three dimensions, Phys. Rev. D60 (1999) 125001-1–8. [7] C. Adam, B. Muratori and C. Nash, Degeneracy of zero modes of the Dirac operator in three dimensions, Phys. Lett. B485 (2000) 314–318. [8] C. Adam, B. Muratori and C. Nash, Multiple zero modes of the Dirac operator in three dimensions, Phys. Rev. D62 (2000) 085026-1–9. [9] C. Adam, B. Muratori and C. Nash, Zero modes in finite range magnetic fields, Mod. Phys. Lett. A15 (2000) 1577–1581.
October 18, 2004 14:30 WSPC/148-RMP
00219
Zero Modes in a System of Aharonov–Bohm Fluxes
903
[10] C. Adam, B. Muratori and C. Nash, Chern–Simons action for zero modes supporting gauge fields in three dimensions, Phys. Rev. D67 (2003) 087703-1–3. [11] L. Erd˝ os and J. P. Solovej, On the kernel of Spinc Dirac operators on S3 and R3 , Adv. Math. 16 (2000) 111–119. [12] V. A. Geyler and E. N. Grishanov, Zero modes in a periodic system of Aharonov– Bohm solenoids (in Russian), Pis’ma Zh. Eksper. Teor. Fiz. 75 (2002) 425–427. (English transl. in JETP Letters. 75 (2002) 354–356). [13] J. Vidal, R. Mosseri and B. Dou¸cot, Aharonov–Bohm cages in two-dimensional structures, Phys. Rev. Lett. 81 (1998) 5888–5891. [14] J. Vidal, P. Butaud, B. Dou¸cot and R. Mosseri, Disorder and interactions in Aharonov–Bohm cages, Phys. Rev. B64 (2001) 155306-1–16. [15] G.-Y. Oh, Effect of field modulation on Aharonov–Bohm cages in a two-dimensional bipartite periodic lattice, Phys. Rev. B62 (2000) 4567–4572. [16] B. A. Dubrovin and S. P. Novikov, Ground states of the two-dimensional electron (in Russian), Zh. Eksper. Teoret. Fiz. 79 (1980) 1006–1016 (English transl. in Soviet Phys. JETP. 52 (1980) 902–910). [17] S. P. Novikov, Two-dimensional Schr¨ odinger operators in periodic fields, Itogi Nauki i Tekhniki: Sovremennye Problemy Mat., vol. 23, VINITI, Moscow, 1983, pp. 3–23. (English transl. in J. Soviet Math. 28(1) (1985) 1–20.) [18] Y. Avishai, R. M. Redheffer and Y. B. Band, Electron states in a magnetic field and random impurity potential: use of the theory of entire functions, J. Phys. A25 (1992) 3883–3889. [19] Y. Avishai and R. M. Redheffer, Two-dimensional electronic systems in a strong magnetic field, Phys. Rev. B47 (1993) 2089–2100. [20] Y. Avishai, M. Ya. Azbel and S. A. Gredeskul, Electron in a magnetic field interacting with point impurities, Phys. Rev. B48 (1993) 17280–17295. [21] V. A. Geyler and V. A. Margulis, Point perturbation-invariant solutions of the Schr¨ odinger equation with a magnetic field (in Russian), Matem. Zametki. 60 (1996) 768–773. [English transl. in Math. Notes 60 (1996) 575–580.] [22] S. A. Gredeskul, M. Zusman, Y. Avishai and M. Ya. Azbel’, Spectral properties and localization of an electron in a two-dimensional system with point scatters in a magnetic field, Phys. Rep. 288 (1997) 223–257. [23] M. Sh. Birman and T. A. Suslina, The two-dimensional periodic magnetic Hamiltonian is absolutely continuous (in Russian), Algebra i Analiz 9(1) (1997) 32–48. (English transl. in St. Petersburg Math. J. 9(1) (1998) 21–32.) [24] M. Sh. Birman and T. A. Suslina, Absolute continuity of the two-dimensional periodic magnetic Hamiltonian with discontinuous vector-valued potential (in Russian), Algebra i Analiz 10(4) (1998) 1–36. (English transl. in St. Petersburg Math. J. 10(4) (1999) 579–601.) [25] M. Sh. Birman, R. G. Shterenberg and T. A. Suslina, Absolute continuity of the spectrum of a two-dimensional Schr¨ odinger operator with potential supported on a periodic system of curves (in Russian), Algebra i Analiz 12(6) (2000) 140–177. (English transl. in St. Petersburg Math J. 12(6) (2001) 983–1012.) [26] P. Kuchment and S. Levendorskiˇı, On the spectra of periodic elliptic operators, Trans. Amer. Math. Soc. 354 (2002) 537–569. [27] A. Morame, Absence of singular spectrum for a perturbation of a two-dimensional Laplace–Beltrami with periodic electromagnetic potential, J. Phys. A31 (1998) 7593–7601. [28] A. Sobolev, Absolute continuity of the periodic magnetic Schr¨ odinger operator, Inv. Math. 137 (1999) 85–112.
October 18, 2004 14:30 WSPC/148-RMP
904
00219
ˇˇtov´ıˇ V. A. Geyler & P. S cek
[29] N. D. Filonov, Second order elliptic equation of divergence form having a compactly supported solution, J. Math. Sci. 106 (2001) 3078–3086. [30] Y. Aharonov and D. Bohm, Significance of electromagnetic potentials in the quantum theory, Phys. Rev. 115 (1959) 485–491. [31] S. Olariu and I. I. Popescu, The quantum effects of electromagnetic fluxes, Rev. Mod. Phys. 57 (1985) 339–436. [32] J. Hamilton, Aharonov–Bohm and other cyclic phenomena. Springer-Verlag, New York etc., 1987 (Springer Tracts in Modern Physics 139). [33] Y. Aharonov and T. Kaufherr, The effect of a magnetic flux line in quantum theory, Phys. Rev. Lett. 92 (2004) 070404-1–4. [34] K. von Klitzing, G. Dorda and M. Pepper, New method for high-accuracy determination of the fine-structure constant based on quantized Hall resistance, Phys. Rev. Lett. 45 (1980) 494–497. [35] R. E. Prange and S. M. Girvin (eds.), The Quantum Hall Effect (Springer-Verlag, New York etc., 1987). [36] H.-P. Thienel, Quantum mechanics of an electron in a homogeneous magnetic field and a singular magnetic flux tube, Ann. Phys. 280 (2000) 140–162. [37] H.-P. Thienel, Supersymmetrische Quantenmechanik eines Elektrons im homogenen Magnetfeld mit singularem magnetischem Flußschlauch. Shaker Verlag, Aachen, 1998. [38] V. A. Geyler, The two-dimensional Schr¨ odinger operator with a uniform magnetic field and its perturbation by periodic zero-range potentials, Algebra i Analiz. 3(3) (1991) 1–48. (English transl. in St. Petersburg Math. J. 3 (1992) 489–532.) [39] S. Albeverio, F. Gesztesy, R. Høegh-Krohn and H. Holden, Solvable Models in Quantum Mechanics (Springer-Verlag, Berlin, 1988). [40] S. N. M. Ruijsenaars, The Aharonov–Bohm effect and scattering theory, Ann. Phys. 146 (1983) 1–34. [41] J. Audretsch, U. Jasper and V. D. Skarzhinsky, A pragmatic approach to the problem of the self-adjoint extension of Hamilton operators with Aharonov–Bohm potential, J. Phys. A28 (1995) 2359–2367. [42] R. Adami and A. Teta, On the Aharonov–Bohm Hamiltonian, Lett. Math. Phys. 43 (1998) 43–54. ˇˇtov´ıˇcek, Aharonov–Bohm effect with δ-type interaction, J. [43] L. D¸abrovski and P. S Math, Phys. 39 (1998) 47–62. ˇˇtov´ıˇcek, Krein’s formula approach to the multisolenoid Aharonov–Bohm effect, [44] P. S J. Math. Phys. 32 (1991) 2114–2122. ˇˇtov´ıˇcek, Scattering on a finite chain of vortices, Duke Math. J. 76 (1994) 303– [45] P. S 332. [46] Y. Nambu, The Aharonov–Bohm problem revisited, Nucl. Phys. B579 (2000) 590–616. [47] H. T. Ito and H. Tamura, Aharonov–Bohm effect in scattering by point-like magnetic fields at large separation, Ann. Henri Poincar´e 2 (2001) 309–359. [48] H. T. Ito and H. Tamura, Aharonov–Bohm effect in scattering by a chain of pointlike magnetic fields, Asymptotic Anal. 34 (2003) 199–240. ˇˇtov´ıˇcek and P. Vytˇras, Generalised boundary conditions for the [49] P. Exner, P. S Aharonov–Bohm effect combined with a homogeneous magnetic field, J. Math. Phys. 43 (2002) 2151–2168. [50] T. Mine, The Aharonov–Bohm solenoids in a constant magnetic field. mp arc/04–80. [51] Ph. de Sousa Gerbert, Fermions in an Aharonov–Bohm field and cosmic strings, Phys. Rev. D40 (1989) 1346–1349.
October 18, 2004 14:30 WSPC/148-RMP
00219
Zero Modes in a System of Aharonov–Bohm Fluxes
905
[52] C. R. Hagen, Aharonov–Bohm scattering of particles with spin, Phys. Rev. Lett. 64 (1990) 503–506. [53] C. G. Beneventano, M. De Francia and E. M. Santangelo, Dirac fields in the background of a magnetic flux string and spectral boundary conditions, Int. J. Mod. Phys. A14 (1999) 4749–4761. [54] O. Ogurisu, Generalized boundary conditions of a spin 1/2 particle for the Aharonov– Bohm effect combined with a homogeneous magnetic field. mp arc/02–495. [55] S. P. Gavrilov, D. M. Gitman and A. A. Smirnov, Dirac equation in magneticsolenoid field, Eur. Phys. J. C32 (2003) 119–142. [56] S. P. Gavrilov, D. M. Gitman and A. A. Smirnov, Green functions of the Dirac equation with magnetic-solenoid field, J. Math. Phys. 45 (2004) 1873–1886. [57] S. P. Gavrilov, D. M. Gitman, A. A. Smirnov and B. L. Voronov, Dirac fermions in a magnetic-solenoid field. Focus Math. Phys. Res. (Nova Science Publishers, New York, 2004), pp. 131–168. [58] L. D. Landau and E. M. Lifshitz, Quantum Mechanics: Non-Relativistic Theory (Pergamon Press, Oxford, 1977). [59] S. A. Voropaev and M. Bordag, The role of boundary conditions in the Aharonov– Bohm effect for particles with spin, (Russian) Zh. Eksper. Teor. Fiz. 105 (1994) 241–249. [English transl. in JETP 78 (1994) 127–131.] [60] M. Bordag and S. Voropaev, Charged particle with magnetic moment in the Aharonov–Bohm potential, J. Phys. A26 (1993) 7637–7649. [61] R. M. Cavalcanti, E. S. Fraga and C. A. A. de Carvalho, Electron localization by a magnetic vortex, Phys. Rev. B56 (1997) 9243–9246. [62] R. M. Cavalcanti and C. A. A. de Carvalho, Bound state of a spin- 21 charged particle in a magnetic flux tube, J. Phys. A31 (1998) 7061–7063. [63] F. A. B. Coutinho, Y. Nogami, J. Fernando Perez and F. M. Toyama, Self-adjoint extension of the Hamiltonian for a charged spin- 21 particle in the Aharonov–Bohm field J. Phys. A31 (1998) 7061–7063. [64] C. R. Hagen, Effect of nongauge potentials on the spin- 21 Aharonov–Bohm problem, Phys. Rev. D48 (1993) 5935–5939. [65] D. K. Park, Green’s-function approach to two- and three-dimensional delta-function potentials and application to the spin- 21 Aharonov–Bohm problem, J. Math. Phys. 36 (1995) 5463–5474. [66] D. K. Park and J. G. Oh, Self-adjoint extension approach to the spin- 21 Aharonov– Bohm–Coulomb problem, Phys. Rev. D50 (1994) 7715–7720. [67] Y. Park and Y. Yoon, Exactly solvable 1/r 2 extended Pauli–Hamiltonian for a point magnetic vortex system, Progr. Theor. Phys. 95 (1996) 261–271. [68] D. K. Park and S.-K. Yoo, Equivalence of renormalization with self-adjoint extension in Green’s function formalism. hep-th/9712134. [69] P. Exner, M. Hirokawa and O. Ogurisu, Anomalous Pauli electron states for magnetic field with tails, Lett. Math. Phys. 50 (1999) 103–114. [70] H. Tamura, Resolvent convergence in norm for Dirac operator with Aharonov–Bohm field. mp arc/03–199. [71] H. Tamura, Scattering of Dirac particles by electromegnetic fields with small support in two dimensions and effect from scalar potentials. mp arc/03–372. [72] H. L. Cycon, R. G. Froese, W. Kirsch and B. Simon, Schr¨ odinger operators with applications to quantum mechanics and global geometry (Springer-Verlag, Berlin etc., 1987). ˇˇtov´ıˇcek, On the Pauli operator for the Aharonov–Bohm effect [73] V. A. Geyler and P. S with two solenoids, J. Math. Phys. 45 (2004) 51–75.
October 18, 2004 14:30 WSPC/148-RMP
906
00219
ˇˇtov´ıˇ V. A. Geyler & P. S cek
[74] A. Arai, Properties of the Dirac–Weyl operator with a strongly singular gauge potential, J. Math. Phys. 34 (1993) 915–935. [75] A. Arai, Representation-theoretic aspects of two-dimensional quantum systems in singular vector potentials: Canonical commutation relations, quantum algebras, and reduction to lattice quantum systems, J. Math. Phys. 39 (1998) 2476–2498. [76] A. Arai, Canonical commutation relations, the Weierstrass zeta function, and infinite dimensional Hilbert space representations of the quantum group Uq (sl2 ), J. Math. Phys. 37 (1996) 4203–4218. [77] A. Arai, Representation of canonical commutation relations in a gauge theory, the Aharonov–Bohm effect, and the Dirac–Weyl operator, J. Nonlinear Math. Phys. 2 (1995) 247–262. [78] A. Arai, Gauge theory on a non-simply connected domain and representations of canonical commutation relations, J. Math. Phys. 36 (1995) 2569–2580. [79] B. Helffer, Effet d’Aharanov–Bohm pour un ´etat born´e de equation de Schr¨ odinger, Comm. Math. Phys. 119 (1988) 315–329. [80] B. Helffer, T. Hoffmann-Ostenhof and N. Nadirashvili, Periodic Schr¨ odinger operators and Aharonov–Bohm Hamiltonians, Moscow Math. J. 3 (2003) 45–62. [81] B. Helffer, M. Hoffmann-Ostenhof, T. Hoffmann-Ostenhof and M. P. Owen, Nodal sets for groundstates of Schr¨ odinger operators with zero magnetic field in non simply connected domains, Commun. Math. Phys. 202 (1999) 629–649. [82] B. Helffer, M. Hoffmann-Ostenhof, T. Hoffmann-Ostenhof and M. P. Owen, Nodal sets, multiplicity and superconuctivity in non simply connected domains, Lect. Notes Phys. 62 (2000) 62–86. [83] M. Hirokawa and O. Ogurisu, Ground state of a spin 1/2 charged particle in a two-dimensional magnetic field, J. Math. Phys. 42 (2001) 3334–3343. [84] L. Erd˝ os and V. Vougalter, Pauli operator and Aharonov–Casher theorem for measure valued magnetic fields, Commun. Math. Phys. 225 (2002) 399–421. [85] Yu. I. Latyshev, O. Laborde, P. Monceau and S. Klaumunzer, Aharonov–Bohm effect on charge density wave (CDW) moving through columnar defects in NbSe3 , Phys. Rev. Lett. 78 (1997) 919–922. [86] Yu. I. Latyshev, O. Laborde, Th. Fournier and P. Monceau, Sliding and quantum interference of charge-density waves moving through columnar defects in NbSe3 , Phys. Rev. B60 (1999) 14019–14024. [87] Yu. I. Latyshev, Quantum interference of a moving charge density wave on columnar defects containing magnetic flux, (Russian) Uspekhi Fiz. Nauk 169 (1999) 924–926 (English transl. in Physics-Uspekhi 42 (1999) 830–832.) [88] S. J. Bending, K. von Klitzing and K. Ploog, Weak localization in a distributions of magnetic flux tubes, Phys. Rev. Lett. 65 (1990) 1060–1063. [89] M. Melgaard, E.-M. Ouhabaz and G. Rozenblum, Spectral properties of perturbed multivortex Aharonov–Bohm Hamiltonian. mp arc/03–527. [90] A. A. Balinsky, Hardy type inequality for Aharonov–Bohm magnetic potentials with multiple singularities, Math. Res. Lett. 10 (2003) 169–176. [91] A. Laptev and T. Weidl, Hardy inequalities for magnetic Dirichlet forms, in Oper. Theory: Advanced and Appl. 108 “Mathematical Results in Quantum Mechanics /QMath.7”, eds. J. Dittrich, P. Exner and M. Tater, Birkh¨ aser-Verlag, Basel, 1999, pp. 299-305. [92] M. Reed and B. Simon, Methods of Modern Mathematical Physics II (Academic Press, New York, 1975). [93] Yu. G. Shondin, Semibounded local Hamiltonians in R4 for a Laplacian perturbed on curves with corner points, Teor. Mat. Fiz. 106 (1996) 179–199. (English transl. in Theor. Math. Phys. 106 (1996) 151–166.)
October 18, 2004 14:30 WSPC/148-RMP
00219
Zero Modes in a System of Aharonov–Bohm Fluxes
907
[94] K. V. Pankrashkin, Locality of quadratic forms for point perturbations of Schr¨ odinger operators, Matem. Zametki 70 (2001) 425–433. (English transl. in Math. Notes 70 (2001) 384–391.) [95] A. M. Perelomov, On the completeness of a system of coherent states (in Russian), Teoret. i matemat. fizika 6 (1971) 213–224. (English transl. in Theor. Math. Phys. 6 (1971) 156–164; see also math-ph/0210005.) [96] A. M. Perelomov, Generalized Coherent States and Their Applications (SpringerVerlag, Berlin, 1986). [97] A. Erdelyi, W. Magnus, F. Oberthettinger and F. G. Tricomi, Higher Transcendental Functions: Vols. 1–3 (McGraw–Hill, New York, 1953–1955). [98] A. Hurwitz and R. Courant, Allgemeine Funktionentheorie und Elliptische Funktionen (Springer-Verlag, Berlin, 1984). [99] N. Dunford and J. T. Schwartz, Linear Operators 2: Spectral Theory (Interscience, New York, 1963). [100] T. Kato, Perturbation theory of linear operators (Springer-Verlag, New York, 1966). [101] A. I. Markushevich, Theory of analytic functions (Nauka, Moscow, 1976) (in Russian). [102] A. I. Markushevich, Theory of functions of a complex variables (Prentice Hall, Englewood-Cliffs., N.J., 1965). [103] M. A. Naimark, Linear Differential Operators (F. Ungar Publishing Co., New York, 1967). [104] M. Abramowitz and I. A. Stegun, Handbook of Mathematical Functions with Formulas, Graphs and Mathematical Tables (Dover Publications, New York, 1965). [105] E. C. Titchmarsh, Eigenfunction Expansions Associated with Second-Order Differential Equations (Clarendon, Oxford, 1962). [106] R. P. Boas, Entire functions (Academic Press, New York, 1954).
October 15, 2004 11:10 WSPC/148-RMP
00216
Reviews in Mathematical Physics Vol. 16, No. 7 (2004) 909–960 c World Scientific Publishing Company
LOCAL FIELDS IN BOUNDARY CONFORMAL QFT
ROBERTO LONGO Dipartimento di Matematica, Universit` a di Roma “Tor Vergata”, 00133 Roma, Italy [email protected] KARL-HENNING REHREN Institut f¨ ur Theoretische Physik, Universit¨ at G¨ ottingen, 37077 G¨ ottingen, Germany [email protected] Received 11 June 2004 Revised 17 August 2004 Dedicated to Detlev Buchholz on the occasion of his 60th birthday Conformal quantum field theory on the half-space x > 0 of Minkowski space-time (“boundary CFT”) is analyzed from an algebraic point of view, clarifying in particular the algebraic structure of local algebras and the bi-localized charge structure of local fields. The field content and the admissible boundary conditions are characterized in terms of a non-local chiral field algebra. Keywords: Quantum field theory; boundary conditions; operator algebras; modular theory. Mathematics Subject Classifications 2000: 81R15, 81T05, 81T40
1. Introduction We study local fields in relativistic boundary conformal QFT (BCFT) on the halfplane M+ = {(t, x): x > 0}. These theories possess a conserved and traceless stressenergy tensor, subject to a boundary condition at the boundary x = 0. As is well known, conservation and vanishing of the trace imply that the components TL = 21 (T00 + T01 ) and TR = 21 (T00 − T01 ) are chiral fields, TL = TL (t + x), TR = TR (t − x). The boundary condition is the absence of energy flow across the boundary, T01 (t, x = 0) = 0
⇔
T L = TR ≡ T .
(1.1)
It follows that the components T10 = T01 , T11 = T00 of the stress-energy tensor are of the form T00 (t, x) = T (t + x) + T (t − x) ,
T01 (t, x) = T (t + x) − T (t − x) ,
i.e. bi-local expressions in terms of the chiral field T (cf. Fig. 1). 909
(1.2)
October 15, 2004 11:10 WSPC/148-RMP
910
00216
R. Longo & K.-H. Rehren
t
t+x
(t,x)
x
t−x
The half−plane M+
Fig. 1. A point in the half-space M+ . A canonical field localized at (t, x) is a bi-local linear combination of chiral field localized at t + x and t − x.
Apart from the stress-energy tensor, the theory may contain further chiral fields, such as currents, subject to an appropriate boundary condition; e.g. for a conserved current with jL = 21 (j0 + j1 ) = jL (t + x), jR = 21 (j0 − j1 ) = jL (t − x), the vanishing of the charge flow across the boundary gives j1 (t, x = 0) = 0
⇔
j L = jR ≡ j ,
(1.3)
and j0 (t, x) = j(t + x) + j(t − x) ,
j1 (t, x) = j(t + x) − j(t − x) .
(1.4)
It is crucial to contrast the bi-local forms (1.2), (1.4) of the chiral fields in boundary CFT with the situation in 2D Minkowski space CFT, where, e.g. the stress-energy tensor has the chiral decomposition T00 (t, x) = T (t + x) ⊗ 1 + 1 ⊗ T (t − x) , T01 (t, x) = T (t + x) ⊗ 1 − 1 ⊗ T (t − x) ,
(1.5)
where TL = T ⊗ 1 and TR = 1 ⊗ T are two independent (left and right) chiral fields. A boundary CFT contains only one chiral algebra with an appropriate identification between left and right movers. Consequently, the representation space is a direct sum of representations of the chiral algebra, rather than of tensor products of representations of two chiral algebras. This ought to be ascribed to the fact that the imposing of boundary conditions and the ensuing breakdown of symmetry have so drastic consequences on the ground state fluctuations (Casimir effect) that states respecting the boundary conditions cannot be realized in the Hilbert space of states without boundary conditions (see, e.g. [25]). Let us point out, however, that locally the two situations with TL = TR = T and with TL = T ⊗ 1 and TR = 1 ⊗ T independent, are algebraically indistinguishable: for instance, in the latter case the commutator [TL (t1 + x1 ) ± TR (t1 − x1 ), TL (t2 + x2 ) ± TR (t2 − x2 )] involves only δ-function contributions at t1 + x1 = t2 + x2 and at t1 −x1 = t2 −x2 , while the commutator [T (t1 +x1 )±T (t1 −x1 ), T (t2 +x2 )±T (t2 −x2 )]
October 15, 2004 11:10 WSPC/148-RMP
00216
Local Fields in Boundary Conformal QFT
911
has additional contributions at t1 +x1 = t2 −x2 and at t1 −x1 = t2 +x2 . But within a wedge region M+ ⊃ W : x > |t| (⇔ t − x < 0 < t + x), the latter contributions are ineffective. The same holds for any time translate of W . A slightly stronger version of this algebraic indistinguishability is the following. It has been shown that the chiral stress-energy tensor satisfies the split property: namely for every pair of intervals J < I which do not touch (thus allowing to smooth out the UV singularities), there exists a state ϕ in the vacuum Hilbert space H0 of T (depending on I and J; in particular not the vacuum state) which has no correlation between T (u1 ) and T (u2 ) when u1 ∈ I and u2 ∈ J. In other words, ϕ factorizes on products of T (ui ) with ui ∈ I ∪ J according to ! ! ! Y Y Y T (uk ) = ϕ ϕ T (ui ) · ϕ T (uj ) . (1.6) k
i:ui ∈I
j:uj ∈J
This implies that, for every double-cone O not touching the boundary (hence t − x and t + x belong to non-touching intervals as before), there is a state ϕ such that products of Tµν (t, x) given by (1.2) in the boundary CFT with (t, x) ∈ O have the same expectation values in the state ϕ as the same products of Tµν (t, x) given by (1.5) in the 2D Minkowski space CFT have in the state ϕ ⊗ ϕ. This property exhibits the local “decoupling” of left and right chiral components. Exactly as the split property fails when the intervals I and J touch, the decoupling of left- and right-movers breaks down at the boundary in BCFT. We shall assume the split property for all chiral fields of a boundary CFT. This property is known to be related to phase space properties of the CFT (existence of Tr exp(−βL0 )) [9, 1], and it has been established for large classes of chiral models ([45] and references therein). Our aim in the present article is to understand the structure of local fields in boundary CFT which do not decompose in the manner of (1.2) or (1.4). These nonchiral fields have to satisfy local commutativity with the chiral fields and with each other, and transform covariantly under the conformal (M¨ obius) group generated by the chiral stress-energy tensor T . The crucial observation will be that non-chiral local fields in BCFT arise from non-local chiral fields by an algebraic construction (explained in detail in Sec. 2). This construction also gives rise to a model-independent explanation (Sec. 5) for an observation due to Cardy [11] concerning the structure of correlation functions. Cardy has shown that n-point functions of primary local fields in boundary CFT satisfy the same differential equations in the 2n variables ti ± xi as chiral 2n-point conformal blocks of an associated two-dimensional Minkowski space CFT, and are therefore particular combinations of the latter. For example, the 4-point function of the order parameter in the critical Ising model in the full plane factorizes as hΩ, σ(t1 , x1 )σ(t2 , x2 )σ(t3 , x3 )σ(t4 , x4 )Ωi = F (t1 + x1 , . . . , t4 + x4 ) · F (t1 − x1 , . . . , t4 − x4 )
October 15, 2004 11:10 WSPC/148-RMP
912
00216
R. Longo & K.-H. Rehren
+ G(t1 + x1 , . . . , t4 + x4 ) · G(t1 − x1 , . . . , t4 − x4 )
(1.7)
(where the chiral 4-point conformal blocks F and G correspond to intermediate states in the vacuum sector and in the “energy” sector, respectively), whereas both hΩ, φ0 (t1 , x1 )φ0 (t2 , x2 )Ωi ∝ F (t1 + x1 , t1 − x1 , t2 + x2 , t2 − x2 )
(1.8)
hΩ, φ1 (t1 , x1 )φ1 (t2 , x2 )Ωi ∝ G(t1 + x1 , t1 − x1 , t2 + x2 , t2 − x2 )
(1.9)
and
are 2-point functions of local fields on the half-plane M+ . Expressed in terms of exchange fields [42] (“generalized chiral creation and annihilation operators”), we have the operator factorization σ(t, x) = a(t + x) ⊗ a(t − x) + b(t + x) ⊗ b(t − x) + h.c. (1.10) i h i h 1 ⊗ H 1 ⊕ H 12 ⊗ H 21 , where Hh are the three on the Hilbert space [H0 ⊗ H0 ]⊕ H 16 16 1 sectors of the stress-energy tensor with c = 21 , the exchange fields a : H0 → H 16 and b : H 161 → H 21 and their adjoints interpolate among the three sectors of the chiral stress-energy tensor, and F = ha∗ aa∗ ai, G = ha∗ b∗ bai. In contrast, (1.8) and (1.9) are 2-point functions of local fields on the half-plane, given by
φ0 (t, x) ∝ a∗ (t + x)a(t − x)
(1.11)
φ1 (t, x) ∝ b(t + x)a(t − x) + a∗ (t + x)b∗ (t − x)
(1.12)
defined on H0 , and
defined on H0 ⊕H 21 . The local commutativity at space-like distance of the combinations (1.11), (1.12) can be directly checked in terms of the exchange (braid group) commutation relations among a, b and their adjoints [42].a In this calculation, the specific ordering t1 − x1 < t2 − x2 < t2 + x2 < t1 + x1 (or 1 ↔ 2) is crucial. In particular, the combinations φ0 and φ1 given by (1.11), (1.12) on the entire plane would fail to be local fields. We learn from this explicit example that the local fields in boundary CFT carry a bi-localized product of charges of the chiral algebra, rather than a tensor product of left and right charges, as in Minkowski space CFT. Moreover, they interpolate in very specific ways among the charged sectors of the chiral algebra, and these structures determine the scaling behavior of the fields as x → 0. For 2 example, the field φ0 has a singular behavior ∝ x− 16 as x → 0, while the field φ1 a More precisely, while any linear combination φ of the two terms in (1.12) satisfies local commuta1 tivity with itself, it does so with φ∗1 only if φ1 is a multiple of a hermitean field. ((1.12) is hermitean up to a phase due to the exchange commutation relations b(t − x)a(t + x) = ωb(t + x)a(t − x) and a∗ (t − x)b∗ (t + x) = ωa∗ (t + x)b∗ (t − x) with ω = exp(−i 83 π) [42].) On the other hand, any two hermitean combinations differ only by a unitary Klein transformation; thus up to a global phase and unitary similarity, the combination (1.12) is unique as a local quantum field.
October 15, 2004 11:10 WSPC/148-RMP
00216
Local Fields in Boundary Conformal QFT 1
913
2
vanishes ∝ x 2 − 16 at the boundary.b Thus, we also see that the choice of a boundary condition is related to the bi-localized charge structure of the local fields. We shall investigate the origin of this charge structure in the general case. For these purposes, we look at boundary CFT from the algebraic point of view [21]. The algebraic point of view emphasizes the representation theoretic features of a QFT, especially charges and their composition [12], rather than kinematical features such as analytic properties of correlation functions. The DHR theory of superselection sectors [12] asserts that all information about charges (superselection sectors), their composition (“fusion”, operator product expansions), and their interchange (“statistics”, commutation relations) is encoded in a braided C ∗ tensor category (the DHR category for short), in terms of local observable quantities. This theory has been developed further into a powerful tool, useful for explicit computational purposes especially in the chiral setting. For example, the classification of local and non-local extensions of a given local QFT has been cast into a problem of classification of Q-systems ([33], see Sec. 4 and Appendix A) within the DHR category. Q-systems are an efficient tool to control the algebraic consistency of commutation relations, operator product expansions, and charge conjugation of primary and descendant fields at one stroke. Under the natural assumption of “complete rationality” (Sec. 2), the classification of irreducible chiral extensions in CFT has thus been shown to be a finite-dimensional problem with finitely many solutions (see Sec. 3.2). In the case c < 1, a complete classification has been obtained along these lines [28]. Furthermore, the existence of exchange fields as in (1.10)–(1.12) with numerical braid group commutation relations and their operator product expansion could be established from general principles in the algebraic approach [15, 16]. Rather than the local fields, say φ(t, x), the prime objects in the algebraic approach to QFT are the von Neumann algebras of local observables generated by the fields smeared with localized test functions, say A(O) := {φ(f ), φ(f )∗ : supp f ⊂ O}00
(1.13)
for open space-time regions O. The properties of the assignment O 7→ A(O) (the net of local algebras) are axiomatized such that their generation by fields as in (1.13) becomes in fact obsolete and needs not be assumed at all. In our case, the chiral fields T (u), (j(u), . . . ) generate a chiral net of local von Neumann algebras I 7→ A(I) ,
I = (a, b) ⊂ R
(1.14)
on the vacuum Hilbert space H0 . In fact, A extends to a net over the intervals of the circle (embedding R into S 1 by means of a Cayley transformation). b In
the general case, one argues as follows. As x → 0, the variables t + x and t − x coalesce. Thus, the scaling behavior is controlled by the operator product expansion, and depends on the particular fusion channel selected by the bi-localization formula.
October 15, 2004 11:10 WSPC/148-RMP
914
00216
R. Longo & K.-H. Rehren
The chiral fields of a boundary CFT generate a net O 7→ A+ (O) .
(1.15)
According to the prescription (1.13), A+ (O) is generated by chiral fields smeared in the variable t + x over the interval I and in the variable t − x over the interval J, where O = I × J, I > J, is an open double-cone in M+ . The bi-local structures (1.2), (1.4) etc., translate into the form of the local algebras (cf. Fig. 2) A+ (O) = A(I) ∨ A(J) ,
(O = I × J, I > J) .
(1.16)
The (searched for) non-chiral local fields of the boundary CFT will generate a net of local algebras in their vacuum representation O 7→ B+ (O) ,
(1.17)
ˆ as O ˆ is with B+ (O) containing A+ (O) and consequently commuting with A+ (O) space-like from O. But while A+ is defined on the vacuum representation space H0 of the chiral CFT A, the boundary CFT B+ will in general be defined on a larger Hilbert space HB ⊃ H0 , i.e. one has π(A+ (O)) ⊂ B+ (O) .
(1.18)
Since we assume covariance and positive energy throughout, the representation π of A on HB is a positive-energy representation, containing the vacuum representation L π0 , with irreducible decomposition π ' s ns · πs , n0 = 1. 1.1. Outline of results We take the structure (1.14)–(1.18) as the characteristic structure of algebraic boundary conformal QFT, irrespective whether the nets are generated as in (1.13) by any specific set of generating local fields. Our main results will be the following
I
O
J A double−cone O=I x J Fig. 2. A double-cone in the half-space M+ . An observable in A+ (O) is generated by chiral observables localized in I and J.
October 15, 2004 11:10 WSPC/148-RMP
00216
Local Fields in Boundary Conformal QFT
915
(for any fixed chiral CFT A), referring to the body of the article for more detailed qualifications of the statements. In Sec. 2, we provide several general results about the structure of local boundary CFT’s B+ . Every maximal local boundary CFT B+ can be recovered from its “restriction to the boundary” (Proposition 2.9). More precisely, B+ (O) equals B+ (O) = B(L) ∩ B(K)0
(1.19)
(see Fig. 5), where I 7→ B(I) is some (possibly non-local) extension of the chiral CFT I 7→ A(I), defined on the same Hilbert space HB as B+ . Structural features of the latter (reviewed and developed in Sec. 3, where Tomita’s Modular Theory [43] plays a crucial role) are exploited to infer structural features of boundary CFT. We shall refer to the (re)construction of the boundary CFT from a (non-local) chiral theory as (boundary) induction. These results show that the classification of (non-local) chiral extensions of a local chiral theory (e.g., in terms of Q-systems) at the same time provides a classification of boundary CFT’s. On the other hand, every (non-local) chiral extension B of A determines a local CFT B2α on two-dimensional Minkowski space-time with left and right chiral observables A ⊗ A (henceforth referred to as the α-induction construction). The L Hilbert space of B2α carries the representation π2 ' στ Z[σ][τ ] · πσ ⊗ πτ¯ of A ⊗ A, where the matrix Z with indices in the set of irreducible sectors of A is a modular invariant determined by the chiral extension B [40, Corollary 1.6]. In Sec. 4, we discuss the relation between these two constructions of 2D nets (boundary induction for the half-space versus α-induction for Minkowski space). Indeed, the local inclusions π(A+ (O)) ⊂ B+ (O) and π2 (A(I) ⊗ A(J)) ⊂ B2α (O) are algebraically isomorphic (Theorem 4.1). In this sense, the boundary CFT constitutes a representation of the local degrees of freedom of the Minkowski space theory B2α which is consistent with the chiral boundary condition (1.1) and its generalizations such as (1.3). But the representation spaces of B+ and of B2α are very different, one being a direct sum of sectors of A, the other being a direct sum of tensor products of sectors. Therefore in spite of the algebraic isomorphism, the bi-localized charge structure of the local fields on the half-space must be structurally different from the tensor product charge structure of the local fields in the plane, as is clearly exemplified by (1.11) or (1.12) versus (1.10). We derive an explicit formula for the local charged fields (Proposition 5.1) exhibiting their bi-localized charge structure in terms of non-local chiral exchange operators. The charge of a field φ(t, x) is a product (not a tensor product) of two chiral charges localized at t + x and t − x, respectively. This structure, and as a consequence the behavior of the charged fields and their correlations close to the boundary, is determined by the non-local chiral extension B, i.e. the choice of B “determines the boundary conditions”.
October 15, 2004 11:10 WSPC/148-RMP
916
00216
R. Longo & K.-H. Rehren
In this sense, the natural reasoning where one would impose the boundary conditions first, and then attempt to construct local fields subject to these conditions, is inverted. This avoids the problem with the usual strategy, that a consistent set of boundary conditions must be chosen in the first place, while it is not a priori clear what “consistent” would mean. As our analysis demonstrates implicitly, the algebraic constraints on the local fields to be constructed are highly involved: they consist in (i) the Q-system describing the algebraic structure of the inclusion A+ (O) ⊂ B+ (O), and (ii) the representation of this algebraic structure on a Hilbert space HB . From these data which most sensitively depend on the DHR structure of the underlying chiral net A, the boundary conditions emerge, so that it is very unlikely that it should be possible to “guess” the consistent sets of boundary conditions without further specific insight. For this reason, we consider the present top-down strategy chiral extension → boundary condition much more effective, since it is completely under control in the algebraic framework. In Sec. 6 we show that along with a given (non-local) chiral extension B, there is a whole family of non-local chiral extensions Ba , all associated with the same Minkowski space theory B2α , and hence a family of boundary CFT nets Ba,+ , which are all locally isomorphic, but whose local fields exhibit different bi-localized charge structures and satisfy different boundary conditions, in the sense just explained. L The multiplicities of the Hilbert spaces Ha ≡ HBa = s nsa · Hs are the diagonal elements of a “nimrep” (non-negative integer matrix representation) of the fusion rules of A: X ns · nt = Nust nu with nsaa = nsa . (1.20) u
We include in Sec. 7 some preliminary remarks on the relation to the modular structure of partition functions and boundary states. The structural analysis pursued in this article generalizes closely related previous analyses in complementary approaches. In the context of critical phenomena in Statistical Mechanics, Cardy has already discussed [11] the case B = A (in our terminology), leading to the set of boundary conditions being labeled by the sectors of A. The same situation was investigated by Felder, Fr¨ ohlich, Fuchs and Schweigert [14] from the perspective of three-dimensional topological field theory. Fuchs, Runkel and Schweigert [19] proceeded to construct the coefficients of all 2n-point conformal blocks as in (1.8), (1.9) in a combinatorial manner, where a condition very similar to our Eq. (5.12) was crucial to ensure locality. Behrend et al. [10] have concentrated on graph theoretic aspects of the pertinent fusion algebras, and to A-D-E classification aspects in the case of SU (2) current algebras, see also [46] for a review. Fuchs and Schweigert [17] have studied the generalization in which (in our terminology) A is a subtheory (not necessary of orbifold type) of a chiral theory B which is itself local. They also emphasized the role of α-induction.
October 15, 2004 11:10 WSPC/148-RMP
00216
Local Fields in Boundary Conformal QFT
917
This case is known to give rise to block diagonal modular invariant matrices Z st [6]. The same authors [18] have further developed the purely categorical aspects characteristic of boundary CFT, no longer referring to the underlying physical postulates. In fact, these structures fit most naturally in the general setting of tensor categories as exposed, e.g., in [18, 30, 37]. In comparison to such a considerable gain of mathematical generality (where quantum physics remains hardly visible), the motivation and ambition of our work is more limited. On the other hand, we study and explain specific representation theoretic issues which in the other frameworks are not or even cannot be addressed. For these issues, operator algebraic methods are most powerful. We emphasize that in our approach the prominent principle is Locality. In other approaches [46], inspired by Statistical Mechanics or String Theory rather than Quantum Field Theory, Modular Invariance of the partition function is taken instead as a first principle, required in order to guarantee that the theory can be consistently defined on arbitrary Riemann surfaces. It is well known, however that — although closely related to each other — these principles cannot be precisely mapped onto each other [41]. In fact, we do not assume diffeomorphism invariance but only M¨ obius invariance. Assuming diffeomorphism invariance (i.e. the algebraic implementation of localized diffeomorphisms by suitable chiral observables), would allow some stronger results. For example, (for an explanation of the notions, see the beginning of the next section), it was shown in [35] that strong additivity would be automatic in a split net of finite µ-index, and that the µ-index coincides with the dimension of the DHR category. 2. Algebraic Boundary Conformal QFT We work with a fixed chiral conformal net I 7→ A(I) over the intervals of the real axis [20], e.g., a Virasoro net with c < 1 or a non-abelian current algebra (affine Kac–Moody) chiral net. In this article, A is assumed to be completely rational [29]. This condition combines rationality (finitely many superselection sectors, each with finite statistics [12, 15]), strong additivity (“irrelevance of points for smearing”, i.e. the algebras of two adjacent intervals (a, b) and (b, c) generate the algebra of the full interval (a, c); this property is equivalent to Haag duality of the chiral theory on the real line), and the split property (statistical independence of local algebras A(I) and A(J) when I and J are finitely separated, and as a consequence A(I) ∨ A(J) is isomorphic to A(I) ⊗ A(J), cf. the discussion around (1.6); this property is guaranteed, e.g., if exp(−βL0 ) is a trace class operator for all β in the vacuum representation [9, 1]). Most of the common models of chiral CFT are completely rational [32, 45], but abelian current algebras as well as stress tensors with c ≥ 1 without further fields are excluded by the assumption of rationality. Completely rational chiral theories enjoy very interesting properties concerning the structure of their superselection sectors. For example, the DHR statistics is
October 15, 2004 11:10 WSPC/148-RMP
918
00216
R. Longo & K.-H. Rehren
non-degenerate (besides the vacuum sector, no sector has trivial monodromy with every other sector) [29, Corollary 37], and thus gives rise to a unitary representation of the modular group SL(2, Z) in terms of the statistics [16, Corollary 5.2], turning the DHR category into a modular category [44]. Moreover, in completely rational theories, the dimension of the DHR category (the sum of the squares of the dimension of all irreducible superselection sectors), equals the “µ-index” (the von Neumann subfactor index of the inclusion A(E) ⊂ A(E 0 )0 where E is the union of two disconnected intervals and E 0 its complement on the circle [29, Theorem 33]). 2.1. Geometric preliminaries on the half-space M+ Before turning to QFT on the half-space M+ ≡ {(t, x) ∈ R2 : x > 0}, let us mention some elementary geometric properties of this space: (i) A double-cone within the half-space M+ is an open region of the form O = I × J ≡ {(t, x): t + x ∈ I, t − x ∈ J} whose closure is contained in M+ (cf. Fig. 3). Let L ⊂ R be a bounded open interval, and J < K < I the three subintervals (ordered as indicated) obtained by removing two points from L (cf. Figs. 2 and 5). There is a bijection between the configurations of four intervals I, J, K, L obtained this way and the double-cones O within M+ , such that O = I × J. By default, I, J, K, L and O will always refer to such a configuration. Only if necessary, we shall write IO , JO , KO , LO in order to indicate this convenient parametrization of double-cones within M+ . (ii) A left wedge is a region of the form WL = {(t, x): |t − t0 | < x0 − x} for some (t0 , x0 ) ∈ M+ ; it is “spanned” by the interval I = (t0 − x0 , t0 + x0 ). A right wedge is a region of the form WR = {(t, x): |t − t0 | < x − x0 } for some (t0 , x0 ) ∈ M+ . The causal complement of a left wedge is a right wedge, and vice versa (cf. Fig. 3). The causal complement of a double-cone is the union of a left wedge WL = O< and ˆ belongs to the left causal complement a right wedge WR = O> . A double-cone O ˆ O< of O (O < O) if and only if LOˆ ⊂ KO , and to the right causal complement O> ˆ > O) if and only if LO ⊂ K ˆ . (O O
O<
O
O>
A double−cone and its causal complement Fig. 3.
WL
WR
A left wedge and a right wedge
Double-cone and wedge regions in M+ and their causal complements.
October 15, 2004 11:10 WSPC/148-RMP
00216
Local Fields in Boundary Conformal QFT
919
ˆ commutes with B+ (O) in both Locality of a net B+ on M+ means that B+ (O) cases. ] (iii) The covering of the M¨ obius group G := P SL(2, R), acting on the universal 1 covering of the compactification S of R, induces an action on a certain covering of M+ ⊂ R × R. The subgroups of translations and of dilations act on R, and the induced actions are the time translations and the dilations of M+ , respectively. 2.2. Local algebras in boundary CFT Definition 2.1. A given chiral net A defines two different local nets over the open double-cones within M+ , namely the trivial boundary CFT O 7→ A+ (O) := A(I) ∨ A(J)
(2.1)
0 O 7→ Adual + (O) := A(L) ∩ A(K) .
(2.2)
and its dual As emphasized by the notation, Adual is the dual net associated with A+ : + 0 0 Adual (2.3) + (O) := A+ (O ) 0 ˆ where A+ (O ) := O⊂O 0 A+ (O) ≡ A+ (O< ) ∨ A+ (O> ) is the algebra generated by ˆ all observables of A+ localized in double-cones at space-like separation from O.
W
Remarks. (1) Both nets A+ and Adual are represented on the same Hilbert space + H0 , the vacuum Hilbert space of A. The observables of the trivial BCFT are bilocal expressions in the chiral observables, as described in the introduction. (2) The dual net is local because, if O1 and O2 are space-like separated within 0 dual M+ , then L2 ⊂ K1 (or 1 ↔ 2), hence Adual + (O1 ) ⊂ A(K1 ) and A+ (O2 ) ⊂ A(L2 ) dual commute. It follows that A+ is its own dual net (Haag duality). (3) The inclusion A+ (O) ⊂ Adual + (O)
(2.4)
is the “two-interval subfactor” extensively discussed in [29]. Apart from A(I) and I J A(J), the algebra Adual + (O) contains all unitary “charge transporters” u : ρ → ρ I J where ρ , ρ are (equivalent) DHR endomorphisms of A localized in I and J, respectively,c and these elements generate Adual + (O). The algebraic isomorphism class of the two-interval subfactor (2.4) does not depend on the pair of intervals, and thus on O. (4) We observe that _ _ A+ (O) = Adual (2.5) + (O) = A(L) , O: LO ⊂L
O: LO ⊂L
because the intervals IO and JO , as O varies as specified, cover all of L.
c For
details on DHR theory in the chiral setting, see [15, 16]. The notation t : ρ → σ means the intertwining property tρ(a) = σ(a)t for all a ∈ A. We shall also write t ∈ Hom(ρ, σ).
October 15, 2004 11:10 WSPC/148-RMP
920
00216
R. Longo & K.-H. Rehren
The trivial BCFT A+ and its dual Adual are special casesd of boundary confor+ mal quantum field theories in the sense of the following definition. Definition 2.2. A boundary CFT (BCFT) associated with A is a local, isotonous net O 7→ B+ (O) over the double-cones within the half-space M+ , represented on a Hilbert space HB such that: (i) There is a unitary representation U of the covering of the M¨ obius group ] G=P SL(2, R) with positive generator for the subgroup of translations, such that U(g)B+ (O)U(g)∗ = B+ (gO)
(2.6)
whenever the conformal transformation g ∈ G takes the double-cone O = IO × JO within M+ into another double-cone gO := gIO ×gJO within M+ e (i.e. in particular for all translations and dilations), with a unique invariant vector Ω ∈ HB (the vacuum vector). (ii) There is a representation π of A on HB such that B+ (O) contains π(A+ (O)), and U(g)π(A+ (O))U(g)∗ = π(A+ (gO))
(2.7)
whenever O and gO are double-cones within M+ . (iii) “Joint irreducibility”: For each double-cone O, the von Neumann algebra B+ (O) ∨ π(A+ )00 is irreducible on HB , i.e. equals B(HB ). Here, π(A+ ) is the C ∗ algebra generated by all double-cone algebras π(A+ (O)), and π(A+ )00 is its weak closure, i.e. the von Neumann algebra generated by all interval algebras π(A(I)). Comments. (1) By Remark (4) following Definition 2.1, the covariance condition in (ii) is equivalent to U(g)π(A(I))U(g)∗ = π(A(gI))
(2.8)
whenever I and gI are intervals in R. As a consequence, π extends to a positiveenergy representation of the chiral net A on the circle. (2) Joint irreducibility (iii) implies irreducibility of the net B+ on HB with Ω the unique U-invariant vector, cf. [20]. On the other hand, Ω being the unique Uinvariant vector and cyclic for B+ implies the irreducibility of B+ by Proposition 3.3 below (choosing U the subgroup of time translations). The covariance and spectrum condition (i) implies that the vacuum vector is in fact cyclic and separating for every local algebra B+ (O) (Reeh–Schlieder property). (3) Joint irreducibility also implies that the local inclusions π(A+ (O)) ⊂ B+ (O) have trivial relative commutant. d The
latter is sometimes called “the Cardy case” in the literature [14]. being a covering group, this means more precisely the following: g is represented by a path gt ∈ P SL(2, R) connecting the identity with p(g) ∈ P SL(2, R), such that gt O lies within M+ for all t.
eG
October 15, 2004 11:10 WSPC/148-RMP
00216
Local Fields in Boundary Conformal QFT
921
(4) Joint irreducibility is automatic if the representation U(g) belongs to π(A+ )00 , e.g., if the stress-energy tensor of the BCFT coincides with that of the chiral theory A, see e.g., [39, 2]. (5) In general, a BCFT net B+ does not contain the dual net π(Adual + ), nor is it relatively local with respect to π(Adual + ) (see Proposition 2.7 for a characterization of this case; clearly, the former property would imply the latter). dual (O) of a boundary CFT O 7→ B+ (O) is Definition 2.3. The dual net O 7→ B+ defined by dual B+ (O) := B+ (O0 )0 ≡ B+ (O< )0 ∩ B+ (O> )0 .
(2.9)
ˆ as Here, B+ (O< ), B+ (O> ) are the von Neumann algebras generated by all B+ (O) ˆ belongs to the left or right causal complement of O, respectively. By locality of O dual B+ , B+ (O) contains B+ (O). We shall see below (Proposition 2.10, using Modular Theory [43]) that B + is in fact wedge dual, i.e. the algebra of a right wedge WR is the commutant of the algebra of the corresponding left wedge WR = WL0 , and vice versa. This means that (2.9) may be rewritten as dual 0 0 B+ (O) := B+ (O> ) ∩ B+ (O< ).
(2.10)
dual In particular, the dual net B+ is again local, and consequently it is its own dual is indeed the dual net (i.e. it is Haag dual). The notation is consistent since Adual + of A+ .
2.3. The non-local chiral net associated with a BCFT We now turn to the description of a BCFT in terms of a chiral net, which is in general non-local. Definition 2.4. A boundary CFT O 7→ B+ (O) generates a chiral net I 7→ B gen (I) (the associated boundary net) on HB , by _ B gen (I) := B+ (O) ≡ B+ (WL ) (2.11) O⊂WL
where WL is the left wedge spanned by I (cf. Fig. 4). By the above Remark (4) following Definition 2.1, this definition associates the original net A with both A+ and Adual + . Proposition 2.5. (i) The boundary net B gen generated from B+ is isotonous, and it is covariant: U(g)B gen (I)U(g)∗ = B gen (gI)
(2.12)
October 15, 2004 11:10 WSPC/148-RMP
922
00216
R. Longo & K.-H. Rehren
I
Double−cones for "generation" Fig. 4. The observables of the associated chiral boundary net localized in I are generated by BCFT observables localized in double-cones O ⊂ WL .
whenever I ⊂ R, gI ⊂ R.f It acts irreducibly on HB . B gen extends π(A) and is relatively local with respect to π(A): π(A(I)) ⊂ B gen (I) ⊂ π(A(I 0 ))0 .
(2.13)
(ii) There is a consistent family of vacuum-preserving conditional expectations E I : B gen (I) → A(I). (iii) The local subfactors π(A(I)) ⊂ B gen (I) are irreducible and have finite index. The index is independent of I. In general, the boundary net B gen is a non-local chiral net. For if I1 and I2 are disjoint, then the double-cones contributing to the definition (2.11) of the corresponding algebras B gen (I1 ) and B gen (I2 ) are pairwise time-like separated, and observables of B+ need not satisfy time-like commutativity. Non-local chiral nets have been studied before, e.g., in [1]. We shall review and extend their general structure theory in Sec. 3. In the remainder of the present section, we shall freely use these results. Proof of Proposition 2.5. (i) Isotony, covariance, the extension property and relative locality are elementary. Irreducibility of the net B gen follows from irreducibility of B+ . (ii) The statement will be proven in the next section (Proposition 3.5(i)). (iii) Irreducibility of the local subfactors A(I) ⊂ B gen (I) follows from joint irreducibility. Namely, the von Neumann algebra π(A(I)) ∨ B gen (I)0 contains π(A(I)) ∨ π(A(I 0 )) hence π(A+ ) by strong additivity, and B+ (O) for some O with I ⊂ KO , hence it equals B(HB ). Thus π(A(I))0 ∩ B gen (I) = C · 1. f In
the same sense as explained in the footnote to Definition 2.2(ii).
October 15, 2004 11:10 WSPC/148-RMP
00216
Local Fields in Boundary Conformal QFT
923
Irreducibility implies finiteness of the index because A is completely rational, by the same argument as in [28, Proposition 2.3] (i.e. each irreducible subsector ρ can arise in π with multiplicity bounded by d(ρ)). Its independence of the interval follows as in [33, Corollary 4.2]. Proposition 2.5 means that the extension π(A) ⊂ B gen defines what was called a quantum field theoretical net of subfactors in [33]. We shall use here rather the terminology chiral extension. Conditional expectations having the abstract properties of an average (“non-commutative integration”), the existence of a consistent family of vacuum-preserving conditional expectations of B(I) to A(I) was viewed in [33] as a “generalized symmetry which is unbroken in the vacuum state”. It should be emphasized that a consistent family of (vacuum-preserving) conditional expectations cannot be expected in general for the double-cone algebras A+ (O) ⊂ B+ (O), because the modular automorphism group of B+ (O) acts nongeometrically and therefore does not preserve the subalgebra A+ (O). In the case of B+ = Adual + , the failure can be seen directly: here the cyclic subspace of A + coincides with the full Hilbert space, and the corresponding projection is the unit operator. On the other hand, the unique conditional expectation of Adual + (O) to A+ (O) [29] does not preserve the vacuum. Likewise, there cannot be a global conˆ ditional expectation, because, e.g., E O is trivial on Adual ˆ or JO ˆ + (O) whenever IO dual ˆ while E O is non-trivial on the contains LO because A+ (O) ⊂ A(LO ) ⊂ A+ (O), same algebra. In this sense, the (generalized) symmetry allows to determine the subalgebras π(A+ (W )) ⊂ B+ (W ) associated with wedges as fix-point algebras, but the same does not hold for the subalgebras π(A+ (O)) ⊂ B+ (O) associated with double-cones. (This is completely analogous to compact symmetry groups acting on field algebras associated with connected and disconnected regions in four dimensions [12].) As a consequence, the techniques and results of [33] do not apply directly to algebraic boundary CFT, considered as the net of subfactors I 7→ A+ (O) ⊂ B+ (O). Instead, as a consequence of Proposition 2.5, these techniques do apply to the associated boundary extension A(I) ⊂ B gen (I), and we shall elaborate in Secs. 4 and 5 how they indirectly provide the desired insight into the structure of the net of algebras on the half-space and its representations. A central result of [33] is the following generation property (for further explanations, see Sec. 4 and Appendix A). Corollary 2.6 ([33]). For each interval I, the “dual canonical” endomorphism of A(I) associated with the local subfactor A(I) ⊂ B gen (I) extends to a DHR endomorphism θI of A localized in I. The algebra B gen (I) is generated by its subalgebra π(A(I)) and a “canonical” isometry v I ∈ B gen (I) which is an intertwiner for θ I , i.e. one has π(θ I (a))v I = v I π(a) for all a ∈ A. This property can be used to obtain:
October 15, 2004 11:10 WSPC/148-RMP
924
00216
R. Longo & K.-H. Rehren
gen Proposition 2.7. If B+ is relatively local with respect to π(Adual = A, + ), then B and B+ lies between A+ and Adual . +
Proof. Let O = I ×J, J < I, and K and L as described in the beginning of Sec. 2.1. ˆ ˆ Assume that π(Adual + (O)) commutes with B+ (O) whenever O belongs to the left dual causal complement of O, i.e. whenever LOˆ ⊂ K. Then π(A+ (O)) commutes with B gen (K). Every unitary charge transporter u : ρI → ρJ belongs to Adual + (O), thus π(u) commutes with B gen (K). By Corollary 2.6, v K ∈ B gen (K) satisfies v K π(u) = π(θK (u))v K while by locality v K π(u) = π(u)v K , hence π(θK (u))v K = π(u)v K . As an equation in B gen (L), this implies [33] θ(u) = u. By [16], this implies that the sectors [ρ] and [θ] have trivial monodromy, and as [ρ] was arbitrary, [θ] has trivial monodromy with every DHR sector. But by [29], the braiding of the completely rational net A is non-degenerate, hence [θ] must be trivial. This in turn implies B gen = A [33]. The last statement will follow from Proposition 2.9(ii), according to which dual B gen = A implies B+ = Adual + . By the definition of the boundary net B and locality of B+ , we obviously have B+ (O) ⊂ B gen (L) ∩ B gen (K)0 . This suggests the following definition of a local boundary CFT induced by a given (possibly non-local) chiral net: Definition 2.8. If I 7→ B(I) is an irreducible chiral extension of I 7→ A(I) (possibly non-local, but relatively local with respect to A), then the induced net is defined by (cf. Fig. 5) ind O 7→ B+ (O) := B(L) ∩ B(K)0 .
(2.14)
Let us discuss to which extent Definition 2.8 is the converse of Definition 2.4, i.e. to which extent a boundary CFT can be reconstructed from its boundary net.
L
K
O
Intervals for "induction" Fig. 5. B(K).
The observables of the induced BCFT localized in O belong to B(L) and commute with
October 15, 2004 11:10 WSPC/148-RMP
00216
Local Fields in Boundary Conformal QFT
925
Proposition 2.9. (i) The induced net (2.14) is a boundary CFT associated with A, defined on the Hilbert space of B; e.g., in the special case B = A, the induced net is the dual net Adual + . ind gen (ii) If B is a chiral extension of A, then the boundary net (B+ ) generated ind by the induced net B+ is again B. Conversely, if B+ is a boundary CFT, then its dual boundary net B gen induces the dual net B+ associated with B+ , i.e. (B gen )ind + = dual B+ . In other words, we have: gen◦ind = id, ind◦gen = dual, implying dual◦ind = ind and dual ◦ dual = dual. ind (iii) Every induced net B+ is self-dual (Haag dual ). ind Proof. (i) B+ contains π(A+ ) and is local by definition. The covariance properties (2.6) and (2.7) follow from covariance of the chiral net B. Joint irreducibility is automatic because π(A(I)) ⊂ B(I) has finite index by virtue of irreducibility of π(A(I)) ⊂ B(I) and complete rationality [28], which implies that U(g) belongs to the von Neumann algebra π(A)00 generated by all π(A(I)), cf. Comment (4) after Definition 2.2. ind gen (ii) (B+ ) (L) is generated by the algebras B(L) ∩ B(K)0 as K varies within L, so its commutant is the intersection of the algebras B(L)0 ∨ B(K) as K varies. For any fixed K0 ⊂ L, by the split property for the net B (Proposition 3.6), B(L)0 ∨ B(K) is naturally isomorphic to B(L)0 ⊗ B(K0 ). Now, as K varies within T K0 , the intersection K⊂K0 B(K) is trivial (this follows from “triviality at a point”, T Proposition 3.2(ii)), hence K⊂K0 B(L)0 ⊗ B(K) = B(L)0 ⊗ C1. It follows that T W 0 0 0 ind gen (L) = K B(L) ∩ K⊂K0 B(L) ∨ B(K) equals B(L) ∨ C1 = B(L) , and (B+ ) B(K)0 = B(L). Conversely, if B+ is given, then by definition
B gen (K) = B+ (O< )
(2.15)
since O< is the left wedge spanned by K. We shall show next (Proposition 2.10) that boundary CFT nets satisfy wedge duality. Hence, because the right wedge O > is the causal complement of the left wedge spanned by L, we have also B+ (O> ) = B gen (L)0 .
(2.16)
dual This implies B+ (O) = B gen (L) ∩ B gen (K)0 = (B gen )ind + (O). (iii) is obvious from (ii).
As the examples (Definition 2.1) of the trivial BCFT and its dual show, there is no bijection between boundary CFT’s and their boundary nets; but Proposition 2.9(ii) means that there is a bijection between Haag-dual boundary CFT’s and their boundary nets. Yet, the non-Haag-dual boundary CFT’s being subtheories of the Haag-dual ones, the previous results show that a classification of boundary CFT’s essentially reduces to a classification of (non-local) chiral extensions. The following facts (some of which anticipate results from Sec. 5) provide some non-trivial examples for the results in this section.
October 15, 2004 11:10 WSPC/148-RMP
926
00216
R. Longo & K.-H. Rehren
The chiral theory A of the stress-energy tensor with c = 21 has one non-trivial chiral extension B, the CAR algebra of a chiral real Fermi field ψ on HB = H0 ⊕H 21 . The local algebras Adual + (O) of the dual net are generated by A+ (O), the operators (ψ(f )ψ(g)) H0 with supp f ⊂ I, supp g ⊂ J, and the field φ0 (Eq. (1.11)) smeared within O. (In Sec. 5 it will become clear that this characterization is equivalent with the one given in Remark (3) following Definition 2.1.) On the other hand, ind the local algebras B+ (O) are generated by π(A+ (O)), ψ(f )ψ(g) and the field φ1 (Eq. (1.12)). The subtheory on H0 with local algebras B+ (O) = (CAR(I) ⊗ CAR(J))even generated by π(A+ ) and ψ(f )ψ(g) is not Haag dual. While the field φ0 may well be “lifted” to H 21 , it is impossible to do so such that it also locally commutes with φ1 . ind In Sec. 4, we shall show how to “compute” the intersection B+ (O) = B(L) ∩ 0 B(K) from the non-local chiral extension B of A in the general case (in terms of the DHR category of the local chiral net A), and to obtain algebraic invariants for ind the inclusions π(A+ (O)) ⊂ B+ (O) from the chiral subfactors π(A(I)) ⊂ B(I). 2.4. General results: duality and split property for wedges We proceed with several general structure results about boundary CFT’s. In the sequel, for W a left wedge, ∆is W , JW are the modular data associated with the von Neumann algebra B+ (W ) and the vacuum state [43, Chap. VI, Theorem 1.19], and ΛW (s) is the one-parameter subgroup of the (covered) M¨ obius group G preserving W , defined as follows: let λ(s) : u 7→ exp(−s)u be the scale transformations. Then ΛW (s) := gλ(s)g −1 where g ∈ P SL(2, R) maps (0, ∞) to the interval I of the real line which spans W . We denote by rW the inversion which maps I to its complement on the circle, i.e. rW = hrh−1 where r is the inversion u 7→ 1/u, and h ∈ G maps (−1, 1) to I. We denote by the same symbol the (densely defined) transformations of Minkowski space or the half-space M+ , induced by acting simultaneously on t + x and t − x. Finally, Γ is the group generated by r and G. Then we prove Proposition 2.10. (i) Every boundary CFT B+ satisfies wedge duality B+ (W 0 ) = B+ (W )0
(2.17)
0
where W is a left wedge and W its causal complement. (ii) Every boundary CFT B+ has the Bisognano–Wichmann property: ∆is W = U(ΛW (−2πs))
(2.18) ˜ for every left wedge W . There exists an (anti-)unitary representation U of the group Γ on HB extending the representation U of G such that JW = U(rW ) .
(2.19)
In particular , JW U(g)JW = U(rW grW ) (g ∈ Γ) and dual dual JW B + (O)JW = B+ (rW O)
whenever O and rW O are double-cones within M+ .
(2.20)
October 15, 2004 11:10 WSPC/148-RMP
00216
Local Fields in Boundary Conformal QFT
927
Proof. The first part (2.18) of (ii) is proven in Proposition 3.5(ii) where B(I) ≡ B+ (W ), A(I) ≡ A+ (W ), using Proposition 2.5(iii). Turning to (i), we note that wedge duality holds for A+ , because it is equivalent to Haag duality on the real line for A, which is in turn equivalent to strong additivity. Let W be a left wedge and W 0 its causal complement. Consider the inclusions π(A+ (W 0 )) ⊂ B+ (W 0 ) ⊂ B+ (W )0 .
(2.21)
π(A+ (W 0 )) = JW π(A+ (W ))JW ⊂ JW B+ (W )JW = B+ (W )0
(2.22)
The subfactor
is irreducible with finite index, because π(A+ (W )) = π(A(I)) ⊂ B gen (I) = B+ (W ) is irreducible with finite index. Clearly, B+ (W )0 is globally stable under AdU (ΛW (s)) , and the same is true for A+ (W 0 ) by strong additivity of A. Due to the rigidity of intermediate subfactors in subfactors with finite index [32], the intermediate algebra B+ (W 0 ) in (2.21) must also be globally stable under AdU (ΛW (s)) . Thus, B+ (W 0 ) is a von Neumann subalgebra of B+ (W )0 cyclic on the vacuum, which, thanks to (2.18), is in addition invariant under the modular automorphisms of B+ (W )0 . By modular theory [43, Chap. IX, Theorem 4.2], this algebra must coincide with B+ (W )0 . This proves wedge duality. Turning to the second part of (ii), we infer from Proposition 3.2(iv) that JW U(g)JW = U(rW grW ) (g ∈ G), thus (2.19) defines a representation. Furthermore, AdJW acts covariantly on interval algebras: AdJW B gen (K) = B gen (rW K), hence on left wedge algebras: AdJW B+ (WL ) = B+ (rW WL ), hence on right wedges dual by wedge duality: AdJW B+ (WR ) = B+ (rW WR ). As B+ (O) is defined as an intersection of wedge algebras, it follows that Ad JW acts covariantly on the dual net as stated in (2.20). For example, if W is spanned by the interval (−1, 1), then rW : u 7→ 1/u induces t x the ray inversion (t, x) 7→ t2 −x2 , − t2 −x2 . This map maps only the region x > |t| of M+ into M+ . Thus, (2.20) makes sense only for double-cones I × J such that I ⊂ R+ and J ⊂ R− . On other double-cones, JW acts non-geometrically. ¯+ (O) by Remark. If B+ is not Haag dual, one may consistently define B JW B+ (rW O)JW (choosing W such that rW O belongs to M+ ). This defines andual other BCFT intermediate between A+ (O) and B+ (O). Proposition 2.11. Every boundary CFT B+ satisfies the split property for wedges. That is, if O is a double-cone within M+ and WL = O< and WR = O> the ind ind associated pair of left and right wedges, then the inclusion B+ (WL ) ⊂ B+ (WR )0 ind ind is split, or equivalently B+ (WL ) ∨ B+ (WR ) is naturally isomorphic to the tensor ind ind product B+ (WL ) ⊗ B+ (WR ). In particular , this implies the split property for ind ind double-cones O1 , O2 whenever O1 ⊂ O< and O2 ⊂ O> , i.e. B+ (O1 ) ∨ B+ (O2 ) is ind ind naturally isomorphic to the tensor product B+ (O1 ) ⊗ B+ (O2 ).
October 15, 2004 11:10 WSPC/148-RMP
928
00216
R. Longo & K.-H. Rehren
Proof. The inclusion A ⊂ B gen has finite index (Proposition 2.5(iii)). Thus B gen is split by Proposition 3.6, i.e. the inclusion B gen (K) ⊂ B gen (L) is split. Now by definition, B+ (WL ) = B gen (K), and by wedge duality (Proposition 2.10(i)), B+ (WR )0 = B gen (L). This proves the claim. ind Proposition 2.12. Let B be a chiral extension of A, and B+ the induced BCFT net. Then ind (i) The index of π(A+ (O)) ⊂ B+ (O) equals the µ-index µA of A (i.e. the index of the two-interval subfactor µA = [A(L0 ) ∩ A(K) : A(I) ∨ A(J)] = [Adual + (O) : A+ (O)] which coincides with the dimension of the DHR category of A [29]; in particular , it is independent of O). This index is thus the same for each chiral extension. ind (ii) The induced net B+ satisfies strong additivity.
Proof. (i) Let λ = [B(I) : π(A(I))] denote the index of the chiral extension. λ is independent of I and finite (Proposition 2.5(iii)). We want to compute the index of π(A(I) ∨ A(J)) ⊂ B(K)0 ∩ B(L), which equals the index of the commutant π(A(I))0 ∩ π(A(J))0 ⊃ B(K) ∨ B(L)0 . Using the notation α
N1 ⊂ N 2
(2.23)
to indicate that a subfactor N1 ⊂ N2 has index [N2 : N1 ] = α, we shall prove the indices displayed in the square of inclusions π(A(I))0 ∩ π(A(J))0 ⊃ B(K) ∨ B(L)0 λ2 µ A
∪
∪λ 0
λ
π(A(K)) ∨ π(A(L )) ⊂ π(A(K)) ∨ B(L)
(2.24) 0
from which the desired index of the subfactor in the top row follows to be µA by the multiplicativity of the index, as claimed in the statement. The inclusion in the left column is the two-interval subfactor of the chiral net A in the representation π on HB , whose index has been computed in [29, Lemma 42] as follows: π is unitarily equivalent to a DHR endomorphism θ of A in its vacuum representation [33] (see also Sec. 4), where θ has dimension d(θ) = λ; thus we may as well consider the subfactor θ(A(K)∨A(L0 )) ⊂ θ(A(I)∨A(J))0 on H0 . Choosing θ to be localized in K, we get θ(A(K))∨A(L0 ) ⊂ A(K)∨A(L0 ) ⊂ (A(I)∨A(J))0 . The former inclusion has index d(θ)2 = λ2 , and the latter is the two-interval subfactor of index µA . Thus, the index in the left column equals λ2 µA . The indices in the right column and bottom row equal [B(K) : π(A(K))] and [B(L)0 : π(A(L0 ))], respectively, by the split property for B (Proposition 3.6). The former is λ by definition, while the latter equals λ because in π(A(L0 )) ⊂ B(L)0 ⊂ π(A(L))0 the second inclusion has index λ by definition, while the total inclusion is the one-interval subfactor of A in the representation π which has dimension λ2 [29, Lemma 42].
October 15, 2004 11:10 WSPC/148-RMP
00216
Local Fields in Boundary Conformal QFT
929
Thus the index in the top row equals µA . The statement (ii) now follows exactly as in [32, Lemma 23]. 2.5. Superselection structure of boundary CFT In the remainder of this section, we discuss DHR sectors (= superselection sectors in the sense of [12]) for Haag-dual boundary CFTs. M¨ uger has shown [36] that in Minkowski space-time, the split property for wedges implies the absence of nontrivial sectors. We obtain here a similar result on the half-space. A DHR sector of a boundary CFT B+ is defined as an equivalence class of positive-energy representations π subject to the selection criterion that π cannot be distinguished from the (defining) vacuum representation by measurements within the causal complement of any double-cone O within M+ . Assuming Haag duality for B+ , by standard arguments [12] one finds that superselection sectors can be represented by localized and transportable endomorphisms (DHR endomorphisms) ρ of the net B+ . This means that for any given double-cone O, ρ can be chosen within its unitary equivalence class to act like the identity map on the algebra of the causal complement B+ (O0 ) = B+ (O< ) ∨ B+ (O> ), and the unitary charge transporters which intertwine equivalent such endomorphisms localized in different regions belong to B+ .g It is obvious that every DHR endomorphism of B+ defines a localized and transportable endomorphism (in the obvious sense) of the boundary net B gen ; but the converse is not true. For example, the DHR endomorphisms ρ of the chiral net A (which is the boundary net generated by Adual + ) localized in, say, the interval K act non-trivially on the charge transporters “across” K [16]. However, such charge transporters do belong to Adual + (O> ), so ρ is not localized as an endomorphism of Adual . In fact, the following result shows that the dual net Adual + + , and in fact any Haag-dual boundary CFT, does not possess any nontrivial DHR sectors at all. Let E = O1 ∪ O2 be the union of two causally disjoint double-cones within M+ which do not touch (i.e. whose closures are disjoint); we may assume that O1 belongs to the left causal complement of O2 , O1 < O2 . Then the causal complement E 0 of E is the union of the left wedge WL = O1< , the right wedge WR = O2> and a double-cone O = O1> ∩ O2< . We consider the inclusion B+ (E) ⊂ B+ (E 0 )0 , where B+ (E) and B+ (E 0 ) are defined by additivity. Proposition 2.13. If B+ ⊃ A+ is a boundary CFT net, then the index µB+ of the inclusion B+ (E) ⊂ B+ (E 0 )0
(2.25)
g Unlike double-cones in Minkowski space-time, any given pair of double-cones O , O within M 1 2 + is not always contained in another double-cone within M+ . However, given O1 and O2 , one can choose an auxiliary O3 such that O1 ∪ O3 and O2 ∪ O3 are each contained in some double-cone within M+ . The charge transporter from O1 to O2 may then be obtained as a composition of two charge transporters from O1 to O3 and from O3 to O2 .
October 15, 2004 11:10 WSPC/148-RMP
930
00216
R. Longo & K.-H. Rehren
is independent of E and equals µB+ =
dual [B+
3
: B+ ] =
µA [B+ : π(A+ )]
3
(2.26)
dual dual where the indices of the extensions [B+ : B+ ] := [B+ (O) : B+ (O)] and [B+ : π(A+ )] := [B+ (O) : π(A+ (O))] are independent of O. In particular, µA+ = µ3A .
Corollary 2.14. (i) When B+ is Haag dual , then µB+ = 1, and B+ satisfies Haag duality also for disconnected regions of the form E = O1 ∪ O2 as above (i.e. (2.25) is an equality). (ii) A Haag-dual boundary CFT net B+ has no nontrivial DHR sectors. dual (iii) When B+ is not Haag-dual , then B+ is a field net for B+ in the sense of [12], i.e. for every sector of B+ represented by a DHR endomorphism ρ, there is dual a nontrivial operator in B+ which intertwines ρ with the identity. Proof of the Proposition 2.13. The independence of the indices on the various regions (of a given topology) follows as in [29, 33]. We denote by B the induced boundary net, and write [B : A] =: λ and [B+ : π(A+ )] =: λ+ . We shall show that B+ (E) λ2+
µB +
⊂
B+ (E 0 )0
∪
∩ λ
2
λ2 λ+
(2.27)
µ3A
π(A+ (E)) ⊂ π(A+ (E 0 ))0 which implies µB+ =
µA λ+
3
(2.28)
by multiplicativity of the index. Bottom row of (2.27): A+ (E) = A(J2 ) ∨ A(J1 ) ∨ A(I1 ) ∨ A(I2 ) is a four-interval algebra of the chiral net A, and so is A+ (E 0 ) by strong additivity of A. Thus we have the four-interval subfactor in the representation π whose index is computed, as in Proposition 2.12, with the help of [29, Lemma 42] to be d(θ)2 µ3A = λ2 µ3A . Left column of (2.27): [B+ (E) : π(A+ (E))] = µ2A . We have B+ (E) = B+ (O1 ) ∨ B+ (O2 ) ⊃ π(A+ (E)) = π(A(O1 )) ∨ π(A(O2 ))
(2.29)
so, by the split property for B+ (Proposition 2.11), [B+ (E) : π(A+ (E))] = [B+ (O1 ) : π(A+ (O1 ))] · [B+ (O2 ) : π(A+ (O2 ))] where each factor equals λ+ . Right column of (2.27): [B+ (E 0 ) : π(A+ (E 0 ))] = µA . The computation is analogous to the previous one, but here E 0 = WL ∪ O ∪ WR has 3 connected components. The double-cone contributes a factor λ+ as before, while the two wedges contribute a factor [B+ (W ) : π(A+ (W ))] = [B(I) : π(A(I))] = λ each. This proves the various indices in (2.27) and hence the formula (2.28). By Propodual dual sition 2.12, µA = [B+ (O) : π(A+ (O))] = [B+ (O) : B+ (O)][B+ (O) : π(A+ (O))] dual gives [B+ (O) : B+ (O)] = µA /λ+ . This proves (2.26).
October 15, 2004 11:10 WSPC/148-RMP
00216
Local Fields in Boundary Conformal QFT
931
Proof of the Corollary 2.14. The statement (i) is obvious from the proposition. The proof for the absence of non-trivial sectors in the Haag-dual case is exactly as (iii) ⇒ (ii) in [29, Corollary 32],h using Haag duality of B+ for disconnected regions of the form E. If u : ρ1 → ρ2 is a unitary intertwiner from ρ1 localized in O1 to ρ2 localized in O2 , then u belongs to B+ (E 0 )0 = B+ (E). Thanks to the split property for B+ (Proposition 2.11), there is a conditional expectation E : B+ (E) = B+ (O1 ) ∨ B+ (O2 ) → B+ (O1 ) such that E(u) 6= 0. This is a non-trivial local intertwiner from ρ1 B+ (O1 ) to id B+ (O1 ), hence a global intertwiner from ρ1 to id thanks to strong additivity for B+ (Proposition 2.12). Thus every sector contains the identity sector, which implies the claim. Similarly, when B+ is not Haag dual, and ρi are a pair of equivalent DHR endomorphisms of B+ as before, then the charge transporter u : ρ1 → ρ2 belongs dual dual to B+ (E), and E(u) : ρ1 → id belongs to B+ (O1 ), as asserted. Remark. To prevent misconceptions of the statement (ii) of Corollory 2.14, it should be pointed out that a Haag-dual BCFT can well have non-trivial positive energy-representations; e.g., every positive-energy representation of A defines a positive-energy representation of Adual + . But these representations are not localized in double-cones as required for DHR representations. 3. Non-Local Chiral CFT This section reviews and generalizes known structural theorems about non-local chiral CFT, and also contains several new results. While the section is logically independent of BCFT, its results bear important implications for BCFT. They are freely used in other sections. 3.1. General structure: Covariance and modular symmetry In [1], the covariant transformation law for a non-local chiral CFT U(g)B(I)U(g)∗ = B(gI) ,
(g ∈ G)
(3.1)
was assumed to hold globally, i.e. for every interval of the circle and without restriction on g ∈ G. It was shown that this implies the rotation of the circle by 4π to be represented by U(4π) = 1, hence the conformal Hamiltonian L0 has half-integer spectrum and the net B is (at least weakly) graded local. In our setting, this restriction is too narrow. Depending on the spectrum of L0 on HB , (3.1) holds for the induced net only locally as indicated in (2.12), admitting “more non-local” induced boundary CFT’s than graded local ones. In order to generalize the analysis in [1], we first note h There
is some unfortunate misnumbering of the implications proven in [29, Corollary 32]. (i) ⇒ (ii) should read (iii) ⇒ (ii), and (ii) ⇒ (iii) should read (i) ⇒ (iii), while (iii) ⇒ (i) is trivial.
October 15, 2004 11:10 WSPC/148-RMP
932
00216
R. Longo & K.-H. Rehren
Lemma 3.1. Let I → B(I) be a net of von Neumann algebras defined on the ] intervals I ⊂ R, and U a representation of G = P SL(2, R) on the same Hilbert space such that (3.1) holds whenever I and gI belong to R. Then, identifying R with S 1 \{−1} by means of a Cayley transformation, B extends to a net defined on the intervals of the (universal ) covering S of S 1 for which (3.1) holds globally. Sketch of the Proof. Use (3.1) as a definition of the algebra B(gI) on the righthand side whenever the conditions on I and g are not met, i.e. whenever gI belongs to S but not to the natural embedding of R into S. Validity of (3.1) in the restricted sense ensures that this definition is consistent. Depending on the theory, the resulting net may enjoy a periodicity of the form B(I + N · 2π) = B(I) for some N ∈ N, in which case it may as well be considered as a net on the N -fold covering of S 1 . For example, N = 1 if B is local, and N = 2 if it is Z2 -graded local. To the theory on the covering, the analysis of [20, 1] may be applied, giving the same conclusions except those which assume that the rotation by 2π takes an interval into itself and hence AdU (2π) is an automorphism of B(I); e.g., the abovementioned triviality of U(4π). Thus, we have Proposition 3.2 ([20, 1]). Let I 7→ B(I) be a chiral net defined on a covering S of the circle, satisfying the standard assumptions: B(I) are von Neumann algebras on a Hilbert space H, I1 ⊂ I2 implies B(I1 ) ⊂ B(I2 ), there is a unitary representation ] U of G = P SL(2, R) on H such that (3.1) holds globally on S, the rotation subgroup has a positive generator , and there is a U-invariant vector Ω ∈ H (the vacuum) W T cyclic for I B(I) and separating for I B(I). Then one has (i) Reeh–Schlieder property: Ω is cyclic and separating for each B(I). W T (ii) Irreducibility and Triviality at a point: I B(I) = B(H) and I B(I) = T I3x B(I) = C1. (iii) Additivity and Continuity: If I and Ik are open intervals such that I ⊂ S W T I B(I) ⊂ B(Ik ), and if I¯ denotes the closure of I and I¯ ⊃ k Ik , then k k , then W ¯ ⊃ B(Ik ). B(I) (iv) Modular Covariance: For any interval I ⊂ S, the modular automorphisms Ad∆it of the von Neumann algebra B(I) with respect to the vacuum vector Ω [43, I Chap. IV, Theorem 1.19] act geometrically by −it ∆it = B(ΛI (−2πt)J) , I B(J)∆I
(J ⊂ S, t ∈ R)
(3.2)
where ΛI is the I-preserving one-parameter subgroup of G which is conjugate to the scale transformations of R. Moreover , if r is the reflection of S induced by x 7→ −x, and Γ the group generated by G and r, then U extends to an (anti-)unitary representation of Γ by setting U(rI ) = JI , where JI is the modular conjugation of (B(I), Ω) and rI is the unique reflection in Γ conjugate to r which has the boundary points of I as fix-points.
October 15, 2004 11:10 WSPC/148-RMP
00216
Local Fields in Boundary Conformal QFT
933
(v) The unitaries z(t) := U(ΛI (2πt))∆it I
(3.3)
do not depend on I and form a one-parameter group in the center of the gauge group.i (vi) Bisognano–Wichmann property: Provided B is local or Z2 -graded local (fermionic),j then the central cocycle z(t) in (v) is trivial: z(t) = 1,
i.e.
∆it I = U(ΛI (−2πt)) .
(3.4)
(vii) If z(t) = 1, then one has the following equivalences: CΩ are the only UW invariant vectors ⇔ B(I) are factors ⇔ B is irreducible, i.e. I B(I) = B(H) ⇔ T I B(I) = C1. In this case, if B(I) 6= C1, the factors are of type III 1 .
Proof. As in [20, 1]. (ii) is proved by various instances of the subsequent PropoW sition 3.3: Choosing M = I B(I) and U the subgroup of translations, gives irreT W ducibility. Choosing M = I B(I)0 and U the translations, gives I B(I) = C · 1. W Choosing M = Mx ≡ I3x B(I) and U the subgroup of special conformal transformations preserving the point x, gives triviality at the point. (Note that every vector which is invariant under the time translations or under the special transformations is automatically also invariant under the full conformal group, and hence is a multiple of Ω. Note also that isotony ensures invariance of Mx under the special conformal transformations although their action does not preserve R.) We have used
Proposition 3.3. Let M be a von Neumann algebra on a Hilbert space H, v a cyclic vector and U a one-parameter unitary group implementing automorphisms of M . If U has a positive generator , and v is the unique U -invariant vector , then M = B(H). Proof. Let E denote the projection on C · v. Because E is one-dimensional and v is cyclic, the algebra E ∨ M generated by E and M contains every one-dimensional projection, hence coincides with B(H). On the other hand, by positivity of the generator [8], the spectrum condition implies that U belongs to M , and consequently its spectral projection E also belongs to M . Hence E ∨ M equals M . For later use, we record a simple fact. Proposition 3.4. If I 7→ B(I) is a non-local net, then a net I 7→ C(I) ⊂ B(I) relatively local with respect to B is local. There is a unique maximal such net. The i The
gauge group consists of all unitaries V on HB such that V Ω = Ω and V B(I)V ∗ = B(I) for all I. j This assumption will be substantially relaxed in the next subsection (Proposition 3.5).
October 15, 2004 11:10 WSPC/148-RMP
934
00216
R. Longo & K.-H. Rehren
maximal net is covariant under the covariance group of B, and its local algebras C(I) are globally stable under the gauge group of B. Proof. Locality of C is obvious. Existence and uniqueness of the maximal net hold because any two nets C1 and C2 within B and relatively local with respect to B generate C1 ∨ C2 with the same properties. The stability and covariance statements follow from uniqueness. 3.2. Non-local extensions In this subsection we assume that the chiral net I 7→ B(I) contains a covariant net of subfactors I 7→ π(A(I)) ⊂ B(I) which is relatively local with respect to B (in particular, A is local). Then we have Proposition 3.5. (i) There is a family of vacuum-preserving conditional expectations E I : B(I) → A(I). (ii) If the local subfactors π(A(I)) ⊂ B(I) are irreducible with finite index , then the central cocycle (3.3) is trivial , z(t) = 1, i.e. the Bisognano–Wichmann property (3.4) holds. (iii) If z(t) = 1, then the family of local conditional expectations is consistent, ˆ ˆ whenever Iˆ ⊂ I, so that there is a global vacuumi.e. E I restricts to E I on B(I) preserving conditional expectation E : B → A which maps B(I) onto A(I). E is implemented by the projection onto the cyclic subspace π(A)Ω. Proof. (i) Consider the maximal net I 7→ C(I) given by Proposition 3.4. By Propositions 3.2(v) and 3.4, C(I) is globally stable under the modular automorphism group [43, Chap. VI, Thoerem 1.19] associated with B(I) and the vacuum. By Takesaki’s Theorem [43, Chap. IX, Theorem 4.2], there exists a vacuum-preserving conditional expectation of B(I) onto C(I). On the other hand, because C is local, we may apply [1] or Proposition 3.2(vi) to the net C to conclude that z(t) is trivial on the cyclic subspace of C. Because U restricts to the covariance representation of C which in turn restricts to that of A, A(I) is globally stable under the modular automorphism group (3.4) associated with C(I) and the vacuum, so there is a vacuum-preserving conditional expectation of C(I) onto A(I). Composition of the two expectations gives an expectation E I of B(I) onto A(I). (ii) By Proposition 3.2, the central cocycle z(s) given by (3.3) is a vacuumpreserving unitary one-parameter group on HB whose adjoint action globally preserves B(I). We have to show that z(s) is trivial. Because there is a vacuum-preserving conditional expectation of B(I) onto A(I), the modular automorphisms of B(I) restrict [43] to the modular automorphisms of A(I) and the vacuum. Because A is local, z(s) is trivial on the cyclic subspace of A(I) (the vacuum subrepresentation of A in πB ). Hence Adz(s) is a one-parameter group of automorphisms of B(I) acting trivially on π(A(I)). Thus, its fix-point
October 15, 2004 11:10 WSPC/148-RMP
00216
Local Fields in Boundary Conformal QFT
935
subalgebra B(I)z is intermediate between π(A(I)) and B(I), and the index [B(I) : B(I)z ] is finite because [B(I) : π(A(I))] is finite. But the fix-point index is the order of the quotient group of R by the subgroup which acts trivially on B(I). This number can be either 1 or ∞. The latter being excluded, the fix-point subalgebra must be all of B(I). Hence, the automorphic action of Adz(s) is trivial, i.e. z(t) commutes with B(I). Since the vacuum is cyclic for B(I) and z(s) preserves the vacuum, z(s) itself must be trivial. This proves (ii). (iii) Because z(t) = 1, the modular automorphism group of B gen (I) coincides with the subgroup of M¨ obius transformations preserving I, hence it globally preserves A(I). Again by Takesaki’s Theorem [43, Chap. IX, Theorem 4.2], it follows that E I is implemented by the projection on the subspace π(A(I))Ω. By the Reeh–Schlieder Theorem, this projection does not depend on I. This implies consistency. In the remainder of this subsection, we shall explain the characterization of nonlocal chiral extensions I 7→ B(I) ⊃ A(I) in terms of Q-systems within the DHR category of the chiral net I 7→ A(I). In [33], a structural analysis of local and non-local extensions of quantum field theories in the algebraic framework has been developed. The main tool was the notion of a Q-system [31], characterizing a subfactor N ⊂ M of finite index. For a brief review on Q-systems, see Appendix A. A Q-system consists of a set of algebraic relations which, in the case of quantum field theory, amount to the statement that the (non-local) fields of the extension form a closed algebra under multiplication and conjugation, and satisfy local commutation relations with the chiral fields. This interpretation of a Q-system is made more transparent if the relations are reformulated in terms of charged intertwiners (cf. Appendix A). The central result in [33] is that a Q-system (θ, w, x) within the DHR category of a local net A determines a relatively local net B which extends A: θ is required to be a DHR endomorphism of the net O 7→ A(O) localized in some region O0 , and consequently the isometries w and x belong to A(O0 ). The Q-system therefore determines a positive-energy representation π ' θ of the net A on a Hilbert space HB , and the local subfactor π(A(O0 )) ⊂ B(O0 ) on HB . The latter can then be “transported” to a covariant net of subfactors O 7→ [π(A(O)) ⊂ B(O)] equipped with a consistent family of conditional expectations preserving the vacuum. (Imposing the additional eigenvalue condition ε(θ, θ)x = x would ensure O 7→ B(O) to be a local net.) In the case of (non-local) chiral extensions A(I) ⊂ B(I) at hand, A being completely rational implies that only finitely many (equivalence classes of) endomorphisms θ can appear in an irreducible Q-system: the argument is as in [28, Proposition 2.3], using the fact that the multiplicity ns of each irreducible subsector [ρs ] of θ is bounded by the square of its dimension [24, p. 39]. In particular, the index of the local subfactors A(I) ⊂ B(I) is finite (and so the stronger bound
October 15, 2004 11:10 WSPC/148-RMP
936
00216
R. Longo & K.-H. Rehren
ns ≤ d(ρs ) [33, Corollary 4.6] applies). Moreover, it was shown in [23, Theorem 2.4] that each θ can arise only in finitely many inequivalent Q-systems. This means that the classification problem of Q-systems in the DHR category of a (completely) rational CFT is a finite problem with finitely many solutions, and thus, fixing A, there exist only finitely many non-local chiral extensions B. Examples for Q-systems within the DHR category of a local net were given for local and non-local chiral extensions of chiral nets, and for local two-dimensional extensions of subnets AL ⊗AR consisting of two (left and right) chiral nets [33]. The main result in [40] is that there is a systematic way (the α-induction construction, using results of [6]) to associate a local Q-system, and hence a local two-dimensional extension B2α of A2 = A ⊗ A, with any given chiral extension B of A, see Sec. 4. 3.3. The split property Let us now turn to the split property, which is related to phase space properties (existence of Tr exp(−βL0 )) in QFT [9, 1]. A commuting pair of von Neumann algebras (M1 , M2 ) is split if there is a natural isomorphism from M1 ∨ M2 to M1 ⊗ M2 , where M1 ∨ M2 denotes the von Neumann algebra generated by M1 and M2 . A chiral net B is split if the pair (B(K), B(L)0 ) is split whenever the open interval L contains the closure of the interval K. We want to prove “upward hereditarity” of the split property. Proposition 3.6. Let B be a M¨ obius covariant net on S 1 and A a finite index subnet such that B is relatively local with respect to A. If A is split, then B is also split. Note that B is possibly non-local, but relative locality implies that A is local. In the case of a local net B, the result was proven in [32]. To prepare the proof of Proposition 3.6, we need Proposition 3.7 and Lemma 3.8: Proposition 3.7. Let M1 , M2 be commuting factors and Nk ⊂ Mk finite index subfactors, k = 1, 2. If the pair (N1 , N2 ) is split, then the pair (M1 , M2 ) is also split. Let γk : Mk → Nk be canonical endomorphisms and (γk , Tk , Sk ) the associated Q-systems. Then Mk = Nk Tk , and every m(k) ∈ Mk can be written as m(k) = n(k) Tk where n(k) ∈ Nk is given by n(k) = λk · Ek (m(k) Tk∗ ), with Ek the associated expectation from Mk to Nk and λk is the index [Mk : Nk ]. Thus ||n(k) || ≤ λk ||m(k) ||. Lemma 3.8. With the above notations we have N T1 T2 = M where M = M1 ∨ M2 , N = N 1 ∨ N2 . Moreover there is a constant C > 0 such that if m ∈ M then m = nT1 T2 , with n ∈ N and ||n|| ≤ C||m||. Proof of Lemma 3.8. We first show that the second part of the statement with m ∈ M1 · M2 . Here M1 · M2 is the product of M1 and M2 which is naturally
October 15, 2004 11:10 WSPC/148-RMP
00216
Local Fields in Boundary Conformal QFT
937
isomorphic to the algebraic tensor product M1 M2 by the Murray–von Neumann factorization lemma. P (1) (2) (1) (2) (1) (1) Let m = mi mi with mi ∈ M1 , mi ∈ M2 and write mi = ni T1 , (2) (2) (k) mi = ni T2 , with ni ∈ Nk . The subfactor N1 ⊗ N2 of M1 ⊗ M2 has finite index and the associated Qsystem is the tensor product Q-system, hence there is a constant C > 0 such that P (1) P (1) (2) (2) || ni ⊗ ni || ≤ C|| mi ⊗ mi || where the norms here are the spatial tensor product norms. Hence we have X X X X (1) (2) (2) (1) (1) (2) (1) (2) ni ni = ni ⊗ n i ≤ C mi ⊗ m i ≤ C mi mi ,
(3.5)
where the first equality holds because of the split property for (N1 , N2 ) and the last inequality due to the minimality of the spatial tensor product norm. Now we prove the general statement. Let m ∈ M with ||m|| ≤ 1 and choose by Kaplansky density theorem a net of elements mj ∈ M1 · M2 , With ||mj || ≤ 1 and mj → m weakly. We can write mj = nj T1 T2 where nj ∈ N and ||nj || ≤ C. With n a weak limit point of nj , we then have m = nT1 T2 and ||n|| ≤ C.
Proof of Proposition 3.7. Let Φ : M1 ⊗ M2 → M be the linear map m 7→ Φ(m) ≡ Φ0 (n)T1 T2
(3.6)
where n ∈ N1 ⊗ N2 is the unique element such that m = n · T1 ⊗ T2 and Φ0 = Φ N1 ⊗N2 is the natural isomorphism of N1 ⊗ N2 with N1 ∨ N2 . By the lemma, Φ is surjective. We show that Φ is multiplicative and respects the ∗ operation. First note that if n ∈ N then T1 T2 n = θ(n)T1 T2
(3.7)
where θ is the endomorphism of N which is transformed to γ1 N1 ⊗γ2 N2 under Φ0 (check this with n ∈ N1 · N2 , then it holds for all n ∈ N by continuity). Let m0 ∈ M1 ⊗ M2 , m0 = n0 · T1 ⊗ T2 with n0 ∈ N1 ⊗ N2 . Then mm0 = n · T1 ⊗ T2 · n0 · T1 ⊗ T2 = nγ1 ⊗ γ2 (n0 ) · T12 ⊗ T22 = nγ1 ⊗ γ2 (n0 ) · λ1 E1 (T12 T1∗ )T1 ⊗ λ2 E2 (T22 T2∗ )T2 .
(3.8)
Thus, suppressing the symbol Φ0 for simplicity, Φ(mm0 ) = nθ(n0 ) · λ1 E1 (T12 T1∗ )λ2 E2 (T22 T2∗ ) · T1 T2 = nθ(n0 )T12 T22 .
(3.9)
On the other hand Φ(m)Φ(m0 ) = nT1 T2 n0 T1 T2 = nθ(n0 )T12 T22
(3.10)
as desired. As for the ∗ operation, the argument is completely analogous, using the formula Ti ∗ = λi Ei (Ti ∗2 )Ti .
October 15, 2004 11:10 WSPC/148-RMP
938
00216
R. Longo & K.-H. Rehren
Thus, Φ is a ∗ -homomorphism of von Neumann algebras, hence σ-weakly continuous. Since M1 ⊗ M2 is a factor, Φ is injective and (M1 , M2 ) is a split pair. Proof of Proposition 3.6. Let I ⊂ I˜ be two intervals without common end points ˜ 0. and apply Proposition 3.7 with N1 = A(I), N2 = A(I˜0 ), M1 = B(I), M2 = B(I) ˜ 0 is antiWe just note that [M2 : N2 ] < ∞ because the inclusion A(I˜0 ) ⊂ B(I) isomorphic to ˜ ⊂ J ˜B(I) ˜ 0 J ˜ = B(I) ˜ JI˜A(I˜0 )JI˜ = A(I) I I
(3.11)
˜ with respect to the vacuum and we where JI˜ is the modular conjugation of B(I) are using the geometric action of JI˜. 4. Charged Intertwiners in Boundary CFT The main result of this section is a generalization of [27, Theorem 3.1 and Remark 3.2] (where B was assumed to be local). Recall that the α-induction construction [40] associaties a CFT on Minkowski space with a chiral extension. Theorem 4.1. For a given completely rational chiral net I 7→ A(I), and a given ind irreducible (possibly non-local ) chiral extension I 7→ B(I), let O 7→ B + (O), O = I × J ⊂ M+ , be the induced Haag-dual boundary CFT net (Definition 2.8), and let O 7→ B2α (O), O = I × J ⊂ M, be the two-dimensional local net on Minkowski space extending A ⊗ A, obtained from B by the α-induction construction. Then the local subfactors ind A(I) ∨ A(J) ⊂ B+ (O)
and
A(I) ⊗ A(J) ⊂ B2α (O)
(4.1)
are isomorphic. dual Because B+ (O) is intermediate between A+ (O) and the dual net B+ (O), we conclude
Corollary 4.2. For a given boundary CFT O 7→ B+ (O), let O 7→ B2α (O) be the local two-dimensional extension of A ⊗ A obtained by applying the α-induction construction to the boundary net B of B+ . Under the isomorphism established in Theorem 4.1, the net O 7→ B+ (O) corresponds to an intermediate net A(I) ⊗ A(J) ⊂ B2 (O) ⊂ B2α (O) ,
(4.2)
with B2 (O) = B2α (O) if and only if B+ satisfies Haag duality. In the course of the proof of the theorem, we shall “compute” the relative ind ind commutant B+ (O) = B(L) ∩ B(K)0 by determining local operators ψi ∈ B+ (O) ind (charged intertwiners, see below) which along with A+ (O) generate B+ (O). These charged intertwiners, as O varies, are the von Neumann analog of the non-chiral local Wightman fields of the boundary CFT, generalizing (1.11), (1.12). In Sec. 5, we shall further analyze their bi-localized charge structure.
October 15, 2004 11:10 WSPC/148-RMP
00216
Local Fields in Boundary Conformal QFT
939
In a general subfactor setup (see Appendix A for more details), the charged intertwiners for a subfactor N ⊂ M are nontrivial elements ψi of M satisfying ψi n = %i (n)ψi ,
(n ∈ N )
(4.3)
(where %i are irreducible endomorphisms of N ), such that every element of M has a unique expansion X ni ψ i , (ni ∈ N ) . (4.4) m= i
The algebra of the charged intertwiners in M X X Γ0∗ Γkij ψk , ψi∗ = ψi ψj = ji ψj
(4.5)
j
k
with intertwiners Γkij : %k → %i %j in N , together with some normalization conditions, is an invariant of the subfactor N ⊂ M , determined by the Q-system. Generalizing an argument used in [29], we show in Lemma A.2 that the Qsystems (γ, v, w) in M and (θ, w, x) in N can be recovered from the system of ˜ charged intertwiners ψi ∈ M ; in particular, two such systems ψi ∈ M , ψ˜i ∈ M k satisfying the same algebra with the same %i ∈ End(N ) and Γij ∈ N induce an ˜ by ψi ↔ ψ˜i . isomorphism of subfactors N ⊂ M , N ⊂ M ind Applying this argument to N = A(I) ∨ A(J) ' A(I) ⊗ A(J) and M = B+ (O), α ˜ = B2 (O), the statement of the theorem thus follows from the equivalence of the M algebras of charged intertwiners which generate the respective inclusions. First part of the proof of Theorem 4.1. We proceed in close analogy with the proof of [29, Proposition 45]. A2 (O) = A(I) ⊗ A(J) and A+ (O) = A(I) ∨ A(J) are naturally isomorphic by the split property of the chiral net A. Under this isomorphism, the Q-system (Θ2 , W2 , X2 ) for A2 (O) ⊂ B2α (O) (given in [40], see below) turns into a Q-system (Θ, W, X) in A+ (O) with X 1 W i Θ(W j ) Γkij W k∗ . (4.6) X = d(Θ)− 2 ijk
By the preceding discussion and Lemma A.2, it is sufficient to show that Θ coincides ind with the dual canonical endomorphism Θ+ for A+ (O) ⊂ B+ (O), and to find ind charged intertwiners ψi ∈ B+ (O) satisfying the algebra (A.4)–(A.6) and (A.9) with Γkij given by (4.17). Without loss of generality, we choose I = (y, z) ⊂ R+ and J = −I ⊂ R− (this situation can always be attained by a conformal transformation), thus K = (−y, y), L = (−z, z) are symmetric, and put O = I×J. Then A(I) = j(A(J)) where j = AdJ is the modular conjugation [43, Chap. IV, Theorem 1.19] for A(R+ ) with respect to the vacuum (= PCT transformation [20]), and A+ (O) = A(I) ∨ j(A(I)). We choose a system ∆ of inequivalent irreducible DHR endomorphisms ρs localized in I, thus ρ¯s = j ◦ ρs ◦ j are conjugates of ρs localized in J. We have
October 15, 2004 11:10 WSPC/148-RMP
940
00216
R. Longo & K.-H. Rehren
Lemma 4.3. Every irreducible subsector of Θ+ is equivalent to some σ¯ τ with σ, τ ∈ ∆. The proof of this lemma is exactly as the proof of [29, Lemma 31]. Next, we show an analog of [29, Theorem 9], which implies that Θ ' Θ2 indeed ind coincides with the dual canonical endomorphism Θ+ for A+ (O) ⊂ B+ (O): Proposition 4.4. The multiplicities of [σ¯ τ ] in the dual canonical endomorphism ind − ± Θ+ for A+ (O) ⊂ B+ (O) equal Z[σ][τ ] = dim Hom(α+ τ , ασ ), where αρ are the α-induced extensions of ρ ∈ ∆ to the chiral net B (see Appendix B ). Proof. By Definition B.1, α± ρ are endomorphisms of B(L), and by Proposition B.3, − the global intertwiners coincide with the local intertwiners, i.e. tα+ τ (b) = ασ (b)t holds for all b ∈ B if and only if it holds for all b ∈ B(L), and in this case t belongs to B(L). ind Now, for σ, τ ∈ ∆, consider the space Xστ of intertwiners ψ ∈ B+ (O) satisfying ψa = σ¯ τ (a)ψ ,
(a ∈ A+ (O)) .
(4.7)
Then for ψ ∈ Xστ , the same equation (4.7) also holds with a ∈ A(K) and a ∈ A(L0 ) ind because then σ¯ τ (a) = a and because B+ (O) commutes with A(K) and A(L0 ). By strong additivity of A, (4.7) holds in fact for all a ∈ A. − Consider on the other hand the space Hom(α+ τ , ασ ) of intertwiners t ∈ B satisfying − tα+ τ (b) = ασ (b)t ,
(b ∈ B)
(4.8)
whose dimension is Z[σ][τ ] . We claim that the maps ϕ : t 7→ ψ := tRτ ,
¯τ∗ )ψ ϕ−1 : ψ 7→ t := σ(R
(4.9)
− are isomorphisms between Hom(α+ τ , ασ ) and Xστ , which proves the proposition. Here, Rτ : id → τ τ¯ are the standard intertwiners in A(L) as in [20, 29], normalized ¯ τ = κτ · Rτ such that by [20, Proof of Lemma 3.5] such that Rτ∗ Rτ = d(τ ), and R ∗ ∗ ¯ ¯ one has τ (R )R = R τ¯(R) = 1. The latter normalization condition ensures that the − maps (4.9) are mutually inverse. It remains to show that the image of Hom(α+ τ , ασ ) belongs to Xστ , and vice versa. − Let t ∈ Hom(α+ τ , ασ ). Then t ∈ B(I) by Proposition B.3, and ψ := ϕ(t) = tRτ by definition belongs to B(L) and satisfies (4.7) for all a ∈ A. The non-trivial part is to show that it also belongs to B(K)0 . Because σ¯ τ acts trivially on A(K), ψ commutes with A(K) by (4.7). Because A(K) and v generate B(K), where (γ, v, w) is the Q-system for A(K) ⊂ B(K), it suffices to show that ψ commutes with v ∈ B(K). We compute (with Proposition B.4(i)) + + − ψv = tRτ v = tα+ τ ατ¯ (v)Rτ v = tατ (v)Rτ = ασ (v)tRτ = vtRτ = vψ ,
(4.10)
− 0 because α+ τ and ασ act trivially on v ∈ B(K). Thus, ψ ∈ B(K) and − + ϕ(Hom(ατ , ασ )) ⊂ Xστ .
October 15, 2004 11:10 WSPC/148-RMP
00216
Local Fields in Boundary Conformal QFT
941
Conversely, let ψ ∈ Xστ . By definition, t := ϕ−1 (ψ) belongs to B(L) and satisfies tτ (a) = σ(a)t ,
(a ∈ A(L)) .
(4.11)
Thanks to Proposition B.3, it remains to show that t has the required intertwining property − tα+ τ (v) = ασ (v)t .
Inserting the above definitions for t = ϕ
−1
(ψ) and for
(4.12) α± ρ (v),
we have
¯∗ ¯ ∗ τ (ε(θ, τ ))ψv = σ(ε(θ, τ¯)∗ )σθ(R ¯τ∗ )ψv , (4.13) tα+ τ (v) = σ(Rτ )ψε(θ, τ )v = σ(Rτ )σ¯ where θ = γ A is localized in K, and ∗ ∗ ∗ ¯∗ ¯∗ ¯∗ α− σ (v)t = ε(σ, θ) vσ(Rτ )ψ = ε(σ, θ) θσ(Rτ )vψ = σθ(Rτ )ε(σ, θ) vψ .
(4.14)
In (4.13), the statistics operator ε(θ, τ¯) is trivial because of the ordering J < K of the localizations of the endomorphisms, and in (4.14), ε(σ, θ) is trivial because K < I. Because ψ ∈ Xστ belongs to B(K)0 and v ∈ B(K), ψ commutes with v, − hence (4.13) and (4.14) are equal. Thus (4.8) holds, and ϕ−1 (Xστ ) ⊂ Hom(α+ τ , ασ ). This completes the proof of the proposition. Proof of Theorem 4.1 (continued). The dimensions Z[σ][τ ] in the Proposition 4.4 L being the multiplicities of [σ¯ τ ] in Θ, we conclude Θ+ = Θ ' Z[σ][τ ] σ¯ τ . For each pair σ, τ ∈ ∆ such that σ¯ τ ≺ Θ+ , we fix a basis of charged intertwiners ψi := ϕ(ti ) = ti Rτi
(4.15)
− where ti are bases of the spaces Hom(α+ τ , ασ ) orthonormal with respect to their 0 −1 ∗ ∗ 0 inner products ht, t i = (d(σ)d(τ )) · Rτ t t Rτ . As σ and τ vary over ∆, we thus P ind obtain a maximal system of charged intertwiners ψi in B+ (O), i = 1 · · · Z[σ][τ ] ,
normalized as
ψi∗ ψj = d(σ)d(τ ) · δij .
(4.16)
We claim that these form an algebra of charged intertwiners with endomorphismsk %i = σi τ¯i ≺ Θ+ and coefficients Γkij as in (4.6), i.e. those of the α-induction construction given more explicitly in (4.17) below. Let us recall how the latter were determined. The α-induction construction [40] proceeds by the specification of a Q-system in A(I) ⊗ A(I)opp , which under the isomorphism between A(I)opp and A(J) given by aopp 7→ j(a∗ ) turns into the Q-system (Θ2 , W2 , X2 ) in A(I)⊗A(J), determining the extension A(I) ⊗ A(J) ⊂ B2α (O) up to isomorphism. Applying in turn the natural isomorphism A(I) ⊗ A(J) ' A(I) ∨ A(J), we read off [40, Sec. 3]l X 1 k Γkij = d(Θ) 2 ζij,ef · Te j(Tf ) ∈ Hom(%k , %i %j ) (4.17) ef
k The index i thus labels the irreducible components of Θ ∈ End(A (O)), which may be pairwise + + equivalent whenever Z[σi ][τi ] > 1. l Adapting the notation of [40] to the present conventions.
October 15, 2004 11:10 WSPC/148-RMP
942
00216
R. Longo & K.-H. Rehren
where %i = σi τ¯i ≺ Θ, Te form orthonormal bases of Hom(σk , σi σj ) ⊂ A(I), Tf form orthonormal bases of Hom(τk , τi τj ) ⊂ A(I) and consequently j(Tf ) form orthonor1 k mal bases of Hom(¯ τk , τ¯i τ¯j ) ⊂ A(J), and the numerical coefficients d(Θ) 2 · ζij,ef are the expansion coefficients of + − ti α+ τi (tj ) ∈ Hom(ατi τj , ασi σj )
(4.18)
− into the basis Te tk Tf∗ of Hom(α+ τi τj , ασi σj ): X 1 k 2 ζij,ef · Te tk Tf∗ . ti α+ τi (tj ) = d(Θ)
(4.19)
k,ef
We now compute + + ψ i ψ j = t i R τi · t j R τj = t i α + τj ατ¯j (tj ) · Rτi Rτj = ti ατj (tj ) · Rτi Rτj ,
(4.20)
because ατ+ ¯j acts trivially on B(I), and insert (4.19) as well as [29, Eq. (15)] X Tg j(Tg ) · Rτk , (4.21) R τi R τj = g
Tg ∈ Hom(τk , τi τj ). This yields X X 1 k ψi ψj = d(Θ) 2 ζij,ef · Te tk j(Tf ) · Rτk = Γkij · ψk k,ef
(4.22)
k
because tk ∈ B(I) and j(Tf ) ∈ A(J) commute. It remains to prove the second of the two defining relations (A.9) with Γ0ji determined by (4.17). We observe that by definition of Γ0ji only τj conjugate to τi and σj conjugate to σi contribute, and the sums over e and f involve only one term Te ∈ Hom(id, σj σi ) and Tf ∈ Hom(id, τj τi ). If we prove (the first of) the identities 1 P 0 · t∗j Te = d(τj ) · α+ d(Θ) 2 j ζji,ef τj (ti )Tf , (4.23) 1 P 0 d(Θ) 2 j ζji,ef · Tf∗ t∗j = d(σj ) · Te∗ α− σj (ti ) , then the claim reduces to the corresponding result obtained in [29, Eqs. (11) and (12)]: namely we get (using local commutativity of tj ∈ B(J) with j(Tf ) ∈ A(I) as τi , τj ) and the trivial action of ατ+ well as j(Tf∗ )Rτj = τj (j(Tf∗ ))Rτj ∈ Hom(¯ ¯i on ti due to Proposition B.2) X X 1 0 Γ0∗ = d(Θ) 2 · Te∗ tj j(Tf∗ )Rτj ζji,ef ji ψj j
j
(4.23)
=
∗ ∗ d(τj ) · Tf∗ α+ τj (ti )j(Tf )Rτj
Proposition B.4
∗ d(τj ) · Tf∗ j(Tf∗ )Rτj ατ+ ¯i (ti )
=
[29]
=
∗ Rτ∗i ατ+ ¯i (ti )
Proposition B.2
=
Rτ∗i t∗i = ψi∗ .
(4.24)
October 15, 2004 11:10 WSPC/148-RMP
00216
Local Fields in Boundary Conformal QFT
943
To prove (4.23), we choose Tg ∈ Hom(id, τi τj ) such that τi (Tf∗ )Tg = 1, and conse∗ quently also Tf∗ τj (Tg ) = 1. Let t˜ := [Te∗ α− σj (ti Tg )] ∈ Hom(τi , σi ). Then 1
0 ∗ ˜∗ −1 ∗ ˜∗ = Te∗ α− Rτj t tj Rτj = d(σj )ht˜, tj i , d(Θ) 2 ζji,ef σj (ti )tj Tf = Tf t tj Tf = d(τi )
(4.25) P
1 2
0 ∗ ˜∗ hence j d(Θ) ζji,ef · tj = d(σj )t because tj form an orthonormal basis with respect to h·, ·i. Inserting the definitions, we get X 1 0 ζji,ef · t∗j Te = d(σj ) · t˜∗ Te = d(σj ) · Te∗ α− d(Θ) 2 σj (ti Tg )Te j
∗ + = d(τj ) · Tf∗ α+ τj (Tg ti )Tf = d(τj ) · Tf τj (Tg )ατj (ti )Tf
= d(τj ) · α+ τj (ti )Tf ,
(4.26)
using the fact that Te and Tf implement standard left-inverses for the α-induced sectors, and the trace property for standard left-inverses. Similarly, X 1 0 d(Θ) 2 ζji,ef · Tf∗ t∗j = d(σj ) · Tf∗ t˜∗ = d(σj ) · Tf∗ Te∗ α− σj (ti Tg ) j
∗ ∗ − = d(σj ) · Te∗ α− σj (ti τi (Tf )Tg ) = d(σj ) · Te ασj (ti ) ,
(4.27)
proving (4.23). This completes the proof of Theorem 4.1. Proof of Corollary 4.2. Obvious. Remark. In 2D CFT, there is a pair of maximal left and right chiral algebras such max that AL (I) ⊗ AR (J) ⊂ Amax L (I) ⊗ AR (J) ⊂ B2 (O). Under standard assumptions max [39], these are given by AL (I) = B2 (I × J) ∩ (1 ⊗ AR (J))0 (independent of J) α and similar for Amax R . In the present situation, with AL = AR = A and B2 = B2 , ¯ the isomorphism of Theorem 4.1 identifies Amax L (I) with B(L) ∩ B(L\I). Namely, ind 0 0 K B+ (O)∩A(J) = B(L)∩(B(K)∨A(J)) and B(K)∨A(J) = {v }∨A(K)∨A(J) = ¯ = B(L\I) ¯ by strong additivity of A. In particular, the intersection {v K } ∨ A(L\I) ¯ does not depend on the upper boundary of the interval L and may B(L) ∩ B(L\I) be replaced by AˆL (I) := B((a, ∞)) ∩ B((b, ∞))0 if I = (a, b). The chiral nets I 7→ AˆL (I) and I 7→ AˆR (I) := B((−∞, b)) ∩ B((−∞, a))0 thus define two local and mutually local chiral nets, both extending I 7→ A(I) within ind B(I), such that A(I) ∨ A(J) ⊂ AˆL (I) ∨ AˆR (J) ⊂ B+ (O) for J < I. In the setting of [4], they correspond to the intermediate subfactors N ⊂ M± ⊂ M . 5. Bi-Localized Charge Structure in BCFT Our aim in this section is to establish in the algebraic framework formulas of the type (1.11), (1.12), exhibiting a separation of the left and right charges of local fields in BCFT (bi-localized charge structure). This will explain Cardy’s observation [11]
October 15, 2004 11:10 WSPC/148-RMP
944
00216
R. Longo & K.-H. Rehren
concerning the relation between n-point local correlation functions and 2n-point conformal blocks in a model-independent setting. Furthermore, it enables us to compute the specific linear coefficients which guarantee locality, in terms of the DHR structure of the underlying net A of chiral observables. 5.1. Preliminaries Let us recall and adapt for our present purposes several results from the literature. In [12], under the name of field bundle, a “crossed product action of the DHR category on the observables” has been constructed as a first substitute for an algebra of charged fields. The fibers of this bundle were labeled by all the DHR endomorphisms. The huge redundancy has been eliminated with the “reduced field bundle” in [15, 16] where only one fiber was retained for each irreducible superselection sector. This amounts to a choice, for each irreducible sector [s], of a representative DHR endomorphism ρs along with the representation of the observables on the Hilbert space Hs . As a space, Hs coincides with the vacuum Hilbert space H0 of the net A, but as a representation it differs in that A is represented on Hs under the action of ˆ the direct sum of the Hs (which is the endomorphism ρs , i.e. πs = ρs . We call H finite because A is rational), and π ˆ the corresponding representation. Let σ be a DHR endomorphism of A and Te an orthonormal basis of intertwiners Te : ρs → ρt σ, e = 1 · · · dim Hom(ρs , ρt σ). Then Te , as an operator from Hs to Ht , satisfies the intertwining relation Te πs (a) = πt (σ(a))Te .
(5.1)
It is crucial that Te , although an element of A as an operator, must not be considered as an observable since it acts on Hs in the representation π0 = id, and not in the representation πs pertaining to Hs . We emphasize this fact by our notation, and ˆ which coincides with Te on the subspace Hs (with denote by ψeσ the operator on H ˆ m Thus values in Ht ) and is extended by zero on its orthogonal complement in H. ψeσ π ˆ (a) = π ˆ (σ(a))ψeσ .
(5.2)
If σ is localized in an interval I, then σ(a) = a for a ∈ A(I 0 ), hence ψeσ commutes with π ˆ (A(I 0 )). We therefore arrive at the “reduced field net” [15, 16] of von Neumann algebras Fred (I), which are generated by π ˆ (A(I)) and the charged interσ twiners ψe with σ localized in I. This net is relatively local with respect to the subnet π ˆ (A), but non-local itself. The reduced field net is covariant with respect to the unitary representation Uπˆ implementing covariance of the observables. The operators ψeσ satisfy braid-group commutation relations with numerical coefficients (“R-matrices”) determined by the DHR statistics operators ε(σ1 , σ2 ). They are bounded operator versions of the chiral exchange fields discussed in the Introduction. m The
same operator was denoted Fe (1)∗ in [16].
October 15, 2004 11:10 WSPC/148-RMP
00216
Local Fields in Boundary Conformal QFT
945
ˆ then If u is a charge transporter u : σ → σ ˆ with σ ˆ localized in I, π ˆ (u)ψeσ = ψeσˆ
(5.3)
ˆ As discussed in [16], suitable regularized limits of ψeσ as the belongs to Fred (I). localization of σ shrinks to a point y, behave like point-like chiral “exchange fields”, generalizing a(y), b(y) and their adjoints displayed in (1.10). Since the commutation relations survive in the limit, the latter satisfy commutation relations with the same R-matrices as the former. Their correlations converge to (primary or descendant, depending on the details of the limit chosen) conformal blocks, whose analytical monodromy properties thus represent the R-matrices of the DHR statistics. The reduced field net does not comply with the axioms for a (non-local) chiral extension of A (in the sense of [33] or [1]) because its local algebras are not factors (and as a consequence, the vacuum vector in H0 is a cyclic, but not a separating vector). However, every (non-local) field extension B on a Hilbert space H B can be “embedded” in (an amplification of) the reduced field net as follows [33]. Let (γ, v, w) be the Q-system associated with the inclusion π(A) ⊂ B, and (θ, w, x) be the dual Q-system. θ is a DHR endomorphism of A; we may writen M M [s(p)] . (5.4) ns [s] = [θ] = s:inequivalent irreducibles
p:irreducibles
Consequently x : θ → θ 2 has an expansion of the form X x= λrpq (e) · wp ρp (wq ) Te wr ∗
(5.5)
p,q,r;e
where ρp ≡ ρs(p) are the representatives of the sectors [s(p)] as before, w p : ρp → θ form a complete system of orthonormal isometries in A, and Te ∈ A form orthonormal bases of Hom(ρr , ρp ρq ). The numerical coefficients λrpq (e) ∈ C are “generalized Clebsch–Gordon coefficients” characteristic for the inclusion π(A(I)) ⊂ B(I). Then, the charged isometry v ∈ B can be represented in terms of operators from Fred as X X λrpq (e) · π(wq )E p ψeρq E r ∗ (5.6) λrpq (e) · E p π ˆ (wq ) ψeρq E r ∗ = v= p,q,r;e
p,q,r;e
ˆ → HB are the partial isometries which identify the irreducible where E p : H ˆ with the irreducible subrepresentation Hp ⊂ HB , subrepresentation Hs(p) ⊂ H and are zero on the complement. It follows that the charged intertwiners ψq := 1 d(θ) 2 ·π(wq∗ )v of the chiral extension B (cf. Appendix A) arise as the characteristic linear combinations X 1 (5.7) ψq = d(θ) 2 λrpq (e) · E p ψeρq E r ∗ p,r;e
n In the sequel, indices s, t, . . . label the irreducible DHR sectors, while p, q, . . . label the irreducible L L s subrepresentations of π which may come with multiplicities: π ' πp = n πs (p). Indices i, j, . . . will label the irreducible components of Θ+ as in Sec. 4.
October 15, 2004 11:10 WSPC/148-RMP
946
00216
R. Longo & K.-H. Rehren
of charged intertwiners from Fred (possibly amplified by multiplicities of sectors [s] in HB ). The algebras B(I) generated by these linear combinations do have the vacuum as a cyclic and separating vector. Remarkably, in case B is local (or graded local), then the specific linear combinations (5.7) satisfy (graded) local commutativity, although the individual summands ψeρ also in this case satisfy proper braid group commutation relations. 5.2. Application to BCFT After these preliminaries, we return to boundary CFT. We formulate the main result of this section: Proposition 5.1. Let σ, τ¯ be irreducible DHR endomorphisms, localized in I and ind J, respectively, such that σ¯ τ ≺ Θ+ . Then the charged intertwiners ψi ∈ B+ (O), ind i = 1 · · · Z[σ][τ ] , for the inclusion π(A+ (O)) ⊂ B+ (O) can be represented as X p (5.8) ϕq,i (g, h) · E q ψgσ ψhτ¯ E p∗ ψi = p,q;g,h
with numerical coefficients ϕpq,i (g, h) to be specified in Corollary 5.2 below. Here, the sums over p and q extend over the irreducible subrepresentations of π, h and g stand for orthonormal bases of intertwiners Th : ρs(p) → ρt τ¯ and Tg : ρt → ρs(q) σ, respectively, and sum over the intermediate sectors [t] is implicit in the sum over the “channels” g and h.
Proof. Let us first consider the case of the reference double-cone O = I × J = (y, z)×(−z, −y) as discussed in Sec. 4. We recall from (4.15) that the operators ψ i = + ind ϕ(ti ) = ti π(Rτ ) ∈ B+ (O) ⊂ B are intertwiners ψi : idB → α− σ ατ¯ . Equivalently (because A and v generate B), they satisfy ψi π(a) = π(σ¯ τ (a))ψi ,
(a ∈ A)
(5.9)
and + ψi v = α − ¯))ε(σ, θ)∗ ]vψi . σ ατ¯ (v)ψi = π[σ(ε(θ, τ
(5.10)
Now let U : π → θ implement the unitary equivalence between the representation π of A on HB and the representation through the DHR endomorphism θ on H0 , and let ϕi := AdU (ψi ). Under AdU , (5.9) and (5.10) translate into ϕi θ(a) = θσ¯ τ (a)ϕi ,
(5.11)
i.e. ϕi ∈ Hom(θ, θσ¯ τ ) ⊂ A, and in addition the linear condition on ϕi ϕi x = θ[σ(ε(θ, τ¯))ε(σ, θ)∗ ]x ϕi ,
(5.12)
because AdU (v) = x [33]. Finally, the normalization (4.16) of ψi turns into the normalization ϕ∗i ϕj = d(σ)d(τ ) · δij .
(5.13)
October 15, 2004 11:10 WSPC/148-RMP
00216
Local Fields in Boundary Conformal QFT
947
Introducing a basis of the space Hom(θ, θσ¯ τ ), we conclude: Corollary 5.2. Consider the finite linear problem (5.12) to be solved within the DHR category of A, i.e. ϕi ∈ Hom(θ, θσ¯ τ ). Let ϕi be its solutions, subject to the normalization (5.13). They have an expansion X p ϕi = ϕq,i (g, h) · wq Tg Th wp∗ (5.14) p,q;g,h
where wp = U ∗ E p Hs(p) : ρp → θ are orthonormal isometries. Transformed back to HB , we have (5.8), where the numerical coefficients ϕpq,i (g, h) are given by ϕpq,i (g, h) = Th∗ Tg∗ wq ∗ ϕi wp ∈ Hom(ρp , ρp ) = C .
(5.15)
This concludes the proof of Proposition 5.1 in the case of the reference doubleˆ = Iˆ × cone O. Now, we may change the localization to any other double-cone O ˆ J. Similar as in (5.3), we multiply ψi from the left with the charge transporter π(Uσ σ(Uτ¯ )) where Uσ : σ → σ ˆ and Uτ¯ : τ¯ → τˆ¯ with the desired localizations. From π(σ(Uτ¯ ))E q ψgσ = E q π ˆ (σ(Uτ¯ ))ψgσ = E q ψgσ π ˆ (Uτ¯ ) and (5.3), we conclude that (5.8) ˆ in fact holds for the charged intertwiners associated with arbitrary double-cones O, substituting only σ ˆ for σ and τˆ¯ for τ¯. Dropping the ˆ symbols, we may equally well assert that the structure (5.8) holds for any double-cone. This completes the proof of Proposition 5.1 in the general case. We note that Eq. (5.12) is quite similar to the condition Definition 5.5 in [19]. Note that ψgσ belongs to Fred (I), and ψhτ¯ belongs to Fred (J). We have thus geometrically separated the “left” and “right” charges of the charged intertwiners, by representing them as linear combinations of bilocalized products of charged intertwiners from Fred (I) and Fred (I). The specific coefficients ϕpq,i (g, h), arising through the solution of a linear problem in the DHR category involving the dual canonical endomorphism θ (Corollary 5.2), are algebraic invariants for the (non-local) chiral extension π(A) ⊂ B. ˆ → (t, x) as in [16], Assuming the same regularity of the point-like limits O we infer the convergence of n-point correlations of ψi (t, x) to characteristic linear combinations of 2n-point conformal blocks involving the arguments t + x and t − x. (Clearly, the limit cannot be effectuated by the action of the M¨ obius group. Instead, one has to use the local implementers which implement the local action of the M¨ obius group on Fred (I) and act trivially on Fred (J), and vice versa, to obtain a local action of P SL(2, R) × P SL(2, R) on Fred (I) ∨ Fred (J) and hence on B(O).) ˆ nor The coefficients ϕab,i (g, h) are affected neither by the transport from O to O (up to some overall normalization) by the point-like limit. We conclude that Cardy’s observation, originally derived from Ward identities in minimal models, is in fact a model-independent feature of boundary CFT, reflecting purely algebraic structures of the associated (non-local) chiral extension π(A) ⊂ B. The relative coefficients of the representation of local n-point correlation functions as linear combinations of
October 15, 2004 11:10 WSPC/148-RMP
948
00216
R. Longo & K.-H. Rehren
2n-point conformal blocks are the products of n coefficients ϕpq,i (g, h) according to the contributing channels. It should be remarked that, according to the structure (5.8), while the initial and final sectors of ψi necessarily belong to HB , the intermediate sectors [t] may range over all DHR sectors of the chiral net A, as can be nicely seen in the examples (1.11) and (1.12). The correlation functions of boundary CFT therefore carry information also about those chiral sectors which are not present in the Hilbert space of its local fields. 6. Varying the Boundary Conditions As we have seen, the chiral extension I 7→ B(I) of I 7→ A(I) determines not only L s the Hilbert space HB ' s n Hs of the boundary CFT, but also the detailed charge structure of its local fields as in (5.8) and, as a consequence illustrated by the example (1.8), (1.9), the behavior of the local fields and their correlations close to the boundary x = 0. In this section, we want to vary the boundary conditions by varying the (nonlocal) chiral extension B. As is well known from [6], there is a finite system of inequivalent (non-local) chiral extensions Ba which all give rise to the same coupling matrix Z[σ][τ ] . In the language of modular categories [17, 34, 30], these extensions correspond to Morita equivalent Frobenius algebras [19, 38]. We want to show here, that they even give rise to boundary CFT’s with locally isomorphic subfactors ind A+ (O) ⊂ Ba,+ (O). In view of Theorem 4.1, this means that they all share the local structure of the same Minkowski space CFT B2α . Our result is essentially a corollary to a result in [5] making use of [6]. Let ι : A(I) → B(I) be the inclusion homomorphism for a given (non-local) chiral extension π(A) ⊂ B, and consider the system X = {a : A(I) → B(I)} of inequivalent irreducible subhomomorphisms of ι ◦ ρ as ρ ranges over the DHR endomorphisms of A localized in I.o Each a ∈ X naturally gives rise to a Q-system (θa , wa , xa ) (where θa = a ¯a ≺ ρ¯ θρ is a DHR endomorphism because a ≺ ιρ) and hence defines an inclusion A ⊂ Ba as in [13, 5] (“varying the iota vertex”), i.e. (non-local) chiral extensions π a (A) ⊂ Ba with inclusion homomorphisms ιa such that θa = ¯ιa ιa . We may call the family Ba (as a varies over X ) the DHR orbit of the given extension B. (Warning: The association a 7→ Ba is in general not injective, see below.) o In the terminology of categories [18, 30, 37] (where a Q-system is a Frobenius algebra), a ∈ X are the irreducible modules of the Frobenius algebra, cf. e.g., [18, Lemma 5.24 and Chap. 6], forming the objects of the module category. If B is local, the Frobenius algebra is commutative. In this case, Kirillov and Ostrik have shown [30] that the module category is again a monoidal (tensor) category. In the same situation, B¨ ockenhauer and Evans have found [3, Theorem 3.9] a bijection between the elements a ≺ ιρ of X and the irreducible subendomorphisms β ≺ α± ρ of α-induced ± endomorphisms (dim Hom(ιρ, ισ) = dim Hom(α± ρ , ασ )). These two results relate to each other in such a way that the monoidal product a1 × a2 coincides with the composition of endomorphisms β1 ◦ β2 . In the non-local case, there is no such bijection.
October 15, 2004 11:10 WSPC/148-RMP
00216
Local Fields in Boundary Conformal QFT
949
ind Each member of the DHR orbit induces a boundary CFT Ba,+ as well as a α Minkowski space CFT Ba,2 by the α-induction construction. Although the associated representations πa ' θa in general differ from each other, the following local isomorphism holds. ind Proposition 6.1. The inclusions A+ (O) ⊂ Ba,+ (O) are isomorphic throughout α the DHR orbit. The same holds true for the inclusions A2 (O) ⊂ Ba,2 (O).
Proof. We recall from Sec. 4, that the algebraic structure of the subfactors of ink terest is coded in the numerical coefficients ζij,ef of their Q-systems. The latter, in turn, arise as expansion coefficients (4.19) of the monoidal product ti α+ τi (tj ) of inter− twiners t : α+ → α between α-induced endomorphisms of both signs. In [5, p. 21], τ σ − + − a bijection βa between the intertwiner spaces Hom(α+ , α ) and Hom(α τ σ a,τ , αa,σ ) for α-inductions to the extensions Ba within a DHR orbit was established. It is therefore sufficient to show that this bijection respects the monoidal product. Let M = B(I) and N = A(I). For a ∈ X , let a ¯ : M → N be a conjugate homomorphism, and a ¯(M ) ⊂ N ⊂ Ma the Jones basic construction [26] associated with the subfactor a ¯(M ) ⊂ N . Let ιa : N → Ma be the inclusion homomorphism of N into Ma , and ¯ιa : Ma → N a conjugate homomorphism such that ¯ιa (Ma ) = a ¯(M ) ⊂ N . Then ϕa = a ¯−1 ◦ ¯ιa : Ma → M is an isomorphism. Now, if ρ and ρ¯ are conjugate DHR endomorphisms of A localized in I and a ≺ ιρ|N , then θa = ¯ιa ιa = a ¯a is contained in (the restriction to N of) ρ¯¯ιιρ = ρ¯ θI ρ which is again a DHR endomorphism localized in I. The statistics operators ε± (τ, θa ) enter the definition of α-induction α± a,τ , cf. Appendix B. According to [6, p. 455f], if T ∈ Hom(a, ιρ) ⊂ M is isometric, then ± Uτ± = T ∗ ι(ε± (τ, ρ))α± τ (T ) ∈ Hom(ατ a, aτ ) ⊂ M
(6.1)
is unitary, and ε± (τ, θa ) = a ¯(Uτ± )ε± (τ, a ¯ι). One finds −1 ± α± a,τ = ϕa ◦ AdUτ± ◦ ατ ◦ ϕ .
(6.2)
− + − Consequently, the bijection βa : Hom(α+ τ , ασ ) → Hom(αa,τ , αa,σ ) is given by − +∗ βa (t) = ϕ−1 a (Uσ tUτ ) .
(6.3)
In order to show that βa respects the monoidal product, we have to show that + ∗ −1 ϕ−1 Uσ−1 σ2 t1 · α+ = ϕ−1 Uσ−1 t1 Uτ+1 ∗ · α+ Uσ−2 t2 Uτ+2 ∗ (6.4) a τ1 (t2 )Uτ1 τ2 a a,τ1 ϕa
which due to (6.2) is equivalent to
+ ∗ − +∗ + + − +∗ +∗ Uσ−1 σ2 t1 · α+ τ1 (t2 )Uτ1 τ2 = Uσ1 t1 Uτ1 · Uτ1 ατ1 (Uσ2 t2 Uτ2 )Uτ1 .
(6.5)
Using “naturality” of the DHR braiding with respect to α-induction, as expressed, e.g., in [6, Eq. (14)], we find + + Uτ+1 α+ τ1 (Uτ2 ) = Uτ1 τ2
and similar for Uσ− , which implies (6.5). This completes the proof.
(6.6)
October 15, 2004 11:10 WSPC/148-RMP
950
00216
R. Longo & K.-H. Rehren
Since the structure of the local subfactors in the case of Minkowski space extensions B2 of A ⊗ A determines the global structure (thanks to the “unbroken symmetry”, i.e. existence of a global conditional expectation in this case, cf. Sec. 2), α the associated two-dimensional theories Ba,2 may in fact be considered as identical. ind In contrast, the boundary CFT nets Ba,+ are defined on different Hilbert spaces Ha given by πa ' θa = ¯ιa ιa . In particular, in spite of the algebraic isomorphism of the local subfactors, the corresponding bi-localized charge structures as in Proposition 5.1 differ among different members within the DHR orbit. As a consequence, exemplified by the example (1.11) and (1.12), also the scaling behavior of the local fields towards the boundary differs. The DHR orbit associates several BCFT’s to a given one. For example, the “Cardy case” discussed in the literature [11, 14, 46] is the DHR orbit of the trivial extension B = A, ι = id, which includes B+ = Adual + . The elements of X in this case are labeled by the sectors ιρ ≡ ρ of A. To be more specific, the Hilbert spaces Hρ carry the representation πρ ' θρ ≡ ρ¯ρ of A and hence of A+ and Adual + . Thus, in the Cardy case, the members of the DHR family are just the extensions πρ (A+ ) ' ρ¯ρ(A+ ) ⊂ ρ¯ρ(Adual + ). The non-trivial charge structure of the “charged fields” of Adual arises through the non-trivial action of ρ¯ρ on the charge transporters + u : σ I → σ J (cf. Remark (3) after Definition 2.1). In the Ising model, there are 1 ], and [ 12 ]. The corresponding chiral extensions Bs are, in turn, three sectors [0], [ 16 A itself, CAR (cf. Sec. 2), and again A itself (exemplifying the non-injectivity of ind the association a 7→ Ba ). The boundary field nets Bs,+ are generated by A+ and, in turn, charged intertwiners of the structure φ0 as in (1.11), φ1 as in (1.12), and ind both coincide with Adual again φ0 . In fact, B0,+ and B ind 1 + . A more refined structure ,+ 2
distinguishing between 0 and 21 will be discussed in the next section. The main problem, however, is the classification of the other orbits, if there are any. By the results of Sec. 2, this amounts to the classification of non-local chiral extensions π(A) ⊂ B, reformulated according to Appendix A as the classification of Q-systems in the DHR category of superselection sectors of A.p As explained in Sec. 3.2, this is a finite-dimensional problem and it has only finitely many solutions. Of course, complete classifications can be expected only when the chiral observables A are specified, see e.g., [28]. We speculate that each DHR orbit of non-local chiral extensions A ⊂ Ba contains a distinguished element which is local, at least if the coupling matrix Z[σ][τ ] is of type I [6]. The argument could go like this. Every element of the DHR orbit defines the same theory B2 on Minkowski space by the α-induction prescription, see max above. This theory in turn has a pair of maximal chiral subalgebras Amax ⊃ πL (A) L max max max max and AR ⊃ πR (A), where πL and πR are determined by the “vacuum block” of the coupling matrix Z[σ][τ ] . (We expect that these coincide with AˆL and AˆR max max mentioned in the end of Sec. 4.) If Z is of type I, πL and πR are equivalent, p See
also [38, Theorem 1] according to which every module category arises as the module category of some Frobenius algebra.
October 15, 2004 11:10 WSPC/148-RMP
00216
Local Fields in Boundary Conformal QFT
951
and we may suppress the subscript. We conjecture that the local chiral extension π max (A) ⊂ Amax is a distinguished local element of the orbit. Thus, classification of DHR orbits of BCFT would be reduced to classification of local chiral extensions, cf. [28, 33], or of commutative Frobenius algebras [30]. We hope to return to this conjecture in a separate work. 7. Partition Functions and Modularity We mention in this section aspects of modular invariant partition functions, as far as they can be easily derived in our framework. Let us recall, however (cf. Sec. 1), that in our approach Modular Invariance of the partition function is not a first principle. Therefore, the natural appearance of the matrix Z (both as the coupling matrix of left and right chiral sectors in the Minkowski space theory B 2α and as the coupling matrix for the bi-localized charge structure, cf. Proposition 4.4), its automatic modular invariance [6], and the validity of relations (7.4) and (7.7) below also in the flat space QFT framework, is a remarkable fact about the intrinsic structure of Minkowski space CFT with or without boundary. The structure of the system X = {a ≺ ιρ irreducible} associated with a BCFT (cf. Sec. 6) defines a “nimrep” (non-negative integer matrix representation) of the L fusion rules [s][t] = u Nust [u] of the superselection sectors. Namely, if a belongs to X , then a ≺ ιρ for some DHR endomorphism ρ, hence then aρs ≺ ιρρs , and every irreducible component of aρt again belongs to X . Hence M X X [aρs ] = nsab [b] with nsab ntbc = Nust nuac . (7.1) b∈X
u
b
nsaa
This implies that the diagonal matrix elements are the multiplicities of ρs within θa = a ¯a, thus M M nsaa Hs ≡ nsaa Hs . (7.2) Ha = H B a = s
s
In the literature on boundary CFT in Statistical Mechanics (for a review see, e.g., [46]) one discusses also theories defined on Hilbert spaces M Hab = nsab Hs . (7.3) s
This leads us to consider “non-diagonal” boundary CFT nets O 7→ Bab,+(O) ⊃ πab (A+ (O)) and the associated (non-local) chiral nets I 7→ Bab (I) ⊃ πab (A(I)) which are defined on Hab carrying the DHR representation πab ' a ¯b, for any pair a, b ∈ X . These theories arise through projections of the reducible subfactors associated with θ = (¯ a ⊕ ¯b) ◦ (a ⊕ b). If a 6= b, the Hilbert spaces Hab do not contain the vacuum vector (because n0ab = δab ), so that the standard theory of chiral extensions as applied in Secs. 2–6 cannot be used. We expect nevertheless (without elaborating) that the results of the previous sections largely carry over to these theories as well, and allow to make precise contact with the Statistical Mechanics interpretation along the following lines.
October 15, 2004 11:10 WSPC/148-RMP
952
00216
R. Longo & K.-H. Rehren
The partition function for the spectrum of the chiral conformal Hamiltonian of L the boundary CFT on Hab = s nsab Hs is X c Zab (β) = Tr Hab πab (exp(−β(L0 − 24 ))) = nsab χs (β) . (7.4) s
ns being a nimrep of the (commutative) fusion rules, its joint spectrum is given by the matrix elements of the modular matrix S (note that in the algebraic approach, complete rationality implies non-degeneracy of the braiding [29], and hence the DHR statistics defines a unitary representation of the modular group SL(2, Z) [16]), i.e. one has the “Cardy equation” nsab =
X t
ψat
Sst ∗ ψ . S0t bt
(7.5)
Inserting this expansion in the partition function Zab , and taking for granted the modular transformation law of the chiral characters χs (β), one obtains X ∗ ˆ at , Zab (β) = ψbt χt (β)ψ (7.6) t
where βˆ = 4π 2 /β is the modular transform of the inverse temperature β. Usually [46], the right-hand side of this formula is reinterpreted as a matrix element of the conformal Hamiltonian of the Minkowski space theory between a pair of so-called P “Ishibashi boundary states” |ai = t ψat |ti, which weakly realize the boundary condition TL = TR : ˆ L0 + LR0 − Zab (β) = hb| exp(− 21 β(L
c 12 ))|ai .
(7.7)
These Ishibashi states, however, are linear combinations of non-normalizable vectors in the Hilbert spaces Ht ⊗ Ht . It was pointed out, e.g., in [22] that Ishibashi states, rather than vector states on A ⊗ A, should be considered as KMS (= Gibbs in this case) states on A, where the second copy of A appears via Tomita’s Modular Theory [43, Chap. VI, Theorem 1.19] as the commutant of A in the GNS representation of the KMS state. While we have not elaborated these issues, we hope to arrive, in a future publication, at a better algebraic understanding of the structures outlined in this section. 8. Conclusion We have classified boundary conformal quantum field theories in terms of chiral extensions of the underlying local chiral observables A. These extensions, which are in general non-local, are in turn classified in terms of Q-systems (Frobenius algebras) in the DHR modular category of superselection sectors of A. We have analyzed how general structural properties of the chiral observables are transmitted to the local algebras of the BCFT. Among other things, we have shown the absence of DHR superselection sectors of the latter (Sec. 2).
October 15, 2004 11:10 WSPC/148-RMP
00216
Local Fields in Boundary Conformal QFT
953
A chiral extension determines both a BCFT and a Minkowski space CFT. Well away from the boundary, these two theories are algebraically indistinguishable (Sec. 4). Only near the boundary, the breakdown of symmetry changes the algebraic structure. This effect is exhibited in the bi-localized charge structure of the local fields in the BCFT. This structure can be derived (and explicitly computed) from the superselection structure (the DHR modular category) of the chiral observables. It may be regarded as an algebraic invariant for the embedding of the latter into the full theory (Sec. 5). The bi-localized charge structure in turn determines the scaling behavior of the local fields with x → 0. In this sense, the boundary “conditions” on the non-chiral fields of BCFT are in fact rather a derived feature. BCFT’s associated with the same chiral observables can be grouped into families (“DHR orbits”) which are algebraically isomorphic well away from the boundary, but differ near the boundary. The members of each orbit may thus be interpreted as the different ways a Minkowski space CFT may “react” to the presence of a boundary; but it can (in general) not be considered as different representations of the same abstract theory on the half-space (Sec. 6). Each DHR orbit is accompanied by a “nimrep” of the fusion rules of the chiral observables, which controls the modular behavior of the partition function of the conformal Hamiltonian (Sec. 7). Boundary CFT with two boundaries [46], corresponding to a QFT on a strip 0 < x < L, may be largely formulated along the same lines, provided t±x are interpreted as angular coordinates of the circle, adjusted with a normalization factor L/2π, rather than cartesian coordinates of the lightlike axes. However, the connection with “non-diagonal” BCFT (cf. Sec. 7) remains to be understood. Acknowledgments KHR thanks the Dipartimento di Matematica of the University of Rome “Tor Vergata” (where the idea for this work was created) and the Max Planck Institute for Physics in Munich (where part of the work was completed) for hospitality, D. Evans and Y. Kawahigashi for helpful correspondence, and J. Fuchs for interesting discussions. RL is supported in part by supported in part by GNAMPA-INDAM, MIUR and EU-HPP. Appendix A. Q-Systems and Algebras of Charged Intertwiners We give a brief reminder of the notion of Q-system associated with a subfactor N ⊂ M of type III von Neumann algebras, and then present a lemma concerning the generation of M in terms of charged intertwiners. This lemma is the obvious generalization of an argument used in [29] in a special case. A subfactor N ⊂ M is irreducible if N 0 ∩ M = C · 1. The index [M : N ] is the optimal bound λ ≥ 1 such that there is a conditional expectation E : M → N satisfying the lower operator bound E(m∗ m) ≥ λ−1 · m∗ m. The dimension d(ρ) of an endomorphism ρ ∈ End(N ) is the square root of the index [N : ρ(N )].
October 15, 2004 11:10 WSPC/148-RMP
954
00216
R. Longo & K.-H. Rehren
The condition of finite index is equivalent to the property that, with ι : N → M the inclusion homomorphism, there is a “conjugate” homomorphism ¯ι : M → N and a “canonical” pair of isometric intertwiners v : idM → γ := ι¯ι ∈ End(M ) 1 in M and w : idN → θ := ¯ιι ∈ End(N ) in N , such that ι(w)∗ v = λ− 2 1M and 1 ¯ι(v)∗ w = λ− 2 1N . Then, E(m) = ι(w)∗ γ(m)ι(w) is the (unique, if N ⊂ M is irreducible) conditional expectation. γ and θ are the “canonical” and “dual canonical” endomorphisms associated with the subfactor, and d(γ) = d(θ) = λ = [M : N ]. A Q-system in M is a triple (ρ, T, S) where ρ ∈ End(M ) is an endomorphism of M , and T and S are isometric intertwiners T : id → ρ and S : ρ → ρ2 in M , satisfying the relationsq 1
T ∗ S = ρ(T ∗ )S = λ− 2 · 1,
SS = ρ(S)S .
(A.1)
A Q-system in M determines a subfactor N ⊂ M of index λ in terms of data of M as the image N := E(M ) of the conditional expectation E : M → N , defined by E(m) := T ∗ ρ(m)T . Thus, the Q-system for N ⊂ M is (γ, v, ι(w)). Likewise, the Q-system in N for ¯ι(M ) ⊂ N (the dual Q-system for N ⊂ M ) is (θ, w, ¯ι(v)). By Jones’ “basic construction” [26], a subfactor N ⊂ M determines (up to unitary equivalence) another subfactor M ⊂ M1 , isomorphic to ¯ι(M ) ⊂ N . The basic construction applied to ¯ι(M ) ⊂ N , recovers N ⊂ M (up to isomorphism), hence N ⊂ M is also determined by its dual Q-system (θ, w, x) in N . Given a Q-system (θ, w, x) in N , a concrete realization of M results if one finds a representation of N in a Hilbert space H and an isometry v in B(H) such that (γ, v, w) form a Q-system in M := N ∨ {v} where γ extends θ by setting γ(v) := x. Then, with ι the inclusion map of N into M via its representation on H, (γ, v, w) is the Q-system for N ⊂ M , and (θ, w, x) is the dual Q-system. We make use of this constructive scheme in Sec. 4. The conditions on v which ensure that (γ, v, w) form a Q-system with γ N = θ and γ(v) = x, can be formulated as an algebra of charged intertwiners, as follows. Let N ⊂ M be a subfactor of finite index, and (γ, v, w) and (θ, w, x) be its Q-system and dual Q-system, and X θ(n) = wi %i (n)wi∗ , (n ∈ N ) (A.2) i
the decomposition of θ into irreducibles (choosing representatives %i = %j whenever %i and %j are equivalent), %0 = id, w0 = w. Then the charged intertwiners 1
ψi := d(θ) 2 · wi∗ v ∈ M
(A.3)
satisfy ψi n = %i (n)ψi , q In
(n ∈ N ) ;
(A.4)
more general frameworks, such as Frobenius algebras in tensor categories [18], one has to require in addition an equivalent of SS ∗ = ρ(S ∗ )S; in a C ∗ context as ours, the latter relation follows from the remaining ones [34, Sec. 6].
October 15, 2004 11:10 WSPC/148-RMP
00216
Local Fields in Boundary Conformal QFT
955
we say that “ψi carry charge %i ”. The charged intertwiner for %0 = id is ψ0 = 1 ,
(A.5)
and whenever %i = %j , one has the normalization ψi∗ ψj = d(%i ) · δij .
(A.6)
Together with N , the charged intertwiners generate M ; more precisely, every element of M has a unique expansion X m= nk ψ k , (nk = E(mψk∗ ) ∈ N ) . (A.7) k
As x ∈ N is an intertwiner x : θ → θ 2 , it has a unique expansion X 1 wi %i (wj ) Γkij wk∗ x = d(θ)− 2
(A.8)
ijk
with intertwiners Γkij : %k → %i %j in N . Transcribing the relations of the Q-system 1 1 1 vv = γ(v)v = xv and v ∗ = d(θ) 2 w∗ vv ∗ = d(θ) 2 w∗ γ(v ∗ )v = d(θ) 2 w∗ x∗ v in terms of the charged intertwiners, one arrives at X X Γ0∗ (A.9) ψi ψj = Γkij ψk , ψi∗ = ji ψj . k
j
P
We note that, by (A.6) alone, i ti ψi = 0 with N 3 ti : %i → % implies ti = 0. Indeed, multiplying the sum from the left by ψj∗ s∗ ( N 3 s : %j → %) one gets s∗ tj = 0. Since s is arbitrary, tj = 0. Therefore, the relations (A.4)–(A.6) and (A.9) among the charged intertwiners impose constraints on the “coefficients” Γkij , e.g., P P l Γk0j = δjk 1, Γki0 = δik 1, as well as n Γnij Γlnk = m %i (Γm jk )Γim (associativity of 0 the expansion (A.9)). Furthermore, Γij : id → %j %i vanish unless %j is conjugate to P k %i , and k Γk∗ ij Γij d(%k ) = d(%i )d(%j ). The relation X k0 (A.10) Γk∗ ij Γij = d(θ)δkk0 · 1 ij
is of a different status: it follows from x∗ x = 1, but seemingly not from the relations among the charged intertwiners alone. This set of relations (A.4)–(A.6) and (A.9) in M is, like the Q-system (γ, v, w), a complete invariant for the subfactor N ⊂ M , see Lemma A.2.
Definition A.1. Let %i be a finite system of pairwise either inequivalent or equal irreducible endomorphisms of N of finite dimension among which %0 = id occurs precisely once, and Γkij ∈ Hom(%k , %i %j ) ⊂ N . An algebra of charged intertwiners for N is a system of operators ψi satisfying (i) the intertwining property (A.4) for %i , (ii) the normalizations as in (A.5), (A.6), and (iii) the algebra (A.9) with P coefficients satisfying (A.10), where d(θ) := i d(%i ). The first statement of the following lemma summarizes the above discussion; the second statement is the converse: the algebra of charged intertwiners determines the
October 15, 2004 11:10 WSPC/148-RMP
956
00216
R. Longo & K.-H. Rehren
subfactor and its Q-system. While a special case underlies the argument leading to [29, Corollary 45], we think it appropriate to formulate the general case. Lemma A.2. (i) An irreducible subfactor N ⊂ M determines an algebra of charged intertwiners. In particular , the condition (A.10) on the coefficients is automatic, if they arise in this way. (ii) Let ψi ∈ B(H) be an algebra of charged intertwiners for N with endomorphisms %i ∈ End(N ) and coefficients Γkij ∈ N, and M be the algebra generated by N and ψi . Then the subfactor N ⊂ M has Q-system (γ, v, w) and dual Q-system (θ, w, x), where (in turn) θ is defined as in (A.2) with the help of any complete orthogonal system of isometries w i ∈ A, w := w0 , x is defined as in (A.8), 1 P v := d(θ)− 2 i wi ψi , and by definition γ extends θ by γ(nv) := θ(n)x. (iii) Two algebras of charged intertwiners with the same endomorphisms in End(N ) and coefficients in N give rise to isomorphic subfactors (possibly on different Hilbert spaces), with the isomorphism given by identification of the charged intertwiners. Sketch of the proof. (ii) It is straightforward to see that the relations of the algebra of charged intertwiners ensure the following: wn = θ(n)w, xθ(n) = θ 2 (n)x, 1 vn = θ(n)v (n ∈ N ); w ∗ w = 1, v ∗ v = 1, x∗ x = 1; θ(w∗ )x = d(θ)− 2 1, w∗ x = 1 1 1 w∗ γ(v) = d(θ)− 2 1, w∗ v = d(θ)− 2 1; vv = xv, xx = θ(x)x; and v ∗ = d(θ) 2 · w∗ x∗ v. These include the defining relations for (θ, w, x) to be a Q-system in N . The missing information for (γ, v, w) to be a Q-system is that γ is an endomorphism. It is also straightforward to see that γ respects products and the previous relations, and 1
1
γ(v ∗ ) = d(θ) 2 · θ(w∗ x∗ )x = d(θ) 2 · θ(w∗ )xx∗ = x∗ = γ(v)∗ .
(A.11)
1
Because ψi = d(θ) 2 · wi∗ v by definition of v, v and N generate M , so γ is indeed an endomorphism. Hence (γ, v, w) is a Q-system, and (θ, w, x) its dual (θ = γ N and x = γ(v)). Finally N = w ∗ γ(M )w because w ∗ γ(vnv ∗ )w = d(θ)−1 · n, showing that the Q-systems are indeed the Q-system associated with N ⊂ M and its dual. The subfactor is irreducible since id is contained in θ with multiplicity one. (iii) is now obvious. Appendix B. α-Induction We collect a number of well-known results on α-induction [33, 3], used in the course of our arguments in this article. Definition B.1 ([33]). Let N be a factor, and ∆ a set of endomorphisms % of N equipped with a braiding ε(%1 , %2 ) : %1 %2 → %2 %1 , giving rise to a braided C ∗ tensor category with direct sums and subobjects. Let N ⊂ M be an irreducible subfactor with canonical endomorphism γ, and dual canonical endomorphism θ = γ N such that θ ∈ ∆. Then for % ∈ ∆, −1 α+ ◦ Adε(θ,%) ◦ % ◦ γ % := γ
(B.1)
October 15, 2004 11:10 WSPC/148-RMP
00216
Local Fields in Boundary Conformal QFT
957
extends the endomorphism % ∈ ∆ of N to an endomorphism α+ % of M . (The nontrivial fact is that Adε(θ,%) ◦ % ◦ γ(M ) belongs to γ(M ).) One has α+ % (n) = %(n)
α± % (v) = ε(θ, %)v
and
(B.2)
for n ∈ N and v the canonical isometry in the Q-system (γ, v, w). The same holds true for α− % , replacing the braiding in (B.1), (B.2) with the opposite braiding ε− (%1 , %2 ) = ε(%2 , %1 )∗ . The endomorphisms α± % are invariant under inner conjugations of the Q-system (i.e. γ 7→ Adu ◦ γ, v 7→ uv, w 7→ uw, u ∈ N unitary). Proposition B.2 ([33]). If I 7→ [A(I) ⊂ B(I)] is a (chiral ) quantum field theoretical net of subfactors, then the dual canonical endomorphism of A(I) for A(I) ⊂ B(I) extends to a DHR endomorphism θ of the net A with [θ] independent of I. With ∆ the set of DHR endomorphisms and ε(ρ1 , ρ2 ) the DHR braiding [15], α± ρ can be defined as endomorphisms of the net B such that (B.2) and its analog − + for α− ρ still hold. If ρ is localized in I, then αρ (respectively αρ ) is a semi-localized endomorphism of B, i.e. it acts trivially on B(K) whenever I < K (respectively I > K). Proposition B.3. Let ρ, σ be localized in I. Then for every combination of ε = ±, ε0 = ±, the space of global intertwiners 0
{t ∈ B : tαερ (b) = αεσ (b)t for all b ∈ B}
(B.3)
coincides with the space of local intertwiners 0
{t ∈ B(I) : tαερ (b) = αεσ (b)t for all b ∈ B(I)} .
(B.4)
Proof. Every element b ∈ B can be written uniquely as b = av with a ∈ A and v ∈ B(I) the charged intertwiner of the Q-system (γ, v, w) for A(I) ⊂ B(I) [33]; furthermore b ∈ B(I) if and only if a ∈ A(I). Let t = av be a global intertwiner. 0 Then tαερ (a1 ) = αεσ (a1 )t implies that a is a global intertwiner in Hom(θρ, σ), hence a ∈ A(I) by Haag duality of A, hence t ∈ B(I) is in fact a local intertwiner. Conversely, let t = av be a local intertwiner. Then a is local intertwiner, hence 0 [20, Theorem 2.3] a global intertwiner in Hom(θρ, σ). Thus, we have tαερ (b) = αεσ (b)t for all b ∈ B(I) by assumption, and for all b ∈ A(I 0 ) by the trivial action of the endomorphisms on A(I 0 ) and locality. Since B(I) and A(I 0 ) generate all of B by strong additivity of A, t is a global intertwiner. Proposition B.4 ([3, Lemmas 3.5 and 3.25]). Let ρ, σ, τ be DHR endomorphisms of A. ± (i) If T ∈ Hom(ρ, σ) ⊂ A, then also T ∈ Hom(α± ρ , ασ ) ⊂ B. ± ± (ii) If t ∈ Hom(αρ , ασ ), then the naturality relations α± τ (t)ε(ρ, τ ) = tε(σ, τ ) ∗ ∗ and α± (t)ε(τ, ρ) = tε(τ, σ) hold. τ
October 15, 2004 11:10 WSPC/148-RMP
958
00216
R. Longo & K.-H. Rehren
Note Added in Proof Equation (A.10) is redundant in the definition of an algebra of charged interwiners. Namely, the normalization (A.6), involutivity and antimultiplicativity of the ∗ and associativity of the product in (A.9) give relations among the coefficients Γ kij ∈ N which are respectively transcribed as w ∗ x∗ xw− = 1, r∗ θ(r) = 1, r∗ θ(r∗ )x = 1 r∗ θ(x∗ ), and xx = θ(x)x, where r := d(θ) 2 · xw : id → θ2 . The last three give in turn r∗ θ(x∗ ) = r∗ x∗ , x∗ = θ(r∗ )x, and xx∗ = θ(x∗ )x, from which one obtains by induction (x∗ x)n+1 = x∗ θ((x∗ x)n )x. Using w∗ (x∗ x)n+1 w = (w∗ x∗ xw)n+1 = 1, gives φ((x∗ x)n ) = 1, where φ is the non-degenerate left-inverse of θ induced by xw. This implies x∗ x = 1 (the transcription of (A.10)). By the above argument and [29, Proposition 45], one can define an action on N of the system (%i , Γkij ) subject to the mentioned relations, such that N ⊂ M arises as the crossed product of N by this action. References [1] C. D’Antoni, R. Longo and F. Radulescu, Conformal nets, maximal temperature and models from free probability, J. Oper. Theory 45 (2001) 195–208, math.OA/9810003. [2] C. D’Antoni, K. Fredenhagen and S. K¨ oster, Implementation of conformal covariance by diffeomorphism symmetry, Lett. Math. Phys. 67 (2004) 239–247. [3] J. B¨ ockenhauer and D. E. Evans, Modular invariants, graphs and α-induction for nets of subfactors, Commun. Math. Phys. 197 (1998) 361–386. [4] J. B¨ ockenhauer and D. E. Evans, Modular invariants from subfactors: type I coupling matrices and intermediate subfactors, Commun. Math. Phys. 213 (2000) 267–289. [5] J. B¨ ockenhauer and D. E. Evans, Modular invariants and subfactors, Fields Inst. Commun. 30 (2001) 11–37, math.OA/0008056. [6] J. B¨ ockenhauer, D. E. Evans and Y. Kawahigashi, On α-induction, chiral generator and modular invariants for subfactors, Commun. Math. Phys. 208 (1999) 429–487. [7] J. B¨ ockenhauer, D. E. Evans and Y. Kawahigashi, Longo–Rehren subfactors arising from α-induction, Publ. RIMS (Kyoto) 37 (2001) 1–35, math.OA/0002154. [8] H.-J. Borchers, Energy and momentum as observables in quantum field theory, Commun. Math. Phys. 2 (1966) 49–54. [9] D. Buchholz, C. D’Antoni and R. Longo, Nuclear maps and modular structures II: application to quantum field theory, Commun. Math. Phys. 129 (1990) 115–138. [10] R. E. Behrend, P. A. Pearce, V. C. Petkova and J.-B. Zuber, Boundary conditions in rational conformal field theories, Nucl. Phys. B579 (2000) 707–773. [11] J. Cardy, Conformal invariance and surface critical behavior, Nucl. Phys. B240 (1984) 514–532. [12] S. Doplicher, R. Haag and J. E. Roberts, Local observables and particle statistics, 1+2, Commun. Math. Phys. 23 (1971) 199–230; ibid. 35 (1974) 49–85. [13] D. E. Evans, Fusion rules of modular invariants, Rev. Math. Phys. 14 (2002) 709–732. [14] G. Felder, J. Fr¨ ohlich, J. Fuchs and C. Schweigert, Correlation functions and boundary conditions in RCFT and three-dimensional topology, Compos. Math. 131 (2002) 189–237, hep-th/9912239. [15] K. Fredenhagen, K.-H. Rehren and B. Schroer, Superselection sectors with braid group statistics, 1, Commun. Math. Phys. 125 (1989) 201–226. [16] K. Fredenhagen, K.-H. Rehren and B. Schroer, Superselection sectors with braid group statistics, II, Rev. Math. Phys. SI1 (Special issue) (1992) 113–157.
October 15, 2004 11:10 WSPC/148-RMP
00216
Local Fields in Boundary Conformal QFT
959
[17] J. Fuchs and C. Schweigert, Solitonic sectors, α-induction and symmetry breaking boundaries, Phys. Lett. B490 (2000) 163–172. [18] J. Fuchs and C. Schweigert, Category theory for conformal boundary conditions, Fields Inst. Commun. 39 (2003) 25–71, math.CT/0106050. [19] J. Fuchs, I. Runkel and C. Schweigert, TFT construction of RCFT correlators: partition functions, Nucl. Phys. B646 (2002) 353–497. [20] D. Guido and R. Longo, The conformal spin and statistics theorem, Commun. Math. Phys. 181 (1996) 11–35. [21] R. Haag, Local Quantum Physics (Springer Verlag, Berlin–Heidelberg–New York, 1996). [22] K. C. Hannabuss and M. Semplice, Boundary conformal fields and Tomita–Takesaki theory, J. Math. Phys. 44 (2003) 5517–5529. [23] M. Izumi and H. Kosaki, On a subfactor analogue of the second cohomology, Rev. Math. Phys. 14 (2002) 733–757. [24] M. Izumi, R. Longo and S. Popa, A Galois correspondence for compact groups of automorphisms of von Neumann algebras with a generalization to Kac algebras, J. Funct. Anal. 155 (1998) 25–63. [25] R. Jaffe, Unnatural acts: unphysical consequences of imposing boundary conditions on quantum fields, AIP Conf. Proc. 687 (2003) 3–12, hep-th/0307014. [26] V. Jones, Index for subfactors, Invent. Math. 72 (1983) 1–25. [27] Y. Kawahigashi, Generalized Longo–Rehren subfactors and α-induction, Commun. Math. Phys. 226 (2002) 269–287. [28] Y. Kawahigashi and R. Longo, Classification of local conformal nets: case c < 1, to appear in Ann. Math., math-ph/0201015. [29] Y. Kawahigashi, R. Longo and M. M¨ uger, Multi-interval subfactors and modularity of representations in conformal field theory, Commun. Math. Phys. 219 (2001) 631–669. [30] A. Kirillov and V. Ostrik, On a q-analog of the McKay correspondence and the b 2 conformal field theories, Adv. Math. 171 (2002) 183–227, ADE classification of sl math.QA/0101219. [31] R. Longo, A duality for Hopf algebras and for subfactors, I, Commun. Math. Phys. 159 (1994) 133–150. [32] R. Longo, Conformal subnets and intermediate subfactors, Commun. Math. Phys. 237 (2003) 7–30. [33] R. Longo and K.-H. Rehren, Nets of subfactors, Rev. Math. Phys. 7 (1995) 567–597. [34] R. Longo and J. E. Roberts, A theory of dimension, K-Theory 11 (1997) 103–159. [35] R. Longo and F. Xu, Topological sectors and dichotomy in conformal field theory, to appear in Commun. Math. Phys., math.OA/0309366. [36] M. M¨ uger, Superselection structure of massive quantum field theories in 1+1 dimensions, Rev. Math. Phys. 10 (1998) 1147–1170. [37] M. M¨ uger, From subfactors to categories and topology I: Frobenius algebras in and Morita equivalence of tensor categories, J. Pure Appl. Algebra 180 (2003) 81–157. [38] V. Ostrik, Module categories, weak Hopf algebras and modular invariants, Transform. Groups 8 (2003) 177–206, math.QA/0111139. [39] K.-H. Rehren, Chiral observables and modular invariants, Commun. Math. Phys. 208 (2000) 689–712. [40] K.-H. Rehren, Canonical tensor product subfactors, Commun. Math. Phys. 211 (2000) 395–406. [41] K.-H. Rehren, Locality and modular invariance in 2D conformal QFT, Fields Inst. Commun. 30 (2001) 341–354, math-ph/0009004. [42] K.-H. Rehren and B. Schroer, Exchange algebra on the light-cone and order/disorder 2n-point functions in the Ising field theory, Phys. Lett. B198 (1987) 84–88.
October 15, 2004 11:10 WSPC/148-RMP
960
00216
R. Longo & K.-H. Rehren
[43] M. Takesaki, Theory of Operator Algebras II, Springer Encyclopedia of Mathematical Sciences, Vol. 125 (Springer, 2003). [44] V. G. Turaev, Modular categories and 3-manifold invariants, Int. J. Mod. Phys. B6 (1992) 1807–1824. [45] F. Xu, On a conjecture of Kac–Wakimoto, Publ. RIMS (Kyoto) 37 (2001) 165–190, math.RT/9904098. [46] J.-B. Zuber, CFT, BCFT, ADE and all that, Contemp. Math. 294 (2002) 230–266, hep-th/0006151.
November 4, 2004 14:58 WSPC/148-RMP
00220
Reviews in Mathematical Physics Vol. 16, No. 8 (2004) 961–976 c World Scientific Publishing Company
NONLINEAR SURFACE SUPERCONDUCTIVITY IN THE LARGE κ LIMIT
Y. ALMOG Faculty of Mathematics, Technion - Israel Institute of Technology, Haifa 32000, Israel Received 26 October 2003 Revised 10 August 2004 The Ginzburg–Landau model for superconductivity is considered in two dimensions. We show, for smooth bounded domains, that the superconductivity order parameter decays exponentially fast away from the boundary as the Ginzburg–Landau parameter κ tends to infinity. We prove this result for applied magnetic fields satisfying hex − κ log κ/κ, and therefore, improve a recent result of Pan [16]. Keywords: Surface superconductivity; Ginzburg–Landau; large κ limit.
1. Introduction Consider a planar superconducting body which is placed in sufficiently low temperature (below the critical one) under the action of an external magnetic field . Its energy is given by the Ginzburg–Landau energy functional which can be represented in the following dimensionless form [6] 2 Z i |Ψ|4 E= −|Ψ|2 + (1.1) + |h − hex |2 + ∇Ψ + AΨ dxdy 2 κ Ω
in which Ψ is the (complex) superconducting order parameter, such that |Ψ| varies from |Ψ| = 0 (when the material is at a normal state) to |Ψ| = 1 (for the purely superconducting state). The magnetic vector potential is denoted by A (the magnetic field is then given by h = ∇×A), hex is the constant applied magnetic field, and κ is the Ginzburg–Landau parameter which is a material property. Superconductors √ for √ which κ < 1/ 2 are termed type I superconductors, and those for which κ > 1/ 2 are termed type II. The superconductor lies in a smooth domain Ω (∂Ω is at list C 2,α ) and its Gibbs free energy is given by E. Note that E is invariant to the gauge transformation Ψ → eiκη ψ ;
A → A + ∇η .
(1.2)
It is known both from experiments [15] and rigorous analysis [10] that for a sufficiently strong magnetic field the normal state (ψ ≡ 0, h = hex ) would prevail. 961
November 4, 2004 14:58 WSPC/148-RMP
962
00220
Y. Almog
If the field is then decreased, there is a critical field, depending on the sample’s geometry, where the material would enter the superconducting state. For samples with boundaries, this field is known as the onset field and has been termed HC3 . The simplest case in which the bifurcation from the normal state to the superconducting one was calculated is the case of a half-plane [18]. The analysis in this case is one-dimensional: the linearized Ginzburg–Landau equations were solved on R+ . Even in this simple case the onset field is substantially larger than the bifurcation field on R [9]. The situation is not different in two dimensions: it was proved in [14] and [7] that the bifurcating mode in R2+ is one-dimensional and that the value of HC3 is exactly the same as in the one-dimensional case. Similarly, the bifurcation from the normal state in R2 takes place when the applied magnetic field is identical with the bifurcation field for R, which has been termed HC2 . In addition to the difference in the values of the applied field, it was found by Saint-James and de Gennes [18] that superconductivity is concentrated at the onset near the boundary for a half-plane, i.e. ψ decays exponentially fast away from the boundary. This phenomenon, which appears only in the presence of boundaries have been termed, therefore, surface superconductivity. It was later proved for general two-dimensional domains with smooth boundaries [14, 7], that as the domain’s scale tends to infinity the onset field tends to de Gennes’ value, and that if the boundaries include wedges the onset field will be larger than de Gennes’ value [4, 13, 19, 12]. Surface superconductivity reflects another difference between the problems in R2+ and R2 , where the bifurcation takes place in the form of periodic solutions [1, 5, 2] known as Abrikosov’s lattices. The transition, as the applied magnetic field decreases, from surface superconductivity to the experimentally-observed [8] Abrikosov’s lattices is not yet well understood. Rubinstein [17] conjectured that superconductivity remains limited to a neighborhood of the boundary until about HC2 when a new solution which is similar in bulk to Abrikosov lattice appears. Two recent contributions [16, 3] study the behavior of the global minimizer of the energy functional (1.1) for external fields satisfying κ = HC2 < hex < HC3 . In [16] the limit κ → ∞ is considered: it is demonstrated that ψ decays, in L2 sense, exponentially fast away form the boundary. The results are valid whenever h ex −κ 1 as κ → ∞, and are stated for the global minimizer of (1.1). In addition the energy of the global minimizer is shown to be evenly distributed along the boundary. In [3] the large domain limit is considered: it is demonstrated for the global minimizer that both ψ and h tend, in C α sense, to the normal state, exponentially fast away from the boundary. The results are valid whenever hex − κ ∼ O(1) as the domain’s size tends to infinity. In the present contribution we focus on the limit κ → ∞. We prove that for any critical point of (1.1) (ψ, A) tends to the normal state exponentially fast away from the boundary as long as hex − κ log κ/κ, which extends the validity of the results in [16]. Furthermore, we show that the magnetic field tends to a constant not only away from the boundaries but also near the boundary for this limit case.
November 4, 2004 14:58 WSPC/148-RMP
00220
Nonlinear Surface Superconductivity in the Large κ Limit
963
The Euler–Lagrange equations associated with the energy functional defined in (1.1), or the steady state Ginzburg–Landau equations, are given by 2 i ∇ + A ψ = ψ(1 − |ψ|2 ) , (1.3a) κ i (ψ ∗ ∇ψ − ψ∇ψ ∗ ) + |ψ|2 A , 2κ and the natural boundary conditions by i ∇+A ψ·n ˆ = 0; h = hex . κ −∇ × ∇ × A =
(1.3b)
(1.4a,b)
We consider two-dimensional settings where we can write h = (0, 0, h(x, y)) and hex = (0, 0, hex ). In the next section we consider the global minimizer of (1.1) in smooth bounded domains as κ → ∞. We show that for sufficiently large κ, the global minimizer of (1.1) which must solve (1.3) together with (1.4), tends exponentially fast away from the boundaries to a normal state as long as hex − κ log κ/κ. Furthermore, we show that 1 log κ . (1.5) +p kh − hex kL∞ [Ω] ≤ C κ κ(hex − κ)
To prove the above results we use a differential inequality which was proved in [3]. Let 1 2 u=h−κ+ ρ . (1.6) 2κ Then 2 1 ρ4 . (1.7) ∇2 u − ρ2 u = κ Jˆ + κ − 2κ The precise definition of Jˆ will not concern us. We shall be interested only in its property 2 2 Jˆ ρ = |∇u|2 ,
(1.8)
which is proved in [3]. Finally, in Sec. 3 we briefly discuss a few key points which are not mentioned in Sec. 2. 2. Exponential Rate of Decay We prove here the following theorem: p Theorem 2.1. Let λ = κ(hex (κ) − κ), and let (ψ, A) = (ψ(λ, κ), A(λ, κ)) denote ˜ λ such that for a solution of (1.3) and (1.4). Then, ∃λ0 > 0, κ0 > 0, β > 0, and h 1/2 every κ > κ0 and λ > λ0 (log κ) we have |Dα ψ| ≤ Cα κα e−βλd(x,∂Ω) ˜ λ ≤ Ce−βλd(x,∂Ω) h − h
for all α ≥ 0 and x ∈ Ω
(2.1a) (2.1b)
November 4, 2004 14:58 WSPC/148-RMP
964
00220
Y. Almog
α D (h − ˜ hλ ) ≤ Cα κα−1 e−βλd(x,∂Ω)
for all α ≥ 1 and x ∈ Ω
log κ ˜ . hλ − hex ≤ C κ
(2.1c) (2.1d)
To prove the theorem we first need a number of auxiliary results. The first of them includes the following well-known estimates: Lemma 2.2. Let hex ≥ κ. Then, any solution of (1.3) and (1.4) must satisfy kρkL∞(Ω) < 1
(2.2a)
kh − hex kC 1 (Ω) ¯ ≤ C
i
κ ∇ + A ψ ∞ ¯ ≤ C . L (Ω)
(2.2b) (2.2c)
Proof. The proof of (2.2a) is well known and follows immediately from (1.3a) and the real part of the boundary condition (1.4a). The proof of (2.2b) and (2.2c) can be found in [11]. Lemma 2.3. Let hex ≥ κ. Then, any solution of (1.3) and (1.4) satisfies, for sufficiently large κ Z C ρ4 ≤ (2.3a) κ Ω Z 2 h − hex 2 ≤ C log κ (2.3b) κ2 Ω where C is independent of κ.
Proof. We first prove (2.3a). To this end, we integrate (1.7) over Ω. In view of (2.2b) we have, Z Z Z Z ∇u 2 ∂u 1 2 4 + ≤C. (2.4) κ ρ u + κ − ρ ≤ ρ 2κ ∂n Ω Ω Ω ∂Ω Hence, applying (2.2b) once again, we have, since hex ≥ κ Z 1/2 Z Z 4 4 2 ρ κ ρ ≤C + ρ (h − hex ) ≤ C 1 + Ω
Ω
(2.5)
Ω
from which (2.3a) is readily verified. To prove (2.3b) we integrate (1.3a) multiplied by ρ2 ψ¯ and integrate over Ω. We obtain, 2 Z Z Z i 1 ρ2 ρ2 |∇ρ|2 = ρ4 (1 − ρ2 ) . ∇ + A Ψ + 2 κ κ Ω Ω Ω
November 4, 2004 14:58 WSPC/148-RMP
00220
Nonlinear Surface Superconductivity in the Large κ Limit
965
By (1.3b) we have Z
2
Ω
|∇h| ≤
Z
Ω
2 Z i ρ ∇ + A Ψ ≤ ρ4 . κ 2
Ω
We now apply Poincar´e inequality and (1.4b) to obtain Z Z C |h − hex |2 ≤ C ρ4 ≤ . κ Ω Ω
(2.6)
In a manner similar to [7, 3] we now define a local coordinate system near ∂Ω. Let η denote the distance from the boundary, s the arclength along the boundary, with some point x0 ∈ ∂Ω corresponding to s = 0, and κ1 (s) the curvature of ∂Ω, which must be uniformly bounded in [−L/2, L/2]. This local coordinate system is well defined in the rectangle L L (2.7) S = (s, η) − < s < , 0 < η < η0 2 2 where L denotes the arclength of ∂Ω, and η0 satisfies inf
s∈[−L/2,L/2]
1 − κ1 (s)η0 > 0 .
Denote by Ω0α the domain enclosed in η = α, i.e., Ω0α = {x ∈ Ω | d(x, ∂Ω) > α} . Integrating (1.7) on Ω0α yields Z Z Z Z ∇u 2 1/2 ∇u 2 ∂u 2 ≤C κ . ρ ρ + 0 ρ u≤ ∂Ω0α Ω0α Ωα ∂Ω0α ∂n
However, by (2.6) we have Z Z 2 − ρ u≤− Ω0α
Ω0α
ρ2 (h − hex ) ≤
C . κ
Furthermore, for every 0 < δ < η0 , there exists δ/2 < α < δ such that Z Z ∇u 2 ∇u 2 ≤C . ρ δ Ω0δ/2 ρ ∂Ω0α
(2.8)
(2.9)
Combining the above with (2.8) and (2.9) yields Z Z Z Z Z ∇u 2 ∇u 2 1/2 C ∇u 2 1/2 2C ≤ C κ + ≤ ρ ρ ρ κ δ 1/2 δ 1/2 Ω0δ Ω0δ/2 Ω0δ/2
which can be applied recursively to obtain Z Z ∇u 2 2 ≤ C . |∇u| ≤ ρ κ2 δ Ω0δ Ω0δ
November 4, 2004 14:58 WSPC/148-RMP
966
00220
Y. Almog
Moreover, it is easy to show in view of (2.4), that Z C |∇u|2 ≤ 2 . κ (δ + 1/κ) Ω0δ
(2.10)
We can now use Schwarz Inequality, and the local coordinate system defined in (2.7), to obtain Z δ Z δ 2 Z δ 2 ∂u ∂u dη dη ≤ η + 1 dη . |u(s, 0) − u(s, δ)|2 ≤ ∂η ∂η κ (η + 1/κ) 0 0 0 Integrating the above with respect to s we obtain Z L/2 Z L/2 Z |u(s, 0) − u(s, δ)|2 ds ≤ C log(1 + κδ) ds −L/2
−L/2
δ 0
1 dη . |∇u|2 η + κ
By changing the order of integration it is easy to show that Z L/2 Z δ Z δ Z L/2 Z δ ds |∇u|2 ηdη = dη1 ds |∇u|2 dη . −L/2
0
0
However, in view of (2.10), we have Z Z δ Z L/2 |∇u|2 dη ≤ C ds −L/2
η1
Ωη1 \Ωδ
−L/2
|∇u|2 ≤
η1
C . κ2 (η1 + 1/κ)
Furthermore, since Z
L/2
ds −L/2
Z
δ 0
|∇u|2 dη ≤
C κ
we must have Z
L/2 −L/2
|u(s, 0) − u(s, δ)|2 ds ≤ C
Utilizing (2.11) we obtain Z Z |u − (hex − κ)|2 ≤ 2
L/2
−L/2
∂Ωδ
log2 (1 + κδ) . κ2
|u(s, 0) − u(s, δ)|2 ds + 2
(2.11)
Z
∂Ω
|u − (hex − κ)|2
2
≤C
log (1 + κδ) . κ2
(2.12)
However, Z
Ωδ
|u − (hex − κ)|2 ≤ C
Z
∂Ωδ
|u − (hex − κ)|2 + C
Z
Ωδ
|∇u|2 ,
and hence, with the aid of (2.12), we obtain Z log2 (1 + κδ) C |u − (hex − κ)|2 ≤ C + 2 . 2 κ κ (δ + 1/κ) Ωδ
November 4, 2004 14:58 WSPC/148-RMP
00220
Nonlinear Surface Superconductivity in the Large κ Limit
Z
Ω
We now write, Z |h − hex |2 ≤ 2
Ωδ
|u − (hex − κ)|2 + C
Z
δ
dη 0
Z
∂Ωη
|u − (hex − κ)|2 +
1 2κ2
Z
967
ρ4 . Ω
Choosing δ ∼ O(1) proves (2.3b). p Lemma 2.4. Let hex > κ and λ = κ(hex − κ). Let {x}λ≥λ0 denote a family of points in Ω. Let sλ = d(xλ , ∂Ω) Then, F (xλ , s) ≤
C 1 ∀ s ≤ sλ ∀ > 0 (λsλ )4− 2
(2.13a)
where F (x, r) is given by F (x, r) =
Z
B(0,r)
1 J˜ = Jˆ λ
2 2 λ2 w+ J˜ + ρ2 (w+ )2 + ∇w+
(2.13b) (2.13c)
and w=
u . hex − κ
(2.13d)
Proof. By (1.7) w satisfies 2 1 1 4 2 ˜ ∇ w−ρ w =λ J + κ − ρ . 2 2λ2 2
2
2
Integrating over B(xλ , r) the product of (2.14) by w + we obtain Z ∂w+ w+ ≥ F (xλ , r) . ∂r ∂B(xλ ,r)
(2.14)
(2.15)
Multiplying (2.15) by 1/r and integrating between s and sλ yields, in view of (2.2), Z Z sλ 2 2 1 2π + F (xλ , r) dr ≤ w (sλ , θ) − w+ (s, θ) dθ ≤ C . (2.16) r 2 0 s
In the following we use C to denote a constant which is independent of both λ and xλ . As F is monotonically increasing in r, 1 < β0 < 1 : F (xλ , β0 sλ ) < C . 2 It is easy to show that 1/2 < β < β0 exists such that Z Z C 2 + ˜ 2 2 + 2 ˜ 2 + ρ2 (w+ )2 λ w J + ρ (w ) ≤ λ2 w+ |J| s λ Bβ 0 ∂Bβ ∃
def
where Bβ = B(x, βs). Let ξ1 , ξ2 ∈ ∂Bβ . Then, Z + 5/2 (w ) (ξ1 ) − (w+ )5/2 (ξ2 )| ≤ C
∂Bβ
(w+ )3/2 |∇w| .
(2.17)
(2.18)
November 4, 2004 14:58 WSPC/148-RMP
968
00220
Y. Almog
By (1.8) |∇w| = ρ J˜ . Hence,
+ 5/2 (w ) (ξ1 ) − (w+ )5/2 (ξ2 )| ≤ C ≤
Z
C λ
Z
∂Bβ
2 w+ J˜
1/2 Z
ρ2 (w+ )2 ∂Bβ
1/2
2 C λ2 w+ J˜ + ρ2 (w+ )2 ≤ F (xλ , β0 sλ ) . λs λ ∂Bβ
(2.19)
Let 0 < s < βsλ , and let (r, θ) denote a polar coordinate system centered around x. Then, 1/2 Z Z Z 2π Z βsλ 2 1/2 1 ∂w ρ2 (w+ )2 2 drdθ ≤ C (2.20) w+ J˜ (w+ )3/2 ∂r r A A 0 s def
where A = Bβ \B(x, s). Hence, Z 2π βs C (w+ )5/2 s λ ≤ F (xλ , β0 sλ ) . λs 0
Utilizing (2.19) together with the inequality 5 x − y 5 ≥ x4 − y 4 5/4
(2.21)
and H¨ older inequality we obtain Z 2π βs 2πC (w+ )5/2 s λ dθ + F (xλ , β0 sλ ) λsλ 0 1/5 #5 Z 2π " C = w+5/2 (βsλ ) + F (xλ , β0 sλ ) − w+5/2 (s) dθ λsλ 0 ≥
Z
2π
0
≥ C
(
(Z
w
2π 0
In view of (2.16)
+5/2
C F (xλ , β0 sλ ) (βsλ ) + λsλ
4/5
−w
+2
(s)
)5/4
dθ
)5/4 4/5 +5/2 C +2 w (βsλ ) + . − w (s) dθ F (xλ , β0 sλ ) λsλ Z
2π 0
βs (w+ )2 s λ dθ ≥ 0 .
Consequently, (Z )5/4 4/5 2π C w +5/2 (βsλ ) + − w+2 (s) dθ F (xλ , β0 sλ ) λsλ 0 ≥
Z
2π
0
βs (w+ )2 s λ
5/4
.
November 4, 2004 14:58 WSPC/148-RMP
00220
Nonlinear Surface Superconductivity in the Large κ Limit
969
Combining the above inequalities yields 4/5 Z 2π F (xλ , β0 sλ ) + 2 βsλ (w ) s dθ ≤ C λs 0
and by (2.16) we have Z
4/5 F (xλ , r) F (xλ , β0 sλ ) . dr ≤ C r λs s Thus, since F is monotone increasing 4/5 F (xλ , β0 sλ ) 1 . ∃ < β1 < β0 s.t. F (xλ , β1 sλ ) ≤ C1 2 λsλ βsλ
(2.22)
It is possible to repeat the above procedure recursively (cf. [3]) to prove the existence of a monotone decreasing sequence {βn }∞ n=1 , which is strictly bounded from below by 1/2, such that 4/5 F (xλ , βn−1 sλ ) ∀n ≥ 1 . (2.23) F (xλ , βn sλ ) ≤ Cn λsλ Utilizing the above inequality together with (2.17) proves the lemma.
Lemma 2.4 allows us to obtain uniform convergence in Ω of w + to a constant, except for a boundary layer of O(1/λ) size (as λ → ∞). Lemma 2.5. For any family of points {xλ }λ>λ0 C ∃w ˜λ : w+ (xλ ) − w ˜λ ≤ 1/2 . λ d(xλ , ∂Ω)1/2 The lemma can be proved by applying the same arguments as in the proof of Lemma 3.4 in [3]. We now find the value of the constant w ˜λ by using the estimates in Lemma 2.3. Lemma 2.6. Let hex > κ. Then, 1 log κ w ˜λ − 1 ≤ C . + λ2 λ1/2
(2.24)
Proof. Let x ∈ Ω such that ∂B(x, r) ⊂ int(Ω), where r is independent of λ. By Lemma 2.3 we have 2 1 k1 − wkL2 [B(x,r)] ≤ k1 − wkL2 [Ω] ≤ khex − hkL2 [Ω] + kρ4 kL2 [Ω] hex − κ κ(hex − κ) log κ . λ2 However, by the previous lemma |w + − w ˜λ | ≤ C/λ1/2 in B(x, r), and hence, since
1 log κ
w ˜λ − w+ L2 [B(x,r)] ≤ C , ˜λ − 1 L2 [B(x,r)] ≤ 1 − w+ L2 [B(x,r)] + w + λ2 λ1/2 the lemma immediately follows. ≤C
November 4, 2004 14:58 WSPC/148-RMP
970
00220
Y. Almog
We can now obtain better estimates for the rate of decay of |w + − w ˜λ | away from the boundaries as λ → ∞. Lemma 2.7. Let hex > κ and {xλ }λ≥λ0 denote a family of points such that xλ ∈ Ω. Let λsλ = λd(xλ , ∂Ω) −→ ∞. Then, λ→∞
∀n ∈ N
∃
1 Cn < βn < 1, Cn > 0 : F (xλ , βn sλ ) ≤ n n 2 λ sλ
(2.25a)
where F is defined in (2.13) Cn ∃w ˜λ : w+ (xλ ) − w ˜λ ≤ n n λ sλ
(2.25b)
log κ w ˜λ − 1 ≤ C 2 . λ
(2.25c)
Proof. The proof of (2.25a) and (2.25b) is obtained by following the same line of arguments as in the proof of Lemma 3.6 in [3]. To prove (2.25c) we use (2.25b) and apply arguments of the proof of Lemma 2.6 once again. Lemma 2.8. Let hex > κ and {xλ }λ≥λ0 denote a family of points such that xλ ∈ Ω. Let sλ = d(xλ , ∂Ω) −→ ∞. Then, λ→∞ Z Cn ∀n ∈ N ∃Cn > 0 : ρ2 ≤ n n . (2.26) λ sλ B(xλ ,sλ /2) Proof. By (2.25a) 1 ∃ < βn < 1 : 2
Z
B(xλ ,βn sλ )
ρ2 (w+ )2 ≤
Cn . λn snλ
Writing
kρkL2 [B(xλ ,βn sλ )] ≤ ρw+ L2 [B(x ,βn s )] λ λ
+ ρ w+ − w ˜λ L2 [B(x
we obtain, in view of (2.24) and (2.25b),
kρkL2 [B(xλ ,βn sλ )] ≤
λ ,βn sλ )]
+ ρ 1 − w ˜λ L2 [B(x
Cn . (λsλ )n/2
λ ,βn sλ )]
,
(2.27)
Proof of Theorem 2.1. We prove the theorem by invoking blow up arguments. We first prove that ∃λ0 and β > 0 such that kψkL2 [B(x,δ)] ≤ Cδe−βλd(x,∂Ω) ∀ λ > λ0 log1/2 κ, 0 < δ <
1 1 , ∀ x ∈ Ω : d(x, ∂Ω) ≥ . λ λ
(2.28)
November 4, 2004 14:58 WSPC/148-RMP
00220
Nonlinear Surface Superconductivity in the Large κ Limit
971
Let Ω(λ, k, s) =
x ∈ Ω | d(x, ∂Ω) ≥ k
s λ
.
We prove (2.28) by showing that ∃λ0 , s0 :
sup x∈Ω(λ,k,s)
kψ(κ, λ)kL2 [B(x,δ)] ≤
1 sup kψ(κ, λ)kL2 [B(x,δ)] 2 x∈Ω(λ,k+1,s)
∀ s > s0 λ > λ0 log1/2 κ, k ∈ N, 0 < δ <
1 . λ
(2.29)
Suppose, for a contradiction, that (2.29) does not hold. Then, sequences ∞ ∞ ∞ ∞ ∞ λ2 λj j=1 , κj j=1 , sj j=1 , kj j=1 and δj j=1 exist such that κj ↑ ∞, logjκj ↑ ∞, sj ↑ ∞, kj ∈ N, 0 < δj < 1/λj , and
sup x∈Ω(λj ,kj +1,sj )
ψ(κj , λj )
L2 [B(x,δj )]
≥
1 def 1
ψ(κj , λj ) 2 = mj . sup L [B(x,δj )] 2 x∈Ω(λj ,kj ,sj ) 2
(2.30)
Let def ψ(κj , λj ) . ψ˜j = mj
By (2.29) there exists xj ∈ Ω(λj , kj + 1, sj ) such that kψ˜j kL2 [B(xj ,δj )] ≥ 12 . Furthermore, since B(xj , δj ) ∈ Ω(λj , kj , sj ) we have
Define
1 ≤ ψ˜j L2 [B(x ,δ )] ≤ 1 . j j 2
(2.31)
x eiAj (xj )·x/λj , fj = ψ˜j xj + λj where Aj = A(κj , λj ). In view of (2.31) we have
fj 1 ≤ 1. ≤ 2 λj L2 [B(0,λj δj )]
(2.32)
Let w ˜λ be the same as in (2.25c) and let
˜λ = w h ˜λ (hex − κ) + κ . Clearly,
˜λ
h − h
L∞ [Ω(λ,k,s)]
≤
C n λ2 . κsn
It is easy to show that ˜ j 2 2 iλj h ∇ + Bj fj = fj 1 − m2j fj κj λj
(2.33)
x ∈ B(0, sj )
(2.34a)
November 4, 2004 14:58 WSPC/148-RMP
972
00220
Y. Almog
wherein 1 , Bj (x) = Aj (xj + x) − Aj (xj ) ˜j h
(2.34b)
˜j = h ˜ λ . We now define a cut-off function and h j ( 1 in B(0, r) ηr = 0 in R2 /B(0, 2r) ∇ηr ≤ C in R2 . r s Multiplying (2.34a) by ηr2 , and integrating over B(0, 2r) we obtain, for all r ≤ 2j (cf. [14]), that Z Z ˜j 2 iλj 2 λ2j h 2 2 2 2 ∇ + B η f = η f 1 − m |f | + 2 ∇ηr fj2 . j r j r j j j κj λj κj B(0,2r) B(0,2r)
(2.35)
2 Let Aˆ : R → R2 denote any vector field satisfying ∇ × Aˆ = ˆiz and ˆ A−B ·n ˆ ∂B(0,sj ) = 0. Then, Z ˜j iλj 2 h ∇ + B η f j r j κj λj B(0,2r)
=
Z
˜ iλj 2 hj ˆ η f ∇ + A r j κj λj B(0,2r)
+
−
Z
Z
B(0,2r)
˜ ˜ hj ˆ r2 i f¯j ∇fj − fj ∇f¯j + 2 fj 2 hj κj Bj (Bj − A)η κj λ2j
˜ 2 2 hj Bj − Aˆ ηr2 |fj |2 . λj B(0,2r)
Clearly, ( ) ˜j ˜j 2 h iλj h 2 λj ¯ ¯ ¯ ηr i , fj ∇fj − fj ∇fj + 2 fj Bj = 2< ηr fj ∇ + Bj ηr fj κj λj κj λj
and hence, Z Z ˜j ˜j iλj iλj 2 2 h h ˆ κj ∇ + λj B j η r f j ≥ κj ∇ + λj A η r f j B(0,2r) B(0,2r)
Z Z ˜j ˜j iλj 1/2 2 1/2 h h 2 2 − 2 Mj ηr fj κj ∇ + λj B j η r f j λj B(0,2r) B(0,2r)
−
˜ 2 Z hj Mj2 ηr2 |fj |2 λj B(0,2r)
(2.36a)
November 4, 2004 14:58 WSPC/148-RMP
00220
Nonlinear Surface Superconductivity in the Large κ Limit
973
where Mj =
sup x∈B(x,sj )
Bj − Aˆ .
(2.36b)
In [14, 7] it was shown that Z ˜j ˜j Z iλj 2 h h 2 2 ˆ κj ∇ + λj A ηr fj ≥ κj 2 ηr |fj | . R2 R
(2.37)
Combining the above with (2.35) and (2.36a) we obtain Z ˜ Z ˜ Z λ2j hj ∇ηr 2 fj2 + Mj hj −1 2ηr2 fj2 ηr2 |fj |2 ≤ 2 κj κ λ j B(0,2r) B(0,2r) j B(0,2r) +
λ2j κ2j
Z
B(0,2r)
∇ηr 2 fj2 +
˜ 2 Z hj Mj2 ηr2 |fj |2 . λj B(0,2r)
(2.38)
By (2.33) we have Mj ≤
λ2j Cn κ2j snj
∀n ∈ N.
By (2.25c) we have, for sufficiently large j (recall that λj log κj ), 1 1 λ2j ˜ hj − hex (κj ) ≤ hex (κj ) − κj = 2 2 κj
and hence by (2.38) we have Z Z |fj |2 ≤ C B(0,r)
B(0,2r)
∇ηr 2 fj2 .
Choosing r = λj δj we obtain, by applying (2.39) n consecutive times Z Z Cn |fj |2 . |fj |2 ≤ 2n n(n+1) n B(0,2 λ δ ) B(0,λj δj ) λj δ j 2 j j
(2.39)
(2.40)
By (2.32) we have, however,
Z
B(0,2n λj δj )
2 fj ≤ C22n λj
where C is independent of n and j. Consequently, 2 n Z fj C ≤ λj δj 2n−1 B(0,λj δj ) λj
which is true for all n ∈ N such that 2n λj δj ≤ sj . Substituting in the above an integer nj satisfying sj < 2 n j λj δ j ≤ s j 2
November 4, 2004 14:58 WSPC/148-RMP
974
00220
Y. Almog
we easily obtain lim
j→∞
Z
B(0,λj δj )
2 fj =0 λj
(2.41)
contradicting (2.32), and therefore proving (2.28). In order to obtain exponential decay in C α norm we first write the equation for φ(z) = ψ(x0 + z/κ)e−iA(x0 )·z which is given by 2 ∇2 φ = 2A˜ · i∇ + A˜ φ − φ 1 − |φ|2 + A˜j
where
(2.42)
˜ A(z) = A(x0 + z/κ) − A(x0 ) .
˜ It is possible to show, using the identity (2.35) and the boundedness of A(z) in B(0, 2), that Z Z 2A˜ · i∇ + A˜ φ − φ 1 − |φ|2 + A˜j 2 2 dz ≤ C φ2 dz . B(0,2)
B(0,2)
Using standard elliptic estimates we then have
kφkH 2 [B(0,1)] ≤ CkφkL2 [B(0,2)] where C is independent of κ, λ, and x0 . Choosing δ = 2/κ in (2.28) we obtain kφkL2 [B(0,2)] ≤ Ce−βλd(x0 ,∂Ω) . Sobolev embedding then implies kφkL∞ [B(0,1)] ≤ Ce−βλd(x0 ,∂Ω) , which proves (2.1a) for α = 0. ˜ We now write the equation for A(z), 1 ∇ × H = 2 = φ¯ i∇ + A˜ φ κ ˜ where H(z) = ∇ × A(z). By (2.2c) we then have C k∇HkL∞ [B(0,1)] ≤ 2 e−βλd(x0 ,∂Ω) , κ and hence, k∇hkL∞ [B(x0 ,1/κ)] ≤ Ce−βλd(x0 ,∂Ω) .
(2.43)
(2.44)
(2.45)
We can now integrate (2.45) to obtain ˜ λ : |h(x0 ) − ˜ ∃h hλ | ≤ Ce−βλd(x0 ,∂Ω) . To prove (2.1c), and (2.1a) for α ≥ 1 we use (2.42) and (2.44) together with bootstrapping and Sobolev embedding. To prove (2.1d) we use (2.25c), which gives ˜ λ | ≤ C log κ . |hex − h κ
November 4, 2004 14:58 WSPC/148-RMP
00220
Nonlinear Surface Superconductivity in the Large κ Limit
975
3. Conclusion In [16] Pan obtains that in the limit κ → ∞ 2 ) √ Z ( 1 C 2 |ψ| + ∇ψ − iAψ eβ κ(hex −κ)d(x,∂Ω) dx ≤ p κ κ(hex − κ) Ω
(3.1)
whenever hex − κ 1, for some β > 0 which is independent of κ. In the present contribution we extend the validity of the above result to external fields satisfying
log κ . κ We also obtain, in Theorem 2.1, convergence in C α norms in contrast to the above L2 convergence which is proved in [16]. It should be mentioned, however, that once L2 convergence is obtained, it is possible to prove (2.43) and (2.45) and then proceed using bootstrapping and Sobolev embedding. The main advantage of the results in this work is therefore the greater range of external fields for which exponential rate of decay is guaranteed. This is facilitated by better a priori estimates of the magnetic field: in [16] it is first proved that |h−hex | ≤ C in Ω, whereas here, (2.25) provides a much better estimate on the magnetic field. In addition to (3.1) it is demonstrated in [16] that for hex − κ 1 the energy of the global minimizer is evenly distributed along the boundary (for a more precise definition the reader is referred to [16]). In view of the better estimate of h in the present contribution it appears reasonable to believe that the validity of this result can be extended to external fields satisfying hex − κ log κ/κ. However, since the analysis in [16] is heavily based on the assumption hex − κ 1, significant modification is necessary before it can be applied to a greater range of applied magnetic fields. hex − κ
Acknowledgments This research was supported by the fund for the promotion of research at the Technion. The author thanks one of the referees for spotting a technical error in the first version. References [1] A. A. Abrikosov, On the magnetic properties of superconductors of the second group, Soviet Phys. JETP 5 (1957) 1175–1204. [2] Y. Almog, On the bifurcation and stability of periodic solutions of the Ginzburg– Landau equations in the plane, Siam J. Appl. Math. 61 (2000) 149–171. [3] Y. Almog, Non-linear surface superconductivity for type II superconductors in the large domain limit, Arch. Ration. Mech. Anal. 165 (2002) 271–293. [4] A. Bernoff and P. Sternberg, Onset of superconductivity in decreasing fields for general domains, J. Math. Phys. 39 (1998) 1272–1284. [5] S. J. Chapman, Nucleation of superconductivity in decreasing fields I, European J. Appl. Math. 5 (1994) 449–468.
November 4, 2004 14:58 WSPC/148-RMP
976
00220
Y. Almog
[6] S. J. Chapman, Asymptotic analysis of the Ginzburg–Landau model of superconductivity: reduction to a free boundary model, Quart. Appl. Math. 53 (1995) 601–627. [7] M. del Pino, P. L. Felmer and P. Sternberg, Boundary concentration for eigenvalue problems related to the onset of the superconductivity, Commun. Math. Phys. 210 (2000) 413–446. [8] U. Essmann and H. Tr¨ auble, The direct observation of individual flux lines in type II superconductors, Phys. Lett. A24 (1967) 526–527. [9] V. L. Ginzburg and L. D. Landau, On the theory of supercoductivity, Soviet Phys. JETP 20 (1950) 1064. [10] T. Giorgi and D. Philips, The breakdown of superconductivity due to strong fields for the Ginzburg–Landau model, SIAM J. Math. Anal. 30 (1999) 341–359. [11] B. Helffer and X.-B. Pan, Upper critical field and location of surface nucleation of superconductivity, Ann. Inst. H. Poincar´e Anal. Non Lin´eaire 20(1) (2003) 145–181. [12] H. T. Jadallah, The onset of superconductivity in a domain with a corner, J. Math. Phys. 42 (2001) 4101–4121. [13] H. T. Jadallah, J. Rubinstein and P. Sterenberg, Phase transition curves for mesoscopic superconducting samples, Phys. Rev. Lett. 82 (1999) 2935–2938. [14] K. Lu and X. B. Pan, Gauge invariant eigenvalue problems in R2 and R2+ , Trans. Amer. Math. Soc. 352 (2000) 1247–1276. [15] W. Meissner and R. Ochsenfeld, Naturwissenschaffen 21 (1933) 787. [16] X. B. Pan, Surface superconductivity in applied magnetic fields above HC2 , Commun. Math. Phys. 228 (2002) 327–370. [17] J. Rubinstein, Six lectures on superconductivity, in Boundaries Interfaces and Transitions, CRM Proceedings and Lecture Notes, Vol. 13, ed. M. Delfour (Amer. Math. Soc., 1998), pp. 163–184. [18] D. Saint-James and P. G. de Gennes, Onset of superconductivity in decreasing fields, Phys. Lett. 7 (1963) 306–308. [19] V. A. Schweigert and F. M. Peters, Influence of the confinement geometry on surface superconductivity, Phys. Rev. B60 (1999) 3084–3087.
November 4, 2004 15:1 WSPC/148-RMP
00217
Reviews in Mathematical Physics Vol. 16, No. 8 (2004) 977–1071 c World Scientific Publishing Company
SELECTION OF THE GROUND STATE FOR NONLINEAR ¨ SCHRODINGER EQUATIONS
A. SOFFER∗ and M. I. WEINSTEIN† ∗Department
of Mathematics, Rutgers University, New Brunswick, NJ 08903, USA of Applied Physics and Applied Mathematics Columbia University, New York, NY 10027, USA and Mathematical Sciences Research, Bell Laboratories, Murray Hill, NJ 07974, USA †Department
Received 31 December 2001 Revised 11 August 2004 We prove for a class of nonlinear Schr¨ odinger systems (NLS) having two nonlinear bound states that the (generic) large time behavior is characterized by decay of the excited state, asymptotic approach to the nonlinear ground state and dispersive radiation. Our analysis elucidates the mechanism through which initial conditions which are very near the excited state branch evolve into a (nonlinear) ground state, a phenomenon known as ground state selection. Key steps in the analysis are the introduction of a particular linearization and the derivation of a normal form which reflects the dynamics on all time scales and yields, in particular, nonlinear master equations. Then, a novel multiple time scale dynamic stability theory is developed. Consequently, we give a detailed description of the asymptotic behavior of the two bound state NLS for all small initial data. The methods are general and can be extended to treat NLS with more than two bound states and more general nonlinearities including those of Hartree–Fock type. Keywords: Nonlinear scattering; soliton dynamics; NLS; Gross–Pitaevskii equation; asymptotic stability; metastability.
Contents 1. Introduction and Statement of Main Results 1.1. Relation to other work 2. Structure of the Proof 2.1. Discussion of time scales 3. Linear and Nonlinear Bound States 3.1. Bound states of the unperturbed problem 3.2. Nonlinear bound states 4. Linearization about the Ground State 4.1. Nondegenerate basis for the discrete subspaces of H0 and H∗0 4.2. Estimates for the linearized evolution operator 5. Decomposition and Modulation Equations 5.1. Modulation equations 977
978 983 985 987 988 988 988 989 992 993 995 997
November 4, 2004 15:1 WSPC/148-RMP
978
00217
A. Soffer & M. I. Weinstein
5.2. Conservation laws and a priori bounds 6. Toward a Normal Form — Algebraic Reductions and Frequency Analysis 6.1. Modulation equations ˜ 6.2. Algebraic reductions and determination of Θ(t) ˜ 6.2.1. Determination of Θ(t) 6.2.2. Simplifications to Eqs. (6.1) and (6.2) 6.3. Peeling off the rapid oscillations of α1 6.4. Expansion of φ2 7. Normal Form and Master Equations 7.1. Expansion of η 7.2. Normal form and master equations 8. Stability Analysis on Different Time Scales — Overview 9. Finite Dimensional Reduction and Its Analysis on Different Time Scales 10. Decomposition and Estimation of the Dispersion 10.1. Estimation of ke0 k∞ 10.2. Estimation of [∂e0 (t)]L2 ; 3 loc 2 10.3. Estimation of [e0 (t)]H 1 ;0 11. Beginning of Proof of Proposition 9.1 (j) 12. Local and Nonlocal ODE Terms: R2 (P0 , P1 ) of Proposition 11.1 12.1. Proof of part (1) of Proposition 12.1 12.2. Proof of part (2) of Proposition 12.1 (j) 13. R1 (ηb ) terms of Proposition 11.1 14. Bootstrapping It All 15. Nongeneric Behavior Acknowledgments Appendix A. Notation Appendix B. Proof of Proposition 4.2 Appendix C. A Commutator Term References
998 999 999 1001 1002 1004 1005 1007 1008 1014 1016 1019 1022 1027 1029 1030 1031 1035 1046 1046 1056 1058 1061 1063 1064 1064 1065 1068 1069
1. Introduction and Statement of Main Results In this paper we study the detailed dynamics of the nonlinear Schr¨ odinger equation with a potential (NLS): i∂t φ = Hφ + λ|φ|2 φ .
(1.1)
Here, H = −∆ + V (x) is a self-adjoint operator on L2 (R3 ) and λ is a coupling parameter, assumed real and of order one. When V (x) is nonzero, Eq. (1.1) is also known as the time-dependent Gross–Pitaevskii equation.a We assume that V (x) is a smooth potential, which decays sufficiently rapidly as |x| tends to infinity (short range). Finally, we assume that the operator H has no zero energy resonance [24, 31], a condition which holds for generic V . a In a similar way, one can treat nonlinear terms of the form λK[|φ| 2 ], which can be local or nonlocal. Typical examples are: K[|φ|2 ] = |φ|p and K[|φ|2 ] = K ? |φ|2 , for some convolution kernel, K, Hartree–Fock type.
November 4, 2004 15:1 WSPC/148-RMP
00217
Selection of the Ground State for Nonlinear Schr¨ odinger Equations
979
NLS is a Hamiltonian system with conserved Hamiltonian energy functional: Z λ Hen [φ] = |∇φ(x)|2 + V (x)|φ(x)|2 + |φ(x)|4 dx (1.2) 2
and additional conserved integral
N[φ] =
Z
|φ(x)|2 dx .
(1.3)
These conserved integrals are continuous in the H 1 (R3 ) topology. An extensive discussion of the well-posedness theory can be found in [6, 22, 52]. In particular, NLS is well-posed globally in time in the space H 1 (R3 ), for initial data, φ0 , which is sufficiently small in H 1 . Throughout this paper we shall assume that φ0 has sufficiently small H 1 norm. We shall use the notation: E0 ≡ kφ0 k2H 1 .
(1.4)
For V (x) which decays sufficiently rapidly as |x| tends to infinity (short range potentials) the spectrum of H [35] consists of discrete spectrum, σd (H), consisting of a finite number of negative point eigenvalues, and continuous spectrum, σc (H) = [0, ∞). The dynamics of solutions for λ = 0 (linear Schr¨ odinger) is very well understood. Let ψj∗ and Ej∗ denote bound states and bound state energies of the linear Schr¨ odinger operator H: hψj∗ , ψk∗ iL2 = δjk .
Hψj∗ = Ej∗ ψj∗ ,
(1.5)
Arbitrary initial conditions in an appropriate Hilbert space, evolve as t tends to infinity into a time-quasiperiodic part consisting of a superposition of time periodic and spatially localized states with frequencies given by the eigenvalues and a dispersive or radiative part, which decays to zero as t tends to infinity in appropriate spaces, e.g. Lp , p > 2, L2 (hxi−σ dx). In order to be more precise, introduce Pc∗ , the orthogonal projection onto the continuous spectral subspace of H: X Pc∗ f = f − hψj∗ , f iψj∗ . (1.6) j
The solution of the linear Schr¨ odinger equation can be expressed as X e−iHt φ0 = hψj∗ , φ0 iψj∗ e−iEj∗ t + e−iHt Pc∗ φ0 .
(1.7)
j
The time decay of the continuous spectral part of the solution can be expressed, under suitable smoothness, decay and genericity assumptions on V (x), in terms of local decay estimates [24, 31]: 3
khxi−σ e−iHt Pc∗ φ0 kL2 (R3 ) ≤ Chti− 2 khxiσ φ0 kL2 (R3 ) , 1
σ ≥ σ0 > 0, and L − L
∞
(1.8)
decay estimates [25, 61] 3
ke−iHt Pc∗ φ0 kL∞(R3 ) ≤ C|t|− 2 kφ0 kL1 (R3 ) .
(1.9)
November 4, 2004 15:1 WSPC/148-RMP
980
00217
A. Soffer & M. I. Weinstein
For λ 6= 0 the bound states of the linear problem persist, and bifurcate from the linear states at zero amplitude into branches of nonlinear bound states [38]. Of interest to us is the detailed dynamics of the nonlinear problem on short, intermediate and long time scales and, in particular, the manner in which the nonlinear bound states participate in the dynamics. In [38] variational methods were used to establish the existence and orbital Lyapunov stability of bound states which are local minimizers of Hen subject to fixed N; see also [59, 15]. This result says that initial data which is close, modulo a phase adjustment, in H 1 to the ground state remains H 1 close to a phase adjusted ground state for all time. The H 1 norm is closely related to the conserved Hamiltonian energy of the system and is insensitive to dispersive phenomena. Therefore, the detailed dynamics is not addressed by this result. For example, could the large time dynamics consist of a nonlinear ground state plus a small nonlinear excited state part? The main result of this paper implies that this cannot occur. For the nonlinear problem, the simplest question to consider is the case where H has only one simple eigenvalue and the norm of the solution is small. The detailed dynamics was studied in [43, 44, 33, 57]. Small norm initial data are shown to evolve into an asymptotic nonlinear ground state and a radiative decaying part. In this paper we study the multibound state variant of this question. We consider the specific case where H has two simple eigenvalues, E0∗ and E1∗ . The linear Schr¨ odinger equation then has two time-periodic solutions ψ0∗ e−iE0∗ t and −iE1∗ t ψ1∗ e , with Hψj∗ = Ej∗ ψj∗ , ψj∗ ∈ L2 . Therefore [38], NLS has two branches of nonlinear bound states bifurcating from the zero state at the eigenvalues of H, Ψα0 e−iE0 t and Ψα1 e−iE1 t , with Ψαj ∈ L2 satisfying HΨαj + λ|Ψαj |2 Ψαj = Ej Ψαj .
(1.10)
Here, αj denotes a coordinate along the jth nonlinear bound state branch and Ej = Ej∗ + O(|αj |2 ) .
(1.11)
In contrast to the linear behavior (1.7) our main result is the following: Theorem 1.1. Consider NLS with V (x) a short range potential supporting two bound states as described above. Furthermore, assume that the linear Schr¨ odinger operator , H, has no zero energy resonance [24]. (i) Assume the initial data, φ(0), is small in the norm defined by: [φ(0)]X ≡ khxiσ φ(0)kH k ,
(1.12)
where k > 2 and σ > 0 are sufficiently large. Let φ(t) solve the initial value problem for NLS.
November 4, 2004 15:1 WSPC/148-RMP
00217
Selection of the Ground State for Nonlinear Schr¨ odinger Equations
981
Assume the (generically satisfied ) nonlinear Fermi golden rule resonance conditionb
2 2 Γω∗ ≡ λ2 π ψ0∗ ψ1∗ , δ(H − ω∗ )ψ0∗ ψ1∗ >0 (1.13)
holds, where
Then, as t → ∞
ω∗ = 2E1∗ − E0∗ > 0 .
(1.14)
φ(t) → e−iωj (t) Ψαj (∞) + ei∆t φ+ ,
(1.15)
in L2 , where either j = 0 or j = 1. The phase ωj satisfies ωj (t) = ωj∞ t + O(log t) .
(1.16)
Here, Ψαj (∞) is a nonlinear bound state (Sec. 3), with frequency Ej (∞) near Ej∗ . When j = 0, the solution is asymptotic to a nonlinear ground state, while in the case j = 1 the solution is asymptotic to a nonlinear excited state. (ii) More specifically, we have the following expansion of the solution φ(t): φ(t) = e−i˜ω0 (t) Ψα0 (t) + e−i˜ω1 (t) Ψα1 (t) + π1 e−iH0 (∞)t Pc φ˜+ (t) + Rloc (t) + Rnloc (t), where as t → ∞ ω ˜ j (t) − ωj (t) → 0 ,
αj (t) − αj (∞) → 0
(1.17)
and such that for each initial state, φ(0), |α0 (∞)| · |α1 (∞)| = 0 , φ˜+ (t) → φ+ in L2 , 1
kRloc (·, t)k2 = O(t− 2 ) 1
kRnloc (·, t)k∞ = O(t− 2 ) . Here, H0 (∞) is a small spatially localized perturbation of the operator σ3 (−∆ + V (x)) and Pc = Pc (H0 (∞)), the projection onto its continuous spectral part. Finally, π1 maps the vector (z1 , z2 ) to (z1 , 0). Remark 1.1. (i) Theorem 1.1 implies the absence of small norm time-quasiperiodic solutions for this class of nonlinear Schr¨ odinger equations [41]. Intuitively, one can explain why one expects only a pure state in the limit t → ∞ and how the condition on ω∗ = 2E1∗ − E0∗ arises. Our intuition is based on viewing the nonlinearity as b The
operator f 7→ δ(H −ω∗ )f projects f onto the generalized eigenfunction of H with generalized eigenvalue ω∗ . The expression in (1.13) is finite by local decay estimates (1.8); see e.g. [47].
November 4, 2004 15:1 WSPC/148-RMP
982
00217
A. Soffer & M. I. Weinstein
linear time-dependent potential; see also [41, 42]. An approximate superposition of a nonlinear ground state and excited state φ ∼ Ψα0 e−iE0 t + Ψα1 e−iE1 t can be viewed as defining a self-consistent time-dependent potential: W (x, t) = λ|Ψα0 e−iE0 t + Ψα1 e−iE1 t |2 , = λ|Ψα0 + Ψα1 e−i(E1 −E0 )t |2 ∼ λ|Ψα0 |2 + 2λ cos((E1 − E0 )t + γ)Ψ|α0 | Ψ|α1 |
(1.18)
for |α1 | |α0 |. As shown in [48], [28] and [29] data with initial conditions given by the unperturbed excited state decay exponentially on a time scale of order τ ∼ O(|α0 α1 |−2 ) provided the forcing frequency, E1 − E0 ∼ E1∗ − E0∗ > −E1∗ or ω∗ = 2E1∗ − E0∗ > 0. (ii) Theorem 1.1 implies asymptotic stability and selection of the ground state for generic small data. Theorem 1.1(i) implies a form of asymptotic completeness. (iii) Since we control the decay of solutions in W k,∞ , our results imply global existence of small solutions in H s for all s sufficiently large. (iv) The asymptotic state where |α1 (∞)| 6= 0 (and therefore |α0 (∞)| = 0) is non-generic. This can be seen by linearization about the excited state. The linearized operator, H1 , is a localized perturbation of an operator having embedded eigenvalues in its continuous spectrum, under our hypothesis ω∗ ≡ 2E1∗ − E0∗ > 0. The connection between embedded eigenvalues in the continuous spectrum of an appropriate linear operator and the non-persistence of localized time periodic states and between embedded eigenvalues in the continuous spectrum of an appropriate linear operator was explored first in [41, 42]. It is well known that embedded eigenvalues in the continuous spectrum are unstable to generic perturbations; see, for example, [7, 14, 47]. In this case, the embedded eigenvalues are perturbed to complex eigenvalues, with corresponding eigenstates whose evolution is exponentially growing with time, under the condition (7.4). The perturbation to the linear operator with embedded eigenvalues is however both non-generic (in that it comes from linearization of a Hamiltonian nonlinear term about a critical point of the energy) and breaks self-adjointness with respect to the standard L2 inner product. A second order perturbation theory calculation shows that if ω∗ ≡ 2E1∗ −E0∗ > 0 generically the embedded eigenvalue perturbs to an exponential instability [45]. This suggests the existence of an unstable manifold of solutions for the nonlinear equation. The existence of such non-generic solutions of NLS with E1 (∞) 6= 0 for the full nonlinear flow has recently been demonstrated [55]. (v) Theorem 1.1 is stated for the case of two nonlinear bound state branches. The technique of proof, however, can be used to consider the more general case. We expect results which are analogous to those of our main theorem, but more complicated due to the presence of: direct bound state-bound state interactions, bound state-continuum interactions and bound state-bound state interactions mediated by the continuum. Multimode Hamiltonian systems have been considered in the context of linear time almost periodic perturbations have been studied in [29].
November 4, 2004 15:1 WSPC/148-RMP
00217
Selection of the Ground State for Nonlinear Schr¨ odinger Equations
983
1.1. Relation to other work We now wish to further put our results in context. Research on nonlinear scattering in the presence of bound states has followed two related lines. (a) Nonlinear dispersive waves in systems with defects, potentials, etc.: Our analysis centers around nonlinear bound states which bifurcate from linear bound states of the operator, H, obtained by linearizing about the zero solution. These bound states exist at all sufficiently small amplitudes (measured in any H s norm, s ≥ 0). The behavior of the “bifurcation diagram” for larger amplitudes depends in a detailed way on the details of the nonlinearity, the spatial dimension and the norm [38]. Such nonlinear bound states are also called nonlinear defect modes, nonlinear localized modes or nonlinear pinned modes. They are localized about or “pinned” to the support of the potential, V , and arise due to a local deviation from translation invariance or a “defect” in the homogeneous background which acts as an attractive potential well. To get a more refined picture of the dynamics than in the H 1 theory, one must consider the linearized evolution about the family of nonlinear ground states. This linearized operator has continuous and discrete spectral parts inherited from the linear bound state spectral structure. In particular, the discrete spectrum contains an eigenvalue at zero corresponding to the ground state and a pair of eigenvalues (located symmetrically about zero) corresponding to the excited state. Thus, at linear order a solution infinitesimally close to the ground state formally appears to be quasiperiodic in time — a ground state plus a small excited state oscillation. However, at higher order in perturbation theory one finds nonlinear resonant coupling of the neutral oscillatory modes to the continuum and as a result these slowly damp to zero; generically, for very large time energy splits between the ground state and dispersive parts of the solution. This mechanism for relaxation to the ground state was earlier considered for the nonlinear Klein–Gordon wave equation with a potential, where the decay of “breather-like” solutions was studied [49]. In this work, small norm solutions relax to the zero solution via resonant energy transfer out of the bound state to radiation modes and dispersive radiation of energy to infinity; the zero solution plays the role of the ground state. Results concerning special classes of initial data are considered in the work of Cuccagna [11] and Tsai and Yau [53, 54]. (b) Nonlinear dispersive translation invariant equations: A closely-related line of research focuses on the translation invariant nonlinear Schr¨ odinger equation. Here the equation is (1.1) with V taken to be identically zero nonlinear coupling parameter λ < 0. In this case, the equation has solitary wave solutions, obtainable by minimization of Hen [φ] subject to N [φ] = N0 . For NLS in dimension n with cubic nonlinearity replaced by the general power nonlinearity |φ|p−1 φ, we have that if p < 1+ n4 , the foregoing variational problem has a unique (up to translation) radially symmetric ground state solution for any N0 > 0. In the case when V (x) 6= 0 is a potential supporting bound states, the small N0 solutions agree with the bifurcating
November 4, 2004 15:1 WSPC/148-RMP
984
00217
A. Soffer & M. I. Weinstein
bound states discussed above [38]. As pointed out earlier, constrained energy minimizers are H 1 orbitally Lyapunov stable [12, 59]. An interesting feature of solitary waves in the translation invariant case is the presence of spurious neutral oscillations. These are sometimes called internal modes [23]. To explain this, consider the linearization about a ground state solitary wave (p < 1 + n4 ). Due to the underlying symmetric group of the equation (translation invariance, phase invariance, Galilean invariance, etc.) this linearization has a (generalized) zero eigenvalue of multiplicity related to the dimension of the equation’s symmetry group. In the nonintegrable cases (n = 1, p 6= 3 and n ≥ 2) the linearization has additional neutral modes. These neutral modes approach zero as p approaches 1 + n4 ; the dimension of the zero subspace jumps by two at p = 1 + n4 , the critical case, corresponding to the larger group of symmetries and the existence of a pseudo-conformal invariant [58]. Buslaev and Perel’man [3] considered the problem in one space dimension and showed that nonlinear resonance of these “internal modes” with the continuum is responsible for their damping on long time scales and the asymptotic stability of solitons. See also the recent work of Buslaev and Sulem [4]. Their analysis was restricted to one space dimension only in their use of explicit eigenfunction expansion methods to obtain the required local energy decay estimates. Cuccagna [10, 11] extended their results to more general nonlinearities and general space dimensions. In his analysis, the required dispersive estimates are obtained by adapting K. Yajima’s [61] approach in which the wave operators, which conjugate the linearized operator on its continuous spectral part to the constant coefficient “free” dispersive evolution, are shown to be bounded on W k,p spaces. This method was also used in [49]. Another feature, common to problems of type (a) and (b) is the use of the method of normal forms. In the context of nonlinear scattering, normal form ideas were used to obtain the local behavior in a neighborhood of a soliton in [3] and for the decaying breather-like state in [49]. In contrast to the normal form for finite dimensional Hamiltonian systems, resonant interaction with the continuous spectrum gives rise to a more general normal form which captures internal damping, due to energy transfer out of certain discrete modes to the continuum modes; see the discussion in the introduction to [49]. In the present work, we derive a nonlinear master equation, coupled equations for the renormalized (up to near identity transformations on the complex discrete mode amplitudes) discrete mode square amplitudes (“mode powers”), which governs generic dynamics on large intermediate and very long time scales. Normal forms of this type, expected to be valid for very long times, were derived and studied in the local analysis about the steady and “wobbling” kink-like solutions of discrete nonlinear wave equations in [26]. A key feature of the normal form of the current work is our analysis of its behavior on different time scales and the analysis of its transitional behavior across time scales, for general initial data. Finally, we point out that there are many important areas of application which motivate the study of the class of models we treated in this paper. We mention two.
November 4, 2004 15:1 WSPC/148-RMP
00217
Selection of the Ground State for Nonlinear Schr¨ odinger Equations
985
At the most fundamental level the Gross–Pitaevskii equation (NLS with a potential) arises as a mean field limit model governing the interaction of a very large number of weakly interacting bosons [21, 51, 30, 9]. At a macroscopic level, it has been shown that equations of this type arise as the equation governing the evolution of the envelope of the electric field of a light pulse propagating in a medium with defects. See, for example, [17, 19, 20]. 2. Structure of the Proof We now sketch our analysis. Certain notations are defined in Appendix A. Analogous with the approach introduced in [43, 44] in the one bound state case, we represent the solution in terms of the dynamics of the bound state part, described through the evolution of the collective coordinates α0 (t) and α1 (t), and a remainder φ2 , whose dynamics is controlled by a dispersive equation. In particular we have Rt ˜ φ(t, x) = e−i 0 E0 (s)ds−iΘ(t) Ψα0 (t) + Ψα1 (t) + φ2 (t, x) . (2.1) We substitute (2.1) into NLS and use the nonlinear equations (1.10) for Ψαj to simplify. Anticipating the decay of the excited state, we center the dynamics about the ground state. We therefore obtain for Φ2 ≡ (φ2 , φ2 )T the equation: ˜ i∂t Φ2 = H0 (t)Φ2 + G(t, x, Φ2 ; ∂t α ~ (t), ∂t α ~ , ∂t Θ(t)) (2.2)
where, H0 (t) denotes the matrix operator which is the linearization about the timedependent nonlinear ground state Ψα0 (t) . The idea is that in order for φ2 (t, x) to decay dispersively to zero we must choose α0 (t) and α1 (t) to evolve in such a way as to remove all secular resonance terms from G. Thus we require, Pb (H0 (t))Φ2 (t) = 0 ,
(2.3)
where Pb (H0 ) and Pc = I − Pb (H0 ) denote the discrete and continuous spectral projections of H0 ; see also condition (5.13). Since the discrete subspace of H0 (t) is four-dimensional (consisting of a generalized null space of dimension two plus two oscillating neutral modes) (2.3) is equivalent to four orthogonality conditions implying four differential equations for α0 , α1 and their complex conjugates. These equations are coupled to the dispersive partial differential equation for Φ2 . At this stage we have that NLS is equivalent to a dynamical system consisting of a finite dimensional part (6.23) and (6.24), governing α ~ j = (αj , αj ), j = 0, 1, coupled to an infinite dimensional dispersive part governing Φ2 : i∂t α ~ = A(t)~ α + F~α (2.4) i∂t Φ2 = H0 (t)Φ2 + F~φ .
We expect A(t) and H0 (t) to have limits as t → ±∞. Our strategy is to fix T > 0 arbitrarily large, and to study the dynamics on the interval [0, T ]. In this we follow the strategy of [3, 11]. We shall rewrite (2.4) as: i∂t α ~ = A(T )~ α + (A(t) − A(T ))~ α + F~α (2.5) i∂t Φ2 = H(T )Φ2 + (H0 (t) − H0 (T ))Φ2 + F~φ
November 4, 2004 15:1 WSPC/148-RMP
986
00217
A. Soffer & M. I. Weinstein
and implement a perturbative analysis about the time-independent reference linear, respectively, matrix and differential, operators A(T ) and H0 (T ). More specifically, we analyze the dynamics of (2.5) by using (a) the eigenvalues of A(T ) to calculate the key resonant terms and (b) together with the dispersive estimates of e−iH0 (T )t Pc (T ) [10]. Note also that Pc (T )Φ2 (t) 6= Φ2 (t) because Φ2 (t) ∈ Range Pc (t) 6= Range Pc (T ). We therefore decompose Φ2 as: Φ2 = disc(t; T ) + η ,
η = Pc (T )η
(2.6)
where disc(t; T ) lies in the discrete spectral subspace of H0 (T ) and show that disc(t; T ) can be controlled in terms of η. The expected generic behavior of this system is that α1 and Φ2 decay with a 1 rate t− 2 . This slow rate actually leads to an equation for α1 with the character ˜ 1 + integrable in t; see (6.12). Thus, Θ ˜ is chosen to satisfy ∂t α1 ∼ ( 1t ρ − ∂t Θ)α 1 ˜ ∂t Θ ∼ t ρ ensuring that α1 has a limit. In this way a logarithmic correction to the standard phase arises; see (1.16). Next we explicitly factor out the rapid oscillations from α1 and show that, after a near identity change of variables (α0 , α1 ) 7→ (˜ α0 , β˜1 ), that the modified ground and excited state amplitudes satisfy the system: i∂t α ˜0 = (c1022 + iΓω )|β˜1 |4 α ˜ 0 + Fα [˜ α0 , β˜1 , η, t]
α0 , β˜1 , η, t] ; i∂t β˜1 = (c1121 − 2iΓω )|˜ α0 |2 |β˜1 |2 β˜1 + Fβ [˜
(2.7)
see Proposition 7.1. It follows that a nonlinear master equation governs P j = |˜ α j |2 , the power in the jth mode: dP0 = 2ΓP12 P0 + R0 (t) dt (2.8) dP1 2 = −4ΓP1 P0 + R1 (t) . dt Coupling to the dispersive part, Φ2 , is through the source terms R0 and R1 . The expression “master equation” is used since the role played by (2.8) is analogous to the role of master equations in the quantum theory of open systems [13]. A novel multiscale Lyapunov argument is implemented in Sec. 8 characterizing the behavior of the system (2.8) coupled to that of the dispersive part on short, intermediate and long time scales. We consider the system (2.8) on three time intervals: I0 = [0, t0 ] (initial phase) I1 = [t0 , t1 ] (embryonic phase) and I2 = [t1 , ∞) (selection of the ground state). For t ≥ t0 , the terms R0 (t) and R1 (t) are shown (Proposition 12.1) to have the form b0 (t0 , E0 ) + ρ0 (E0 , t)P0 P12 (2.9) R0 (t) ∼ hti2 R1 (t) ∼
p b1 (t0 , E0 ) P0 P1m + ρ1 (E0 , t)P0 P12 , + δ (t) m hti2
(2.10)
November 4, 2004 15:1 WSPC/148-RMP
00217
Selection of the Ground State for Nonlinear Schr¨ odinger Equations
987
where b0 = O(ht0 i−1 ) ,
1
b1 = O(ht0 i− 2 ) .
(2.11)
R0 (t) and R1 (t) have parts which are local in time and nonlocal in time. The handling of nonlocal terms is explained in Sec. 8. We set (Proposition 9.2) Q0 = P 0 −
b˜0 , hti
Q1 = P 1 +
b˜1 hti
(2.12)
where b˜0 and b˜1 are positive and satisfy (2.11) as well. 2.1. Discussion of time scales
Estimates (2.9), (2.10) and the definitions (2.12) imply an effective finitedimensional reduction to a system of equations for the “effective mode powers”: Q0 (t) and Q1 (t), whose character on different time scales dictates the full infinite dimensional dynamics, in a manner analogous to the role of a center manifold reduction of a dissipative system [5]. Initial phase – t ∈ I0 = [0, t0 ]: Here, I0 is the maximal interval on which Q0 (t) ≤ 0. If t0 = ∞, then P0 (t) = O(hti−2 ) and the ground state decays to zero. In this case, we show in Sec. 15 that the excited state amplitude has a limit as well (which may or may not be zero). This case is non-generic. Embryonic phase – t ∈ I1 = [t0 , t1 ]: If t0 < ∞, then for t > t0 : dQ0 ≥ 2Γ0 Q0 Q21 dt (2.13) √ dQ1 m 0 2 ≤ −4Γ Q0 Q1 + O( Q0 Q1 ) · m ≥ 4 . dt Therefore, Q0 is monotonically increasing; the ground state grows. Furthermore, if Q0 is small relative to Q1 , then Q0 is monotonically increasing , Q1
(2.14)
in fact exponentially increasing; the ground state grows rapidly relative to the excited state. Selection of the ground state t ∈ I2 = [t1 , ∞): There exists a time t = t1 , t0 ≤ t1 < √ ∞ at which the O( Q0 Qm 1 ) term in (2.13) is dominated by the leading (“dissipative”) term. For t ≥ t1 we have dQ0 ≥ 2Γ0 Q0 Q21 dt dQ1 ≤ −4Γ0 Q0 Q21 . dt
(2.15)
November 4, 2004 15:1 WSPC/148-RMP
988
00217
A. Soffer & M. I. Weinstein
It follows that Q0 (t) → Q0 (∞) > 0 and Q1 (t) → 0 as t → ∞; the ground state is selected. 3. Linear and Nonlinear Bound States In this section we introduce bound states of the linear (λ = 0) and nonlinear (λ 6= 0) Schr¨ odinger equation (1.1). 3.1. Bound states of the unperturbed problem Let H = −∆ + V (x). We assume that V (x) is smooth and sufficiently rapidly decaying, so that H defines a self-adjoint operator in L2 . Additionally, we assume that the spectrum of H consists of a continuous spectrum extending from 0 to positive infinity and two discrete negative eigenvalues, each of multiplicity one. σ(H) = {E0∗ , E1∗ } ∪ [0, ∞) .
(3.1)
Therefore, there exist eigenstates, ψj∗ ∈ D(H), j = 0, 1 such that Hψj∗ = Ej∗ ψj∗ .
(3.2)
We also introduce spectral projections onto the discrete eigenstates and continuous spectral part of H, respectively: Pj∗ f ≡ hψj∗ , f iψj∗ ,
j = 0, 1
Pc∗ ≡ I − P0∗ − P1∗ . 3.2. Nonlinear bound states We seek solutions of (1.1) of the form φ = e−iEj t ΨEj .
(3.3)
HΨEj + λ|ΨEj |2 ΨEj = Ej ψEj .
(3.4)
Substitution into (1.1) yields
We introduce a bifurcation parameter, αj , for the jth nonlinear bound state branch and define α ~ j = (αj , α ¯j ) .
(3.5)
Proposition 3.1 ([38]). For each j = 0, 1 we have a one-parameter family, Ψ αj = Ψj , of bound states depending on the complex parameter αj = |αj |eiγj and defined for |αj | sufficiently small : Ψj (x) ≡ αj ψj (x; |αj |2 ) = eiγj |αj |ψj (x; |αj |2 ) (1) = αj ψj∗ (x) + λ|αj |2 ψj (x; |αj |2 )
(1) = αj ψj∗ (x) + λ|αj |2 ψj (x; 0) + O(λ2 |αj |4 .
(3.6)
November 4, 2004 15:1 WSPC/148-RMP
00217
Selection of the Ground State for Nonlinear Schr¨ odinger Equations
989
Here, (1)
3 ψj (x; 0) = −(H − E0∗ )−1 (I − PEj∗ )ψj∗
(3.7)
(1)
Ej ≡ Ej (~ αj ) ≡ Ej∗ + |αj |2 Ej (|αj |2 ) Z 4 = Ej∗ + λ ψ0∗ dx |αj |2 + O(|αj |4 ) .
(3.8)
The mapping α ~ j 7→ (Ej (~ αj ), Ψj (·; α ~ j )) is smooth.
The proof uses standard bifurcation theory [32], which is based on the implicit function theorem. The analysis extends to the case of nonlocal nonlinearities. A variational approach can also be used to construct nonlinear bound states. Variational approaches, though more global, do not directly yield the information we require concerning smooth variation with respect to parameters. Remark 3.1. Ψj depends on αj and αj . We shall compute derivatives of the nonlinear bound states Ψα with respect to αj and αj and use the notation: ~ j = ∂αj , ∂α ≡ (∂j , ∂j ) . (3.9) ∇ j
In what follows we shall “modulate” these bound states. That is, we shall allow α ~ to vary with time. For convenience, we shall use the notation: Ψj (t, x) = Ψαj (t) (x) , Ej (t) = Ej (|αj (t)|2 ) . 4. Linearization about the Ground State Let Ψ denote a nonlinear bound state (ground state or excited state) of (1.1); see Sec. 3. Then, HΨ + λ|Ψ|2 Ψ = E Ψ .
(4.1)
We first derive the linear stability problem. Let φ = (Ψ + p) e−iEt ,
(4.2)
where p denotes the perturbation about Ψ. Substituting (4.2) into (1.1) and neglecting all terms which are nonlinear in p and p¯, we obtain the linearized perturbation equation i∂t p = H − E + 2λ|Ψ|2 p + λΨ2 p¯ . (4.3) Since p¯ appears explicitly in (4.3) it is natural to consider the system for p p~ = , p¯ i∂t p~ = H~ p,
(4.4) (4.5)
November 4, 2004 15:1 WSPC/148-RMP
990
00217
A. Soffer & M. I. Weinstein
where H = σ3
H − E + 2λ|Ψ|2 ¯2 λΨ
and σ3 =
1
λΨ2 H − E + 2λ|Ψ|2 0
0 −1
,
(4.6)
.
(4.7)
Later in this paper we shall refer specifically to the linearization about a “curve” of bound states (Ψj (t), Ej (t)) and will denote by Hj (t) the operator (4.6) with E replaced by Ej (t) and Ψ replaced by Ψj (t). Our main focus will be on the operator family H0 (t) = HE0 (t),Ψ0 (t) .
(4.8)
The nonlinear bound state Ψ is linearly spectrally stable if the spectrum of H, σ(H), is a subset of the real line.c Ψ is linearly dynamically stable if, in an appropriate space, all solutions of the initial value problem for (4.5) are bounded in time. That is, in some norm e−iH0 t is a bounded operator. Linear dynamical stability of the ground state Ψ0 follows from [58]. For this result and the necessary stronger dispersive estimates on e−iH0 t [10], we require information on the discrete spectrum of H0 and the corresponding spectral subspaces. Before stating these results we observe that the operator e−iH0 t can be expressed in terms of the operator treated explicitly in [58, 10]. To see this, express the ground state as Ψ0 = |Ψ0 |eiγ (|Ψ0 | > 0), where γ = arg α is a constant. Set p = eiγ q. Then, by (4.3) we have: i∂t q = H − E0 + 2λ|Ψ0 |2 q + λΨ20 q¯ . (4.9) Now let q = u + iv, where u and v are real. Then, u u ˜0 ∂t = JH , v v
(4.10)
where J=
0 −1 1 0
and
˜0 = H
H0 − E0 − λ|Ψ0 |2 0
0
H0 − E0 − 3λ|Ψ0 |2 (4.11)
Note that p(t) = π1 p~(t) = π1 e−iH0 t p~0 ˜ ˜ = eiγ π1 eJ H0 t
(4.12)
where π1 (z1 , z2 ) = (z1 , 0) and π2 (z1 , z2 ) = (0, z2 ). Therefore, we have: c The spectrum of H has the symmetries one expects for Hamiltonian systems. The mappings ¯ send points in the spectrum to points in the spectrum. Note that if ξ is an λ 7→ −λ and λ 7→ λ eigenvector of H with eigenvalue µ then σ1 ξ is an eigenvector of H with eigenvalue −µ. Therefore, ξ−µ = σ1 ξµ .
November 4, 2004 15:1 WSPC/148-RMP
00217
Selection of the Ground State for Nonlinear Schr¨ odinger Equations
991
˜
Proposition 4.1. Estimates on e−iH0 t are equivalent to those for eJ H0 t and are independent of γ. We now turn to a detailed discussion of the spectral properties of H0 . Proposition 4.2. Consider H0 , the linearization about the ground state. Let α0 be sufficiently small. (i) σ(H0 ) is a subset of the real line. (ii) σdiscrete (H0 ) = {−µ, 0, µ}, where 0 < µ < |E0 |. (iii) Zero is a generalized eigenvalue of H0 . The generalized null space, Ng (H0 ), is given by Ψ0 ∂ E0 Ψ 0 Ng (H0 ) = span σ3 , . (4.13) Ψ0 ∂ E0 Ψ 0 (iv) ±µ are simple eigenvalues. We denote their corresponding eigenfunctions by ξµ and ξ−µ . For |α0 | small we have the expansion: µ = E1 − E0 + O(|α0 |2 ) 1 |α0 |2 c1 (|α0 |2 ) ξµ = ψ1∗ + α0 2 c2 (|α0 |2 ) 0
ξ−µ = σ1 ξµ ,
(4.14) (4.15) (4.16)
where c1 (a) and c2 (a) are real analytic functions in a. (v) σ(H0 ) − σdiscrete (H0 ) = (−∞, E0 ] ∪ [−E0 , ∞). The proof of Proposition 4.2 is in Appendix B. Note that if ω is an eigenvalue of H: σ3 Lg ≡ Hg = ωg
(4.17)
Lσ3 (σ3 g) = H∗ (σ3 g) = ωσ3 g .
(4.18)
then ω is an eigenvalue of H∗ with corresponding eigenfunction σ3 g: Therefore, we have Proposition 4.3. (i) σ(H0∗ ) = σ(H0 ). (ii) Ψ0 ∂ E0 Ψ 0 ∗ Ng (H0 ) = span . , σ3 Ψ0 ∂ E0 Ψ 0
(4.19)
(iii) N (H0∗ ∓ µ) = span{σ3 ξµ , σ3 ξ−µ } ≡ {ζµ , ζ−µ } .
(4.20)
Here, ζµ = σ 3 ξµ
and
ζ−µ = −σ3 ξ−µ ,
where this choice of ζ−µ is taken so that hζ−µ , ξ−µ i = 1 in the |α0 | ↓ 0 limit.
(4.21)
November 4, 2004 15:1 WSPC/148-RMP
992
00217
A. Soffer & M. I. Weinstein
4.1. Nondegenerate basis for the discrete subspaces of H0 and H∗0 For fixed E 6= E0 , the basis of Ng (H0∗ ) displayed in (4.19) is a natural basis due to its direct connection to the symmetries of NLS. However, this basis is degenerate and singular in the limit E → E0∗ , as we shall now see. Consider the basis of Ng (H0 ) displayed in (4.13). Beginning with the first element of this basis, explicitly we have: 1 α0 ψ0 (·; |α0 |2 ) Ψ0 F0 , (4.22) = α σ G = σ3 σ3 0 3 0 2 1 Ψ0 α0 ψ0 (·; |α0 | )
where
α0 = |α0 |e
iγ0
,
G0 =
eiγ0
0
0
e−iγ0
,
(4.23)
and F0 ≡ ψ0 (·; |α0 |2 ) = ψ0∗ + |α0 |2 χ(·, |α0 |2 ) (0)
= ψ0∗ + |α0 |2 χ(·, 0) + χ4 .
(4.24)
Here and subsequently we use the notation χ(·, p) to denote a generic real-valued (j) localized function of x with smooth dependence on a parameter, p, and χk is localized in x and O(|αj |k ). A nonsingular element of the Ng (H0 ) is obtained by dividing out ρ0 . We therefore define 1 (4.25) F0 (ρ20 ) . ξ01 ≡ σ3 G0 1 We now turn to the second element of the basis displayed in (4.13). First note that ∂ ∂ |α0 |ψ0 (·; |α0 |2 ) Ψ0 = eiγ0 ∂|α0 | ∂|α0 | = eiγ0 [ψ0 + |α0 |2 χ(·; |α0 |2 )] .
Differentiation of (3.4) with respect to |α0 | yields:
(H − E)∂|α0 | Ψ0 + 2|Ψ0 |2 ∂|α0 | Ψ0 + Ψ20 ∂|α0 | Ψ0 = (∂|α0 | E)Ψ0 .
(4.26)
Taken together with the complex conjugate of (4.26), this yields, after multiplication by σ3 : 1 1 |α0 |ψ0 F00 = (∂|α0 | E) σ3 G0 σ3 HG0 1 1 = |α0 |ξ01 .
(4.27)
1
(4.28)
We define ξ02 ≡ G0
1
F00 ,
November 4, 2004 15:1 WSPC/148-RMP
00217
Selection of the Ground State for Nonlinear Schr¨ odinger Equations
993
where F00 =
∂ |α0 |ψ0 (·; |α0 |2 ) = ψ0∗ + |α0 |2 χ(x, |α0 |2 ) . ∂|α0 |
(4.29)
By the above calculation ξ02 lies in the null space of (σ3 H)2 . Therefore, the pair of vectors: ξ01 and ξ02 spans Ng (H0 ) and is nonsingular as E0 → E0∗ . By a previous remark: ζ01 ≡ σ3 ξ02
and
ζ02 ≡ σ3 ξ01
(4.30)
form a nonsingular basis for H0∗ . This choice of basis will facilitate a uniform description of the dynamics in a neighborhood of the origin. The above construction and Proposition 4.2 imply the following basis for the discrete subspaces of H0 and H0∗ . Proposition 4.4. Ng (H0 ) = span{ξ01 , ξ02 }
(4.31)
Ng (H0∗ ) = span{ζ01 , ζ02 }
(4.32)
N (H0 ∓ µ) = span{ξµ , ξ−µ }
(4.33)
N (H0∗ ∓ µ) = span{ζµ , ζ−µ } = {σ3 ξµ , −σ3 ξ−µ }
(4.34)
hζa , ξb i = Cab δab + O(|α0 |2 ) ,
(4.35)
where a and b vary over the set {(01), (02), µ, −µ}. For α0 small we have the expansions 1 1 (0) F0 = σ 3 G 0 ψ0∗ + |α0 |2 χ0 ξ01 = σ3 G0 1 1 1 1 (0) ξ02 = G0 F00 = G0 ψ0∗ + |α0 |2 χ0 1 1 1 0 (0) (0) χ0 (4.36) ξµ = ψ1∗ + |α0 |2 χ0 + α0 2 1 0
and ξ−µ = σ1 ξµ .
Finally, we shall find it useful to note that ζ01 = σ3 ξ02 = ξ01 + |α0 |2 χ(x; |α0 |2 ) .
(4.37)
4.2. Estimates for the linearized evolution operator Theorem 4.1 (Linear dynamical stability [58]). Let M ≡ Ng (H0∗ )⊥ ∩ H 1 × H 1 .
(4.38)
November 4, 2004 15:1 WSPC/148-RMP
994
00217
A. Soffer & M. I. Weinstein
There exists C > 0 such that for any f ∈ M ||e−iH0 t f ||H 1 ≤ C||f ||H 1 .
(4.39)
Theorem 4.2 (Dispersive estimates). Let M1 ≡ [Ng (H0∗ ) ⊕ N (H ∗ ∓ µ)]
⊥
(4.40)
and Pc the associated continuous spectral projection. For any q ≥ 2 there exists C1,q > 0 such that n
n
||e−iH0 t Pc f ||Lq ≤ C1,q t− 2 + q ||f ||Lp ,
(4.41)
where p−1 + q −1 = 1. The L1 → L∞ and, more generally, Lp → Lq estimates, are known in the self-adjoint case [25, 61, 37]. The extension to matrix Hamiltonians with non-selfadjoint off-diagonal part is more complicated. The results of [10, 18, 39, 40] cover Theorem 4.2. Remark 4.1. We shall in later sections use the notation M(t) and M1 (t) to denote corresponding time-dependent subspaces relative to the time-dependent operator H0∗ (t). Theorem 4.3 (Local decay estimate). Let σ be sufficiently large. Let ω ∈ {ν ∈ R : |ν| > |E0 |}, the interior of the continuous spectrum of H0 . Then, for t > 0 3
||hxi−σ e−iH0 t Pc (H0 − ω − i0)−l hxi−σ ||B(L2 ) ≤ Chti− 2 ,
l = 0, 1
˜ 0k e−iH0 t Pc hxi−σ ||B(H 2k ,L2 ) ≤ Chti− 32 −k , ||hxi−σ H
(4.42) (4.43)
˜ 0 = H0 + E0 σ3 . For t < 0, the same estimates hold with −i0 replaced by where H +i0 in (4.42). We now sketch a proof, using that H0 = H Diag + ? W , where ? is small and W is bounded and localized . For the propagator associated with the diagonal Diag 3 part, e−itH , we have the dispersive L1 → L∞ estimate with bound t− 2 . This 3 implies the bound hti− 2 on the propagator from L1 ∩ L2 → L2 + L∞ , where kf kL2 +L∞ = min{kf1 k2 + kf2 k∞ , f = f1 + f2 }. The corresponding bound for the perturbed propagator is obtained by a bootstrap argument, as follows. Writing the DuHamel formula for e−iH0 t f , we obtain 3
3
ke−iH0 t f kL2 +L∞ ≤ Chti− 2 kf kL1 ∩L2 + ? CW hti− 2 sup ke−iH0 s f kL2 +L∞ 0≤s≤t
+ kPb (H Diag )e−iH0 t f kL2 +L∞ .
(4.44)
Let f = Pc (H0 )f . The last term of (4.44) can be rewritten as follows: Pb (H Diag )e−iH0 t f = Pb (H Diag )(Pc (H0 ) − Pc (H Diag ))e−iH0 t f = −Pb (H Diag )(Pb (H0 ) − Pb (H Diag ))e−iH0 t f .
(4.45)
November 4, 2004 15:1 WSPC/148-RMP
00217
Selection of the Ground State for Nonlinear Schr¨ odinger Equations
995
Elementary perturbation theory gives that Pb (H0 ) − Pb (H Diag ) is of order ? times a projection onto a localized function. Therefore, kPb (H Diag )e−iH0 t f kL2 +L∞ ≤ ? CW ke−iH0 t Pc (H0 )f kL2 +L∞ .
(4.46)
It follows that 3
ke−iH0 t f kL2 +L∞ ≤ CW hti− 2 kf kL1 ∩L2 .
(4.47)
Furthermore, we have the local decay estimate (4.42) with l = 0. We now sketch the proof of the case (4.42) for l = 1 and (4.43). For ease of presentation we sketch the proof in the context of the Schr¨ odinger operator H0 = −∆ + V . One has the local decay estimate 3
||hxi−σ e−iH0 t Pc (H0 )hxi−σ ||B(L2 ) ≤ Chti− 2 ,
(4.48)
where σ is positive and sufficiently large. The key to the proof is an analysis of the resolvent, (H0 − λ)−1 near λ = 0. One uses the spectral theorem Z ∞ e−itλ E 0 (λ)dλ (4.49) e−itH0 Pc (H0 ) = 0
and an explicit expansion of E 0 (λ), which can be expressed in terms of the imaginary part of the resolvent. The expansion of the resolvent is valid in the space B(L2 (hxiσ dx), L2 (hxi−σ dx)) for σ sufficiently large. Time decay is arbitrar3 ily fast for functions with spectral support away from λ = 0, while the t− 2 decay results from the behavior near λ = 0. The analogue of (4.43) is an estimate for e−itH0 H0k Pc (H0 ), which follows from the expansion of [24] applied to the formula: Z ∞ −itH0 k e H0 Pc (H0 ) = e−itλ λk E 0 (λ)dλ . (4.50) 0
p
q
Remark 4.2. The L → L estimates of Theorem 4.2 are later used to estimate nonlinear terms which do not include spatially localized factors. In the case of small data, which we consider here, it is also possible to use the above weaker L1 ∩ L2 → L∞ + L2 estimates. Indeed, the only difference is that we need to bound the L2 norm (not just the L1 norm) of the nonlinearity. The L2 norm has even faster decay, since we can bound the L∞ norm in terms of Sobolev norms, which are controlled by the current argument. Remark 4.3. We also point out that one can prove the L1 → L∞ dispersive estimate by using the L1 ∩ L2 → L∞ + L2 estimate and a cancellation argument which is rather involved [36]. 5. Decomposition and Modulation Equations Consider the nonlinear Schr¨ odinger equation i∂t φ = Hφ + λ|φ|2 φ .
(5.1)
November 4, 2004 15:1 WSPC/148-RMP
996
00217
A. Soffer & M. I. Weinstein
In the regime of low energy solutions we decompose the solution of NLS in the following form: φ(t) = e−iΘ(t) [Ψ0 (t) + Ψ1 (t) + φ2 (t)] .
(5.2)
Here, Ψ0 (t) and Ψ1 (t) represent motion along the ground state and excited state manifolds of equilibria and φ2 is a decaying correction term, lying in an appropriate dispersive subspace. The phase, Θ is divided into two parts: ˜ , Θ(t) = Θ0 (t) + Θ(t)
(5.3)
where Θ0 (t) =
Z
t
E0 (s)ds .
(5.4)
0
˜ Thus, ∂t Θ0 (t) = E0 (t) is the modulated ground state energy, and Θ(t) is a “long range” logarithmic correction, which is to be derived below. We begin by setting φ = e−iΘ(t) [Ψ0 (t) + φ1 ] .
(5.5)
Substituting (5.5) into (5.1) yields the following equation for φ1 : i∂t φ1 = (H − E0 (t)) φ1 + 2λ|Ψ0 (t)|2 φ1 + λΨ20 (t)φ1 ˜ + 2λ|φ1 |2 Ψ0 (t) + λΨ0 (t)φ2 + λ|φ1 |2 φ1 − ∂t Θ(t)φ ˜ + − ∂t Θ 1 1 − i∂t Ψ0 (t) .
(5.6)
Next, we decompose φ1 into a part along the excited state manifold and a correction term: φ1 ≡ Ψ1 (t) + φ2 .
(5.7)
We have, using (5.6), ˜ i∂t φ2 = (H − E0 (t)) φ2 + 2λ|Ψ0 (t)|2 φ2 + λΨ20 (t)φ2 − ∂t Θ(t)φ 2 ˜ + i∂t Ψ1 (t) + 2λ|Ψ0 (t)|2 Ψ1 (t) + λΨ0 (t)2 Ψ1 (t) − E01 (t) + ∂t Θ(t) ˜ + 2λ|φ1 |2 Ψ0 (t) + λΨ0 (t)φ21 + − ∂t Θ(t) + λ |φ1 |2 φ1 − |Ψ1 (t)|2 Ψ1 (t) − i∂t Ψ0 (t) .
(5.8)
Since equation (5.8) involves φ2 it is natural to consider the system governing φ2 and φ2 . Let φ2 (5.9) Φ2 = φ2
November 4, 2004 15:1 WSPC/148-RMP
00217
Selection of the Ground State for Nonlinear Schr¨ odinger Equations
and introduce the matrix linear operator: H − E0 (t) + 2λ|Ψ0 (t)|2 H0 (t) ≡ σ3 ¯ 20 (t) λΨ
λΨ20 (t) H − E0 (t) + 2λ|Ψ0 (t)|2
.
997
(5.10)
Then we have (recall that φ1 = Ψ1 + φ2 ) Ψ1 (t) Ψ0 (t) ˜ , (E01 (t) + ∂t Θ(t))σ3 + i∂t i∂t Φ2 = H0 (t)Φ2 + F − i∂t c.c. c.c. (5.11) where ˜ F ≡ −∂t Θ(t)σ 3
Ψ0 c.c.
˜ − ∂t Θ(t)σ 3 Φ2
+ λσ3
2|Ψ0 |2 Ψ1 + Ψ20 Ψ1
+ λσ3
2Ψ0 |φ1 |2 + Ψ0 φ21
c.c.
c.c.
+ λσ3
|φ1 |2 φ1 − |Ψ1 |2 Ψ1 c.c.
.
(5.12)
5.1. Modulation equations Motivated by the results of Theorem 4.2 on dispersive decay, we shall require that Φ2 (t) ∈ M1 (t) ,
(5.13)
where M1 is defined in (4.40). Equivalently, Pc (H0 (t))Φ2 (t) = Φ2 (t) ,
(5.14)
where Pc (H0 (t)) denotes the continuous spectral projection of H0 (t). By Proposition 4.3 this imposes four orthogonality conditions on Φ2 (t): hσ3 ξa (t), Φ2 (t)i = 0 ,
(5.15)
where a ∈ {(01), (02), µ, −µ}. We impose (5.15) at t = 0 and now derive modulation equations for the coordinates α0 (t) and α1 (t) ensuring that (5.15) persists for all t 6= 0. To derive the modulations we first take the inner product of (5.11) with the adjoint vectors σ3 ξa to obtain the identity: Ψ1 Ψ0 ˜ + σ3 ξa , (E01 + ∂t Θ)σ3 + i∂t σ3 ξa , i∂t c.c. c.c. = hH0∗ (t)σ3 ξa , Φ2 i + hσ3 ξa , Fi + ih∂t (σ3 ξa ), Φ2 i − i∂t hσ3 ξa , Φ2 i
(5.16)
The initial data for NLS is decomposed so that hσ3 ξa (t), Φ2 (t)i = 0 for t = 0. In order for this condition to persist for all time it is necessary and sufficient that the last term in (5.16) vanish, or equivalently:
November 4, 2004 15:1 WSPC/148-RMP
998
00217
A. Soffer & M. I. Weinstein
Proposition 5.1. The condition that Pc (H0 (t))Φ2 (t) = Φ2 (t) is equivalent to the following modulation equations for the coordinates α0 and α1 which specify the dynamics along the ground state and excited state manifolds of equilibria:
~ 0 + σ3 ξa , σ3 (E01 + ∂t Θ) ˜ + i∂t Ψ ~1 σ3 ξa , i∂t Ψ 2|Ψ0 |2 Ψ1 + Ψ20 Ψ1 = −λ ξa , c.c. Ψ0 2 ˜ + ξa , − ∂t Θ(t) + 2λ|φ1 | c.c. Ψ0 φ21 |φ1 |2 φ1 − |Ψ1 |2 Ψ1 + λ ξa , + λ ξa , c.c. c.c. ˜ − ∂t Θ(t)hξ a , Φ2 i + ih∂t (σ3 ξa ), Φ2 i
(5.17)
~ j ≡ (Ψj , Ψj ), j = 0, 1. where a ∈ {(01), (02), µ, −µ} and Ψ Remark 5.1. (i) Note that the term hH0∗ (t)σ3 ξa , Φ2 i in (5.16) vanishes by the orthogonality constraint and because H0∗ (t) maps the discrete subspace into itself. It therefore does not appear in (5.17). (ii) The last term in (5.17) is present due to the time-dependence of the eigenvectors ξa (t). An important simplification of this “commutator term”, which we require for a = (01), is carried out in Appendix C. Initial data for the system (5.17), (5.11), governing α0 , α1 and Φ2 are obtained as follows. Given data φ0 for NLS, we find α0 so as to minimize kφ0 − Ψα0 k2 ;
(5.18)
see [44]. This α0 is used to define the initial Hamiltonian H0 (0). Now decompose φ0 using the biorthogonal decomposition associated with H0 (0). This specifies α0 (0), α1 (0) and Φ2 (0). 5.2. Conservation laws and a priori bounds In this subsection we obtain bounds on α0 , α1 and φ2 using the conservation laws of NLS, noted in the introduction. By the L2 conservation law: N[φ] = N[φ0 ] we have Z h(t) ≡ |Ψ0 (t)|2 + |Ψ1 (t)|2 + |φ2 (t, x)|2 dx = N[φ0 ] − 2<
Z
Ψ0 Ψ1 − 2<
Z
Ψ 0 φ2 − <
Z
Ψ 1 φ2 .
The last three terms in (5.19) can be estimated using the following: R 3 • Ψ0 Ψ1 = O(|α0 | |α1 |2 ) + O(|α0 |2 |α1 |) = O(h(t) 2 ).
(5.19)
November 4, 2004 15:1 WSPC/148-RMP
00217
Selection of the Ground State for Nonlinear Schr¨ odinger Equations
999
• Orthogonality relation (5.15) with a = (01), hσ3 ξ01 , Φ2 i = 0 or equivalently R < Ψ0 φ2 = 0. R • Orthogonality relation (5.15) with a = µ or hσ ξ , Φ i = 0 implies < Ψ 1 φ2 = 3 µ 2 3 2 2 O |α1 | + |α1 | |α0 | kφ2 k2 = O(h(t) ). Therefore,
1 1 + O(h(t) 2 ) h(t) ≤ N[φ0 ] .
(5.20)
By continuity of h(t), if N[φ0 ] is sufficiently small h(t) ≤ CN[φ0 ] .
(5.21)
Furthermore, the conservation law Hen [φ(t)] = Hen [φ0 ] can be used to prove an H 1 bound on φ2 (t), provided kφ0 kH 1 is sufficiently small. In particular, substituting the decomposition (5.2) into the conserved functional Hen [φ(t)], we find after integration by parts and interpolation of the L4 term in Hen : Z Z k∇φ2 (t)k2L2 ≤ E0 + |φ2 |(|∆Ψ1 | + |∆Ψ2 |) + |∇Ψ1 | |∇Ψ2 | + kV k∞ kφ0 k2L2 +
3 C kφ0 kL2 k∇φ(t)k22 2 1
≤ CE0 + O(h(t)) + O(h(t) 2 )k∇φ2 (t)k3L2 . Therefore, using the bound (5.21) and continuity in time, we have that if kφ0 kH 1 is sufficiently small, Z |∇φ2 (t, x)|2 dx ≤ CE0 . (5.22) 6. Toward a Normal Form Frequency Analysis
Algebraic Reductions and
We now embark on a detailed calculation leading to a form of this system, which though equivalent, is of a form to which normal form methods can be easily applied. 6.1. Modulation equations Equations (5.17) are a coupled system for α0 , α1 and their complex conjugates. It is natural to write the system as one which is nearly diagonal. This can be done by taking appropriate linear combinations of the equations in (5.17). The equation which essentially determines α0 can be found by adding the two equations obtained from (5.17) by setting a = (01) and a = (02). This gives: ~ 0 i + hσ3 (ξ01 + ξ02 ), σ3 (E01 + ∂t Θ) ˜ + i∂t Ψ ~ 1i hσ3 (ξ01 + ξ02 ), i∂t Ψ 2|Ψ0 |2 Ψ1 + Ψ20 Ψ1 = −λ ξ01 + ξ02 , c.c.
November 4, 2004 15:1 WSPC/148-RMP
1000
00217
A. Soffer & M. I. Weinstein
+
˜ + 2λ|φ1 |2 ξ01 + ξ02 , − ∂t Θ(t)
+ λ ξ01 + ξ02 ,
2
Ψ 0 φ1 c.c.
Ψ0 c.c.
+ λ ξ01 + ξ02 ,
|φ1 |2 φ1 − |Ψ1 |2 Ψ1 c.c.
˜ − ∂t Θ(t)hξ 01 + ξ02 , Φ2 i + ih∂t (σ3 (ξ01 + ξ02 )), Φ2 i .
(6.1)
The difference of the a = (01) and a = (02) equations is the complex conjugate of the Eq. (6.1). The equation which essentially determines α1 is Eq. (5.17) with a = µ: ~ 0 i + hσ3 ξµ , σ3 (E01 + ∂t Θ) ˜ + i∂t Ψ ~ 1i hσ3 ξµ , i∂t Ψ 2|Ψ0 |2 Ψ1 + Ψ20 Ψ1 = −λ ξµ , c.c. Ψ0 ˜ + 2λ|φ1 |2 + ξµ , − ∂t Θ(t) c.c. |φ1 |2 φ1 − |Ψ1 |2 Ψ1 Ψ0 φ21 + λ ξµ , + λ ξµ , c.c. c.c. ˜ − ∂t Θ(t)hξ µ , Φ2 i + ih∂t (σ3 ξµ ), Φ2 i .
(6.2)
The equation corresponding to a = −µ is the complex conjugate of this equation. Since Ψk ∂k Ψk ∂k Ψk ∂t = ∂ t αk + ∂ t αk , (6.3) Ψk ∂k Ψk ∂k Ψk we see that the equations for α ~ j = (α0 , α0 )T , j = 0, 1 can be expressed in the form: α1 α1 α0 ˜ + iM01∂t + (E01 + ∂t Θ)N01 = F0 (6.4) iM00 ∂t α0 α1 α1 α0 α1 α1 ˜ iM10 ∂t + iM11∂t + (E01 + ∂t Θ)N11 = F1 . (6.5) α0 α1 α1 Here, for k = 0, 1: M0k = G0 * N01
= G0
σ3 (ξ01 + ξ02 ),
σ3 (ξ01 + ξ02 ),
σ3 (ξ01 + ξ02 ), σ3 (ξ01 + ξ02 ),
∂k Ψk ∂k Ψk ∂k Ψk
!+
∂k Ψk ψ1
0 0
ψ1
σ3 (ξ01 + ξ02 ),
σ3 (ξ01 + ξ02 ),
∂k Ψk ∂k Ψk
∂k Ψk
∂k Ψk 0 σ3 (ξ01 + ξ02 ), ψ1 ψ1 σ3 (ξ01 + ξ02 ), 0
(6.6)
(6.7)
November 4, 2004 15:1 WSPC/148-RMP
00217
1001
Selection of the Ground State for Nonlinear Schr¨ odinger Equations
M1k = G0 *
N11 = G0
σ3 ξ µ ,
σ3 ξ µ ,
∂k Ψk ∂k Ψk
σ3 ξ µ ,
σ3 ξ µ ,
∂k Ψk
!+
σ3 ξ µ ,
σ3 ξ µ ,
∂k Ψk ∂k Ψk
, ∂k Ψk
(6.8)
∂k Ψk 0 σ3 ξ µ , ψ1 . ψ1 σ3 ξ µ , 0
∂k Ψk ψ1
0 0
ψ1
(6.9)
˜ 6.2. Algebraic reductions and determination of Θ(t) To express the modulation equations (5.17) in a tractable form we shall make use of a number of notations and relations which we now list for convenience; see also Appendix A. (j) χk denotes a spatially localized function of order |αj |k , as |αj | → 0. (j) (j) (j) Ok denotes a quantity which is of order |αj |k as |αj | → 0. Both χk and Ok are invariant under the map αj 7→ αj eiγ . (0,1) (0) (1) Ok = O k1 O k2 , k = k 1 + k 2 . φ1 = Ψ 1 + φ 2
α0 e2iγ0 = α0 α0 = |α0 |eiγ0 , (k) (k) ∂k Ψk , ∂k Ψk = ψk∗ + |αk |2 χ0 , α2k χ0 , ξ01 + ξ02 = G0
ξµ =
F0 + F00
F0 − F00 (0) ! ψ1∗ + χ2
=
k = 0, 1 (0) ! 2eiγ0 (ψ0∗ + χ2 ) (0)
e−iγ0 χ2
(0)
α0 2 χ0
(0)
F0 + F00 = 2ψ0∗ + |α0 |2 χ(·; |α0 |2 ) = 2(ψ0∗ + χ2 ) (0)
F0 − F00 = |α0 |2 χ(·; |α0 |2 ) = χ2 . (1)
hψ0∗ , ψ0 (·; 0)i = 0 .
(6.10)
Using (6.10) in (6.1) we get: (0) (0) ˜ + i∂t Ψ ~ 1i . 2i(1 + O4 )∂t α0 + iα20 O0 ∂t α0 + hσ3 (ξ01 + ξ02 ), σ3 (E01 + ∂t Θ) ˜ 0 , Ψ0 i − 2λhF 0 , |φ1 |2 Ψ0 i = −2 ∂t ΘhF 0 0
+ hF0 + F00 , 2λ|Ψ0 |2 Ψ1 + λΨ20 Ψ1 + λ|φ1 |2 φ1 − λ|Ψ1 |2 Ψ1 i − e2iγ0 hF0 − F00 , c.c.i + ie2iγ0 h∂t [σ3 (ξ01 + ξ02 )], Φ2 i .
(6.11)
November 4, 2004 15:1 WSPC/148-RMP
1002
00217
A. Soffer & M. I. Weinstein
˜ 6.2.1. Determination of Θ(t) 1
We anticipate that generically |φ1 | ∼ |α1 | ∼ t− 2 for t very large. α0 will have a ˜ to cancel limit as t → ±∞ if ∂t α0 is integrable. We ensure this by choosing ∂t Θ −1 ˜ to satisfy the terms which are of order t and non-oscillatory. Thus, we choose Θ ˜ 0 , Ψ0 i − 2λhF 0 , |φ1 |2 Ψ0 i = 0 . ∂t ΘhF 0 0
(6.12)
To leading order this gives: ˜ ∼ 2λhψ 2 , |φ1 |2 i . ∂t Θ 0∗
(6.13) Rt
In this way, a logarithmic correction to the standard phase, 0 E0 (s)ds arises. Equations (6.11) and (6.12) together with Proposition C.1 imply (0)
(0)
˜ 01 + ξ02 , Ψ ~ 1i 2i(1 + O4 )∂t α0 + iα20 O0 ∂t α0 + (E01 + ∂t Θ)hξ ~ 1i + hσ3 (ξ01 + ξ02 ), i∂t Ψ ˜ (0) , φ2 i + ∂t Θα ˜ 2 hχ(0) , φ2 i = −∂t Θhχ 0 0 0 (2)
(0)
+ λO0 (|α0 |2 α1 + α20 α1 ) + λh2ψ0∗ + χ0 , |φ1 |2 φ1 − |Ψ1 |2 Ψ1 i (0) (0) (0) − λα20 hχ0 , |φ1 |2 φ1 − |Ψ1 |2 Ψ1 i − ∂t |α0 |2 hχ0 , φ2 i + e2iγ0 hχ0 , φ2 i (0) (0) (6.14) + i|α0 |2 ∂t γ0 hχ0 , φ2 i + e2iγ0 hχ0 , φ2 i .
We now turn to Eq. (6.2). Using (6.10), Eq. (6.2) can be written as: (0,1)
(1 + O2
)i∂t α1 + (O(α20 ) + O(α21 ))i∂t α1 (0,1)
+ ((1 + O2
(0,1)
)α1 + α20 α1 O2
˜ + hσ3 ξµ , i∂t Ψ ~ 0i )(E01 + ∂t Θ) (0,1)
˜ ˜ = −∂t Θ(hχ, φ2 i + α20 hχ, φ2 i) − ∂t ΘO 0 (0,1)
− λO2
α0
(0)
(|α0 |2 α1 + α20 α1 ) + λO0 α0 hχ, |φ1 |2 i 2
+ λα0 hχ, φ21 i + λα30 hχ, φ1 i + λhχ, |φ1 |2 φ1 − |Ψ1 |2 Ψ1 i + λα20 hχ, |φ1 |2 φ1 − |Ψ1 |2 Ψ1 i + ih∂t (σ3 ξµ ), Φ2 i .
(6.15)
We next write the systems for α ~ 0 and α ~ 1. ˜ 3 N01 α iM00 ∂t α ~ 0 + iM01 ∂t α ~ 1 + (E01 + ∂t Θ)σ ~1 |α0 |2 α1 + α20 α1 (0) = λO0 σ3 c.c. (2) (0) hψ0∗ + χ0 , |φ1 |2 φ1 − |Ψ1 |2 Ψ1 i + α20 hχ0 , |φ1 |2 φ1 − |Ψ1 |2 Ψ1 i + λσ3 c.c.
November 4, 2004 15:1 WSPC/148-RMP
00217
Selection of the Ground State for Nonlinear Schr¨ odinger Equations
˜ − ∂t Θ
(0)
(0)
hχ0 , φ2 i + α20 hχ0 , φ2 i
2
− ∂t |α0 | σ3
c.c.
+ i|α0 |2 ∂t γ0
(0) hχ0 , φ2 i
(0)
+ e2iγ0 hχ0 , φ2 i c.c.
(0) hχ0 , φ2 i
1003
(0)
+ e2iγ0 hχ0 , φ2 i c.c.
(6.16)
˜ 3 N11 α iM11∂t α ~ 1 + iM10 ∂t α ~ 0 + (E01 + ∂t Θ)σ ~1 2 2 |α0 | α1 + α0 α1 (0,1) = λO0 σ3 c.c. hχ, |φ1 |2 φ1 − |Ψ1 |2 Ψ1 i + α20 hχ, |φ1 |2 φ1 − |Ψ1 |2 Ψ1 i + λσ3 c.c. 2 ! α0 hχ, |φ1 |2 i + α0 hχ, φ21 i + α30 hχ, φ1 i + λσ3 c.c. h∂t (σ3 ξµ ), Φ2 i hχ, φ2 i + α20 hχ, φ2 i ˜ σ3 +i . − ∂t Θ c.c. c.c.
(6.17)
The matrices Mjk , Nj1 , j, k = 0, 1 are displayed in Eqs. (6.6)–(6.9). For the (generic) case where we expect |α0 | to approach a nonzero limit and α1 and φ2 to decay to zero, we shall use the expansion: (0) (0) 2 1 + O4 α20 O0 M00 = (0) (0) α0 2 O 0 2 1 + O4 ! (0,1) 1 + O2 O α20 + O α21 . (6.18) M11 = (0) O α0 2 + O α1 2 1 + O2
The matrices M10 and M10 are higher order in α0 and satisfy: (0) (0) ! O2 α20 O0 . M10 , M01 = (0) (0) α0 2 O 0 O2 The matrices N01 and N11 satisfy: (0)
N01 =
N11 =
(1)
O2 + O 2
(0)
(0)
(1)
(0)
α20 O0
(0)
α0 2 (O2 + O2 ) 1 + O2
(0,1)
α0 2 O 0
(1)
α20 (O2 + O2 ) (1)
O2 + O 2 ! (0,1) (0)
1 + O2
.
!
(6.19)
November 4, 2004 15:1 WSPC/148-RMP
1004
00217
A. Soffer & M. I. Weinstein
6.2.2. Simplifications to Eqs. (6.1) and (6.2) (1) Since M10 and M01 are higher order in α0 we can eliminate ∂t α1 from (6.16) and ∂t α0 from (6.17). (2) Note also, that “commutator terms” with factors like ∂t |α0 |2 or ∂t α20 can be eliminated via redefinition of the near identity matrix M00 through incorporation of a higher-order correction. (3) We can eliminate the term proportional to |α0 |2 ∂t γ0 as follows. Consider the last two terms of (6.16). Since ∂t |α0 |2 = α0 ∂t α0 + α0 ∂t α0 we can incorporate the second to last term of (6.16) as a higher-order correction to the near identity matrix ] M00 , M 00 . Our goal is now to eliminate the ∂t γ0 from the equation. The system can now be written as: −1 ] i∂t α ~0 = M · · · + iO(|α0 |2 )∂t γ0 . (6.20) 00 The first component has the form:
i∂t α0 = 1st component of the vector : ] M 00
−1
] (· · ·) + iO(|α0 |2 )∂t γ0 M 00
−1
.
(6.21)
Since α0 = |α0 |eiγ0 we have i∂t |α0 | − |α0 |∂t γ0 = e−iγ0 × 1st component of the vector : ] M 00
−1
] (· · ·) + iO(|α0 |) |α0 |∂t γ0 M 00
−1
.
(6.22)
By taking the real part of (6.22), for |α0 | small, we can solve for |α0 |∂t γ0 . This enables us to eliminate it from Eq. (6.16) as a higher-order term. Implementation of these simplifications leads to the following Proposition 6.1. # ˜ 3 N01 α ~1 i∂t α ~ 0 = M00 − (E01 + ∂t Θ)σ (0)
+ λO0 σ3 + λσ3
˜ − ∂t Θ
|α0 |2 α1 + α20 α1 c.c.
(2)
(0)
hψ0∗ + χ0 , |φ1 |2 φ1 − |Ψ1 |2 Ψ1 i + α20 hχ0 , |φ1 |2 φ1 − |Ψ1 |2 Ψ1 i c.c.
(0) hχ0 , φ2 i
+
(0) α20 hχ0 , φ2 i
c.c.
i∂t α ~ 1 = A(t)~ α1 hχ, |φ1 |2 φ1 − |Ψ1 |2 Ψ1 i + α20 hχ, |φ1 |2 φ1 − |Ψ1 |2 Ψ1 i λσ + M# 3 11 c.c.
(6.23)
November 4, 2004 15:1 WSPC/148-RMP
00217
1005
Selection of the Ground State for Nonlinear Schr¨ odinger Equations 2
+ λσ3
α0 hχ, |φ1 |2 i + α0 hχ, φ21 i + α30 hχ, φ1 i c.c.
˜ σ3 − ∂t Θ Here,
hχ, φ2 i + α20 hχ, φ2 i c.c.
(0,1)
E10 + O0
A=
(0,1)
−O0
!
.
(6.24)
(0,1)
|α0 |2
O0
α0 2
α20
(0,1)
−E10 − O0
|α0 |2
!
A22 = −A11 , A21 = −A12 , and
(6.25)
φ1 = Ψ 1 + φ 2 . M# 00
M# 11
and are near-identity matrices, whose deviations from the identity give rise to higher-order terms which are subordinate to the leading order behavior obtained in the analysis which follows. Finally, Eqs. (6.23) and (6.24) are coupled to the equation for Φ 2 given by (5.11). 6.3. Peeling off the rapid oscillations of α1 Fix T > 0 and large. We rewrite (6.24) centered about the ground state at time t = T: i∂t α ~ 1 = A(T )~ α1 + (A(t) − A(T ))~ α1 + F~ , (6.26) where
(0,1)
E01 (T ) + O0
A(T ) =
(0,1)
−O0
(0,1)
(T )|α0 (T )|2
(T )α0 2 (T )
O0
(T )α20 (T ) (0,1)
−E01 (T ) − O0
(T )|α0 (T )|2
!
.
(6.27)
Since A(T ) is a constant coefficient matrix it is a simple matter to obtain the fundamental matrix. Proposition 6.2. The system i∂t α ~ 1 = A(T )~ α1
(6.28)
has a fundamental solution matrix : + −iλ+ t c11 c− e 0 12 X(t) = c+ c− 0 e−iλ− t 21 22 =
(0,1)
1 (0,1)
O0
2
−1 (T ) (T )α0 (T ) E10
O0
−1 (T )α0 (T )2 E10 (T )
1
!
e−iλ+ t
0
0
e−iλ− t
.
(6.29)
The eigenfrequencies, λ± (T ) are given by λ+ (T ) and λ− (T ) = −λ+ (T ), where: (0,1)
λ+ (T ) = E10 (T ) + O0
2
|α0 (T )|2 ,
where E10 = E1 − E0 , and provided |α0 (T )| /E10 (T ) is sufficiently small.
(6.30)
November 4, 2004 15:1 WSPC/148-RMP
1006
00217
A. Soffer & M. I. Weinstein
We use the fundamental matrix, X(t), to define a change of variables: ~ = X(t) α ~ 1 ≡ X(t)β
β1 β1
.
(6.31)
Therefore, (0,1) 2 −1 α0 (T )E10 (T )e−iλ− t β1
α1 = e−iλ+ t β1 + O0
.
(6.32)
Then, β~ satisfies ~ = X−1 (t)F~ (X(t)β(t), ~ i∂t β t) .
(6.33)
Note that since the linear in α1 terms have been removed by the change of variables ~1 ∼ O(|β1 |2 ). (6.31), ∂t β Note that by (6.29) X−1 (t) =
eiλ+ t
0
0
eiλ− t
(0,1)
−2 (T )|α0 (T )|4 E10 (T )
(0,1)
2
1 + O0 −O0
−1 (T )α0 (T ) E10 (T )
(0,1)
−1 (T )α20 (T )E10 (T )
(0,1)
−2 (T )|α0 (T )|2 E10 (T )
−O0 1 + O0
!
(6.34) Written out in detail, from (6.33) and the various definitions, we have Proposition 6.3. The equation for β1 has the form i∂t β1 = λeiλ+ t hχ0 , |e−iλ+ t β1 ψ1∗ + φ2 |2 iα0 + λeiλ+ t hχ0 , (e−iλ+ t β1 ψ1∗ + φ2 )2 iα0 + λeiλ+ t hχ0 , (eiλ+ t β1 ψ1∗ + φ2 )2 iα30
3 + λ χ0 , |e−iλ+ t β1 ψ1∗ + φ2 |2 (e−iλ+ t β1 ψ1∗ + φ2 ) − e−iλ+ t |β1 |2 β1 ψ1∗ + Rβ ,
(6.35)
where
Remark 6.1.
~1 + R1 . Rβ = X−1 (t) (A(t) − A(T ))X(t)β
(6.36)
X−1 (t)[A(t) − A(T )]X(t) = Real Symmetric Diagonal S(t) + e−2iλ+ t B(t) (6.37) Therefore, the non-oscillatory part involving S(t) does not effect the evolution of |β1 |2 .
.
November 4, 2004 15:1 WSPC/148-RMP
00217
Selection of the Ground State for Nonlinear Schr¨ odinger Equations
1007
6.4. Expansion of φ2 The next step is to get an appropriate expansion of φ2 , which upon substitution into (6.35) can be used to isolate the key resonant terms in the β1 equation. The analogous steps are then repeated for the α0 equation. Finally, a near-identity change of variables is constructed which maps the system for α0 and β1 to a new system (a normal form plus corrections) for which the dynamical behavior is more transparent. Φ2 solves Eq. (5.11). We shall require dispersive decay estimates for Φ2 and these are most naturally obtained relative to a time-independent Hamiltonian. Since α0 (t) is expected to tend to a limit as t → ∞ and since we are fixing a time interval [0, T ], it is natural to use as reference Hamiltonian, the operator H0 (T ). We now make use of the linear spectral theory of Sec. 3 and decompose Φ2 into a part lying in the “discrete subspace of H0 (T )”: Ng (H0 (T )) ⊕ N (H0 (T ) − µ(T )) ⊕ N (H0 (T ) + µ(T ))
(6.38)
and a part lying in M1 (T ), the “dispersive subspace of H0 (T )”; see (4.40). Let [3, 11] Φ2 = k + n + η ,
(6.39)
where k≡ n≡
X
ξa ∈Ng (H0 (T ))
hσ3 ξa (T ), Φ2 iξa
X
ξb ∈N (H0 (T )∓µ(T ))
hσ3 ξb (T ), Φ2 iξb
η ≡ Pc (T )Φ2 = Φ2 − k − n .
(6.40) (6.41) (6.42)
Since hσ3 ξa (t), Φ2 (t)i = 0 ,
ξa ∈ Ng (H0 (t))
hσ3 ξb (t), Φ2 (t)i = 0 ,
ξb ∈ Ng (H0 (t) ∓ µ(t)) ,
ξ(t) may be replaced by ξ(t) − ξ(T ) in the definitions of k and n. Inserting the expansion (6.39) into (6.40) and (6.41) and defining X hσ3 (ξa (T ) − ξa (t)), ·iξa (6.43) pnull (t, T ) = ξa ∈Ng (H0 (T ))
pneut (t, T ) =
X
ξb ∈N (H0 (T )∓µ(T ))
hσ3 (ξb (T ) − ξb (t)), ·iξa
we have that k and n may be expressed in terms of η as follows: Then, pnull (t, T ) pnull (t, T ) k pnull (t, T )η I− = . pneut (t, T ) pneut (t, T ) n pneut (t, T )η
(6.44)
(6.45)
November 4, 2004 15:1 WSPC/148-RMP
1008
00217
A. Soffer & M. I. Weinstein
Therefore, we have Proposition 6.4. There exists ε0 > 0 such that if |α0 (t) − α0 (T )| < ε0 then the relation (6.45) can be inverted and we have Φ2 = Φ2 [η] = k[η] + n[η] + η ,
(6.46)
where Φ2 is linear in η and continuous in the weighted (local decay) norm of f 7→ khxi−σ f k2 . The statement about continuity in the weighted norm follows from the spatial localization of the generalized eigenfunctions. 1
Remark 6.2. Note that since |ξa (T ) − ξa (t)| ≤ CE02 |α0 (T ) − α0 (t)|, we have the simple estimate: |k|, |n| ≤ C|α0 (T ) − α0 (t)| ||hxi−σ η(t)||2 . 1
(6.47) 1
Anticipating that for t very large, |α0 (T ) − α0 (t)| ∼ t− 2 and ||hxi−σ Φ2 (t)||2 ∼ t− 2 , it follows that |k|, |n| ∼ t−1 . Thus, |k| and |n|, are expected to decay faster than η. Finally, η satisfies the following evolution equation obtained from (5.11) by explicitly introducing the reference Hamiltonian, H0 (T ), and applying the projection Pc (T ) to the equation. i∂t η = H0 (T )η ˜ c (T )σ3 Φ2 [η] + (H0 (t) − H0 (T ))Φ2 [η] − ∂t ΘP Ψ1 (t) 2|Ψ0 |2 Ψ1 + Ψ20 Ψ1 + (E1 (t) − E0 (t))Pc (T )σ3 + λPc (T )σ3 c.c. c.c. 2Ψ0 |Ψ1 + π1 Φ2 [η]|2 + Ψ0 (Ψ1 + π1 Φ2 [η])2 + λPc (T )σ3 c.c. 2 |Ψ1 + π1 Φ2 [η]| (Ψ1 + π1 Φ2 [η]) − |Ψ1 |2 Ψ1 + λPc (T )σ3 c.c. ~ ~ 0 Ψ0 (t) · ∂t α ∇1 Ψ1 (t) · ∂t α ~ 1 (t) + ∇ ~ 0 (t) − iPc (T ) . (6.48) c.c. 7. Normal Form and Master Equations In Sec. 4 we decomposed the solution, φ, in terms of coordinates α0 (t) and α1 (t) along manifolds of nonlinear bound states and φ2 , a correction which lies in a time-dependent subspace, M1 (t), of continuum modes. φ2 (t) was then decomposed into its discrete (k and n) and continuous (η) components with respect to a timeindependent Hamiltonian, H0 (T ). We also observed that k and n are determined by and are expected to be more rapidly decaying than η (Proposition 6.4 and
November 4, 2004 15:1 WSPC/148-RMP
00217
Selection of the Ground State for Nonlinear Schr¨ odinger Equations
1009
Remark 6.2). Therefore, the evolution of φ is determined by α0 , α1 and η. Finally, the fast oscillations of α1 are removed by the introduction of β1 = [X(t)~ α 1 ]1 ∼ e−iλ+ t α1 . We now seek a form of the system for α0 and β1 from which the large time dynamics can be deduced. We obtain this “normal form” by first solving for η (see (7.21), (6.48)) as a functional of α0 , β1 and the initial data η(0) and then substituting an appropriate expansion (see Secs. 7 and 8) into the equations for α 0 and β1 . Proposition 7.1 (The Normal Form). There exists a near identity change of variables α0 α ˜0 Jα [α0 , β1 , t] ≡ + (7.1) β˜1 β1 Jβ [α0 , β1 , t] where Jk [α0 , β1 , t] = O(|α0 |2 + |β1 |2 ) ,
k = α, β
(7.2)
and bounded uniformly in t, and such that i∂t α ˜ 0 = (c1022 + iΓω ) |β˜1 |4 α ˜ 0 + Fα [˜ α0 , β˜1 , η, t]
i∂t β˜1 = (c1121 − 2iΓω )|˜ α0 |2 |β˜1 |2 β˜1 + Fβ [˜ α0 , β˜1 , η, t] .
(7.3)
The properties of Fα and Fβ are briefly discussed in Remark 7.2 following Corollary 7.1 below and described in detail in Sec. 8. Furthermore, Γ = Γω∗ + O(|α0 (T )|2 ) > 0 , where, 2 2 Γω∗ ≡ λ2 πhψ0∗ ψ1∗ , δ(H − ω∗ )ψ0∗ ψ1∗ i>0
ω∗ = 2E1∗ − E0∗ . (0,1)
The coefficients cklmn = O0 form αk0 α ¯ l0 β1m β¯1n .
(7.4)
are real constants multiplying monomials of the
˜ Remark 7.1. Due to our choice of the phase correction, Θ(t) (see (6.12), a term of (0,1) ˜ 2 the form c1011 O0 |β1 | α ˜ 0 , is absent from the differential equation for α ˜ 0 in (7.3). Now let P0 ≡ |˜ α 0 |2
and
P1 ≡ |β˜1 |2
(7.5)
denote the (renormalized) ground state and excited state powers. Then, by (7.3) we have the Nonlinear Master Equation:
November 4, 2004 15:1 WSPC/148-RMP
1010
00217
A. Soffer & M. I. Weinstein
Corollary 7.1. dP0 = 2ΓP0 P12 + R0 dt
(7.6)
dP1 = −4ΓP0 P12 + R1 , dt
(7.7)
where R0 = R0 [˜ α0 , β˜1 , η, t] = 2=(˜ α 0 Fα ) , R1 = R1 [˜ α0 , β˜1 , η, t] = 2=(β˜1 Fβ ) .
(7.8)
A more precise and revealing variant of Corollary 7.1 is Proposition 11.1, which is stated and proved in Sec. 8. Remark 7.2. The terms Fα and Fβ are such that R0 and R1 are not small perturbations of the leading order terms in (7.6), (7.7) for all t ≥ 0. In fact there are three time intervals defined in terms of transition times t0 and t1 (see Sec. 8), in which we consider the system (7.6), (7.7): I0 = [0, t0 ], I1 = [t0 , t1 ] and I2 = [t1 , ∞). It is only for sufficiently large time, (t ∈ I2 ), where R0 and R1 are negligible. The behavior on short (0 ≤ t ≤ t0 ) and intermediate (t0 ≤ t ≤ t1 ) time scales can be very different. We go into the details of R0 and R1 in Sec. 8 but wish to make some remarks at this stage which indicate our approach. If we drop the terms Rj then we have a flow, which evolves in the first quadrant of the P0 − P1 plane according to: dp0 = 2Γp0 p21 dt
dp1 = −4Γp0 p21 , dt
(7.9) (7.10)
where solutions for typical data converge to the p0 axis with a rate hti−1 . In order for the corrections coming from R0 and R1 to be small, intuitively it is sufficient that Rj ∼ E0ρ (P0 P12 + hti−3 ) ,
(7.11)
where E0 is small and ρ > 0. This is what we show for t ≥ t1 . For the intermediate time range, t0 ≤ t ≤ t1 , we show that the behavior is controlled by the system:
dP0 = 2ΓP0 P12 + O(hti−3 ) dt √ dP1 = −4ΓP0 P12 + O( P0 P1m ) + O(hti−3 ) , dt where m ≥ 3. Therefore, for intermediate times we need to show: R0 ∼ E0ρ (P0 P12 + hti−3 ) √ R1 ∼ E0ρ (P0 P12 + P0 P1m + hti−3 ) .
(7.12)
(7.13)
November 4, 2004 15:1 WSPC/148-RMP
00217
Selection of the Ground State for Nonlinear Schr¨ odinger Equations
1011
What makes the analysis subtle is the dependence of Rj on P0 and P1 in a manner which is nonlocal in time. That is, Z t m4 m3 1 m2 β1 ds . (7.14) η∼ e−iH0 (t−s) Pc (H0 )χαm 0 β1 α 0 0
Local in time terms are simple to dominate by the leading terms. However, nonlocal terms require careful analysis. Note in particular, that due to the “history dependence” of such terms, being expressed as time integrals from 0 up to t, an analysis of the effect of such terms for t ≥ t1 requires use of estimates on other time regimes t ≤ t0 and t0 ≤ t ≤ t1 as well. Furthermore, there are no decay estimates on either P0 or P1 in the intermediate interval t0 ≤ t ≤ t1 or on the size of this interval. The normal form of Proposition 7.1 is essentially the Poincar´e–Dulac normal form which can be constructed along the lines explicitly implemented in [49]; see also [1]. We now give a detailed outline of the procedure with explicit illustrative detail of key points concerning the treatment of resonant and nonresonant terms. Resonant terms and removal nonresonant terms: Here we illustrate, by way of a simple example, how non-resonant terms can be removed by near identity changes of variables. Consider the scalar ordinary differential equation A0 (t) = |A(t)|2 eiΩt
(7.15)
where A(t) is a complex valued function. We shall introduce a change of variables ¯ t), where q2 (A, A, ¯ t) = O(|A|2 ) and q2 (A, A, ¯ t + 2π ) = A 7→ A˜ = A + q2 (A, A, Ω ¯ q2 (A, A, t), which is therefore approximately the identity for |A| small, and such that i i ˜ 2˜ ˜ 2 A(t) ˜ + E4 (A(t), ˜ ˜ A(t), t) , (7.16) A(t) + e2iΩt |A(t)| A˜0 (t) = |A(t)| Ω Ω Here, E4 is 2π/Ω periodic in t and ˜ ˜ 4) E4 (A(t), A(t), t) = O(|A(t)| (7.17) The change of variables can be derived by elementary means. Integration of (7.15) gives: Z t A(t) − A(0) = |A(s)|2 eiΩs ds 0
t
1 d iΩs e ds iΩ ds t Z t 1 iΩs d 2 1 iΩs = |A(s)| e − e |A(s)|2 ds iΩ iΩ ds 0 0 t Z t 1 2 1 iΩs eiΩs A(s)|A(s)|2 eiΩs ds = |A(s)| e − iΩ iΩ 0 0 Z t 1 − |A(s)|2 A(s)ds . iΩ 0 =
Z
0
|A(s)|2
(7.18)
November 4, 2004 15:1 WSPC/148-RMP
1012
00217
A. Soffer & M. I. Weinstein
˜ = A(t) − (iΩ)−1 |A(t)|2 eiΩt . Then, A˜ satisfies the renormalized ODE, in Define A(t) which resonant quadratic terms have been removed. The process can be repeated; ˜ higher order in A˜ and by introducing further changes of variables A˜ 7→ A˜1 = A+ period in t, non-resonant (oscillatory) cubic terms can be removed to obtain: i ( A˜01 (t) = |A˜1 (t)|2 A˜1 (t) + ik|A˜1 (t)|4 A˜1 t) + . . . , (7.19) Ω where k is real. That the coefficients in the first to terms of this normal form are purely imaginary implies that, to this order, the amplitude |A˜1 (t)| is independent in time. This is the typical situation of the norm form finite dimensional Hamiltonian systems, in which resonances occur between isolated discrete frequencies. We next examine resonances between discrete and frequencies and the continuum of frequencies, associated with the continuous spectral (dispersive) part of H0 . These can introduce nonconservative terms into the normal from (via coefficients with real as well as imaginary parts), which are responsible for energy transfer between discrete modes (bound states) and radiation. Nonconservative resonant terms and energy transfer: We explain how to find the key resonant energy transfer terms, the leading terms in (7.3). These are terms responsible for the exchange of energy among the nonlinear ground and excited states mediated by interaction with continuums modes. We focus on the β1 equation. Analogous considerations apply to the α0 equation. Equation (6.35) can be written the following compact form: X q (7.20) i∂t β1 = Cpqr β1p β1 e−iωr t hχpqr , ηi + O(η 2 ) + · · · , p,q,r
where Cpqr are of order 1, α0 or higher order in α0 , ωr ∈ {±λ± , ±2λ± , 0}, and χpqr denote functions which are exponentially localized in space. The equation for α0 has a similar structure. The equation for η = η0 + η1 + η2 , can be formally solved giving: η = O(η0 ) + O(η02 ) + O(η 2 ) X q1 + Dp1 q1 r1 G β1p1 β1 e−iνr1 s χ ˜ p1 q 1 r 1 .
(7.21)
p1 ,q1 ,r1
Here, G denotes the operator
f 7→ Gf ≡ −i
Z
t
e−iH0 (t−s) Pc f (s)ds .
(7.22)
0
We insert the expansion for η, (7.21) into the terms involving inner products hχpqr , ηi in (7.20) and in the corresponding equation for α0 . This yields a coupled system for α0 and β1 which is closed up to higher order. The terms in the resulting equations are of the form X X
q q1 Cpqr Dp1 q1 r1 β1p β1 e−iωr t χpqr , Gβ1p1 β1 e−iνr1 s χ (7.23) ˜ p1 q 1 r 1 . p,q,r p1 ,q1 ,r1
November 4, 2004 15:1 WSPC/148-RMP
00217
1013
Selection of the Ground State for Nonlinear Schr¨ odinger Equations
We now use the integration by parts lemma: Lemma 7.1. Z
t
eiAs f (s)ds = lim δ↓0
0
Z
t
ei(A±iδ)s f (s)ds
0
= −i(A ± i0)−1 eiAt f (t) + i(A ± i0)−1 f (0) Z t −1 eiAs f 0 (s)ds . + i(A ± i0))
(7.24)
0
Applying this lemma, we obtain e−iωr t hχpqr Gβ1p1 β1 =
q1 −iνr s 1
e
q −e−i(ωr +νr1 )t β1p1 β1 1
+ β1p1 (0)β1 +
Z
t 0
q1
χ ˜ p 1 q1 r 1 i
χpqr , (H0 (T ) − νr1 ∓ i0)−1 Pc (T )χ ˜ p 1 q1 r 1
˜ p 1 q1 r 1 i (0)hχpqr , (H0 (T ) − νr1 ∓ i0)−1 e−iH0 (T )t Pc (T )χ
χpqr , (H0 (T ) − νr1 ∓ i0)−1 e−iH0 (T )(t−s) e−iνr1 s Pc (T )
d p 1 q1 (β β1 χ ˜p1 q1 r1 )ds ds 1
.
(7.25) We first focus on the first term in the expansion (7.25). This contributes a resonant term, which cannot be transformed by a near identity transformation to higher order if ωr + ν r1 = 0 .
(7.26)
Now consider such a resonant term. We find that in the β1 equation they are of the form:
−|α0 |2a1 |β1 |2b1 β1 χ ~ , (H0 − νr1 ∓ i0)−1 Pc χ ~ (7.27)
where νr1 = −ωr ∈ {±λ± , ±2λ± , 0}. There are two cases to consider: (i) νr1 not in the continuous spectrum of H0 and (ii) νr1 in the continuous spectrum of H0 .d If νr1 is not in the continuous spectrum of H0 then the inner product in (7.27) does not involve a singular limit and we get the limit
χ ~ , (H0 − νr1 )−1 Pc χ ~ . (7.28) In this case, the coefficient of |α0 |2a1 |β1 |2b1 β1 is real. Such a term results only in a nonlinear distortion of the phase of β1 and does not effect the amplitude. If νr1 is in the interior of the continuous spectrum of H0 then the limit is singular. We choose the plus sign (+i0) if we study the evolution for t > 0 and the negative sign (−i0) d In case (ii) we consider the generic case where if ν r1 lies in the interior of the continuous spectrum of H0 .
November 4, 2004 15:1 WSPC/148-RMP
1014
00217
A. Soffer & M. I. Weinstein
for t < 0. This choice is related to the condition of outgoing radiation explained below; see also [49], for example. Evaluation of this singular limit gives:
˜ + iΓ ˜, χ, (H0 − νr1 ∓ i0)−1 Pc χ = Λ where
˜ = πhχ, δ(H0 − νr1 )χi , Γ
˜ = χ, P.V. (H0 − νr1 )−1 χ . Λ
Contributions to the imaginary part are therefore responsible for a change in amplitude (here damping of β1 ). Now , when does a frequency ν lie in the continuous spectrum of H0 ? By Proposition 4.2, we must have ν > −E0 = |E0 | or ν < E0 . By (6.30), λ± ∼ ±(E1 − E0 ). Since ν varies over the frequencies 0, ±λ± , ±2λ± we find that νr1 = ±2λ± = −ωr resonances are in the continuous spectrum and therefore are those giving rise to energy transfer, provided 2E1 − E0 > 0; see (7.4). We now embark on the details. 7.1. Expansion of η We expand η as follows: η(t) = η0 (t) + η1 (t) + η2 (t) ,
(7.29)
where η0 (t) corresponds to the linear homogeneous evolution with initial data η(0) = Pc (T )Φ2 (0) and η1 solves the inhomogeneous linear equation driven by α0 , α1 and η0 (t). Equation for η0 (t): i∂t η0 = H0 (T )η0
(7.30)
η0 (0) = Pc (T )Φ2 (0) . Thus, η0 (t) = e−iH0 (T )t Pc (T )Φ2 (0) .
(7.31)
Equation for η1 (t): i∂t η1 = H0 (T )η1 + Pc (T )[H0 (t) − H0 (T )]Φ2 [η0 ] + E10 (t)Pc (T )σ3
Ψ1 c.c.
+ λPc (T )σ3
2|Ψ0 |2 Ψ1 + Ψ20 Ψ1
+ λPc (T )σ3
2Ψ0 |Ψ1 + π1 Φ2 [η0 ]|2 + Ψ0 (Ψ1 + π1 Φ2 [η0 ])2
+ λPc (T )σ3
|Ψ1 + π1 Φ2 [η0 ]|2 (Ψ1 + π1 Φ2 [η0 ]) − |Ψ1 |2 Ψ1
− iPc
c.c.
˜ c (T )σ3 Φ2 [η0 ] − ∂t ΘP
c.c.
c.c.
~ ~ 0 Ψ0 (t) · ∂t α ∇1 Ψ1 (t) · ∂t α ~ 1 (t) + ∇ ~ 0 (t) c.c.
.
(7.32)
November 4, 2004 15:1 WSPC/148-RMP
00217
Selection of the Ground State for Nonlinear Schr¨ odinger Equations
1015
The initial data for η1 is η1 (0) = 0. Remark 7.3. A direct computation using the biorthogonal decomposition of the discrete subspaces of H0 and H0∗ of Sec. 4 yields that the second term in (6.48) is of a higher order than is explicit: Ψ1 (0) (1) (0,1) E10 (t)Pc (T )σ3 = O(|α0 |2 |α1 | + |α1 |3 ) χ0 + χ0 + χ0 (7.33) c.c. for |α0 | 1 and |α1 | ↓ 0.
Equation for η2 (t): i∂t η2 ˜ c (T )σ3 Φ2 [η1 + η2 ] = H0 (T )η2 + Pc (T )[H0 (t) − H0 (T )]Φ2 [η1 + η2 ] − ∂t ΘP h i! 2Ψ0 (Ψ1 + π1 Φ2 [η0 ])π1 Φ2 [η1 + η2 ] + c.c. + λPc (T )σ3 c.c. 2Ψ0 |π1 Φ2 [η1 + η2 ]|2 + 2Ψ0 (Ψ1 + π1 Φ2 [η0 ])π1 Φ2 [η1 + η2 ] + λPc (T )σ3 c.c. Ψ0 (π1 Φ2 [η1 + η2 ])2 + λPc (T )σ3 c.c. 2|Ψ1 + π1 Φ2 [η0 ]|2 π2 Φ[η1 + η2 ] + λPc (T )σ3 c.c. (π1 Φ2 [η1 + η2 ])2 (Ψ1 + π1 Φ2 [η0 ]) + π1 Φ2 [η1 + η2 ](Ψ1 + π1 Φ2 [η0 ])2 + λPc (T )σ3 c.c. 2(Ψ1 + π1 Φ2 [η0 ])|π1 Φ2 [η1 + η2 ]|2 + |π1 Φ2 [η1 + η2 ]|2 π1 Φ2 [η1 + η2 ] . + λPc (T )σ3 c.c.
(7.34) We expect that η2 = O(hti−1 ). Let η2 = η2a + η2b .
(7.35) 3
By construction we will show that η2a = O(hti−1 ) and η2b = O(hti− 2 ). i∂t η2a ˜ 3 Φ2 [η1 ] = H0 (T )η2a + Pc (T )[H0 (t) − H0 (T )]Φ2 [η1 ] − ∂t Θσ Ψ0 (Ψ1 π1 Φ2 [η1 ] + Ψ1 π2 Φ2 [η1 ]) + Ψ0 |π1 Φ2 [η1 ]|2 + 2λPc (T )σ3 c.c. 2Ψ0 π1 Φ2 [η1 ](Ψ1 + π1 Φ2 [η0 ]) + Ψ0 (π1 Φ2 [η1 ])2 + λPc (T )σ3 c.c.
November 4, 2004 15:1 WSPC/148-RMP
1016
00217
A. Soffer & M. I. Weinstein
+ λPc (T )σ3
2|Ψ1 + π1 Φ2 [η0 ]|2 π1 Φ2 [η1 ] + (Ψ1 + π1 Φ2 [η0 ])2 π1 Φ2 [η1 ]
+ λPc (T )σ3
2(Ψ1 + π1 Φ2 [η0 ]) |π1 Φ2 [η1 ]|2 + (Ψ1 + π1 Φ2 [η0 ]) (π1 Φ2 [η1 ])2
+ λPc (T )σ3
c.c
c.c
2
|π1 Φ2 [η1 ]| π1 Φ2 [η1 ] c.c.
.
(7.36)
7.2. Normal form and master equations Using (7.33) and explicitly inserting in (7.32) the representation (0,1)
α1 = (X(t)β(t))1 = e−iλ+ (T )t β1 + O0
−1 α20 (T )E10 (T )e−iλ− (T )t β1
(7.37)
gives the following equation for η1 : ˜ c (T )σ3 Φ2 [η0 ] i∂t η1 = H0 (T )η1 + Pc (T ) (H0 (t) − H0 (T )) Φ2 [η0 ] − ∂t ΘP (0) (1) + E10 (t)Pc (T )σ3 (|˜ α0 |2 χ0 + |β˜1 |2 χ1 ) −iλ+ (T )t (0,1) −1 e β1 + O0 α20 (T )E10 (T )e−iλ− t β1 × c.c. 2|α0 |2 β1 e−iλ+ (T )t + α20 β1 eiλ+ (T )t 2 + λPc (T )σ3 ψ0∗ ψ1∗ c.c. 2 2α0 |β1 | + α0 β12 e−2iλ+ (T )t 2 ~ η1 , + λPc (T )σ3 ψ0∗ ψ1∗ +R c.c.
(7.38)
with initial condition η1 (0) = 0. Substitution of η1 into Eqs. (6.23) and (6.35) for α0 and β1 gives rise to terms which make explicit the resonant exchange of energy between the ground state and excited state. We next isolate the key terms in the expansion of η1 relating to this energy exchange. Let us begin with the β1 equation, (6.35). Written out in greater detail we have: 3 2 i∂t β1 = 2λhψ0∗ , ψ1∗ i|β1 |2 α0 eiλ+ (T )t + 2λhψ0∗ ψ1∗ , π2 Φ2 iβ1 α0 2 3 , π1 Φ2 i(β1 α0 e2iλ+ (T)t + β1 α0 ) + λhψ0∗ , ψ1∗ iβ12 α0 e−iλ+ (T )t + 2λhψ0∗ ψ1∗
+ 2λhψ0∗ ψ1∗ , |π1 Φ2 |2 iα0 eiλ+ (T )t + λhψ0∗ ψ1∗ , (π1 Φ2 )2 iα0 eiλ+ (T )t 3 2 + λhψ1∗ , π1 Φ2 i|β1 |2 eiλ+ (T )t + λhψ1∗ , (π1 Φ2 )2 iβ1 e2iλ+ (T )t 3 2 + λhψ1∗ , π2 Φ2 iβ12 e−iλ+ (T )t + 2λhψ1∗ , |π1 Φ2 |2 iβ1 e−iλ+ (T )t
+ λhψ1∗ , |π1 Φ2 |2 π1 Φ2 ieiλ+ (T )t + Rβ , where Rβ is defined in (6.36).
(7.39)
November 4, 2004 15:1 WSPC/148-RMP
00217
1017
Selection of the Ground State for Nonlinear Schr¨ odinger Equations
We claim that the key term in (7.39) responsible for energy transfer is the term 2 2λhψ0∗ ψ1∗ , π1 Φ2 iβ1 α0 e2iλ+ (T)t
(7.40)
on the second line of (7.39). To see this we decompose η1 into “resonant” and “nonresonant” parts η1 = η1R + η1NR .
(7.41)
The part η1NR gives rise to the key resonant energy transfer term in the equation for β1 and is the solution to the initial value problem 1 2 2 −2iλ+ (T )t i∂t η1R = H0 (T )η1R + λPc (T )σ3 ψ1∗ ψ0∗ α0 β1 e 0 η1R (0) = 0 .
(7.42)
Solving (7.42) using DuHamel’s principle we have Z t 1 2 −2iλ+ (T )s −iH0 (T )(t−s) 2 ds η1R (t) = −iλ e Pc (T )σ3 ψ1∗ ψ0∗ α0 (s)β1 (s)e 0 0 Z t 1 −iH0 (T )t i[H0 (T )−2λ+ (T )]s 2 2 = −iλe e Pc (T )σ3 ψ1∗ ψ0∗ α0 (s)β1 (s) ds . 0 0 (7.43) Recall that the continuous spectrum of H0 (T ) is given by points ω such that |ω| ≥ |E0 |. By Proposition 6.2, for |α0 |2 /E10 sufficiently small (0,1)
λ+ (T ) = E10 (T ) + O0
|α0 (T )|2 .
(7.44)
By the hypothesis (7.4), 2E1∗ − E0∗ > 0, if |α0 | is sufficiently small then 2λ+ (T ) lies in the continuous spectrum of H0 (T ). Therefore, (7.42) is a resonantly forced system. We expand the solution as follows. Let δ > 0 and set Z t 1 δ −iH0 (T )t 2 i[H0 (T )−2λ+ (T )−iδ]s 2 ds . η1R (t) = −iλe e Pc (T )σ3 ψ1∗ ψ0∗ α0 (s)β1 (s) 0 0 (7.45) δ . Then, η1R = limδ→0 η1R We now apply Lemma 7.1 to η1R with A = H0 (T ) − 2λ+ (T ). The result is δ Proposition 7.2. The limit limδ→0 η1R = η1R exists in S 0 and
α0 (t)β12 (t)(H0 (T )
− 2λ+ (T ) − i0)
+ λe−iH0 (T )t (H0 (T ) − 2λ+ (T ) − i0)−1
−1
1
2 ψ1∗ ψ0∗ 0 1 2 −iH0 (T )t −1 2 (H0 (T ) − 2λ+ (T ) − i0) Pc (T ) ψ1∗ ψ0∗ + λα0 (0)β1 (0)e 0
η1R (t) = −λe
−iλ+ (T )t
Pc (T )
November 4, 2004 15:1 WSPC/148-RMP
1018
00217
A. Soffer & M. I. Weinstein
·
Z
t
e
i[H0 (T )−2λ+ (T )]s
Pc (T )
0
1 0
2 ψ1∗ ψ0∗
= η1Ra + η1Rb + η1Rc .
d α0 (s)β12 (s) ds ds
(7.46)
We substitute (7.46) into the key term (7.40) in (7.39). 2 2 2λhψ0∗ ψ1∗ , π1 Φ2 [η]iβ1 α0 e2iλ+ (T )t = 2λhψ0∗ ψ1∗ , π1 η1Ra iβ1 α0 e2iλ+ (T )t + R . (7.47)
Here, R denotes rapidly dispersively decaying terms plus higher-order terms in |α0 β1 |. By Proposition 7.2 and the Plemelj formula (A.3) 2 2λhψ0∗ ψ1∗ , π1 η1Ra iβ1 α0 e2iλ+ (T )t 1 2 −1 2 2 ψ0∗ ψ1∗ |α0 |2 |β1 |2 β1 = −2λ ψ0∗ ψ1∗ , π1 (H0 (T ) − 2λ+ (T ) − i0) Pc (T ) 0
= −2(Λ + iΓ)|α0 |2 |β1 |2 β1 ,
(7.48)
where (using that E0 − 2E10 = E0 − 2(E1 − E0 ) = ω∗ + O(|α0 (T )|2 )) we have (0)
Λ = Λ ω∗ O 0
(0)
2 2 = λ2 hψ0∗ ψ1∗ , P.V.(H − ω∗ )−1 ψ0∗ ψ1∗ iO0
(7.49)
(0)
Γ = Γ ω∗ O 0
(0)
2 2 , δ(H − ω∗ )ψ0∗ ψ1∗ iO0 . = λ2 πhψ0∗ ψ1∗
(7.50)
(0)
Recall that O0 denotes a term of the form 1 + O(|α0 |2 ). Returning to the equation for β1 we have: i∂t β1 = (Λω∗ − iΓω∗ )|α0 |2 |β1 |2 β1 + · · · .
(7.51)
We now seek the key terms in the α0 equation, (6.23). Using that φ1 = Ψ1 + φ2 and the representation (7.37), we have 2 2 i∂t α0 = λhψ0∗ , ψ1∗ ie−2iλ+ (T )t β12 α0 2 2 + 2λhψ0∗ ψ1∗ , φ2 ie−iλ+ (T )t α0 β1 + λhψ0∗ , φ22 iα0 2 + 2λhψ0∗ ψ1∗ , φ2 i|β1 |2 2 , φ2 iβ12 e−2iλ+ (T)t + λhψ0∗ ψ1∗ , φ22 iβ1 eiλ+ (T )t + λhψ0∗ ψ1∗
+ 2λhψ0∗ ψ1∗ , |φ2 |2 iβ1 e−iλ+ (T )t + λhψ0∗ , |φ2 |2 φ2 i + R α0 .
(7.52)
We first focus on the key resonant term in (7.52) which is responsible for the system settling onto the nonlinear ground state. We claim this term is: 2 , φ2 iβ12 e−2iλ+ (T)t . λhψ0∗ ψ1∗
(7.53)
November 4, 2004 15:1 WSPC/148-RMP
00217
Selection of the Ground State for Nonlinear Schr¨ odinger Equations
1019
In analogy with the previous calculation, we have 2 2 , π2 η1Ra iβ12 e−2iλ+ (T )t + R , λhψ0∗ ψ1∗ , φ2 iβ12 e−2iλ+ (T )t = λhψ0∗ ψ1∗
(7.54)
where R is as above. Therefore, applying Proposition 7.2 we get 2 λhψ0∗ ψ1∗ , π2 Φ2 iβ12 e−2iλ+ (T )t = (−Λω∗ + iΓω∗ )|β1 |4 α0 + R .
(7.55)
The expression R denotes terms which are higher order, oscillatory type and dispersively decaying. See also Remark 7.2. In summary, we have the following system for α0 and β1 : Proposition 7.3. i∂t α0 = (−Λω∗ + iΓω∗ )|β1 |4 α0 + R
i∂t β1 = 2(Λω∗ − iΓω∗ )|α0 |2 |β1 |2 β1 + R .
(7.56)
The proof of Proposition 7.1 follows by constructing an appropriate near-identity change of variables, transforming (7.56) to (7.3). This is implemented as in [49]. 8. Stability Analysis on Different Time Scales
Overview
In Corollary 7.1 we obtained coupled power equations or nonlinear master equations governing the (renormalized) ground state and excited state square amplitudes: P0 = |˜ α 0 |2
and
P1 = |β˜1 |2 .
(8.1)
If we neglect the correction terms R0 and R1 , in (7.6) and (7.7) we obtain the simpler autonomous system of differential equations: dp0 = 2Γp0 p21 dt
(8.2)
dp1 = −4Γp0 p21 . (8.3) dt Note that this system is exactly solvable. Addition of twice (8.2) to (8.3) yields that along any solution trajectory: 2p0 (t) + p1 (t) = 2p0 (0) + p1 (0) .
(8.4)
This relation can be used to eliminate p1 from (8.2) or p0 from (8.3). p0 (t) and p1 (t) are thus obtained by quadrature. The dynamics of this finite dimensional reduced system anticipates that an initial state, arbitrarily close to but not exactly on the excited state branch, with energy distributed among the ground state and excited state, will evolve to a state with an increased ground state energy and no energy in the excited state. While not strictly correct, since there are nongeneric data giving rise to solutions which converge to the excited state [55], this captures the generic very large time dynamics. The correction terms R0 and R1 in (7.6) and (7.7) lead to different transient behaviors which may be quite different from that suggested by the system (8.2) and (8.3). However, we show that eventually (t ≥ t1 ), this system
November 4, 2004 15:1 WSPC/148-RMP
1020
00217
A. Soffer & M. I. Weinstein
dominates. Moreover, a large class of data, for which the system (8.2) and (8.3) controls the behavior is that for which P0 (0) > P1 (0) and sufficiently small initial dispersive part. Before embarking on the details we give a brief overview of the strategy. Using ˜ of Proposition 7.1 we have transformed away the change of variables (α, β) 7→ (˜ α, β) all local in time nonresonant terms. This introduces contributions to Fα and Fβ , and therefore contributions to R0 and R1 in Eqs. (7.6) and (7.7), which are of two types: (i) local in time terms depending on α ˜ 0 and β˜1 , which can be absorbed by the leading terms in (7.6) and (7.7), with a small correction to the coefficient Γ and are of order b0 hti−2 ; see Proposition 9.1 below. (ii) nonlocal in time functions of α ˜ 0 and β˜1 defined in terms of η = η[η0 , η1 , η2 ] in Fα and Fβ . These contribute terms to (7.3) with the same (anticipated) time-decay rate as the leading order terms in (7.3). Correspondingly, there are nonlocal in time functions of α ˜0 and β˜1 which contribute to Rj in Eqs. (7.6) and (7.7) which are of the same (anticipated) decay rate as the leading order terms in (7.6) and (7.7). The goal is to control these nonlocal terms, to the extent possible, by the leading order terms. However, due to the different behaviors of α ˜ 0 and β˜1 on different time scales the argument is somewhat tricky and we now explain our strategy. Let 5 [η0 ]X + E0 0 0 . (8.5) t0 + 1 ≡ sup τ : 0 ≤ τ ≤ τ, P0 (τ ) ≤ 2 hτ − 1ihτ 0 i2 τ ≥0 Propositions 11.2–12.1 and 13.1 will justify this choice, by implying the inequalities (8.7). If t0 < ∞, then we have the bound: P0 (t) ≤
5 [η0 ]X + E0 , 2 ht0 ihti2
0 ≤ t ≤ t0 .
(8.6)
Consider the system for P0 (t) and P1 (t), (7.6) and (7.7). Decomposing Rj into local and nonlocal in time parts we have: Rj =
Rjl (P0 , P1 , η0 , η)
+
= Rjl (P0 , P1 , η0 , η) + +
Z
t t0
Z Z
t 0
K(t, s)rjnl (s)ds
t0 0
K(t, s)rjnl (s)ds
K(t, s)rjnl (s)ds .
The terms rjnl arise from nonlocal in time functions of α ˜ 0 and β˜1 (see (ii) above); explicit expressions of this type are analyzed in Sec. 11. For the local in time
November 4, 2004 15:1 WSPC/148-RMP
00217
1021
Selection of the Ground State for Nonlinear Schr¨ odinger Equations
contributions we have the estimates: b0 ≤ + 2 hti √ b1 ρ1 l 2 m R1 (P0 , P1 , η0 , η) ≤ O(E0 ) P0 P1 + 2 + O P0 P1 , hti R0l (P0 , P1 , η0 , η)
O(E0ρ0 )
P0 P12
(8.7)
where E0 is the initial total energy and ρj > 0. Therefore, for E0 small Z t b0 dP0 ≥ 2Γ0 P0 P12 − 2 + K0 (t, s)r0nl (s)ds dt hti t0 Z t p dP1 b1 ≤ −4Γ0 P0 P12 + 2 + K1 (t, s)r1nl (s)ds + O P0 P1m , dt hti t0
with Γ0 ∼ Γ. The reverse inequalities hold with Γ0 replaced by Γ00 = Γ + o(Γ). Next (Proposition 9.2) we introduce the auxiliary quantities Q0 = P0 − k0 hti−1
and
Q1 = P1 + k1 hti−1 ,
(8.8)
where k0 (b0 , b1 , E0 ) and k1 (b0 , b1 , E0 ) are chosen appropriately and derive equations of the form Z t dQ0 K0 (t, s)r0nl (s)ds ≥ 2Γ00 Q0 Q21 + dt t0 (8.9) Z t p dQ1 nl m 00 2 ≤ −4Γ Q0 Q1 + K1 (t, s)r1 (s)ds + O Q0 Q1 , dt t0 where Γ00 ∼ Γ0 ∼ Γ. We then proceed with the following continuity argument. At t = t0 , dQ0 (t0 ) ≥ 2Γ00 Q0 (t0 )Q21 (t0 ) , dt p dQ1 (t0 ) ≤ −4Γ00 Q0 (t0 )Q21 (t0 ) + O Q0 (t0 )Qm 1 (t0 ) . dt
(8.10)
Therefore, by continuity, the following inequalities hold for some time interval t0 ≤ 00 t ≤ t0,1 , with Γ00 replaced by Γ2 : dQ0 ≥ Γ00 Q0 Q21 , dt
p dQ1 ≤ −2Γ00 Q0 Q21 + O Q0 Qm 1 . dt
Let t∗ ≡ sup{t ≥ t0 : inequalities (8.11) hold }. We show that Z t bj ρ nl 2 K1 (t, s)r1 (s)ds ≤ O(E0 ) Q0 Q1 + 2 hti t0
(8.11)
(8.12)
and therefore, up to renormalization of Qj (adding higher-order terms of order E0ρ kj hti−1 to the definition of Qj )). Use of this estimate in (8.9) implies (8.11), for E0 sufficiently small. The argument can be repeated and therefore, t∗ = T .
November 4, 2004 15:1 WSPC/148-RMP
1022
00217
A. Soffer & M. I. Weinstein
9. Finite Dimensional Reduction and Its Analysis on Different Time Scales We now begin our study of the generic case, where t0 < ∞ and the solution converges to the nonlinear ground state family as t → ∞. The following three propositions concern the various time scales which enter the analysis. The first is a basic result, a normal form, which is the point of departure for our analysis on all time scales. Proposition 9.1. Let m ≥ 4. Let
b0 = ht0 i−1 [η0 ]X + c∗ E02 ,
2 1 3 b1 = ht0 i− 2 [η0 ]X + d∗ E02 ,
for some order one constants c∗ and d∗ . If for some t0 , positive and finite, 3b0 (t0 , [η0 ]X ) , P0 (t0 ) ≥ ht0 i then for t ≥ t0 b0 dP0 ≥ 2(1 − δ1 )ΓP0 P12 − 2 + J0 dt hti
p b1 dP1 ≤ −4(1 − δ1 )ΓP0 P12 + O( P0 P1m ) + 2 + J1 , dt hti where J0 and J1 are nonlocal in time terms, which have the form: Z t Jj = K(t, s)rjnl (s)ds .
(9.1) (9.2)
(9.3)
(9.4) (9.5)
(9.6)
t0
The terms encompassed in Jj are derived and estimated in the coming sections.
Remark 9.1. The reverse inequalities of (9.4) and (9.5) hold as well with a different constant δ2 ∼ δ1 . The proof of Proposition 9.1 will be given following the estimates on the remainder terms, Ri (Proposition 11.1). Proposition 9.2. Assume that t0 is positive and finite as in Proposition 9.1. Then, there exist k0 = k0 (b0 , b1 , E0 ) and k1 = k1 (b0 , b1 , E0 ), such that for t ≥ t0 the auxiliary functions k1 k0 , Q1 (t) ≡ P1 (t) + (9.7) Q0 (t) ≡ P0 (t) − hti hti satisfy E 2 c∗ dQ0 ≥ 2Γ0 Q0 Q21 + J0 + 0 2 dt ht0 ihti √ dQ1 E02 d∗ ≤ −4Γ0 Q0 Q21 + O( Q0 Qm , 1 1 ) + J1 − dt ht0 i 2 hti2
(9.8)
November 4, 2004 15:1 WSPC/148-RMP
00217
Selection of the Ground State for Nonlinear Schr¨ odinger Equations
1023
for some positive free constants c∗ and d∗ . In particular , Q0 (t) is monotonically increasing for t ≥ t0 . The next result shows that c∗ and d∗ can be chosen to control the terms Jj . Proposition 9.3 (Monotonicity of Q0 for τ ≥ t0 ). There exist c∗ and d∗ of order one, such that for t ≥ t0 .
dQ0 ≥ 2Γ0 Q0 Q21 dt (9.9) √ dQ1 m 0 2 ≤ −4Γ Q0 Q1 + O( Q0 Q1 ) . dt The above three propositions are established in the next two sections. We complete the current section by working out the consequences of the finite dimensional reduction (9.9). The next proposition, shows that even if Q0 is very small at some stage, it √ will eventually become large relative to Q1 and the O( Q0 Qm 1 ) term in (9.8) will become negligible. Proposition 9.4. Assume t ≥ t0 and suppose for some r > 0 that Q0 ≤ E0r . Q1
(9.10)
Q0 (t) is increasing for t ≥ t0 Q1 (t)
(9.11)
Then,
and there exists t1 such that Q0 (t1 ) = E0r . Q1 (t1 ) Furthermore, for t ≥ t1 :
dQ0 ≥ 2Γ0 Q0 Q21 dt dQ1 ≤ −4Γ0 Q0 Q21 . dt
(9.12)
(9.13)
Finally, for t ≥ t1 : Q1 (t) ≤
1+
4Γ00 Q
Q1 (t1 ) . 1 (t1 )(inf [t1 ,T ] |Q0 |) · (t − t1 )
(9.14)
Proof of Proposition 9.2. Define Q0 ≡ P0 − k0 hti−1 ,
Q1 ≡ P1 + k1 hti−1 ,
(9.15)
where k0 , k1 > 0 are to be appropriately chosen. Note that Q0 (t) ≤ P0 (t)
and
Q1 (t) ≥ P1 (t) .
(9.16)
November 4, 2004 15:1 WSPC/148-RMP
1024
00217
A. Soffer & M. I. Weinstein
Using (9.4) and (9.5) and some estimation we deduce a simplified system for Q0 and Q1 . We calculate, omitting the terms J0 and J1 , which are carried along passively. We begin with Q0 . dP0 k0 dQ0 = + 2 dt dt hti k0 b0 + 2 2 hti hti 2 k0 k0 b0 k1 = 2Γ Q0 + − 2 + 2 Q1 − hti hti hti hti ≥ 2ΓP0 P12 −
≥ 2ΓQ0 Q21 − 4Γ +
k1 k2 Q0 Q1 + 2Γ 12 hti hti
k0 − b 0 . hti2
(9.17)
We estimate the second term on the right-hand side as follows. For any s > 0, −4Γ Therefore,
k1 k2 Q0 Q1 ≥ −2sΓQ0 Q21 − 2ΓQ0 1 2 . hti shti
1 dQ0 1 2 2 ≥ 2Γ(1 − s)Q0 Q1 + 2Γk1 Q0 1 − . + k0 − b0 dt s hti2 Now set s =
1 10
(9.18)
(9.19)
and assume k1 ≥ b 1 .
(9.20)
Then, using that k1 = 10b1, we have
If
1 dQ0 9 ≥ 2 · ΓQ0 Q21 + k0 − b0 − 18Γb21 Q0 . dt 10 hti2 k0 ≡ b0 + 18Γb21 sup Q0 + t0 ≤t≤T
E02 c∗ , ht0 i
(9.21)
(9.22)
we have by (9.21) and (9.28) dQ0 9 E 2 c∗ ≥ 2 · ΓQ0 Q21 + 0 2 . dt 10 ht0 ihti
In particular, Q0 is increasing for t ≥ t0 . We now turn to Q1 ; we have dP1 k1 dQ1 = − 2 dt dt hti ≤ −4ΓP0 P12 +
p b1 k1 P0 P1m ) − + O( hti2 hti2
(9.23)
November 4, 2004 15:1 WSPC/148-RMP
00217
Selection of the Ground State for Nonlinear Schr¨ odinger Equations
1025
2 p b1 − k 1 k0 k1 + = −4Γ Q0 + Q1 − + O( P0 P1m ) 2 hti hti hti 2 p b1 − k 1 k1 + O( P0 P1m ) + ≤ −4ΓQ0 Q1 − 2 hti hti = −4ΓQ0 Q21 + 8Γ
p k1 k2 b1 − k 1 Q0 Q1 − 4ΓQ0 12 + + O( P0 P1m ) . 2 hti hti hti
(9.24)
The second term on the right-hand side is estimated as follows. For any r > 0 we have, since 2ab ≤ ra2 + r−1 b2 , that 8Γ Therefore,
4Γk1 rQ0 4Γk1 Q0 Q21 k1 Q0 Q1 ≤ + . 2 hti hti r
(9.25)
p k1 b1 − k 1 dQ1 r − k1 ≤ −4Γ 1 − + + O P0 P1m . Q0 Q21 + 4Γk1 Q0 dt r hti2 hti2
(9.26)
Let
r ≡ 20k1
and
b1 ≡ k1 (1 − 76Γk1 Q0 ) −
which is consistent with the constraint (9.20). Then,
E02 d∗ 1
ht0 i 2
,
p dQ1 19 E02 d∗ + O( P0 P1m ) . ≤ −4 · ΓQ0 Q21 − 1 2 dt 20 ht0 i 2 hti
(9.27)
(9.28)
By (9.8), Q0 is increasing for t ≥ t0 . Therefore, Q0 (t0 ) ≥
1 k0 ⇒ 100 ht0 i
P0 (t) ≤ Q0 (t) +
k0 k0 = Q0 (t) + 100 hti 100hti
≤ Q0 (t) + 100Q0 (t0 ) ≤ 101Q0 (t) . Also, by definition P1 ≤ Q1 and so
p 19 dQ1 ≤ −4 · ΓQ0 Q21 + O( Q0 Qm 1 ). dt 20 This completes the proof of Proposition 9.2. Proof of Proposition 9.4. Q˙ 0 Q1 − Q˙ 1 Q0 d Q0 1 3/2 = ≥ 2 ΓQ0 Q21 Q1 + 4ΓQ20 Q21 − CQ0 Qm 1 dt Q1 Q21 Q1 ≥
1 1/2 −1/2 m+1/2 2ΓQ0 Q31 − CQ0 Q1 Q1 + 4ΓQ20 Q21 . 2 Q1
(9.29)
November 4, 2004 15:1 WSPC/148-RMP
1026 Q0 Q1
00217
A. Soffer & M. I. Weinstein
≤ E0r then implies that d Q0 1 r/2 m+1/2 ≥ 2 2ΓQ0 Q31 − E0 Q1 + 4ΓQ20 Q21 dt Q1 Q1 2 Q0 Q0 ≥Γ Q21 + 4ΓQ21 Q1 Q1 Q0 Q21 , ≥Γ Q1
for m +
1 2
(9.30)
> 3 and |Q1 | ≤ E0 1. Hence Z t Q0 Q0 2 ≥ exp Γ Q1 (s)ds . Q 1 t Q1 t0 t0 Q0 Q1
grows exponentially with t ≥ t0 . In either case, r 0 there exists t1 ≥ t0 , such that for t = t1 , Q Q1 t = E0 . Now whenever, Since Q1 > 0, either Q1 ↓ 0 or
1
Q0 ≥ E0r , Q1
(9.31)
we have p dQ1 ≤ −4ΓQ0 Q21 + +O( Q0 Qm 1 ) dt
m−1/2 −r/2
≤ −4ΓQ0 Q21 + Q0 Q1 E0 2 −r/2 m−5/2 2 ≤ −4ΓQ0 Q1 − CE0 E0 Q1
≤ −4ΓQ0 Q21
for m >
5 2
(9.32) 1 + r2 . Therefore, by (9.31) we have dQ dt t1 < 0. Since Q0 is increasing for
t ≥ t1 ≥ t0 , the inequality
Q0 Q1
≥ E0r persists and (9.32) holds for all t ≥ t1 . Hence
dQ1 ≤ −4Γ0 Q0 Q21 dt
dQ0 ≥ 2Γ0 Q0 Q21 . dt
and
(9.33)
Finally, for t ≥ t1 dQ1 19 ≤ −4 Γ dt 20
!
inf Q0 Q21 .
(9.34)
Q1 (t1 ) . 1 (t1 )|Q0 (t1 )|(t − t1 )
(9.35)
[t0 ,T ]
Solving this scalar inequality Q1 (t) ≤
1+
4Γ00 Q
Since P1 ≤ Q1 , P1 (t) decays to zero like ht − t1 i−1 . This completes the proof of Proposition 9.4.
November 4, 2004 15:1 WSPC/148-RMP
00217
Selection of the Ground State for Nonlinear Schr¨ odinger Equations
1027
10. Decomposition and Estimation of the Dispersion In this section, we revisit the decomposition of the dispersive part, η, which satisfies Eq. (6.48). Here, we decompose η in a manner suitable for consideration of the solution on the various time scales. Proposition 10.1. η(t) = η0 (t) + e0 (t) + ηb (t) = eiσ3 [
RT t
˜ (E0 (t0 )−E0 (T ))dt0 +Θ(t)]
(˜ η0 (t) + e˜0 (t) + η˜b (t)) .
(10.1)
The three terms can be described as follows: (i) η0 (t) is a dispersive wave generated by the data η(0) = ηin : (i∂t − H0 )˜ η0 = 0 ,
η(0) = ηin ,
(10.2)
(ii) e0 (t) is driven principally by η0 (t): (i∂t − H0 )˜ e0 = Pc S (0) [e0 , ηb ; η0 ] ,
e0 (0) = 0 ,
(10.3)
ηb (0) = 0 .
(10.4)
and (iii) ηb (t), which is driven by bound state dynamics: (i∂t − H0 )˜ ηb + Pc S (b) [e0 , ηb ; η0 ] ,
We display expressions for S (0) and S (b) with the detail required in our analysis. We let χ denote a generic exponentially localized function of position, x. ei[
RT t
˜ (E0 (t0 )−E0 (T ))dt0 +Θ(t)] (0)
(0)
S (0)
(0)
≡ S0 + S1 e0 + S2 e20 + e30 ∼ α20 (t) − α20 (T ) χη0 + η03 + α0 α1 χη0 + α0 χη02 + α0 α1 + α21 χe0 + (α0 + α1 )χ(η0 + ηb )e0 + η0 ηb e0 + α20 (t) − α20 (T ) χe0 + (α0 + α1 )χe20 + (η0 + ηb )e20 + e30
ei[
RT t
˜ (E0 (t )−E0 (T ))dt +Θ(t)] 0
(b)
0
(b)
S (b)
(b)
≡ S0 + S1 ηb + S2 ηb2 + ηb3 ∼ α20 (t) − α20 (T ) χηb ~ ~ 0 Ψ0 (t) · ∂t α ∇1 Ψ1 (t) · ∂t α ~ 1 (t) + ∇ ~ 0 (t) + iPc (T ) c.c. Ψ1 (t) + E1 (t) − E0 (t) Pc (T )σ3 c.c.
(10.5)
November 4, 2004 15:1 WSPC/148-RMP
1028
00217
A. Soffer & M. I. Weinstein
+ |α0 |2 |α1 | + |α0 ||α1 |2 χ
+ α0 α1 χηb + (α0 + α1 ) χηb2 + η0 + e0 ηb2
+ |α0 |χη0 ηb + (η02 + e20 )ηb + ηb3 .
(10.6)
˜ Remark 10.1. Due to the decay of E0 (t) − E0 (T ) and ∂t Θ(t), integration by parts implies that contributions from the phase factors multiplying S (0) and S (b) contribute at higher and negligible order. Also note that η0 , e0 and ηb satisfy the same energy (H s ) and dispersive(Lp , local decay) estimates as η˜0 , e˜0 and η˜b , since these functions differ only by a time-dependent phase. These properties will be used repeatedly below and in Secs. 11–13. By (5.22) we have the following H 1 bounds on e0 and ηb in terms of one another: ke0 (t)kH 1 ≤ E0 + kη0 kH 1 + kηb (t)kH 1
(10.7)
kηb (t)kH 1 ≤ E0 + kη0 kH 1 + ke0 (t)kH 1 .
(10.8)
Using DuHamel’s formula, both the e0 and ηb equations can be written as equivalent integral equations: Z t e0 (t) = −i e−iH0 (t−s) Pc S (0) (s)ds (10.9) 0
ηb (t) = −i
t
Z
e−iH0 (t−s) Pc S (b) (s)ds .
(10.10)
0
Therefore, in both cases we must estimate an expression of the form: Z t w(t) = e−iH0 (t−s) Pc S(s)ds .
(10.11)
0
For estimation, we shall require the following class of dispersive estimates: Proposition 10.2. (i) Let 2 ≤ p < ∞, p0 , q ≥ 2, q 0 and s be related by: 1 1 3 = s, p−1 + (p0 )−1 = 1 , q −1 + (q 0 )−1 = 1 . − q p
(10.12)
Then, 1
1
1
1
ke−iH0 t Pc f kp ≤ |t|−3( 2 − q ) hti−3( q − p ) (kf kp0 + k∂ s f kq0 ) .
(10.13)
(ii) Assume q ≥ 2 and s > 3/q. Then, 1
1
3
ke−iH0 t Pc f k∞ ≤ |t|−3( 2 − q ) hti− q (kf k1 + k∂ s f kq0 ) .
(10.14)
Proof of Proposition 10.2. We use the classical Sobolev inequality for functions defined on R3 : kf kp ≤ C k∂ s f kq ,
3(q −1 − p−1 ) = s
(10.15)
November 4, 2004 15:1 WSPC/148-RMP
00217
Selection of the Ground State for Nonlinear Schr¨ odinger Equations
1029
0
and the Lp – Lp estimate (Theorem 4.2) 3
3
ke−iH0 t Pc f kp ≤ C1,p t− 2 + p kf kp0 , p−1 + (p0 )−1 = 1 .
(10.16)
For |t| ≥ 1, we use (10.16). For |t| ≤ 1, we have by (10.15) that ke−iH0 t Pc f kp ≤ k∂ s e−iH0 t Pc f kq s
≤ C kH02 e−iH0 t Pc f kq 1
1
≤ C |t|−3( 2 − q ) kDs f kq0 ,
(10.17)
q −1 + (q 0 )−1 = 1 . From estimates (10.16) and (10.17) we obtain (10.13). Estimate (10.14) is obtained similarly. This completes the proof of Proposition 10.2. We now apply Proposition 10.2 with q = 4 and s = 1 > 3/4 to the integral equation (10.11) and obtain the bound: Z t 1 1 kS(s)k1 + k∂S(s)k 34 ds . (10.18) kw(t)k∞ ≤ 3 3 0 |t − s| 4 ht − si 4
We shall use this bound as the first step in estimating ke0 (t)k∞ and kηb (t)k∞ . More specifically, our estimation strategy seeks a closed system of inequalities for the following: norms of e0 : 1. [e0 ]∞, 32 , 2. [∂e0 ]L2loc ; 23 , 3. [e0 ]H 1 ;0 norms of ηb : 4. [ηb ]∞;0 , 5. [ηb ]H 1 ;0 , in terms of norms of the initial data [φ(0)]X . 10.1. Estimation of ke0 k∞ By (10.18), we have Z ke0 (t)k∞ ≤
0
t
1
1 3 4
|t − s| ht − si
3 4
kS (0) (s)k1 + k∂S (0) (s)k 43 ds ,
(10.19)
so it sufficies to bound kS (0) k1 and k∂S (0) k 34 . kS (0) k1 : For any t ≥ 0, (0)
3
(0)
3
kS1 e0 k1 ≤ Chsi− 2 [e0 ]∞; 23 3
√
3 E0 [η0 (0)]2X ≤ C(E0 )D(η(0))hsi− 2 1 1 E0 + E02 [ηb ]∞;0 + E02 [η0 ]∞;0 + [η0 ]2∞; 3
kS0 k1 ≤ Chsi− 2 E0 [η0 (0)]X +
2
3
+ Chsi− 2 [e0 ]∞; 32 [η0 ]∞; 23 [ηb ]∞;0 ≤ Chsi− 2 √ (0) kS2 e20 k1 ≤ Chsi−3 [e0 ]2∞; 3 C E0 + [η0 ]∞; 23 + [ηb ]∞;0 2
3
ke30 k1 ≤ Chsi− 2 [e0 ]∞; 23 [e0 ]2H 1 ;0 .
(10.20)
November 4, 2004 15:1 WSPC/148-RMP
1030
00217
A. Soffer & M. I. Weinstein
k∂S (0) k 34 : For any t ≥ 0, 3 1 1 3 (0) 2 k∂S0 k 34 ≤ Chsi− 2 E0 [∂η0 ]∞; 32 + E02 [η0 ]∞; 23 [η0 ]H 1 ;0 + E04 [η0 ]∞, 3 2 1 3 3 (0) k∂S1 e0 k 43 ≤ Chsi− 2 E0 [η0 ]∞; 32 + E0 [e0 ]L2loc ; 32 + Chsi− 2 E02 [ηb ]∞;0 [η0 ]∞; 23 1 1 + E02 [e0 ]∞, 32 [ηb ]H 1 ;0 + E02 [ηb ]∞;0 [∂e0 ]L2loc ; 32 1 1 3 (0) (10.21) k∂S2 e20 k 43 ≤ Chsi− 2 [e0 ]∞; 32 E02 [e0 ]∞; 32 + E02 [e0 ]H 1 ;0 3 + Chsi− 2 [e0 ]∞; 32 [η0 ]H 1 ;0 + [ηb ]H 1 ;0 [e0 ]H 1 ;0 3 1 2 2 + [e0 ]∞; 3 [e0 ]L2 ;0 [η0 ]H 1 ;0 + [ηb ]H 1 ;0 2
3
k∂e30 k 34 ≤ Chsi− 2 [e0 ]∞; 32 [e0 ]2H 1 ;0 .
Using the bounds (10.20) and (10.21) in (10.19) implies Proposition 10.3. 1
3
ke0 (t)k∞ ≤ hti− 2 C(E0 , [η0 (0)]X ) + E02 [ηb (t)]∞;0 + [η0 (0)]X [ηb (t)]∞;0 [e0 ]∞; 23 1
3
+ hti− 2 E02 + [η0 (0)]X + [e0 (t)]2H 1 ;0 + [ηb (t)]∞; 32
[e0 ]∞; 23 + [e0 ]2∞; 3
1 3 + hti− 2 [e0 ]∞; 32 [e0 ]2H 1 ;0 + E0 [e0 ]L2loc ; 32 + E02 [e0 ]2∞; 3 . 2
2
(10.22)
10.2. Estimation of [∂e0 (t)]L2loc ; 32 k∂e0 (t)kL2loc ∼ kχ∂e0 (t)k2 (χ localized) 1 1 ∼ kχhH0 i 2 e0 (t)k2 ∂hH0 i− 2 ∈ B(L2 ) Z t 1 ≤ kχhH0 i 2 e−iH0 (t−s) Pc S (0) (s)k2 ds (by (10.9)) .
(10.23)
0
For the purpose of continuing the estimation, we regard S (0) as consisting of terms of two types: (i) terms having spatially localized (exponentially) functions of x as a factor, coming from Ψj , j = 0, 1, and (ii) terms like e30 or ηb η0 e0 and others which (0) (0) are not of this type. It is convenient to refer to such terms as SLOC and SNLOC below. From (10.23) we have: Z t 1 (0) k∂e0 (t)kL2loc ≤ kχhH0 i 2 e−iH0 (t−s) Pc SLOC (s)k2 ds 0
+ ≤
Z
Z t
t−1
+ 0
C
t t−1
1
ht − si 2
(0)
kχhH0 i 2 e−iH0 (t−s) Pc SNLOC (s)k2 ds (0)
3
0
Z
khxiσ SLOC kds
November 4, 2004 15:1 WSPC/148-RMP
00217
Selection of the Ground State for Nonlinear Schr¨ odinger Equations
+ + ≤
Z
Z Z
t−1 0
+C
(0)
1
khH0 i 2 e−iH0 (t−s) Pc SNLOC (s)k2 ds C
0t0
(0)
1
ke−iH0 (t−s) Pc hH0 i 2 SNLOC (s)k∞ ds
t t−1
1031
(0)
3
ht − si 2 Z t−1
khxiσ SLOC kds 1
(0)
k∂SNLOC (s)k∞ ds +
3
|t − s| 2
0
Z
t t−1
(0)
k∂SNLOC (s)k2 ds
= A +B+C.
(10.24)
We estimate the integrands of B and C; that of A is similar. 3 |integrand of B| ≤ Chti− 2 [e0 ]∞; 23 + [η0 ]∞; 32 [e0 ]2H 1 ;0 + [ηb ]2H 1 ;0 + [η0 ]2H 1 ;0
(10.25)
3
|integrand of C| ≤ hti− 2 [e0 ]2∞; 3 [e0 ]H 1 ;0 + [ηb ]H 1 ;0 2
+ [e0 ]∞; 23 [η0 ]∞; 23 [ηb ]H 1 ;0
+ [η0 ]W 1,∞ ; 32 [ηb ]∞;0 [e0 ]H 1 ;0 + [η0 ]∞; 32 [ηb ]H 1 ;0 + [e0 ]∞; 23 [e0 ]H 1 ;0 + [η0 ]2W 1,∞ ; 3 [e0 ]H 1 ;0 .
2
Substituting of these bounds into (10.24), we have: Proposition 10.4. 3
k∂e0 (t)kL2loc ; 32 ≤ Chti− 2 [e0 ]∞; 23 + [η0 (0)]X
[e0 ]∞; 32 + [e0 ]H 1 ;0 .
(10.26)
10.3. Estimation of [e0 (t)]H 1 ;0 1
ke0 (t)kH 1 ∼ kH02 e0 (t)k2 ≤ (0)
Z
t 0
k∂S (0) (s)k2 ds
(10.27)
1
3
k∂S0 k2 ≤ Chsi− 2 E0 [η0 ]W 1,∞ ; 32 + E02 [η0 ]2∞; 3 2
+ [η0 ]H 1 ;0 [η0 ]2∞; 3 2 (1)
+ E0 [η0 ]W 1,∞ ; 23
3
k∂S0 e0 k2 ≤ Chsi− 2 · · · (2)
3
1
1
k∂S0 e20 k2 ≤ Chsi− 2 E02 [e0 ]2∞; 3 + E02 [e0 ]∞; 32 [η0 ]H 1 ;0 2
+ Chsi
− 32
[e0 ]2∞; 3 2
[η0 ]H 1 ;0 + [ηb ]H 1 ;0
November 4, 2004 15:1 WSPC/148-RMP
1032
00217
A. Soffer & M. I. Weinstein 3
+ Chsi− 2 [e0 ]∞; 32 [η0 ]W 1,∞ ; 32 ([η0 ]2;0 + [ηb ]2;0 ) k∂e30 k2 ≤ Chsi−3 [e0 ]2∞; 3 [e0 ]H 1 ;0 . 2
Proposition 10.5. For t ≥ 0, ke0 (t)kH 1 ≤ C(E0 , [η0 (0)]X ) + [e0 ]∞; 23 [η0 (0)]X + [e0 ]H 1 ;0 .
(10.28)
We now turn to the estimation of kηb (t)k∞ and kηb (t)kH 1 . By (10.18) we have Z t 1 1 kS (b) (s)k1 + k∂S (b) (s)k 34 ds , (10.29) kηb (t)k∞ ≤ 3 3 0 |t − s| 4 ht − si 4
where S (b) is given by (10.6).
Proposition 10.6. In terms of Qj , j = 0, 1 we have (b)
(b)
(b)
S (b) = S0 + S1 ηb + S2 ηb2 + ηb3
(10.30)
where (b)
1
1
S0 ∼ Q02 Q1 χ + Q0 Q12 χ 1
(b)
1
1
S1 ηb ∼ Q02 Q12 χηb + (η02 + e20 )ηb + Q02 χη0 ηb 1 1 (b) S2 ηb2 ∼ Q02 + Q12 χηb2 + (η0 + e0 ) ηb2 .
4
We now proceed with estimates of S (b) in L1 and in W 1, 3 . Beginning with kS (b) k1 we have: kS (b) k1 1
1
1
1
≤ C Q02 Q1 + Q0 Q12 + Q02 Q12 kηb k∞ + kηb k∞ (kη0 k22 + ke0 k22 )
1 1 1 + C Q02 kη0 k∞ kηb k2 + Q02 + Q12 kηb k2∞ + kηb k∞ (kη0 kkηb k2 + ke0 k2 kηb k2 )
+ Ckηb k∞ kηb k22 .
(10.31)
We now turn to k∂S (b) k 43 : k∂S (b) k 43 1
1
1
1
≤ C Q02 Q1 + Q0 Q12 + Q02 Q12 kηb kH 1
1
+ C k(η02 + e20 )∂ηb k 43 + kη0 ∂η0 ηb k 43 + Q02 kηb k2 (kη0 k∞ + k∂η0 k∞ ) 1
+ C(Q02 kη0 k∞ k∂ηb k2 + k(η0 + e0 )ηb ∂ηb k 34 + k(∂η0 + ∂e0 )ηb2 k 34 + kηb2 ∂ηb k 43 ) .
(10.32)
November 4, 2004 15:1 WSPC/148-RMP
00217
Selection of the Ground State for Nonlinear Schr¨ odinger Equations
1033
To further estimate k∂S (b) k 34 we use the following estimates of individual terms: 1
kη02 ∂ηb k 34 ≤ kη0 k∞ kη0 k22 kηb kH 1 1
3
2 ke20 ∂ηb k 34 ≤ kη0 k∞ ke0 k22 kηb kH 1
1
1
2 kη0 k22 kηb k2 kη0 ∂η0 ηb k 34 ≤ k∂η0 k∞ kη0 k∞ 1
1
1 2
1 2
2 kη0 ηb ∂ηb k 34 ≤ kηb k∞ kη0 k∞ kη0 k22 kηb kH 1
ke0 ηb ∂ηb k ≤ kηb k∞ ke0 k∞ ke0 k2 kηb k 4 3
(10.33)
H1
1
2 k∂η0 k4 kηb k2 k∂η0 ηb2 k 34 ≤ kηb k∞ k∂η0 k∞ 3
1
2 k∂e0 ηb2 k 34 ≤ ke0 kH 1 kηb k∞ kηb k22
kηb2 ∂ηb k 34 ≤ kηb k∞ k∂ηb k2 kηb k2 .
Recall that kηb (t)kH 1 can be estimated in terms of ke0 (t)kH 1 ; see (10.8). Since ηb is driven by the bound state amplitudes (Q0 and Q1 ), which have different behaviors on the intervals Ij , we now estimate ηb (t) separately on I0 = [0, t0 ], I1 = [t0 , t1 ] and I2 = [t1 , ∞). We now introduce appropriate norms on different time scales. Define 3
MI0 (t) ≡ sup hti 2 ke0 (t)k∞ + sup htikηb (t)k∞ + sup ke0 (t)kH 1 0≤t∧t0
0≤t∧t0
0≤t∧t0
3
+ sup hti 2 k∂e0 (t)kL2loc + sup hti2 |Q0 (t)| + sup |Q1 (t)| 0≤t∧t0
0≤t∧t0
(10.34)
0≤t∧t0
3
MI1 (t) ≡ sup hti 2 ke0 (t)k∞ + sup kηb (t)k∞ + sup ke0 (t)kH 1 t0 ≤t∧t1
t0 ≤t∧t1
t0 ≤t∧t1
3
+ sup hti 2 k∂e0 (t)kL2loc + sup |Q0 (t)| + sup |Q1 (t)| t0 ≤t∧t1
t0 ≤t∧t1
(10.35)
t0 ≤t∧t1
1
3
MI2 (t) ≡ sup hti 2 ke0 (t)k∞ + sup ht − t1 i 2 kηb (t)k∞ + sup ke0 (t)kH 1 t1 ≤t∧T
t1 ≤t∧T
t1 ≤t∧T
3
+ sup hti 2 k∂e0 (t)kL2loc + sup |Q0 (t)| t1 ≤t∧T
t1 ≤t∧T
0
+ sup ht − t1 iΓ |Q0 (t1 )||Q1 (t1 )||Q1 (t)| .
(10.36)
t1 ≤t∧T
Remark 10.2. By Propositions 10.3–10.5, the e0 contributions to the norms MIk are controlled in terms of the initial conditions. Therefore, to control MIk , it suffices to bound Qj and ηb . The above estimates can be used together with the bounds on Qj of Sec. 9 to obtain the following three propositions, which give bounds for ηb on the intervals I0 , I1 and I2 .
November 4, 2004 15:1 WSPC/148-RMP
1034
00217
A. Soffer & M. I. Weinstein
Proposition 10.7. (ηb (t) for t ∈ I0 ) Assume t ∈ I0 = [0, t0 ], i.e. Q0 (t) ≤ C([η0 ]X + E0 )hti−2 for t ∈ [0, t0 ]. Then, for kφ(0)kX sufficiently small kηb (t)k∞ ≤ C(kφ(0)kX , MI0 (t0 ))hti−1 .
(10.37)
Proposition 10.8. (ηb (t) for t ∈ I1 ) Assume t ∈ I1 = [t0 , t1 ]. Then, for kφ(0)kX sufficiently small kηb (t)k∞ ≤ C (kφ(0)kX , MI0 (t0 ), MI1 (t1 )) .
(10.38)
Proposition 10.9. (ηb (t) for t ∈ I2 ) Assume t ∈ I2 = [t1 , T ]. Then, for kφ(0)kX sufficiently small 1
kηb (t)k∞ ≤ C(kφ(0)kX , MI0 (t0 ), MI1 (t1 ), MI2 (T ))ht − t1 i− 2 .
(10.39)
In our estimates of Sec. 11, we shall use the following result to estimate the size of correction terms in the system for Q0 and Q1 for t ≥ t0 , where Q0 is monotonically increasing. Proposition 10.10. Let ζ = ζ(x, t), x ∈ R3 , t ≥ t0 , with ζ(x, t0 ) = 0 satisfy the following dispersive equation i∂t ζ = H0 ζ + Pc (S1 (t) + S2 (t)ζ + χζ 2 + ζ 3 ) , where for all k ≥ 0 and j = 1, 2:
1
kSj (t)kH k = O C([φ(0)]X Q02 (t)
∂t Q0 ≥ 0 for t ≥ t0 , and Q0 ≤ E0 .
(10.40)
(10.41)
Suppose kζ(t)kH 1 ≤ C([φ(0)]X , MI1 (t1 ), MI2 (T )) for all t0 ≤ t ≤ T, where [φ(0)]X is sufficiently small. Then, kζ(t)k∞ ≤ C [φ(0)]X , MI1 (t1 ) , MI2 (T ) Q0 (t)1/2 . ¯ Corollary 10.1. Let p > 6. E0m = supt0 ≤t 1 (t0 ), then Z t 1 (s) 1/2 ds . kζkp ≤ cQ0 (t) ht − si3/2−3/p 0
Corollary 10.2. Let p > 6. 1/2
|hχ, ζi| ≤ cQ0 (t)
Z
t 0
1 (s) ds . ht − si3/2−3/p
which follows by using kχe−iHt Pc χgk ¯ 2 ≤ chti−3/2 kgk2 . Corollary 10.3. 1/2
k∂ k ζk∞ ≤ CQ0 (t)
Z
t 0
1 (s) ds . ht − si3/2
November 4, 2004 15:1 WSPC/148-RMP
00217
1035
Selection of the Ground State for Nonlinear Schr¨ odinger Equations
Proof. This result follows from applying ∂ k to the equation for ζ and estimating, as above, in Lp for any α and p > 6. By the Sobolev inequality this implies control derivatives in L∞ . Remark 10.3. In our applications of Proposition 10.10 and its corollaries, 1 (s) 1
will be given by the source terms depending on Q0 and Q12 . For t > t1 , Q1 = O(ht − t1 i−1 ). Therefore, since the lowest order term in Q1 contributing to 1 (s) is O(Q1 ) (see Eq. (7.32)), it follows that for t > t1 3
1
kηb kW k,∞ = O(ht − t1 i− 2 ) + O(ht − t1 i−1 ) + O(ht − t1 i− 2 ) .
(10.42)
Since η = O(ηb ) + O(η0 ), the conclusion of the main theorem for large t, t > t1 , follows. Namely, for 1
kηkW k,∞ = O(ht − t1 i− 2 ) + e−iH0 t Pc [η(0) + E02 ] ,
(10.43)
where the non-free wave part is coming from spatially-localized source terms. 11. Beginning of Proof of Proposition 9.1 The key to Proposition 9.1 is the following more detailed version of Corollary 7.1: Proposition 11.1. Let t ≥ 0. The equations for P0 , P1 can be written in the following form dP0 (0) (0) (0) = 2ΓP0 P12 + R0 [η0 ] + R1 [ηb ] + R2 [P0 , P1 ] dt dP1 (1) (1) (1) = −4ΓP0 P12 + R0 [η0 ] + R1 [ηb ] + R2 [P0 , P1 ] dt
(11.1)
(i)
where (i) R0 [η0 ] are η0 -dependent terms only both local and nonlocal in time, and may also depend on α ˜ 0 , β˜1 . (ii) ηb is the bound state driven part of the dispersion; see Sec. 5. (0) (iii) R2 depends only on P0 , P1 , t, but not on η; it is formally linear in P0 , of high order in P1 , and contains both local and nonlocal in time terms. (1) (iv) R2 depends only on P0 , P1 , t, but not on η. It is of high order in P1 , local and √ nonlocal terms included , but has terms which are linear in α ˜ 0 ∼ P0 . The proof uses repeated application of near-identity transformations of the variables α ˜ 0 , β˜1 , derivable by integration by parts (see the discussion of resonant and nonresonant terms in Sec. 7) and the decomposition of η given in (10.1). Remark 11.1. As noted in Sec. 10 additional terms which arise, when replacing η0 , e0 and ηb by η˜0 , e˜0 and η˜b , are treated perturbatively. This is the case because integration by parts of these terms gives rise to an extra factor from the derivative ˜ = O(t−1 ). Throughout Secs. 11– of the phase which is O(E0 (t) − E0 (T )) + O(∂t Θ) 13 we shall use this fact as well as the fact that dispersive and energy estimates of
November 4, 2004 15:1 WSPC/148-RMP
1036
00217
A. Soffer & M. I. Weinstein
these sets of functions are the same because they differ by a purely time-varying phase. The proof is long so we break it up into three parts, which are presented in three different sections. The following is an overview. (0)
(1)
Part 1: The terms R0 [η0 ] and R0 [η0 ] are forcing terms in the ODE dynamics, which are driven by the dispersive part of the initial conditions. They are studied and estimated in this section; Proposition 11.2. (0)
(1)
Part 2: The terms R1 [ηb ] and R1 [ηb ] are studied and estimated in Sec. 13; Proposition 13.1. (0)
(1)
Part 3: The terms R2 [P0 , P1 ] and R2 (P0 , P1 ) are studied and estimated in Sec. 12; Proposition 12.1. In Parts 1–3, we require estimates for all t ≥ 0. On I0 = {t : 0 ≤ t ≤ t0 } we use the a priori bound on P0 (t), implied by the definition of t0 ; see Eq. (8.5). For t > t0 , we use the monotonicity property Q. Monotonicity property Q Q0 and
Q0 are monotonically increasing , Q1
(11.2)
where Q0 and Q1 are the modified bound state energies related to P0 and P1 (see (9.7), (9.1), (9.2)). This monotonicity property is shown to hold at t = t0 and is then shown to continue for all time, t, by a continuity argument; see Sec. 14. Since there are many terms, we focus on those which are most problematic, namely, those which are nonlocal in time and of slowest time-decay rate. These calculations are very lengthy and before embarking on them we present a calculation, related to the normal form discussion in Sec. 7, and which is repeated in order to exploit rapid oscillations in time. Expansion of oscillatory integrals, resonances and improved time decay (i) In deriving and estimating the terms Rj in (11.1), we must frequently expand and/or estimate terms of the form: Z t −iH0 (t−s) iΩ1 s e Pc χ2 (s)e ds . (11.3) χ1 , 0
Here, χ1 and χ2 (s) are localized functions of x, χ1 is independent of s and χ2 (s) depends on s through its dependence on α0 (s), β1 (s) or η0 (s). Recall that H0 , defined in (4.6), is of the form ˜ 0 + E 0 σ3 H0 = H
(11.4)
˜ 0 = σ3 (−∆) plus a matrix potential which decays to zero rapidly as |x| tends and H to infinity. In (11.3) we would like to integrate by parts, exploiting the oscillation of frequency E0 . However, “peeling off” these oscillations is a little tricky because ˜ 0 . We handle this as follows, using Theorem 4.3. σ3 does not commute with H
November 4, 2004 15:1 WSPC/148-RMP
00217
Selection of the Ground State for Nonlinear Schr¨ odinger Equations
1037
Recall that ZH0 W = σ3 (−∆ − E0 ) ,
Ze
−iH0 tPc
(11.5)
W = e−iσ3 (−∆−E0 )t
where W denotes the wave operator W = lim e−iH0 t e−iσ3 (−∆−E0 )t , t→∞
(11.6)
Z = W −1 . By (11.5), we can rewrite (11.3) as Z χ1 , e−iH0 t
t 0
W eiσ3 (−∆−E0 )s ZPc eiΩ1 s χ2 (s)ds
Z = χ1 , e−iH0 t Z = χ1 , e−iH0 t Z = χ1 , e−iH0 t Z = χ1 , e−iH0 t
t 0 t 0 t 0
W e−iσ3 (−∆−E0 )s ZPc eiΩ1 s χ2 (s)ds
∼ − χ1 ,
Z
0
t
W eiσ3 ∆s Z · W eiσ3 E0 s ZPc eiΩ1 s χ2 (s)ds
W eiσ3 ∆s Z · W ei(σ3 E0 +IΩ1 )s ZPc χ2 (s)ds
t
We 0
Z ∼ − χ1 , e−iH0 t
t 0
iσ3 ∆s
d Z · (−i) W (σ3 E0 + IΩ1 )−1 ei(σ3 E0 +IΩ1 )s ZPc χ2 (s)ds ds
W σ3 ∆eiσ3 ∆s Z · W (σ3 E0 + IΩ1 )−1 ei(σ3 E0 +IΩ1 )s ZPc χ2 (s)ds
˜ 0 (σ3 E0 + IΩ1 )−1 eiΩ1 s Pc χ2 (s)ds eiH0 (t−s) H
.
(11.7)
In the previous string of equations we have used the notation f ∼ g to mean equality up to terms which are local in time. Note that σ3 E0 + IΩ1 is invertible since its determinant is Ω21 +E02 . We can therefore carry out this procedure any finite number of times to arrange, up to local in time terms, an expression which involves ˜ 0k χ, k ≥ 1, where χ is spatially localized. an operator of the form χ exp(iH0 (t − s))H Therefore, the enhanced local decay estimate (4.43) of Theorem 4.3 applies. We shall use these observations, together with the detailed dependence of χ2 (s) on on (i) α0 , β1 etc. to control certain terms in Rj . (1) (0) Part 1: Estimation of R0 [η0 ] and R0 [η0 ]
Proposition 11.2. Assume either t ≤ t0 or monotonicity property Q on [t0 , t0 + δ∗ ]. Then, (0) (1) R0 [η0 ] , R0 [η0 ] ≤ bi (t0 , [η0 ]X ) hti−2 + O(E0 )2ΓP0 P12 b0 (t0 , [η0 ]X ) = O(ht0 i−1 [η0 ]X ) b1 (t0 , [η0 ]X ) = O(ht0 i
− 21
2/3
[η0 ]X ) .
(11.8)
November 4, 2004 15:1 WSPC/148-RMP
1038
00217
A. Soffer & M. I. Weinstein
Proof of Proposition 11.2. (0)
Proof of estimate (11.8) for R0 [η0 ]: The key terms are those in the P0 equation, which are decaying most slowly with t. These are linear in η0 , since kη0 k∞ = O(t−3/2 ). We focus on the most difficult terms. These are nonlocal in η0 (t). Recall Eq. (7.52) for α0 and that the equation for P0 , (7.6), is derived from the α ˜ 0 equation (related to (7.52) by a near-identity change of variables) by multiplication by α ˜0 and taking the imaginary part of the ∂t α0 equation. (0) We consider the following representative “most problematic” terms in R0 [η0 ], whose estimation introduces the necessary methods for treating them all: Z t −iH0 (t−s) ˜ 0 β˜1 e−iλ+ t + |β˜1 |2 , (T 1) O(˜ α0 ) χ, e Pc Ψ0 (s)Ψ1 (s)η 0 (s)ds O α 0
(11.9)
¯ 1 η0 , Ψ ¯ 0 Ψ1 η0 ) (also with Ψ0 Ψ1 η¯0 replaced by Ψ0 Ψ Z t e−iH0 (t−s) Pc ηb2 η0 ds . (T 2)O(˜ α0 )O(˜ α0 β˜1 ) χ,
(11.10)
0
Estimation of T 1: Since the time-integral is bounded by O(E0 )hti−3/2 [η0 ]X , by the Cauchy–Schwarz inequality the second term in (11.9) is bounded as follows: Z t 2 ˜ O(˜ (11.11) · · · ≤ O(E0 )P0 P12 + O(E0 )hti−3 kη0 k2X . α0 )O(|β1 | ) χ, 0
We now control the first term in (11.9). O(˜ α0 β˜1 )O(˜ α0 ). We argue that the key contribution from this term which must be bounded is of the form: Z t ∂ e−iH0 (t−s) Pc Ψ0 (s)Ψ1 (s)¯ η0 (s)ds . (11.12) O(˜ α0 )˜ α0 β˜1 e−iλ+ t χ, ∂t 0 To see this, consider the term in the α ˜ 0 equation which corresponds to the first term in (11.9): Z t χ, e−iH0 (t−s) Pc Ψ0 (s)Ψ1 (s)¯ η0 (s)ds O α (11.13) ˜ 0 β˜1 e−iλ+ t . 0
Next we integrate with respect to t, and integrate by parts, making use of the oscillatory exponential factor. The result is a boundary term, which can be subsumed in the definition of α ˜ 0 , by a near identity transformation, followed by a time-integral 0 to t. The latter contributes terms to the P0 equation (which has been modified due to the slight redefinition of α ˜ 0 ) of the following type: Z t O(˜ α0 )O(β˜1 ∂t α ˜0 + α ˜ 0 ∂t β˜1 ) χ, e−iH0 (t−s) Pc Ψ0 Ψ1 η¯0 ds (11.14) 0
O(˜ α0 )O(˜ α0 β˜1 )
∂ ∂t
χ,
Z
t 0
e−iH0 (t−s) Pc Ψ0 Ψ1 η¯0 ds
.
(11.15)
November 4, 2004 15:1 WSPC/148-RMP
00217
Selection of the Ground State for Nonlinear Schr¨ odinger Equations
Since ∂t α ˜ 0 = O(|˜ α0 |2 |β˜1 |) and ∂t β˜1 = O(|˜ α0 ||β˜1 |2 ) (11.14) is bounded by 3 O(E0 )O α ˜30 β˜12 hti− 2 [η0 ]X ,
1039
(11.16)
where we have used
kη0 k∞ ≤ C[η0 ]X hti−3/2 .
(11.17) O(E0 )(2ΓP0 P12 +[η0 ]2X hti−3 ).
By the Cauchy–Schwarz inequality this is bounded by Therefore, the contribution from this first term satisfies the estimate (11.8). Obtaining a bound on (11.15) is more involved. We use the local decay estimate of Theorem 4.3: ˜ 0 f k2 ≤ hti−5/2 k∂ k hxiσ f k2 . kχe−iH0 t Pc H
(11.18)
The expression (11.15) bounded by: Z t −iH0 (t−s) ˜ ˜ α0 β1 ) hχ, Ψ0 Ψ1 η0 i − ihχ, e Pc H0 Ψ0 Ψ1 η¯0 dsi O(˜ α0 )O(˜ 0
−3
≤ O(E0 ) ΓQ0 Q21 + hti Z t 1 ds α0 ||β˜1 | . + O(˜ α20 β˜1 )[η0 ]X 3 |˜ 5/2 ht − si hsi 2 0
(11.19)
The latter integral requires detailed estimation on different time scales. To estimate the last integral, we split the range of integration into three regions: I0 ≡ {s : 0 ≤ s ≤ t0 } I1 ≡ {s : t0 < s ≤ t1 }
(11.20)
I2 ≡ {s : t1 < s ≤ T } . Estimate on I0 : Assume t ≤ t0 . Recall that by (8.5), |˜ α0 (s)| ≤ O(E0 )hsi−1 . Using this we have Z t 1 1 O(˜ α20 β˜1 )[η0 ]X α0 | |β˜1 |ds 3 |˜ 5/2 hsi 2 0 ht − si Z 1 [η0 ]X t 3 1 1 2 ˜ ˜ 2 = O(|˜ α0 | ) 4Γ˜ α 0 β1 |˜ α0 | |β1 |ds 4Γ 0 ht − si5/2 hsi 32 3 5 4 4/3 ≤ O(E04 ) 2ΓP0 P12 + [η0 ]X hti− 2 · 3 , 4
where we have used that ab ≤ a4 /4 + 3b 3 /4.
Estimate on I1 : Let t be such that t0 ≤ t ≤ t1 . We break the integral into an integral over [0, t0 ] plus an integral over [t0 , t]. Recall the definitions of Qj (t), j = 0, 1 in terms of |˜ αj (t) displayed in (9.7). Using that • for s ∈ I0 , |˜ α0 (s)| ≤ O(E0 )hsi−1 and for • s ∈ I1 , Q0 is increasing and Q0 (s) ≤ E0r Q1 (s),
November 4, 2004 15:1 WSPC/148-RMP
1040
00217
A. Soffer & M. I. Weinstein
we have Z
t0
Z t
1 |β˜ (s)| |˜ α0 (s)|ds 5/2 hsi3/2 1 ht − si 0 t0 Z t ˜ |β1 (s)||˜ α0 (s)| ds ≤ I0 type bound + O(|˜ α0 |2 |β˜1 |) [η0 ]X 5/2 hsi3/2 t0 ht − si Z t 1 3 ˜ ˜ ≤ I0 type bound + O(|˜ α0 | |β1 |kβ1 k∞ )[η0 ]X ds 5/2 hsi3/2 ht − si t0
O(|˜ α0 (t)|2 |β˜1 (t)|)[η0 ]X
+
1 3 α0 | · |˜ α0 | · |˜ α0 ||β˜1 | [η0 ]X hti− 2 ds = I0 type bound + O E02 |˜
r 1 3 = I0 type bound + O E0 · Q02 · E02 Q1 [η0 ]X hti− 2 r+2
= I0 type bound + O E0 2
r+2
= I0 type bound + O E0 2 which is a bound of the type in (11.8).
1
3
Q02 Q1 hti− 2
(Q0 Q21 + hti−3 ) ,
(11.21)
Estimate on I2 : |˜ α0 ||β˜1 | ds 5/2 hsi3/2 0 ht − si Z t0 Z t1 Z t 2˜ = O(˜ α0 β1 (t)[η0 ]X ) + +
O(˜ α20 β˜1 (t))[η0 ]X
Z
t
|˜ α0 ||β˜1 | ds ht − si5/2 hsi3/2 0 t0 t1 Z t |˜ α0 ||β˜1 | = I0 & I1 type bounds + O(˜ α20 β˜1 (t))[η0 ]X ds . (11.22) 5/2 hsi3/2 t1 ht − si
The latter integral must be treated differently from the previous terms. For t ∈ I1 , we used that Q0 monotonically increasing and bounded by a small constant times Q1 to treat terms perturbatively. On I2 , Q0 dominates Q1 (which decays) and we must use a different argument. We return to the expression from which the last term in (11.22) is derived: Z t ˜ 0 Ψ0 Ψ1 η¯0 ds O(˜ α0 )O(˜ α0 β˜1 ) hχ, Ψ0 Ψ1 η0 i − i χ, e−iH0 (t−s) Pc H . (11.23) 0
We need to expand and estimate the time integral: Z t ˜0α e−iH0 (t−s) Pc H ˜0 (s)β˜1 (s)e−iλ+ s η¯0 (s)ds t1
= e
−iH0 t
Z
t t1
ei(H0 −λ+ )s Pc H0 α ˜ 0 (s)β˜1 (s)¯ η0 (s)ds
(11.24)
which we do using integration by parts. We carry this out, then take the inner product of the result with a localized function, χ, and then finally multiply by
November 4, 2004 15:1 WSPC/148-RMP
00217
Selection of the Ground State for Nonlinear Schr¨ odinger Equations
1041
O(|˜ α0 |2 |β˜1 |). The result is of order
˜0χ |˜ α0 |3 |β˜1 |2 kη0 (t)k∞ + |˜ α0 |3 |β˜1 | χ, e−iH0 t H Z t ˜ 0 χe−iλ+ s d α ˜0 (s)β˜1 (s)¯ η0 (s) ds . + O(|˜ α0 |2 |β˜1 |) e−iH0 (t−s) Pc (H0 − λ+ )−1 H ds t1 (11.25) The first two terms in (11.25) are bounded as follows: |˜ α0 |3 |β˜1 |2 kη0 (t)k∞ = O(E0 )(|˜ α0 ||β˜1 |2 )kη0 (t)k∞ ≤ O(E0 ) Q0 Q21 + kη0 (t)k2∞ ≤ O(E0 ) Q0 Q21 + hti−3
˜ 0 χ ≤ C|˜ α0 |5/2 (|˜ α0 |1/2 |β˜1 |)hti−5/2 |˜ α0 |3 |β˜1 | χ, e−iH0 t H 5 4 5/2 ≤ O(E0 ) Q0 Q21 + hti− 2 · 3 5/2 ≤ O(E0 ) Q0 Q21 + hti−3 .
(11.26)
We now turn to the nonlocal term, in (11.25), which we denote I1 : Z t ˜ 0 χe−iλ+ s e−iH0 (t−s) Pc (H0 − λ+ )−1 H I1 ∼ O(|˜ α0 |2 |β˜1 |) t1
× ∂s α ˜0 (s)β˜1 (s)¯ η0 (s) + α ˜0 (s)∂s β˜1 (s)¯ η0 (s) + α ˜ 0 (s)β˜1 (s)∂s η¯0 (s) .
(11.27)
Using that
˜0 ) , ∂t α ˜0 ∼ O(β˜12 α
∂t β˜1 ∼ O(˜ α0 β˜12 ) ,
and ∂t η0 = −iH0 + O(hti−1 ) η0 ,
(11.28)
(see also Remark 11.1) we have Z t 2 ˜ ˜ 0 χe−iλ+ s I1 ∼ O(|˜ α0 | |β1 |) e−iH0 (t−s) Pc (H0 − λ+ )−1 H t1
˜ 0 η¯0 (s) × α ˜ 0 β˜13 η¯0 + α ˜ 20 β˜12 η¯0 (s) + α ˜ 0 (s)β˜1 (s)H = I1a + I1b + I1c .
(11.29)
Each of the three terms I1j , j = a, b, c satisfies a bound of the form: |I1j | ≤ CE02 [η0 (0)]X
1 1 . ht0 i ht − t1 i2
(11.30)
We illustrate this by estimating I1a ; the other two terms are estimated similarly.
November 4, 2004 15:1 WSPC/148-RMP
1042
00217
A. Soffer & M. I. Weinstein 1
Using that |˜ α0 |, |β˜1 | = O(E02 ) we have Z t 1 ˜ 3 α0 ||η0 |ds |I1a | ≤ |˜ α0 |2 |β˜1 | 5 |β1 | |˜ 2 ht − si t1 3 Z 1 ≤ E02 [η0 (0)]X sup ht − t1 i 2 |β˜1 (t)| t≥t1
t t1
1
1 5 2
1 3 2
3
ht − si hs − t1 i hsi 2
.
Separate estimation of the contributions from the intervals [t1 , 21 (t + t1 )] and [ 21 (t + t1 ), t] yields the bound: 3 1 1 1 |I1a | ≤ CE02 [η0 (0)]X sup ht − t1 i 2 |β˜1 (t)| . (11.31) ht0 i ht − t1 i2 t≥t1 Estimation of T 2: Consider the term Z t T2 = O(˜ α0 (t))O(˜ α0 (t)β˜1 (t)) χ, e−iH0 (t−s) Pc ηb2 (s)η0 (s) ds .
(11.32)
0
t ∈ I1 ≡ [t0 , t1 ]: For t ∈ I1 , r
|˜ α0 |2 |β˜1 | ≤ cE02 |˜ α0 ||β˜1 |2 ; Proposition 9.4 .
(11.33)
Therefore, Z t 2 ˜ α0 ||β1 | χ, e−iH0 (t−s) Pc ηb2 η0 ds |T2| ≤ cE0 |˜ r 2
0
r
α0 ||β˜1 |2 hti−3/2 ≤ cE02 |˜
3
sup hsi 2 kηb2 (s)η0 (s)kW k,2 ∩L1
0≤s≤t
r
≤ cE02 |˜ α0 ||β˜1 |2 hti−3/2 kηb k2W k,2 [η0 ]X ≤ cE0ρ [η0 ]X |˜ α0 ||β˜1 |2 hti−3/2 ≤ cE0ρ [η0 ]X P0 P12 + hti−3 .
t ∈ I2 ≡ [t1 , ∞): To see the relevant terms for t > t1 , we integrate by parts and obtain, besides easily estimable local terms and terms with faster time decay, Z t 2˜ −iH0 (t−s) ˜ 2 O(˜ α0 β1 ) χ, e H0 Pc ηb η0 ds t > t1 . (11.34) 0
Consider the contribution to the integral in (11.34) coming from s ∈ [0, t0 ]: Z t0 ˜ 0 Pc η 2 η0 ds e−iH0 (t−s) H O α ˜ 20 (t)β1 (t) χ, t > t1 > t0 . (11.35) b 0
Consider, the inner product Z R(t) = χ,
t0 0
˜ 0 Pc η 2 η0 ds e−iH0 (t−s) H b
.
(11.36)
November 4, 2004 15:1 WSPC/148-RMP
00217
Selection of the Ground State for Nonlinear Schr¨ odinger Equations
1043
We have 1
|(11.35)| ≤ C|˜ α0 (t)|2 |β˜1 (t)||R(t)| ∼ CQ0 (t)Q12 (t)|R(t)|
(11.37)
and therefore it suffices to prove that 3
|R(t)| ≤ CQ12 (t) .
(11.38)
To prove (11.38), recall that for t ≥ t1 dQ0 ΓQ0 Q21 dt dQ1 −2ΓQ0 Q21 , dt
(11.39)
and E0r Q1 (t1 ) = Q0 (t1 ) .
(11.40)
Therefore, Q1 (t)
2Q1 (t1 ) . Rt 1 + Γ t1 Q0 (s)ds
(11.41)
Therefore, to establish (11.38) we need: |R(t)| ≤ C
2Q1 (t1 ) Rt 1 + Γ t1 Q0 (s)ds
! 32
.
(11.42)
We consider two cases Case 1: Γ sups∈[t1 ,t] |Q0 (s)||t − t1 | ≤ 1, and Case 2: Γ sups∈[t1 ,t] |Q0 (s)||t − t1 | ≥ 1, where it suffices to prove |R(t)| ≤
3Q1 (t1 ) Rt Γ t1 Q0 (s)ds
! 32
.
(11.43)
In Case 2 we prove the bound on R(t), (11.38), while in Case 1 we prove that the 3 3 expression (11.34) of order ht − t1 i− 2 ht − t0 i− 2 . We first handle Case 2, the bound (11.43). From (11.39) we have d (Q0 + 2Q1 ) 0 , dt
t ≥ t1
Q0 (t) + 2Q1 (t) Q0 (t1 ) + 2Q1 (t1 ) = (2 + E0r )Q1 (t1 ) .
(11.44)
Therefore, since Q1 (t)
2Q1 (t1 ) Rt 1 + Γ t1 Q0 (s)ds
(11.45)
November 4, 2004 15:1 WSPC/148-RMP
1044
00217
A. Soffer & M. I. Weinstein
we have, as t → ∞ Q0 (t) (2 + E0r )Q1 (t1 ) − 2Q1 (t) → (2 + E0r )Q1 (t1 ) .
(11.46)
Therefore, for t > t1 (t − t1 large enough) (1 + E0r )Q1 (t1 ) ≤ Q0 (t) ≤ (2 + E0r )Q1 (t1 )
(11.47)
and therefore Q1 (t1 ) Q1 (t1 ) ≥ Rt 1 + Γ(2 + E0r )|t − t1 |Q1 (t1 ) Q (s)ds 0 t1
1+Γ
≥
1 1 1 1 ≥ . 3 Γ|t − t1 | 3 Γ|t − t0 |
(11.48)
On the other hand, Z |R(t)| ≤ C χ,
t0 0
1
˜ 2 3 kH0 ηb (s)η0 (s)kL1 ds
ht − si 2
≤ C sup kηb (τ )k2H 2 [η0 ]X τ ∈[0,t0 ]
≤ CE02 [η0 ]X
1 3
ht − t0 i 2
Z
t0
1 3
0
3
ht − si 2 hsi 2
ds
.
(11.49)
The bounds (11.49) and (11.48) imply (11.43). We now turn to Case 1. In this case, Γ sups∈[t1 ,t] Q0 (s)|t − t1 | ≤ 1, Q0 (t) ≤
1 1 Γ |t − t1 |
and therefore by (11.34) and (11.50) 1 |(11.34)| ≤ E02 [η0 ]X sup hs − t1 i 2 Q1 (s) s∈[t1 ,t]
(11.50)
1 1 1 . |t − t1 | ht − t1 i 12 ht − t0 i 23
(11.51)
We begin by noting that by Proposition 10.10 and its corollaries, for p > 6 kηb kW k,p ≤ O(E0 )hti−1 , kηb kW k,p ≤
O(E0ρ )Q0 (t)1/2
kηb k∞ = O(ht − t1 i
− 12
),
t ≤ t0 ,
t0 < t ≤ t 1
(11.52)
t ≥ t1 .
For t < t1 , the previous arguments with the known estimates on ηb , and the facts that |˜ α0 (t)| ≤ O(E0 )/hti on I0 and Q0 (t) ≤ E0r Q1 on I1 imply the necessary bounds. Collecting all these, we have Z t ˜ 0 η 2 η0 ds ≤ O(E0 )[2ΓQ0 Q2 + ht0 i−1 hti−2 ] . O(˜ α20 β˜1 ) χ, eiH0 (t−s) Pc H 1 b 0
November 4, 2004 15:1 WSPC/148-RMP
00217
Selection of the Ground State for Nonlinear Schr¨ odinger Equations
1045
(1)
Proof of estimate (11.8) for R0 : The key terms to consider in the P1 equation are the slowest decaying nonlocal terms; see (7.39) and (7.32). Since b1 enters as b21 in (1) (9.22), we need to bound R0 by O(E0 )[ht0 i−1/2 hti−2 + 2ΓP0 P12 ] for t > t0 . The slowest decaying nonlocal term in the β1 equation arises from the balance: 3 i∂t β1 ∼ λhψ1∗ , π1 Φ2 i|β1 |2 eiλ+ (T )t .
(11.53)
Since the P1 equation is obtain by multiplying by β1 and taking the imaginary part, we must estimate: 3 λhψ1∗ , π1 Φ2 i |β1 |2 β1 eiλ+ (T )t ,
(11.54)
the leading order nonlocal part of which is Z t −iH0 (t−s) 2˜ ˜ e Pc Ψ0 Ψ1 η0 ds eiλ+ t ∼ |β1 | β1 χ, 0
∼ |β˜1 |2 β˜1 χ,
Z
t 0
e−iH0 (t−s) Pc α ˜ 0 β˜1 e−iλ+ s χη0 (s)ds eiλ+ t .
Integrating by parts (using the oscillatory factor eiλ+ t ) and removing the nonresonant local in terms by near identity transformations the key term is Z t ˜ 0 Pc α |β˜1 |2 β1 χ, e−iH0 (t−s) H ˜0 β˜1 χeiλ+ s η0 (s)ds eiλ+ t . (11.55) 0
Suppose t > t1 . Since we have no factor of α ˜ 0 outside the integral (local in time α0 (t) factor), the estimate for I0 = {0 < s < t0 } is the critical part and requires the O(hti−1 ) bound on I0 and Theorem 4.3: Z t −iH0 (t−s) ˜ −iλ+ s ˜ χ, e H0 P c α ˜ 0 β1 χη0 e ds 0
Z t1 Z t ds ˜ · · · ds |˜ α (s)|| β (s)| + + 0 1 ht − si5/2 hsi3/2 t1 t0 0 Z t0 ds ≤ O(E0 )[η0 ]X 5/2 ht − si hsihsi3/2 0 Z t Z t1 ds ˜ |˜ α (t )|| β (s)| + · · · ds + [η0 ]X 0 1 1 5/2 hsi3/2 t1 t0 ht − si ≤ O(E0 ) hti−5/2 [η0 ]X + O(E0 )[η0 ]X |˜ α0 (t)|hti−3/2 3 1 + [η0 ]X |˜ α0 (t)|ht − t1 i− 2 hti− 2 .
≤ C[η0 ]X
Z
t0
Note that we can extract the factor |˜ α(t)| from the integral for s ≥ t0 since in this range Q0 (t) is monotonically increasing and Q0 (t) ∼ P0 (t) ∼ |˜ α0 (t)| with correction
November 4, 2004 15:1 WSPC/148-RMP
1046
00217
A. Soffer & M. I. Weinstein
terms which are rapidly decaying in time and which are therefore dominated by the first term. Multiplication by O(|β˜1 |3 ) prefactor gives the bound " # 3 1 −5/2 −3/2 2 ˜ 2 C(E0 , [η0 ]X ) sup hs − t1 i |β1 (s)| hti ht − t1 i + Q0 Q . (11.56) 1
s≥t1
This completes the proof of Proposition 11.2. (j)
12. Local and Nonlocal ODE Terms: R2 (P0 , P1 ) of Proposition 11.1 (0)
(1)
In this section we prove estimates on the terms R2 (P0 , P1 ) and R2 (P0 , P1 ) of Proposition 11.1. Proposition 12.1. Assume either t ≤ t0 or monotonicity property Q on [t0 , t0 + δ∗ ]. Then, for m ≥ 4, (0)
(i) |R2 (P0 , P1 )| ≤ O(E0ρ )ΓP0 P12 + O(hti−3 )(P0 , P1 1) √ (1) (ii) |R2 (P0 , P1 )| ≤ O(E0ρ )ΓP0 P12 + O(α0 )P12m + O(hti−3 ), |α0 | ∼ P0 .
12.1. Proof of part (1) of Proposition 12.1 The most problematic terms are nonlocal, slowest decaying. The terms which are linear in η in the α0 equation contribute the slowest terms; these are nonlocal in time, t. We have to consider terms arising in Eq. (7.52) of the type: hχ, ηie−iλ+ t α0 β1 , hχ, ηi|β1 |2 , hχ, ηiβ12 e−2iλ+ t ,
(12.1)
where φ2 ∼ η and η(t) = η0 (t) + e0 (t) + ηb (t); see (10.1). As calculated earlier, the last term is resonant and its contribution is +2ΓP0 P12 to the P0 equation; see (7.55). The leading ODE terms in η are the source terms of the type in (7.32): |Ψ0 |2 Ψ1 , Ψ20 Ψ1 , Ψ0 |Ψ1 |2 , Ψ0 Ψ21 .
(12.2)
Recall that the P0 equation is obtained from the α ˜ 0 equation by multiplication by α ˜ 0 and taking the imaginary part. Solving for ηb and plugging the source term contributions into the hχ, ηi terms in the P0 equation gives, apart from the resonant term, terms of the type Z t α0 O(¯ α0 β1 eiλ+ t + |β1 |2 ) χ, e−iH0 (t−s) Pc |Ψ0 |2 Ψ1 0
¯ 1 + Ψ0 |Ψ1 |2 + Ψ ¯ 0 Ψ2 ds . + Ψ20 Ψ 1
(12.3)
¯ 1 ) source terms. For t0 < t < t1 For t > t1 the slowest terms are O(|Ψ0 |2 Ψ1 + Ψ20 Ψ 2 2 ¯ the problematic terms are O(Ψ0 |Ψ1 | + Ψ0 Ψ1 ) source terms, since on this interval
November 4, 2004 15:1 WSPC/148-RMP
00217
Selection of the Ground State for Nonlinear Schr¨ odinger Equations
1047
β1 does not necessarily decay. Since Ψ1 ∼ e−iλ+ t β1 + higher order terms, we can ¯ 1 ) terms in (12.3) to get arbitrary order integrate by parts the O(|Ψ0 |2 Ψ1 + Ψ20 Ψ m β1 terms, which are nonlocal. The remaining terms are local and its lowest order term is of the order
α0 O(α0 eiλ+ t β1 + |β1 |2 ) χ, O(|α0 |2 β1 eiλ+ t + α20 β¯1 e−iλ+ t ) . (12.4)
The nonresonant local terms can be transformed to higher order by a near identity change of variables giving O(|α0 |3 β12m eiΩt ), Ω > 0 and m arbitrarily large ,
while resonant terms are of the type already derived but of higher order (bounded by O(E0 )Q0 Q21 )). We are therefore left to consider the following nonlocal terms Z t ˜ . (12.5) e−iH0 (t−s) Pc α0 β12m eiΩm s χds α ¯0 O α ¯ 0 β1 e−iλ+ t + |β1 |2 χ, 0
Consider the first term in (12.5). Due to the oscillatory factor e−iλ+ t in the O(¯ α0 β1 e−iλ+ t ) term we can, by a near-identity transformation, in the α0 equation, remove this term in exchange for one higher order Z t Z t −iH0 (t−s) 2 iΩs 2 3 e Pc · · · · · · + O(¯ α0 β1 e )∂t χ, O(|α0 | β1 ) χ, 0
0
+ higher order + local terms , and by further near identity transformations, we are left with the following slowest term Z t O(|α0 |2 β1 )∂tm χ, e−iH0 (t−s) Pc · · · ds . (12.6) 0
All other terms are of order o(E0 )O(P0 P12 ), o(E0 )hti−3 or higher order, for E0 small. Z t ∂t χ, e−iH0 (t−s) Pc α0 β12m χds ˜ 0
=
χ, α0 (t)β12m (t)χ ˜
− χ, i
Z
t
e 0
−iH0 (t−s)
˜ 0 Pc χα H ˜ 0 β12m eiΩs ds
.
The first term is local and will contribute o(1)O(P0 P12 ). It remains to estimate the nonlocal term. Note that Z t Z t e−iH0 (t−s) eiωs χα ˜ 0 β12m ds = e−i¯ωt e−i(H0 −¯ω)(t−s) ei(ω+¯ω)s χα ˜ 0 β12m ds . 0
0
Consider now the second termR in (12.5). Using the oscillatory factor e−i¯ωt , one can transform the O(¯ α0 |β1 |2 )hχ, · · ·i to higher order. We now turn to the second term in (12.5). First, let us consider the case where t ≥ t1 : Z t O(¯ α0 |β1 |2 )∂tm χ, e−iH0 (t−s) χα ˜ 0 β12m ds . (12.7) 0
November 4, 2004 15:1 WSPC/148-RMP
1048
00217
A. Soffer & M. I. Weinstein
We consider (12.7) for t ≥ t1 . For this we require the following: Lemma 12.1. For t > t1 , Q1 (t) ≥ Q1 (t1 ) [1 + 4(Γ0 + δ)Q1 (t1 )Q0 (t1 )(t − t1 )]
−1
.
Proof of Lemma 12.1. For t ≥ t1 dQ1 ≥ −4(Γ0 + δ)Q0 Q21 . dt Therefore, −
(12.8)
(12.9)
dQ−1 dQ−1 1 1 ≥ −4(Γ0 + δ)Q0 (t) ≥ −4(Γ0 + δ)Q0 (t1 ) or ≤ 4(Γ0 + δ)Q0 (t1 ) dt dt (12.10)
since Q0 (t1 ) ≤ Q0 (t) (Q0 ↑ for t > t0 ). Integrating from t to t1 , we get Q1 (t)−1 − Q1 (t1 )−1 ≤ 4(Γ0 + δ)Q0 (t1 )(t − t1 ) ,
(12.11)
which is equivalent (12.8). This completes the proof of Lemma 12.1. Estimation of (12.7) for t ≥ t1 : Carrying out the differentiation in (12.7) we find that it suffices to bound terms of the type: O α0 |β1 |2 hχ, Pc χiα0 β12m Z t 2 2 −iH0 (t−s) m 2m ˜ = O(E0 )Q0 Q1 O(α0 |β1 | ) χ, e Pc H0 χα ˜ 0 β1 ds . (12.12) 0
We now consider (12.12), which we break into the sum of three integrals: Z t0 Z t1 Z t 2 −iH0 (t−s) k 2m ˜ O(¯ α0 |β1 | ) χ, + + e Pc H0 χα ˜ 0 β1 ds . (12.13) 0
Rt
t1
t0
t1
: By the local decay estimate for e−iH0 t H0k Pc of Theorem 4.3, the integral in
(12.13) is bounded above by: Z
t
t1
ds |α0 β12m | . ht − sik
This in turn is bounded above by Z 1/2 Q0 (t)
t t1
ds Qm (s) , ht − sik 1
(12.14)
(12.15)
since Q0 is increasing for t ≥ t0 . We now aim to further bound (12.15) by “extracting” powers of Q1 (t) from under the integral. Recall that for s ≥ t1 Q1 (t1 ) Q1 (s) ≤ 1 + 2ΓQ0 (t1 )Q1 (t1 )|s − t1 | Q1 (t1 )Q0 (t1 )2Γ 1 = 2ΓQ0 (t1 ) 1 + 2ΓQ0 (t1 )Q1 (t1 )|s − t1 | −1 ≤ min{Q1 (t1 ), 2Γ−1 Q−1 } 0 (t1 )hs − t1 i
(12.16)
November 4, 2004 15:1 WSPC/148-RMP
00217
Selection of the Ground State for Nonlinear Schr¨ odinger Equations
1049
and −1 −r Q−1 E0 , 0 (t1 ) ≡ Q1 (t1 )
Q1 ≤ E0 .
and
(12.17)
We write ¯
¯
¯
m−k Qm (s)Qk1 (s) ≤ Q1m−k (s) 1 (s) = Q1
=
¯ Q1m−k (s)
1 1 ¯ k hs − t1 i (2ΓQ0 (t1 ))k¯
¯
E0−rk 1 ¯ hs − t1 ik (2ΓQ1 (t1 ))k¯
¯
¯
¯
≤ Q1m−2k (s)E0−rk hs − t1 i−k ,
(12.18)
where we have used that Q1 (t1 ) ≥ Q1 (s). We consider separately the cases t ≥ 2t1 and t1 ≤ t ≤ 2t1 . t ≥ 2t1 : Take k¯ = 2 and m > 2(r + 2). Then, the integral in (12.15) is bounded by ¯
1
E0ρ Q0 (t) 2 hti− min(k,k) , ρ > 0 .
This implies, for k sufficiently large, Z t O(α0 |β1 |2 ) χ, ≤ O(E ρ ) Q0 Q21 + hti−4 · ρ > 0 . · · · 0
(12.19)
(12.20)
0
t1 ≤ t < 2t1 : Let t = t1 + 2M, M = (t − t1 )/2 and rewrite the integral as Z t1 +M Z t1 +2M Z t ds ds 2m |α β | = |α0 β12m | . + 0 1 k k ht − si ht − si t1 t1 t1 +M
(12.21)
In a manner similar to the previous estimate, using that Q1 is decreasing we have, by (12.18): Z t1 +M ds ¯ Q (s)m/2 hs − t1 i−k k 1 ht − si t1 ≤ cht − (t1 + M )i−k Q1 (t1 )m/2 m
≤ cht − t1 i−k Q1 (t1 )Q1 (t1 ) 2 −1 ≤ cht − t1 i−k [1 + 4(Γ0 + δ)Q1 (t1 )Q0 (t1 )|t − t1 |] Q1 (t)Q1 (t1 )m/2−1 ≤ O(E0 )Q1 (t) for k ≥ 1. Hence 1/2
O(α0 |β1 |2 )Q0
Z
t1 +M t1
· · · ≤ O(E0 )Q0 Q21 .
(12.22)
Furthermore, by (12.18) the integral over [t1 + M, t1 + 2M ] is bounded by Z t1 +2M ds ¯ ¯ m/2 ≤ cQ1 (t1 )m/2 hM i−k ≤ cQ1 (t1 )m/2 ht − t1 i−k ¯ Q1 (s) k hs − t ik ht − si t1 +M 1 (12.23)
November 4, 2004 15:1 WSPC/148-RMP
1050
00217
A. Soffer & M. I. Weinstein
and, as above, using the upper bound (12.8) for Q1 (t1 ) in terms of Q1 (t) we have Z t1 +2M · · · ≤ O(E0 )Q1 (t) (12.24) t1 +M
for k¯ ≥ 1. Therefore, for all t > t1 the nonlocal (and local) ODE terms in the Q0 equation are bounded by O(E0 )[Q0 Q21 + hti−3 ] , Rt provided we control the integral 0 1 · · · ds. R t1 : Consider 0 Z t1 ds 1/2 1/2 I(t) ≡ Q0 Q1 Q Q1m−1 Q1 ds . ht − sik 0 0
(12.25)
(12.26)
By the mean value theorem, Q1 (s) = Q1 (s) − Q1 (t) + Q1 (t) = Q˙ 1 (¯ s)|t − s| + Q1 (t), where s ≤ s¯ ≤ t. Then, Z t1 1 1 ds I(t) ≤ Q02 (t)Q21 (t) Q02 (s)Q1m−1 (s)ds k ht − si 0 + I1 (t) ,
(12.27)
where 1
I1 (t) ≡ cQ02 (t)Q1 (t)
Z
t1 0
where we have used
1 1 ds Q02 (s)Q1m−1 (s)Q02 (¯ s)Q1 (¯ s) , k−1 ht − si
1 Q˙ 1 = O Q02 Q1 + h.o.t.
(12.28)
(12.29)
1
s) ≤ E0ρ h¯ si−1 ≤ E0ρ hsi−1 , we have the bound If s¯ ≤ t0 , then using that Q02 (¯ 1 (12.30) I1 (t) ≤ O(E0ρ )Q02 (t)Q1 (t)hti−2 ≤ O(E0ρ ) Q0 (t)Q21 (t) + hti−4 .
If s¯ > t0 , then since Q0 (s) is monotonically increasing for s ≥ t0 , and we have Z 1 ds Q0 (t)Q1 (t) t1 Q02 (s)Q1m−1 (s)Q1 (¯ s) . (12.31) I1 (t) ≤ k−1 ht − t1 i ht − si 0 We now expand the latter factor of Q1 in the integrand using the mean value theorem. Specifically, there exists s0 with t0 < s¯ ≤ s0 ≤ t1 such that: Q1 (¯ s) = Q1 (t1 ) + Q˙ 1 (s0 )(s0 − t1 ) 1 = Q1 (t1 ) + O Q02 (s0 )Q1 (s0 ) |s0 − t1 |
1 ≤ Q1 (t1 ) + O Q02 (t1 )Q1 (s0 ) |¯ s − t1 | ,
where the last inequality follows by monotonicity of Q0 .
(12.32)
November 4, 2004 15:1 WSPC/148-RMP
00217
Selection of the Ground State for Nonlinear Schr¨ odinger Equations
Substitution into integrand in (12.32) gives: Z t1 1 Q0 (t)Q1 (t) 21 ds I1 (t) ≤ cE0ρ Q1 (t1 ) Q02 (s)Q1m−1 (s) , k−3 ht − t1 i ht − si 0
1051
(12.33)
where we have used (12.17) to replace Q0 (t1 ) by Q1 (t1 ). A higher-order term (one proportional to Q1 (t1 )) is subsumed by the constant, c. From (12.8) we have Q1 (t1 ) ≤ (1 + 4(Γ0 + δ)Q1 (t1 )Q0 (t1 )(t − t1 )) Q1 (t) .
(12.34)
Case 1: 4(Γ0 + δ)Q1 (t1 )Q0 (t1 )(t − t1 ) ≤ 100. Here, Q1 (t1 ) ≤ 101Q(t), and therefore 3 Z t1 2 1 ds ρ Q0 (t)Q1 (t) Q02 (s)Q1m−1 (s) . (12.35) I1 (t) ≤ cE0 k−3 ht − t1 i ht − si 0 Rt Rt Rt We now split the integral in (12.35) as 0 1 = 0 0 + t01 . Using the hsi−1 decay of 1
Q02 for s ≤ t0 we have
Z
t0 0
· ≤ E0ρ hsi−1 ,
(12.36)
and using the monotonicity of Q0 for t1 ≥ s ≥ t0 and the relation Q0 (t1 ) = E0r Q1 (t1 ) ≤ 101E0r Q1 (t) we have Z t1 √ 1 1 ρ− r (12.37) · ds ≤ E0ρ Q02 (t1 ) = 101E0 2 Q12 (t) . t0
Therefore, choosing m sufficiently large, we get
3 I1 (t) ≤ c0 E0ρ1 Q0 (t)Q12 (t)hti−1 + E0ρ2 Q0 (t)Q21 (t) ≤ c00 E0ρ3 (Q0 Q21 + hti−4 ) ,
(12.38)
where ρ3 = min{ρ1 , ρ2 }. Case 2: 4(Γ0 + δ)Q1 (t1 )Q0 (t1 )(t − t1 ) ≥ 100. Then, (12.8) and monotonicity of Q0 for t ≥ t0 implies 1 ≤ 4(Γ0 + δ)Q0 (t)Q1 (t) . (12.39) ht − t1 i This gives
I1 (t) ≤ CE0ρ Q0 (t)Q21 (t) .
(12.40)
We now turn to (12.7) for t ∈ [t0 , t1 ]. It suffices to estimate the nonlocal terms Z t 2 −iH0 (t−s) m 2m ˜ O(α0 |β1 | ) χ, e Pc H0 χα ˜ 0 β1 ds (12.41) 0
for t0 ≤ t ≤ t1 . In this region Q0 and (Q0 /Q1 ) are increasing functions. Also Q0 (t) ≤ E0r Q1 (t). The main difficulty is the need to pull a factor of Q1 (t) out of the nonlocal term. To this end we use the following proposition:
November 4, 2004 15:1 WSPC/148-RMP
1052
00217
A. Soffer & M. I. Weinstein
Proposition 12.2. There exists a constant δ > 0 such that for t ≥ t0 Z t 1 m−2 2 Q0 Qm + Q0 (t)Q1m−2 (t) + h.o.t. 2Γ 1 ds ≤ δQ0 (t)Q1 (t)
(12.42)
t0
Corollary 12.1. Z t t0
3 ds m− 1 1/2 Q0 (s)Q1m−1 (s) ≤ C(1 + δ)Q08 Q1 2 . 3/2 ht − si
(12.43)
The corollary follows from Proposition 12.2 by the H¨ older’s inequality. Proof of Proposition 12.2. Recall that 1 dQ1 ≤ −4ΓQ0Q21 + O(Q02 Qm 1 ). dt
dQ0 = 2ΓQ0 Q21 + R0 , dt
(12.44)
Claim. 1
Z
1
Q02
t t0
R0 ds ≤
O(E0ρ )
1+
Z
t t0
Q0 (s)Q2m 1 (s)ds
.
(12.45)
Proof of Claim. The leading order term of R0 , in the variable α0 , which is nonlocal, is a term of the form (12.5). From this term we have after integration by parts to obtain e−iH0 (t−s) H0 from (12.5), Z t0 Z t Z t 1 ds 1/2 0 R0 ≤ Q02 (t) Q1 (t0 ) Q Qm 1 dsdt 5/2 0 t0 ht − si t0 t0 # "Z 0 Z t Z t0 t 1 1 ds 0 0 2m 2 + Q1 (t )dt Q0 Q1 ds ≤ Q0 (t) 4−δ ht − t0 i1+δ0 t0 t0 t0 ht − si 1 2
≤ cQ0 (t)
Z
t0 t0
h 0 0 Q1 (t0 )dt0 ht − t0 i−3+δ + ht − t0 i−3+δ
0 −1−δ 0
+ ht − t i
Z
t0 t0
Q0 Q2m 1 ds
#
Z t 1 ≤ cQ02 (t) O(E0 ) 1 + Q0 Q2m ds , 1 t0
thus proving the claim. We first rewrite the right-hand side of (12.42) and use the differential equation for Q0 and the above claim to integrate by parts: Z t 2Γ Q0 Qm 1 ds t0
=
Z
t t0
2ΓQ0 Q21 Q1m−2 ds
November 4, 2004 15:1 WSPC/148-RMP
00217
Selection of the Ground State for Nonlinear Schr¨ odinger Equations
Z s d Q0 − R0 ds = ds t0 t0 Z Z s t m−2 0 = Q1 (t ) Q0 − R0 − Z
t
1053
Q1m−2
t
Z s d m−2 (s) Q0 (s) − R0 ds Q t0 ds t0 t0 t0 Z t m−2 m−2 m−2 = Q1 (t)Q0 (t) − Q1 (t0 )Q0 (t0 ) − Q1 (t) R0 (s)ds t0
+ (m − 2)
Z
+ (m − 2)
Z
t t0 t t0
1 dt0 Q1m−1 (t0 )Q0 (t0 ) 4ΓQ0 (t0 ) + O Q02 (t0 )Q1m−2 (t0 )
Z 1 dt0 Q1m−1 (t0 ) 4ΓQ0 (t0 ) + O Q02 (t0 )Q1m−2 (t0 )
t0
t0
R0 (s)ds .
The first term on the right-hand side is of the form we want (the right-hand side of (12.42)). The second term is negative so we can drop it. The third term is bounded 1 Rt by O(E0 )Q1m−2 (t)Q02 (t) by the claim above, plus O( t0 Q0 Qm 1 ds), which is of the above form and can be absorbed by the left hand by smallness of E0 , where we used t0 < s < t1 , Q0 ≤ E0r Q1 . This completes the proof of Proposition 12.2. Using the proposition and its corollary it is easy to bound, for t ≥ t0 , Z t 2 2 k −iH0 (t−s) 2m O(α0 |β1 | + |α0 | β1 )∂t χ, e Pc χα ˜ 0 β1 ds
(12.46)
0
by O(E0ρ ) Q0 Q21 + hti−3 + h.o.t. It remains to consider t ≤ t0 . In this case we only know that |˜ α0 | ≤ k0 hti−1 . The first term of (12.5) Z t e−iH0 (t−s) Pc α0 β12m χds ˜ (12.47) α ¯0 O α ¯ 0 β1 e−iλ+ t χ, 0
is bounded above by O(hti−3 ), due to dispersive estimates and the decay of the decay of Q0 for t ≤ t0 . In the second term of (12.5), Z t 2 k O(α0 |β1 | )∂t χ, ··· 0
1
∼ O Q02 (t)Q1 (t) 1
χ,
Z
t 0
˜ k χ˜ ˜2m eiΩs ds , Ω 6= 0 e−iH0 (t−s) Pc H ˜ α β 0 0 1
we need to pull Q02 (t)Q1 (t) out of the nonlocal (integral) term. By dispersive estimates, we have the bound Z 1 t 1 |˜ α0 (s)|Qm O Q02 (t)Q1 (t) 3 1 (s)ds . +k 0 ht − si 2
(12.48)
(12.49)
November 4, 2004 15:1 WSPC/148-RMP
1054
00217
A. Soffer & M. I. Weinstein
To pull the Q1 term we proceed as earlier. By the mean value theorem, there exists s¯, with s ≤ s¯ ≤ t ≤ t0 such that Q1 (s) = Q1 (t) + Q1 (s) − Q1 (t) = Q1 (t) − Q˙ 1 (¯ s)(t − s) 1 s)Q1 (¯ s) (t − s) = Q1 (t) + O Q02 (¯ = Q1 (t) + O E0ρ h¯ si−1 ht − si , 1
where we have used that Q˙ 1 ∼ Q02 Q1 for t ≤ t0 . Therefore, the expression in (12.49) satisfies a bound of order Z t 1 1 Q02 (t)Q21 (t) |˜ α0 (s)|Q1m−1 (s)ds 3 +k 0 ht − si 2 Z t 1 1 1 0 1 s)Qm s) Qm−1 (s)ds Q02 (s)O Q02 (¯ + Q02 (t)Q1 (t) 1 1 (¯ 1 +k 0 ht − si 2 Z t 1 1 1 2 2 Q02 (s)Q1m−1 (s)ds = Q0 (t)Q1 (t) 3 +k 0 ht − si 2 1
+ E0ρ Q02 (t)Q1 (t)
1 hti2
(12.50)
where the last term, which is bounded by E0ρ (Q0 Q21 + hti−4 ), is obtained using the decay of Q0 (s) for s ≤ t0 . It remains to estimate the second to last term in (12.50). Estimating the con1 volution, using the hsi−1 decay of Q0 (s), we obtain the bound O Q02 (t)Q21 (t) hti−1 which is not bounded by the desired O E0ρ (Q0 Q21 + hti−3 ) . We will obtain the desired bound by turning to an earlier expression, derivable from (12.48). The expression we must consider is Z t 1 2m−2 2 −iH0 (t−s) k iΩs 2 ˜ ˜ O Q0 (t)Q1 (t) χ, e Pc H0 χ˜ ˜α0 (s)β1 (s)e ds , Ω 6= 0 . (12.51) 0
We will show that this term is of order E ρ Q0 (t)Q21 (t)hti−3/2 , which implies the desired bound. We proceed as follows. First, by Eq. (7.3) of Proposition 7.1 the equation for α ˜ 0 may be rewritten as: i∂t α ˜0 (t) = (c + iΓω )|β˜1 (T )|4 α ˜ 0 (t) + (c + iΓω ) |β˜1 (t)|4 − |β˜1 (T )|4 α ˜0 (t) + Fα .
(12.52)
Introducing α ˜# 0 (t) via the equation ˜
4
α ˜0 (t) ≡ e−it(c+iΓω )|β1 (T )| t α ˜# 0 (t) ,
(12.53)
# 4 ˜ 4 4 ˜ ˜ ˜ 0 (t) + eit(c+iΓω )|β1 (T )| Fα . i∂t α ˜# 0 (t) = (c + iΓω ) |β1 (t)| − |β1 (T )| α
(12.54)
we have
November 4, 2004 15:1 WSPC/148-RMP
00217
Selection of the Ground State for Nonlinear Schr¨ odinger Equations
1055
The integral in (12.51) can be written, keeping only leading order terms, as
χ,
Z
t 0
= = ≤
˜
# ˜2m−2 (s)eiΩs ds ˜ k χ˜ e−iH0 (t−s) Pc H 0 ˜α0 (s)β1
χ,
χ,
Z Z
t
e
−iH0 (t−s)
0 t 0
O(E02 )
# ˜ ˜ 0k χ Pc H ˜1 α ˜˙0 (s)β˜12m−2 (s)eiΩs ds + · · ·
# ˜ ˜ 0k χ e−iH0 (t−s) Pc H ˜1 |β˜1 (t)|4 − |β˜1 (T )|4 α ˜ 0 (s)β˜12m−2 (s)eiΩs ds
Z
t
0
ds
ht − si
3 2 +k
hsi
|β˜1 (t)|4 − |β˜1 (T )|4 + · · · .
(12.55)
It suffices to show Proposition 12.3. |P1 (t) − P1 (s)| ≤
C 1
hsi 2
,
0 ≤ s ≤ t.
(12.56)
For the proof we turn to the P1 equation (7.7): dP1 = −4ΓP0 P12 + R1 dt
(12.57)
R1 = R1 [˜ α0 , β˜1 , η, t] = 2=(β˜1 Fβ ) . iΩt ˜# , Ω 6= 0. We have, after two integraThe key term in =(β˜1 Fβ ) is the form β˜1m α 0 e tions by parts,
|P1 (t) − P1 (s)| ≤
Z
t s
0 iΩs0 β˜1m (s)˜ α# ds0 + · · · 0 (s )e
t Z t d ˜m 0 # 0 1 iΩs0 0 # 0 1 iΩs0 m ˜ α0 (s ) e = β1 (s)˜ β1 (s )˜ α0 (s ) e ds + · · · − 0 iΩ ds iΩ s s Z t 1 log s ˜ 0 0 iΩs (P1 (t) − P1 (s0 ))2 α = O +O + O(E0 ) ˜# ds0 0 (s )e hsi hsi s Z t log s 1 1 00 21 00 +O + O(E0 ) suphs i |P (t) − P (s )| ≤ O ds0 0 i2 00 hsi hsi hs s s 1 log s E0 1 +O +O suphs00 i 2 |P (t) − P (s00 )| . (12.58) = O hsi hsi hsi s00 1
Multiplication by hsi 2 and taking the supremum over s ≤ t implies Proposition 12.3 and therewith the proof of part (a) of Proposition 12.1.
November 4, 2004 15:1 WSPC/148-RMP
1056
00217
A. Soffer & M. I. Weinstein
12.2. Proof of part (2) of Proposition 12.1 The proof is similar to that of part (1) but simpler, since we can allow for the 1
nonlocal terms to be controlled, in addition, by terms of order Q02 Qm 1 . The leading contributions are again nonlocal, linear ηb contributions: From (7.39) we read ¯ 1 ηb + Ψ1 ηb ieiλ+ t α0 i∂t β1 ∼ · · · + 2λhχ, Ψ + λeiλ+ t 2hχ, Ψ1 ηb i¯ α0 + λeiλ+ t hχ, Ψ21 ηb + 2|Ψ1 |2 ηb i + h.o.t. + X −1 (t)(A(t) − A(T ))X(t)β1 .
(12.59)
The first two terms on the right-hand side of (12.59) are easily seen to be of order O(α0 )P12m or O(E0ρ )P0 P12 , by integration by parts over the ODE source terms in ηb , as before. The third term contributes to the P1 equations, after normal form transformations and remaining resonant terms: Z t 2 iλ+ t n −iH0 (t−s) 2 ¯ = β1 e ∂t χ, |β1 | e Pc α0 |β1 | χds ˜ 0
and higher-order/similar terms. The leading term, after integration by parts of the integral term (note that H −1 Pc is bounded): Z t 2m −iH0 (t−s) n ˜ =β¯1 eiλ+ t |β1 |2 χ, e P H α ˜ β χds ˜ c 0 0 1 0
≤ c|β1 |3
Z
t
0
ds 1/2 Q (s)Qm 1 . ht − sik 0
To this end we repeat the argument of part (1) to estimate the above integral by 1
Q02 (t)Q1m−1 (t) + O(t−3 ) + h.o.t.(ODE) . The main new type of term we need to control comes from the last term on the term Rβ in (6.35) and (6.36), coming from the difference A(t) − A(T ). This term contributes to the P1 equation terms like 2 iωt 2 = |β1 |2 α20 (t) − α20 (T ) α ¯ 0 (t) , = e β1 (P0 (t) − P0 (T ))O(α20 (T )) . (12.60)
The term with phase eiωt can be integrated by parts, and gives higher-order terms. The term without a phase requires the estimate of Z T dα0 2 2 α0 (t) − α0 (T ) = 2 α0 ds . ds t
November 4, 2004 15:1 WSPC/148-RMP
00217
Selection of the Ground State for Nonlinear Schr¨ odinger Equations
1057
Using that dα0 = c¯ α0 β12 e−2iλ+ t + h.o.t. ds we can repeatedly integrate by parts to get α20 (t) − α20 (T ) = local terms +
Z
T t
O(α20 )β12m ds + h.o.t.
where local terms = O(¯ α0 P1 ) and higher. Clearly, then |β1 |2 α ¯ 20 (t) [local terms + h.o.t.] ≤ O(E0 )[P0 P12 + hti−3 ] . So, we need to estimate Z |β1 |2 α ¯ 20 (t)
T t
O(α20 )β12m ds by O(E0ρ )Q0 Q21 + h.o.t.
(12.61)
For t ≥ t1 , |β1 |2 ≤ Q1 is monotonic decreasing. Hence Z T Z T 2 2 2m Q0 Q1m−1 ds . O(α )β ds ≤ |β (t)| 1 0 1 t
t
For t > t1 , Q1 ↓ is bounded by
Q1 (t) ≤ Q1 (t1 )(1 + 2ΓQ0 (t1 )Q1 (t1 )(t − t1 ))−1 ≤ cht − t1 i−1 Q0 (t1 )−1 = cQ1 (t1 )−1 E0−r ht − t1 i−1 since Q0 (t1 ) = E0r Q1 (t1 ). So, k Q1 (t)1+r Q1 (t) ≤ O(E0 )ht − t1 i−k . Hence, for k > 1, or for m > 2 + r + 1 = r + 3, Z T Q0 Q1m−1 ≤ O(E0 ) t
so
Z
T t
O(α20 )β12m ≤ O(E0 )Q0 Q21 .
If t > t0 , t < t1 , Q0 (t) ≤ E0r Q1 (t); therefore Z T Z T T 1 m−1 2 m−3 Q0 Q1 ds = Q0 Q1 Q1 ds = Q0 (s) + R0 (s) Q1m−3 (s) s=t 2Γ t t Z s Z T 1 dQ1 Q0 + ds . − R0 (s)ds Q1m−4 (m − 3) 2Γ ds t0 t
November 4, 2004 15:1 WSPC/148-RMP
1058
00217
A. Soffer & M. I. Weinstein
The first term on the right-hand side is local and high order. The second term on the right-hand side is bounded by Z T Z t1 Z t1 Q20 Q1m−2 ds + Q20 Qm−2 ds c (· · ·) + h.o.t. ≤ O(E0 )Q21 + O(E0ρ )hti−3 + c 1 t1
t
t
≤ O(E0 )[Q21 + hti−3 ] + cE0r Hence, Z T t
Q0 Q1m−1 ds ≤ O(E0 )[Q21 + hti−3 ] + O(E0 )
which implies that Z
T t
Z
T t
t
t0
t1 t
Q0 Q1m−1 ds
Q0 Q1m−1 ds ≤ O(E0 )[Q21 + hti−3 ]
for all t ≥ t0 . For 0 < t < t0 , we need to estimate Z |β1 |2 α ¯ 20 (t)
Z
Q0 Q1m−1 ds .
(12.62)
(12.63)
O(α20 )β12m ds .
Using that for t < t0 , |α0 (t)|2 ≤ k0 hti−1 ≤ E0 ht0 i−1 hti−1 the above expression is bounded by t0 hti t0 k0 (t0 ) ≤ O(E0 )k0 (t0 )hti−1 hti−1 ln O(E0 )k0 (t0 )hti−1 ln hti ht0 i hti ≤ O(E0 )k0 (t0 )hti−2 . This completes the proof of Proposition 12.1. (j)
13. R1 (ηb ) terms of Proposition 11.1 Proposition 13.1. Assume either t ≤ t0 or monotonicity property Q on [t0 , t0 + (i) δ∗ ]. Then, the terms R1 [ηb ], i = 0, 1 in (11.1) satisfy the estimates: (i) R [ηb ] ≤ O(E0 )[2ΓQ0 Q21 + ht0 i−1 hti−2 (Poly[P0 (0), P1 (0), P0 (T ), P1 (T )])] 1 (13.1)
where Poly[· · ·] stands for polynomial in the bracketed variables. (i)
Proof. The contributions to R1 (ηb ) comes from linear and nonlinear terms in ηb in the Pi equations. Consider first the nonlinear contributions: In the P0 equations we have terms like (7.41) α ¯ 20 hχ, ηb2 i ,
α ¯0 β¯1 eiλ+ t hχ, ηb2 i ,
eiλ+ t α ¯ 0 β1 hχ, |ηb |2 i ,
hχ, |ηb |2 ηb i¯ α0 .
(13.2)
November 4, 2004 15:1 WSPC/148-RMP
00217
Selection of the Ground State for Nonlinear Schr¨ odinger Equations
1059
Since for t ∈ I0 , I2 we have time decay of either α0 or β1 respectively, the main contribution is when t ∈ I1 = [t0 , t1 ]. For t ∈ I2 , the bound on ηb we have is Z t ds 1/2 kηb (t)kH k ≤ cQ0 (t) Qm (s) . 3/2 1 ht − si 0
The second and third terms can be integrated by parts leaving terms of the type O(β12 )O(¯ α0 )∂tm hχ, |ηb |2 + ηb2 i .
We also need to integrate by parts the two other terms. For this we need to pull out a phase factor from the leading nonlocal. Pulling a phase as in the proof of Proposition 12.1 we are left with estimating term of the type O(α20 + α0 β1 )hχ, ηb ∂tk ηb i + hχ, |ηb |2 ∂tk ηb iO(α0 ) . As in the estimates of Proposition 10.3 , for t ∈ I1 , Z t ds 1/2 k s Qm (s) k∂t ηb kH ≤ CQ0 (t) ht − sik0 1 0 with k 0 large for k large. To this end we use the following.
Proposition 13.2. For t ∈ I1 : Z t ds ¯ Qm (s) ≤ O(E0 )Q21 (t) . k0 1 ht − si t0 Proof of Proposition 13.2. Z t Z t Z t ds ds ds m−1 ¯ ¯ m ¯ Q (s) = Q1 (t) Q + (Q1 (s) − Q1 (t))Qm−1 . 1 k0 1 k0 1 ht − si ht − si ht − sik0 t0 t0 t0 The second term on the right-hand side can be estimated, using that Q˙ 1 = 1 −ΓQ0 Q21 + O(Q02 Qm 1 ) and monotonicity of Q0 , as follows: (s ≤ ξ ≤ t) Z t Z t ds ds ¯ m−1 ¯ ˙ ≤ C Q2 (ξ)Qm−1 (s) Q (ξ)Q1 (s) ≤ CQ0 (t) 1 k0 −1 1 k0 −1 1 ht − si ht − si t0 t0 Z t ds 1/2 ¯ + CQ0 (t) Qm (ξ)Qm−1 (s) 1 k0 −1 1 t0 ht − si Z t ds ¯ Q2 (ξ)Qm−1 ≤ O(E0 )Q1 (t) 1 ht − sik0 −1 1 t0 Z t ds 1/2 ¯ Qm−1 (ξ)Qm−1 . + O(E0 )Q1 (t) 1 k0 −1 1 ht − si t0
November 4, 2004 15:1 WSPC/148-RMP
1060
00217
A. Soffer & M. I. Weinstein
Repeating this argument, we have Z t Z t ds ds j ¯ m ¯ Q (s) ≤ O(E )Q (t) Qm−2j 0 0 k0 1 k0 −2j 1 ht − si ht − si t0 t0 ≤ O(E0 )Qj1 (t) for k 0 − 2j > 1, which proves the Proposition 13.2. This Proposition together with the estimate 1/2
kηb kW k,∞ ≤ O(E0 )Q0
(13.3)
implies that (0) R1 ≤ O(E0 ) Q0 Q21 + higher order terms .
(13.4)
(1)
The estimates of R1 are similar. It remains to estimate the linear ηb terms in the P0 , P1 equations. The leading order source term of ηb was estimated in Proposition 12.1. It remains to estimate the higher-order corrections. To this end we need to estimate terms of the following type appearing in the P0 equation, (11.1): Z t −iH0 (t−s) 2 2 e Pc ψ0 ηb ds and α ¯ 0 (|β1 | + α0 β1 ) χ, 0
|β1 |2 α0 + α ¯ 0 α 0 β1
χ,
Z
t 0
e−iH0 (t−s) Pc |ηb |2 ηb ds
,
and similar terms in the P1 equation. Again we focus on t ∈ I1 . Since for 0 ≤ s ≤ t0 α0 and ηb are of order clearly these contributions are of order O(E0 ) Q0 Q21 + ht0 i−3 ,
O(E0 ) hsi ,
so it remains to estimate the s-integrals above on I1 . 1 1 But on I1 , kηb k ≤ O(E0 )Q02 (t) ≤ O(E0 )Q12 (t) and since Q0 is monotonic increasing on I1 , the above nonlocal terms are bounded by 3
O(E0 )Q02 (t) . So, 3
O(|β1 |2 α ¯ 0 )O(E0 )Q02 (t) ≤ O(E0 )Q21 Q0 3
O(|α0 |2 β1 )O(E0 )Q02 (t) ≤ O(E0 )Q21 Q0 since Q0 ≤ O(E0 )Q1 on I1 .
November 4, 2004 15:1 WSPC/148-RMP
00217
Selection of the Ground State for Nonlinear Schr¨ odinger Equations
1061
14. Bootstrapping It All We assume that t0 < ∞, where t0 is given by (8.5). Consider the equations for P0 and P1 , (11.1) displayed in Proposition 11.1. Explicit in (11.1) are terms which (i) are driven by the dispersive part of the initial data: R00 [η0 ] and R01 [η0 ] (ii) encompass interactions of the two bound states and dispersive waves: R10 [ηb ] and (iii) encompass interactions between bound states: R20 [P0 , P1 ]. By Proposition 11.2 the R0j [η0 ], j = 0, 1 terms satisfy: R0j [η0 ] = O (bj (t0 , [η0 ]X , E0 )) hti−2 + O(E0 )2ΓP0 P12 .
(14.1)
Therefore, by Proposition 9.2 and its proof (see Eq. (9.22) ), it is natural to introduce the functions: k0 k1 Q0 (t) = P0 − , Q1 (t) = P1 + (14.2) hti hti
where
k0 = b0 + O(E0 )b21 ,
(14.3)
b0 = ht0 i−1
(14.4)
k1 = 10b1 [η0 ]X + c∗ E02 ,
2 1 3 b1 = ht0 i− 2 [η0 ]X + d∗ E02 ,
(14.5)
where the constants c∗ and d∗ are to be chosen. We find, for any m ≥ 4 and all t ≥ 0: E 2 c∗ dQ0 ≥ 2ΓQ0 Q21 + R10,# [ηb ] + R20,# [Q0 , Q1 ] + 0 2 dt ht0 ihti
p dQ1 E02 d∗ ≤ −4ΓQ0 Q21 + R11,# [ηb ] + R21,# [Q0 , Q1 ] + O(E0 ; m) Q0 Qm . 1 1 − dt ht0 i 2 hti2 (14.6) The analogous reverse inequalities hold as well with slightly different constants. By the definition of t0 , Q0 (t0 ) > 0. Furthermore, using the energy estimate on the bound state amplitudes and (14.2) of Sec. 5.2, we have Q0 (t) + Q1 (t) ≤ CE0 ,
t > 0.
(14.7)
We now introduce a set of norms. The norm of q(t) ≡ (Q0 (t), Q1 (t), ηb (t)) is defined as kq(t)kY = |Q0 |y0 (t) + |Q1 |y1 (t) + kηb ky2 (t) .
(14.8)
The norm, kq(t)kY , encodes all the estimates for Q0 , Q1 and ηb in the intervals I0 , I1 and I2 through the following: |Q0 |y0 (t) ≡ sup |Q0 (s)| + 0≤s≤t
sup 0≤s≤min{t,t0 }
ht0 ihsi|Q0 (s)|
(14.9)
November 4, 2004 15:1 WSPC/148-RMP
1062
00217
A. Soffer & M. I. Weinstein
|Q1 |y1 (t) ≡ sup |Q1 (s)| + sup |s − t1 |Q1 (s)Γ0 Q0 (t1 )Q1 (t1 ) 0≤s≤t
kηb ky2 (t) ≡
sup 0≤s≤min{t,t0 }
(14.10)
t1 ≤s≤t
hsikηb (s)kW k,∞ +
sup t0 ≤s≤min{t,t1 }
kηb (s)kW k,∞
1
+ sup hs − t1 i 2 kηb (s)kW k,∞ .
(14.11)
t1 ≤s≤t
In these definitions we use the convention that terms for which the s-range is empty are set to zero. By the H 1 a priori bounds sup |Q0 (s)| ≤ E0 (1 + E02 kηb k∞ )
(14.12)
0≤s≤t
and by definition of t0 , (8.5), for t0 < ∞, sup ht0 ihsi|Q0 (s)| ≤
0≤s≤t0
1 E0 + [η0 ]X . 2
(14.13)
In terms of these norms, we have bounds on Rij,# . By Proposition 13.1 |R10,# [ηb ]| + |R11,# [ηb ]| ≤ O(E0ρ )kq(t)klY1 Q0 Q21 + C By Proposition 12.1 |R20,# [Q0 , Q1 ]| ≤ O(E0ρ )Q0 Q21 + C |R21,# [Q0 , Q1 ]| ≤ O(E0ρ )Q0 Q21 + C where l ≥ 2.
E0ρ kq(t)klY2 . ht0 ihti2
E0ρ + kq(t)klY ht0 ihti2 E0ρ + kq(t)klY 1
ht0 i 2 hti2
(14.14)
(14.15) (14.16)
By the definition of I0 , (0 ≤ t ≤ t0 ) and Propositions 11.2, 12.1 and 13.1, we have estimates (14.14)–(14.16). Therefore, for an appropriate choice of c ∗ and d∗ we have for 0 ≤ t ≤ t0 c∗ E02 dQ0 ≥ 2Γ0 Q0 Q21 + dt 2 ht0 ihti2 √ dQ1 E02 d∗ , ≤ −4Γ0 Q0 Q21 + O(E0 ; m) Q0 Qm 1 − dt 2 ht0 i 12 hti2
(14.17)
where m ≥ 4. Note that by definition of I0 , Q0 (t) < 0 for t ∈ I0 . By continuity, (14.17) holds for t0 ≤ t ≤ t0 + δ, for some δ > 0. It follows, using that Q0 (t0 ) > 0 and Propositions 9.2–9.4, that (11.2) (Monotonicity Property Q) holds on t0 ≤ t ≤ t0 + δ. Therefore, by Propositions 11.2, 12.1 and 13.1 the terms J0 and J1 in (9.8) both satisfy the bound |Jk | ≤
O(E02+ρ ) + E0 Q0 Q21 , ht0 ihti2
ρ > 0,
k = 0, 1.
(14.18)
November 4, 2004 15:1 WSPC/148-RMP
00217
Selection of the Ground State for Nonlinear Schr¨ odinger Equations
1063
Therefore, for E0 sufficiently small (14.17) holds with c∗ , d∗ replaced by c∗ /2, d∗ /2. Define T∗ = sup{t ≥ t0 : (14.17) holds for some c∗ > 0 and d∗ > 0} .
(14.19)
For t ∈ [0, T∗ ), kq(t)kY is small. We claim that T∗ = ∞. Suppose t0 ≤ T∗ < ∞. Then, for t ∈ [t0 , T∗ ) we have, by (14.17), (14.18) and the above argument using Propositions 9.2–9.4, that the monotonicity property (11.2) holds at t = T ∗ and slightly beyond. Thus, the a priori bounds on J0 , J1 of Propositions 11.2, 12.1 and 13.1, the ηb -bounds of Propositions 10.7–10.9, and the smallness of Q0 and Q1 imply persistence of the inequalities (14.17), with perhaps a slightly smaller choice of positive constants c∗ and d∗ . This implies that T∗ = ∞. 15. Nongeneric Behavior Recall that t0 is defined by (8.5) and consider the case where t0 = ∞. We would like to show that P0 (t) → 0 as t → ∞ P1 (t)
has a limit .
The following is a consequence of the definition of t0 . Proposition 15.1. Assume t0 = ∞. Then, P0 (t) = O [η0 ]X hti−2 . Therefore, α0 → 0 and the ground state decays. Proposition 15.2. Let t0 = ∞. Then, β1 has a limit as t → ∞. States with this (nongeneric) behavior were constructed in [55]. Proof of Proposition 15.2. Equation (7.7), together with the above estimate P0 (t) = O(η0 )hti−2
(15.1)
implies dP1 = (−4ΓP0 P12 + O(η0 )hti−3/2 )(1 + O(P0 ) + O(P1 )) dt + <(ceiλ+ t |β1 |2 β1 α0 ) + h.o.t. (15.2) Rt To show that P1 has a limit, we show that 0 ∂s P2 (s)ds has a limit. All terms other than the O(α0 ) term, on the right-hand side are absolutely integrable since P0 = O(hti−2 ). It is left to integrate the O(|β1 |2 β¯1 α0 ) term. For T given, let βT2 ≡ β1 (T )2 . Then, Eq. (7.52) reads 2i∂t α0 = λhψ0∗ , Ψ1 (t)2 iα0 + integrable in t .
(15.3)
November 4, 2004 15:1 WSPC/148-RMP
1064
00217
A. Soffer & M. I. Weinstein
Using the expression for Ψ1 (t) = α1 ψ1 (·, |α1 |2 ): 2 2i∂t α0 = λhψ0∗ , α1 (t)2 ψ1∗ i¯ α0 + h.o.t. 2 = λhψ0∗ , ψ1∗ ie−2iλ+ t β1 (t)2 α ¯ 0 (t) + O(α20 (T )β¯1 ) + h.o.t.
˜ −2iλ+ t β 2 α ˜ −2iλ+ t [β 2 (t) − β 2 (T )]¯ ≡ λe α0 (t) + O 1 T ¯ 0 (t) + λe 1
1 T2
+ h.o.t.
(15.4)
˜ ≡ λhψ0∗ ψ 2 i and β 2 ≡ β 2 (T ). where λ 1∗ 1 T Solving the homogeneous part of (15.4): ˜ −2iλ+ t β 2 α 2i∂t α ˆ 0 = λe T ˆ (t)
(15.5)
we have, using the Ansatz α0 (t) = A(t)ei(λ−a)t + B(t)ei(λ+a)t with ˜ −2iλ+ t [β 2 (t) − β 2 (T )]Ae ¯ iθ(t) + h.o.t. A˙ ∼ λe 1 1 and a similar equation for B(t). We have dP1 = −4ΓP0 P12 + < ceiθA t |β1 |2 β¯1 A + eiθB t |β1 |2 β¯1 B + h.o.t. , dt θA , θB 6= 0. Integration of the above equation, integration by parts (twice) of A and B, implies Z T 1 1 ¯ 0 )dt0 + h.o.t.hti 12 |β1 |2 β¯1 ht0 i−1 ht0 i[β12 (t0 ) − β12 (T )]2 A(t hti 2 |P1 (t) − P1 (T )| ≤ Chti 2 t
≤ CE0m
Z
T t
1
ht0 i− 2 −1 dt0
!
1
sup ht 2 i|P1 (t0 ) − P1 (T )|
0≤t0 ≤T
2
1
+ suphti 2 h.o.t. t
1
⇒ |P1 (t) − P1 (T )| ≤ Chti− 2
˙ which implies integrability of A(t) and limit of P1 (t). Acknowledgments This work was supported in part by grants from the National Science Foundation. Appendix A. Notation
November 4, 2004 15:1 WSPC/148-RMP
00217
Selection of the Ground State for Nonlinear Schr¨ odinger Equations
1065
Pc∗ — projection onto the continous spectral part of the self-adjoint operator, H. R ¯ hf, gi = fg. a a . (A.1) = a ¯ c.c. For j = 1, 2, let πj : C2 → C1 be defined by: z z =w = z, π2 π1 w w 0 1 1 0 σ1 = , σ3 = . 1 0 0 −1
(A.2)
Plemelj identities
(x ∓ i0)−1 = P.V. x−1 ± iπδ(x) .
(A.3)
~ j = ∂ αj , ∂ α . ∇ j χ(x, p) denotes the real-valued localized function of x which depends smoothly on a parameter, p. (j) χk denotes a spatially localized function of order |αj |k , as |αj | → 0. (j) (j) (j) Ok denotes a quantity which is of order |αj |k as |αj | → 0. Both χk and Ok are invariant under the map αj 7→ αj eiγ . (0,1) (0) (1) Ok = O k1 O k2 , k = k 1 + k 2 .
Appendix B. Proof of Proposition 4.2 Parts (i), (iii) and (v) of Proposition 4.2 follow from [58]. We now prove parts (ii) and (iv) by a perturbation argument about the case α0 = 0. ~ Since α0 is assumed small it is Consider the eigenvalue problem H0 f~ = µf. natural to make explicit the leading order and perturbation terms. Thus we have 2|α0 |2 α20 (1) f~ = µf~ (B.1) H0 f~ = σ3 (H − E0∗ ) − E0 |α0 |2 I + ψ02 α0 2 2|α0 |2 (1)
Recall that E0 and ψ0 are defined in Proposition 3.1. The zeroth order problem (α0 = 0) is σ3 (H − E0∗ )f~0 = µ0 f~0 ,
(B.2)
which has two linearly-independent solutions: µ0 = E∗1 − E0∗ , f~0 =
1 0
µ0 = −(E∗1 − E0∗ ), σ1 f~0 .
ψ1∗
(B.3)
(B.4)
We develop the perturbation theory of (B.3). That of the second is completely analogous.
November 4, 2004 15:1 WSPC/148-RMP
1066
00217
A. Soffer & M. I. Weinstein
For α0 and small we define the perturbations about the zeroth order eigenstates via: f~ = f~0 + f~1
(B.5)
µ = E1∗ − E0∗ + µ1 .
(B.6)
Substitution into (B.1) yields: [σ3 (H − E0∗ ) − (E1∗ − E0∗ )I]f~1 2|α0 |2 α20 (1) = |α0 |2 E0 σ3 f~0 − λψ02 σ3 f~0 + µ1 f~0 α0 2 2|α0 |2 2|α0 |2 α20 2 (1) 2 ~ f~1 + µ1 f~1 . + |α0 | E0 σ3 f1 − λψ0 σ3 α0 2 2|α0 |2
(B.7)
We consider, individually, the first and second equations of the system (B.7), governing f1j = πj f~1 , j = 1, 2. The first component of (B.7) is: (1) (H − E1∗ ) f11 = |α0 |2 E0 − 2λψ02 ψ1∗ + µ1 ψ1∗ (1) + |α0 |2 E0 − 2λψ02 f11 − λα20 ψ02 f12 + µ1 f11 . (B.8) Let ν∗ = 2E0∗ − E1∗ . The second component of (B.7) is: (1)
(H − ν∗ )f12 = −λα0 2 ψ02 ψ1∗ + |α0 |2 E0 f12 − λψ02 α0 2 f11 + 2|α0 |2 f12 − µ1 f12 .
(B.9)
We wish to make the dependence of f~1 on α0 and α0 explicit. Define µ1 = |α0 |2 µ ˜1 ,
f11 = |α0 |2 f˜11 ,
f12 = α0 2 f˜12 .
(B.10)
Equations (B.8) and (B.9) reduce to the following system for f˜11 and f˜12 : (1) (H − E1∗ )f˜11 = E0 − 2λψ02 ψ1∗ + µ ˜1 ψ1∗ (1) + |α0 |2 E0 − 2λψ02 f˜11 − λ|α0 |2 ψ02 f˜12 + |α0 |2 µ ˜1 f˜11 (B.11) (1) (H − ν∗ )f˜12 = −λψ02 ψ1∗ + |α0 |2 E0 f˜12 − λ|α0 |2 ψ02 2f˜11 + 2f˜12 − |α0 |2 µ ˜1 f˜12 .
(B.12)
We seek a solution to the system (B.11), (B.12):
|α0 |2 7→ f˜11 (|α0 |2 ), f˜12 (|α0 |2 ), µ ˜(|α0 |2 ) ∈ L2 × L2 × R
defined in a neighborhood of α0 = 0. For α0 = 0 the system (B.11), (B.12) reduces to: (1) 0 2 (H − E1∗ )f˜11 = E0 (0) − 2λψ0∗ ψ1∗ + µ ˜1 ψ1∗ 0 2 (H − ν∗ )f˜12 = −λψ0∗ ψ1∗ .
(B.13)
(B.14) (B.15)
November 4, 2004 15:1 WSPC/148-RMP
00217
Selection of the Ground State for Nonlinear Schr¨ odinger Equations
1067
0 Note that ν∗ < 0 is not in the spectrum of H. Therefore, (B.15) is solvable for f˜12 and we have: 0 2 f˜12 = −λ(H − ν∗ )−1 ψ0∗ ψ1∗ .
(B.16)
Since (H − E1∗ )ψ1∗ = 0, (B.14) is solvable if and only if its right-hand side is orthogonal to ψ1∗ . This determines µ1 (0): (1)
2 2 i, µ ˜1 (0) = −E0 (0) + 2λhψ0∗ , ψ1∗
(B.17)
0 f˜11 .
and now (B.14) can be solved for To solve in a neighborhood of α0 = 0 we proceed as follows. Rewrite (B.12) as (H + |α0 |2 W12 − ν∗ )f˜12 = −λψ02 ψ1∗ − λ|α0 |2 ψ02 f˜11 ,
(B.18)
where W12 is a multiplication operator defined by (1)
W12 (|α0 |2 , µ ˜1 ) = −E0 + 2λψ02 + µ ˜1 .
(B.19)
For µ ˜1 in a fixed compact set and α0 sufficiently small, the operator H +|α0 |2 W12 − ν∗ has a bounded inverse, B(|α0 |2 ). Thus, f˜12 ≡ f˜12 [f˜11 , |α0 |2 ]
= −λB(|α0 |2 )ψ02 ψ1∗ − λ|α0 |2 B(|α0 |2 )ψ02 f˜11 .
(B.20)
Substitution of (B.20) into (B.11) yields the following closed equation for f˜11 : (1) ˜1 ψ1∗ + |α0 |2 W11 f˜11 , (B.21) (H − E1∗ )f˜11 = E0 − 2λψ02 ψ1∗ + µ
where the operator W11 is defined by (1)
W11 (|α0 |2 , µ ˜1 ) = (E0 − 2λψ02 ) + µ ˜1 + ψ02 f˜12 [·, |α0 |2 ] .
(B.22)
Setting the inner product of the right-hand side of (B.21) equal to zero, gives the solvability condition for (B.21): (1) 2 µ ˜1 = 2λhψ02 , ψ1∗ i − E0 − |α0 |2 hψ1∗ , W11 f˜11 i .
(B.23)
The system (B.21), (B.23) is of the form: F (f˜11 , µ ˜1 , s) = 0 ,
(B.24)
0 f˜11 ,µ ˜1
with the solution f˜11 = = µ ˜1 (0), s = 0 defined by (B.17). Furthermore, 0 the Jacobian of F (f˜11 , µ ˜1 , s) with respect to (f˜11 , µ ˜1 ) evaluated at (f˜11 ,µ ˜1 (0), 0) is given by: H − E1∗ −ψ1∗ (B.25) 0 I which maps H 2 × R one to one and onto L2 × {hg, ψ1∗ i : g ∈ L2 }. Therefore, by the implicit function theorem [32], we have a real analytic curve of solutions s 7→ (f˜11 (s), µ ˜1 (s), s), defined in a neighborhood of s = 0 and coinciding with 0 ˜ (f11 , µ ˜1 (0), 0) for s = 0. The family of solutions we seek is obtained by restriction to s = |α0 |2 ≥ 0. This completes the proof of Proposition 4.2.
November 4, 2004 15:1 WSPC/148-RMP
1068
00217
A. Soffer & M. I. Weinstein
Appendix C. A Commutator Term In this section we record a calculation of a “commutator term” appearing in the modulation equations of Sec. 5. Proposition C.1. 2
ih∂t (σ3 ξ01 ), Φ2 i = i∂t (|α0 | )
F00 G0
1
, Φ2 1 1 2 + ∂t γ0 |α0 | χG0 , Φ2 1 1 ih∂t (σ3 ξ02 ), Φ2 i = i∂t (|α0 |2 ) F000 σ3 G0 , Φ2 1 1 2 , Φ2 . + (∂t γ0 )|α0 | χσ3 G0 1
(C.1)
(C.2)
Proof. By direct computation from (4.23) ∂t G0 (t) = i(∂t γ0 )σ3 G0 (t) .
(C.3)
Note also that by (4.37) 1 1 σ3 G 0 F0 = ξ01 = 2ζ01 + |α0 |2 σ3 G0 χ(x; |α0 |2 ) 1 1 1 1 1 G0 F00 = ξ02 = ζ02 + |α0 |2 G0 χ(x; |α0 |2 ) . 2 1 1 Using these relations we have for j = 1 that 1 ∂t (σ3 ξ01 (t)) = ∂t (|α0 |2 )G0 F00 (|α0 |2 ) 1 1 + i(∂t γ0 )σ3 G0 F0 (|α0 |2 ) 1 1 2 = ∂t (|α0 | )G0 F00 (|α0 |2 ) 1 + 2i(∂t γ0 )ζ01 (t) + (∂t γ0 ) |α0 |2 χ(x; |α0 |2 )G0
1 1
.
Substitution into the inner product h∂t (σ3 ξ01 (t)), Φ2 i and using the constraint hζ01 (t), Φ2 i = 0 yields the result for j = 1. For j = 2 1 F000 (|α0 |2 ) ∂t (σ3 ξ02 (t)) = ∂t (|α0 |2 )σ3 G0 1
November 4, 2004 15:1 WSPC/148-RMP
00217
Selection of the Ground State for Nonlinear Schr¨ odinger Equations
1069
1
F00 (|α0 |2 ) 1 1 2 F00 (|α0 |2 ) = ∂t (|α0 | )σ3 G0 1 + i(∂t γ0 )G0
i + (∂t γ0 )ζ02 (t) + ∂t γ0 |α0 |2 χ(x; |α0 |2 )σ3 G0 2
1 1
.
Substitution into the inner product h∂t (σ3 ξ02 (t)), Φ2 i and using the constraint hζ02 (t), Φ2 i = 0 yields the result for j = 2. This completes the proof of Proposition C.1. References [1] V. I. Arnol’d, Geometric Methods in the Theory of Ordinary Differential Equations (Springer-Verlag, New York, 1983). [2] V. S. Buslaev and G. S. Perel’man, Scattering for the nonlinear Schr¨ odinger equation: states close to a soliton, St. Petersburg Math. J. 4 (1993) 1111–1142. [3] V. S. Buslaev and G. S. Perel’man, On the stability of solitary waves for nonlinear Schr¨ odinger equation, Nonlinear evolution equations, Amer. Math. Soc. Transl. Ser. 2 164 (Amer. Math. Soc., Providence, RI, 1995), pp. 75–98. [4] V. S. Buslaev and C. Sulem, On asymptotic stability of solitary waves for nonlinear Schr¨ odinger equations, Ann. Inst. H. Poincar´e Anal. Non Lin´eaire 20 (2003) 419– 475. [5] J. Carr, Applications of Centre Manifold Theory (Springer-Verlag, New York, 1981). [6] T. Cazenave, An Introduction to the Nonlinear Schr¨ odinger Equation, Textos de M´etodos Matem´ aticos 26 (Instituto de Matem´ atica, UFRJ, Rio De Janeiro, 1989). [7] H. L. Cycon, R. G. Froese, W. Kirsch and B. Simon, Schr¨ odinger Operators (SpringerVerlag, Berlin Heidelberg, New York, 1987). [8] E. Coddington and N. Levinson, Theory of Ordinary Differential Equations (McGraw-Hill, New York, 1955). [9] J. Fr¨ ohlich, T.-P. Tsai and H.-T. Yau, The point-particle limit of the nonlinear Hartree equation, Commun. Math. Phys. 225 (2002) 223–274. [10] S. Cuccagna, Stabilization of solutions to nonlinear Schr¨ odinger equations, Commun. Pure Appl. Math. 54(9) (2001) 1110–1145. [11] S. Cuccagna, On asymptotic stability of ground states of nonlinear Schr¨ odinger equations, preprint. [12] T. Cazenave and P.-L. Lions, Orbital stability of standing waves for some nonlinear Schr¨ odinger equations, Commun. Math. Phys. 85 (1982) 549–561. [13] E. B. Davies, Quantum Theory of Open Systems (Academic Press, 1976). [14] M. Grillakis, Analysis of the linearization around a critical point of an infinitedimensional Hamiltonian system, Commun. Pure Appl. Math. 43(3) (1990) 299–333. [15] M. Grillakis, J. Shatah and W. Strauss, Stability theory of solitary waves in the presence of symmetry. I. J. Funct. Anal. 74(1) (1987) 160–197. [16] J. Guckenheimer and P. Holmes, Nonlinear Oscillations, Dynamical Systems, and Bifurcations of Vector Fields (Springer-Verlag, 1983). [17] R. H. Goodman, M. I. Weinstein and P. J. Holmes, Nonlinear propagation of light in one-dimensional periodic structures, J. Nonlinear Sci. 11(2) (2001) 123–168.
November 4, 2004 15:1 WSPC/148-RMP
1070
00217
A. Soffer & M. I. Weinstein
[18] M. Goldberg and W. Schlag, Dispersive estimates for Schr¨ odinger operators in dimensions one and three, to appear in Commun. Math. Phys. [19] R. H. Goodman, P. J. Holmes and M. I. Weinstein, Strong NLS-Soliton Defect Interactions, submitted to Physica D. [20] R. H. Goodman, R. E. Slusher and M. I. Weinstein, Stopping light on a defect, J. Opt. Soc. Am. B19 (2002) 1635–1652. [21] K. Hepp, The classical limit for quantum mechanical correlation functions, Commun. Math. Phys. 35 (1974) 265–277. [22] T. Kato, On nonlinear Schr¨ odinger equations, Ann. Inst. H. Poincar´e Phys. Th´eor. 46, 113–129. [23] Yu. S. Kivshar, D. E. Pelinovsky, T. Cretegny and M. Peyrard, Internal modes of solitary waves, Phys. Rev. Lett. 80 (1998) 5032. [24] A. Jensen and T. Kato, Spectral properties of Schr¨ odinger operators and time-decay of wave functions, Duke Math. J. 46 (1979) 583–611. [25] J.-L. Journ´e, A. Soffer and C. D. Sogge, Decay estimates for Schr¨ odinger operators, Commun. Pure Appl. Math. 44 (1991) 573–604. [26] P. G. Kevrekidis and M. I. Weinstein, Dynamics of lattice kinks, Physica D 142 (2000) 113–152. [27] M. Kwong, Uniqueness of positive solutions of ∆u − u + up = 0, Arch. Ration. Mech. Anal. 105 (1989) 243–266. [28] E. Kirr and M. I. Weinstein, Parametrically excited Hamiltonian partial differential equations, SIAM J. Math. Anal. 33 (2001) 16–52. [29] E. Kirr and M. I. Weinstein, Metastable states in parametrically excited multimode Hamiltonian partial differential equations, Commun. Math. Phys. 236 (2003) 335– 372. [30] E. H. Lieb, R. Seiringer and J. Yngvason, A rigorous derivation of the Gross– Pitaevskii energy functional for a two-dimensional Bose gas, Commun. Math. Phys. 224 (2001) 17–31. [31] M. Murata, Rate of decay of local energy and spectral properties of elliptic operators, Japan J. Math. 6 (1980) 77–127. [32] L. Nirenberg, Topics in Nonlinear Functional Analysis, Courant Institute Lecture Notes (1974). [33] C.-A. Pillet and C. E. Wayne, Invariant manifolds for a class of dispersive, Hamiltonian, partial differential equations, J. Differential Equations 141 (1997) 310–326. [34] M. Reed and B. Simon, Functional Analysis, Modern Methods of Mathematical Physics, Volume 1 (Academic Press, 1972). [35] M. Reed and B. Simon, Analysis of Operators, Modern Methods of Mathematical Physics, Volume 4 (Academic Press, 1978). [36] I. Rodnianski, W. Schlag and A. Soffer, Dispersive analysis of the charge transfer models, to appear in Commun. Pure Appl. Math. [37] I. Rodnianski and W. Schlag, Time decay for solutions of Schr¨ odinger equations with rough and time-dependent potentials, Invent. Math. 155(3) (2004) 451–513. [38] H. A. Rose and M. I. Weinstein, On the bound states of the nonlinear Schr¨ odinger equation with a linear potential, Physica D 30 (1988) 207–218. [39] W. Schlag, Stable manifolds for orbitally unstable NLS, http://xxx.lanl.gov/pdf/math.AP/0405435. [40] W. Schlag, private communication. [41] I. M. Sigal, Nonlinear wave and Schr¨ odinger equations I. Instability of time-periodic and quasiperiodic solutions, Commun. Math. Phys. 153 (1993) 297–320.
November 4, 2004 15:1 WSPC/148-RMP
00217
Selection of the Ground State for Nonlinear Schr¨ odinger Equations
1071
[42] I. M. Sigal, General characteristics of nonlinear dynamics, in Spectral and Scattering Theory; Proceedings of the Taniguchi international workshop, ed. M. Ikawa (Marcel Dekker, Inc. New York–Basel–Hong Kong, 1994), pp. 197–217. [43] A. Soffer & M. I. Weinstein, Multichannel nonlinear scattering theory for nonintegrable equations, Oleron Proceedings, Springer Lecture Notes in Physics, Vol. 342, eds. T. Balaban, C. Sulem and P. Lochak (1989), pp. 312–327. [44] A. Soffer and M. I. Weinstein, Multichannel nonlinear scattering theory for nonintegrable equations I & II, Commun. Math. Phys. 133 (1990) 119–146; J. Differential Equations 98 (1992) 376–390. [45] A. Soffer and M. I. Weinstein, 1995–1996 unpublished notes. [46] A. Soffer and M. I. Weinstein, Dynamic theory of quantum resonances and perturbation theory of embedded eigenvalues, in Proceedings of Conference on Partial Differential Equations and Applications, CRM Lecture Notes, Vol. 12, eds. P. Greiner, V. Ivrii, L. Seco and C. Sulem, University of Toronto, June 1995 (AMS, 1997), pp. 277–282. [47] A. Soffer and M. I. Weinstein, Time dependent resonance theory, Geom. Funct. Anal. 8 (1998) 1086–1128. [48] A. Soffer and M. I. Weinstein, Nonautonomous Hamiltonians, J. Stat. Physics 93 (1998) 359–391. [49] A. Soffer and M. I. Weinstein, Resonances and radiation damping in Hamiltonian partial differential equations, Invent. Math. 136 (1999) 9–74. [50] A. Soffer and M. I. Weinstein, Ionization and scattering for short lived potentials, Lett. Math. Phys. 48 (1999) 339–352. [51] H. Spohn, Kinetic equations from Hamiltonian dynamics, Rev. Mod. Phys. 52 (1980) 569–615. [52] C. Sulem and P.-L. Sulem, The Nonlinear Schr¨ odinger Equation, Self-Focusing and Wave Collapse (Springer, 1999). [53] T.-P. Tsai and H.-T. Yau, Asymptotic dynamics of nonlinear Schr¨ odinger equations: Resonance dominated and radiation dominated solutions, Commun. Pure Appl. Math. 55 (2002) 153–216. [54] T.-P. Tsai and H.-T. Yau, Relaxation of excited states in nonlinear Schr¨ odinger equations, Int. Math. Res. Not. 31 (2002) 1629–1673. [55] T.-P. Tsai and H.-T. Yau, Stable directions for excited states of nonlinear Schr¨ odinger equations, Commun. Partial Differential Equations 27 (2002) 2363–2402. [56] A. Vanderbauwhede and G. Iooss, Center manifold theory in infinite dimensions, Dynamics Reported 2 (1990). [57] R. Weder, Center manifold for nonintegrable nonlinear Schr¨ odinger equations on the line, Commun. Math. Phys. 215(2) (2000) 343–356. [58] M. I. Weinstein, Modulational stability of ground states of nonlinear Schr¨ odinger equations, SIAM J. Math. Anal. 16 (1985) 472–491. [59] M. I. Weinstein, Lyapunov stability of ground states of nonlinear dispersive evolution equations, Commun. Pure Appl. Math. 39 (1986) 51–68. [60] V. Weisskopf and E. Wigner, Berechnung der nat¨ urlichen Linienbreite auf Grund der Diracschen Lichttheorie, Z. Phys. 63 (1930) 54–73. [61] K. Yajima, W k,p -continuity of wave operators for Schr¨ odinger operators, J. Math. Soc. Japan 47 (1995) 551–581.
December 23, 2004 10:23 WSPC/148-RMP
00221
Reviews in Mathematical Physics Vol. 16, No. 9 (2004) 1073–1114 c World Scientific Publishing Company
ONE-PARTICLE SUBSPACE OF THE GLAUBER DYNAMICS GENERATOR FOR CONTINUOUS PARTICLE SYSTEMS
YURI KONDRATIEV∗ , ROBERT MINLOS† and ELENA ZHIZHINA‡ ∗Faculty
of Mathematics, University of Bielefeld, D-33501 Bielefeld; BiBoS, University of Bielefeld, D-33501 Bielefeld †IPPI, RAS, Moscow ‡IPPI, RAS, Moscow ∗[email protected] †[email protected] ‡[email protected] Received 27 March 2004 Revised 23 August 2004
We consider a Glauber-type stochastic dynamics of continuous particle systems in R d . We construct a one-particle invariant subspace of the generator of this dynamics in the high temperature and low density regime. We prove that under some additional assumptions on the decay of the potential the restriction of the generator on the one-particle subspace is unitary equivalent to the operator of the multiplication by a bounded smooth realvalued function. As a consequence we estimate the spectral gap of the generator and find the second gap between the one-particle branch and the rest of the spectrum. Keywords: Gibbs measures; equilibrium spatial birth and death processes; K-transform; spectral analysis of the generator. Mathematics Subject Classifications (2000): 82C21, 60J25
1. Introduction An equilibrium stochastic dynamics is usually constructed as a stationary Markov process with a given stationary measure. As such a measure we typically use an equilibrium (Gibbs) state of an infinite particle system under consideration. The most well-known example to both physicists and mathematicians is the so-called Glauber stochastic dynamics for the Ising model that has a really fundamental meaning in theoretical and mathematical physics, see, e.g., lectures of Martinelli [18] for historical comments and an extensive bibliography. As a generalization of the Glauber dynamics we can consider several types of symmetric Markov processes associated with lattice systems of different nature (for a review see recent lectures of Albeverio [1]). Generators of these processes are self-adjoint operators in L 2 -spaces with respect to equilibrium measures and in several particular models they possess additional nice properties which, e.g., satisfy Poincar´e inequality (that implies a 1073
December 23, 2004 10:23 WSPC/148-RMP
1074
00221
Y. Kondratiev, R. Minlos & E. Zhizhina
spectral gap estimate), log-Sobolev inequality, etc. Moreover, in the uniqueness regime and for some specific models stochastic dynamics generators admit a more detailed spectral analysis. In particular, the problem of the existence of invariant subspaces and spectral representations of generators in these subspaces was studied in [3, 10, 19, 20]. In the case of continuous systems the equilibrium dynamics techniques is essentially less elaborated. Starting with the pioneering works by Lang [15, 16], the so-called gradient stochastic dynamics of continuous infinite particle systems was constructed for different classes of interaction potentials, see [2] and references therein. This dynamics can be considered as a diffusion on the configuration space of the system. Due to a simple general argument we cannot expect any specific spectral information concerning its generator except self-adjointness and positivity. And this situation concerns not only the particular case of the gradient stochastic dynamics but also should be expected, in general, for symmetric diffusions on configuration spaces. To obtain richer spectral properties of Markov generators we need to introduce other types of random evolutions in continuum. One natural possibility here is related to a consideration of pure jump type Markov processes on configuration spaces, in particular, spatial birth-and-death processes in which particles do not move but only appear and disappear in the position space of the system. In the case of systems in bounded domains (i.e., in finite volumes in the context of mathematical physics models) these processes admit quite complete analysis of the existence and uniqueness problems [8, 23]. A special case of general spatial birthand-death processes in finite volumes corresponding to Gibbs states (as invariant measures) was introduced in [6]. This process was called Glauber-type dynamics for continuous systems and we will also use such terminology in our paper. The Glauber-type dynamics in infinite volumes associated with the Poisson measure (or free systems without interaction) has been constructed by Surgailis in [26, 27]. In [6] the generator of Glauber dynamics in a finite volume was studied. More precisely, the authors considered a positive finite range pair potential φ and activity z > 0 which satisfy the condition of the low activity–high temperature (LAHT) regime. Then, with any finite volume Λ ⊂ Rd and a boundary condition η outside Λ, one may associate the finite volume Gibbs measure µΛ,η . The authors considered a Dirichlet form which corresponds to the Glauber dynamics in Λ. They showed that the generator HΛ,η of this Dirichlet form has a spectral gap (0, GΛ,η ), GΛ,η > 0, and, furthermore, the infimum of GΛ,η over all finite volumes Λ and boundary conditions η is positive. This result was extended in [13] to the case of general positive potentials and the infinite volume dynamics and, moreover, an explicit estimate of the spectral gap was shown. To produce this estimate, a coercive identity approach developed in [4] for infinite dimensional diffusion generators was applied. Similar results were obtained using other techniques in [28]. In [28] the hard core case was also considered and
December 23, 2004 10:23 WSPC/148-RMP
00221
Glauber Dynamics Generator for Continuous Particle Systems
1075
an exponential convergence established for the finite volume process which started with every initial configuration. In this paper we study more detailed spectral properties of the infinite volume Glauber dynamics generators of continuous gas for positive potentials and in the LAHT regime. Namely, Theorem 2.1 gives the existence of a one-particle invariant subspace for the generator. Moreover, we show a separation of the spectrum in this subspace from the rest of the generator’s spectrum in the orthogonal complement to the one-particle subspace. Under an additional assumption on a decay of the potential at the infinity we prove in Theorem 2.2 a unitary isomorphism of the generator in the one-particle subspace and the multiplication operator by a smooth bounded real-valued function. Even more, if this decay has an exponential character, then this function will be analytic and the generator has a Lebesgue spectrum in this subspace. The techniques used in the analysis of the Glauber dynamics generator is based on an essential modification of an approach developed in [3, 10, 11, 19, 20] for the lattice case. Note that in the study of the Glauber dynamics for continuous systems, an assumption about the positivity of the potential plays an absolutely crucial role. Under general conditions on the potential (e.g. of Ruelle type [25]) one can construct an equilibrium stochastic dynamics of considered form that gives the Glauber generator as a self-adjoint operator in the corresponding Hilbert space [13]. But additional properties of the generator (including essential self-adjointness, spectral gap etc.) are shown in the papers mentioned above (which used essentially different techniques) only for positive potentials. We consider this situation as an important open problem in the stochastic dynamics of continuous systems. We believe that the case of general potentials (even in the LAHT regime) needs essentially new technical ideas for the analysis and a deeper understanding of the nature of spatial birth-and-death processes in general. 2. Glauber Dynamics for Continuous Particle Systems 2.1. Configuration space The configuration space Γ := ΓRd over Rd , d ∈ N, is defined as the set of all subsets of Rd which are locally finite: ΓRd := γ ⊂ Rd | |γΛ | < ∞ for each compact Λ ⊂ Rd ,
where | · | denotes the cardinality of a set and γΛ := γ ∩ Λ. The space Γ can be endowed with a topology, namely the weakest topology on Γ with respect to P which all maps Γ 3 γ 7→ hf, γi := x∈γ f (x), f ∈ D, are continuous. Here, D := C0∞ (Rd ) is the space of all infinitely differentiable real-valued functions on Rd with compact support. We will denote by B(Γ) the Borel σ-algebra on Γ, generated by this topology. Let us consider on (Γ, B(Γ)) the Poisson measures πz with activity z, z > 0 defined by the following conditions:
December 23, 2004 10:23 WSPC/148-RMP
1076
00221
Y. Kondratiev, R. Minlos & E. Zhizhina
(1) for any family of mutually disjoint bounded domains Λ1 , . . . , Λk , Λj ⊂ Rd random variables NΛj (γ) = |γΛj |, j = 1, . . . , k are independent. (2) Each random variable NΛj has the Poisson distribution Pr(NΛj (γ) = n) =
z n |Λj |n −z|Λj | e , n!
j = 1, . . . , k ,
where |Λj | is the volume of Λj in Rd [22]. 2.2. Gibbs measures Now, we proceed to consider a Gibbs reconstruction of the Poisson measure πz by a formal Hamiltonian X φ(x − y) H(γ) = x,y∈γ
with the inverse temperature β. Properties of the pair interaction potential φ(u) will be formulated below. The Gibbs measure µβ,z is constructed as a limit when Λ % Rd of finite volume Gibbs measures corresponding to empty boundary conditions: µβ,z = lim µΛ β,z .
(2.1)
Λ%Rd
The measure µΛ β,z is defined by the following density with respect to the Poisson measure: dµΛ β,z dπz
=
1 exp{−βH(γΛ )} , ZΛ
Λ ⊂ Rd ,
where γΛ = γ ∩ Λ, and ZΛ is the normalizing factor. We will consider below general assumptions on the pair potential φ(u) and on the parameters β, z guaranteeing the existence of the limit (2.1), see for instance [24]. 2.3. Equilibrium spatial birth-and-death processes We will consider a stationary Markov process on the state space Γ with the invariant measure µβ,z . A generator of the corresponding stochastic semigroup in the functional space L2 (Γ, dµβ,z ) has the form Z X (Hb,d F )(γ) = b(x, γ)(F (γ ∪ x) − F (γ))dx , d(x, γ\x)(F (γ\x) − F (γ)) + x∈γ
Rd
(2.2)
where d : Rd × Γ → R + , b : Rd × Γ → R +
December 23, 2004 10:23 WSPC/148-RMP
00221
Glauber Dynamics Generator for Continuous Particle Systems
1077
are death and birth rates correspondingly. A general condition of symmetry for the operator Hb,d in the space L2 (Γ, µβ,z ) can be written as the following condition on the functions d and b: b(x, γ) = ze−βE(x,γ)d(x, γ) , where E(x, γ) is the relative energy of interaction between a particle located at x and the configuration γ: (P P if y∈γ |φ(x − y)| < ∞ , y∈γ φ(x − y) , E(x, γ):= +∞ , otherwise , cf. [7]. In what follows we consider an equilibrium birth-and-death generator Z X e−βE(x,γ)(F (γ ∪ x) − F (γ))dx , (2.3) (F (γ\x) − F (γ)) + z (HF )(γ) = Rd
x∈γ
corresponding to the following rates: b(x, γ) = ze−βE(x,γ) ,
d(x, γ) ≡ 1 .
As was shown in [13] under general conditions on the potential φ and the parameters β, z (the conditions are given below), there exists a stationary Markov process {γ(t), t ∈ R} on Γ with the stationary measure µβ,z , such that the generator (2.3) of the process can be extended to a self-adjoint operator in L2 (Γ, µβ,z ), that is equivalent to the reversibility of the process γ(t). This process is called the equilibrium Glauber dynamics which corresponds to the Gibbs measure µβ,z . Let us remark that the Gibbs measure (2.1) is invariant with respect to the space translations on Γ: τs γ = γ + s = {xi + s, xi ∈ γ} ,
xi , s ∈ Rd .
We denote by Us the corresponding unitary group of the operators of the space translations acting in L2 (Γ, dµβ,z ): (Us F )(γ) = F (τs−1 γ) .
(2.4)
It easy to see that the operators Us commute with the generator H (2.3). 2.4. Conditions on the pair potential φ and parameters β, z We consider conditions on the pair potential φ and parameters β, z, which guarantee the existence of the Gibbs measure (2.1) as well as the existence of a one-particle invariant subspace of the generator, see Theorems 2.1 and 2.2 below. (Ia) (Integrability): ˜ C(β) :=
Z
Rd
|1 − e−βφ(u) |1/2 du < +∞ .
(Ib) (Positivity): φ(u) ≥ 0 for all u ∈ Rd .
December 23, 2004 10:23 WSPC/148-RMP
1078
00221
Y. Kondratiev, R. Minlos & E. Zhizhina
(Ic) (Low activity–high temperature regime): We assume that the parameter of the model ˜ ε = z C(β) < ε0 is small enough. In the second group we collect more restrictive additional conditions on the potential φ and the parameters β, z. We need these conditions in Theorem 2.2 to study in more detail the spectral properties of the Glauber generator on the one-particle invariant subspace. (IIa) We assume that the function φ(u) satisfies the following estimate: D ≡ Dν 2 (u) as |u| > 1 0 ≤ φ(u) ≤ (1 + |u|)2s with s > d and an absolute constant D, (IIb) z < z0 , where z0 is a constant, depending on the functions φ(u) and ν(u), see the relation (2.18) below. It is easy to see, that condition (IIa) immediately implies the integrability condition (Ia). 2.5. Main results We recall the definition of the one-particle subspace, see for instance [17]. We call a subspace H1 ⊂ L2 (Γ, dµβ,z ) a one-particle invariant subspace of the generator H if: (1) it is invariant with respect to the generator H and the unitary group of the operators {Us , s ∈ Rd }; (2) there is a unitary transformation H1 to L2 (Rd , dp), such that the operators H1 = (1) H|H1 and Us = Us |H1 are unitary equivalent to the operators of multiplication by a function: ˜ 1 f (p) = m(p)f (p) , ˜s(1) f (p) = ei(s,p) f (p) , H U (2.5) Pd d d with (s, p) = i=1 si pi , s ∈ R , p ∈ R .
Remark 2.1. In the physical literature, when H is the operator of the energy of a physical system, the subspace H1 is associated with states of “quasi-particles”. Here p is a “quasi-momentum” of the “quasi-particle”, the function m(p) referred to “dispersion” is the energy of the “quasi-particle” with “quasi-momentum” p. We state now the main result of our paper. Let us denote by G0 = {Ψ(γ) ≡ c} ⊂ L2 (Γ, µβ,z ) the subspace of constants. It is easy to see that G0 is an invariant subspace of the operator H, and the corresponding eigenvalue is equal to 0. Theorem 2.1. Let conditions (Ia)–(Ic) hold. Then the space L2 (Γ, µβ,z ) can be decomposed into a direct orthogonal sum of subspaces invariant with respect to the
December 23, 2004 10:23 WSPC/148-RMP
00221
Glauber Dynamics Generator for Continuous Particle Systems
1079
operator H and the operators {Us , s ∈ Rd }:
L2 (Γ, µβ,z ) = G0 ⊕ G1 ⊕ G2 .
Let Hk = H|Gk , k = 0, 1, 2, be restrictions of the operator H on the corresponding invariant subspaces G0 , G1 , G2 , and σk = σ(Hk ) be their spectra. Then σ0 = {0} ,
σ1 ⊂ [−1 − γ1 , −1 + γ1 ] ,
σ2 ⊂ (−∞, −2 + γ2 ] ,
(2.6)
where γ1 = 3ε, γ2 = 30ε are small under small enough ε.
Corollary 2.1. The spectrum of H is decomposed into at least three separate parts under small enough ε. As follows from (2.6) at least two gaps exist in the spectrum of H. The first spectral gap is a gap between 0 and σ1 , which is estimated by 1 − γ1 (ε). The latter is a gap between σ1 and σ2 . Remark 2.2. The invariant subspace G1 appears to be the one-particle subspace, as will be seen from Theorem 2.2. Remark 2.3. To make our reasoning more accurate we impose the unified conditions on the parameters of the model for both theorems. The integrability condition in the form (Ia) is crucial in the proof of Theorem 2.2, but Theorem 2.1 holds under more general assumptions: Z |1 − e−βφ(u) |du < +∞ , φ(u) ≥ 0 , C(β) := Rd
and ε = zC(β) is small enough.
To obtain more detailed information about the spectral properties of the generator on the first invariant subspace we have to impose some additional restrictions on the potential φ(u). Theorem 2.2. If the potential φ and the parameters β, z meet conditions (Ia)–(Ic) and (IIa)–(IIb), then there is a unitary map W : G1 → L2 (Rd , dp) , (1)
which transforms the operators H1 , Us ˜ 1 f (p) = m(p)f (p) , H
into multiplication operators p ∈ Rd ,
f ∈ L2 (Rd , dp) ,
d X ˜ (1) f (p) = ei(p,s) f (p), (p, s) = U pj sj . s j=1
Here m(p) ∈ derivative.
Cb1 (Rd )
is a bounded smooth real-valued function on Rd with bounded
Corollary 2.2. Under exponential decay of the potential , when there exists a > 0 such that |φ(u)| < Ce−a|u|
for all
the operator H1 in G1 has a Lebesgue spectrum.
|u| > R ,
December 23, 2004 10:23 WSPC/148-RMP
1080
00221
Y. Kondratiev, R. Minlos & E. Zhizhina
Proof. Following the proof of Lemma 4.2 one can see that the function V (u) decays exponentially under exponential decay of the potential. Consequently, the Fourier transform of V (u) is an analytic function in a strip p, q ∈ Rd ,
{ξ = p + iq ,
|q| < const}
and the same holds for the function m(p). For any f ∈ L2 (Rd , dp) the spectrum ˜ 1 could measure σf (λ) corresponding to the the spectral resolution of the operator H be written as follows: Z |f (p)|2 dp . σf (λ) = p:m(p)<λ
Then dσf = dλ
Z
m(p)=λ
|f (p)|2 dp ,
where the last integration is over the Gelfand–Leray measure, and this measure exists for a.e. λ due to the analyticity of the function m(p). In addition, it follows dσ from the Fubini theorem that the derivative dλf is integrable. Corollary 2.3. Let the potential φ decay exponentially, and the function F ∈ L2 (Γ, dµβ,z ) ∩ L1 (Γ, dµβ,z ) has a non-zero projection on the one-particle invariant subspace G1 . Then the following asymptotic formula holds as t → ∞: hF (γ(t)), F (γ(0))iP ≡ hF (γ(t)) · F (γ(0))iP − hF (γ)i2µβ,z =
e−gt (CF + o(1)) . td/2
(2.7)
Here P is the distribution of the process on Γ with the generator (2.3), C F is a constant depending on the function F, g = − sup{σ1 } is the first spectral gap between 0 and σ1 = σ(H1 ). Proof of the asymptotics (2.7) follows the standard reasoning, see for example [10]. The proof is based on the statement of Theorem 2.2, the spectral theorem and the Laplace method. In particular, any function F (γ) of the form X F (γ) = f (x) , x∈γ
with an integrable function f (x) ∈ L1 (Rd ) meets the conditions of Corollary 2.3. As an important example we consider the function 1 X FΛ (γ) = χΛ (x) , |Λ| x∈γ
December 23, 2004 10:23 WSPC/148-RMP
00221
Glauber Dynamics Generator for Continuous Particle Systems
1081
where χΛ (x) is the characteristic function of a finite volume Λ. Then the following asymptotics holds for the decay of the density correlation function as t → ∞: * + 1 X 1 X hFΛ (γ(t)), FΛ (γ(0))iP = χΛ (x), χΛ (x) |Λ| |Λ| x∈γ(t)
=
1 |Λ|2
*
X
x∈γ(t)
x∈γ(0)
χΛ (x) ·
X
y∈γ(0)
χΛ (y)
+
P
P
− ρ21 =
e−gt (C + o(1)) , td/2
where ρ1 is the one-point correlation function. 2.6. The space of quasi-observables Let us consider the space of finite configurations Γ0 :=
∞ G
(n)
Γ0 ,
n=0 (n) Γ0
(n) where := ΓRd := {γ ∈ Γ0 : |γ| = n} is the space of n-point subsets in Rd for (0) n ∈ N, and Γ0 := {∅}. For n ∈ N, there is a natural bijection between the space (n) ^ ^ d )n S of the set (R d )n := {(x , . . . , x ) ∈ Rdn : Γ0 and the symmetrization (R n 1 n
^ d )n xi 6= xj if i 6= j} under the permutation group Sn over {1, . . . , n} acting on (R by permuting the coordinate index. This bijection induces a topology on Γ0 , the Borel σ-algebra B(Γ0 ) on Γ0 , and the corresponding measure on Γ0 , the so-called Lebesgue–Poisson measure λ(A) :=
˜ |A| , n!
(n)
A ⊂ Γ0 ,
n ∈ N,
^ ^ d )n is the pre-image of A ⊂ Γ(n) under the mapping (R d )n 7→ Γ(n) , |A| ˜ where A˜ ⊂ (R 0 0 ˜ is the dn-dimensional volume of the subset A; and λ(∅) = 1. (n) By analogy we can consider configurations in a finite domain Λ ⊂ Rd . Then ΓΛ (0) denotes the set of all n-point subsets of Λ, n ∈ N, and ΓΛ = {∅}. We denote by Bbs (Γ0 ) the space of all complex-valued bounded B(Γ0 )-measurable functions with bounded support, i.e., GΓ
0\
F
N n=0
(n)
ΓΛ
≡
0
for some N ∈ N, and some bounded domain Λ ∈ Rd .
For any G ∈ Bbs (Γ0 ) we define a function KG : Γ → C on the space Γ (so-called K-transform) as follow: X (KG)(γ) := G(η) . (2.8) η⊂γ |η|<∞
Note that for every G ∈ Bbs (Γ0 ) the sum in (2.8) has only a finite number of terms different from zero and thus KG is a well-defined function on Γ. Moreover,
December 23, 2004 10:23 WSPC/148-RMP
1082
00221
Y. Kondratiev, R. Minlos & E. Zhizhina
if G ∈ Bbs (Γ0 ), then KG is a local function: (KG)(γ) = (KG)(γΛ ) ,
γΛ = γ ∩ Λ ,
and the function KG is polynomially bounded: |(KG)(γ)| ≤ L(1 + |γΛ |)N ,
for all γ ∈ Γ ,
where the bounded domain Λ ⊂ R and N ∈ N are defined by the function G, and L = supξ∈Γ0 |G(ξ)|. The inverse mapping of the K-transform is defined by X (K −1 F )(η) := (−1)|η\ξ| F (ξ) , η ∈ Γ0 . d
ξ⊂η
The functions of the form (2.8) are known as summator functions. Summator functions form a commutative algebra, the product of two summator functions is again a summator function. For every G1 , G2 ∈ Bbs (Γ0 ) we have (KG1 ) · (KG2 ) = K(G1 ? G2 ) where the ?-convolution is defined on B(Γ0 )-measurable functions by X G1 (η1 ∪ η2 )G2 (η2 ∪ η3 ) , η ∈ Γ0 , (G1 ? G2 )(η) :=
(2.9)
(2.10)
(η1 , η2 , η3 ): η1 ∪η2 ∪η3 =η
and G1 ? G2 ∈ Bbs (Γ0 ), see [12]. Here the summation in (2.10) is over all three mutually disjoint subsets (η1 , η2 , η3 ) of η which may be empty, and η1 ∪η2 ∪η3 = η. In the context of models of infinite particle systems elements from Γ present locations of particles in the space. The state of such a system is described by a probability measure µ on Γ and functions F on Γ are considered as observables of the system and they represent physical quantities which can be measured. The measured values correspond to the expectation values Z F (γ) dµ(γ) . Γ
In contrast to the described situation, points from Γ0 do not have a direct physical interpretation related to particle configurations. The space Γ0 and the K-transform present rather a mathematical equipment which is useful for an alternative description of infinite particle systems via collections of finite dimensional objects (e.g., correlation functions). We will consider functions G on Γ0 as quasi-observables of infinite particle systems. In fact, typical observables we need in statistical physics can be constructed by an application of the K-transform to proper quasi-observables. 2.7. Correlation functions Let us consider a probability measure µ defined on (Γ, B(Γ)) with finite moments of all orders for γΛ = γ ∩ Λ (Λ ⊂ Rd is a bounded domain): Z |γΛ |n dµ(γ) < ∞ for all n ∈ N . Γ
December 23, 2004 10:23 WSPC/148-RMP
00221
Glauber Dynamics Generator for Continuous Particle Systems
1083
Then one can define a unique σ-finite measure % = %(µ) on (Γ0 , B(Γ0 )), such that Z Z (KG)(γ) dµ(γ) = G(η) d%(η) , (2.11) Γ
Γ0
for all G ∈ Bbs (Γ0 ). We call % the correlation measure corresponding to µ. In the case when % is absolute continuous with respect to the Lebesgue–Poisson measure there exists the Radon–Nicodym derivative %µ (η) =
d% (η) , dλ
and the functions %µ (η), η ∈ Γ0 are called the correlation function of the measure µ. In our case of the Gibbs measure under the above assumptions on the potential and the parameters of the model, the correlation function exists, and moreover, it meets the following Ruelle bound, see [24, 25]: %µ (η) < z |η| . 2.8. An auxiliary Hilbert space H and a reduced generator L Using formulas (2.8) and (2.9) we have the following representation for the scalar product of functions KG1 , KG2 from Bbs (Γ0 ): Z (KG1 , KG2 )L2 (Γ,µβ,z ) = (KG1 )(γ) · (KG2 )(γ)dµβ,z (γ) =
Z
Γ
K(G1 ? G2 )(γ)dµβ,z (γ) = Γ
Z
(G1 ? G2 )(η)%µ (η)dλ(η) . Γ0
(2.12)
Since equality (2.12) determines a positive quadratic form (G1 , G2 ) in the space Bbs (Γ0 ), we can accept the relation Z (G1 , G2 ) = G1 , G2 ∈ Bbs (Γ0 ) (G1 ? G2 )(η)%µ (η)dλ(η) , Γ0
as a new scalar product. The closure of Bbs (Γ0 ) by this scalar product is denoted by H. It was shown in [5, 12], that the K-transform can be extended as a unitary operator K : H → L2 (Γ, µβ,z ) .
(2.13)
Representations (2.8) and (2.13) imply that the unitary group of the space translations {Us , s ∈ Rd } in H transforms under K to the unitary group {Us } acting in L2 (Γ, dµβ,z ) by formula (2.4), and we preserve the same notations {Us } for the operators in H.
December 23, 2004 10:23 WSPC/148-RMP
1084
00221
Y. Kondratiev, R. Minlos & E. Zhizhina
Direct calculations, see for instance [14], give the representation for the unitary image L := K −1 HK of the Glauber generator H acting in the Hilbert space H: XZ Y Y 0 (LG)(η) = −|η|G(η) + z G(γ ∪ x) (e−βφ(x−y) − 1) e−βφ(x−y ) dx . γ⊆η
Rd
y 0 ∈γ
y∈η\γ
(2.14)
We call the operator (2.14) as the reduced generator, and in what follows we will study the spectral properties of the operator L in the space H. An advantage of this representation is the following. The operator L acts on functions defined on Γ0 or, that is the same, on sequences of symmetric functions of finite numbers of variables. It gives the possibility to apply well-known techniques of operators in Fock-type spaces to the case under considerations. 2.9. Main results in terms of the reduced generator L We formulate here the main results in terms of the auxiliary Hilbert space H and the operator L. As follows from the unitary property (2.13) of the K-transform, statements of Theorems 2.1’ and 2.2’ below are equivalent to Theorems 2.1 and 2.2 respectively. Let H0 ⊂ H be an one-dimensional subspace, generated by the “vacuum” vector Φ0 : ( 1, η = ∅; Φ0 (η) = (2.15) 0, η 6= ∅ . It is easy to see that LΦ0 = 0. Theorem 2.1’. Let assumptions (Ia)–(Ic) be valid. Then the space H can be decomposed into a direct orthogonal sum ˆ0 ⊕ H ˆ1 ⊕ H ˆ2 H=H
(2.16)
ˆ 0 = H0 , H ˆ1, H ˆ 2 invariant with respect to the operator L and the of the subspaces H d unitary group {Us , s ∈ R } of the space translations. Let Lk = L|Hˆ k , k = 0, 1, 2, ˆ0, H ˆ1, H ˆ 2 , and be restrictions of the operator L on the corresponding subspaces H σk = σ(Lk ) be their spectra. Then σ0 = {0} ,
σ1 ⊂ [−1 − γ1 , −1 + γ1 ] ,
σ2 ⊂ (−∞, −2 + γ2 ] ,
(2.17)
where γ1 = 3ε, γ2 = 30ε are small under small enough ε. Remark 2.4. It is well known that both spaces L2 (Γ, dµ0,z ) and L2 (Γ0 , dλz ) are isomorphic to the symmetric Fock space F (sym) (L2 (Rd , dx)), see for instance [9]. Using the Fock space structure it is easy to find that the spectrum of the generator of the free dynamics, when β = 0 or φ ≡ 0, is the same as the set of non-positive integer numbers {0, −1, −2, . . .}. In this case Charlier polynomials are eigenfunctions of the
December 23, 2004 10:23 WSPC/148-RMP
00221
Glauber Dynamics Generator for Continuous Particle Systems
1085
free generator, and we can construct a complete decomposition of L2 (Γ, dµ0,z ) into direct sum of invariant subspaces using Charlier polynomials, see [9, 12]. Recall that ν(u) = ν(|u|) = and we denote Z Kν = ν(u)du ;
1 , (1 + |u|)s
u ∈ Rd ,
s > d,
|e−βφ(u) − 1|1/2 ¯ ¯ C(β) := max < ∞; Cˆ := max{1, C(β)} . ν(u) u∈Rd ¯ The existence of a bounded constant C(β) follows from condition (IIa). Theorem 2.2’. If the potential φ and the parameters β, z meet conditions (Ia)–(Ic) and (IIa)–(IIb) with z < z0 :=
1 , ˆ (eC + Kν )2
(2.18)
then there is a unitary map ˆ 1 → L2 (Rd , dp) , W :H (1)
which transforms the operators L1 , Us ˜ 1 f (p) = m(p)f (p) , L ˜ (1) f (p) = ei(p,s) f (p) , U s
into multiplication operators p ∈ Rd , (p, s) =
f ∈ L2 (Rd , dp) , d X
pj sj .
j=1
Here m(p) ∈ Cb1 (Rd ) is a bounded smooth real-valued function on Rd with bounded derivative. 3. Proof of Theorem 2.1’: The Construction of the One-Particle Subspace of the Generator We denote the set of all continuous functions on Γ0 with bounded support by Cbs (Γ0 ), and let us consider the following norm in the space Cbs (Γ0 ): ! |η| Z 1 sup kGkM = (|η| + |ξ|)|G(η ∪ ξ)| M |ξ| dξ + |G(∅)| , (3.1) 3 Γ0 η where G ∈ Cbs (Γ0 ) and dξ := dλ(ξ) is the Lebesgue–Poisson measure on Γ0 . We take a constant M , such that M > 4z. We denote by L a closure of Cbs (Γ0 ) with respect to the norm (3.1). Let us notice, that the Banach space L and the norm (3.1) are invariant with respect to the operators Ut of the space translations: Ut G ∈ L , for any G ∈ L and any t ∈ Rν .
kUt GkM = kGkM
(3.2)
December 23, 2004 10:23 WSPC/148-RMP
1086
00221
Y. Kondratiev, R. Minlos & E. Zhizhina
Lemma 3.1. Let M > 4z ,
(3.3)
then L ⊂ H, the space L is dense in H, and kGkH ≤ kGkM ,
G ∈ L.
(3.4)
For the proof of Lemma 3.1, see Sec. 5. We denote by DL ⊂ H the domain of the operator L in H. Let us consider the following set of functions fL = {G ∈ L ∩ DL : LG ∈ L} . D
fL is the domain of L as an operator acting in L. Since Cbs (Γ0 ) ⊂ DL and Then D fL , then D fL is dense in L. Cbs (Γ0 ) ⊂ D For any k = 0, 1, 2, . . . , we define the following spaces of functions: Lk = {G ∈ L : G(η) = 0, when |η| 6= k} , M Lj = {G ∈ L : G(η) = 0, |η| < k} , L≥k = j≥k
L≤k =
M j≤k
Lj = {G ∈ L : G(η) = 0, |η| > k} .
All these subspaces are closed in L. By analogy we can define subspaces Hk , H≥k , H≤k ⊂ H, which are also closed in the space H. The decomposition of L in a direct sum L = L≤1 ⊕ L≥2 implies the following matrix representation for the operator L: L11 L12 L= L21 L22
(3.5)
(3.6)
where L11 : L≤1 → L≤1 , L12 : L≥2 → L≤1 etc. We will construct an invariant to the operators L and Ut subspace Lˆ≤1 as the graph of a bounded operator S : L≤1 → L≥2 : Lˆ≤1 = {G + SG; G ∈ L≤1 } ,
(3.7)
(see the general description of this approach in [17, 10]). The condition of the invariance of the subspace Lˆ≤1 with respect to L could be rewritten as the following equation on the operator S: L21 + L22 S = S(L11 + L12 S) .
(3.8)
Lemma 3.2. For all small enough ε the operator L22 is reversible in L≥2 , and the norm of the operator L−1 22 has the upper bound |||L−1 22 |||M <
1 (1 + 3ε) , 2
(3.9)
December 23, 2004 10:23 WSPC/148-RMP
00221
Glauber Dynamics Generator for Continuous Particle Systems
1087
where |||·|||M means the operator norm, generated by the norm k·kM in the Banach space L. For the proof of Lemma 3.2, see Sec. 5. We can now rewrite the relation (3.8) in the form −1 −1 S = −L−1 22 L21 + L22 SL11 + L22 SL12 S .
(3.10)
To prove the existence of the operator S we have to estimate the norms of the operators in Eq. (3.10). Lemma 3.3. For small enough ε we have |||L11 |||M < 1 + 3ε ,
(3.11)
|||L12 |||M < ε ,
(3.12)
|||L21 |||M < 4ε ,
(3.13)
For the proof of Lemma 3.3, see Sec. 5. We denote by F(S) the right-hand side of (3.10) and consider the mapping S → F(S) in the space of bounded linear operators O1,2 , acting from L≤1 to L≥2 . Let B% ⊂ O1,2 be a ball in the space O1,2 of the radius %: B% = {S ∈ O1,2 : |||S|||M < %} . Then the estimates (3.9) and (3.11)–(3.13) imply the following result. Lemma 3.4. Under small enough ε the ball B8ε is invariant with respect to F: FB8ε ⊆ B8ε ,
(3.14)
and the mapping F(S) is a contraction on B8ε : |||F(S1 ) − F(S2 )|||M ≤ c|||S1 − S2 |||M ,
S1 , S2 ∈ B8ε ,
(3.15)
with 0 < c < 1. For the proof of Lemma 3.4, see Sec. 5. Lemma 3.4 implies the existence and uniqueness of the solution S of the equation (3.10) with a norm |||S|||M < 8ε .
(3.16)
Therefore, we construct the subspace Lˆ≤1 of the form (3.7), which is invariant with respect to the operator L. We denote by L1 = L|Lˆ≤1 the restriction of L to this invariant subspace. Remark 3.1. Notice that the invariant subspace Lˆ≤1 is invariant with respect to the operators of the space translations Ut , t ∈ Rν . This follows from the fact that the operators Lij , i, j = 1, 2 commute with Ut , and for any t the operator S¯t = Ut−1 SUt also meets Eq. (3.10). Since |||S¯t |||M ≤ |||S|||M and we proved the uniqueness of the
December 23, 2004 10:23 WSPC/148-RMP
1088
00221
Y. Kondratiev, R. Minlos & E. Zhizhina
solution (3.10) with a small norm (3.16), we have S¯t = S. Consequently, we have the commutative relation Ut S = SUt , that implies the invariance of the subspace Lˆ≤1 under the transformation Ut : Ut Lˆ≤1 ⊆ Lˆ≤1 .
We shall now construct the second “supplementary” subspace Lˆ≥2 invariant with respect to L in the form Lˆ≥2 = {G + T G; G ∈ L≥2 } ,
(3.17)
of the graph of a bounded operator T : L≥2 → L≤1 . The condition of the invariance of the subspace (3.17) is equivalent to the following equation on the operator T : −1 −1 T = L12 L−1 22 + L11 T L22 − T L21 T L22 .
(3.18)
Using the same reasoning and the same bounds as above, we prove the existence of a unique solution T of Eq. (3.18) with a small norm |||T |||M < 8ε .
(3.19)
Thus, we construct the subspace Lˆ≥2 of the form (3.17), which is invariant with respect to L and Ut (the last statement can be proved by the same line as above). Lemma 3.5. The following decomposition into a direct sum of invariant subspaces holds for any small enough ε: L = Lˆ≤1 + Lˆ≥2 .
(3.20)
For the proof of Lemma 3.5, see Sec. 5. We denote L2 = L|Lˆ≥2 . Lemma 3.6. Let ε be small enough, then the operator L2 is reversible in Lˆ≥2 and
1 (1 + 14ε) . (3.21) 2 For the proof of Lemma 3.6, see Sec. 5. As follows from our constructions, the space Lˆ≤1 contains the one-dimensional invariant subspace L0 = {Φ0 }, such that LL0 = 0. We denote by Lˆ1 the following subspace of Lˆ≤1 : |||L−1 2 |||M ≤
Lˆ1 = H0⊥ ∩ Lˆ≤1 ,
(3.22)
Lˆ1 = {G1 + S 0 G1 ; G1 ∈ L1 }
(3.23)
where H0⊥ is the orthogonal complement in H to H0 , and Lˆ1 is invariant with respect to the operators L and Ut as an intersection of two invariant subspaces. The representations (3.7) and (3.22) imply that the subspace Lˆ1 can be determined again as a graph
of an operator S 0 : L1 → L≥2 ⊕ L0 , where S 0 G1 = S|L1 G1 + C0 (G1 )Φ0 ∈ L≥2 ⊕ L0 ,
G1 ∈ L1 ,
December 23, 2004 10:23 WSPC/148-RMP
00221
Glauber Dynamics Generator for Continuous Particle Systems
the operator S is defined above in (3.8), and Z Z C0 (G1 ) = −(G1 + S|L1 G1 , Φ0 )H = −%1 G1 (x)dx −
|η|≥2
1089
(S|L1 G1 )(η)%(η)dη .
It is easy to see that the functional C0 (G1 ), G1 ∈ L1 is a bounded linear functional on L1 : |C0 (G1 )| < (ε + |||S|||M )||G1 ||M .
(3.24)
Then the estimates (3.16) and (3.24) imply that for small enough ε |||S 0 |||M ≤ 17ε .
(3.25)
Thus for any small enough ε we established the decomposition: Lˆ≤1 = L0 + Lˆ1 , which implies the following decomposition of L into a direct sum of invariant subspaces L = L0 + Lˆ1 + Lˆ≥2 .
(3.26)
The decomposition of L in a direct sum L = L1 + (L≥2 + L0 )
(3.27)
implies the following matrix representation for the operator L: 0 L11 L012 L= , L021 L022 where L011 : L1 → L1 , L12 : (L≥2 + L0 ) → L≤1 , etc. Moreover, the condition of the invariance of the space Lˆ1 (3.23) with respect to the operator L can be written as follows: L021 + L022 S 0 = S 0 (L011 + L012 S 0 ) , where the operator S 0 is defined in (3.23). Let us consider the projection operator P1 : Lˆ1 → L1 ,
P1 G = G 1 ∈ L 1 ,
where G1 is the L1 -component of the function G, and the inverse operator P1−1 : L1 → Lˆ1 ,
P1−1 G1 = G1 + SG1 + C0 (G1 )Φ0 = G1 + S 0 G1 .
Then as follows from (3.25) the operators P1 and P1−1 are bounded in the norm of the space L. Consequently, the operator L1 = L|Lˆ1 = P1−1 (L011 + L012 S 0 )P1 is similar to the operator L01 = L011 + L012 S 0 acting in the space L1 . Moreover, since LΦ0 = 0, SΦ0 = 0 and for any G ∈ L≥2 we have LG ∈ L≥1 , then L012 S 0 = L12 (S|L1 ) ,
December 23, 2004 10:23 WSPC/148-RMP
1090
00221
Y. Kondratiev, R. Minlos & E. Zhizhina
where the operators L12 and S were defined above in (3.6) and (3.8). Thus, the operator L1 is similar to the operator L011 + L12 (S|L1 ) , acting in L1 . ˆ1, H ˆ ≥2 as the closure in H of the subspaces Finally, we introduce the subspaces H ˆ ˆ L1 , L≥2 respectively. ˆ 1 and H ˆ ≥2 are invariant with respect to the operators Lemma 3.7. The subspaces H L and Ut . Together with the invariant subspace H0 they give the orthogonal decomposition (2.16) of the space H. In addition, the spectra of L on the corresponding subspaces meet the condition (2.17). For the proof of Lemma 3.7, see Sec. 5. Lemma 3.7 is the final step in the proof of Theorem 2.1’. 4. Proof of Theorem 2.2’ (r)
Let us denote by H1 a closure of L1 = {G1 (x)} functions, defined on one-point configurations, with respect to the scalar product: (G1 , G2 )r = (G1 + S 0 G1 , G2 + S 0 G2 )H .
(4.1)
Then |(G1 , G2 )r | ≤ (1 + |||S 0 |||M )2 ||G1 ||M ||G2 ||M . Lemma 4.1. The scalar product (4.1) could be written as follows: Z Z (G1 , G2 )r = %1 G1 (x)G2 (x)dx + G1 (x)G2 (y)s(x − y)dxdy , where s(u) is a real-valued even integrable function with Z |s(u)|du < C%1 . Here C < 1 is a constant, %1 = %|Γ(1) is the one-point correlation function. 0
For the proof of Lemma 4.1, see Sec. 5. The operator (r) P −1 : H1 → Lˆ1 , (r)
is the isometric operator from H1
P −1 G1 = G1 + S 0 G1
to Lˆ1 :
(P −1 G1 , P −1 G2 )H = (G1 , G2 )r . Since (P −1 )−1 = P ,
(r) P : Lˆ1 → H1 ,
P (G1 + S 0 G1 ) = G1 ,
(4.2)
(4.3)
December 23, 2004 10:23 WSPC/148-RMP
00221
Glauber Dynamics Generator for Continuous Particle Systems
1091
there exists the unitary operator (r) P −1 : H1 → Hˆ1 .
Using the representation Lˆ1 = L011 + L012 S 0 = P L1 P −1
(4.4)
we obtain that the operator Lˆ1 can be extended to a self-adjoint bounded operator (r) in H1 , which is unitary equivalent to the operator L1 in Hˆ1 . Thus, our goal is to (r) study spectral properties of the operator Lˆ1 in H1 . (r) Lemma 4.2. The operator Lˆ1 acts in H1 by the formula Z Lˆ1 G (x) = −G(x) + V (x − y)G(y)dy ,
(4.5)
where V (u) is an even integrable function Z |V (u)|du < 2ε
with
Z
(4.6)
|u| · |V (u)|du < ∞ .
(4.7)
For the proof of Lemma 4.2, see Sec. 5. The representations (4.2) and (4.3) imply that the Fourier transform Z 1 (r) ˜ G(x)ei(x,p) dx , F : H1 → L2 (Rd , dµ) , (F G)(p) = G(p) = (2π)d/2 (r)
is the unitary mapping from H1
p ∈ Rd ,
to L2 (Rd , dµ) with
dµ = (%1 + s˜(p))dp ,
%1 + s˜(p) > a%1 ,
a > 0.
Here dp is the Lebesque measure, s˜(p) is the Fourier transform of the function s(u), defined in (4.2). Moreover, (4.5) implies that the operator Lˆ1 is unitary equivalent to the operator of multiplication by the function ˜ ˜ , (L˜1 G)(p) = m(p)G(p)
L˜1 = F Lˆ1 F −1 ,
with m(p) = −1 + V˜ (p) ,
V˜ (p) = (F V )(p) .
Using (4.6) and (4.7) we have that m(p) is a bounded smooth function with |V˜ (p)| ≤ 2ε . Then we can consider a unitary mapping U = W F P : Hˆ1 → L2 (Rd , dp) ,
(4.8)
December 23, 2004 10:23 WSPC/148-RMP
1092
00221
Y. Kondratiev, R. Minlos & E. Zhizhina
where W : L2 (Rd , dµ) → L2 (Rd , dp) , (W f )(p) = (%1 + s˜(p))1/2 f (p) ,
f ∈ L2 (Rd , dµ) ,
and (U L1 U −1 g)(p) = m(p)g(p) ,
g(p) ∈ L2 (Rd , dp) .
Here m(p) is the same function as in (4.8), since W commutes with the operator of multiplication by a function. Theorem 2.2’ is completely proved. 5. Proofs of Lemmas 5.1. Proof of Lemma 3.1 We prove (3.4) first for the functions G ∈ L, such that G(∅) = 0. Using the estimate on the correlation function ρ(η1 ∪ η2 ∪ η3 ) < z |η1 |+|η2 |+|η3 | , we have: Z Z Z 2 ||G||H = G(η1 ∪ η2 )G(η2 ∪ η3 )ρ(η1 ∪ η2 ∪ η3 )dη1 dη2 dη3 |η2 | 1 |G(η1 ∪ η2 )|(3z)|η2 | z |η1 | dη1 dη2 dη3 3 ! |η| Z |η3 | 1 z ≤ sup |G(η ∪ η3 )|(|η| + |η3 |) M |η3 | dη3 3 M η ≤
Z Z Z
Z
|G(η2 ∪ η3 )|z |η3 |
X
η1 ⊆ε ε=η1 ∪η2
3z M
|η2 |
z M
|η1 |
|G(ε)|M |ε| dε .
(5.1)
Here we apply the well-known formula [22] Z Z X F (ξ1 ∪ ξ2 )ϕ1 (ξ1 )ϕ2 (ξ2 )dξ1 dξ2 = F (ξ) ϕ1 (ξ1 )ϕ2 (ξ\ξ1 )dξ .
(5.2)
ξ1 ⊆ξ
Using the equality X
η1 ∪η2 =ε η1 ∩η2 =∅
3z M
|η1 |
z M
|η2 |
=
3z z + M M
|ε|
=
4z M
|ε|
and the condition M ≥ 4z (together with the apparent inequality have that the expression (5.1) can be estimated from above by |ε| Z z 3z + |G(ε)|M |ε| dε ≤ ||G||2M . ||G||M · M M
,
z M
≤
1 3
) we
December 23, 2004 10:23 WSPC/148-RMP
00221
Glauber Dynamics Generator for Continuous Particle Systems
1093
Here we also use the evident estimate Z |G(ε)|M |ε| dε ≤ ||G||M .
Let us now consider the general case, when the function G ∈ L can be represented as a sum G = gΦ0 + G1 , where Φ0 is the “vacuum” vector, defined by (2.15), g = G(∅), and G1 (∅) = 0. Then using the Cauchy–Schwarz–Bunyakovskii inequality, we have ||G||2H = (gΦ0 + G1 , gΦ0 + G1 )H = g 2 (Φ0 , Φ0 )H + 2g(Φ0 , G1 )H + (G1 , G1 )H ≤ g 2 + 2|g|||G1 ||M + ||G1 ||2M = (|g| + ||G1 ||M )2 = ||G||2M .
Thus, we proved the estimate (3.4) together with the inclusion L ⊂ H. Since the space Cbs (Γ0 ) of continuous functions on Γ0 with bounded support is contained in L, and Cbs is dense in H, then L is dense in H. Lemma 3.1 is completely proved. 5.2. Proof of Lemma 3.2 We consider the following decomposition for the operator L in the sum of two operators: L = L 0 + L1 ,
(5.3)
where (L0 G)(η) = −|η|G(η)
(5.4)
1
is a “free” generator, and the “perturbation” L is given as XZ Y Y 0 (L1 G)(η) = z G(γ ∪ x) κβ (x − y) e−βϕ(x−y ) dx .
(5.5)
for each operator L0 and L1 . Consequently, we can write L−1 22 as −1 0 1 −1 0 −1 1 −1 (L022 )−1 L22 = (L22 + L22 ) = E≥2 + (L22 ) L22
(5.6)
γ⊆η
y∈η\γ
y 0 ∈γ
Q Here κβ (u) = e − 1, and as usual, we assume that y∈∅ f (y) = 1. By analogy with the matrix representation (3.6) associated with the decomposition (3.5) for the operator L we get matrix representations j L11 Lj12 Lj = , j = 0, 1 , Lj21 Lj22 −βϕ(u)
with the identity operator E≥2 acting in L≥2 . Let us estimate the norm of the operator (L022 )−1 L122 . It follows from (3.1), (5.4) and (5.5), and the notation γ = γ1 ∪ γ2 , η = η1 ∪ η2 (γ 6= ∅), that ||(L022 )−1 L122 G(η)||M |η1 | Z Z (|η1 | + |η2 |) X X 1 |G(γ1 ∪ γ2 ∪ x)| = z sup 3 (|η1 | + |η2 |) Rν Γ 0 η1 γ1 ⊆η1 γ2 ⊆η2
December 23, 2004 10:23 WSPC/148-RMP
1094
00221
Y. Kondratiev, R. Minlos & E. Zhizhina
Y
y∈η1 ∪η2 \γ
Y
|κβ (x − y)|
e
−βϕ(x−y)
!
dx M |η2 | dη2
y∈γ1 ∪γ2
|η1 | X Z 1 ≤ z sup |G(˜ γ ∪ γ2 ∪ x)| sup 3 γ ˜ ⊆η1 Rν Γ 0 η1 Z
γ2 ⊆η2
Y
X
γ1 ⊆η1 y∈η1 \γ1
Y
y 0 ∈η2 \γ2
|κβ (x − y)| 0
|κβ (x − y )|
Y
e
Y
e
−βϕ(x−y)
y∈γ1 −βϕ(x−y 0 )
!
!
dx M |η2 | dη2 .
y 0 ∈γ2
(5.7)
Using that the potential ϕ ≥ 0 is non-negative, we have Y e−βϕ(x−y) ≤ 1 , y∈γ2
and Y
X
γ1 ⊆η1 y∈η1 \γ1
=
X
|κβ (x − y)| Y
γ1 ⊆η1 y∈η1 \γ1
=
Y
y∈η1
Y
e−βϕ(x−y)
y∈γ1
(1 − e−βϕ(x−y))
Y
e−βϕ(x−y)
y∈γ1
1 − e−βϕ(x−y) + e−βϕ(x−y) = 1
for any η1 . Thus we can continue (5.7) as follows: ≤z
Z Z
Γ 0 Rν
X
γ2 ⊆η2
)! |η1 | ( 1 sup sup |G(˜ γ ∪ γ2 ∪ x)| 3 η1 γ ˜ ⊆η1
Y
y∈η2 \γ2
|κβ (x − y)|dxM |η2 | dη2 .
(5.8)
Let us notice that for any non-negative f (γ): ( )! ( )! ) |η1 | ( |˜ γ| |η1 | 1 1 1 sup γ) ≤ sup sup sup f (˜ f (˜ γ) = sup f (η1 ) . 3 3 3 η1 η1 η1 γ ˜ ⊆η1 γ ˜ ⊆η1 Consequently, the expression (5.8) has the following upper bound: ! |η1 | Z Z X Y 1 |κβ (x − y)|dxM |η2 | dη2 . |G(η1 ∪ γ2 ∪ x)| ≤z sup 3 η1 Γ0 Rν γ2 ⊆η2
y∈η2 \γ2
(5.9)
December 23, 2004 10:23 WSPC/148-RMP
00221
Glauber Dynamics Generator for Continuous Particle Systems
1095
We apply again the equality (5.2) to the formula (5.9) with ! |η1 | 1 |G(η1 ∪ γ2 ∪ x)| M |γ2 | , ϕ1 (γ2 ) = sup 3 η1 Y
ϕ2 (η2 \γ2 ) =
y∈η2 \γ2
|κβ (x − y)|M |η2 \γ2 | ,
F (η2 ) = 1 ,
then (5.8) can be rewritten as ! |η1 | Z Z Z Y 1 |G(η1 ∪ γ2 ∪ x)| z sup |κβ (x − y)|M |γ1 | M |γ2 | dγ1 dγ2 dx 3 ν η 1 Γ0 Γ0 R y∈γ 1
=
z M C(β) e M
Z
!
|η| z M C(β) 1 |G(η ∪ γ˜)| |˜ γ |M |˜γ | d˜ γ≤ e kGkM,q , 3 M
sup Γ0
η
where γ˜ = γ2 ∪ x. In the last step we use (5.2) and the following calculation Z Y |κβ (x − y)|M |γ1 | dγ1 Γ0 y∈γ1
= 1+ with C(β) =
R
∞ X
Mn
n=1
1 n!
Z
Rν
···
Z
Rν
|κβ (y)|dy. Taking M = |||(L022 )−1 L122 |||M ≤
n Y
i=1
1 ˜ C(β)
|κβ (yi )|dy1 · · · dyn = eM C(β) , ˜ we have M C(β) < M C(β) and
z M C(β) e < ε ·e. M
Since (5.4) implies the estimate |||(L022 )−1 |||M ≤
1 , 2
then from (5.6) and (5.11) we finally have 1 1 1 < (1 + 3ε) 2 1 − εe 2 for all small enough ε. Lemma 3.2. is proved. |||(L22 )−1 |||M ≤
5.3. Proof of Lemma 3.3 5.3.1. Operator L11 Functions G ∈ L≤1 have the norm: ||G||M
1 = sup |G1 (y)| + M 3 y
where G(η) =
(
G0 , G1 (x) ,
(5.10)
Z
|G1 (x)|dx + |G0 | , η = ∅,
η = {x} .
(5.11)
December 23, 2004 10:23 WSPC/148-RMP
1096
00221
Y. Kondratiev, R. Minlos & E. Zhizhina
The function L11 G has the following components: Z (L11 G)0 = z G1 (x)dx , (L11 G)1 (y) = −G1 (y) + z
Z
G1 (x)(e−βϕ(x−y) − 1)dx .
(5.12)
Then using the estimates |κβ (x − y)| ≤ 1, we have: ! |η| Z 1 (L11 G)1 (η ∪ ξ)(|η| + |ξ|) M |ξ| dξ sup ||L11 G||M = 3 Γ0 η
Z 1 + |(L11 G)0 | ≤ sup |G1 (y)| + z |G1 (x)||κβ (x − y)|dx 3 y Z Z Z Z + M |G1 (x)|dx + M z |G1 (x)||κβ (x − y)|dxdy + z |G1 (x)|dx
Thus,
Z z ≤ ||G||M + 2 + zC(β) |G1 (x)|M dx ≤ (1 + 3ε)||G||M . M |||L11 |||M ≤ 1 + 3ε .
5.3.2. Operator L12 Let us consider the components of the operator L12 : (L12 G)0 = 0 , G ∈ L≥2 , Z (L12 G)1 (y) = z G2 (x, y)e−βϕ(x−y)dx ,
where G2 ∈ L2 is a two-point configuration component of G ∈ L≥2 . Then Z Z Z 1 sup |G2 (y, x)|e−βϕ(x−y) dx + |G2 (y, x)|e−βϕ(x−y) M dxdy ||L12 G||M = z 3 y Z Z Z z 1 2 sup |G2 (y, x)|M dx + |G2 (y, x)|M dxdy . (5.13) ≤ M 3 y On the other hand, the expression (3.1) implies 2 Z 2 1 sup |G2 (y, x)| + ||G2 ||M = 2 sup |G2 (y, x)|M dx 3 3 x,y y Z Z + |G2 (y, x)|M 2 dxdy ≤ ||G||M , where G ∈ L≥2 , G2 ∈ L2 . Comparing (5.13) and (5.14) we have z ||L12 G||M ≤ ||G2 ||M ≤ ε||G||M , M
(5.14)
December 23, 2004 10:23 WSPC/148-RMP
00221
Glauber Dynamics Generator for Continuous Particle Systems
1097
and |||L12 |||M ≤ ε . 5.3.3. Operator L21 Let G = (G0 , G1 ) ∈ L≤1 , then Z Y κβ (x − y)dx, |η| ≥ 2 , (L21 G)(η) = z G1 (x)
(5.15)
y∈η
and under M =
1 ˜ C(β)
we have:
||L21 G||M ≤ z
Z
Y
y∈η
|η| Z 1 (|η| + |ξ|) |G1 (x)| 3
sup Γ0 η:|η|+|ξ|≥2
|κβ (x − y)|
Y
y 0 ∈ξ
!
|κβ (x − y 0 )|dx M |ξ| dξ
Z n Y 1 |G1 (x)|dx M |ξ| dξ (n + |ξ|) sup |κβ (x − y)| ≤ z sup 3 x Γ0 n:n+|ξ|≥2
Z
z ≤ M +
Z
|G1 (x)|M dx
X
k≥2
z ≤ M ≤ 4ε
Z
Z
y∈ξ
n 1 sup n n≥2 3
!
+ C(β)M
! n (C(β)M )k 1 (n + k) sup k! n≥0 3
! n 1 sup (n + 1) n≥1 3
! 2 ∞ X 1 2 1 (C(β)M )k + C(β)M + |G1 (x)|M dx 2 k+ 3 3 3 k! k=2
|G1 (x)|M dx ≤ 4ε||G||M .
Here we use, that
n 1 1 n= , 3 n>0 3 sup
and
1 M C(β) e e < < 1. 3 3
Lemma 3.3. is proved completely. 5.4. Proof of Lemma 3.4 Using (3.9) with (3.11)–(3.13) we have for all small enough ε: |||F(S)|||M ≤ |||L−1 22 |||M |||L21 |||M
December 23, 2004 10:23 WSPC/148-RMP
1098
00221
Y. Kondratiev, R. Minlos & E. Zhizhina −1 2 + |||L−1 22 |||M |||L11 |||M |||S|||M + |||L22 |||M |||L12 |||M |||S|||M
1 1 1 (1 + 3ε)4ε + (1 + 3ε)2 8ε + (1 + 3ε)ε(8ε)2 < 8ε , 2 2 2
≤
what proves the inclusion (3.14). Further, −1 F(S1 ) − F(S2 ) = L−1 22 (S1 − S2 )L11 + L22 (S1 − S2 )L12 S1
+ L−1 22 S2 L12 (S1 − S2 ) consequently, using again (3.9), (3.11)–(3.13) we have for any S1 , S2 ∈ B8ε 1 (1 + 3ε)2 + (1 + 3ε)8ε2 · |||S1 − S2 |||M . |||F(S1 ) − F(S2 )|||M ≤ 2 Since for small enough ε: c=
1 1 (1 + 3ε)2 + (1 + 3ε)8ε2 = + O(ε) < 1 , 2 2
the inequality (3.15) is proved. 5.5. Proof of Lemma 3.5 To prove (3.20) we have to find for any G ∈ L functions g≤1 ∈ L≤1 and g≥2 ∈ L≥2 , such that G = (g≤1 + Sg≤1 ) + (g≥2 + T g≥2 ) ,
(5.16)
and to prove that the decomposition (5.16) is unique. The decomposition (5.16) is equivalent to the following relations g≤1 + T g≥2 = G≤1 ,
g≥2 + Sg≤1 = G≥2 ,
(5.17)
where G≤1 ∈ L≤1 and G≥2 ∈ L≥2 are the components of the function G ∈ L. Then (5.17) implies that G≤1 − T G≥2 = g≤1 − T Sg≤1 , consequently, g≤1 = (E≤1 − T S)−1 (G≤1 − T G≥2 ) , and analogously, g≥2 = (E≥2 − ST )−1 (G≥2 − SG≤1 ) . Since for small enough ε the operators T S in L≤1 and ST in L≥2 have small norms, the functions g≤1 , g≥2 are uniquely defined. Lemma 3.5 is proved.
December 23, 2004 10:23 WSPC/148-RMP
00221
Glauber Dynamics Generator for Continuous Particle Systems
1099
5.6. Proof of Lemma 3.6 As follows from the construction (3.17) of the invariant subspace Lˆ≥2 any function G ∈ Lˆ≥2 can be decomposed into a sum of two components: G = G≤1 + G≥2 ,
G≤1 ∈ L≤1 ,
G≥2 ∈ L≥2 ,
G≤1 = T G≥2 .
We denote by P≥2 the projection operator P≥2 : Lˆ≥2 → L≥2 , acting as follows P≥2 G = G≥2 ,
G = G≤1 + G≥2 ∈ Lˆ≥2 ,
−1 and the inverse operator P≥2 : L≥2 → Lˆ≥2 , acting by the formula: −1 P≥2 G≥2 = G≥2 + T G≥2 ∈ Lˆ≥2 ,
G≥2 ∈ L≥2 .
According to the construction of the invariant subspace Lˆ≥2 the operator L2 can be represented in the following form −1 L2 = P≥2 (L22 + L21 T )P≥2 ,
(5.18)
where the operators L22 , L21 were defined by (3.6). Analogous to (5.18) the representation is valid for the inverse operator L−1 2 : −1 −1 L−1 P≥2 . 2 = P≥2 (L22 + L21 T )
(5.19)
Since −1 −1 (L22 + L21 T )−1 = (E≥2 + L−1 L22 , 22 L21 T )
then using estimates (3.9), (3.13) and (3.19) we have 2 |||L−1 22 L21 T |||M < 16ε (1 + 3ε)
and consequently, |||(L22 + L21 T )−1 |||M <
1 (1 + 4ε) 2
(5.20)
for small enough ε. The norm (3.1) and the estimate (3.19) on the norm of T imply that |||P≥2 |||M ≤ 1 ,
−1 |||P≥2 |||M ≤ 1 + 8ε .
(5.21)
Finally the estimate (3.21) follows from (5.19), (5.20) and (5.21). Lemma 3.6 is proved. 5.7. Proof of Lemma 3.7 The proof is based on the following proposition. Proposition 5.1 ([10]). Let L be a Banach space with a norm || · ||L such that L ⊂ H is a dense subset of a Hilbert space H, and for any f ∈ L ||f ||H ≤ ||f ||L .
December 23, 2004 10:23 WSPC/148-RMP
1100
00221
Y. Kondratiev, R. Minlos & E. Zhizhina
Let L be a self-adjoint operator in H such that LL ⊂ L and the restriction L| L is a bounded operator in L. Then L is a bounded operator in H, and ||L||H ≤ ||L||L . ˆ 1 = Lˆ1 is invariant Using our constructions above we obtain that the subspace H for the operators L and Ut , so that the restriction L|H1 is a bounded self-adjoint operator in H1 . Analysis of the operator L11 , see (5.12), shows that the operator L011 , acting in L1 has a form Z (L011 G1 )(y) = −G1 (y) + z G1 (x)κβ (x − y)dx, G1 ∈ L1 . Now (3.12), (3.16) imply that the operator L12 S|L1 has a small norm: |||L12 (S|L1 )|||M ≤ 8ε2 ,
hence, the operator L011 + L12 (S|L1 ) can be rewritten as L011 + L12 (S|L1 ) = −E + R , where (RG1 )(y) = z and
Z
G1 (x)κβ (x − y)dx + L12 (S|L1 ) ,
|||R|||M ≤ ε + 8ε2 < 2ε
for small enough ε. Using the estimates on the norms of the operators P1 and P1−1 : |||P1 |||M ≤ 1 ,
|||P1−1 |||M ≤ 1 + 17ε ,
we have for small enough ε |||L1 + ELˆ1 |||M = |||P1−1 RP1 |||M ≤ |||R|||M · |||P1−1 |||M ≤ 2ε(1 + 17ε) < 3ε . Now the proposition implies that ||L1 + EHˆ 1 ||H ≤ |||L1 + ELˆ1 |||M ≤ 3ε , that gives the position for the spectrum σ1 in (2.17) with γ1 (ε) = 3ε. −1 Applying similar reasoning to the operator L|Hˆ ≥2 in the invariant subspace ˆ H≥2 together with the estimate (3.21) we obtain that under small enough ε the
spectrum σ2 of the operator L2 is bounded from above by the value −2 + γ2 (ε) with γ2 (ε) = 30ε. Thus, we proved the inclusions (2.17). The last step is to prove the decomposition (2.16). Since for small enough ε the spectra σ0 , σ1 , σ2 are not overlapping, then the subspaces H0 , H1 , H≥2 are mutually orthogonal. Let us prove that the sum (2.16) gives a complete decomposition of the space H. We know that according to the decomposition (3.26) any function G ∈ L has a representation of the form G = G0 + G1 + G≥2 ,
G0 ∈ L0 ,
G1 ∈ Lˆ1 ,
G≥2 ∈ Lˆ≥2 .
(5.22)
December 23, 2004 10:23 WSPC/148-RMP
00221
Glauber Dynamics Generator for Continuous Particle Systems
1101
Any component of the decomposition (5.22) equals to the orthogonal projection of G to the corresponding invariant subspace G 0 = P H0 G ,
G 1 = P H1 G ,
G≥2 = PH≥2 G ,
so that all vectors G0 , G1 , G≥2 are mutually orthogonal and ||G||2H = ||PH0 G||2H + ||PH1 G||2H + ||PH≥2 G||2H . This equality holds for a dense set in H, consequently it is true for any element from H, which is equivalent to the decomposition (2.16). Lemma 3.7 is completely proved. Recall that Theorem 2.2’ holds under assumptions (IIa)–(IIb), and assumption (IIa) implies in particular that |κβ (u)|1/2 := |e−βφ(u) − 1|1/2 <
¯ C(β) (1 + |u|)s
(5.23)
¯ with a constant C(β). The proofs of the two next lemmas are based on decay properties of the function S(ξ; x) and the correlations %ˆ(ξ1 ; ξ2 ) = %(ξ1 ∪ ξ2 ) − %(ξ1 )%(ξ2 ) ,
ξ1 6= ∅ ,
ξ2 6= ∅ ,
(see Propositions 5.2 and 5.3 below). Proposition 5.2. Let the potential φ satisfy condition (IIa), the parameter ε = ˜ z C(β) be small enough. Then the operator S|L1 can be written as Z ˜ x)G(x)dx , (SG)(ξ) = S(ξ, |ξ| ≥ 2 , G ∈ L1 , (5.24) Rd
˜ x) for any given x belongs to the space L≥2 and meets the where a kernel S(ξ, following estimate ˜ |S(ξ; x)| ≤ M T (ξ − x)ν max |y − x| (5.25) y∈ξ
with T (η) ∈ L≥2
and
||T ||M ≤ 8ε .
(5.26)
˜ Proposition 5.3. Let z < z0 and ε = z C(β) be small enough, then ¯ √ p 3 C(β)e z |ξ1 | · |ξ2 |z 4 (|ξ1 |+|ξ2 |) ν(d(ξ1 , ξ2 )) , (5.27) 1−u √ ˆ ¯ where the constant C(β) is defined in (5.23), u = zeC < 1 with Cˆ = ¯ max{1, C(β)}. |ˆ %(ξ1 ; ξ2 )| <
For the proofs of Propositions 5.2 and 5.3 see the Appendix.
December 23, 2004 10:23 WSPC/148-RMP
1102
00221
Y. Kondratiev, R. Minlos & E. Zhizhina
5.8. Proof of Lemma 4.1 5.8.1. The representation for the scalar product Let us denote Kj (ξ) = (Gj + S 0 Gj )(ξ) ∈ H ,
G j ∈ L1 ,
j = 1, 2 ,
so that − (Gj + S|L1 Gj , Φ0 )H , Kj (ξ) = Gj (x) , R 0 ˜ x)Gj (x)dx , (S Gj )(ξ) = (S|L1 Gj )(ξ) = S(ξ;
|ξ| = 0 , |ξ| = 1 ,
(5.28)
|ξ| ≥ 2 .
˜ x) is the kernel of the operator S|L1 (3.10). Then using the formula for Here S(ξ; the scalar product in H we have (G1 + S 0 G1 , G2 + S 0 G2 )H Z = K1 (ξ1 )K2 (ξ2 )%(ξ1 ∪ ξ2 )dξ1 dξ2 +
Z
ξ2 6=∅
K1 (ξ1 ∪ ξ2 )K2 (ξ2 ∪ ξ3 )%(ξ1 ∪ ξ2 ∪ ξ3 )dξ1 dξ2 dξ3 .
(5.29)
We can rewrite the first term as follows Z K1 (ξ1 )K2 (ξ2 )(%(ξ1 ∪ ξ2 ) − %(ξ1 )%(ξ2 ))dξ1 dξ2 +
Z
K1 (ξ1 )%(ξ1 )dξ1
Z
K2 (ξ2 )%(ξ2 )dξ2 .
(5.30)
Since Z
Kj (ξ)%(ξ)dξ = Kj (∅) + (Gj + S|L1 Gj , Φ0 )H = 0 ,
j = 1, 2 ,
(5.31)
the expression in (5.30) equals to Z K1 (ξ1 )K2 (ξ2 )ˆ %(ξ1 ; ξ2 )dξ1 dξ2 = + +
Z
Z
Z
G1 (x)G2 (y)ˆ %(x; y)dxdy G1 (x)S(ξ; y)G2 (y)ˆ %(x; ξ)dxdydξ +
Z
S(ξ; x)G1 (x)G2 (y)ˆ %(ξ; y)dxdydξ
S(ξ1 ; x)S(ξ2 ; y)G1 (x)G2 (y)ˆ %(ξ1 ; ξ2 )dxdydξ1 dξ2 .
(5.32)
December 23, 2004 10:23 WSPC/148-RMP
00221
Glauber Dynamics Generator for Continuous Particle Systems
We denote t(1) (x − y) = %ˆ(x; y) + +
Z
Z
1103
S(ξ; y)ˆ %(x; ξ)dξ
S(ξ; x)ˆ %(ξ; y)dξ +
Z
S(ξ1 ; x)S(ξ2 ; y)ˆ %(ξ1 ; ξ2 )dξ1 dξ2 ,
then from (5.30)–(5.32) we have for the first term in (5.29) Z Z K1 (ξ1 )K2 (ξ2 )%(ξ1 ∪ ξ2 )dξ1 dξ2 = t(1) (x − y)G1 (x)G2 (y)dxdy .
(5.33)
(5.34)
By analogy we obtain a similar representation for the second term in (5.29) Z K1 (ξ1 ∪ ξ2 )K2 (ξ2 ∪ ξ3 )%(ξ1 ∪ ξ2 ∪ ξ3 )dξ1 dξ2 dξ3 ξ2 6=∅
= %1 with
Z
G1 (x)G2 (x)dx +
Z
t(2) (x − y)G1 (x)G2 (y)dxdy
(5.35)
t(2) (x − y) Z S(x ∪ ξ; y)%(x ∪ ξ) + S(y ∪ ξ; x)%(y ∪ ξ) dξ = +
Z
|ξ1 ∪ξ2 |≥2, |ξ2 ∪ξ3 |≥2, ξ2 6=∅
S(ξ1 ∪ ξ2 ; x)S(ξ2 ∪ ξ3 ; y)%(ξ1 ∪ ξ2 ∪ ξ3 )dξ1 dξ2 dξ3 . (5.36)
5.8.2. The estimates on the function s(u) = t(1) (u) + t(2) (u) The function t(1) (x − y). As follows from the Kirkwood–Salsburg equations (see Appendix) ! Z Y %1 = z 1 + %(η) κβ (x − y)dη > z 1 − C(β)zeC(β)z > z(1 − εeε ) , Γ0 \∅
y∈η
˜ where zC(β) < z C(β) = ε, and ε is small enough. Thus z(1 − εeε ) < %1 < z .
(5.37)
For the first term %ˆ(x; y) = %ˆ(x − y) in (5.33) we have, using (5.27), Z ¯ ¯ Kν z C(β)e C(β)e %ˆ(x)dx < z 2 Kν < %1 = D1 %1 . (5.38) 1−u (1 − u) (1 − εeε ) R √ ˆ Here Kν = ν(x)dx, u = zeC. It is easy to see, that for small enough ε and z < z0 =
1
(eCˆ + Kν )2
,
December 23, 2004 10:23 WSPC/148-RMP
1104
00221
Y. Kondratiev, R. Minlos & E. Zhizhina
we get D1 =
¯ C(β)eK νz < 1. (1 − εeε )(1 − u)
The second and third terms in (5.33) can be estimated by the similar way. Using √ (5.37), (5.25), (5.27) and the relation u = zeCˆ we have in the case |ξ| ≥ 2: Z S(ξ; x)ˆ % (ξ; y)dξ ¯ C(β)e p √ 3 (|ξ|+1) |ξ| zz 4 ν(d(ξ, y))dξ ≤ M T (ξ−x )ν max |x − y˜| y˜∈ξ 1−u Z p uz z ν(|x − y|) T (ξ)M |ξ| |ξ|dξ ≤ (1 − u) M Z
≤
u%1 u%1 z ν(|x − y|)||T ||M ≤ 8ε2 ν(|x − y|) . (1 − u)(1 − εeε ) M (1 − u)(1 − εeε )
Finally, Z Z S(ξ; x)ˆ %(ξ; y)dξd(x − y) ≤ D2 ε2 %1
(5.39)
with a constant D2 . The last term in (5.33) is estimated as follows Z z3 u S(ξ1 ; x)S(ξ2 ; y)ˆ ν(|x − y|)||T ||2M , %(ξ1 ; ξ2 )dξ1 dξ2 < (1 − u) M 2
and again,
Z S(ξ1 ; x)S(ξ2 ; y)ˆ %(ξ1 ; ξ2 )dξ1 dξ2 d(x − y) < D3 ε4 %1
with a constant D3 . The estimates (5.38), (5.39) and (5.40) imply that Z |t(1) (x)|dx ≤ (D1 + D4 ε2 )%1 ,
(5.40)
(5.41)
where D1 < 1 and ε is small enough. The function t(2) (x − y). Using the properties (5.25) and (5.26) of the function S(ξ; x), the estimate on the correlation functions %(ξ) ≤ z |ξ| , and (5.37) we have Z Z ≤ D 5 ε2 %1 . S(x ∪ ξ; y)%(x ∪ ξ)dξd(x − y)
(5.42)
December 23, 2004 10:23 WSPC/148-RMP
00221
1105
Glauber Dynamics Generator for Continuous Particle Systems
The second term in (5.36) meets the following bound Z S(ξ1 ∪ ξ2 ; x)S(ξ2 ∪ ξ3 ; y)%(ξ1 ∪ ξ2 ∪ ξ3 )dξ1 dξ2 dξ3 ξ2 6=∅
Z ≤ ν(|x − y|) ≤
ξ2 6=∅
D6 ||T ||2M ν(|x
T (ξ1 ∪ ξ2 )T (ξ2 ∪ ξ3 )M 2 z |ξ1 |+|ξ2 |+|ξ3 | dξ1 dξ2 dξ3
− y|)%1 .
(5.43)
Here we use the same reasoning as in Lemma 3.1 in the case, when |ξ1 ∪ ξ2 | ≥ 2 and |ξ2 ∪ ξ3 | ≥ 2. Finally from (5.42) and (5.43) it follows that Z |t(2) (x)|dx ≤ D7 ε2 %1 . (5.44) The estimations (5.41) and (5.44) imply (4.3) for all small enough ε. Lemma 4.1 is completely proved. 5.9. Proof of Lemma 4.2 The representations (4.4), (5.28) and the relation S(η; x) = S(η−x ; 0) for the kernel of the operator S|L1 imply that Z Lˆ1 G (x) = −G(x) + z κβ (x − y)G(y)dy +z
Z
S2 (y − x, z − x; 0)G(y)e−βφ(y−z) dydz
= −G(x) + where
Z
V (u) = zκβ (u) + z
V (x − y)G(y)dy , Z
S2 (u, v; 0)e−βφ(u−v) dv ,
and S2 (y − x, z − x; 0) = S2 (y, z; x) is a two-point component of the operator S, corresponding to the configuration η = {y, z}. Using the properties of the operator S, see Proposition 5.2, we have for small enough ε: Z Z Z Z |V (u)|du ≤ z |κβ (u)|du + z |S2 (u, v; 0)|dudv ˜ ≤ z C(β) +z ˜ ≤ z C(β) + and
Z
|u||V (u)|du
Z Z
M T (u, v)dudv
z |||T |||M ≤ ε + ε|||S|||B ≤ 2ε , M
December 23, 2004 10:23 WSPC/148-RMP
1106
00221
Y. Kondratiev, R. Minlos & E. Zhizhina
≤ z
Z
|u||κβ (u)|du + z
≤ z max |u||κβ (u)|1/2 u
Z Z
Z
|u||S2 (u, v; 0)|e−βφ(u−v) dudv
|κβ (u)|1/2 du + zM
z K1 ||T ||M ≤ ε(K + 8εK1 ) , M
˜ ≤ z C(β)K +
Z Z
T (u, v)|u|ν(|u|)dudv
with K = max |u| · |κβ (u)|1/2 , u
Lemma 4.2 is proved.
K1 = max |u|ν(|u|) . u
A. Appendix A.1. Decay of the matrix elements of the operator S. Proposition 5.2 Let us consider the space B of operators S : L1 → L≥2 of the form Z ˜ u)G1 (u)du, |ξ| ≥ 2 , (SG1 )(ξ) = S(ξ, satisfying the following estimate:
˜ |S(ξ, u)| ≤ M T (ξ − u)ν max |y − u| , y∈ξ
(A.1)
where T (ξ) ∈ L≥2 and ||T ||M < ∞. We introduce the following norm in B: |||S|||B = inf ||T ||M , where the infimum is taken over all T (ξ) ∈ L≥2 meeting the estimate (A.1). We prove below that the solution S of the equation −1 −1 S = −L−1 22 L21 + L22 SL11 + L22 SL12 S
(A.2)
belongs to B and |||S|||B ≤ 8ε . To do this we use Eq. (A.2) and show that: (1) L−1 22 S ∈ B, if S ∈ B, (2) L21 ∈ B, (3) SL11 ∈ B, if S ∈ B, (4) SL12 S ∈ B, if S ∈ B. Then we apply the same reasoning as above in Lemma 3.4 to prove the existence of the fixed point of Eq. (A.2) with a small norm. (1) Using the representation (5.6) we have 0 −1 1 −1 L−1 L22 (L022 )−1 . (A.3) 22 = E≥2 + (L22 )
December 23, 2004 10:23 WSPC/148-RMP
00221
1107
Glauber Dynamics Generator for Continuous Particle Systems
Obviously, 1 |||S|||B , (A.4) 2 and since L022 is a diagonal operator, we have to estimate the norm of (L022 )−1 L122 S: (L022 )−1 L122 S (η, u) |||(L022 )−1 S|||B ≤
Z Y Y 0 z X ˜ ∪ x, u) S(γ = κβ (x − y) e−βφ(x−y ) dx |η| 0 γ⊆η
≤
z M |η|
y ∈γ
y∈η\γ
XZ
γ⊆η
T (γ ∪ x − u)ν
max |y − u|
y∈γ∪x
Y
y∈η\γ
|κβ (x − y)|
Y
0
e−βφ(x−y ) dx .
y 0 ∈γ
(A.5)
Denote by y¯ ∈ η a point of the configuration η, such that max |y − u| = |¯ y − u| . y∈η
Then if y¯ ∈ γ ⊆ η, max |y − u| ≥ |¯ y − u| ,
y∈γ∪x
and since the function ν(r) is decreasing on R+ , we have ν max |y − u| ≤ ν(|¯ y − u|) .
(A.6)
y∈γ∪x
If y¯ ∈ η\γ, then max |y − u| ≥ |x − u| ,
y∈γ∪x
and ¯ |κβ (x − y¯)|1/2 ν(|x − u|) ≤ C(β)ν(|¯ y − u|) .
(A.7)
Here we use the inequality (5.23) and properties of the metrics ρ(x1 , x2 ) = − ln ν(|x1 − x2 |) = s ln(1 + |x1 − x2 |) ,
x 1 , x2 ∈ R d .
Thus, (A.6) and (A.7) imply that we have the following estimate from above on (A.5): Z Y Y z X 0 T (γ ∪ x − u) |κ (x − y)| e−βφ(x−y ) dx y − u|) β |η| M ν(|¯ 0 y∈η\γ
γ⊆η: y¯∈γ
y ∈γ
Z z X T (γ ∪ x − u)|κβ (x − y¯)|1/2 + |η| γ⊆η: y¯∈γ /
December 23, 2004 10:23 WSPC/148-RMP
1108
00221
Y. Kondratiev, R. Minlos & E. Zhizhina
Y
×
y∈(η\¯ y)\γ
|κβ (x − y)|
Y
y 0 ∈γ
0 ¯ e−βφ(x−y ) dx y − u|) C(β)M ν(|¯
¯ = (F1 (η − u) + C(β)F y − u|) . 2 (η − u))M ν(|¯
(A.8)
We denote the expressions in the first and second brackets in (A.8) as F1 (η − u) and F2 (η − u) respectively: Z Y Y z X e−βφ(x−w) dx |κβ (x − y)| T (γ ∪ x − u) F1 (η − u) := |η| w∈γ γ⊆η: y¯∈γ
=
z |η − u|
y∈η\γ
X Z
γ⊆η−u: y¯−u∈γ
T (γ ∪ x0 )
Y
y∈(η−u)\γ
|κβ (x0 − y)|
Y
0
e−βφ(x −w) dx0 ;
w∈γ
with x0 = x − u, and F2 (η − u) :=
z |η − u|
X
γ⊆η−u: y¯0 ∈γ /
Z
T (γ ∪ x0 )|κβ (x0 − y¯0 )|1/2
Y
|κβ (x0 − y)|
Y
0
e−βφ(x −w) dx0 .
w∈γ
y∈((η−u)\y¯0 )\γ
Using similar reasoning as in Lemma 3.2, we get the estimate on the norm of F1 : z M C(β) ||F1 ||M ≤ e ||T ||M ≤ εe||T ||M . (A.9) M ˜ Since C(β) > C(β), then M C(β) < 1. Let us consider the norm of F2 (η): ||F2 ||M =z
Z
sup η1
Y
|η1 | 1 3
X
γ1 ⊆η1 \¯ y γ2 ⊆η2 \¯ y
|κβ (x − y)|
≤ z
Z
sup η1
X
Y
Z
|T (γ1 ∪ γ2 ∪ x)||κβ (x − y¯)|1/2
|κβ (x − y)|
|η1 | 1 3 Y
X
sup
y γ2 ⊆η2 \¯ y γ1 ⊆η1 \¯
γ1 ⊆η1 \¯ y y∈(η1 \¯ y)\γ1
|κβ (x − y)|
Y
e
−βφ(x−y 0 )
y 0 ∈γ1 ∪γ2
y∈(η2 \¯ y )\γ2
y∈(η1 \¯ y )\γ1
X
Z
Y
dx M |η2 | dη2
|T (γ1 ∪ γ2 ∪ x)|
y 0 ∈γ1
e
−βφ(x−y 0 )
Y
y∈η2 \γ2
|κβ (x − y)|
1/2
dx M |η2 | dη2
December 23, 2004 10:23 WSPC/148-RMP
00221
Glauber Dynamics Generator for Continuous Particle Systems
≤ z
≤ z
≤
Z Z
X
γ2 ⊆η2 \¯ y
X
γ2 ⊆η2
! |η1 | 1 sup |T (γ1 ∪ γ2 ∪ x)| sup 3 η1 γ1 ⊆η1 \¯ y
sup η1
|η1 | 1 |T (η1 ∪ γ2 ∪ x)| 3
!
Y
Y
1109
|κβ (x − y)|1/2 dxM |η2 | dη2
y∈η2 \γ2
|κβ (x − y)|1/2 M |η2 | dxdη2
y∈η2 \γ2
˜ z M C(β) e ||T ||M = εe||T ||M . M
(A.10)
Here we again use the same reasoning as in Lemma 3.2. Now we have from (A.5), (A.8)–(A.10) that |||(L022 )−1 L122 S|||B ≤ C1 ε|||S|||B , ¯ with C1 = e(1 + C(β)). Finally the representation (A.3) implies that for small enough ε |||L−1 22 S|||B ≤
1 (1 + 2C1 ε)|||S|||B . 2
(A.11)
(2) The representation (5.15) for the operator L21 implies that the corresponding kernel L˜21 (ξ, u) has a form Y L˜21 (ξ, u) = z κβ (u − y) , y∈ξ
and it can be estimated as follows: |L˜21 (ξ, u)| = z
Y
y∈ξ
≤
1/2 Y |κβ (u − y)|1/2 |κβ (u − y)| < z κβ max |u − y| y∈ξ
z ¯ M C(β)ν max |u − y| y∈ξ M
y∈ξ
Y
y∈ξ
|κβ (u − y)|1/2 .
Using the analogous reasoning as in Lemma 3.3 we have for the function Y z ¯ T (ξ) = C(β) |κβ (y)|1/2 , |ξ| ≥ 2 , M y∈ξ
the following estimate on the norm: ¯ ||T ||M ≤ 4C(β)
z . M
Thus, ¯ |||L21 |||B ≤ 4C(β)ε ,
(A.12)
where ε is small enough. (3) It follows from (5.12) that the kernel of the operator SL11 has the form: Z ˜ ˜ ˜ y)κβ (y − u)dy . SL11 (ξ, u) = −S(ξ, u) + z S(ξ,
December 23, 2004 10:23 WSPC/148-RMP
1110
00221
Y. Kondratiev, R. Minlos & E. Zhizhina
This representation together with estimate (A.7) imply that ˜ C(β)|||S||| ¯ |||SL11 |||B ≤ |||S|||B + z C(β) B. Hence, ¯ |||SL11 |||B ≤ (1 + C(β)ε)|||S||| B.
(A.13)
(4) Using the representation for the operator L12 , see the proof of Lemma 3.3, we have Z dxdy ˜ ˜ y) + S(ξ, ˜ x))S({y; ˜ SL12 S (ξ, u) = z (S(ξ, x}; u)e−βφ(x−y) (ξ, u) , 2 and the estimate (A.1) implies (SL˜12 S)(ξ, u) Z dxdy ˜ y) + S(ξ, ˜ x))S({y, ˜ = z (S(ξ, x}; u)e−βφ(x−y) (ξ, u) 2 Z 2 ≤ zM T (ξ − y)ν max |v − y| + T (ξ − x)ν max |v − x| v∈ξ
T (y − u, x − u) ν(max{|y − u|; |x − u|})
v∈ξ
dxdy 2
Z dxdy (T (ξ − y) + T (ξ − x))T (y − u, x − u) ≤ zM 2 ν max |v − u| v∈ξ 2 Z ≤ zν max |v − u| sup T (ξ − y) T (y 0 , x0 )M 2 dx0 dy 0 v∈ξ
y
≤ zν max |v − u| T (ξ 0 )||T ||M , v∈ξ
so that |||SL12 S|||B ≤ ε|||S|||2B .
(A.14)
The estimates (A.11)–(A.14) imply that for small enough ε the right-hand side of Eq. (A.2) is a contraction on the ball B8ε = {S ∈ B : |||S|||B < 8ε} . Consequently, there exists the unique solution S ∈ B of Eq. (A.2) with the norm |||S|||B < 8ε. Since for S ∈ B: |||S|||M ≤ |||S|||B , this solution is the same as that constructed above in Lemma 3.4 operator S giving the invariant subspace Lˆ≤1 . Proposition 5.2 is completely proved.
December 23, 2004 10:23 WSPC/148-RMP
00221
Glauber Dynamics Generator for Continuous Particle Systems
1111
A.2. Decay of the correlation functions. Proposition 5.3 We define %ˆ(ξ1 , ξ2 ) = %(ξ2 |ξ1 ) − %(ξ2 ) , %(ξ1 )
δξ1 (ξ2 ) =
(A.15)
where %(ξ1 ∪ ξ2 ) %(ξ1 )
%(ξ2 |ξ1 ) =
is the correlation function of a conditional ensemble (under a condition that ξ1 belongs to the configuration η ∈ Γ). The correlation functions satisfy the Kirkwood– Salsburg equation, see for instance [24] Z Y −βE(ξ 0 ,¯ x) %(ξ) = ze %(ξ 0 ∪ η) κβ (¯ x − y)dη, ξ 6= ∅ , %(∅) = 1 . Γ0
y∈η
Here x ¯=x ¯(ξ) ∈ ξ is some point of the finite configuration ξ, and X φ(¯ x − y) . ξ 0 = ξ\{¯ x} , E(ξ 0 , x ¯) = y∈ξ 0
Then the conditional correlation functions %(ξ2 |ξ1 ) satisfy the similar equation (see [21]) Z Y 0 %(ξ|ξ1 ) = ze−βE(ξ ,¯x) e−βE(ξ1 ,¯x) %(ξ 0 ∪ η|ξ1 ) κβ (¯ x − y)dη , ξ 6= ∅ , Γ0
y∈η
together with the upper bound:
%(ξ|ξ1 ) ≤ z |ξ| . Consequently, we get the following equation for δξ1 (ξ): Z Y −βE(ξ 0 ,¯ x) δξ1 (ξ) = Φξ1 (ξ) + ze δξ1 (ξ 0 ∪ η) κβ (¯ x − y)dη , Γ0
with Φξ1 (ξ) = ze
−βE(ξ 0 ,¯ x)
e
−βE(ξ1 ,¯ x)
−1
We define d(ξ1 , ξ2 ) =
δξ1 (∅) = 0 ,
y∈η
Z
min
Γ0
x∈ξ1 , y∈ξ2
%(ξ 0 ∪ η|ξ1 )
Y
y∈η
(A.16)
κβ (¯ x − y)dη .
|x − y| .
Using the evident inequality 1 − ab < (1 − a) + (1 − b)
as
0 ≤ a, b < 1 ,
and the estimate (5.23) we have X X ¯ |e−βE(ξ1 ,¯x) − 1| ≤ |κβ (¯ x − y)| ≤ |κβ (¯ x − y)|1/2 ≤ |ξ1 |C(β)ν(d(¯ x , ξ1 )) . y∈ξ1
y∈ξ1
December 23, 2004 10:23 WSPC/148-RMP
1112
00221
Y. Kondratiev, R. Minlos & E. Zhizhina
It implies that Φξ1 (ξ) ≤ z |ξ| ezC(β) C(β)ν(d(ξ, ¯ ξ1 )) · |ξ1 | .
Let us consider a space Dξ1 of functions on Γ0 , such that |Ψ(ξ)| < Cz
|ξ| 2
ξ 6= ∅ ,
ν(d(ξ, ξ1 )) ,
Ψ(∅) = 0 .
(A.17)
We put ||Ψ||Dξ1 = inf C where the infimum is taken over all C from the inequality (A.17). Then Φξ1 (ξ) ∈ Dξ1 , and √ ¯ z|ξ1 | . (A.18) ||Φξ1 ||Dξ1 ≤ C(β)e We introduce an operator A in Dξ1 of the following form (AΨ)(∅) = 0 , 0
(AΨ)(ξ ∪ x) = ze
−βE(ξ 0 ,x)
Z
Γ0
Ψ(ξ 0 ∪ η)
Y
y∈η
κβ (x − y)dη ,
then Eq. (A.16) can be written as follows: δξ1 = Φξ1 + Aδξ1 . Using the estimate (A.7) we have: √ ˜ √ √ ˆ zC(β) ≤ zeCˆ , |||A|||D ≤ z Ce ξ1
(A.19)
with
¯ Cˆ = {1, C(β)} ,
when √
˜ z C(β) ≤ 1.
(A.20)
˜ Let us remark that under the condition that ε = z C(β) is small enough, the bound (A.20) is valid since q √ √ ˜ ˜ z C(β) . = ε C(β) If
u≡
√ zeCˆ < 1 ,
then Eq. (A.19) has a unique solution δξ1 (ξ) ∈ Dξ1 with the norm ||δξ1 ||Dξ1 <
1 ||Φξ1 ||Dξ1 , 1−u
and (A.17), (A.18), (A.21) imply that ¯ %ˆ(ξ1 , ξ) C(β)e |ξ| 1 ≤ |δξ1 (ξ)| = |ξ1 |z 2 + 2 ν(d(ξ, ξ1 )) , %(ξ1 ) 1−u
(A.21)
December 23, 2004 10:23 WSPC/148-RMP
00221
Glauber Dynamics Generator for Continuous Particle Systems
and consequently, ¯ √ |ξ| %ˆ(ξ1 , ξ) ≤ C(β)e |ξ1 | zz 2 +|ξ1 | ν(d(ξ, ξ1 )) . 1−u Using the symmetry property we have ¯ √ |ξ | %ˆ(ξ1 , ξ) ≤ C(β)e |ξ| zz 21 +|ξ| ν(d(ξ, ξ1 )) . 1−u Finally, (A.22) and (A.23) imply the estimate (5.27).
1113
(A.22)
(A.23)
Acknowledgment The work is partially supported by DFG grants 436 RUS 113/485 and 436 RUS 113/747, INTAS grant “Stochastic Analysis”, RFBR grant 02-01-00444, Scientific School grant 934.2003.1. References [1] S. Albeverio, Lectures on Stochastic Dynamics, 30th Saint Flour Summer School, Lect. Notes in Math., 1816 (Springer Verlag, Berlin, 2003). [2] S. Alberverio, Yu. G. Kondratiev and M. R¨ ockner, Analysis and geometry on configuration spaces. The Gibbsian case, J. Funct. Anal. 157 (1998) 242–291. [3] N. Angelescu, R. A. Minlos and V. A. Zagrebnov, The lower spectral branch of the generator of the stochastic dynamics for the classical Heisenberg model, in On Dobrushin’s way: From probability theory to statistical physics, eds. R. A. Minlos, S. Shlosman and Yu. M. Suhov, Amer. Math. Soc. Trans. 198(2) (2002) 1–11. [4] Yu. M. Berezansky, Yu. G. Kondratiev, Spectral Methods in Infinite Dimensional Analysis, Vols. 1, 2 (Kluwer Academic Publisher, Dordrecht–Boston–London, 1995). [5] Yu. M. Berezansky, Yu. G. Kondratiev, T. Kuna and E. Lytvynov, On a spectral representation for correlation measures in configuration space analysis, Methods Funct. Anal. Topology 5(4) (1999) 87–100. [6] L. Bertini, N. Cancrini and F. Cesi, The spectral gap for a Glauber-type dynamics in a continuous gas, Ann. Inst. H. Poincar´e Probab. Statist. 38 (2002) 91–108. [7] E. Gl¨ otzl, Time reversible and Gibbsian point processes. I. Markovian spatial birth and death processes on a general phase space, Math. Nachr. 102 (1981) 217–222. [8] R. A. Holley and D. W. Stroock, Nearest neighbor birth and death processes on the real line, Acta Math. 140 (1987) 103–154. [9] Y. Ito and I. Kubo, Calculus on Gaussian and Poisson white noises, Nagoya Math. J. 111 (1988) 41–84. [10] Yu. G. Kondratiev and R. A. Minlos, One-particle subspaces in the stochastic XY model, J. Stat. Phys. 87(3/4) (1997) 613–642. [11] Yu. G. Kondratiev, R. A. Minlos and E. A. Zhizhina, Lower branches of the spectrum of Hamiltonians for infinite quantum system with compact spin, Transactions of Moscow Math. Soc. Vol. 60, Moscow 1998, pp. 259–302. [12] Yu. G. Kondratiev and T. Kuna, Harmonic analysis on configuration spaces I. General theory, Infin. Dimens. Anal. Quantum Probab. Relat. Top. 5 (2002) 201–233. [13] Yu. Kondratiev and E. Lytvynov, Glauber dynamics of continuous particle systems, SFB-611 preprint, to appear in Ann. Inst. H. Poincar´e (2003). [14] Yu. G. Kondratiev and M. J. Oliveira, Invariant measures for Glauber dynamics of continuous systems, BiBoS-Preprint (2003).
December 23, 2004 10:23 WSPC/148-RMP
1114
00221
Y. Kondratiev, R. Minlos & E. Zhizhina
[15] R. Lang, Unendlichdimensionale Wienerprozesse mit Wechselwirkung, Z. Wahrsch. Verw. Gebiete 38 (1977) 55–72. [16] R. Lang, Unendlichdimensionale Wienerprozesse mit Wechselwirkung, Z. Wahrsch. Verw. Gebiete 39 (1977) 277–299. [17] V. A. Malyshev and R. A. Minlos, Linear Infinite-Particle Operators (American Mathematical Society Publ., 1995). [18] F. Martinelli, Lectures on Glauber Dynamics for Discrete Spin Models, Saint Flour Summer School, Lect. Notes in Math. 1717 (Springer Verlag, Berlin, 1997), pp. 93–191. [19] R. A. Minlos, Invariant subspaces of Ising stochastic dynamics (for small β), Markov Processes and Related Fields 2(2) (1996) 263–284. [20] R. A. Minlos and Yu. M. Suhov, On the spectrum of the generator of an infinite system of interacting diffusions, Commun. Math. Phys. 206 (1999) 463-489. [21] R. A. Minlos, Limiting Gibbs distributions, Funktsional’nyj Analiz i Ego Prilozhenija 1(2) (1967) 60–73. [22] R. A. Minlos, Lectures on statistical physics, Uspechi Mat. Nauk 23(1) (1968), 133– 190. [23] C. Preston, Spatial birth-and-death processes, in Proceedings of the 40th Session of the International Statistical Institute (Warsaw, 1975), Vol. 2, Bull. Inst. Internat. Statist., 46 (1975) 371–391. [24] D. Ruelle, Statistical Mechanics. Rigorous Results (Benjamins, Amsterdam, 1969). [25] D. Ruelle, Superstable interaction in classical statistical mechanics, Commun. Math. Phys. 18 (1970) 127–159. [26] D. Surgailis, On multiple Poisson stochastic integrals and associated Markov semigroups, Probab. Math. Statist. 3 (1984) 217–239. [27] D. Surgailis, On Poisson multiple stochastic integrals and associated equilibrium Markov processes, in Theory and Application of Random Fields (Bangalore, 1982), Lecture Notes in Control and Inform. Sci., 49, Springer, Berlin (1983), pp. 233–248. [28] L. Wu, Estimate of spectral gap for continuous gas, preprint (2003).
December 23, 2004 10:56 WSPC/148-RMP
00223
Reviews in Mathematical Physics Vol. 16, No. 9 (2004) 1115–1189 c World Scientific Publishing Company
REVISITING THE CHARGE TRANSPORT IN QUANTUM HALL SYSTEMS
TOHRU KOMA Department of Physics, Gakushuin University Mejiro, Toshima-ku, Tokyo 171-8588, Japan [email protected] Received 19 February 2004 Revised 29 September 2004 We re-examine the charge transport induced by a weak electric field in two-dimensional quantum Hall systems in a finite, periodic box at very low temperatures. Our model covers random vector and electrostatic potentials and electron–electron interactions. The resulting linear response coefficients consist of the time-independent term σ xy corresponding to the Hall conductance and the linearly time-dependent term γsy · t in the transverse and longitudinal directions s = x, y in a slow switching limit for adiabatically applying the initial electric field. The latter terms γsy · t are due to the acceleration of the electrons by the uniform electric field in the finite and isolated system, and so the time-independent term σyy corresponding to the diagonal conductance which generates dissipation of heat always vanishes. The well-known topological argument yields the integral and fractional quantization of the averaged Hall conductance σ xy over gauge parameters under the assumption that there exists a spectral gap above the ground state. In addition to this fact, we show that the averaged acceleration coefficients γ sy vanish under the same assumption. In the non-interacting case, the spectral gap between the neighboring Landau levels persists if the vector and the electrostatic potentials together satisfy a certain condition, and then the Hall conductance σxy without averaging exhibits the exact integral quantization with the vanishing acceleration coefficients in the infinite volume limit. We also estimate their finite size corrections. In the interacting case, the averaged Hall conductance σxy for a non-integer filling of the electrons is quantized to a fraction not equal to an integer under the assumption that the potentials satisfy certain conditions in addition to the gap assumption. We also discuss the relation between the fractional quantum Hall effect and the Atiyah–Singer index theorem for non-Abelian gauge fields. Keywords: Charge transport; linear response theory; quantum Hall effect; geometric invariants; non-Abelian gauge fields.
Contents 1. 2. 3. 4.
Introduction The Model and the Main Results Derivation of the Linear Response Coefficients The System with Translation Invariance 1115
1116 1121 1129 1135
December 23, 2004 10:56 WSPC/148-RMP
1116
00223
T. Koma
5. The Linear Response Coefficients Averaged over the Gauge Parameters 5.1. Proof of Theorem 2.2 5.2. Fractional quantization and Atiyah–Singer index theorem 6. The Non-Interacting Case 6.1. The single-electron Landau Hamiltonian 6.2. The general electron gases 7. The Interacting Case 7.1. Boundedness of the Hall conductance σxy (φ) 7.2. Fractional quantization of the Hall conductance σxy (φ) A. Differentiability of the Ground-State Wavefunctions B. Proof of Proposition 6.2 C. Proof of Proposition 6.4 D. Proofs of Theorems 7.2 and 7.5 E. Proofs of Theorems 7.6, 7.8 and 7.12 (N ) F. Estimate of the Ground State Energies E0,µ (φ) References
1140 1141 1146 1152 1152 1154 1158 1159 1161 1163 1168 1171 1174 1177 1186 1186
1. Introduction The linear response theory [1, 2] for charge transport successfully elucidates some aspects of the quantum Hall effect observed experimentally [3, 4] in two-dimensional electron gases in a strong magnetic field. In particular, it was found that the integral quantization of the Hall conductance is a consequence of the topological nature of the Hall conductance [5, 6]. However, the derivation of the linear response formulas for conductance from the first principle is still an unsolved problem [7, 8]. Actually it is very hard to take into account the effect of the reservoir explicitly. Needless to say, there have been many, varied arguments, each employing some simplifying feature. For example, the infinite volume formalism without taking the infinite volume limit from a sequence of finite volumes and the adiabatic (slowly varying) switching of the external electric field to avoid accelerating the electrons have been often used instead of coupling the corresponding finite system to a reservoir.a Thus the issue is still left somewhat hanging although it has been debated again and again. Apart from the problem of the validity of the linear response formulas, the quantized Hall conductance was identified with a topological invariant of a certain fiber bundle [10] by using the resulting linear response formulas as mentioned above. The topological argument for the Hall conductance was first introduced into a quantum Hall system of non-interacting electron gas in a periodic potential [5, 6]. As a result, it was shown that the Hall conductance is quantized to an integer under the assumption that there exists a spectral gap above the unique ground state. The integer of the quantization is equal to the filling factor of the Landau levels when the periodic potential is weak. However, one cannot expect the appearance of the
a For
recent attempts to justify the linear response formulas, see [8, 9].
December 23, 2004 10:56 WSPC/148-RMP
00223
Revisiting the Charge Transport in Quantum Hall Systems
1117
conductance plateaus for varying the filling of the electrons because of the absence of disorder.b Soon after these articles, this topological argumentc was extended to a quantum Hall system with disorder and with electron–electron interactions [18–21]. Instead of the crude Hall conductance with fixed gauge parameters, the Hall conductance was averaged over gauge parameters, and it was shown that the averaged Hall conductance is quantized to an integer under the assumption that there exists a spectral gap above the ground state. The integer of the quantization is equal to the filling factor of the Landau levels in the case of the non-interacting electron gas with weak single-body potentials [11] as well as the above case with a periodic potential. Surprisingly the averaged Hall conductance does not have any finite size correction to the exact integral quantization even though the system has disorder. Clearly one cannot expect that the crude Hall conductance is exactly quantized for any finite system. In order to explain the fractional quantization [22–24] of the Hall conductance, this topological argument needs some ad hoc assumptions on the degeneracy of the ground state in addition to the assumption on averaging the Hall conductance [19, 25–27]. In fact, if the ground state is non-degenerate with an excitation gap, then the averaged Hall conductance always shows integral quantization. The explicit value of the fraction of the quantized Hall conductance is not determined by the topological argument alone even if the dimension of the sector of the degenerate ground state is given. A degenerate ground state with an excitation gap is expected to appear only for a fractional filling p/q of the electrons [28–30]. Actually, the existence of a spectral gap for a non-integer filling of the electrons implies a degenerate ground state for a certain model [31]. In the real experiments, the fraction of the quantized Hall conductance is observed to be equal to the fractional filling p/q. Under the assumption on the spectral gap, it was proved without relying on the topological argument that the Hall conductance is proportional to the filling factor of the electrons for certain interacting electron gases [32]. However, the existence of the spectral gap itself has not yet been proved for any interacting electron gas. Besides, the relation between the fractionally quantized Hall conductance and the filling factor of the electrons is still unclear. We should remark that a set of the possible quantized values of the Hall conductance can be derived from a mathematical argument relying on the universality. See [33]. We should also note another topological approach by Bellissard et al. [8]. For an infinite volume quantum Hall system of a non-interacting electron gas with disorder, the quantized Hall conductance is identified with a Fredholm index of a certain operator that arises in Connes’ theory of non-commutative geometry [34]. (See also [7, 12, 35, 36].) In comparison to the above topological approach, Bellissard’s b The
appearance of the Hall conductance plateaus was discussed with localization estimates in [7, 8, 11, 12]. We will discuss this issue in the next paper [13]. c See also related articles [14–17].
December 23, 2004 10:56 WSPC/148-RMP
1118
00223
T. Koma
framework has the advantage that it does not need the assumption on averaging the Hall conductance over gauge parameters. However, it has not yet been extended to interacting quantum Hall electron gases. In this paper, we study a two-dimensional N -electron system in a uniform magnetic field perpendicular to the two-dimensional plane in which the electrons are confined. For simplicity we assume that the electrons do not have the spin degreesof-freedom, although we can treat a similar system with both the spin degrees-offreedom and multiple layers in the same way. The explicit form of the Hamiltonian of the system is given by (2.1) in the next section. The model covers a wide class of potentials including a random vector potential, a random electrostatic potential and an electron–electron interaction. In order to measure an induced current as a response to an external electric field, we apply a time-dependent vector potential Aex (t) = (0, α(t)), where the function α(t) of time t is given by (2.14) in the next section. For t ∈ [−T, 0] with a large positive T , the corresponding electric field is adiabatically switched on, and for t ≥ 0, the electric field becomes (0, F ) with the constant strength F . We consider the finite, isolated system of an Lx × Ly rectangular box, and impose periodic boundary conditions. Thus we will not consider a reservoir, and clearly the system does not exhibit any dissipation of heat. In this sense, we cannot measure the conductance. But the quantum Hall systems in the real experiments show negligibly small dissipation. Correspondingly the system we consider in this paper is expected to show weak acceleration of the electrons. Actually we will show that the acceleration is weak in a certain sense, and the constant Hall current flow is dominant. From these results, if the system is connected with a reservoir, then the acceleration of the electrons is expected to be further suppressed, and the linear response coefficient corresponding to the Hall conductance can be identified with the realistic one. Thus we believe that it is useful in future studies to re-examine the charge transport in such a finite, isolated quantum Hall system. Let us describe our results. The precise statement of the main results will be given in the next section. The linear response coefficients are given by jind,s , (1.1) F →0 F where jind,s are the induced current in the transverse direction s = x and the longitudinal direction s = y to the electric field. We obtain the generic forms of the coefficients for time t ≥ 0 as follows: σtot,sy = lim
σtot,xy = σxy + γxy · t + δσxy
(1.2)
σtot,yy = γyy · t + δσyy .
(1.3)
and
The first term σxy in the right-hand side of (1.2) is constant in time t, and corresponds to the Hall conductance. As mentioned above, the time-independent term σyy corresponding to the diagonal conductance is absent. Instead of that, the linearly time-dependent terms γsy · t appear, i.e. there exist terms corresponding to
December 23, 2004 10:56 WSPC/148-RMP
00223
Revisiting the Charge Transport in Quantum Hall Systems
1119
the acceleration of the electrons by the external electric field. The appearance of the acceleration term γxy · t in the transverse direction is due to the disorder scattering of the electrons accelerated in the longitudinal direction. When the system couples to a reservoir, we can expect that the acceleration terms γsy · t disappear, and instead of them, the diagonal conductance σyy appears . The rest of the two terms δσsy are the corrections depending on the initial switching process for adiabatically applying the external electric field. We prove that these two terms are negligibly small for slow switching. In the case without a uniform magnetic field, we demonstrate that the velocity of the electrons increases in time for an interacting electron gas with translation invariance in Sec. 4. More precisely, we have σxy = 0 ,
γxy = 0 ,
and γyy =
ne2 , me
(1.4)
where n is the density of the electrons, and −e and me are, respectively, the charge and the mass of the electron. For an interacting electron gas with translation invariance in a uniform magnetic field [37], we also demonstrate σxy = −
e2 ν, h
and γsy = 0 for s = x, y,
(1.5)
where h is the Planck constant, and ν is the filling factor of the Landau levels. Using the explicit expression of the linear response coefficients, we focus on the unsolved issues about the charge transport of the quantum Hall effect. For this purpose, we assume that there exists an excitation gap above the sector of the (quasi)degenerate ground state. Since our Hall conductance σxy is the same as the standard one, the well-known topological argument yields the fractional quantization of the Hall conductance [19, 25–27] as σxy = −
e2 p , h q
(1.6)
under the assumption on the gap, where · · · denotes the average over gauge parameters, the integer p is given by the geometrical invariantd called the first Chern number, and the integer q is the dimension of the sector of the ground state. Under the same assumption, we prove γsy = 0 for s = x, y .
(1.7)
Thus the acceleration of the electrons is absent in the sense of the average, and so we can expect that, for fixed gauge parameters, the acceleration of the electrons is weak. In the non-interacting case, assume an integer filling ν = ` of the Landau levels. Then we prove that a spectral gap exists above the unique ground state for certain d In the following, we will use the term “geometrical” instead of “topological” because we will not consider any deformation for a manifold nor a change of the local coordinate.
December 23, 2004 10:56 WSPC/148-RMP
1120
00223
T. Koma
weak potentials, and that the Hall conductance σxy and the acceleration coefficients γsy satisfy 2 −1 σxy + e ` ≤ const. × max{L−1 (1.8) x , Ly } , h and
−1 |γsy | ≤ const. × max{L−1 x , Ly } for s = x, y ,
(1.9)
where Lx , Ly are the system sizes. This Hall conductance σxy is not averaged over the gauge parameters, and the result gives the upper bound for the finite size correctione to the quantized value −(e2 /h)`. The second inequality for γsy implies weak acceleration of the electrons. In particular, it vanishes in the infinite volume. In the general case of the interacting electron gas, we cannot remove both of the assumptions on the existence of a spectral gap above the sector of the ground state and on the average over the gauge parameters. As for the assumption on the average, for general values of the gauge parameters, we cannot expect an exact fractional quantization of the Hall conductance as (1.6), and the finite-size corrections to the quantized Hall conductance should appear. Besides, the fraction p/q cannot be determined by the geometrical argument alone. But we can get the following result: in addition to the assumption on the existence of the spectral gap, if the potentials satisfy certain technical assumptions, then the fraction p/q must satisfy the bound, p (1.10) ν(1 − δ) ≤ ≤ ν(1 + δ) , q where ν is the filling factor of the electrons, and δ is a positive number which is determined by certain norms of the single-body potentials. In order to clarify the meaning of the bound (1.10), consider the situation where the interval [ν(1 − δ), ν(1 + δ)] does not include any integer. This situation is indeed realized for a non-integer filling factor ν and for weak single-body potentials. Then the fraction p/q must be a non-integer, and the degeneracy q of the ground states must be greater than 1. Thus the fractional quantization of the Hall conductance occurs with a degenerate ground state for a non-integer filling. The present paper is organized as follows: in Sec. 2, we give the precise definition of the model and describe our main theorems in a mathematically rigorous manner. In Sec. 3, the linear response coefficients are derived, starting from the basic Schr¨ odinger equation with a time-dependent gauge field which gives a constant electric field for the time t ≥ 0. We check that the linear response coefficients so obtained are physically reasonable ones for certain translationally invariant systems in Sec. 4. Section 5 is devoted to the proofs of the fractional quantization (1.6) of the averaged Hall conductance σxy and of the vanishing (1.7) of the averaged acceleration coefficients γsy in the most general setting. We also discuss the e The upper bound for the finite size correction would not be optimal [38]. But the inequality (1.8) gives the mathematically rigorous upper bound!
December 23, 2004 10:56 WSPC/148-RMP
00223
Revisiting the Charge Transport in Quantum Hall Systems
1121
relation between the fractional quantum Hall effect and the Atiyah–Singer index theorem for non-Abelian gauge fields [39–42]. The non-interacting electron gases with disorder is treated in Sec. 6. As a result, we obtain the bound (1.8) for the Hall conductance and the bound (1.9) for the acceleration coefficients. In Sec. 7, the interacting electron gases with disorder are treated, and we prove the bound (1.10) for the fraction of the quantized Hall conductance (1.6). Appendix A–Appendix F are devoted to the details of technical calculations and proofs of propositions and theorems. 2. The Model and the Main Results Consider a two-dimensional N -electron system in a uniform magnetic field (0, 0, B) perpendicular to the x-y plane in which the electrons are confined. For simplicity we assume that the electrons do not have the spin degrees-of-freedom, although we can treat a similar system with the spin degrees-of-freedom or with multiple layers in the same way. The Hamiltonian is given by N X X 1 (N ) [pj + eA(rj ) + φ]2 + W (rj ) + W (2) (ri − rj ) , (2.1) H0 = 2m e j=1 1≤i<j≤N
where −e and me are, respectively, the charge of electron and the mass of electron, and rj = (xj , yj ) is the jth Cartesian coordinate of the N electrons. As usual, we define px,j = −i~
∂ ∂xj
and py,j = −i~
∂ ∂yj
(2.2)
with the Planck constant ~. The system is defined on a rectangular box, −Ly Ly −Lx Lx × , T := , , 2 2 2 2
(2.3)
with the periodic boundary conditions. The vector potential A = (Ax , Ay ) consists of two parts as A = AP + A0 , where A0 (r) = (−By, 0) which gives the uniform magnetic field and the vector potential AP satisfies the periodic boundary conditions AP (x, y) = AP (x + Lx , y) = AP (x, y + Ly ) .
(2.4)
We have also introduced the gauge parameters φ = (φx , φy ) ∈ Tg , where the space Tg ⊂ R2 of the gauge parameters φ is defined as Tg := [0, ∆φx ] × [0, ∆φy ] with ∆φs :=
2π~ , Ls
s = x, y .
(2.5)
We call Tg the gauge torus. As we will see in Sec. 5, the Hall conductance σxy of the present system can be expressed in a geometric invariant on the gauge torus T g . We assume AP ∈ C 1 (R2 , R2 ), i.e. the components are continuously differentiable on
December 23, 2004 10:56 WSPC/148-RMP
1122
00223
T. Koma
R2 . Further we assume that the single-body potential W and the electron–electron interaction W (2) satisfy the following conditions: the periodic boundary conditions W (x + Lx , y) = W (x, y + Ly ) = W (x, y)
(2.6)
W (2) (x + Lx , y) = W (2) (x, y + Ly ) = W (2) (x, y) ,
(2.7)
and
and the boundednessf (2)
kW k∞ < w0 < ∞ and kW (2) k∞ < w0 < ∞
(2.8)
(2) w0
with the positive constants w0 and which are independent of the number N of the electrons and of the system sizes Lx , Ly ; the interaction W (2) is invariant under the interchange of two electrons’ coordinates as W (2) (−x, −y) = W (2) (x, y) .
(2.9)
For our purpose of accelerating the electrons on the torus T by the electric field, it is convenient to choose the periodic boundary conditions (2.10) and (2.11) below for the wavefunctions. Then the magnetic flux piercing the torus must be quantized [31, 32] so that the Hamiltonian is self-adjoint, or we need additional conditions for the wavefunctions. See [37] for other choices of boundary conditions with a non-quantized flux. In the present paper, we choose the flux quantization condition, Lx Ly = 2πM `2B , with a sufficiently large p positive integer M , where `B is the so-called magnetic length defined as `B := ~/(eB). The number M is exactly equal to the number of the states in a single Landau level of the single-electron Hamiltonian in the simple uniform magnetic field with no single-body potential and with no electric field. For simplicity we take M even. We define by ν = N/M the filling factor. We assume ν < ν0 with a positive constant ν0 which is independent of Lx , Ly and N . In other words, the filling factor ν converges to a finite constant in an infinite volume limit. The above condition Lx Ly = 2πM `2B for the sizes Lx , Ly is convenient for imposing the following periodic boundary conditions: for an N -electron wavefunction Φ(N ) , we impose periodic boundary conditions (x)
tj (Lx )Φ(N ) (r1 , r2 , . . . , rN ) = Φ(N ) (r1 , r2 , . . . , rN )
for j = 1, 2, . . . , N ,
(2.10)
for j = 1, 2, . . . , N ,
(2.11)
and (y)
tj (Ly )Φ(N ) (r1 , r2 , . . . , rN ) = Φ(N ) (r1 , r2 , . . . , rN )
where t(x) (· · ·) and t(y) (· · ·) are magnetic translation operators [43] defined as 0 iy x t(x) (x0 )f (x, y) = f (x − x0 , y) , t(y) (y 0 )f (x, y) = exp f (x, y − y 0 ) (2.12) `2B f Let
f be a complex-valued function on R2 , and let |f (x, y)| ≤ C for some C except for a subset of Lebesgue measure zero in R2 . Then the norm kf k∞ is given by the smallest such C. If f is a continuous function, then kf k∞ = max |f (x, y)|.
December 23, 2004 10:56 WSPC/148-RMP
00223
Revisiting the Charge Transport in Quantum Hall Systems
1123
for a function f on R2 , and a subscript j of an operator indicates that the operator acts on only the jth coordinate of the function.g In order to get the expression of the current induced by an external electric field, we introduce a time-dependent vector potential [44] into the Hamiltonian (2.1) as H (N ) (t) =
N X 1 1 [px,j + eAx (rj ) + φx ]2 + [py,j + eAy (rj ) + φy + eα(t)]2 2me 2me j=1 +
N X j=1
W (rj ) +
X
1≤i<j≤N
W (2) (ri − rj ) ,
(2.13)
where the additional vector potential is given by Aex (t) = (0, α(t)) with ( eηt , t ≤ 0; α(t) = −F t × 1, t > 0,
(2.14)
with a small positive parameter η. The corresponding electric field is oriented along the y direction with the constant strength F for all t ≥ 0. Namely we apply the electric field adiabatically from the initial time t = t0 = −T with a large T , and observe the currents of the system for the time t ≥ 0. Therefore we will consider those quantities for the time t ≥ 0 only in the following. Throughout the present paper, we will consider the following two situations: (N ) (i) the ground state of the Hamiltonian H0 of (2.1) is exactly q-fold degenerate, (q = 1, 2, . . .), with a small excitation energy gap which tends to zero in the infinite volume limit; (ii) the ground state is q-fold quasidegenerate with a uniform excitation energy gap which persists in the infinite volume limit in the sense of Assumption 2.1 below. In both of two situations, we take the initial state of the system at the time t = t0 = −T as ω0 (· · ·) :=
q 1 X (N ) (N ) Φ0,µ (φ), (· · ·)Φ0,µ (φ) , q µ=1
(2.15)
(N )
(N )
where Φ0,µ (φ) are the q eigenvectors of the ground state of the Hamiltonian H0 of (2.1). Namely we assume that the system is at a low temperature such that the corresponding inverse temperature β satisfies the condition ∆E β −1 ∆E, (N ) (N ) where ∆E is the excitation energy gap and ∆E = maxµ,µ0 |E0,µ (φ) − E0,µ0 (φ)| (N )
(N )
with the energy eigenvalue E0,µ (φ) of the ground state eigenvector Φ0,µ (φ). In the corresponding realistic situation, the transition between the degenerate ground states frequently occurs with finite probabilities owing to an external thermal perturbation. As a consequence, all of the ground states are equally mixed as in the assumption of the initial state (2.15). But, when a symmetry breaking occurs at zero temperature, those transition probabilities become negligibly small in a large g Throughout
the present paper, we use this convention.
December 23, 2004 10:56 WSPC/148-RMP
1124
00223
T. Koma
volume. In that situation, the assumption of the initial state (2.15) may be physically unnatural. Instead of the mixed state (2.15), we might have to take one of the symmetry breaking pure ground states as an initial state. But we can expect that all of the pure ground states give the same current because the broken symmetry is the translational symmetry [31, 32]. Here we stress that we cannot justify this argument. To summarize, in both of the cases, we can expect that the assumption of the initial state (2.15) leads to the realistic, correct current for the present quantum Hall system. In general, it is believed that the existence of an energy gap above the ground state for an integral or a fractional filling of the electrons is essential to both integral and fractional quantization of the Hall conductance. In addition, the degeneracy [19, 27] of the ground state is essential to the fractional quantization because the unique ground state with an energy gap always yields an integral quantization of the conductance by using the well-known topological argument [19, 20]. However, as Tao and Haldane [26] pointed out, one cannot expect the exact degeneracy of the ground state because the randomness of the potential(s) always lifts the degeneracy for a finite system. Therefore we require the following assumption on the quasidegeneracy of the ground state with the excitation energy gap in the quantum Hall case: (N )
Assumption 2.1. For any φ ∈ Tg , the ground state of the Hamiltonian H0 of (2.1) is q-fold degenerate in the sense that (N ) E (φ) − E (N )0 (φ) → 0 as Lx , Ly → ∞ , (2.16) max 0,µ 0,µ 0 µ,µ ∈{1,2,...,q}
(N )
where E0,µ (φ), µ = 1, 2, . . . , q, are the energy eigenvalues of the ground state. Besides, there exists a uniform energy gap ∆E above the degenerate ground state in the sense that (N )
inf E1
φ∈Tg
(N )
(φ) > ∆E + sup max E0,µ (φ) , φ∈Tg µ∈{1,2,...,q}
(2.17)
(N )
where E1 (φ) is the energy of the first excited state, and ∆E is a positive constant which is independent of the number N of the electrons and of the system sizes Lx , Ly . This assumption is justified for the non-interacting case with the potentials AP and W satisfying the condition (2.36) in Theorem 2.3 below. Unfortunately, for the interacting case, we cannot justify the assumption of the gap. We call the subspace of the (quasi)degenerate ground state the sector of the ground state. We remark that the dimension q of the sector of the ground state may depend on the system sizes Lx , Ly and on the number N of the electrons. The state of the system at the time t ≥ t0 is given by ω(· · · ; t) := ω0 ([U (N ) (t, t0 )]† (· · ·)U (N ) (t, t0 )) .
(2.18)
December 23, 2004 10:56 WSPC/148-RMP
00223
Revisiting the Charge Transport in Quantum Hall Systems
1125
Here U (N ) (t, t0 ) is the time evolution operator for the Schr¨ odinger equation, i~
∂ (N ) Ψ (t) = H (N ) (t)Ψ(N ) (t) , ∂t
(2.19)
with the time-dependent Hamiltonian H (N ) (t) of (2.13). Namely the solution of the equation is written as Ψ(N ) (t) = U (N ) (t, t0 )Ψ(N ) (t0 ) in terms of the operator U (N ) (t, t0 ) with the initial vector Ψ(N ) (t0 ). Now we define the total velocity operator as N X 1 [px,j + eAx (rj ) + φx ] , i = x; me j=1 vtot,i (t) := N X 1 [py,j + eAy (rj ) + φy + eα(t)] , i =y. m e j=1
Then the total current density is given by e ω(vtot (t); t) , jtot (t) := − Lx Ly
(2.20)
(2.21)
(2.22)
where vtot (t) = (vtot,x (t), vtot,y (t)). This total current density consists of the initial current density j0 and the induced current density jind (t) due to the external electric field as jtot (t) = j0 + jind (t) ,
(2.23)
e (0) ω0 (vtot ) Lx Ly
(2.24)
where j0 = − with the velocity operator, (0) vtot
N X 1 = [pj + eA(rj ) + φ] , m e j=1
(2.25)
without the vector potential α(t) giving the external electric field. The initial current density j0 is not necessarily vanishing because the persistent current may exist owing to the presence of the vector potentials. The linear response coefficients are given by σtot,sy (t; φ, η, T ) := lim
F →0
jind,s (t) F
(2.26)
in the s = x, y directions, where we have written jind (t) = (jind,x (t), jind,y (t)). As we will show in Sec. 3, these coefficients for the time t ≥ 0 have the expressions, σtot,xy (t; φ, η, T ) = σxy (φ) + γxy (φ) · t + δσxy (t; φ, η, T ) ,
(2.27)
December 23, 2004 10:56 WSPC/148-RMP
1126
00223
T. Koma
and σtot,yy (t; φ, η, T ) = γyy (φ) · t + δσyy (t; φ, η, T ) ,
(2.28)
where σxy (φ), γxy (φ) and γyy (φ) are all independent of the time t. The rest of the two terms δσsy (t; φ, η, T ) for s = x, y are due to the initial switching process for adiabatically applying the electric field in the time −T ≤ t ≤ 0, and so the two terms are negligibly small for the slow switching condition ηT 1 and η ∆E/~. In particular, they vanish in the slow switching limit ηT → ∞ and η → 0. See the bound (4.4) and the equality (4.25) in Sec. 4 for certain translationally invariant systems, the bound (6.27) in Sec. 6 for the non-interacting electron gas and the bound (7.5) in Sec. 7 for the interacting electron gas. The term σxy (φ) corresponds to the Hall conductance. For simplicity, we will call σxy (φ) the Hall conductance. As mentioned in the Introduction, the time-independent term σyy (φ) corresponding to the diagonal conductance does not appear [7, 9, 45], and instead of that, there appear linearly time-dependent terms, γsy (φ) · t. In particular, if the system is translationally invariant in both x and y directions with no uniform magnetic field, then the total velocity of the electrons is proportional to the time t owing to the uniform electric field. (See Sec. 4.) Therefore we will call γsy (φ) the acceleration coefficient. On the other hand, when a finite energy gap appears above the sector of the ground state as in a band insulator or in the quantum Hall case, we can expect that both the acceleration coefficients γsy (φ) will vanish in the infinite volume limit. Actually this statement holds for the non-interacting case as we will see in Theorem 2.3 below. However, we could not obtain a similar theorem for the interacting case except for a trivial case without disorder in Sec. 4. Of course, we cannot expect that the acceleration coefficients γsy (φ) are exactly vanishing for a finite volume in a generic situation with disorder because there may exist a nonvanishing current due to the scattering of the electrons by the disorder. If the Hall conductance σxy (φ) alone is needed, then it is enough to measure the linear response coefficients at the time t = 0 because the effect of the Lorentz force alone persists at t = 0 without any acceleration of the electrons by the potentials of the system. We define the averaged Hall conductance and the averaged acceleration coefficients over the gauge parameters φ on the gauge torus Tg as Z 1 dφx dφy σxy (φ) , (2.29) σxy (φ) = ∆φx ∆φy Tg and γsy (φ) =
1 ∆φx ∆φy
Z
dφx dφy γsy (φ) for s = x, y .
(2.30)
Tg
Under Assumption 2.1, the averaged Hall conductance σxy (φ) can be written as σxy (φ) =
e2 I h q
(2.31)
December 23, 2004 10:56 WSPC/148-RMP
00223
Revisiting the Charge Transport in Quantum Hall Systems
in terms of the geometric invariant [5, 6, 8, 19, 20, 27], Z 1 I= dφx dφy tr F(φ) , 2πi Tg with the curvature F(φ) given by ∂ ∂ Ay (φ) − Ax (φ) + [Ax (φ), Ay (φ)] , F(φ) = ∂φx ∂φy
1127
(2.32)
(2.33)
where tr stands for the trace of the matrix, and As (φ) for s = x, y are the connections on the gauge torus Tg , i.e. each As (φ) takes a q × q matrix value in the Lie algebra of the unitary group U (q) of q × q matrices [39]. As we will see in Sec. 5, the connections As (φ) are written in terms of the ground state wavefunctions. In the language of the non-Abelian gauge theory [42], the curvature F(φ) corresponds to the field strength tensor or the “electromagnetic” field, and the connections As (φ) correspond to the gauge fields on the torus Tg . Thus the averaged Hall conductance can be written in terms of the geometric invariant for the non-Abelian gauge fields on the torus. Since the geometric invariant I called the first Chern number takes an integer value, the averaged Hall conductance σxy (φ) is integrally or fractionally quantized. On the other hand, we can prove that the averaged acceleration coefficients γsy (φ) are exactly vanishing. We summarize the well-known result of the averaged Hall conductance σxy (φ) and our result on the averaged acceleration coefficients γsy (φ) as the following theorem in the most general setting for the present model: Theorem 2.2. Suppose that, for a finite volume and a filling factor ν of the Landau levels, there exists a uniform gap above the sector of the ground state in the sense of Assumption 2.1. Then there exists an integer p such that the averaged Hall conductance for the finite volume is quantized as e2 p (2.34) h q with the dimension q of the sector of the ground state, and the averaged acceleration coefficients are vanishing as σxy (φ) = −
γsy (φ) = 0
for both s = x, y directions.
(2.35)
The proof will be given in Sec. 5. In such a general setting, we cannot determine the fraction p/q, which is expected to be equal to the filling factor ν of the Landau levels [32]. We also remark that, for general fixed values of the gauge parameters φ ∈ T g , we cannot expect the same exact results about the fractional quantization of the Hall conductance and the vanishing acceleration coefficients without any finite size correction. For the quantum Hall case without the electron–electron interaction, we can obtain much stronger results as follows: Theorem 2.3. Assume W (2) = 0, i.e. no electron–electron interaction, and assume an integer filling factor ν = ` of the Landau levels with ` ∈ N. Further we assume
December 23, 2004 10:56 WSPC/148-RMP
1128
00223
T. Koma
that the vector potential AP and the electrostatic potential W satisfy the condition, r p p 2~ωc e2 ek|AP |k∞ ` + 1/2 + ` − 1/2 + (k|AP |k∞ )2 ~ωc > me 2me + kW + k∞ + kW − k∞ ,
(2.36) √ where ωc is the cyclotron frequency given by ωc = eB/me , |AP | = AP · AP , and W ± = max{±W, 0}. Then there exists a uniform gap above the unique ground state in the sense of Assumption 2.1, and the following bounds are valid : 2 σxy (φ0 ) + e ` ≤ C max{L−1 , L−1 } , (2.37) x y h
and
−1 |γsy (φ0 )| ≤ C 0 max{L−1 x , Ly }
(2.38)
for any gauge parameters φ0 ∈ Tg , and for s = x, y, where the positive constants C and C 0 are independent of the number N of the electrons and of the system sizes Lx , Ly . The proof will be given in Sec. 6. Thus the Hall conductance shows the integral quantization with the finite size correction, and the acceleration coefficients are vanishing in the infinite volume limit. However, the bounds for the finite size corrections would not be optimal [38] in comparison with the precision of the quantization and the weakness of the dissipation of heat in realistic quantum Hall systems. Perhaps, if possible, we should take account of the effect of the self-averaging about the disorder in the realistic systems. In order to explain the appearance of the Hall conductance plateaus, the gap condition in Theorem 2.3 must be replaced with localization estimates. Namely we must show that the quantized Hall conductance does not change when varying the Fermi level within the localization regime. For infinite volume systems, the Hall conductance formula by Bellissard shows the plateaus [7, 8, 12]. Kunz [11] also discussed the plateaus in an infinite volume system by combining the topological argument with certain assumptions on localization. This issue for finite volume systems will be discussed in [13]. For the interacting case, we could not estimate the corresponding finite size corrections. But we obtained the following theorem about the relation between the fraction p/q of the quantization and the filling factor ν of the electrons: Theorem 2.4. Assume AP = 0, and that the electrostatic potential W and the electron–electron interaction W (2) satisfy W ∈ C 3 (R2 ) and W (2) ∈ C 1 (R2 ), respectively.h Further we assume that, for a finite volume and a filling factor ν of the Landau levels, there exists a uniform gap ∆E above the sector of the ground state h The
set C k (S) denotes k times continuously differentiable functions on the set S.
December 23, 2004 10:56 WSPC/148-RMP
00223
Revisiting the Charge Transport in Quantum Hall Systems
1129
with the degeneracy q in the sense of Assumption 2.1. Then the fraction p/q of the averaged Hall conductance σxy (φ) of (2.34) satisfies ν(1 − δ) ≤
p ≤ ν(1 + δ) , q
(2.39)
where the positive number δ is given by δ=
2`4B
m+n 2
∂ ~ωc W
. max (∆E)3 m,n≥0; ∂xm ∂y n ∞
(2.40)
m+n=2
The proof will be given in Sec. 7.2. As mentioned in the Introduction, if the interval [ν(1 − δ), ν(1 + δ)] does not include any integer, then the number p/q must be equal to a purely fractional number, i.e. a non-integer. In addition, the dimension q of the sector of the ground state must be greater than 1. In the weak potential limit δ → 0, we obtain the desired result, σxy (φ) = −
e2 ν, h
with ν = p/q
(2.41)
from the result (2.34) with the inequality (2.39). Therefore one can expect the fractional quantization of the Hall conductance [31, 32] when a spectral gap appears above the sector of the ground state for the fractional filling factor ν = p/q. But the Hall conductance would vanish in the exceptional case with a very strong periodic potential.i In the case with AP 6= 0, we can also obtain very similar results to Theorem 2.4. But we need stronger assumptions on the potentials AP , W and W (2) . See Sec. 7.2 for the details. 3. Derivation of the Linear Response Coefficients In this section, we derive the expressions for the linear response coefficients by using the perturbation theory [47] in a mathematically rigorous manner. We denote by (N ) (N ) Φ0,µ (φ) the ground state eigenvectors of the Hamiltonian H0 of (2.1) with the (N )
(N )
eigenvalue E0,µ (φ), and denote by Φn,µ (φ) with n ≥ 1 the eigenvectors of the (N )
excited states with the energy eigenvalue En (φ) and with the subscript µ for the (N ) (N ) (N ) degeneracy. In the following, we will often use the abbreviations, Φn,µ , E0,µ , En , by dropping the φ dependence if there is no confusion. To begin with, we rewrite the Hamiltonian H (N ) (t) of (2.13) as (N )
H (N ) (t) = H0 i When
(N )
+ ∆H0
(N ) (t) + Hper (t)
(3.1)
a spectral gap exists owing to a strong periodic potential irrespective of the uniform magnetic field, the Hall conductance must vanish in the weak limit of the magnetic field. In such a situation, the integer p in the fractionally quantized Hall conductance must be equal to zero because the Hall conductance is a continuous function of the uniform magnetic field. Thus the Hall conductance vanishes owing to the strong periodic potential [46].
December 23, 2004 10:56 WSPC/148-RMP
1130
00223
T. Koma
with the diagonal part, X e2 (N ) (N ) ∆H0 (t) = [α(t)]2 , Q(En(N ) )Hmin (t)Q(En(N ) ) + N 2m e n
of the perturbation and the off-diagonal part, X (N ) (N ) (N ) Hper (t) = Hmin (t) − Q(En(N ) )Hmin (t)Q(En(N ) ) ,
(3.2)
(3.3)
n
(N )
where Hmin (t) is the minimal coupling with the external electric field, i.e. (N ) Hmin (t)
N e X α(t)[py,j + eAy (rj )] , = me j=1
(3.4)
(N )
and Q(E0 ) is the projection operator onto the subspace spanned by the ground (N ) (N ) (N ) state eigenvectors Φ0,µ of the unperturbed Hamiltonian H0 , and Q(En ) for n ≥ 1 is the projection operator onto the eigenspace spanned by the excited state (N ) (N ) eigenvector(s) of the Hamiltonian H0 with the eigenvalue En . Consider the time-dependent Schr¨ odinger equation i~
∂ (N ) Ψ (t) = H (N ) (t)Ψ(N ) (t) ∂t
(3.5)
with the Hamiltonian H (N ) (t) of (2.13). The solution Ψ(N ) (t) can be written as Ψ(N ) (t) = U (N ) (t, t0 )Ψ(N ) (t0 ) by using the time evolution operator U (N ) (t, s), with (N ) an initial vector Ψ(N ) (t0 ) at the initial time t = t0 . We denote by U0 (t, s) the (N ) (N ) time evolution operator for the Hamiltonian H0 + ∆H0 (t). (N ) Let Ψ (t) be a solution of the Schr¨ odinger equation (3.5). Note that ∂ (N ) U (t, s)Ψ(N ) (s) ∂s 0 (N ) ∂ i (N ) (N ) (N ) = U0 (t, s) H0 + ∆H0 (s) Ψ(N ) (s) + U0 (t, s) Ψ(N ) (s) ~ ∂s (N ) i (N ) (N ) = U0 (t, s) H0 + ∆H0 (s) − H (N ) (s) Ψ(N ) (s) ~
i (N ) (N ) = − U0 (t, s)Hper (s)Ψ(N ) (s) . ~ Integrating this on the time s from t0 to t, one obtains Z i t (N ) (N ) (N ) Ψ(N ) (t) − U0 (t, t0 )Ψ(N ) (t0 ) = − ds U0 (t, s)Hper (s)Ψ(N ) (s) . ~ t0 From the definition of U (N ) (t, s), this can be rewritten as Z (N ) i t (N ) (N ) (N ) ds U0 (t, s)Hper (s)Ψ(N ) (s) . U (t, t0 )−U0 (t, t0 ) Ψ(N ) (t0 ) = − ~ t0 (N )
Since both U (N ) (t, s) and U0
(3.6)
(3.7)
(3.8)
(t, s) are bounded, one has the following lemma [47]:
December 23, 2004 10:56 WSPC/148-RMP
00223
Revisiting the Charge Transport in Quantum Hall Systems (N )
Lemma 3.1. In the strong sense, U (N ) (t, s) → U0 t, s in any finite interval.
1131
(t, s) as F → 0, uniformly in
In Eq. (3.7), we take the initial state at t0 = −T as (N )
Ψ(N ) (t0 = −T ) = U0
with a vector Φ becomes Ψ
(N )
(N )
(t) =
(−T, 0)Φ(N )
in the domain of the Hamiltonian
(N ) U0 (t, 0)Φ(N )
i − ~
Z
t −T
(N )
ds U0
(N ) H0
(3.9) of (2.1). Then Eq. (3.7)
(N ) (t, s)Hper (s)Ψ(N ) (s) .
(3.10)
The following theorem can be obtained in the same way as in [47, Chap. IX, Sec. 2, Theorem 2.19]: (N )
Theorem 3.2. Let Φ(N ) be a vector in the domain of the Hamiltonian H0 of (2.1). Then Z i t (N ) (N ) (N ) (N ) (N ) (N ) ds U0 (t, s)Hper (s)U0 (s, 0)Φ(N ) + o(F ) , Ψ (t) = U0 (t, 0)Φ − ~ −T (3.11) where o(F ) denotes a vector ΨR with the norm kΨR k satisfying kΨR k/F → 0 as F → 0. Proof. Since Ψ(N ) (s) = U (N ) (s, −T )Ψ(N ) (−T )
(3.12)
and (N )
U0
(N )
(s, 0)Φ(N ) = U0
(N )
(s, −T )U0
(N )
(−T, 0)Φ(N ) = U0
(s, −T )Ψ(N )(−T ) ,
it is sufficient to show that
(N ) (N )
Hper (s) U (s, −T ) − U (N ) (s, −T ) Ψ(N ) (−T ) = o(F ) . 0
(3.13) (3.14)
Note that
(N ) (N ) ˜ (N ) (s, −T )S(−T )Ψ(N )(−T ) Hper (s)U (N ) (s, −T )Ψ(N )(−T ) = Hper (s)[S(s)]−1 U
(3.15)
and (N )
(N ) Hper (s)U0
(s, −T )Ψ(N ) (−T )
(N ) ˜ (N ) (s, −T )S˜0 (−T )Ψ(N ) (−T ) , = Hper (s)[S˜0 (s)]−1 U 0
(3.16)
where ˜ s) = S(t)U (t, s)[S(s)]−1 , U(t, S(t) =
i (N ) H (t) + λ0 , ~
˜0 (t, s) = S˜0 (t)U0 (t, s)[S˜0 (s)]−1 , U
(3.17) (3.18) (3.19)
December 23, 2004 10:56 WSPC/148-RMP
1132
00223
T. Koma
and i (N ) (N ) S˜0 (t) = H0 + ∆H0 (t) + λ0 . ~
(3.20)
Here λ0 is some real constant so that both [S(t)]−1 and [S˜0 ]−1 exist, and the ˜ (t, s) and U ˜0 (t, s) are well defined and bounded [48]. Formally one has operators U d ˜ d ˜ U (t, r)U (r, s) = −U(t, r) S(r) [S(r)]−1 U (r, s) . (3.21) dr dr Integrating this on the time r from s to t, one obtainsj Z t d ˜ ˜ S(r) [S(r)]−1 U (r, s) . U (t, s) = U (t, s) + drU (t, r) dr s
(3.22)
Further one has
i (N ) [S(s)]−1 = [S˜0 (s)]−1 − [S(s)]−1 Hper (s)[S˜0 (s)]−1 . (3.23) ~ Using these two formulas, the right-hand side of (3.15) can be evaluated as (N ) ˜ (N ) (s, −T )S(−T )Ψ(N )(−T ) Hper (s)[S(s)]−1 U (N ) = Hper (s)[S˜0 (s)]−1 U (N ) (s, −T )S˜0 (−T )Ψ(N )(−T ) + o(F ) .
(3.24)
Similarly one has ˜0 (t, s) = U0 (t, s) + U
Z
t s
d ˜ ˜ S0 (r) [S˜0 (r)]−1 U0 (r, s) . drU0 (t, r) dr
(3.25)
Hence the right-hand side of (3.16) can be evaluated as (N )
(N ) ˜ Hper (s)[S˜0 (s)]−1 U 0
(s, −T )S˜0 (−T )Ψ(N ) (−T ) (N )
(N ) = Hper (s)[S˜0 (s)]−1 U0
(s, −T )S˜0 (−T )Ψ(N )(−T ) + o(F ) .
(3.26)
Combining (3.15), (3.16), (3.24) and (3.26), one has
(N ) (N )
Hper (s) U (s, −T ) − U (N ) (s, −T ) Ψ(N ) (−T ) 0
(N )
(N ) = Hper (s)[S˜0 (s)]−1 U (N ) (s, −T ) − U0 (s, −T ) S˜0 (−T )Ψ(N )(−T ) + o(F ) .
(3.27)
This right-hand side is of o(F ) from Lemma 3.1 because the operator (N ) Hper (s)[S˜0 (s)]−1 is bounded and already of order F . We take the initial state ω0 at the time t = t0 = −T as ω0 (· · ·) =
q 1 X (N ) (N ) (N ) (N ) Φ0,µ , (U0 (−T, 0))† (· · ·)U0 (−T, 0), Φ0,µ , q µ=1
(3.28)
j For simplicity we have given the formal derivation here although the resulting integral equation (3.22) is justified [48].
December 23, 2004 10:56 WSPC/148-RMP
00223
Revisiting the Charge Transport in Quantum Hall Systems (N )
1133 (N )
where the ground state vectors Φ0,µ are normalized. Using the projection Q(E0 onto the sector of the ground state, we have ω0 (a) =
)
q 1 X (N ) (N ) (N ) (N ) Φ0,µ , (U0 (−T, 0))† aU0 (−T, 0), Φ0,µ q µ=1
=
1 (N ) (N ) (N ) Tr Q(E0 )(U0 (−T, 0))† aU0 (−T, 0) q
=
1 (N ) Tr Q(E0 )a q
(3.29)
for any observable a in the domain. Here Tr stands for the trace on the Hilbert (N ) space, and we have used the fact that Q(E0 ) commutes with the unitary operator (N ) U0 (t, s). Starting from this initial state is physically natural as we discussed in the preceding section. With this initial condition, the total current at time t is given by jtot (t) = − (N )
q e 1 X (N ) (N ) Ψ0,µ (t), vtot (t)Ψ0,µ (t) , Lx Ly q µ=1
(3.30)
where Ψ0,µ (t) is the solution of the time-dependent Schr¨ odinger equation (3.5) with (N )
(N )
the initial state U0 (−T, 0)Φ0,µ for µ = 1, 2, . . . , q, and the total velocity operator vtot (t) is given by (2.21). This total current jtot (t) can be decomposed into two parts as jtot (t) = j0 + jind (t)
(3.31)
with the current, j0 = −
q e 1 X (N ) (N ) (N ) (0) (N ) Φ0,µ , [U0 (t, 0)]† vtot U0 (t, 0)Φ0,µ , Lx Ly q µ=1
(3.32)
(0)
where the velocity operator vtot is given by (2.25). The current j0 of (3.32) is independent of the time t, and is equal to the initial current or the persistent current without the external electric field. Actually it can be rewritten as e 1 (0) (N ) Tr vtot Q(E0 ) (3.33) j0 = − Lx Ly q in the same way as in the above. Clearly the rest of the current, jind (t) = (jind,x (t), jind,y (t)), is the induced current by the external electric field. Using Theorem 3.2, the Hall current (the x component of jind (t)) induced by the electric field is given by jind,x (t) =
q Z
(N ) e i1X t (N ) (0) (N ) (N ) (N ) (N ) ds U0 (t, 0)Φ0,µ , vtot,x U0 (t, s)Hper (s)U0 (s, 0)Φ0,µ Lx Ly ~ q µ=1 −T
December 23, 2004 10:56 WSPC/148-RMP
1134
00223
T. Koma
+ c.c. + o(F ) =
q e2 i 1 X X (N ) (0) (N ) (N ) (0) (N ) Φ0,µ , vtot,x Φn,µ0 Φn,µ0 , vtot,y Φ0,µ Lx Ly ~ q µ=1 0 n≥1,µ
i (N ) × exp (E0,µ − En(N ) )t ~
Z
t −T
i (N ) (N ) ds exp − (E0,µ − En )s α(s) ~
+ c.c. + o(F ) ,
(3.34)
(N )
(N )
where Φn,µ0 is the normalized eigenvector of the Hamiltonian H0
with the energy
(N ) En
eigenvalue for n ≥ 1, and c.c. stands for the complex conjugate of the first part. Note that 1 F
Z
t −T
=
"
i (N ) (N ) ds exp − (E0,µ − En )s α(s) ~
i~T (N ) E0,µ
+~
−
2
"
"
−
(N ) En
+ i~η
−
1 (N ) 2
(N )
i~t (N )
− En
(N ) (E0,µ
−
−
(N ) En
+ i~η)2
1
#
(N ) e−ηT exp i(E0,µ − En(N ) )T /~ #
2 (N ) (N ) E0,µ − En + i~η # ~2 (N ) + exp −i(E0,µ − En(N ) )t/~ (N ) (N ) 2 E0,µ − En
(3.35)
jind,x (t) = σxy + γxy · t + δσxy (t; η, T ) , F
(3.36)
E0,µ − En
(N ) E0,µ
~2
for the time t ≥ 0. In the following, we will consider only the time t ≥ 0. Substituting (3.35) into (3.34), we obtain the linear response coefficient, σtot,xy (t; η, T ) := lim
F →0
in the x direction, where the first term which we call the Hall conductance is given by σxy
(N ) (0) (N ) (N ) (0) (N ) q Φ0,µ , vtot,x Φn,µ0 Φn,µ0 , vtot,y Φ0,µ i~e2 1 X X − c.c. , =− (N ) (N ) 2 Lx Ly q µ=1 E − En 0 n≥1,µ
(3.37)
0,µ
and the acceleration coefficient γxy of the second term which is linear in the time t is (N ) (0) (N ) (N ) (0) (N ) q Φ0,µ , vtot,x Φn,µ0 Φn,µ0 , vtot,y Φ0,µ e2 1 X X γxy = + c.c. . (3.38) (N ) (N ) Lx Ly q µ=1 E − En 0 n≥1,µ
0,µ
December 23, 2004 10:56 WSPC/148-RMP
00223
1135
Revisiting the Charge Transport in Quantum Hall Systems
The third term δσxy (t; η, T ) is given by δσxy (t; η, T ) q ie2 1 X X (N ) (0) (N ) (N ) (0) (N ) n Φ0,µ , vtot,x Φn,µ0 Φn,µ0 , vtot,y Φ0,µ M(t, E0,µ ; η, T ) + c.c. = Lx Ly q µ=1 0 n≥1,µ
(3.39)
with
n E0,µ
=
(N ) E0,µ
−
(N ) En
and
M(t, E; η, T ) ~ ~ ~ iT −ηT iET /~ e e + eiEt/~ . − − = E + i~η (E + i~η)2 E2 (E + i~η)2
(3.40)
Similarly the induced current in the y direction is jind,y (t) =
N e2 Ft L x L y me q Z
(N ) e i1X t (N ) (0) (N ) (N ) (N ) (N ) + ds U0 (t, 0)Φ0,µ , vtot,y U0 (t, s)Hper (s)U0 (s, 0)Φ0,µ Lx Ly ~ q µ=1 −T
+ c.c. + o(F ) ,
(3.41)
and the linear response coefficient is given by jind,y (t) = γyy · t + δσyy (t; η, T ) , F →0 F
σtot,yy (t; η, T ) := lim
(3.42)
where N e2 L x L y me
γyy =
+
q
(N ) (0) (N ) 2 X X (N ) (0) (N ) 1 Φ , v Φ0,µ , vtot,y Φn,µ0 (N ) Φ 0 tot,y 0,µ n,µ (N ) q µ=1 E − En 0 n≥1,µ
(3.43)
0,µ
(0)
(0)
and δσyy (t; η, T ) is given by replacing the velocity operator vtot,x with vtot,y in δσxy (t; η, T ). 4. The System with Translation Invariance In this section, we check the validity of our linear response formulas in the special case without the vector potential AP and without the electrostatic potential W . The Hamiltonian without the external electric field is given by (N )
H0
=
N X 1 (px,j − eByj + φx )2 + (py,j + φy )2 + 2me j=1
X
1≤i<j≤N
W (2) (ri − rj ) . (4.1)
December 23, 2004 10:56 WSPC/148-RMP
1136
00223
T. Koma
Clearly the system has translation invariance [37] in both x and y directions because the electron–electron interaction W (2) is a function of the relative coordinate only. We assume W (2) ∈ C 1 (R2 ). In this situation with B 6= 0, the well-known results are obtained as σxy = −
e2 ν, h
(4.2)
and γxy = γyy = 0 ,
(4.3)
where ν is the filling factor of the Landau level, i.e. ν = N/M with the number N of the electrons and with the number M of the single-electron states in a single Landau level without the electron–electron interaction. Further we prove the bounds, e2 η η e2 −ηT + ν 2+ for s = x, y . (4.4) |δσsy | ≤ ν(1 + ωc T )e h h ωc ωc Clearly the first term on the right-hand side vanishes in the large limit of T , and the second term also vanishes in the small limit of η. Further we also stress that the right-hand side is independent of the system sizes, Lx , Ly , and of the number N of the electrons for a fixed filling factor ν. In order to give proofs of (4.2) and (4.3), we note that (0)
ime (0) (N ) , vtot,y , H0 ~eB ime (0) (N ) = v ,H ~eB tot,x 0
(4.5)
vtot,x = − (0)
vtot,y and
(4.6)
ime (0) N (0) vtot,x , vtot,y = . ~eB me (N )
(4.7)
(N )
Let Φ be a vector in the domain of H0 , i.e. kH0 Φk < ∞. Then one has
(N )
(0)
(N ) (N ) (N ) (N ) (0) (N ) (N ) (0) (E0,µ − H0 )Φ, vtot,x Φ0,µ = vtot,x Φ, H0 Φ0,µ − H0 Φ, vtot,x Φ0,µ = lim
n→∞
(0)
(N )
vtot,x Φ, H0
(N ) (0) − H0 Φ, vtot,x Ψ(n) Ψ(n) µ µ
(0) (N ) = lim Φ, [vtot,x , H0 ]Ψ(n) µ n→∞
= (n)
~eB
(0) (N ) Φ, vtot,y Φ0,µ , ime
(4.8) (N )
(n)
where Ψµ ∈ C ∞ (T ) is an approximate vectork such that k(H0 + λ)(Ψµ − (N ) (N ) Φ0,µ )k → 0 as n → ∞ with a positive constant λ satisfying H0 + λ > 0, and we k One
(n)
can easily find such a vector Ψµ by using the Fourier expansion in terms of the eigenvectors (6.7) of the single-electron Landau Hamiltonian in Sec. 6.1 below.
December 23, 2004 10:56 WSPC/148-RMP
00223
Revisiting the Charge Transport in Quantum Hall Systems
1137
have used the commutation relation (4.6). Using this and the commutation relation (4.7), one has * + q (N ) 2X (N ) (0) 1 − Q(E0 ) (0) (N ) Φ0,µ , vtot,y (N ) v Φ (N ) tot,y 0,µ q µ=1 E −H 0,µ
=
0
q X
ime (N ) (0) 1 (N ) (0) (N ) Φ ,v [1 − Q(E0 )]vtot,x Φ0,µ + c.c. q µ=1 ~eB 0,µ tot,y
q 1 X ime (N ) (0) N (0) (N ) = . Φ , vtot,y , vtot,x Φ0,µ = − q µ=1 ~eB 0,µ me
(4.9)
Substituting this into the expression (3.43) of γyy , one gets γyy = 0. In the same way, one can easily obtain γxy = 0 by using the expression (3.38) for γxy . For the rest of the Hall conductance σxy of (4.2), one has * + q (N ) 1 − Q(E0 ) (0) (N ) 1X (N ) (0) Φ0,µ , vtot,x (N ) vtot,y Φ0,µ − c.c. (N ) q µ=1 (E − H )2 0,µ
= −
q X
1 q µ=1
me ~eB
2
0
(N )
(0)
(N )
Φ0,µ , vtot,y [1 − Q(E0
(0) (N ) )]vtot,x Φ0,µ + c.c.
2 q
(N ) (0) iN 1 X me (0) (N ) Φ0,µ , vtot,x , vtot,y Φ0,µ = − . = q µ=1 ~eB ~eB
(4.10)
Substituting this into the expression (3.37) of σxy , the desired result (4.2) is obtained. Next let us give a proof of the bounds (4.4). To this end, we recall the expression of δσsy (t; η, T ) as δσsy (t; η, T ) =
q ie2 1 X X (N ) (0) (N ) (N ) (0) (N ) Φ0,µ , vtot,s Φn,µ0 Φn,µ0 , vtot,y Φ0,µ Lx Ly q µ=1 0 n≥1,µ
n × M(t, E0,µ ; η, T ) + c.c. (N )
(N )
n with E0,µ = E0,µ − En
(4.11)
and
M(t, E; η, T ) =
iT ~ − e−ηT eiET /~ E + i~η (E + i~η)2 ~ ~ eiEt/~ . + 2− E (E + i~η)2
(4.12)
For a generic filling factor ν, we cannot expect the existence of a system-sizeindependent energy gap above the sector of the ground state. In fact the gap might become small for a large volume of the system. Therefore we must treat carefully the denominators of the fractions that appear in the expression of M(t, E; η, T ).
December 23, 2004 10:56 WSPC/148-RMP
1138
00223
T. Koma
To begin with, we note that 1 1 2i~η ~2 η 2 − = − . E2 (E + i~η)2 E(E + i~η)2 E 2 (E + i~η)2
(4.13)
Using this identity, Lx Ly = 2πM `2B and ν = N/M , we have δσsy (t; η, T ) = i where Aj =
e2 ν (A1 + A2 ωc T )e−ηT + (2A3 + A4 η/ωc )η/ωc + c.c. , h
q
(N ) (0) (N ) ~eB 1 X X (N ) (0) (N ) n Φ0,µ , vtot,s Φn,µ0 Mj (E0,µ ) Φn,µ0 , vtot,y Φ0,µ N q µ=1 0
(4.14)
(4.15)
n≥1;µ
with
1 exp[iE(t + T )/~] , (E + i~η)2
(4.16)
M2 (E) =
1 i exp[iE(t + T )/~] , ~ωc E + i~η
(4.17)
M3 (E) =
i~ωc eiEt/~ E(E + i~η)2
(4.18)
M1 (E) = −
and M4 (E) = −
~2 ωc2 eiEt/~ . E 2 (E + i~η)2
(4.19)
We can prove |Aj | ≤ 1/2 for j = 1, 2, 3, 4. The desired bound (4.4) follows from these bounds for Aj . Since all of Aj can be treated in the same way, we shall give a proof for |A1 | ≤ 1/2 only. In the same way as above, one has A1 q ~eB 1 X = − N q
*
q ime 1 X N q
*
µ=1
= −
µ=1
q m2e 1 X = − ~eBN q
ˆ (N ) (0) Φ0µ , vtot,s eiθ
(N )
(0)
µ=1
(0)
(N )
(N )
(N )
(N )
(E0,µ − H0
)
+ i~η)2
1 − Q(E0
ˆ (N ) (0) Φ0µ , vtot,s eiθ
where we have written θˆ =
(N )
(E0,µ − H0
ˆ
Φ0µ , vtot,s eiθ *
(N )
1 − Q(E0
)
(N )
(N )
(N )
(E0,µ − H0
(N )
+ i~η)2
1 − Q(E0
(0) (N ) vtot,y Φ0,µ
+ (N )
(E0,µ − H0
)
+ i~η)2
(N ) (E0,µ
(0)
(N )
)vtot,x Φ0,µ
+
(0) (N ) (N ) − H0 )2 vtot,y Φ0,µ
+
,
(4.20) (N ) (N ) (E0,µ −H0 )(t+T )/~.
of vtot,x . Then one has
(N ) (0) (N ) Φ, [1 − Q(E0 )]vtot,y Φ0,µ
Let Φ be a vector in the domain
December 23, 2004 10:56 WSPC/148-RMP
00223
Revisiting the Charge Transport in Quantum Hall Systems
ime (N ) (N ) (0) lim Φ, [1 − Q(E0 )][H0 , vtot,x ]Ψ(n) µ ~eB n→∞
ime (N ) (N ) (N ) (0) lim Φ, [1 − Q(E0 )](H0 − E0,µ )vtot,x Ψ(n) , = − µ ~eB n→∞
1139
= −
(n)
(4.21)
where Ψµ is the approximate vector which was introduced for proving (4.8), and we have used the commutation relation (4.6). Substituting this into the expression (4.20) of A1 and then applying the Schwarz inequality, we obtain |A1 | ≤
q 1X m3e max lim (~eB)2 N s n→∞ q µ=1
(0) (N ) (N ) (N ) (0) × vtot,s Ψ(n) − E0,µ )vtot,s Ψµ(n) . µ , [1 − Q(E0 )](H0
(4.22)
Note that, for a symmetric operator A, one has formally q X
µ=1
(N ) (N ) (N ) (N ) (N ) AΦ0,µ , 1 − Q(E0 ) (H0 − E0,µ )AΦ0,µ
=
q X
(N )
µ=1
=
=
(N )
AΦ0,µ , (H0
(N )
(N )
− E0,µ )AΦ0,µ
q
(N ) 1 X (N ) (N ) (N ) (N ) (N ) (N ) (N ) AΦ0,µ , (H0 − E0,µ )AΦ0,µ + (H0 − E0,µ )AΦ0,µ , AΦ0,µ 2 µ=1 q 1 X (N ) (N ) (N ) (N ) (N ) (N ) AΦ0,µ , H0 , A Φ0,µ + H0 , A Φ0,µ , AΦ0,µ 2 µ=1
q 1 X (N ) (N ) (N ) = Φ , A, H0 , A Φ0,µ , 2 µ=1 0,µ
(4.23)
where we have used q X q X
(N ) (N ) (N ) (N ) (N ) (N ) AΦ0,µ , Φ0,µ0 (E0,µ0 − E0,µ ) Φ0,µ0 , AΦ0,µ = 0 .
(4.24)
µ=1 µ0 =1
(0)
We can justify this formal identity (4.23) for A = vtot,s by using the approximate (n) Ψµ .
vector Combining this observation, the commutation relations, (4.5)–(4.7), and the bound (4.22), we obtain |A1 | ≤ 1/2. In the case with B = 0, one has σxy = δσxy (t; η, T ) = δσyy (t; η, T ) = 0
(4.25)
and γxy = 0 ,
γyy =
N e2 L x L y me
(4.26)
December 23, 2004 10:56 WSPC/148-RMP
1140
00223
T. Koma
under the same assumptions as in the above. Clearly these imply that the total velocity of the electrons is proportional to the time t. The derivation is not hard: from the assumptions, we have (0) (0) (N ) (0) vtot,s , H0 = 0 for s = x, y , and vtot,x , vtot,y = 0 . (4.27)
All of these operators commute with each other. This implies that all of the matrix elements in the expressions (3.37), (3.38), (3.43) and (4.11) of the coefficients vanish. As a result, we get the desired results. 5. The Linear Response Coefficients Averaged over the Gauge Parameters
In this section, we treat the averaged Hall conductance σxy (φ) of (2.29) and the averaged acceleration coefficients γsy (φ) of (2.30). As is well known, the “topological” argument [5, 6, 8, 11, 14, 19, 20, 26, 27] yields the integral and fractional quantization of the Hall conductance on the assumption of the excitation energy gap above the ground state. Following the argument, the fractional quantization (2.34) of the averaged Hall conductance σxy (φ) will be proved in the most general setting of the present paper. In addition, we will prove that the averaged acceleration coefficients are exactly vanishing, i.e. γsy (φ) = 0 for s = x, y. Thus we will give the proof of Theorem 2.2 in Sec. 5.1 below. In Sec. 5.2, we will also discuss the geometric property of the averaged Hall conductance σxy (φ) as a geometric invariant for non-Abelian gauge fields on the gauge torus Tg . In particular, the integer I of the fractional quantized Hall conductance (2.31) is equal to an index of a Pauli–Dirac operator coupled to the gauge fields. In other words, the Hall conductance of the interacting electron gas is closely related to the ground state property of a single-electron system coupled to the gauge fields. The Hamiltonian of the quantum Hall system without the external electric field is given by N X X 1 (N ) W (2) (ri − rj ) . H0 (φ) = [pj + eA(rj ) + φ]2 + W (rj ) + 2m e j=1 1≤i<j≤N
(5.1)
Since the gauge parameters φ = (φx , φy ) play an important role in the following (N ) (N ) (N ) proof, we write the parameter dependence explicitly as Φ0,µ = Φ0,µ (φ) and E0,µ = (N )
(N )
(N )
(N )
E0,µ (φ) for the (quasi)degenerate ground state, and Φn,µ = Φn,µ (φ) and En
=
(N ) En (φ)
for the excited states throughout this section. We also write the velocity operator as (0)
vtot (φ) =
N X 1 [pj + eA(rj ) + φ] . me j=1
(5.2)
Further, throughout this section, we require Assumption 2.1 for the proof of Theorem 2.2.
December 23, 2004 10:56 WSPC/148-RMP
00223
Revisiting the Charge Transport in Quantum Hall Systems
1141
5.1. Proof of Theorem 2.2 To begin with, we prepare some tools to rewrite the expressions of σxy (φ) and (N ) γsy (φ). The derivative of the projection Q(E0 (φ)) onto the sector of the ground state becomes Z ∂ 1 1 ∂ (N ) (N ) Q(E0 (φ)) = dz Qi (E0 (φ)) := ∂φi ∂φi 2πi Γ z − H (N ) (φ) 0 Z 1 1 1 (0) = dz vtot,i (φ) , (N ) (N ) 2πi Γ z − H (φ) z − H0 (φ) 0 (5.3) where we have used the integral representation of the projection Z 1 1 (N ) Q(E0 (φ)) = dz 2πi Γ z − H (N ) (φ) 0
(5.4)
(N )
with the resolvent [z−H0 (φ)]−1 . Here the closed path Γ encircles all of the ground (N ) state energy eigenvalues E0,µ (φ) which are isolated from the rest of the spectrum. We can take the path Γ to be independent of φ because of Assumption 2.1. Using (N ) the operator Qi (E0 (φ)), the Hall conductance (3.37) can be rewritten as σxy (φ) = −
i~e2 1 (N ) (N ) (N ) Tr Q(E0 (φ))[Qx (E0 (φ)), Qy (E0 (φ))] , Lx Ly q
(5.5)
and for γxy of (3.38) and γyy of (3.43), we have γxy (φ) =
and
e2 1 (N ) (N ) (0) Tr vtot,x (φ) Qy (E0 (φ))Q(E0 (φ)) Lx Ly q (N ) (N ) + Q(E0 (φ))Qy (E0 (φ))
e2 γyy (φ) = Lx Ly
(5.6)
N 1 (N ) (N ) (0) + Tr vtot,y (φ) Qy (E0 (φ))Q(E0 (φ)) me q (N ) (N ) + Q(E0 (φ))Qy (E0 (φ)) ,
(5.7)
where Tr stands for the trace on the Hilbert space. The expression of the Hall conductance with the use of the derivative of a projection was used in [14, 20, 27]. Further γsy (φ) can be rewritten as γsy (φ) =
e2 1 ∂ (N ) (0) Tr vtot,s (φ)Q(E0 (φ)) Lx Ly q ∂φy
for s = x, y ,
(5.8)
where we have used the identity, (N )
Qs (E0
(N )
(φ)) = Qs (E0
(N )
(φ))Q(E0
(N )
(φ)) + Q(E0
for s = x, y ,
(N )
(φ))Qs (E0
(φ)) , (5.9)
December 23, 2004 10:56 WSPC/148-RMP
1142
00223
T. Koma (N )
which is derived by differentiation of Q(E0 that (N )
Q(E0
(N )
(φ))Qs (E0
(N )
(φ))Q(E0
(N )
(φ)) = Q(E0
(φ))2 . We also note
(φ)) = 0 for s = x, y ,
(5.10)
which is easily derived from (5.9). ˆ (N ) (φ), Proposition 5.1. Under Assumption 2.1, there exist orthonormal vectors Φ 0,µ µ = 1, 2, . . . , q such that the sector of the (quasi) degenerate ground state is spanned ˆ (N ) (φ), µ = 1, 2, . . . , q, and that all the vectors Φ ˆ (N ) (φ), µ = by the vectors Φ 0,µ
0,µ
1, 2, . . . , q, are infinitely differentiable with respect to the gauge parameters φ on the gauge torus Tg . This proposition is essentially due to T. Kato. But, in his book [47], he treated only the case with a single variable. For the reader’s convenience, we give the proof of Proposition 5.1 in Appendix A along his line although the extension to two variables ˆ (N ) (φ), the acceleration coefficients is not so difficult. In terms of these vectors Φ 0,µ γsy (φ) of (5.8) are written as γsy (φ) =
q e2 1 X ∂ ˆ (N ) (0) ˆ (N ) (φ) Φ (φ), vtot,s (φ)Φ 0,µ Lx Ly q µ=1 ∂φy 0,µ
for s = x, y .
(5.11)
Next let us rewrite the Hall conductance σxy (φ) of (5.5) in terms of the vectors (N ) ˆ Φ0,µ (φ). Note that ∂ ∂ ˆ (N ) (N ) ˆ (N ) (φ) Φ (φ) = Q(E0 (φ))Φ 0,µ ∂φi 0,µ ∂φi (N )
= Qi (E0
(N )
(N )
ˆ (φ) + Q(E (φ))Φ 0,µ 0
(φ))
∂ ˆ (N ) Φ (φ) . ∂φi 0,µ
Therefore one has ∂ (N ) (N ) ˆ (φ) = Qi (E (N ) (φ))Φ ˆ (N ) (φ) . 1 − Q(E0 (φ)) Φ 0 0,µ ∂φi 0,µ
(5.12)
(5.13)
Using this identity (5.13), the Hall conductance σxy (φ) of (5.5) can be written as σxy (φ) = −
q ∂ (N ) ∂ ˆ (N ) i~e2 1 X (N ) ˆ (φ) − (x ↔ y) Φ0,µ (φ), 1 − Q(E0 (φ)) Φ Lx Ly q µ=1 ∂φx ∂φy 0,µ
=−
q ∂ ˆ (N ) ∂ ˆ (N ) i~e2 1 X Φ0,µ (φ), Φ0,µ (φ) − (x ↔ y) Lx Ly q µ=1 ∂φx ∂φy
= −
q i~e2 1 X ∂ ˆ (N ) (φ), ∂ Φ ˆ (N ) (φ) − (x ↔ y) , Φ 0,µ Lx Ly q µ=1 ∂φx ∂φy 0,µ
(5.14)
December 23, 2004 10:56 WSPC/148-RMP
00223
Revisiting the Charge Transport in Quantum Hall Systems
1143
where we have used the identity ∂ ˆ (N ) ∂ ˆ (N ) (N ) (N ) ˆ ˆ Φ (φ), Φ0,µ0 (φ) + Φ0,µ (φ), Φ 0 (φ) ∂φy 0,µ ∂φy 0,µ =
∂ ˆ (N ) ˆ (N )0 (φ) = ∂ δµ,µ0 = 0 Φ (φ), Φ 0,µ ∂φy 0,µ ∂φy
(5.15)
for getting the second equality. The Hall conductance expressed in terms of the derivative of wavefunctions was first introduced in [5]. Following Kunz [11], we shall introduce a gauge transformation as ˜ (N ) (φ) = G(N ) (φ)Φ ˆ (N ) (φ) Φ 0,µ 0,µ
(5.16)
with G
(N )
N Y
i exp (xj φx + yj φy ) . (φ) = ~ j=1
Then the Hamiltonian is transformed as ˜ (N ) (φ) := G(N ) (φ)H (N ) (φ) G(N ) (φ) −1 H 0 0
N X 2 1 = pj + eA(rj ) + W (rj ) + me j=1
X
1≤i<j≤N
(5.17)
W (2) (ri − rj ) . (5.18)
The expression of the right-hand side does not include the gauge parameters φ explicitly but the Hamiltonian indeed depends on φ through the boundary conditions. Namely the boundary conditions are twisted with the angles φ. Here we stress that the boundary condition in the s direction becomes periodic for the special values φs = m∆φs of the gauge parameter with an integer m for s = x, y, where ∆φs = 2π~/Ls are given in (2.5). Clearly the sector of the ground state has this periodicity from Assumption 2.1 on the ground state. Therefore the ground state ˜ (N ) (φ) must satisfy the relations, vectors Φ 0,µ ˜ (N ) (∆φx , φy ) = Φ 0,µ
q X
(x)
(N )
˜ 0 (0, φy ) Cµ,µ0 (φy )Φ 0,µ
(5.19)
µ0 =1
and ˜ (N ) (φx , ∆φy ) = Φ 0,µ
q X
(y)
(N )
˜ 0 (φx , 0) , Cµ,µ0 (φx )Φ 0,µ
(5.20)
µ0 =1
where C (x) (φy ) and C (y) (φx ) are a q × q unitary matrix as a function of φy and φx , respectively. ˜ (N ) (φ), the acceleration coefficients γsy (φ) of (5.11) In terms of the vectors Φ 0,µ can be written as q e2 1 X ∂ ˜ (N ) (0) ˜ (N ) (φ) γsy (φ) = Φ0,µ (φ), vtot,s (0)Φ for s = x, y , (5.21) 0,µ Lx Ly q µ=1 ∂φy
December 23, 2004 10:56 WSPC/148-RMP
1144
00223
T. Koma
PN (0) where vtot,s (0) = j=1 [ps,j +eAs (rj )]/me . Combining this with the relation (5.20), the averaged acceleration coefficients γsy (φ) of (2.30) vanish as γsy (φ) =
q Z (N ) 1 e2 1 X ∆φx ˜ (φx , ∆φy ), v (0) (0)Φ ˜ (N ) (φx , ∆φy ) dφx Φ tot,s 0,µ 0,µ ∆φx ∆φy Lx Ly q µ=1 0
(N ) ˜ (φx , 0), v (0) (0)Φ ˜ (N ) (φx , 0) = 0 − Φ tot,s 0,µ 0,µ
(5.22)
for s = x, y. Here we have used the unitarity of C (y) (φx ). Similarly the Hall conductance σxy (φ) of (5.14) becomes " q ∂ ˜ (N ) i~e2 1 X ∂ (N ) ˜ Φ0,µ (φ), Φ (φ) σxy (φ) = − Lx Ly q µ=1 ∂φx ∂φy 0,µ ∂ ˜ (N ) (φ), ∂ Φ ˜ (N ) (φ) Φ 0,µ ∂φy ∂φx 0,µ + * N X i ∂ (N ) (N ) ˜ (φ) ˜ (φ), yi Φ Φ − 0,µ 0,µ ~ ∂φx i=1 −
* +# N X i ∂ (N ) (N ) ˜ (φ), ˜ (φ) + Φ xj Φ . 0,µ 0,µ ~ ∂φy j=1
(5.23)
By averaging over the gauge parameters φ, σxy (φ) of (2.29) is written as σxy (φ) =
q Z ∂ e2 1 1 X ˜ (N ) (φ), ∂ Φ ˜ (N ) (φ) − (x ↔ y) , dφx dφy Φ 0,µ h 2πi q µ=1 Tg ∂φx ∂φy 0,µ
(5.24)
where we have used the relations (5.19) and (5.20). Thus one gets the geometrically invariant form of the Hall conductance. To see this more explicitly, we write ∂ ˜ (N ) (N ) ˜ for s = x, y , (5.25) Φ 0 (φ) Aµ,µ0 ,s (φ) = Φ0,µ (φ), ∂φs 0,µ and F(φ) =
∂ ∂ Ay (φ) − Ax (φ) + [Ax (φ), Ay (φ)] , ∂φx ∂φy
(5.26)
which are, respectively, the connections and the curvature in the language of differential geometry [39]. In the language of the corresponding non-Abelian gauge theory [42], these corresponds to the gauge fields and the field strength tensor. The connections As (φ) are the q × q matrix with the matrix elements Aµ,µ0 ,s (φ). Then
December 23, 2004 10:56 WSPC/148-RMP
00223
1145
Revisiting the Charge Transport in Quantum Hall Systems
the geometric invariant on the gauge torusl Tg is given by Z 1 dφx dφy tr F(φ) , I= 2πi Tg
(5.27)
where tr is the trace of q × q matrix. Since tr[Ax (φ), Ay (φ)] = 0, one has σxy (φ) =
e2 I . h q
(5.28)
In passing, we remark that the connection As (φ) is not necessarily periodic at the boundaries of the rectangular region Tg . If the connection As (φ) satisfies the periodic boundary conditions, then the quantity I vanishes, i.e. the averaged Hall conductance σxy (φ) becomes zero. Following the way of computing a geometric invariant in [6, 11], we shall show that the geometric invariant I takes an integer value, i.e. the averaged Hall conductance σxy (φ) is quantized to a rational number p/q. For this purpose, we rewrite the averaged Hall conductance σxy (φ) of (5.24) as q Z e2 1 1 X ∆φy ∂ ˜ (N ) (N ) ˜ σxy (φ) = Φ (∆φx , φy ) dφy Φ0,µ (∆φx , φy ), h 2πi q µ=1 0 ∂φy 0,µ ∂ ˜ (N ) (N ) ˜ Φ (0, φy ) − Φ0,µ (0, φy ), ∂φy 0,µ q Z ∂ ˜ (N ) e2 1 1 X ∆φx (N ) ˜ − dφx Φ0,µ (φx , ∆φy ), Φ (φx , ∆φy ) h 2πi q µ=1 0 ∂φx 0,µ ˜ (N ) (φx , 0) . ˜ (N ) (φx , 0), ∂ Φ − Φ 0,µ ∂φx 0,µ
(5.29)
In order to compute this right-hand side, we introduce q × q matrices θ (x) (φy ) and θ(y) (φx ) as C (x) (φy ) = exp iθ(x) (φy ) and C (y) (φx ) = exp iθ(y) (φx ) . (5.30)
Using these and the relations (5.19) and (5.20) again, one has Z ∆φx q Z ∆φy ∂ e2 1 1 X ∂ (x) (y) dφy σxy (φ) = iθ (φy ) − iθ (φx ) dφx h 2πi q µ=1 0 ∂φy µ,µ ∂φx µ,µ 0 q e2 1 X (x) (x) (y) (y) θ (∆φy ) − θµ,µ (0) − θµ,µ (∆φx ) + θµ,µ (0) = 2πh q µ=1 µ,µ
=
e2 1 (x) tr θ (∆φy ) − θ(x) (0) − θ(y) (∆φx ) + θ(y) (0) , 2πh q
(5.31)
l It is necessary to impose the periodic boundary conditions on the rectangular region T for g identifying it as a torus.
December 23, 2004 10:56 WSPC/148-RMP
1146
00223
T. Koma
where we have used
˜ (N ) (φ), Φ ˜ (N )0 (φ) = δµ,µ0 . Φ 0,µ 0,µ
(5.32)
On the other hand, from the relations (5.19), (5.20) and (5.30), one has ˜ (N ) (∆φx , ∆φy ) = exp[iθx (∆φy )]Φ ˜ (N ) (0, ∆φy ) , Φ 0 0
(5.33)
˜ (N ) (∆φx , 0) = exp[iθx (0)]Φ ˜ (N ) (0, 0) , Φ 0 0
(5.34)
˜ (N ) (∆φx , ∆φy ) = exp[iθy (∆φx )]Φ ˜ (N ) (∆φx , 0) , Φ 0 0
(5.35)
and ˜ (N ) (0, ∆φy ) = exp[iθy (0)]Φ ˜ (N ) (0, 0) , Φ 0 0 ˜ (N ) (φx , φy ) Φ 0
(5.36) ˜ (N ) (φx , φy ). Φ 0,µ
where is the q component vector whose µth component is These four equations yield exp iθ(x) (∆φy ) exp −iθ(x) (0) exp −iθ(y) (∆φx ) exp iθ(y) (0) = 1 ,
(5.37)
where we have used the relation (5.32). Taking the determinant of both sides of (s) (s) this equation and using det exp[iθ (· · ·)] = exp i tr θ (· · ·) for s = x, y, one has tr θ(x) (∆φy ) − θ(x) (0) − θ(y) (∆φx ) + θ(y) (0) = −2πp with an integer p . (5.38)
Owing to this relation, the averaged Hall conductance σxy (φ) of (5.31) must satisfy σxy (φ) = −
e2 p h q
with an integer p .
(5.39)
5.2. Fractional quantization and Atiyah–Singer index theorem In this subsection, we will show that the integer −p of the fractional quantization (5.39) of the averaged Hall conductance σxy (φ) is equal to an index of a Pauli–Dirac operator D with the gauge field A(φ) = (Ax (φ), Ay (φ)) on the gauge torus Tg . This is nothing but a special case of the Atiyah–Singer index theorem. Namely the first Chern number I of (5.27) is equal to the index, ind D, of the Pauli–Dirac operator as we will see in Theorem 5.5 below. The corresponding system described by the Hamiltonian H = −D 2 is equivalent to a single-electron system with spin-1/2 and with q flavors in the gauge field A(φ) on the torus Tg , and the index, −p = ind D, is equal to the difference in the degeneracy between the up-spin and the down-spin ground states. Thus the integer p of the fractionally quantized Hall conductance of the interacting electrons is closely related to the ground state property of the single electron coupled to the gauge field A(φ) determined by the ground state of the original interacting electron system. ˜ µ (φ) = Throughout the present subsection, for simplicity, we will write Φ (N ) ˜ Φ0,µ (φ), µ = 1, 2, . . . , q, for the ground state wavefunctions, by dropping the subscript 0 and the superscript (N ). Then the matrix elements of the gauge field ˜ µ (φ) as A(φ) = (Ax (φ), Ay (φ)) are written in terms of Φ
˜ µ (φ), ∂s Φ ˜ ν (φ) for s = x, y , Aµ,ν,s (φ) = Φ (5.40)
December 23, 2004 10:56 WSPC/148-RMP
00223
Revisiting the Charge Transport in Quantum Hall Systems
1147
where we have written ∂s =
∂ ∂φs
for s = x, y .
(5.41)
We define gauge transformations as
(x) ˜ µ (φx + ∆φx , φy ), Φ ˜ ν (φx , φy ) Gµ,ν (φx , φy ) := Φ
for small φx
(5.42)
for small φy .
(5.43)
and
(y) ˜ µ (φx , φy + ∆φy ), Φ ˜ ν (φx , φy ) Gµ,ν (φx , φy ) := Φ
These are q × q unitary matrices. Actually, q X
α=1
∗
(x) (x) Gµ,α (φ)Gν,α (φ) =
Similarly, one has
q X
α=1
˜ µ (φx + ∆φx , φy ), Φ ˜ α (φ) Φ ˜ α (φ), Φ ˜ ν (φx + ∆φx , φy ) Φ
˜ µ (φx + ∆φx , φy ), Φ ˜ ν (φx + ∆φx , φy ) = δµ,ν . = Φ
q X
α=1
∗
(y) (y) Gµ,α (φ)Gν,α (φ) = δµ,ν .
(5.44)
(5.45)
Lemma 5.2. The gauge field A(φ) = (Ax (φ), Ay (φ)) satisfies ∗
As (φx + ∆φx , φy ) = G (x) (φ)As (φ)G (x) (φ) + G (x) (φ)∂s G (x) (φ)
∗
for small φx (5.46)
and ∗
As (φx , φy + ∆φy ) = G (y) (φ)As (φ)G (y) (φ) + G (y) (φ)∂s G (y) (φ)
∗
for small φy . (5.47)
Proof. From the definitions (5.40), (5.42) and (5.43) for the gauge field A(φ) and the gauge transformations G (s) (φ) for s = x, y, we have Aµ,ν,s (φx + ∆φx , φy )
˜ µ (φx + ∆φx , φy ), ∂s Φ ˜ ν (φx + ∆φx , φy ) = Φ =
q X
α,β=1
=
X α,β
˜ µ (φx + ∆φx , φy ), Φ ˜ α (φ) Φ ˜ α (φ), ∂s Φ ˜ β (φ) Φ ˜ β (φ), Φ ˜ ν (φx + ∆φx , φy ) Φ (x)
∗
(x)
(x) (x) Gµ,α (φ)Aα,β,s (φ)Gν,β (φ) + Gµ,α (φ)δα,β ∂s Gν,β (φ)
= G (x) (φ)As (φ)G (x) (φ)
∗
µ,ν
+ G (x) (φ)∂s G (x) (φ)
∗
µ,ν
.
In the same way, the other relation (5.47) can be obtained.
∗
(5.48)
December 23, 2004 10:56 WSPC/148-RMP
1148
00223
T. Koma
We define covariant derivatives ∇s as ∇s := ∂s + As (φ) for s = x, y .
(5.49)
These covariant derivatives ∇s act on a vector field f (φ) = (f1 (φ), f2 (φ), . . . , fq (φ)) on the torus Tg . We require that such a vector field f (φ) is transformed as q X
fµ (φx + ∆φx , φy ) =
α=1
and fµ (φx , φy + ∆φy ) =
q X
α=1
(x) Gµ,α (φ)fα (φ) for small φx
(y) Gµ,α (φ)fα (φ)
for small φy ,
(5.50)
(5.51)
by the gauge transformation G (s) (φ). Then one has Lemma 5.3. Assume the vector field f (φ) is continuously differentiable with respect to the gauge parameters φ = (φx , φy ). Then f (φ) satisfies (∇s f )(φx + ∆φx , φy ) = G (x) (φ)(∇s f )(φx , φy )
for small φx ,
(5.52)
(∇s f )(φx , φy + ∆φy ) = G (y) (φ)(∇s f )(φx , φy )
for small φy .
(5.53)
and
Proof. From the transformation (5.50) for the vector field f (φ), one has ∂s fµ (φx + ∆φx , φy ) =
q X (x) (x) ∂s Gµ,α (φ) · fα (φ) + Gµ,α (φ)∂s fα (φ)
(5.54)
α=1
for small φx . From the transformation (5.50) and Lemma 5.2, one has q X
α=1
Aµ,α,s (φx + ∆φx , φy )fα (φx + ∆φx , φy ) = G (x) (φ)As (φ)f (φ)
= G (x) (φ)As (φ)f (φ)
+ µ
q X
α=1 µ
G (x) (φ)∂s G (x) (φ)
− ∂s G (x) (φ) · f (φ)
where we have used the identity ∗
∗
µ
∗
µ,α
G (x) (φ)f (φ)
,
G (s) (φ)∂t G (s) (φ) + ∂t G (s) (φ) · G (s) (φ) = 0 for s, t = x, y ,
α
(5.55)
(5.56)
for getting the second equality. Combining (5.54) and (5.55), the relation (5.52) is derived. The other relation is also derived in the same way. We introduce the Pauli–Dirac operator D as D = σ x ∇x + σ y ∇y ,
(5.57)
December 23, 2004 10:56 WSPC/148-RMP
00223
Revisiting the Charge Transport in Quantum Hall Systems
where σx and σy are the Pauli matrices given by 0 −i 0 1 . and σy = σx = i 0 1 0
1149
(5.58)
Clearly the Pauli–Dirac operator D is written as 0 D− D= , D+ 0
(5.59)
where
D+ = ∇x + i∇y
and D− = ∇x − i∇y .
(5.60)
The Hamiltonian of the corresponding system is given by −D− D+ 0 2 H := −D = . 0 −D+ D−
(5.61)
Note that
D− D+ = (∇x − i∇y )(∇x + i∇y ) = ∇2x + ∇2y + i∇x ∇y − i∇y ∇x = ∇2x + ∇2y + i[∇x , ∇y ] .
(5.62)
Since the commutator in the last line is equal to the curvature F as ∇x , ∇y = ∂x Ay − ∂y Ax + Ax , Ay =: F ,
(5.63)
one has
D− D+ = ∆ + iF
(5.64)
∆ = ∇2x + ∇2y = (∂x + Ax )2 + (∂y + Ay )2 .
(5.65)
with the Laplacian,
In the same way, one has D+ D− = ∆ − iF .
(5.66)
Therefore the Hamiltonian H becomes H = −∆ − iFσz
with σz :=
1
0
0 −1
.
(5.67)
Namely the system is equivalent to the single electron with spin-1/2 and q flavors in the “magnetic field” F in two dimensions. Here we stress that the number q of the flavors is equal to the degeneracy of the ground state of the original interacting electron system. We define the index of the Pauli–Dirac operator D as ind D := dim ker D+ − dim ker D− .
(5.68)
December 23, 2004 10:56 WSPC/148-RMP
1150
00223
T. Koma
In the standard way, one has Lemma 5.4. For any positive constant β, the following equality is valid : ind D = Tr σz e−βH .
(5.69)
Proof. Let ψ be an eigenvector of the operator D − D+ with the eigenvalue λ 6= 0, i.e., D− D+ ψ = λψ .
(5.70)
Then the vector D + ψ is the eigenvector of the operator D + D− with the same eigenvalue λ. Actually, one has kD+ ψk2 = hD+ ψ, D+ ψi = −hψ, D− D+ ψi = −λkψk2 6= 0
(5.71)
D+ D− (D+ ψ) = D+ (D− D+ ψ) = λD+ ψ .
(5.72)
and
Let ψ1 , ψ2 be eigenvectors of D − D+ with the eigenvalue λ, and assume the two vectors are orthogonal to each other. Then
+ (5.73) D ψ1 , D+ ψ2 = − ψ1 , D− D+ ψ2 = −λ ψ1 , ψ2 = 0 .
We denote by d+ (λ) the dimension of the eigenspace of the operator D − D+ with the eigenvalue λ, and by d− (λ) the dimension of the eigenspace of D + D− with λ. From the above observations, one has d+ (λ) ≤ d− (λ) for λ 6= 0. Further d− (λ) ≤ d+ (λ) for λ 6= 0 in the same way. Combining these two inequalities, one has d+ (λ) = d− (λ) for λ 6= 0. From this fact, one gets Tr σz e−βH = Tr eβD
−
D+
− Tr eβD
+
D−
= dim ker D− D+ − dim ker D+ D− .
(5.74)
This right-hand side is equal to the index of the operator D by the definition (5.68). The following theorem is a special case of the Atiyah–Singer index theorem [40, 41]: Theorem 5.5. ind D =
1 2πi
Z
Tg
dφx dφy tr F .
(5.75)
Proof. We shall prove this statement along the same line as in [49]. First let us introduce the complete system of the orthonormal functions as ψ(kx , ky ) := p
1 exp[i(kx φx + ky φy )] ∆φx ∆φy
with kx =
2πny 2πnx , ky = ∆φx ∆φy (5.76)
December 23, 2004 10:56 WSPC/148-RMP
00223
Revisiting the Charge Transport in Quantum Hall Systems
1151
with nx , ny ∈ Z. Using this system of the functions, one has X
Tr eβ(∆±iF ) = tr ψ(kx , ky ), eβ(∆±iF ) ψ(kx , ky ) kx ,ky
=
Z X 1 tr dφx dφy exp β ∆(k) ± iF , ∆φx ∆φy Tg
(5.77)
kx ,ky
where we have written ∆(k) = (∇x + ikx )2 + (∇y + iky )2
(5.78)
and have used e−iks φs ∂s eiks φs = ∂s + iks
for s = x, y .
Moreover, by using DuHamel’s formula Z t dse−(t−s)X Y e−s(X+Y ) , e−t(X+Y ) = e−tX +
(5.79)
(5.80)
0
one has Tr eβ(∆+iF ) − Tr eβ(∆−iF ) Z X Z 0 1 β 0 (β−β 0 )∆(k) β 0 (∆(k)+iF ) β tr dβ e F e + eβ (∆(k)−iF ) dφx dφy = i∆φx ∆φy β 0 Tg kx ,ky
=
Z 1 X Z β dse(1−s)β∆(k) F esβ(∆(k)+iF ) + esβ(∆(k)−iF ) dφx dφy tr i∆φx ∆φy 0 Tg kx ,ky
=
Z 1 X Z ˜ ˜ β ˜ ˜ ˜ ˜ tr dφx dφy dse(1−s)∆(k) F es(∆(k)+iβF ) + es(∆(k)−iβF ) , i∆φx ∆φy ˜ ˜ Tg 0 k x ,k y
(5.81)
where we have introduced the variable s as β 0 = βs with 0 ≤ s ≤ 1, and we have written p p k˜x = βkx and k˜y = βky , (5.82)
and
˜ k) ˜ = ∆(
p
β∇x + ik˜x
Here, since the sum of the operators
2
+
p 2 β∇y + ik˜y .
X β ˜ ˜ k) exp ∆( ∆φx ∆φy ˜ ˜ k x ,k y
(5.83)
(5.84)
December 23, 2004 10:56 WSPC/148-RMP
1152
00223
T. Koma
is uniformly bounded in β, the right-hand side of (5.81) converges to Z Z Z 1 ˜x dk˜y tr ˜2 + k˜2 × 2F = 1 d k dφ dφ exp − k dφx dφy tr F x y x y (2π)2 i 2πi Tg Tg
(5.85)
in the limit β ↓ 0. Combining this observation, (5.64), (5.66), Lemma 5.4 and (5.74), one obtains the desired relation, Z 1 β(∆+iF ) β(∆−iF ) ind D = Tr e − Tr e = dφx dφy tr F . (5.86) 2πi Tg
6. The Non-Interacting Case As a demonstration, we first treat the simplest model of the quantum Hall system, and determine the explicit forms of the function θ (s) (φ), s = x, y, introduced in Sec. 5.1. As a result, the well-known integral quantization of the Hall conductance is obtained. In the next subsection, we will treat the general non-interacting case, and give the proof of Theorem 2.3. 6.1. The single-electron Landau Hamiltonian The single-electron Hamiltonian in two dimensions with the uniform magnetic field only is given by H0 (φ) =
1 (px − eBy + φx )2 + (py + φy )2 2me
(6.1)
with the gauge parameters φ = (φx , φy ) ∈ R2 . The eigenvectors on R2 are given by i ikx ϕn,k (x, y; φ) = e exp − φy y vn (y − y(k, φx )) , (6.2) ~ where k is the real wavenumber in the x direction, and
with
vn,k (y; φx ) := vn (y − y(k, φx )) := Nn exp −(y − y(k, φx ))2 /(2`2B ) ℘n (y − y(k, φx ))/`B y(k, φx ) :=
1 (~k + φx ) . eB
(6.3)
(6.4)
Here ℘n is the nth Hermite polynomial, and the normalization constant Nn is taken to satisfy Z +∞ dy|vn,k (y; φx )|2 = 1 . (6.5) −∞
December 23, 2004 10:56 WSPC/148-RMP
00223
Revisiting the Charge Transport in Quantum Hall Systems
1153
Now consider the Lx × Ly rectangular box T = [−Lx /2, Lx/2] × [−Ly /2, Ly /2] satisfying Lx Ly = 2πM `2B with a sufficiently large positive and even integer M . We impose the periodic boundary conditions ϕ(x, y; φ) = t(x) (Lx )ϕ(x, y; φ) ,
ϕ(x, y; φ) = t(y) (Ly )ϕ(x, y; φ)
(6.6)
for the wavefunctions ϕ. Then the complete system of the eigenvectors [31, 32] of the Hamiltonian satisfying the periodic boundary conditions is given by +∞ X i −1/2 i(k+`K)x φ (y − `L ) vn (y − y(k, φx ) − `Ly ) ϕP (x, y; φ) = L e exp − y y n,k x ~ `=−∞
(6.7)
for k = 2πm/Lx with m = −M/2 + 1, . . . , M/2 − 1, M/2, and with K = Ly /`2B . The energy eigenvalue is given by (n + 1/2)~ωc for n = 0, 1, 2 . . . . By the gauge transformation i G(x, y; φ) = exp (xφx + yφy ) (6.8) ~ corresponding to the transformation (5.17) of N body, the eigenvectors are transformed as ϕ˜P n,k (x, y; φ) = G(x, y; φ)ϕP n,k (x, y; φ) =
Lx−1/2
+∞ X
`=−∞
e
i(k+`K)x
i xφx + `Ly φy exp ~
vn (y − y(k, φx ) − `Ly ) .
(6.9)
In order to determine the explicit forms of the phases θ (x) and θ(y) in (5.30), we first show that
and
for k 6= kmax , ϕ˜P ˜P n,k (x, y; ∆φx , φy ) = ϕ n,k+2π/Lx (x, y; 0, φy ) i ϕ˜P ˜P n,kmax (x, y; ∆φx , φy ) = exp − Ly φy ϕ n,kmin (x, y; 0, φy ) , ~ ϕ˜P ˜P n,k (x, y; φx , ∆φy ) = ϕ n,k (x, y; φx , 0) for any k ,
(6.10) (6.11)
(6.12)
where kmax := πM /Lx and kmin := 2π(−M/2 + 1)/Lx . This last relation (6.12) is immediately obtained from the expression (6.9) of ϕ˜P n,k . By the definition (6.4), one has y(k, ∆φx ) = y(k + 2π/Lx , 0) for k 6= kmax .
(6.13)
Combining this with the expression (6.9), one obtains the first relation (6.10). For the largest wavenumber kmax = πM/Lx , one has kmax +
2π = kmin + K , Lx
(6.14)
December 23, 2004 10:56 WSPC/148-RMP
1154
00223
T. Koma
where we have used the relations K = Ly /`2B and Lx Ly = 2πM `2B . Substituting this into (6.4), one gets y(kmax , ∆φx ) = y(kmin , 0) + Ly .
(6.15)
Combining these with the expression (6.9), the second relation (6.11) is obtained. ˜ (N ) (φ) with N = `M electrons, where ` is a Let us consider the ground state Φ 0 positive integer and M is the number of the states in a single Landau level. All the states in the lowest ` Landau levels are occupied with the `M electrons and the rest of the higher Landau levels are all empty. Clearly the ground state is unique, i.e. q = 1, and the energy gap ~ωc appears above the ground state. Combining (5.19), (5.30) and the above results, (6.10) and (6.11), one gets Ly φy ` + δ (x) (6.16) ~ ˜ (N ) (φ). Here δ (x) is a real constant which for the N = `M electrons ground state Φ 0 is independent of φ. Combining (5.20), (5.30) and (6.12), one obtains θ(x) (φy ) = −
θ(y) (φx ) = 0 for all φx .
(6.17)
From (5.31), (6.16) and (6.17), the averaged Hall conductance is given by e2 `. (6.18) h Since the present system is translationally invariant, the non-averaged Hall conductance is also quantized to the same integer as we already obtained the more general result (4.2). σxy (φ) = −
6.2. The general electron gases Consider the general non-interacting case, i.e. W (2) = 0 in the Hamiltonian (5.1). The corresponding single electron Hamiltonian is given by 1 H(φ) = [p + eA(r) + φ]2 + W (r) (6.19) 2me with the vector potential A = A0 + AP and with the periodic boundary conditions (N ) (6.6). Consider the ground state Φ0 (φ) with N = `M electrons, where ` is a positive integer and M is the number of the states in a single Landau level. This is the same situation as in the preceding subsection except for the potentials. The aim of this subsection is to prove all the statements of Theorem 2.3. First of all, we show that the gap condition (2.17) is valid if the vector potential AP and the electrostatic potential W satisfy the condition (2.36) in Theorem 2.3. For this purpose, we first rewrite the single electron Hamiltonian (6.19) as e e AP (r) · [p + eA0 (r) + φ] + [p + eA0 (r) + φ] · AP (r) H(φ) = H0 (φ) + 2me 2me +
e2 |AP (r)|2 + W (r) , 2me
(6.20)
December 23, 2004 10:56 WSPC/148-RMP
00223
Revisiting the Charge Transport in Quantum Hall Systems
1155
where H0 (φ) is given by (6.1). Using the Schwarz inequality, one has q q
2 ψ, e AP · (p + eA0 )ψ ≤ e ψ, AP ψ ψ, (p + eA0 )2 ψ 2me 2me
q
e
|AP | ≤ ψ, (p + eA0 )2 ψ (6.21) ∞ 2me
for the vector ψ in the domain of the Hamiltonian. From this inequality, the energy expectation can be evaluated as √ p 2e hψ, H(φ)ψi ≤ hψ, H0 (φ)ψi + √ k|AP |k∞ hψ, H0 (φ)ψi me + and
e2
|AP | 2 + W + ∞ ∞ 2me
(6.22)
√
hψ, H(φ)ψi ≥ ψ, H0 (φ)ψ − √
p 2e k|AP |k∞ hψ, H0 (φ)ψi − kW − k∞ . (6.23) me
edge edge Let us denote by En,+ and En,− the upper and lower edges of the Landau band with the index n, respectively. From the standard argument about the min–max principle,m one has √ p e2 2e edge k|AP |k2∞ + kW + k∞ En,+ ≤ (n + 1/2)~ωc + √ k|AP |k∞ (n + 1/2)~ωc + me 2me (6.24)
for n = 0, 1, 2, . . . . For the lower edge, we assume r e 1 √ ~ωc . k|AP |k∞ ≤ 2 2me
(6.25)
Then the right-hand side of the bound (6.23) is a strictly monotone increasing function of the expectation hψ, H0 (φ)ψi. Therefore, the same argument yields √ p 2e edge En,− ≥ (n + 1/2)~ωc − √ k|AP |k∞ (n + 1/2)~ωc − kW − k∞ (6.26) me
for n = 0, 1, 2, . . . . If this right-hand side with the index n + 1 is strictly larger than the right-hand side of (6.24) with the index n, then there exists a spectral gap edge edge above the Landau band with the index n, i.e. En+1,− > En,+ . This gap condition can be written as the desired form (2.36) with ` = n + 1. Clearly the condition (2.36) is stronger than the above (6.25) for the vector potential AP . Therefore we have no need to take into account the condition (6.25). Next we check that the corrections δσsy (t; φ, η, T ) in (2.27) and (2.28) to the dominant parts of the linear response coefficients are negligibly small for the slow m See,
for example, Sec. XIII.1 of the book [50] by M. Reed and B. Simon.
December 23, 2004 10:56 WSPC/148-RMP
1156
00223
T. Koma
switching process. Because of the excitation energy gap above the ground state, one can easily prove the bound, |δσsy (t; φ, η, T )| ≤
e2 ν (C1 + C2 ωc T )e−ηT + (C3 + C4 η/ωc )η/ωc , h
(6.27)
by using the expressions (4.14)–(4.19). Here all the positive constants C j , j = 1, 2, 3, 4, are independent of the system sizes Lx , Ly and of the number N of the electrons for a fixed filling factor ν of the electrons. In the rest of this section, we derive two bounds, (2.37) and (2.38), in Theorem 2.3. To begin with, we rewrite the Hall conductance σxy (φ) of (3.37) as
(0) (0) X ψm , v x ψn ψn , v y ψm i~e2 − (x ↔ y) (6.28) σxy (φ) = − Lx Ly (Em − En )2 m,n:Em ≤EF <En
in terms of the eigenvector ψn of the single-electron Hamiltonian H(φ) of (6.19) with the energy eigenvalue En . Here EF is the Fermi energy, and v(0) =
1 [p + eA(r)] . me
(6.29)
We have dropped φ in v(0) by relying on the orthogonality between ψm and ψn with Em 6= En . Further we introduce some operators as follows: Z 1 1 Q(E ≤ EF ; φ) = R(z; φ)dz with the resolvent R(z; φ) = (6.30) 2πi γ z − H(φ) and Qi (E ≤ EF ; φ) =
1 2πi
Z
(0)
γ
R(z; φ)vi R(z; φ)dz ,
(6.31)
where we have chosen the closed path γ so that the operator Q(E ≤ EF ; φ) is the projection onto the subspace spanned by all the levels below the Fermi energy EF . Here the path γ is taken to be independent of the gauge parameters φ by the gap condition (2.17). Note that Tr[Q(E ≤ EF ; φ)Qx (E ≤ EF ; φ)Qy (E ≤ EF ; φ)] X = hψn , Qx (E ≤ EF ; φ)Qy (E ≤ EF ; φ)ψn i n:En ≤EF
=
X
X
XD
n:En ≤EF
=
ψn ,
1 2πi
Z
(0) ψn , v x ψm
n:En ≤EF m
=
X
γ
1 1 1 (0) dz1 vx z1 − E n z1 − H(φ) 2πi
X
n:En ≤EF m:Em >EF
D
ED
(0)
(0) ψm , v y ψn
ψn , v x ψm
Z
γ
1 1 (0) vy dz2 ψn z2 − H(φ) z2 − E n
2 E 1 Z 1 1 dz 2πi γ z − En z − Em
E ED (0) ψm , v y ψn
1 . (En − Em )2
(6.32)
December 23, 2004 10:56 WSPC/148-RMP
00223
Revisiting the Charge Transport in Quantum Hall Systems
1157
Combining this with the expression (6.28), the Hall conductance σxy (φ) can be rewritten as i~e2 σxy (φ) = − Tr Q(E ≤ EF ; φ) Qx (E ≤ EF ; φ)Qy (E ≤ EF ; φ) − (x ↔ y) . Lx Ly (6.33) Theorem 6.1. The following inequality is valid : |σxy (φ + δφ) − σxy (φ)| ≤ C max |δφi | ,
(6.34)
i=x,y
where C is a positive constant which is independent of the system sizes L x , Ly and of the number N of the electrons. This theorem follows from Proposition 6.2 with the expression (6.33) of σxy (φ). Proposition 6.2. The following bound is valid : |Tr[Q(E ≤ EF ; φ + δφ)Qi (E ≤ EF ; φ + δφ)Qj (E ≤ EF ; φ + δφ)] − Tr[Q(E ≤ EF ; φ)Qi (E ≤ EF ; φ)Qj (E ≤ EF ; φ)] ≤ CN max |δφ` | `=x,y
(6.35)
for i, j = x, y. Here C is a positive constant which is independent of the system sizes Lx , Ly and of the number N of the electrons. The proof is given in Appendix B. Fix φ0 ∈ [0, ∆φx ] × [0, ∆φy ]. Then we have Z ∆φx Z ∆φy 1 σxy (φ0 ) − σxy (φ) = dφx dφy [σxy (φ0 ) − σxy (φ)] . ∆φx ∆φy 0 0 Using Theorem 6.1, the difference can be evaluated as −1 . |σxy (φ0 ) − σxy (φ)| ≤ C max L−1 x , Ly
(6.36)
(6.37)
As shown in Sec. 5, the averaged Hall conductance is quantized as σxy (φ) = −e2 p/h because the present ground state is non-degenerate, i.e. q = 1. Hence we obtain 2 −1 σxy (φ0 ) + e p ≤ C max L−1 (6.38) x , Ly h with an integer p. In order to determine the integer p, consider the Hamiltonian H(φ, λ) =
1 [p + eA0 (r) + eλA AP (r) + φ]2 + λW W (r) 2me
(6.39)
with the parameters λ = (λA , λW ) ∈ [0, 1]2 . This Hamiltonian H(φ, λ) continuously connects H(φ) with H0 (φ) of (6.1) by varying λ from (1, 1) to (0, 0) continuously. Then, in the same way as in the proof of Proposition 6.2, one can prove that the averaged Hall conductance σxy (φ) for the Hamiltonian H(φ, λ) is a continuous function of the parameters λ. As shown in Sec. 6.1, σxy (φ) = −e2 `/h for λ = (0, 0).
December 23, 2004 10:56 WSPC/148-RMP
1158
00223
T. Koma
Therefore the integer p must be equal to the filling factor ` of the Landau levels. Consequently we obtain the bound (2.37). Next consider the acceleration coefficients γsy (φ). In terms of the projection operator Q(E ≤ EF ; φ), these can be expressed as e2 ∂ Tr vs(0) (φ)Q(E ≤ EF ; φ) Lx Ly ∂φy e2 1 = δs,y Q(E ≤ EF ; φ) + Tr vs(0) (φ)Qy (E ≤ EF ; φ) Tr Lx Ly me e2 N (0) = δs,y + Tr vs (φ)Qy (E ≤ EF ; φ) . (6.40) L x L y me
γsy (φ) =
Theorem 6.3. The following inequality is valid : |γsy (φ + δφ) − γsy (φ)| ≤ C max |δφi | ,
(6.41)
i=x,y
where the positive constant C is independent of the time t, the number N of the electrons, and the system sizes Lx , Ly . Owing to the expression (6.40), this theorem follows from the following proposition: Proposition 6.4. The following inequality is valid : |Tr vs(0) (φ + δφ)Qy (E ≤ EF ; φ + δφ) − Tr vs(0) (φ)Qy (E ≤ EF ; φ)| ≤ CN max |δφi | , i=x,y
(6.42) where the positive constant C is independent of the time t, the number N of the electrons, and of the system sizes Lx , Ly . The proof is given in Appendix C. Fix φ0 ∈ [0, ∆φx ] × [0, ∆φy ]. Then Z ∆φx Z ∆φy 1 γsy (φ0 ) − γsy (φ) = dφx dφy γsy (φ0 ) − γsy (φ) . ∆φx ∆φy 0 0
(6.43)
Using Theorem 6.3 and the fact γsy (φ) = 0 which was shown in Sec. 5, we have the bound (2.38). 7. The Interacting Case In this section we study the interacting case in detail. In Sec. 5, we obtained that the averaged Hall conductance is quantized as σxy (φ) = −
e2 p h q
(7.1)
December 23, 2004 10:56 WSPC/148-RMP
00223
Revisiting the Charge Transport in Quantum Hall Systems
1159
with integers p, q. Here the integer q is the dimension of the sector of the ground state. Since the Hall conductance σxy (φ) is a continuous function of the gauge parameters φ, we can find special gauge parameters φ 0 ∈ Tg satisfying σxy (φ0 ) = −
e2 p , h q
(7.2)
by using the mean value theorem about integration. Namely the Hall conductance is exactly quantized for the special value φ = φ 0 of the gauge parameters. But we cannot necessarily expect the same exact quantization for general fixed values of the gauge parameters in Tg . Besides, the value φ0 may strongly depend on the potentials, the system sizes Lx , Ly and the number N of the electrons although φ0 tends to zero as Lx , Ly → ∞.n As we treated the non-interacting case in the preceding section, we want to resolve the following two issues for the interacting case: • estimating the finite size correction for σxy (φ) for fixed gauge parameters φ 6= φ0 ; • elucidating the relation between the integer p and the filling factor ν of the Landau levels. Unfortunately we cannot estimate the finite size correction because of certain technical difficulty. Therefore we focus on the second issue only in this paper. 7.1. Boundedness of the Hall conductance σxy (φ) Firstly we prove that the Hall conductance σxy (φ) is uniformly bounded in the number N of the electrons and the system sizes Lx , Ly for any fixed filling factor ν under the assumptions below. If the dimension q of the sector of the ground state is also uniformly bounded in Lx , Ly , N in addition to the boundedness of σxy (φ), (i) (i) then there exists a sequence {(Lx , Ly )}i of the system sizes going to infinity and two integers p(∞) , q (∞) such that (∞) σxy :=
(i)
lim (i)
Lx ,Ly →∞
σxy (φ0 ) = −
e2 p(∞) . h q (∞)
(7.3)
The quantization of the Hall conductance occurs in the infinite volume limit although the number p(∞) /q (∞) may be equal to an integer. Unfortunately we cannot determine the explicit values of the integers p(∞) and q (∞) for the infinite volume ground state for a given filling factor ν. In order to prove the boundedness of the Hall conductance σxy (φ), we need some technical assumptions: Assumption 7.1. The the electrostatic potential W and the electron–electron interaction W (2) of the present model satisfy higher differentiability as W ∈ C 2 (R2 ) n The space T itself contracts into the single point (0, 0) in the infinite volume limit from the g definition (2.5) of the gauge torus Tg .
December 23, 2004 10:56 WSPC/148-RMP
1160
00223
T. Koma
and W (2) ∈ C 1 (R2 ). The norms,
2
∂ W
∂x2 ∞
2
∂ W
∂y 2 , ∞
and
(7.4)
are bounded uniformly in the sizes Lx , Ly of the system. Theorem 7.2. Suppose AP = 0, and require Assumption 7.1 in addition to the assumptions (including Assumption 2.1) in Sec. 2. Then the the Hall conductance σxy (φ) of (3.37) is uniformly bounded in the number N of the electrons and in the system sizes Lx , Ly for any fixed filling factor ν. The proof is given in Appendix D. In the same way, we can prove the boundedness of the acceleration coefficients γsy (φ) of (3.38) and (3.43), and get the bound |δσsy (t; φ, η, T )| ≤
e2 ν (C1 + C2 ωc T )e−ηT + (C3 + C4 η/ωc )η/ωc , h
(7.5)
for the corrections δσsy (t; φ, η, T ) to the dominant parts of the linear response coefficients (2.27) and (2.28) under the same assumptions. Here all the positive constants Cj , j = 1, 2, 3, 4, are independent of the system sizes Lx , Ly and of the number N of the electrons for a fixed filling factor ν of the electrons. Next consider the case with AP 6= 0. We write the z component of the magnetic field for the vector potential AP as BP,z =
∂AP,x ∂AP,y − . ∂x ∂y
(7.6)
Assumption 7.3. The magnetic field BP,z for the vector potential AP satisfies BP,z ∈ C 2 (R2 ), and the norms,
∂BP,z
∂BP,z
, and (7.7) kBP,z k∞ , ∂x ∞ ∂y ∞
are bounded uniformly in the sizes Lx , Ly of the system.
Assumption 7.4. The electron–electron interaction W (2) is non-negative (repulsive) and satisfies the decay condition, −γ/2 (2) W (2) (x, y) ≤ W0 1 + [dist(x, y)/r0 ]2 (2)
with the constants W0
> 0, γ > 2, r0 > 0 ,
(7.8)
where the distance is given by q dist(x, y) := min |x − mLx |2 + min |y − nLy |2 .
(7.9)
m∈Z
n∈Z
For the case with AP 6= 0, we have the following theorem:
Theorem 7.5. Require Assumptions 7.1, 7.3 and 7.4 in addition to the assumptions (including Assumption 2.1) in Sec. 2. Then the Hall conductance σxy (φ) of (3.37) is
December 23, 2004 10:56 WSPC/148-RMP
00223
Revisiting the Charge Transport in Quantum Hall Systems
1161
uniformly bounded in the number N of the electrons and in the system sizes L x , Ly for any fixed filling factor ν of the electrons. The proof is given in Appendix D. Under the same assumptions, we can also prove the boundedness of the acceleration coefficients γsy (φ), and get the same bound for δσsy (t, φ; η, T ) as (7.5) with different constants Cj . 7.2. Fractional quantization of the Hall conductance σxy (φ) The rational number p/q in the right-hand side of (7.1) or (7.2) may be equal to an integer, i.e. the number p may equal a multiple of q. But, from the result (4.2), we can expect that the Hall conductance exhibits purely fractional quantization as σxy (φ0 ) = −
e2 ν h
(7.10)
for a fractional filling factor ν = p/q ∈ / N when a spectral gap exists above the sector of the ground state with weak disorder. Next we shall show that this expectation holds under certain assumptions. When the system is translationally invariant in one direction [27], we can obtain the desired result as follows: Theorem 7.6. In addition to the assumptions (including Assumption 2.1) in Sec. 2, assume that the electrostatic potential W is a function of the single variable x only or y only as W (x) or W (y), and assume AP = 0, W ∈ C 1 (R) and W (2) ∈ C 1 (R2 ). Then there exist gauge parameters φ0 ∈ Tg such that σxy (φ0 ) = −
e2 p h q
(7.11)
for the fractional filling factor ν = p/q of the Landau levels. The proof is given in Appendix E. In order to proceed to more generic situations, we write D(m,n) :=
∂ m+n ∂xm ∂y n
(7.12)
for non-negative integers m, n. Assumption 7.7. The electrostatic potential W and the electron–electron interaction W (2) satisfy higher differentiability as W ∈ C 3 (R2 ) and W (2) ∈ C 1 (R2 ). Theorem 7.8. Suppose AP = 0, i.e. BP,z = 0, and require Assumption 7.7 in addition to the assumptions (including Assumption 2.1) in Sec. 2. Then the Hall conductance σxy (φ) averaged over the gauge parameters φ satisfies the bound , −
e2 e2 ν(1 + δ) ≤ σxy (φ) ≤ − ν(1 − δ) , h h
(7.13)
December 23, 2004 10:56 WSPC/148-RMP
1162
00223
T. Koma
with δ = 2`4B
~ωc max kD(m,n) W k2∞ . (∆E)3 m+n=2
(7.14)
Here ν = N/M is the fixed filling factor of the electrons, and the definition of the energy gap ∆E is given in Assumption 2.1. The proof is given in Appendix E. We can obtain a similar bound for the nonaveraged Hall conductance σxy (φ) to (7.13). See Appendix E. Corollary 7.9. Under the same assumptions as in the above theorem, there exist gauge parameters φ0 ∈ Tg such that e2 p h q
(7.15)
p ≤ ν(1 + δ) q
(7.16)
σxy (φ0 ) = − with the integers p, q satisfying ν(1 − δ) ≤ with the same δ as in the theorem.
Consider again the situation where the interval [ν(1 − δ), ν(1 + δ)] does not include any integer. Then the number p/q must be equal to purely a fractional number, i.e. a non-integer. In addition to this condition, if the number δ and the dimension q of the sector of the ground state are uniformly bounded in the sizes Lx , Ly of (i) (i) the system, then there exists a sequence {(Lx , Ly )}i of the system sizes going to infinity and two integers p(∞) , q (∞) such that (∞) σxy :=
(i)
lim (i)
Lx ,Ly →∞
σxy (φ0 ) = −
e2 p(∞) , h q (∞)
(7.17)
(∞)
and that σxy satisfies the bound derived from (7.16). Therefore the number p(∞) /q (∞) equals a purely fractional number also in the infinite volume limit. In order to get a similar bound for the Hall conductance in the case with the vector potential AP 6= 0, we need stronger assumptions as follows: Assumption 7.10. The magnetic field BP,z for the vector potential AP and the electrostatic potential W of the present model satisfy higher differentiability as BP,z ∈ C 4 (R2 ) and W ∈ C 4 (R2 ). Assumption 7.11. The electron–electron interaction W (2) of the present model satisfies W (2) ∈ C 2 (R2 ), and the following two conditions: ∂W (2) ∂W (2) `B (r) + (r) ≤ αint W (2) (r) (7.18) ∂x ∂y and
2 (2) 2 (2) ∂ W ∂ W (r) + (r) ≤ αint W (2) (r) `2B 2 2 ∂x ∂y
(7.19)
December 23, 2004 10:56 WSPC/148-RMP
00223
1163
Revisiting the Charge Transport in Quantum Hall Systems
for any r, with a positive constant αint which is independent of the sizes Lx , Ly of the system. Theorem 7.12. In addition to the assumptions (including Assumption 2.1) in Sec. 2, we require Assumptions 7.4, 7.10 and 7.11. Then there exists a positive number δ such that δ is a continuous function of the norms, kD (k,`) BP,z k∞ for k + ` ≤ 4 and kD (m,n) W k∞ for m + n ≤ 3, and satisfies δ = 0 for the special point with AP = 0 and W = 0, and that the Hall conductance σxy (φ) satisfies the bound , −
e2 e2 ν(1 + δ) ≤ σxy (φ) ≤ ν(1 − δ) , h h
(7.20)
where ν = N/M is the fixed filling factor of the electrons. The proof is given in Appendix E. Clearly we get the following corollary similar to Corollary 7.9: Corollary 7.13. Under the same assumptions as in the above theorem, there exists gauge parameters φ0 ∈ Tg such that e2 p h q
(7.21)
p ≤ ν(1 + δ) q
(7.22)
σxy (φ0 ) = − with integers p, q satisfying ν(1 − δ) ≤
with the same δ as in the above theorem. For this case, we can also make the same remarks as those after Corollary 7.9. Appendix A. Differentiability of the Ground-State Wavefunctions Following Kato (see [47, Chap. II, Sec. 4.2]), we give the proof of Proposition 5.1 in this appendix. For this purpose, it is enough to construct an operator-valued function Ug (φ) of the gauge parameters φ ∈ Tg with the following two conditions: • The inverse Ug−1 (φ) exists and both Ug (φ) and Ug−1 (φ) are infinitely differentiable with respect to φ; (N ) (N ) • Ug (φ)Q(E0 (0))Ug−1 (φ) = Q(E0 (φ)). ˆ (N ) (0), µ = 1, 2, . . . , q, From this second property, one obtains that, if the vectors Φ 0,µ ˆ (N ) (φ) = span the sector of the degenerate ground state for φ = 0, then the vectors Φ 0,µ
(N )
ˆ (0), µ = 1, 2, . . . , q, also span the sector of the degenerate ground state Ug (φ)Φ 0,µ for any given φ ∈ Tg . We begin with the following lemma:
December 23, 2004 10:56 WSPC/148-RMP
1164
00223
T. Koma (N )
Lemma A.1. Let z ∈ / σ(H0 (φ)), i.e. the complex number z is not in the spectrum (N ) (N ) σ(H0 (φ)) of the Hamiltonian H0 (φ). Then
1
1
me ps,j + eAs (rj ) + φs
(N ) z − H0 (φ) s 2 |z| + N kW k∞ + N (N − 1)kW (2) k∞ /2 ≤ 1+ (A.1) (N ) (N ) me dist(z, σ(H0 (φ))) dist(z, σ(H0 (φ))) for s = x, y and j = 1, 2, . . . , N . Proof. Let Φ be an N -electron vector with norm one. Then one has 2 1 1 1 Φ, [p + eA (r ) + φ ] Φ s,j s j s (N ) (N ) z ∗ − H0 (φ) me z − H0 (φ) 2 1 (N ) H0 (φ) + N kW k∞ ≤ Φ, (N ) z ∗ − H0 (φ) me N (N − 1) 1 (2) + kW k∞ Φ (N ) 2 z − H0 (φ) 2 1 2 X
(N ) Φ, Φn (φ) En(N ) (φ) + N kW k∞ = (N ) me n |z − En (φ)|2 N (N − 1) (2) (A.2) + kW k∞ , 2 (N )
(N )
(N )
where Φn (φ) are the eigenvectors of H0 (φ) with the eigenvalue En (φ) counting degenerate eigenvalues a number of times equal to their multiplicity. Note that 1 N (N − 1) (N ) (2) En (φ) + N kW k∞ + kW k∞ (N ) 2 |z − En (φ)|2 1 N (N − 1) (N ) (2) = En (φ) − z + z + N kW k∞ + kW k∞ (N ) 2 |z − En (φ)|2 1 1 N (N − 1) (2) + kW k = − z + N kW k + ∞ ∞ (N ) (N ) 2 z ∗ − En (φ) |z − En (φ)|2 ≤
1 (N )
dist(z, σ(H0
(φ)))
+
|z| + N kW k∞ + N (N − 1)kW (2) k∞ /2 (N )
dist(z, σ(H0
(φ)))2
.
(A.3)
Substituting this into (A.2), the desired bound (A.1) is obtained. (N )
For any z ∈ / σ(H0
(φ)), one has N
X 1 ∂ (N ) R (z; φ) = R(N ) (z; φ) [ps,j + eAs (rj ) + φs ]R(N ) (z; φ) , ∂φs m e j=1
(A.4)
December 23, 2004 10:56 WSPC/148-RMP
00223
Revisiting the Charge Transport in Quantum Hall Systems
1165
where we have written R(N ) (z; φ) =
1 (N )
z − H0
(φ)
.
(A.5)
Here, since the product [ps,j + eAs (rj ) + φs ]R(N ) (z; φ) of the two operators is bounded owing to the above Lemma A.1, the resolvent R (N ) (z; φ) is infinitely differentiable with respect to φ. Therefore, by the integral representation (5.4), the (N ) projection Q(E0 (φ)) is also infinitely differentiable with respect to φ. We introduce abbreviations as (N )
P (φ) = Q(E0
(φ)) ,
(A.6)
and Ps (φ) =
∂ (N ) Q(E0 (φ)) ∂φs
for s = x, y ,
(A.7)
and write the commutators for them as F (s) (φ) = [Ps (φ), P (φ)] .
(A.8)
Ps (φ)P (φ) + P (φ)Ps (φ) = Ps (φ) ,
(A.9)
P (φ)Ps (φ)P (φ) = 0 .
(A.10)
Note that, for s = x, y,
and
These are equivalent to Eqs. (5.9) and (5.10), respectively. Combining the latter with (A.8), one has P (φ)F (s) (φ) = −P (φ)Ps (φ) ,
F (s) (φ)P (φ) = Ps (φ)P (φ) .
(A.11)
Further, from these and (A.9), one obtains Ps (φ) = [F (s) (φ), P (φ)] .
(A.12)
Now introduce the ordinary differential equation, d X+ = F (x) (φx , 0)X+ , dφx
(A.13)
for the unknown operator-valued function X+ = X+ (φx ) of φx . Here we have fixed (x) the argument φy to zero in F (x) . Let X+ = Ug (φx ) be the unique solution satis(x) fying the initial condition Ug (0) = 1. Since F (x) (φx , 0) is infinitely differentiable (x) with respect to φx , the solution Ug (φx ) is also infinitely differentiable. The existence and the infinite differentiability of the unique solution follow from the standard theory for differential equations. Further introduce the ordinary differential equation, d X− = −X− F (x) (φx , 0) , dφx
(A.14)
December 23, 2004 10:56 WSPC/148-RMP
1166
00223
T. Koma (x)
for the unknown operator-valued function X− = X− (φx ). Let X− = Vg (φx ) be (x) the unique solution satisfying the initial condition Vg (0) = 1. Note that d Vg(x) (φx )Ug(x) (φx ) dφx (x)
=
(x)
dVg dUg (φx )Ug(x) (φx ) + Vg(x) (φx ) (φx ) dφx dφx
= −Vg(x) (φx )F (x) (φx , 0)Ug(x) (φx ) + Vg(x) (φx )F (x) (φx , 0)Ug(x) (φx ) = 0 . (x)
(x)
Hence Vg Ug
(A.15)
is a constant and
Vg(x) (φx )Ug(x) (φx )
= Vg(x) (0)Ug(x) (0) = 1 for all φx .
(A.16)
Similarly one has d U (x) (φx )Vg(x) (φx ) dφx g (x)
(x)
dUg dVg (φx )Vg(x) (φx ) + Ug(x) (φx ) (φx ) dφx dφx = F (x) (φx , 0)Ug(x) (φx )Vg(x) (φx ) + Ug(x) (φx ) −Vg(x) (φx )F (x) (φx , 0) = F (x) (φx , 0), Ug(x) (φx )Vg(x) (φx ) . (A.17)
=
(x)
(x)
This is also an ordinary differential equation for Z(φx ) = Ug (φx )Vg (φx ) with the (x) (x) initial condition Z(0) = Ug (0)Vg (0) = 1. Clearly Z(φx ) = 1 satisfies the equation and the initial condition. Combining this with the uniqueness of the solution, the following relation must hold: Ug(x) (φx )Vg(x) (φx ) = 1 for all φx . (A.18) (x) (x) −1 By combining this with (A.16), one gets Vg = Ug . (x) Consider the operator-valued function P (φx , 0)Ug (φx ), and by differentiation, one has (x)
d dUg (φx ) P (φx , 0)Ug(x) (φx ) = Px (φx , 0)Ug(x) (φx ) + P (φx , 0) dφx dφx
= Px (φx , 0)Ug(x) (φx ) + P (φx , 0)F (x) (φx , 0)Ug(x) (φx ) = Px (φx , 0) + P (φx , 0)F (x) (φx , 0) Ug(x) (φx )
= F (x) (φx , 0)P (φx , 0)Ug(x) (φx ) ,
(A.19)
(x)
where we have used the relation (A.12). Thus the function X+ = P (φx , 0)Ug (φx ) is a solution of the differential equation (A.13) with the initial condition (x) (x) P (0, 0)Ug (0) = P (0, 0). By the uniqueness of the solution, P (φx , 0)Ug (φx ) must
December 23, 2004 10:56 WSPC/148-RMP
00223
Revisiting the Charge Transport in Quantum Hall Systems
1167
(x)
coincide with Ug (φx )P (0, 0) which has the same initial condition X+ (0) = P (0, 0). Namely one has P (φx , 0)Ug(x) (φx ) = Ug(x) (φx )P (0, 0) .
(A.20)
Next consider the ordinary differential equation, d Y+ = F (y) (φx , φy )Y+ , dφy
(A.21)
for the unknown operator-valued function Y+ = Y+ (φy ; φx ) of φy and with the (y) parameter φx . Let Y+ = Ug (φx , φy ) be the unique solution satisfying the initial (y) (y) condition Ug (φx , 0) = 1. The solution Ug (φx , φy ) is infinitely differentiable with respect to both φx and φy because of the infinitely differentiability of F (y) . Further introduce the ordinary differential equation, d Y− = −Y− F (y) (φx , φy ) , (A.22) dφy for the unknown operator-valued function Y− = Y− (φy ; φx ) of φy and with the (y) parameter φx . Let Y− = Vg (φx , φy ) be the unique solution satisfying the initial (y) (y) (y) condition Vg (φx , 0) = 1. Then one has Vg = (Ug )−1 in the same way as in the (y) above. Consider the function P (φx , φy )Ug (φx , φy ). By differentiation, one has (y)
∂ ∂Ug P (φx , φy )Ug(y) (φx , φy ) = Py (φx , φy )Ug(y) (φx , φy ) + P (φx , φy ) (φx , φy ) ∂φy ∂φy = Py (φx , φy ) + P (φx , φy )F (y) (φx , φy ) Ug(y) (φx , φy ) = F (y) (φx , φy )P (φx , φy )Ug(y) (φx , φy ) ,
(A.23)
where we have used the relation (A.12). This equation may be treated as the ordinary differential equation (A.21) for the function Y+ (φy ; φx ) = (y) P (φx , φy )Ug (φx , φy ) of φy and with the parameter φx . The initial condi(y) tion is P (φx , 0)Ug (φx , 0) = P (φx , 0). By the uniqueness of the solution, (y) (y) P (φx , φy )Ug (φx , φy ) must coincide with Ug (φx , φy )P (φx , 0) which has the same (y) initial condition Ug (φx , 0)P (φx , 0) = P (φx , 0). Namely one has P (φx , φy )Ug(y) (φx , φy ) = Ug(y) (φx , φy )P (φx , 0) .
(A.24)
Combining this with (A.20), one gets P (φx , φy )Ug(y) (φx , φy )Ug(x) (φx ) = Ug(y) (φx , φy )P (φx , 0)Ug(x) (φx ) = Ug(y) (φx , φy )Ug(x) (φx )P (0, 0) . (y)
(A.25)
(x)
Let Ug (φ) = Ug (φ)Ug (φx ). Then the operator Ug (φ) is invertible and both the operator Ug (φ) and its inverse are infinitely differentiable with respect to φ. Besides, the above result is rewritten in the desired form as P (φ) = Ug (φ)P (0, 0)Ug−1 (φ) .
(A.26)
December 23, 2004 10:56 WSPC/148-RMP
1168
00223
T. Koma
Appendix B. Proof of Proposition 6.2 In order to prove Proposition 6.2, we need some lemmas. Lemma B.1. Let z ∈ / σ(H(φ)). Then s
(0)
1 |z| + kW k∞ 2
v (φ)R(z; φ) ≤ + i me dist(z, σ(H(φ))) dist(z, σ(H(φ)))2
(B.1)
(0)
with the velocity operator vi (φ) = [pi + eAi (r) + φi ]/me for i = x, y.
Proof. For any vector ψ, one has
2
(0)
1 1 1 (0) 2
v (φ)
= ψ, ψ [v (φ)] ψ
i z − H(φ) z ∗ − H(φ) i z − H(φ) 1 1 2 ψ, ∗ ψ H(φ) + kW k∞ ≤ me z − H(φ) z − H(φ) =
1 2 1 2 X
En + kW k∞ . ψ, ψn ∗ me n z − En z − En
Here, since 1 1 1 1 z ∗ − En En + kW k∞ z − En = z ∗ − En En − z + z + kW k∞ z − En z + kW k∞ 1 + = − ∗ z − En |z − En |2 ≤
1 |z| + kW k∞ + , dist(z, σ(H(φ))) dist(z, σ(H(φ)))2
(B.2)
(B.3)
the desired bound is obtained. Lemma B.2. Let z ∈ / σ(H(φ + δφ)) ∪ σ(H(φ)). Then kR(z; φ + δφ) − R(z; φ)k ≤ C max |δφi | , i=x,y
(B.4)
where the positive constant C depends on dist(z, σ(H(φ + δφ)) ∪ σ(H(φ))). Proof. From H(φ + δφ) − H(φ) =
1 1 δφ · [p + eA(r) + φ] + (δφ)2 , me 2me
(B.5)
one has R(z; φ + δφ) − R(z; φ) =
X
(0)
R(z; φ + δφ)vi (φ)R(z; φ)δφi
i=x,y
+
1 R(z; φ + δφ)R(z; φ)(δφ2x + δφ2y ) . 2me
(B.6)
December 23, 2004 10:56 WSPC/148-RMP
00223
Revisiting the Charge Transport in Quantum Hall Systems
1169
The norm is evaluated as kR(z; φ + δφ) − R(z; φ)k ≤
X
i=x,y
+
(0)
kR(z; φ + δφ)k vi (φ)R(z; φ) |δφi |
1 kR(z; φ + δφ)k kR(z; φ)k(δφ2x + δφ2y ) . 2me
(B.7)
Consequently, from Lemma B.1, the bound (B.4) is obtained. Lemma B.3. The following bound is valid : kQ(E ≤ EF ; φ + δφ) − Q(E ≤ EF ; φ)k ≤ C max |δφi |
(B.8)
i=x,y
with a positive constant C. Proof. By definition,
Z
Q(E ≤ EF ; φ + δφ) − Q(E ≤ EF ; φ) = 1
2πi [R(z; φ + δφ) − R(z; φ)]dz γ ≤
|γ| max kR(z; φ + δφ) − R(z; φ)k , (B.9) 2π z∈γ
where |γ| denotes the total length of the path γ. Therefore the inequality (B.8) follows from Lemma B.2. Lemma B.4. The following bound is valid : kQi (E ≤ EF ; φ + δφ) − Qi (E ≤ EF ; φ)k ≤ C max |δφj | j=x,y
(B.10)
with a positive constant C. Proof. Since Z
[R(z; φ)]2 dz = 0 ,
(B.11)
γ
one has another expression 1 Qi (E ≤ EF ; φ) = 2πi
Z
(0)
γ
R(z; φ)vi (φ0 )R(z; φ)dz
(B.12)
(0)
with vi (φ0 ) = [pi + eAi (r) + φ0i ]/me for i = x, y and for any φ0 . Using this expression (B.12), one has Qi (E ≤ EF ; φ + δφ) − Qi (E ≤ EF ; φ) Z 1 (0) = [R(z; φ + δφ) − R(z; φ)]vi (φ + δφ)R(z; φ + δφ)dz 2πi γ Z 1 (0) + R(z; φ)vi (φ + δφ)[R(z; φ + δφ) − R(z; φ)]dz . (B.13) 2πi γ
December 23, 2004 10:56 WSPC/148-RMP
1170
00223
T. Koma
The norm can be evaluated as kQi (E ≤ EF ; φ + δφ) − Qi (E ≤ EF ; φ)k ≤
(0)
|γ| max kR(z; φ + δφ) − R(z; φ)k vi (φ + δφ)R(z; φ + δφ) 2π z∈γ
+
(0) |γ| max kR(z; φ)k vi (φ + δφ) R(z; φ + δφ) − R(z; φ) . 2π z∈γ
(B.14)
Using Lemmas B.1 and B.2, the first term on the right-hand side can be evaluated, and one can get the desired bound for the first term. Thus it is sufficient to evaluate the second term. Using the identity (B.6), one has
(0)
v (φ + δφ)[R(z; φ + δφ) − R(z; φ)] i
≤
X (0)
v (φ + δφ)R(z; φ + δφ)
v (0) (φ)R(z; φ) |δφj | i
j
j=x,y
+
1
v (0) (φ + δφ)R(z; φ + δφ) kR(z; φ)k(δφ2x + δφ2y ) . i 2me
(B.15)
Combining this with Lemma B.1, one can get the desired bound also for the second term on the right-hand side of (B.14). Now we shall give the proof of Proposition 6.2. Since all the cases can be treated in the same way, we treat the case with i = x, j = y only. Note that Tr[Q(E ≤ EF ; φ + δφ)Qx (E ≤ EF ; φ + δφ)Qy (E ≤ EF ; φ + δφ)] − Tr[Q(E ≤ EF ; φ)Qx (E ≤ EF ; φ)Qy (E ≤ EF ; φ)] = Tr[{Q(E ≤ EF ; φ + δφ) − Q(E ≤ EF ; φ)}Qx (E ≤ EF ; φ + δφ)Qy (E ≤ EF ; φ + δφ)] + Tr[Q(E ≤ EF ; φ){Qx (E ≤ EF ; φ + δφ) − Qx (E ≤ EF ; φ)}Qy (E ≤ EF ; φ + δφ)] + Tr[Q(E ≤ EF ; φ)Qx (E ≤ EF ; φ){Qy (E ≤ EF ; φ + δφ) − Qy (E ≤ EF ; φ)}] . (B.16) Consider first the first term on the right-hand side. Using the identity, Qi (E ≤ EF ; φ) = Q(E ≤ EF ; φ)Qi (E ≤ EF ; φ) + Qi (E ≤ EF ; φ)Q(E ≤ EF ; φ) ,
(B.17)
which is the non-interacting version of (5.9), one has Tr δQ≤ Qx (E ≤ EF ; φ + δφ)Qy (E ≤ EF ; φ + δφ) X
= ψ˜m , Qx (E ≤ EF ; φ + δφ)Qy (E ≤ EF ; φ + δφ)δQ≤ ψ˜m m:Em ≤EF
+
X
m:Em ≤EF
ψ˜m , Qy (E ≤ EF ; φ + δφ)δQ≤ Qx (E ≤ EF ; φ + δφ)ψ˜m ,
(B.18)
December 23, 2004 10:56 WSPC/148-RMP
00223
1171
Revisiting the Charge Transport in Quantum Hall Systems
where the vectors ψ˜m are the energy eigenvectors of the single electron Hamiltonian with φ + δφ, and we have written δQ≤ = Q(E ≤ EF ; φ + δφ) − Q(E ≤ EF ; φ). Immediately, |Tr[δQ≤ Qx (E ≤ EF ; φ + δφ)Qy (E ≤ EF ; φ + δφ)]| ≤ 2N kQx (E ≤ EF ; φ + δφ)k kQy (E ≤ EF ; φ + δφ)k kδQ≤ k ,
(B.19)
where N is the number of electrons. Here the operators Qi (E ≤ EF ; φ + δφ) are bounded because of Lemma B.1 and (B.12). Combining these observations with Lemma B.3, one gets |Tr[{Q(E ≤ EF ; φ + δφ) − Q(E ≤ EF ; φ)}Qx (E ≤ EF ; φ + δφ)Qy (E ≤ EF ; φ + δφ)]| ≤ CN max |δφi |
(B.20)
i=x,y
with a positive constant C. In a similar way, one has |Tr[Q(E ≤ EF ; φ){Qx (E ≤ EF ; φ + δφ) − Qx (E ≤ EF ; φ)}Qy (E ≤ EF ; φ + δφ)]| X |hψm , {Qx (E ≤ EF ; φ + δφ) − Qx (E ≤ EF ; φ)} Qy (E ≤ EF ; φ + δφ)ψm i| ≤ m:Em ≤EF
≤ N kQx (E ≤ EF ; φ + δφ) − Qx (E ≤ EF ; φ)k kQy (E ≤ EF ; φ + δφ)k ≤ CN max |δφi | ,
(B.21)
i=x,y
where the vectors ψm are the energy eigenvectors of the single electron Hamiltonian with φ, and C is a positive constant, and we have used Lemma B.4 for getting the last inequality. In the same way, |Tr[Q(E ≤ EF ; φ)Qx (E ≤ EF ; φ) {Qy (E ≤ EF ; φ + δφ) − Qy (E ≤ EF ; φ)}]| ≤ CN max |δφi |
(B.22)
i=x,y
with a positive constant C. Combining these three inequalities with (B.16), one can obtain the desired bound (6.35) in the case with i = x, j = y. Appendix C. Proof of Proposition 6.4 Since Tr Qy (E ≤ EF ; φ) = 0, it is sufficient to show Tr v (0) (φ + δφ)Qy (E ≤ EF ; φ + δφ) − Tr v (0) (φ + δφ)Qy (E ≤ EF ; φ) s s ≤ CN max |δφi | i=x,y
with some positive constant C. Using the identity (B.17), one has Tr vs(0) (φ + δφ)Qy (E ≤ EF ; φ + δφ) − Tr vs(0) (φ + δφ)Qy (E ≤ EF ; φ) = Tr vs(0) (φ + δφ) {Qy (E ≤ EF ; φ + δφ)Q(E ≤ EF ; φ + δφ)
(C.1)
December 23, 2004 10:56 WSPC/148-RMP
1172
00223
T. Koma
− Qy (E ≤ EF ; φ)Q(E ≤ EF ; φ)} + Tr vs(0) (φ + δφ) {Q(E ≤ EF ; φ + δφ)Qy (E ≤ EF ; φ + δφ) − Q(E ≤ EF ; φ)Qy (E ≤ EF ; φ)} .
(C.2)
We begin with estimating the first term on the right-hand side. Note that [1st term on r.h.s. of (C.2)] = Tr vs(0) (φ + δφ) {Qy (E ≤ EF ; φ + δφ) − Qy (E ≤ EF ; φ)} Q(E ≤ EF ; φ + δφ) + Tr vs(0) (φ + δφ)Qy (E ≤ EF ; φ)Q(E ≤ EF ; φ) Q(E ≤ EF ; φ + δφ) − Q(E ≤ EF ; φ) + Tr vs(0) (φ + δφ)Q(E ≤ EF ; φ)Qy (E ≤ EF ; φ) × Q(E ≤ EF ; φ + δφ) − Q(E ≤ EF ; φ) , (C.3)
where we have used the identity (B.17) again. This first term on the right-hand side is rewritten as Tr vs(0) (φ + δφ) Qy (E ≤ EF ; φ + δφ) − Qy (E ≤ EF ; φ) Q(E ≤ EF ; φ + δφ) X
= ψ˜m , vs(0) (φ + δφ) Qy (E ≤ EF ; φ + δφ) − Qy (E ≤ EF ; φ) ψ˜m , m:Em ≤EF
(C.4)
where the vectors ψ˜m are the energy eigenvectors of the single-electron Hamiltonian with the gauge parameters φ + δφ. From Lemma B.1, (B.13) and (B.15), one has
(0)
v (φ + δφ){Qy (E ≤ EF ; φ + δφ) − Qy (E ≤ EF ; φ)} ≤ C max |δφi | , (C.5) s i=x,y
where the positive constant C is independent of the number N of the electrons and the system sizes Lx , Ly . Using this bound for (C.4), one obtains Tr v (0) (φ + δφ){Qy (E ≤ EF ; φ + δφ) − Qy (E ≤ EF ; φ)}Q(E ≤ EF ; φ + δφ) s ≤ CN max |δφi | . i=x,y
(C.6)
The second term on the right-hand side of (C.3) is rewritten as Tr vs(0) (φ + δφ)Qy (E ≤ EF ; φ)Q(E ≤ EF ; φ) Q(E ≤ EF ; φ + δφ) − Q(E ≤ EF ; φ) X
= ψm δQ≤ vs(0) (φ + δφ)Qy (E ≤ EF ; φ)ψm , (C.7) m:Em ≤EF
where we have written δQ≤ = Q(E ≤ EF ; φ + δφ) − Q(E ≤ EF ; φ), and ψm are the energy eigenvectors of the single-electron Hamiltonian with the gauge parameters (0) φ. From Lemma B.1, the operator vs (φ + δφ)Qy (E ≤ EF ; φ) is bounded, and the
December 23, 2004 10:56 WSPC/148-RMP
00223
Revisiting the Charge Transport in Quantum Hall Systems
1173
difference Q(E ≤ EF ; φ + δφ) − Q(E ≤ EF ; φ) was already estimated in Lemma B.3. Therefore Tr vs(0) (φ + δφ)Qy (E ≤ EF ; φ)Q(E ≤ EF ; φ){Q(E ≤ EF ; φ + δφ) − Q(E ≤ EF ; φ)} ≤ CN max |δφi | ,
(C.8)
i=x,y
where the positive constant C is independent of the number N of the electrons and the system sizes Lx , Ly . Similarly the third term on the right-hand side of (C.3) is evaluated as Tr vs(0) (φ + δφ)Q(E ≤ EF ; φ)Qy (E ≤ EF ; φ) Q(E ≤ EF ; φ + δφ) − Q(E ≤ EF ; φ) X
ψm , Qy (E ≤ EF ; φ)δQ≤ vs(0) (φ + δφ)ψm ≤ m: Em ≤EF
≤ kQy (E ≤ EF ; φ)k kδQ≤ k ) ( X q
(0) 2 N × ψm , vs (φ) ψm |δφs | + me m:Em ≤EF
≤ C max |δφi | i=x,y
N |δφs | + N me
r
2 (EF + kW k∞ ) me
.
(C.9)
Thus all of the terms on the right-hand side of (C.3) have been evaluated. Next consider the second term on the right-hand side of (C.2). It can be rewritten as [2nd term on r.h.s. of (C.2)] = Tr vs(0) (φ + δφ)δQ≤ Qy (E ≤ EF ; φ + δφ)Q(E ≤ EF ; φ + δφ) + Tr vs(0) (φ + δφ)δQ≤ Q(E ≤ EF ; φ + δφ)Qy (E ≤ EF ; φ + δφ) + Tr vs(0) (φ + δφ)Q(E ≤ EF ; φ) {Qy (E ≤ EF ; φ + δφ) − Qy (E ≤ EF ; φ)} , (C.10) where we have used the identity (B.17). From Lemma B.1 and (B.6), one has
(0)
vs (φ + δφ)δQ≤ ≤ C max |δφi | , (C.11) i=x,y
where the positive constant C is independent of the number N of the electrons and the system sizes Lx , Ly . Hence, in the same way as in the above, the first and the second terms on the right-hand side of (C.10) are evaluated as Tr vs(0) (φ + δφ)δQ≤ Qy (E ≤ EF ; φ + δφ)Q(E ≤ EF ; φ + δφ) + Tr vs(0) (φ + δφ)δQ≤ Q(E ≤ EF ; φ + δφ)Qy (E ≤ EF ; φ + δφ) ≤ CN max |δφi | i=x,y
(C.12)
December 23, 2004 10:56 WSPC/148-RMP
1174
00223
T. Koma
with some positive constant C. By using Lemma B.4, the third term on the righthand side of (C.10) is evaluated as Tr v (0) (φ + δφ)Q(E ≤ EF ; φ) Qy (E ≤ EF ; φ + δφ) − Qy (E ≤ EF ; φ) s ≤ CN max |δφi |
(C.13)
i=x,y
in the same way as in (C.9). Combining (C.2), (C.3), (C.6), (C.8), (C.9), (C.10), (C.12) and (C.13), the desired bound (C.1) is obtained. Appendix D. Proofs of Theorems 7.2 and 7.5 In order to give the proofs of Theorems 7.2 and 7.5, we first recall the Hall conductance σxy (φ) of (3.37) which can be expressed as σxy (φ) * + q (N ) 1 − Q(E (φ)) i~e2 1 X (0) (N ) (N ) (0) 0 vtot,x (φ)Φ0,µ (φ) − c.c. . Φ0,µ (φ), vtot,y (φ) (N ) = (N ) Lx Ly q [E (φ) − H (φ)]2 µ=1
0,µ
0
(D.1)
Using the Schwarz inequality, the matrix element is estimated as q (N ) X 1 − Q(E0 (φ)) (N ) (0) (N ) (0) vtot,x (φ)Φ0,µ (φ) Φ0,µ (φ), vtot,y (φ) (N ) (N ) [E (φ) − H (φ)]2 µ=1
≤
0,µ
1 (∆E)2
0
v q Y u uX
(N ) (N ) (N ) (0) (0) t Φ (φ), vtot,s (φ)[1 − Q(E (φ))]vtot,s (φ)Φ (φ) . 0
0,µ
s=x,y
0,µ
µ=1
(D.2)
Because of the factor 1/(Lx Ly ) on the right-hand side of (D.1), it is sufficient to show that this right-hand side is of order N . But a simple estimate yields order N 2 because of the two total velocity operators in the matrix elements. In order to reduce the order N 2 to N , we use the method introduced in [51]. See also [31, 52]. Let A be a symmetric operator. In the same way as in (4.23), one has formally q X
µ=1
(N )
(N )
AΦ0,µ (φ), [1 − Q(E0
≤ =
(N )
(φ)]AΦ0,µ (φ)
q 1 X (N ) (N ) (N ) (N ) AΦ0,µ (φ), [H0 (φ) − E0,µ (φ)]AΦ0,µ (φ) ∆E µ=1 q 1 X (N ) (N ) (N ) Φ0,µ (φ), [A, [H0 (φ), A]]Φ0,µ (φ) . 2∆E µ=1
(D.3)
December 23, 2004 10:56 WSPC/148-RMP
00223
Revisiting the Charge Transport in Quantum Hall Systems
1175
We stress that this formal inequality can be justified in the same way as in Sec. 4. From the two inequalities (D.2) and (D.3), it is sufficient to show that the ex(0) (N ) (0) pectation values of the double commutators [vtot,s (φ), [H0 (φ), vtot,s (φ)]] for the ground state are of order N . Let us calculate those double commutators. Note that (0) i~eB i~eB (0) (N ) v (φ) − Iy (φ) , (D.4) vtot,x (φ), H0 (φ) = − me tot,y me and
where Ix (φ) =
(0)
(N )
vtot,y (φ), H0
i~eB (0) i~eB (φ) = vtot,x (φ) − Ix (φ) , me me
(D.5)
N N 1 X 1 X ∂W (0) (0) (rj ) − BP,z (rj )vx,j (φ) + vx,j (φ)BP,z (rj ) eB j=1 ∂y 2B j=1
(D.6)
N N 1 X 1 X ∂W (0) (0) BP,z (rj )vy,j (φ) + vy,j (φ)BP,z (rj ) (rj ) + eB j=1 ∂x 2B j=1
(D.7)
and Iy (φ) =
with the velocity operator, (0)
vj (φ) =
1 [pj + eA(rj ) + φ] . me
Using the commutation relation (D.4), we have (0) (N ) (0) vtot,x (φ), [H0 (φ), vtot,x (φ)] =
i~eB (0) i~eB (0) (0) vtot,x (φ), vtot,y (φ) + vtot,x (φ), Iy (φ) . me me
(D.8)
(D.9)
The commutator of the first term on the right-hand side is calculated as
N i~eB i~e X (0) (0) vtot,x (φ), vtot,y (φ) = − 2 N − 2 BP,z (rj ) , me me j=1
(D.10)
and the commutator of the second term becomes (0) vtot,x (φ), Iy (φ)
N N N i~e X i~e X i~ X ∂ 2 W 2 (r ) − [B (r )] − BP,z (rj ) = − j P,z j me eB j=1 ∂x2 m2e B j=1 m2e j=1 −
N i~ X (0) ∂BP,z ∂BP,z (0) vy,j (φ) (rj ) + (rj )vy,j (φ) . 2me B j=1 ∂x ∂x
(D.11)
December 23, 2004 10:56 WSPC/148-RMP
1176
00223
T. Koma
Substituting these into the right-hand side of (D.9), we have (0) (N ) (0) vtot,x (φ), [H0 (φ), vtot,x (φ)] =
N N N 2~2 e2 B X ~2 e 2 B 2 ~2 X ∂ 2 W ~2 e 2 X 2 [B (r )] + BP,z (rj ) N + (r ) + P,z j j m3e m2e j=1 ∂x2 m3e j=1 m3e j=1
+
N ~2 e X (0) ∂BP,z ∂BP,z (0) v (φ) (r ) + (r )v (φ) . j j y,j 2m2e j=1 y,j ∂x ∂x
(D.12)
Similarly we have (0) (N ) (0) vtot,y (φ), [H0 (φ), vtot,y (φ)] =
N N N 2~2 e2 B X ~2 X ∂ 2 W ~2 e 2 X ~2 e 2 B 2 2 [B (r )] + BP,z (rj ) N + (r ) + P,z j j m3e m2e j=1 ∂y 2 m3e j=1 m3e j=1
−
N ~2 e X (0) ∂BP,z ∂BP,z (0) v (φ) (r ) + (r )v (φ) j j x,j 2m2e j=1 x,j ∂y ∂y
(D.13)
by using the commutation relations (D.5) and (D.10). From the expressions (D.12) and (D.13) for the double commutators, it is sufficient to evaluate the ground state expectation values of the last sums on both the right-hand sides. Since all the terms in the summands can be treated in the same way, we consider only the second term in the summand of the last sum in (D.13). Using the Schwarz inequality, we have N X ∂B P,z (N ) (0) (N ) Φ0,µ (φ), (rj )vx,j (φ)Φ0,µ (φ) ∂y j=1 ≤
"
×
N X i=1
"
N X
j=1
(N ) Φ0,µ (φ),
2 #1/2 (N ) ∂BP,z (ri ) Φ0,µ (φ) ∂y
(0) 2 (N ) (N ) Φ0,µ (φ), vx,j (φ) Φ0,µ (φ)
#1/2
1/2
∂BP,z (N ) 1/2 (N ) (N )
N 1/2 Φ0,µ (φ), H0 (φ)Φ0,µ (φ) + N kW k∞
∂y ∞
1/2
∂BP,z (N ) 2
E (φ) + N kW k∞ 1/2 . N 1/2 = (D.14) 0,µ
me ∂y ∞
≤
2 me
To get the second inequality, we have used the inequality,
N X 2 me (0) (N ) vx,j (φ) ≤ H0 (φ) + N kW k∞ 2 j=1
(D.15)
December 23, 2004 10:56 WSPC/148-RMP
00223
1177
Revisiting the Charge Transport in Quantum Hall Systems
which follows from the assumption W (2) ≥ 0. By relying on the decay assumption (N ) (7.8) for the interaction W (2) , we can prove [31] that the energy eigenvalue E0,µ (φ) is of order N . See Appendix F for the details. Consequently the right-hand side of the last line in (D.14) is of order N . Appendix E. Proofs of Theorems 7.6, 7.8 and 7.12 We begin with rewriting the Hall conductance σxy (φ) of (D.1). Using the commutation relation (D.5), the summand in (D.1) can be written as (N ) 1 − Q(E0 (φ)) (0) (N ) (N ) (0) v (φ)Φ (φ) − c.c. Φ0,µ (φ), vtot,y (φ) (N ) tot,x 0,µ (N ) [E0,µ (φ) − H0 (φ)]2 (N ) 2ime 1 − Q(E0 (φ)) (N ) (0) (0) (N ) = − Φ0,µ (φ), vtot,y (φ) (N ) vtot,y (φ)Φ0,µ (φ) (N ) ~eB E (φ) − H (φ) 0,µ
+
(N ) (0) Φ0,µ (φ), vtot,y (φ)
0
(N )
1 − Q(E0
(N )
(φ))
(N )
[E0,µ (φ) − H0
(φ)]2
(N ) Ix (φ)Φ0,µ (φ)
− c.c.
(E.1)
in the same way as in (4.21), where c.c. stands for the complex conjugate. First we prove Theorem 7.6 by using the above expression (E.1). We assume AP = 0, i.e. BP,z = 0. We treat only the case where the electrostatic potential W is a function of the single variable x only because we can treat the other case where W is a function of the single variable y only in the same way. From the assumptions and the expression (D.6) of Ix (φ), one has Ix (φ) = 0. Further we have q (N ) X 1 − Q(E0 (φ)) (0) (N ) (N ) (0) vtot,y (φ)Φ0,µ (φ) Φ0,µ (φ), vtot,y (φ) (N ) 2 (N ) E0,µ (φ) − H0 (φ) µ=1 =
∂ Nq (0) (N ) Tr vtot,y (φ)Q(E0 (φ)) − ∂φy me
(E.2)
in the same way as in Sec. 5. Combining these observations, (D.1) and (E.1), we obtain the desired result, e2 e2 N =− ν. (E.3) hM h Now let us return to the general setting. We rewrite the right-hand side of (E.1) further. In the same way, we have (N ) 1 − Q(E0 (φ)) (N ) (0) (N ) Φ0,µ (φ), vtot,y (φ) (N ) I (φ)Φ (φ) − c.c. x 0,µ (N ) [E0,µ (φ) − H0 (φ)]2 σxy (φ) = −
ime = − ~eB +
(N ) (0) Φ0,µ (φ), vtot,x (φ)
(N ) Φ0,µ (φ), Ix (φ)
(N )
1 − Q(E0
(N )
(N )
E0,µ (φ) − H0 (N )
1 − Q(E0
(N )
(φ))
(φ))
(N )
[E0,µ (φ) − H0
(φ)]2
(φ)
(N ) Ix (φ)Φ0,µ (φ)
(N ) Iy (φ)Φ0,µ (φ)
+ c.c.
− c.c. ,
(E.4)
December 23, 2004 10:56 WSPC/148-RMP
1178
00223
T. Koma
and q X
(N ) (0) Φ0,µ (φ), vtot,x (φ)
µ=1
=
(N )
1 − Q(E0
(N )
(φ))
(N )
E0,µ (φ) − H0
(φ)
(N ) Ix (φ)Φ0,µ (φ)
+ c.c.
N 1 X ∂ (N ) (N ) Tr Ix (φ)Q(E0 (φ)) + Tr BP,z (rj )Q(E0 (φ)) . ∂φx me B j=1
(E.5)
By combining (D.1), (E.1), (E.2), (E.4) and (E.5), the averaged Hall conductance can be written as N e2 1 X e2 N (N ) + σxy (φ) = − Tr BP,z (rj )Q(E0 (φ)) + ∆σxy (φ) , (E.6) h M h M Bq j=1 where
∆σxy (φ) q (N ) i~e2 1 X 1 − Q(E0 (φ)) (N ) (N ) = Φ0,µ (φ), Ix (φ) (N ) Iy (φ)Φ0,µ (φ) − c.c. . (N ) Lx Ly q µ=1 [E (φ) − H (φ)]2 0,µ
0
(E.7)
The sum on the right-hand side of (E.6) is easily evaluated as N e2 N kB k e2 1 X P,z ∞ (N ) . Tr BP,z (rj )Q(E0 (φ)) ≤ h M Bq h M B j=1
(E.8)
Next we estimate the right-hand side of (E.7). In the same way as in Appendix D, we have 3 ~ωc e2 N 0 |∆σxy (φ)| ≤ δ (φ) (E.9) ∆E h M with
v u X Y u 1 q (N ) (N ) m e (N ) t Φ0,µ (φ), Is (φ), [H0 (φ), Is (φ)] Φ0,µ (φ) . δ 0 (φ) = 2 (~ωc ) N s=x,y q µ=1
(E.10)
In the rest of this appendix, we will show δ 0 (φ) ≤ δ˜0 ,
˜0
(E.11)
where δ is independent of φ, the number N of the electrons and the sizes Lx , Ly of the system. But δ˜0 is a continuous function of the norms kD (m,n) BP,z k∞ , kD(m,n) W k∞ and satisfies δ˜0 = 0 in the special point with AP = 0 and W = 0. Therefore δ˜0 becomes small for the weak potentials AP and W . Combining (E.6), (E.8), (E.9) and (E.11), we have the desired result, −
e2 e2 ν(1 + δ) ≤ σxy (φ) ≤ − ν(1 − δ) , h h
(E.12)
December 23, 2004 10:56 WSPC/148-RMP
00223
Revisiting the Charge Transport in Quantum Hall Systems
1179
with δ=
kBP,z k∞ + B
~ωc ∆E
3
δ˜0 .
(E.13)
In passing, we remark that we can obtain a similar bound for the non-averaged Hall conductance σxy (φ) to (E.12) by using (D.4) and (D.10) for the first term in the right-hand side of (E.1). In order to prove the bound (E.11), it is sufficient to estimate the ground state (N ) expectation values of the double commutators Is (φ), [H0 (φ), Is (φ)] in (E.10). To this end, let us calculate the double commutators. In the following, we will consider only the case with s = x because we can treat the case with s = y exactly in the same way. Note that (N )
[H0
(φ), Ix (φ)]
N 2 2 ∂W me X (0) (0) (rj ) vx,j (φ) + vy,j (φ) , = 2eB j=1 ∂y −
N 2 2 me X (0) (0) (0) (0) vx,j (φ) + vy,j (φ) , BP,z (rj )vx,j (φ) + vx,j (φ)BP,z (rj ) 4B j=1
−
N 1 X (0) (0) W (rj ), BP,z (rj )vx,j (φ) + vx,j (φ)BP,z (rj ) 2B j=1
N 1 X X (2) (0) (0) − W (ri − rj ), BP,z (r` )vx,` (φ) + vx,` (φ)BP,z (r` ) 2B i,j `=1
=
3 X
Jx(0,s) + Jx(x) (φ) +
s=1
3 X
Jx(y,s) (φ) + Jx(xx) (φ) + Jx(xy) (φ) ,
where the operators in the last line are given by N i~ X ∂W (0,1) Jx =− BP,z (rj ) (rj ) , me B j=1 ∂x
(E.15)
3 N 3 i~3 X ∂ BP,z ∂ BP,z (r ) + (r ) , j j 4m2e B j=1 ∂x3 ∂x∂y 2
(E.16)
i~ X ∂W (2) =− [BP,z (ri ) − BP,z (rj )] (ri − rj ) , me B i,j ∂x
(E.17)
Jx(0,2) = − Jx(0,3)
(E.14)
s=1
Jx(x) (φ) = −
2 2 N i~ X (0) ∂ W ∂ W (0) vx,j (φ) (rj ) + (rj )vx,j (φ) , 2eB j=1 ∂x∂y ∂x∂y
(E.18)
December 23, 2004 10:56 WSPC/148-RMP
1180
00223
T. Koma
Jx(y,1) (φ)z
2 2 N ∂ W i~ X (0) ∂ W (0) =− v (φ) (rj ) + (rj )vy,j (φ) , 2eB j=1 y,j ∂y 2 ∂y 2
Jx(y,2) (φ) = −
N i~e X (0) (0) vy,j (φ)BP,z (rj ) + BP,z (rj )vy,j (φ) , 2me j=1
Jx(y,3) (φ) = −
N i~e X (0) (0) vy,j (φ)[BP,z (rj )]2 + [BP,z (rj )]2 vy,j (φ) , 2me B j=1
Jx(xx) (φ)
N i~ X (0) ∂BP,z (0) = v (φ) (rj )vx,j (φ) , B j=1 x,j ∂x
(E.19)
(E.20)
(E.21)
(E.22)
and Jx(xy) (φ)
N ∂BP,z ∂BP,z i~ X (0) (0) (0) (0) v (φ) (rj )vy,j (φ) + vy,j (φ) (rj )vx,j (φ) . = 2B j=1 x,j ∂y ∂y
(E.23)
Hence the double commutator becomes (N ) Ix (φ), [H0 (φ), Ix (φ)] =
3 X
[Ix (φ), Jx(0,s) ] + [Ix (φ), Jx(x) (φ)] +
s=1
3 X
[Ix (φ), Jx(y,s) (φ)]
s=1
+ [Ix (φ), Jx(xx) (φ)] + [Ix (φ), Jx(xy) ] .
(E.24)
In order to prove the boundedness of δ 0 (φ) of (E.10), we shall show that the ground state expectation values for all the commutators on this right-hand side are of order N. The three commutators in the first sum on the right-hand side of (E.24) are calculated as 2 N ~2 X ∂ W [BP,z (rj )]2 (rj ) [Ix (φ), Jx(0,1) ] = 2 2 me B j=1 ∂x2 + BP,z (rj ) [Ix (φ), Jx(0,2) ]
∂BP,z ∂W (rj ) (rj ) , ∂x ∂x
(E.25)
4 4 N ∂ BP,z ∂ BP,z ~4 X = BP,z (rj ) (rj ) + (rj ) , (E.26) 4m3e B 2 j=1 ∂x4 ∂x2 ∂y 2
and [Ix (φ), Jx(0,3) ] =
2 (2) ~2 X 2 ∂ W [B (r ) − B (r )] (ri − rj ) P,z i P,z j m2e B 2 i,j ∂x2
December 23, 2004 10:56 WSPC/148-RMP
00223
Revisiting the Charge Transport in Quantum Hall Systems
1181
∂BP,z ~2 X BP,z (ri ) (ri ) + 2 2 me B i,j ∂x − BP,z (rj )
∂BP,z ∂W (2) (rj ) (ri − rj ) . ∂x ∂x
(E.27)
Clearly the first two are of order N from the assumptions. The ground state expectation value of the last one is evaluated as (N ) Φ (φ), [Ix (φ), J (0,3) ]Φ(N ) (φ) x 0,µ 0,µ 4~2 ≤ m2e
kBP,z k∞ B
2 X i,j
(N ) Φ0,µ (φ),
∂BP,z 2~2 kBP,z k∞
+ 2
∂x m B B e
X
∞ i,j
(N ) ∂ 2 W (2) Φ (φ) (r − r ) i j 0,µ ∂x2
(N ) Φ0,µ (φ),
(N ) ∂W (2) (ri − rj ) Φ0,µ (φ) ∂x
4αint ~ωc kBP,z k∞ kBP,z k∞ ∂BP,z `B
≤ + me B B 2B ∂x ∞ X (N ) (N ) × Φ0,µ (φ), W (2) (ri − rj )Φ0,µ (φ) ,
(E.28)
i,j
where we have used the assumptions (7.18) and (7.19). Further, by using the inequality X (N ) (N ) (N ) (N ) (N ) Φ0,µ (φ), W (2) (ri − rj )Φ0,µ (φ) ≤ Φ0,µ (φ), H0 (φ)Φ0,µ (φ) + N kW k i,j
(N )
= E0,µ (φ) + N kW k ,
(E.29)
we have (N ) Φ (φ), [Ix (φ), Jx(0,3) ]Φ(N ) (φ) 0,µ
≤
0,µ
`B kBP,z k∞
∂BP,z +
B 2B ∂x ∞ (N ) × E0,µ (φ) + N kW k . 4αint ~ωc kBP,z k∞ me B
(N )
(E.30)
Since the ground state energy E0,µ (φ) is of order N as shown in Appendix F, this right-hand side is of order N . We remark that, when AP = 0, we do not need the (0,s) assumptions (7.18) and (7.19) because all the operators Jx are vanishing. Note that 2 N 2 X ~2 ∂ W [Ix (φ), Jx(x) (φ)] = (r ) j me e2 B 2 j=1 ∂x∂y
December 23, 2004 10:56 WSPC/148-RMP
1182
00223
T. Koma
3 N X ~2 ∂ W (0) + (rj ) v (φ), BP,z (rj ) 2me eB 2 j=1 x,j ∂x2 ∂y 2 N X ~2 ∂ W ∂BP,z (0) − (rj ) (rj ) , v (φ), 2me eB 2 j=1 x,j ∂x ∂x∂y [Ix (φ), Jx(y,1) (φ)]
(E.31)
2 N 2 X ∂ W ~2 (rj ) = me e2 B 2 j=1 ∂y 2 +
3 N X ∂ W ~2 (0) v (φ), B (r ) (r ) P,z j j 2me eB 2 j=1 y,j ∂x∂y 2
2 N X ∂ W ∂BP,z ~2 (0) (rj ) v (φ), (rj ) − 2me eB 2 j=1 x,j ∂y ∂y 2 2 N ~2 X ∂ W BP,z (rj ) + 2 (rj ) me B j=1 ∂y 2 +
2 N ~2 X 2 ∂ W [B (r )] (rj ) , P,z j m2e B 2 j=1 ∂y 2
(E.32)
2 N ∂ W ~2 X B (r ) (rj ) P,z j 2 me B j=1 ∂y 2
[Ix (φ), Jx(y,2) (φ)] =
+
N ~2 e X (0) ∂BP,z v (φ), B (r ) (r ) P,z j j 2m2e B j=1 y,j ∂x
N ∂BP,z ~2 e X (0) − v (φ), BP,z (rj ) (rj ) 2m2e B j=1 x,j ∂y +
N N ~2 e 2 X ~2 e 2 X 2 [B (r )] + [BP,z (rj )]3 , P,z j m3e j=1 m3e B j=1
and [Ix (φ), Jx(y,3) (φ)]
2 N ~2 X 2 ∂ W = 2 2 [BP,z (rj )] (rj ) me B j=1 ∂y 2 +
N ~2 e X (0) 2 ∂BP,z v (φ), [B (r )] (r ) P,z j j m2e B 2 j=1 y,j ∂x
(E.33)
December 23, 2004 10:56 WSPC/148-RMP
00223
Revisiting the Charge Transport in Quantum Hall Systems
1183
N ~2 e X (0) 2 ∂BP,z (rj ) − v (φ), [BP,z (rj )] 2m2e B 2 j=1 x,j ∂y +
N N ~2 e 2 X ~2 e 2 X 3 [B (r )] + [BP,z (rj )]4 , P,z j m3e B j=1 m3e B 2 j=1
(E.34)
where {X, Y } = XY +Y X for operators X, Y . Since the terms including the velocity (0) operators vs,j (φ) on the right-hand sides can be estimated in the same way as in Appendix D, we can get the desired estimates of order N for their ground state expectation values. Note that [Ix (φ), Jx(xx) (φ)] = −
2 N ∂ W ∂BP,z ~2 X (0) (r ) (r ) v (φ), j j me eB 2 j=1 x,j ∂x ∂x∂y
2 2 N ∂BP,z ~2 X (0) ∂ BP,z (0) v (φ) 2 (rj ) vx,j (φ) + (rj ) − BP,z (rj ) me B 2 j=1 x,j ∂x ∂x2 2 3 N ~4 X ∂ 2 BP,z + ∂BP,z (rj ) ∂ BP,z (rj ) . − (r ) j 2m3e B 2 j=1 ∂x2 ∂x ∂x3
(E.35)
Clearly the ground state expectation values for the first and third sums can be evaluated in the same way as in the above. For the second sum, we have N 2 X ∂BP,z (N ) (0) Φ0,µ (φ), vx,j (φ) 2 (rj ) ∂x j=1
∂ 2 BP,z (0) (N ) − BP,z (rj ) (rj ) vx,j (φ)Φ0,µ (φ) ∂x2
≤
2
X N
∂BP,z 2
(N ) 2 (N ) (0)
+ kBP,z k∞ ∂ BP,z Φ0,µ (φ), vx,j (φ) Φ0,µ (φ) 2
∂x2
∂x ∞ ∞ j=1
2
∂BP,z 2
(N ) 2 (N ) (N )
+ kBP,z k∞ ∂ BP,z Φ0,µ (φ), H0 (φ)Φ0,µ (φ) 2
2 me ∂x ∞ ∂x ∞ + N kW k∞
2
∂BP,z 2
(N ) 2
+ kBP,z k∞ ∂ BP,z = 2 E (φ) + N kW k ∞ . (E.36) 0,µ
∂x2 me ∂x ∞ ∞
≤
Thus the corresponding contribution is of order N .
December 23, 2004 10:56 WSPC/148-RMP
1184
00223
T. Koma (xy)
Finally let us compute the ground state expectation value of [Ix (φ), Jx (φ)]. In order to make this task easier, we decompose the operator Ix (φ) into two parts as Ix (φ) = Ix(1) + Ix(2) (φ)
(E.37)
with Ix(1)
N 1 X ∂W = (rj ) , eB j=1 ∂y
(E.38)
and Ix(2) (φ) = −
N 1 X (0) (0) BP,z (rj )vx,j (φ) + vx,j (φ)BP,z (rj ) . 2B j=1
(E.39)
Note that [Ix(1) , Jx(xy) (φ)] = − −
2 N X ∂ W ~2 ∂BP,z (0) (r ) v (φ), (r ) j j 2me eB 2 j=1 x,j ∂y ∂y 2 2 N X ∂BP,z ∂ W ~2 (0) v (φ), (r ) (r ) , j j 2me eB 2 j=1 y,j ∂y ∂x∂y
(E.40)
and [Ix(2) (φ), Jx(xy) (φ)] = Kx(xx) (φ) + Kx(xy) (φ) + Kx(x) (φ) + Kx(0)
(E.41)
with Kx(xx) (φ)
2 N ~2 X (0) ∂BP,z (0) = v (φ) (rj ) vx,j (φ) , me B 2 j=1 x,j ∂y
Kx(xy) (φ) = − +
(E.42)
2 N ∂ BP,z ~2 X (0) (0) v (φ)B (r ) (r )v (φ) + (x ↔ y) P,z j j y,j 2me B 2 j=1 x,j ∂x∂y N ~2 X (0) ∂BP,z ∂BP,z (0) v (φ) (r ) (r )v (φ) + (x ↔ y) , j j y,j 2me B 2 j=1 x,j ∂x ∂y
(E.43)
Kx(x) (φ) = − −
2
~ e 2m2e B
N X j=1
(0)
vx,j (φ), BP,z (rj )
∂BP,z (rj ) ∂y
N ~2 e X (0) 2 ∂BP,z v (φ), [B (r )] (r ) , P,z j j 2m2e B 2 j=1 x,j ∂y
(E.44)
December 23, 2004 10:56 WSPC/148-RMP
00223
Revisiting the Charge Transport in Quantum Hall Systems
1185
and Kx(0) 3 2 2 N ∂ BP,z ∂ BP,z ~4 X ∂BP,z ∂ BP,z = − (rj ) (rj ) + 2 (rj ) (rj ) 4m3e B 2 j=1 ∂y ∂x2 ∂y ∂x2 ∂y 2 −
2 N 2 ∂ BP,z ~4 X . (r ) j 4m3e B 2 j=1 ∂x∂y
(E.45) (xy)
Hence all the contributions except that for Kx (φ) can be evaluated in the same way as in the above. We shall show that the ground state expectation value of (xy) Kx (φ) is of order N , too. Using the Schwarz inequality, we have 2 N X ∂ BP,z (N ) (0) (0) (N ) Φ0,µ (φ), vx,j (φ)BP,z (rj ) (rj )vy,j (φ)Φ0,µ (φ) ∂x∂y j=1 v uN uX (N ) 2 (0) (0) (N ) Φ0,µ (φ), vx,j (φ) BP,z (rj ) vx,j (φ)Φ0,µ (φ) ≤ t j=1
v uN 2 2 uX ∂ BP,z (0) (N ) (N ) (0) t × Φ0,µ (φ), vy,j (φ) (rj ) vy,j (φ)Φ0,µ (φ) ∂x∂y j=1
v
2
uN
∂ BP,z uX
(N ) 2 (N ) (0)
t ≤ kBP,z k∞ Φ0,µ (φ), vx,j (φ) Φ0,µ (φ)
∂x∂y ∞ j=1 v uN uX (N ) 2 (N ) (0) Φ0,µ (φ), vy,j (φ) Φ0,µ (φ) . ×t
(E.46)
j=1
This right-hand side is of order N in the same method as in the above. All the rest of the contributions on the right-hand side of (E.43) can be treated in the same way. Consequently δ 0 (φ) of (E.10) has been proved to be bounded uniformly in the number N of the electrons. Consider the special case with AP = 0 which corresponds to the situation of Theorem 7.8. Then all the contributions except for the first sum in (E.31) and for the first sum in (E.32) are vanishing, and a similar result can be obtained for the (N ) double commutator [Iy (φ), [H0 (φ), Iy (φ)]]. As a result, we have the bound, δ 0 (φ) ≤
2 2`4B max D(m,n) W ∞ . 2 (~ωc ) m+n=2
(E.47)
Combining this with (E.6) and (E.9), we obtain the desired result (7.13) with (7.14).
December 23, 2004 10:56 WSPC/148-RMP
1186
00223
T. Koma (N )
Appendix F. Estimate of the Ground State Energies E0,µ (φ) (N )
In this appendix, we show that all of the ground state energies E0,µ (φ) are of order N. (N ) Let Ψ0 (φ) be a ground state vector of the Landau Hamiltonian, (N )
HL (φ) =
N X 1 (px,j − eByj + φx )2 + (py,j + φy )2 , 2me j=1
for the non-interacting N electrons. Then one has
(N ) (N ) (N ) (N ) E0,µ (φ) ≤ Ψ0 (φ), H0 (φ)Ψ0 (φ) + ∆E(φ) (N )
where H0 by
(F.1)
for µ = 1, 2, . . . , q ,
(F.2)
(φ) is the Hamiltonian (5.1) of the present system, and ∆E(φ) is given (N ) (N ) ∆E(φ) = max0 E0,µ (φ) − E0,µ0 (φ) .
(F.3)
µ,µ
Using the eigenvectors ϕP n,k (φ) of (6.7) for the single-electron Hamiltonian H0 (φ) (N )
of (6.1), the ground state expectation value for H0 (φ) can be written as
(N ) (N ) (N ) Ψ0 (φ), H0 (φ)Ψ0 (φ) X (N ) X
(N ) P Ψ0 (φ), W (2) (ri − rj )Ψ0 (φ) ϕP = n,k (φ), H(φ)ϕn,k (φ) + 1≤i<j≤N
n,k
≤
X n,k
+
√
En,k + √
X
1≤i<j≤N
p 2e e2 (k|AP |k∞ )2 + kW + k∞ k|AP |k∞ En,k + me 2me
(N ) (N ) Ψ0 (φ), W (2) (ri − rj )Ψ0 (φ) ,
(F.4)
where En,k = (n + 1/2)~ωc, the Hamiltonian H(φ) is given by (6.19), and we have used the inequality (6.22). Clearly the first sum on the right-hand side of the inequality is of order N , and so it is enough to estimate the ground state expectation value of the electron–electron interaction energy. But this quantity is of order N in the same way as in [32, Appendix G]. References [1] H. Nakano, A method of calculation of electrical conductivity, Progr. Theoret. Phys. 15 (1956) 77–79. [2] R. Kubo, Statistical-mechanical theory of irreversible processes, I. General theory and simple applications to magnetic and conduction problems, J. Phys. Soc. Jpn. 12 (1957) 570–586. [3] K. von Klitzing, G. Dorda and M. Pepper, New method for high accuracy determination of the fine structure constant based on quantized Hall resistance, Phys. Rev. Lett. 45 (1980) 494–497.
December 23, 2004 10:56 WSPC/148-RMP
00223
Revisiting the Charge Transport in Quantum Hall Systems
1187
[4] S. Kawaji and J. Wakabayashi, Temperature dependence of transverse and Hall conductivities of silicon MOS inversion layers under strong magnetic fields, in Physics in High Magnetic Fields, eds. S. Chikazumi and N. Miura (Springer, Berlin, Heidelberg, New York, 1981), pp. 284–287. [5] D. J. Thouless, M. Kohmoto, M. P. Nightingale and M. den Nijs, Quantized Hall conductance in a two-dimensional periodic potential, Phys. Rev. Lett. 49 (1982) 405–408. [6] M. Kohmoto, Topological invariant and the quantization of the Hall conductance, Ann. Phys. 160 (1985) 343–354. [7] M. Aizenman and G. M. Graf, Localization bounds for an electron gas, J. Phys. A31 (1998) 6783–6806. [8] J. Bellissard, A. Van Elst and H. Schulz-Baldes, The noncommutative geometry of the quantum Hall effect, J. Math. Phys. 35 (1994) 5373–5451. [9] A. Elgart and B. Schlein, Adiabatic charge transport and the Kubo formula for Landau type Hamiltonians, Comm. Pure Appl. Math. 57 (2004) 590–615. [10] B. A. Dubrovin and S. P. Novikov, Ground states in a periodic field. Magnetic bloch functions and vector bundles, Soviet Math. Dokl. 22 (1980) 240–244; Ground states of a two-dimensional electron in a periodic magnetic field, Soviet Phys. JETP 52 (1980) 511–516; S. P. Novikov, Magnetic Bloch function and vector bundles. Typical dispersion laws and their quantum numbers, Soviet Math. Dokl. 23 (1981) 298–303. [11] H. Kunz, The quantum Hall effect for electrons in a random potential, Commun. Math. Phys. 112 (1987) 121–145. [12] J. E. Avron, R. Seiler and B. Simon, Quantum Hall effect and the relative index for projections, Phys. Rev. Lett. 65 (1990) 2185–2188; Charge deficiency, charge transport and comparison of dimensions, Commun. Math. Phys. 159 (1994) 399–422. [13] T. Koma, The width of the Hall conductance plateaus, in preparation. [14] J. E. Avron, R. Seiler and B. Simon, Homotopy and quantization in condensed matter physics, Phys. Rev. Lett. 51 (1983) 51–53. [15] B. Simon, Holonomy, the quantum adiabatic theorem, and Berry’s phase, Phys. Rev. Lett. 51 (1983) 2167–2170. [16] I. Dana, Y. Avron and J. Zak, Quantised Hall conductance in a perfect crystal, J. Phys. C18 (1985) L679–L683. [17] K. Ishikawa, Topological phenomena in two-dimensional electron systems, in Proc. 3rd Int. Symp. Foundations of Quantum Mechanics, Tokyo, 1989, pp. 70–79; N. Imai, K. Ishikawa, T. Matsuyama and I. Tanaka, Field theory in a strong magnetic field and the quantum Hall effect: Integer Hall effect, Phys. Rev. B42 (1990) 10610–10640. [18] Q. Niu and D. J. Thouless, Quantised adiabatic charge transport in the presence of substrate disorder and many-body interaction, J. Phys. A17 (1984) 2453–2462. [19] Q. Niu, D. J. Thouless and Y.-S. Wu, Quantized Hall conductance as a topological invariant, Phys. Rev. B31 (1985) 3372–3377. [20] J. E. Avron and R. Seiler, Quantization of the Hall conductance for general multiparticle Schr¨ odinger Hamiltonians, Phys. Rev. Lett. 54 (1985) 259–262. [21] J. E. Avron, R. Seiler and L. G. Yaffe, Adiabatic theorems and applications to the quantum Hall effect, Commun. Math. Phys. 110 (1987) 33–49. [22] D. C. Tsui, H. L. St¨ ormer and A. C. Gossard, Two-dimensional magnetotransport in the extreme quantum limit, Phys. Rev. Lett. 48 (1982) 1559–1562. [23] H. L. St¨ ormer, A. M. Chang, D. C. Tsui, J. C. M. Hwang, A. C. Gossard and W. Wiegmann, Fractional quantization of the Hall effect, Phys. Rev. Lett. 50 (1983) 1953–1956. [24] R. Willett, J. P. Eisenstein, H. L. St¨ ormer, D. C. Tsui, A. C. Gossard and J. H. English, Observation of an even-denominator quantum number in the fractional quantum Hall effect, Phys. Rev. Lett. 59 (1987) 1776–1779.
December 23, 2004 10:56 WSPC/148-RMP
1188
00223
T. Koma
[25] R. Tao and Y.-S. Wu, Gauge invariance and fractional quantum Hall effect, Phys. Rev. B30 (1984) 1097–1098. [26] R. Tao and F. D. M. Haldane, Impurity effect, degeneracy, and topological invariant in the quantum Hall effect, Phys. Rev. B33 (1986) 3844–3850. [27] J. E. Avron and L. G. Yaffe, Diophantine equation for the Hall conductance of interacting electrons on a torus, Phys. Rev. Lett. 56 (1986) 2084–2087. [28] D. Yoshioka, B. I. Halperin and P. A. Lee, Ground state of two-dimensional electrons in strong magnetic fields and 31 quantized Hall effect, Phys. Rev. Lett. 50 (1983) 1219–1222; The ground state of the 2D electrons in a strong magnetic field and the anomalous quantized Hall effect, Surf. Sci. 142 (1984) 155–162. [29] D. Yoshioka, Ground state of the two-dimensional charged particles in a strong magnetic field and the fractional quantum Hall effect, Phys. Rev. B29 (1984) 6833–6839. [30] W. P. Su, Ground-state degeneracy and fractionally charged excitations in the anomalous quantum Hall effect, Phys. Rev. B30 (1984) 1069–1072. [31] T. Koma, Spectral gaps of quantum Hall systems with interactions, J. Stat. Phys. 99 (2000) 313–381. [32] T. Koma, Insensitivity of quantized Hall conductance to disorder and interactions, J. Stat. Phys. 99 (2000) 383–459. [33] J. Fr¨ ohlich and T. Kerler, Universality in quantum Hall systems, Nucl. Phys. B354 (1991) 369–417; J. Fr¨ ohlich and A. Zee, Large scale physics of the quantum Hall fluid, ibid. B364 (1991) 517–540; J. Fr¨ ohlich and U. M. Studer, Gauge invariance in non-relativistic many body theory, Int. J. Mod. Phys. B6 (1992) 2201–2208; U (1) × SU (2)–gauge invariance of non-relativistic quantum mechanics, and generalized Hall effects, Commun. Math. Phys. 148 (1992) 553–600; Gauge invariance and current algebra in non-relativistic many-body theory, Rev. Mod. Phys. 65 (1993) 733– 802; J. Fr¨ ohlich and E. Thiran, Integral quadratic forms, Kac–Moody algebras, and fractional quantum Hall effect, J. Stat. Phys. 76 (1994) 209–283; J. Fr¨ ohlich, T. Kerler, U. M. Studer and E. Thiran, Structuring the set of incompressible quantum Hall fluids, Nucl. Phys. B453 (1995) 670–704; J. Fr¨ ohlich, U. M. Studer and E. Thiran, A classification of quantum Hall fluid, J. Stat. Phys. 86 (1997) 821–897; J. Fr¨ ohlich, B. Pedrini, C. Schweigert and J. Walcher, Universality in quantum Hall systems: Coset construction of incompressible states, J. Stat. Phys. 103 (2001) 527–567. [34] A. Connes, Noncommutative Geometry (Academic Press, San Diego, 1994). [35] J. Xia, Geometric invariants of the quantum Hall effect, Commun. Math. Phys. 119 (1988) 29–50. [36] F. Nakano, Calculation of the Hall conductivity by adiabatic approximation, J. Math. Sci. Univ. Tokyo 4 (1997) 351–371; Calculation of the Hall conductivity by Abel limit, Ann. Inst. H. Poincar´e 69 (1998) 441–455. [37] F. Chandelier, Y. Georgelin, T. Masson and J.-C. Wallet, Quantum Hall conductivity in a Landau type model with a realistic geometry, Ann. Phys. 305 (2003) 60–78. [38] Q. Niu and D. J. Thouless, Quantum Hall effect with realistic boundary conditions, Phys. Rev. B35 (1987) 2188–2197. [39] M. F. Atiyah, Geometry of Yang–Mills Fields, Lezioni Fermiane, Accad. Naz. Lincei & Scuola Norm. Sup. Pisa, 1979. [40] H. L. Cycon, R. G. Froese, W. Kirsch and B. Simon, Schr¨ odinger Operators with Application to Quantum Mechanics and Global Geometry (Springer-Verlag, 1987). [41] P. B. Gilkey, Invariance Theory, the Heat Equation, and the Atiyah–Singer Index Theorem, 2nd edn. (CRC Press, Boca Raton, 1995). [42] S. Weinberg, The Quantum Theory of Fields, Vol. II , Modern Applications (Cambridge University Press, 1996).
December 23, 2004 10:56 WSPC/148-RMP
00223
Revisiting the Charge Transport in Quantum Hall Systems
1189
[43] J. Zak, Magnetic translation group, Phys. Rev. 134 (1964) A1602–A1606; Magnetic translation group. II. Irreducible representations, ibid. 134 (1964) A1607–A1611. [44] D. A. Greenwood, The Boltzmann equation in the theory of electrical conduction in metals, Proc. Phys. Soc. 71 (1958) 585–596. [45] F. Nakano and M. Kaminaga, Absence of transport under a slowly varying potential in disordered systems, J. Stat. Phys. 97 (1999) 917. [46] S. Nakamura and J. Bellissard, Low energy bands do not contribute to quantum Hall effect, Commun. Math. Phys. 31 (1990) 283–305. [47] T. Kato, Perturbation Theory for Linear Operators, 2nd edn. (Springer, Berlin, Heidelberg, New York, 1980). [48] T. Kato, Linear Evolution Equations of “Hyperbolic” Type I, J. Fac. Sci. Univ. Tokyo, Sec. I A17 (1970) 241–258; Linear Evolution Equations of “Hyperbolic” Type II, J. Math. Soc. Japan. 25 (1973) 648–666. [49] K. Fujikawa, Path Integral for Gauge Theories with Fermions, Phys. Rev. D21 (1980) 2848–2858. [50] M. Reed and B. Simon, Methods of Modern Mathematical Physics, Vol. IV, Analysis of Operators (Academic Press, New York, 1978). [51] P. Horsch and W. von der Linden, Spin-correlations and low lying excited states of the spin-1/2 Heisenberg antiferromagnet on a square lattice, Z. Phys. B72 (1988) 181–193. [52] T. Koma and H. Tasaki, Symmetry breaking and finite-size effects in quantum manybody systems, J. Stat. Phys. 76 (1994) 745–803.
December 23, 2004 10:36 WSPC/148-RMP
00222
Reviews in Mathematical Physics Vol. 16, No. 9 (2004) 1191–1225 c World Scientific Publishing Company
DENSITY OF STATES FOR GUE THROUGH SUPERSYMMETRIC APPROACH
M. DISERTORI Theoretische Physik, ETH Z¨ urich, CH-8093, Z¨ urich, Switzerland [email protected] Received 7 July 2004 Revised 6 October 2004 The supersymmetric approach has proved to be a powerful tool for the study of random systems where classical techniques do not seem to apply. It also seems promising for rigorous analysis. In this context, we consider the GUE density of states, and show that, using the supersymmetric approach, we can rigorously re-derive the results obtained by classical techniques (orthogonal polynomials), in all energy regions (inside the spectrum, at the edge and outside the spectrum). Keywords: Random matrix theory; supersymmetric approach.
1. Introduction Random-matrix theory (RMT) appears in a large number of models both in physics and mathematics, and deals with the statistical properties of large matrices with randomly-distributed elements. It was initially introduced by Wigner to describe the statistics of energy levels in complex many-body systems, such as heavy nuclei, complex atoms and molecules (many of the early papers are collected in a book by Porter [1], see also [2]). Later it proved relevant in mesoscopic physics for the study of disordered conductors and systems that are chaotic in the classical limit (see [3–5] for reviews on the subject), in 2D quantum gravity, conformal field theory and QCD (see [6–9] for reviews). In mathematics RMT appears in a variety of problems, such as the zeros of the Riemann Zeta function, random permutations, random tiling and growth processes (among the many titles see [10–13] and references therein). For a review containing the history and main applications in physics of RMT see [14]. For recent developments and applications both in mathematics and physics see [15, Chap. 1] and the papers in [16] (special edition on random matrix theory). The mathematical basis of RMT was developed in the 1960s, notably by Wigner, Dyson, Mehta and Gaudin (see the book by Mehta [15], the classical reference on the subject). In particular Dyson introduced the classification of random matrix ensembles according to their invariant properties under time reversal [17]. In this context the three most important ensembles are GOE, GUE and GSE. They consist of the 1191
December 23, 2004 10:36 WSPC/148-RMP
1192
00222
M. Disertori
Gaussian distributed N ×N random matrices which are respectively real symmetric (GOE), complex hermitian (GUE) and real quaternion (GSE). The corresponding probability distributions are invariant under orthogonal, unitary and symplectic transformations respectively. The classical mathematical tool for the study of RMT is orthogonal polynomials. For an introduction to this important technique see [15, Chaps. 5–7; 18, Chap. 5], or, for more advanced material, [19, 20]. This technique applies in all situations when the probability distribution is invariant under the symmetry group (respectively orthogonal, unitary or symplectic), P (H) ∝ exp[−tr V (H)], where V (·) is some polynome. In such cases the symmetry group can be integrated out exactly and the joint probability density for the eigenvalues can be written explicitly. In some multi-matrix models, though P (H) is no longer invariant, it is still possible to integrate out the symmetry group using the Harish Chandra–Itzykson– Zuber formula. This formula was originally derived for the unitary group [21, 22], and was later extended to other symmetries (see [23] and references therein). It applies to probability ditributions of the form P (H1 , H2 ) ∝ exp[−tr V1 (H1 ) + V2 (H2 )] exp[tr(H1 H2 )], or for any chain of matrices. In the context of mesoscopic physics nevertheless, the probability distributions typically heavily breaks the symmetry, so the orthogonal polynomial technique cannot be applied. In these cases a technique widely applied by physicists is the supersymmetric approach. This method, based on a seminal work by Wegner [24, 25], was built in a systematic way by Efetov [26, 27]. It has proved to be a powerful tool for the study of random systems where classical techniques do not seem to apply. For an introduction to the method with some applications see [28], and also [29, 30]. The supersymmetric approach seems also promising for a rigorous analysis (see [31–33]). In this paper we reconsider the simple case of the averaged density of states for the GUE ensemble. This is defined as [15] the set of N × N Hermitian random matrices H = H + with probability distribution N
2
P(H) = N e− 2 TrH = N
Y
1≤i<j≤N
∗
e−N Hij Hij
N Y
N
2
e− 2 Hii
(1.1)
i=1
where N is the normalization factor. The density of states ρN (E) is defined by 1 1 1 ρN (E) =: Tr δ(E − H) = − lim Im Tr . (1.2) N πN ε→0+ E − H + iε The averaged density of states is then 1 1 ρ¯N (E) =: hρN (E)i = − lim Im Tr πN ε→0+ E − H + iε 1 1 = − lim Im (1.3) π ε→0+ E − H + iε 00
December 23, 2004 10:36 WSPC/148-RMP
00222
Density of States for GUE through Supersymmetric Approach
1193
where in the second line we applied translation invariance. Note that for any function f (H) we will denote the average of f with respect to this probability measure as Z hf i =: dHP (H)f (H) . (1.4)
It is known that ρ¯N (E) tends to the so-called semicircle law when N → ∞ [15] 1 , (1.5) ρ¯N (E) = ρSC (E) + O N where r 2 1 1 − E |E| ≤ 2 . ρSC (E) = π 4 0 |E| > 2
(1.6)
Moreover the analysis of the corrections for large N to the dominant behavior shows that, for energies near the edge, more precisely |E| = 2 − x, x = t/N 2/3 and t = O(1), ρ¯N (E) is given by 1 2 ρ¯N (t) = N − 3 [Ai00 (−t)Ai(−t) − Ai0 (−t)Ai0 (−t)] + O N − 3 (1.7)
where Ai(x) is the Airy function ([15, Chap. 18; 34, 35]). Finally, in the region outside the spectrum, |E| = 2 + x, and 1/N 2/3 < x < 1 we have ([15, Chap. 18])
1 −cN x3/2 e . (1.8) Nx We will use the supersymmetric approach to recover the limit N → ∞ and the corrections to the asymptotic behavior for large N in the whole spectrum. The computations in this case are simple but already show the richness of the problem and the advantages of the technique. Note that in the limit N → ∞ we recover the density of states (without average) limN →∞ ρ¯N (E) = ρ∞ (E) as the density of states is a non-random variable in that limit. Basically supersymmetry is an algebraic tool that allows us to write the averaged density of states (or in general the quantity under study) as a functional integral where a saddle point analysis can be applied. In the case of the density of states for GUE we are reduced to an integral over only two variables (from the N 2 variables at the beginning). This integral is then analyzed differently depending on whether we are in the bulk of the spectrum, at the edge or outside the spectrum. Saddle point analysis shows that only four saddle points survive for large N . As there is a pole, only two of them can be reached by contour deformation without generating remainder terms. Inside the spectrum, the integrand presents then a double well configuration. The dominant well gives the expected asymptotic expression for DOS, that is the semicircle law. The second well is suppressed by a factor 1/N and hence disappears in the limit. Nevertheless it contains an oscillating factor exp[iN f (E)] where f (E) is some function of the energy E. When we compute derivatives of ρ¯N (E) with respect to the energy, the contribution from the ρ¯N (2 + x) ≤ K
December 23, 2004 10:36 WSPC/148-RMP
1194
00222
M. Disertori
second well grows and eventually (for derivatives of order two at least) diverges in the limit N → ∞. When the edge of the spectrum is approached, the saddles tend to merge and the Hessian becomes very small. The leading contribution is no longer a gaussian (quadratic term) but an Airy function (cubic term). Finally, outside the spectrum, the saddle points are real, and the only contribution to the imaginary part comes from the region near the pole. But this is far from the saddle points, so the integrand is very small. Note that the deformed contour must always be far enough from the pole. For this purpose inside the bulk a simple translation to the saddle point is sufficient. However, when we get near the edge, additional rotations must be performed. 2. Results For the model defined above we prove the following theorems. Theorem 1. The averaged density of states (1.3), for |E| < 2 − η and N large, has the following behavior E 3 1 1 Im KN (2.1) + O N−2 ρ¯N (E) = ρSC (E) + N π|C| C p where ρSC (E) is defined in (1.6), E = E/2 − i 1 − E 2 /4, C = 1 − E 2 , KN is a phase factor N 2 ∗ 2 N E KN = (2.2) e− 2 [E −(E ) ] ∗ E and finally 0 < η 1 is some constant. A consequence of this result is that ρ¯N (E) is not smooth when N → ∞, because of the strong phase oscillations that already appear in the first correction. Theorem 2. The averaged density of states (1.3), for |E| = 2 − β < 31 and t = O(1)) and N large, has the following behavior ρ¯N (E) = ρSC (E) + O N −(1−2β) . Note that for β =
t N 2β
(where 0 < (2.3)
1 3
ρSC (E) = O(N −β ) = O N −(1−2β)
(2.4)
therefore the semicircle law is no longer a good approximation for the averaged density. Actually in this region an Airy kernel appears and we have the follwing result: 2
Theorem 3. The averaged density of states (1.3), for |E| = 2 ± tN − 3 , t = O(1) and N large, has the following behavior 2 1 (2.5) ρ¯N (t) = N − 3 [Ai00 (−t)Ai(−t) − Ai0 (−t)Ai0 (−t)] + O N − 3
December 23, 2004 10:36 WSPC/148-RMP
00222
Density of States for GUE through Supersymmetric Approach
where Ai(x) is the Airy function defined by Z v3 1 dv e−v z+ 3 = Ai(z) 2πi L
1195
(2.6)
where the contour L starts at v = −∞ with phase − π2 + δ < φ < − π6 − δ and ends at v = +∞ with phase + π2 − δ > φ > + π6 + δ, 1 > δ > 0.
Theorem 4. The averaged density of states (1.3), for E = 2 + x (x ≥ N −2/3 ) and N large, has the following behavior √ √ 1 −N 2[y y 2 −1]+2N ln y+ y 2 −1 p e (2.7) |¯ ρN (E)||E|>2 ≤ K N y − y 2 − 1 (y 2 − 1)
where y = |E|/2. In particular , when y is near 1 (that is E ' 2) y = 1 + x/2, we have 1 −cN x3/2 e (2.8) ρ¯N (2 + x) ≤ K Nx where c > 0 is some constant. 3. Supersymmetric Approach
We introduce the representation we are going to work on. By algebraic tools we can reduce an integral over N 2 variables (the starting average) to an integral over two variables. The formula we obtain is (3.1) below. We then perform a few steps of integration by parts to reduce the formula to an even simpler expression. In order to stress the main properties of these formulas, we also consider the case when no observable is present (we take the average of one). This also allows us to check the validity of this representation. The final expressions we work on are (3.28) and (3.2). Lemma 1. Using the supersymmetric approach the averaged matrix element in (1.3) can be written as 1 Eε − H jk Z N N +1 1 N −N (a2 +b2 ) (Eε − ib) 2 = δjk da db e 1− (3.1) 2π (Eε − a)N +1 N (Eε − a)(Eε − ib) where a and b are real variables and we define Eε = E + iε. With no observable we have Z N 2 2 (Eε − ib) N 1 −N (a +b ) 1 = h1i = da db e 2 . (3.2) 1− 2π (Eε − a)N (Eε − a)(Eε − ib)
Proof. The matrix element of (Eε − H)−1 in (1.3) can be written as a functional integral Z + (Eε − H) 1 = −i det −i dS ∗ dS eiS (Eε −H)S Sj Sk∗ (3.3) Eε − H jk 2π
December 23, 2004 10:36 WSPC/148-RMP
1196
00222
M. Disertori
where the determinant is the normalization factor, we defined S 1
. S = .. ,
∗ S + = (S1∗ , . . . , SN )
(3.4)
SN
and S1 , . . . , SN are complex variables. In order to have the H dependence only in the argument of the exponential we introduce the anticommuting variables χ1 , χ∗1 , . . . , χN , χ∗N . We summarize the properties of such variables in the Appendix. The determinant can then be written as Z + (Eε − H) = dχ∗ dχ eiχ (Eε −H)χ (3.5) det −i 2π where we define
χ1 . . χ= . , χN
χ+ = (χ∗1 , . . . , χ∗N ) .
Therefore we can write Z 1 = −i dΦ∗ dΦ exp[iΦ+ (Eε − H)Φ]Sj Sk∗ Eε − H jk where we introduce the superfield Φ1 , . . . , ΦN (i = 1, . . . , N ) Si ∗ ∗ Φi = , Φ+ i = (Si , χi ) . χi These superfields can be seen as components of a supervector Φ Φ
(3.6)
(3.7)
(3.8)
1
. Φ = .. , ΦN
+ Φ+ = Φ+ 1 , . . . , ΦN .
(3.9)
Here we have adopted the conventions in the review by Mirlin [28]. We summarize supersymmetric formalism and notation in the Appendix. Now we can perform the average over H. For this purpose we note that the average with respect to the real and imaginary parts decouple and can be performed separately. The result is # " 1 X + + + (3.10) (Φi Φj )(Φj Φi ) . hexp[−iΦ HΦ]i = exp − 2N ij To convert this quartic interaction into a quadratic one we perform a Hubbard– Stratonovich transformation. We remark that X + 2 2 ∗ (3.11) (Φ+ i Φj )(Φj Φi ) = A − B + P P ij
December 23, 2004 10:36 WSPC/148-RMP
00222
Density of States for GUE through Supersymmetric Approach
where X Si∗ Si , A=
B=
X
χ∗i χi ,
P =
Si∗ χi ,
P∗ =
X
Si χ∗i
(3.12)
i
i
i
i
X
1197
where A and B are commuting variables, and P and P ∗ are anticommuting ones. Now r Z 1 2 N N 2 exp − A = da e− 2 a −ia A 2N 2π r Z N 2 1 2 N (3.13) exp + B = db e− 2 b +b B 2N 2π Z ∗ ∗ ∗ 1 2π exp − P ∗ P = dρ∗ dρ e−N ρ ρ−iρ P −iP ρ N N
where a and b are real variables, and ρ and ρ∗ are anticommuting ones. Therefore Z 2 2 ∗ + N hexp[−iΦ+ HΦ]i = da db dρ∗ dρ e− 2 (a +b +2ρ ρ)−iΦ RΦ (3.14) where
+
Φ RΦ :=
N X
Φ+ i RΦi
,
R :=
i=1
a ρ∗ ρ
ib
.
(3.15)
R is actually a supermatrix, containing both commuting and anticommuting variables. For such a matrix we can define the notion of transpose, complex conjugate, determinant and trace and it can be shown that the usual properties of the vector and matrix algebra hold (see the Appendix). Using this formalism we have N Z 1 1 ∗ iΦ+ (Eε −H)Φ ∗ (3.16) −i dΦ dΦ e Sj Sk = δjk Eε − R 11 Sdet(Eε − R) where
Therefore
1 (Eε − a) ∗ 1−ρ ρ Sdet(Eε − R) = (Eε − ib) (Eε − a)(Eε − ib) −1 1 1 1 ∗ = . 1−ρ ρ Eε − R 11 (Eε − a) (Eε − a)(Eε − ib)
1 E − H + iε
= δjk
jk
×
Z
∗
da db dρ dρ e
2 2 ∗ −N 2 (a +b +2ρ ρ)
(Eε − ib) (Eε − a)
−(N +1) 1 ρ∗ ρ 1− . (Eε − a) (Eε − a)(Eε − ib)
(3.17)
(3.18)
N (3.19)
The integration over the anticommuting variables can be performed exactly. Using the property: ρ2 = (ρ∗ )2 = 0 we observe that −1 ρ∗ ρ ρ∗ ρ = exp (3.20) 1− (Eε − a)(Eε − ib) (Eε − a)(Eε − ib)
December 23, 2004 10:36 WSPC/148-RMP
1198
00222
M. Disertori
therefore the integration over ρ and ρ∗ reduces to the following expression Z 1 N +1 dρ∗ dρ exp −N ρ∗ ρ 1 − N (Eε − a)(Eε − ib) N +1 1 N 1− . (3.21) = 2π N (Eε − a)(Eε − ib) Inserting this result in (3.19) we complete the proof of (3.1). The derivation of (3.2) is done in the same way. We conclude this section with a couple of remarks. Remark 1. Note that from (3.1) we can recover the classical formula for ρ¯N (E) in terms of orthogonal polynomials: ρ¯N (E) =
√ √ N 2π
E2 N 1 2 (x) − HN −1 (x)HN +1 (x) |x= E√√N e − 2 HN N N !2 2
where HN (x) is the Hermite polynomial N 2 d x2 e−x . − HN (x) = e dx
(3.22)
(3.23)
This fact has been remarked in the review by Kalisch and Braak [36] and can be seen by observing that the integrals over a and b in (3.1) are actually integral representations for Hermite polynomials. For the b integral we have r M2 √ Z 2 2π 1 E N −N b2 M √ db e (E − ib) = HM (3.24) N 2N 2 where after taking ε = 0 the integral is real. For the a integral we have Z I a2 1 1 a2 1 lim Im da e−N 2 da e−N 2 = M ε→∞ (Eε − a) 2i C (E − a)M M2−1 √ N E2 N E N π √ e− 2 HM (3.25) = (M − 1)! 2 2 where C is a contour in the complex plane around the singularity a = E. Remark 2. The identity (3.2) can be checked directly by induction on N .a We start by rescaling the a and b variables and the energy too (but the result should be true for any E therefore it does not depend on any rescaling of E. Then (3.2) can be written as Z N 1 N − 21 (a2 +b2 ) (Eε − ib) IN = da db e . (3.26) 1− 2π (Eε − a)N (Eε − a)(Eε − ib) a We
thank G. M. Graf for explaining to us this nice argument.
December 23, 2004 10:36 WSPC/148-RMP
00222
Density of States for GUE through Supersymmetric Approach
1199
Note that I0 = 1 therefore we have only to verify the induction hypothesis. We assume IN = 1 and we want to check that IN +1 = 1 too. IN +1 =
1 2π
Z
=
1 2π
Z
1 = 2π
Z
1 2π
Z
1 = 2π
Z
=
− ib)N +1 N +1 1 − (Eε − a)N +1 (Eε − a)(Eε − ib) N 2 2 (Eε − ib) 1 (Eε − ib) N +1 da db e− 2 (a +b ) − (Eε − a)N (Eε − a) (Eε − a)2 N (Eε − ib) a − 21 (a2 +b2 ) (Eε − ib) da db e − (Eε − a)N (Eε − a) (Eε − a) N 2 2 (Eε − ib) ib 1 1 − da db e− 2 (a +b ) (Eε − a)N (Eε − a) N N − 21 (a2 +b2 ) (Eε − ib) = IN (3.27) 1− da db e (Eε − a)N (Eε − a)(Eε − ib) 1
da db e− 2 (a
2
+b2 ) (Eε
where in the line 3 and 5 we have applied integration by parts with respect to a and b. Remark 3. By integration by parts it is easy to see that (3.1) can be written as
1 Eε − H
N = δjk 2π
jk
Z
da db e
2 2 −N 2 (a +b )
(Eε − ib)N 1 a 1− (Eε − a)N (Eε − a)(Eε − ib)
(3.28)
where the observable contribution is only in the factor a. In the following we will use this expression.
4. Saddle Point Analysis We compute the saddle points for (3.28). We will see that both a and b have two saddle points (4.3). Note that the saddle points for (3.2) are the same as the observable does not enter in the saddle analysis. We then deform the integration contour in order to pass through the saddles. For the a variable there is a singularity in a = Eε . If we cross this singularity moving the contour we have to compute a residual. In the GUE case this will give a Hermite polynomial as shown. However, in more general cases it is not easy to compute such a residual, therefore in this presentation we will avoid crossing the singularity. We will then consider only the contribution from one saddle in a. We remark that for b there is no singularity, therefore we must consider the contributions from both saddles.
December 23, 2004 10:36 WSPC/148-RMP
1200
00222
M. Disertori
4.1. Saddle points For N large, the dominant term in (3.28) is N −N f −N [a2 +b2 ] Eε − ib 2 e =e Eε − a
(4.1)
where
a2 + b 2 + ln(Eε − a) − ln(Eε − ib) . 2 The saddle points are then ( E 1 as = + O + O(ε) N E∗ ( −iE 1 bs = + O(ε) +O ∗ N −iE f=
(4.2)
(4.3)
where E E = Er − iEi = −i 2 The following relations are true: EE ∗ = 1 ;
r
1−
E2 . 4
E − E = E∗ .
(4.4)
(4.5)
4.1.1. Spectrum Note that, the saddle point (and as a consequence also the observable) has a nonzero imaginary part only for |E| < 2. Therefore we expect to find the density of states concentrated in the region |E| ≤ 2 when N → ∞. In this region we can take the limit ε = 0 for any fixed N . The Hessian at the two saddles is r E2 E2 2 HessE = C = 1 − E = 2 1 − + iE 1 − (4.6) 4 4 r E2 E2 ∗ ∗2 . (4.7) − iE 1 − HessE ∗ = C = 1 − E = 2 1 − 4 4 The real part is positive for any |E| ≤ 2 and tends to zero as E → ±2. In the region E ' ±2 the quadratic terms go to zero and the leading contributions are cubic. Therefore we have Airy type behavior. For |E| > 2 the saddles are real and we get exponential decay in N . 4.2. Contour deformation in the bulk We consider |E| < 2 and we want to deform the integration contour in order to • pass through the saddle points and
December 23, 2004 10:36 WSPC/148-RMP
00222
Density of States for GUE through Supersymmetric Approach
1201
• avoid the singularity that appears for a when ε → 0+ (to avoid computing residuals). The easiest thing is to perform a translation a → a+as , b → b+bs. This corresponds to a translation on the real and the imaginary axis. As there is one pole, of order N , at a = Eε we have to choose as = E for the translation in a (see Fig. 1). For b there is no singularity, hence we will compute the contributions from both saddles. Note that both saddles for b have the same imaginary part, therefore translating the contour in the imaginary plane we can pass through both saddles (see Fig. 2). We will see that the dominant contribution comes from as = bs = E. Nevertheless the second saddle has an oscillating factor exp[iN F (E)] for some function F (E). When computing high-order derivatives in E this factor becomes dominant and diverges for N → ∞. Therefore we perform the following translations a → a+E
(4.8)
b → b − iE .
ε*
Im(a) E>0
εi
−L
singularity
L
ε 0 −
E/2
Re(a)
E
εi ε
Fig. 1. a has a singularity in Eε and two saddles; the contour is chosen so as not to cross the singularity and the limit L → ∞ is taken. Im(b) E>0
−L
L −
εi
0
εi
Re(b)
− E/2
ε
ε*
Fig. 2. b has two saddles but no singularity, the contour passes through both saddles; the limit L → ∞ is taken.
December 23, 2004 10:36 WSPC/148-RMP
1202
00222
M. Disertori
Note that the boundary terms (see Figs. 1 and 2) disappear thanks to the exponential decay After that the saddle points are a = 0, b = 0 (dominant saddle) and a = 0, b = 2Ei . Note that now we can take ε = 0 as the imaginary part of E ensures that there is no pole. 4.3. Optimization of the contour We remember that when performing saddle point analysis in the complex plane, one should choose an integration contour passing through the saddle such that along that contour the function is maximum in absolute value only at the saddle point. Actually we have taken the easiest possible contour (simple translation) therefore we have to check that the maximum in absolute value along this contour is really at the saddle. We then have to study the minimum of Re f = Re[f1 (a) + f2 (b)]
(4.9)
where f1 (a) =
a2 + Ea + ln(E ∗ − a) 2
(4.10)
f2 (b) =
b2 − iEb − ln(E ∗ − ib) . 2
(4.11)
Lemma√2. The real part for f√ 1 (a) has unique minimum (in a = 0) only for energies For energies 4 2/3 < |E| < 2 a second minimum appears in a = |E| ≤ 4 2/3. q
E/4 + (1/2) E 2 49 − 8 → 1 when |E| → 2 (for E = 2 there is a singularity in a = 1 √ as ε = 0). This second minimum at |E| = 4 2/3 is nearly as high as the first one and becomes dominant as soon as the energy grows. On the other hand , the real part for f2 (b) has two minima of height 0 at the points b = 0 and b = 2Ei , for any energy |E| < 2. These are exactly the positions of the two saddles. Proof. The expressions for Re f1 and Re f2 are Re f1 =
a2 1 + Er a + ln(1 − 2Er a + a2 ) 2 2
(4.12)
b2 1 − Ei b − ln(1 − 2Ei b + b2 ) . (4.13) 2 2 To study the minima we have to compute the first derivative. For f1 we have Re f2 =
E 1 (−E + 2a) + 2 2 (1 − Ea + a2 ) E2 a E 2 a + 2 1 − = a − . (1 − Ea + a2 ) 2 4
Re f10 = a +
(4.14)
December 23, 2004 10:36 WSPC/148-RMP
00222
1203
Density of States for GUE through Supersymmetric Approach
In order to have only one saddle the discriminant for the last parenthesis must be negative ∆ = E2
9 −8≤0 4
E2 ≤
if
32 . 9
(4.15)
√ For larger energies a second minimum appears in E/4 + (1/2) ∆. Repeating the same analysis we see that Re f2 has two minima of the same height exactly at the two saddle points. Therefore, for energies inside |E|2 ≤ 32/9 the translation we performed is good, but for larger energies the singularity for a reappears. In order to approach the boundary region we must change the contour to keep away from the singularity. This will be considered in Sec. 5.1.2. It will turn out that by performing an additional rotation on a we can solve the problem. The equation to study is then Z 2 2 1 (E ∗ − ib)N N N 1 1 − da db e− 2 [(a +b )+2E(a−ib)] (a + E) ∗ − Im π 2π (E − a)N (E ∗ − a)(E ∗ − ib) = ρSC (E) −
1 Im RN (E) π
(4.16)
where ρSC (E) is defined in (1.6) and Z ∗ N 2 2 N 1 (E − ib) −N [(a +b )+2E(a−ib)] RN (E) = da db e 2 1− ∗ . a ∗ 2π (E − a)N (E − a)(E ∗ − ib) (4.17) Note that in the first term we applied (3.2). Therefore we only have to analyze the remainder RN (E). E=0 E=0
–4–4
–3–3
–2–2
E=1.95 E=1.95
E=1.9
11
1
11
0.8 0.8
0.8
0.8 0.8
y0.6 y0.6
y0.6
y0.6 y0.6
0.4 0.4
0.4
0.4 0.4
0.2 0.2
0.2
0.2 0.2
0 –1–1 0
11
22 xx
33
44 –4
–3
–2
–1
0
1
2 x
3
4 –4 –4
–3 –3
–2 –2
00 –1 –1
11
22 xx
33
44
Fig. 3. Behavior of |e−f1 (a) | at different energies; note that when |E| gets near to 2 a second −f1 (a) Figure appears, 3: behavior | at different energies; note that when |E| gets maximum due toof the|esingularity.
near to 2 a second maximum appears, due to the singularity
1.2
E=0
1.2
E=0.5
1.2
E=1.7
December 23, 2004 10:36 WSPC/148-RMP
1204
M. Disertori 1.2
–4
00222
–2
E=0
1.2
E=0.5
1.2
1
1
1
0.8
0.8
0.8
y0.6
y0.6
y0.6
0.4
0.4
0.4
0.2
0.2
0.2
0
2
x
4
6 –4
–2
0
2
x
4
6 –4
–2
0
E=1.7
2
x
4
6
Fig. 4. Behavior of |e−f2 (b) | at different energies; note that when |E| gets near 2 the two maximums tend to collapse.
Remark. After the translation we have |E ∗ − a| ≥ Ei ,
|E ∗ − ib| ≥ |Er | .
(4.18)
5. Proof of Theorem 1 5.1. Bulk region We first consider the energy region |E| ≤ 2 − η for some small constant 0 < η 1. In this region we want to prove that E 3 1 1 (5.1) KN + O N−2 . RN (E) = − N |C| C Actually we have to cut this region in two parts:
√ • far from the edge: this corresponds to the energy region {E | η < |E| < 4 2/3}, where the contour translation we made is enough; √ • near the edge: this corresponds to the energy region {E | 4 2/3 < |E| < 2 − η} where an additional rotation of the contour must be performed. 5.1.1. Region far from the edge In the following we assume the additional restriction |E| > η. This is only for technical reasons. Actually, for E = 0 a singularity seems to appear in b = 1. This is not a real singularity as any factor 1/(E − ib) is always compensated by another factor in the numerator. At the end of this section we will show how to treat the case E ' 0. We partition the integration region for b, in three regions: Z 3 Z X db = (5.2) i=1
Ii
December 23, 2004 10:36 WSPC/148-RMP
00222
Density of States for GUE through Supersymmetric Approach
1205
where I1 := {b | |b| ≤ N −α , } 1 I2 := b | |b − 2Ei | ≤ α , N 1 1 I3 := b | |b| > α , |b − 2Ei | > α , N N
(5.3)
where 21 > α > 31 is fixed. These intervals select the region around the first saddle (I1 ), the second saddle (I2 ), and far from both (I3 ). • Contribution from the dominant saddle. We consider the contribution from the first saddle. Therefore we restrict the integration for b to I1 defined above. Lemma 3. The contribution for N large to (4.17) restricted to I1 is 3 [RN (E)]I1 = 0 + O N − 2 .
Proof. We expand around the saddle and extract the Hessian. Z 1 (1 − t)2 a2 a2 dt ∗ + Ea + ln(E ∗ − a) = C − a3 + ln E ∗ 3 2 2 (E − ta) 0 Z 1 2 2 b b (1 − t)2 ∗ 3 − Eib − ln(E − ib) = C + (ib) − ln E ∗ dt ∗ 2 2 (E − tib)3 0
(5.4)
(5.5) (5.6)
where we remember that E, E ∗ and C were defined in (4.3), (4.4) and (4.6). Therefore (4.17) becomes Z Z 1 N − N2C (a2 +b2 )+N [V (a)−V (ib)] da a 1− ∗ (5.7) db e 2π (E − a)(E ∗ − ib) I1
where we define
1
(1 − t)2 . (5.8) (E ∗ − tx)3 0 Now we perform a few steps of perturbative expansion in a and b in order to extract the contribution 1/N . Note that Z 1 1 d2 1 3 = [C − (a + ib)E ] − ds(1 − s) 2 1− ∗ (E − a)(E ∗ − ib) ds (E ∗ − as)(E ∗ − ibs) 0 (5.9) Z 1 2 d eN [V (a)−V (ib)] = 1 + N [V (a) − V (ib)] + ds(1 − s) 2 eN s[V (a)−V (ib)] ds 0 (5.10) Z 1 4 3 4 a (ib) 3 3 3 E + (1 − s) . [V (a) − V (ib)] = [a − (ib) ] − ∗ 3 (E ∗ − sa)4 (E − sib)4 0 (5.11) V (x) := x
3
Z
dt
December 23, 2004 10:36 WSPC/148-RMP
1206
00222
M. Disertori
Inserting these equations we can write [RN (E)]I1 = R1 + R2 + R3 + R4 where the four terms are: Z Z 2 2 N NC R1 = da db e− 2 (a +b ) aC = 0 2π I1 Z Z 2 2 NC E3 N =0 db e− 2 (a +b ) a[−a − ib]E 3 + CaN [a3 − (ib)3 ] da R2 = − 2π 3 I1 Z Z 2 2 N NC R3 = − da db e− 2 (a +b )+N [V (a)−V (ib)] a 2π I1 Z 1 d2 1 ds(1 − s) 2 × , ds (E ∗ − as)(E ∗ − ibs) 0 Z Z E3 N − N2C (a2 +b2 ) a(a + ib)E 3 N [a3 − (ib)3 ] R4 = − db e da 2π 3 I1 Z 1 2 d + a[C − (a + ib)E 3 ] ds(1 − s) 2 eN s[V (a)−V (ib)] ds 0 Z 1 a4 (ib)4 + a[C − (a + ib)E 3 ]N (1 − s)3 − . (E ∗ − sa)4 (E ∗ − sib)4 0
(5.12)
(5.13) (5.14)
(5.15)
(5.16)
Note that R1 and R2 are exactly zero. In order to bound R3 and R4 we apply the following estimates: 2 −N f (a) 1−3α ) 1 e ≤ e−N Re Cfa a2 ∀ a , |e N V (ib) | ≤ eO(N ≤ K , ∀ b ∈ I1 (5.17) where 0 < fa 1, α > 1/3 and K is some constant. Finally note that 2 d 1 2 2 ds2 (E ∗ − as)(E ∗ − ibs) ≤ K|a| + |ab| + |b|
(5.18)
where we applied (4.18). Therefore by inserting the absolute value in the integrals we have Z N 2Ei2 2 2 3 N |R3 | ≤ K da db e− 2 (fa a +b ) [|a3 | + |ab2 | + |a2 b|] = O N − 2 (5.19) 2π Z N 2Ei2 2 2 N 3 da db e− 2 (fa a +b ) [N |a|5 + |a|6 N + |a|7 N 2 ]+ = O N − 2 (5.20) |R4 | ≤ K 2π
where Re C = 2Ei2 .
• Contribution from the second saddle. We consider the contribution from the second saddle. Therefore we restrict b to the region I2 defined in (5.3). Then we have the following result:
December 23, 2004 10:36 WSPC/148-RMP
00222
Density of States for GUE through Supersymmetric Approach
Lemma 4. The contribution for N large to (4.17) restricted to I2 is 3 E 1 1 KN + O N−2 [RN (E)]I2 = 0 − N |C| C
1207
(5.21)
where E is defined in (4.4), C = 1 − E 2 and KN is the phase factor defined in (2.2). Proof. First we perform a real translation on b in order to be centered around the second saddle b → b + 2Ei
(5.22)
so that I2 is an interval centered on zero. The integral (4.17) becomes [RN (E)]I2 Z N 2N iEi Er = e da 2π Z N 1 [(a2 +b2 )+2Ea−2E ∗ ib)] (E − ib) −N 2 . 1− ∗ a ∗ db e × (E − a)N (E − a)(E − ib) |b|≤N −α
(5.23)
As before we expand around the saddle: a2 a2 + Ea + ln(E ∗ − a) = C − a3 2 2
Z
1
dt 0
b2 b2 − E ∗ ib − ln(E − ib) = C ∗ + (ib)3 2 2
Z
(1 − t)2 + ln E ∗ (E ∗ − ta)3
1
dt 0
(1 − t)2 − ln E . (E − tib)3
(5.24) (5.25)
Therefore (4.17) becomes Z Z N 1 (Ca2 +C ∗ b2 )+N [V1 (a)−V2 (ib)] −N 2 KN da a 1− ∗ db e 2π (E − a)(E − ib) |b|≤N −α
(5.26)
where we define V1 (x) := x3 V2 (x) := x3 Note that
1 1− ∗ (E − a)(E − ib)
Z Z
1
dt
(1 − t)2 (E ∗ − tx)3
(5.27)
dt
(1 − t)2 . (E − tx)3
(5.28)
0 1 0
a=b=0
= 1 − EE ∗ = 0
(5.29)
therefore the perturbative expansion starts directly with a term of order a 2 . Repeating the same analysis as for the first saddle we can extract the contribution of order 1/N . [RN (E)]I2 = R1 + R2
(5.30)
December 23, 2004 10:36 WSPC/148-RMP
1208
00222
M. Disertori
where Z Z 2 2 ∗ N N da db e− 2 (a C+b C ) a[aE + ibE ∗ ] 2π |b|≤N −α 1−2α E 1 = −KN + O e−N |C| N C Z Z 2 2 ∗ N N R2 = K N db e− 2 (a C+b C ) da 2π |b|≤N −α Z 1 d2 1 N [V1 (a)−V2 (ib)] ds 2 × ae ds (E ∗ − a)(E − ib) 0 Z 1 3 d + a[aE + ibE ∗ ] ds eN s[V1 (a)−V2 (ib)] = O N − 2 ds 0 R1 = −KN
(5.31)
(5.32)
where the remainder R2 is bounded as in the section above.
• Contribution from the region far from both saddles. Finally we consider the integral on the region I3 defined in (5.3) For the a integral we apply (5.17) while for the b integral we have 2 − f (b)] e 2 ≤ e−N b2 Cfb e−cN N −2α eN O(fb ) (5.33) where the last factor comes from the second saddle. In order to make it small we take fb = Nc for 0 < c some small constant. This means we have lost most of the mass for b. The exponential factor exp[−cN N −2α ] appears because we are far from both saddles. Note that this factor is small only if α < 1/2. Therefore the integral is bounded by Z Z 1−2α 2 2 1−2α 1 da db e− 2 [(N Cfa a +C cb )] |a| ≤ KN e−cN |RN (E)I3 | ≤ KN e−cN . I3
(5.34)
• Center of the spectrum. For E ' 0 we can perfom the same analysis as before, except in the region near b = Ei ' 1. Note that this is far from both saddles therefore we do not have to perform any perturbative expansion as we want only to ensure that we have a small factor. To see that in reality there are no poles we write the integral over b as a difference of two integrals where no singularity appears: Z Z N 2 N 2 (1 − b)N −1 db e− 2 [b −2b] iN (1 − b)N − db e− 2 [b −2b] iN −1 (i − a) |b−1|≤1/2 |b−1|≤1/2
(5.35)
where we take the approximation E = 0. It is easy to see that b = 1 is actually a minimum and that these integrals are bounded by a factor exp −cN . The integral over a is then performed as before.
December 23, 2004 10:36 WSPC/148-RMP
00222
Density of States for GUE through Supersymmetric Approach
1209
5.1.2. Approaching the edge
√ In order to lift the restriction |E| < 4 2/3 we have to change the integration contour for a. Note that the problem arises because when E gets near to ±2 the contour we have chosen is getting near to the real axis, where as ε = 0 there is a singularity. We could translate again in the imaginary plane to keep it afar, but then we would no longer pass through the saddle point. The second easiest thing to do after a translation is a rotation. We try then to rotate the contour. The rotation must satisfy two properties:
• we must keep far from the singularity even when |E| → 2, • the integrand after the rotation must be maximum in absolute value only at a = 0, that is at the saddle point. Therefore we consider the closed contour C = [−L, L] ∪ AL ∪ [−L, L]e−iθσ ∪ A−L (see Fig. 5), AL and A−L are arcs connecting the interval [−L, L] with its rotated image, 0 < θ < π4 , and σ = sign(E) is inserted in order to ensure we do not cross the singularity moving the contour (see Fig. 5). The constraint θ < π/4 ensures that the real part of the quadratic term in the exponential does not change sign. Therefore when L → ∞, the contribution from AL and A−L vanish and we remain with Z 2 −2iθσ N N −iθσ +b2 )+2E(ae−iθσ −ib)] da db e− 2 [(a e e RN (E) = 2π 1 (E ∗ − ib)N ×a ∗ 1 − . (5.36) (E − ae−iθσ )N (E ∗ − ae−iθσ )(E ∗ − ib) Note that the denominator is (E ∗ − ae−iθσ ) = Er − a cos θ + i(Ei + aσ sin θ) .
Im (a)
(5.37)
singularity
A−L
E>0
εi + ε −L
θ
E/2
L
Re (a)
AL
Fig. 5.
Rotation for the a contour; the limit L → ∞ is taken.
December 23, 2004 10:36 WSPC/148-RMP
1210
00222
M. Disertori
Now we can write E = σe−iσφ , where φ = arcsin Ei and π2 ≥ φ ≥ 0. Note that, as |E| is near 2, the phase φ is small. Then ∗ (E − ae−iθσ ) ≥ 1 − [cos(θ + φ)]2 > 0 (5.38) for π > θ + φ > 0. Now we require that Re f1 (ae−iθσ ) is minimum only in a = 0. This means we have to study 1 a2 cos 2θ + aσ cos(θ + φ) + ln[1 − 2σ cos(θ + φ)a + a2 ] . 2 2 In order to study the minimum we have to compute the first derivative:
(5.39)
Re f1 =
1 (−2σ cos(θ + φ) + 2a) 2 (1 − 2σ cos(θ + φ)a + a2 ) a 2 = a cos 2θ − 2aσ cos(θ + φ) cos 2θ − (1 − 2σ cos(θ + φ)a + a2 ) 2 2 + 2 cos θ − cos (θ + φ) .
Re f10 = a cos 2θ + σ cos(θ + φ) +
1 2
(5.40)
This expression becomes (4.14) for θ = 0. Note that cos2 θ − cos2 (θ + φ) > 0 for φ 6= 0. In order to have only one saddle the discriminant for the last parenthesis must be negative for all energies near |E| = 2 2 1 ∆ = cos 2θ − cos2 (θ + φ) − 2 cos 2θ cos2 θ − cos2 (θ + φ) . (5.41) 2
For |E| ' 2 we have φ ' 0 and 2 1 cos2 θ . ∆ ' cos 2θ − 2
(5.42)
Therefore √ 1 ∆ < 0 ∀ 4 2/3 < |E| < 2 ⇔ cos 2θ − =0 2 this means θ =
π 6.
For this choice ∆ becomes ∆ = − cos2 θ − cos2 (θ + φ) < 0
(5.43)
(5.44)
for all φ > 0 and (5.38) becomes ∗ (E − ae−iθσ ) ≥ 1 + O(φ) > 0 . (5.45) 4 Now we can perform the analysis exactly as in the previous section. In order to bound the a integral note that (5.17) still applies, but with a different mass 2 −N f (ae−iσθ ) 1 e ≤ e−N Ei fa a2 (5.46) for some 0 < fa 1. We remark that in (5.17) the mass was Re C = 2Ei2 Ei when we get near to the edge. Therefore the rotation gives a larger mass.
December 23, 2004 10:36 WSPC/148-RMP
00222
Density of States for GUE through Supersymmetric Approach
1211
6. Energies in the Boundary Region. Proof of Theorem 2 The analysis above works well only for energies such that Ei > η for η > 0 some small constant. This means we cannot get too near to the edge of the spectrum. When Ei = O(N −β ) for some β > 0 the Hessian becomes very small and starts to affect the power counting (until now it has been treated as a constant). To see that we consider the new energy variable t (6.1) E = σ 2 − 2β . N √ In terms of this new variable tEi = tN −β + O(N −3β ). Inserting this in the two dominant terms for ρ¯N (E) we have E 1 1 KN = O N −(1−2β) . (6.2) ρSC (E) = O(N −β ) , N |C| C In this section we consider only the region β < 1/3. Here the dominant contribution is still the semicircle law and the perturbative contributions can be computed as usual. On the other hand in order to bound the remainders we will have to perform a rotation also in the b integral. Note that, for β ≥ 1/3 the dominant contribution (semicircle law) has the same size as the first-order correction for β = 1/3. Therefore the perturbative analysis we have applied until now is wrong (the perturbation is larger than the dominant term) and we have to modify it. We will see this in the next section. 6.1. Rotation for b Before the rotation the Hessian for b at the saddle is C or C ∗ . This means that Re C = 2Ei2 = O(N −2β ) (as for a before the rotation). But |C| = O(N −β ) therefore by rotating the contour we can get a larger mass term and a better decay. As in the case of a the rotation must leave the mass term positive and the maximum in absolute value must still be only on the two saddles. If we consider the rotated contour parametrized by be−iθσ with b ∈ R, the mass terms around the two saddles are √ HessE = Re e−i2θσ C = 2Ei2 cos 2θ + EEi σ sin 2θ ' sin 2θ tN −β (6.3) √ HessE ∗ = Re e−i2θσ C ∗ = 2Ei2 cos 2θ − EEi σ sin 2θ ' − sin 2θ tN −β . Note that HessE > 0 ∀ 0 < θ <
π 2
(6.4) π < θ < 0. 2 Therefore we take the contour as in Fig. 6. When L → ∞ we see that as for the a integral, the contribution from the two archs AL and A−L vanish. Therefore we remain with the three intervals I1 (rotation around the first saddle), I2 (rotation around the second saddle) and I3 (region between the the saddles). HessE ∗ > 0 ∀ −
December 23, 2004 10:36 WSPC/148-RMP
1212
00222
M. Disertori
Im (b)
A−L
−L
I1
E>0
0
θ
εi
AL I2
2 εi
L Re (b)
I3
Fig. 6.
Rotation for the b contour.
6.2. Rotation for b around the first saddle (contour I1 ) We consider the contour (−∞, Ei ]e−iθσ with θ > 0. We remember that Ei is the point between the two saddles where the minimum occurs. The expression for f2 (b) becomes b2 1 Re f2 (beiθσ ) = cos(2θ) − b sin(θ + φ) − ln[1 − 2b sin(θ + φ) + b2 ] . (6.5) 2 2 The equation for the minimum is 2 b b cos 2θ − b(2 cos 2θ + 1) sin(θ + φ) 2 b − 2 sin(θ + φ) + 1 + 2 sin2 (θ + φ) − sin2 θ . (6.6)
For θ = 0 this expression takes value zero in three points: b = 0, 2Ei (minima) and Ei (relative maximum). For θ 6= 0, we want to ensure that b = 0 is still a maximum in the interval (−∞, Ei ]. We write down the solutions q i 1 h x2 ± x22 − 4x1 x3 b= (6.7) 2x1 where
x1 = cos 2θ ,
x2 = (2 cos 2θ + 1) sin(θ + φ) ,
x3 = 2 sin2 (θ + φ) − sin2 θ .
(6.8)
Both solutions are on the positive real axis if x2 /x1 > 0 and x1 x3 > 0. This is true only if cos(2θ) > 0, that means θ < π/4. Now the new minimum position is q i 1 1 h x2 − x22 − 4x1 x3 ' 4x1 x3 = O(φ) = O(Ei ) (6.9) 2x1 4x1 x2
therefore the minimum is still at a distance O(Ei ) from zero, so it is basically unchanged. Note that on this interval there is only one maximum, therefore we can
December 23, 2004 10:36 WSPC/148-RMP
00222
1213
Density of States for GUE through Supersymmetric Approach
extract a small fraction fb Ei of the quadratic decay to ensure our bounds, for some constant 0 < fb 1. 6.3. Rotation for b around the second saddle (contour I2 ) For b ≥ Ei we translate to the second saddle and then we perform the rotation. Therefore we actually consider the contour [−Ei , ∞)e+iθσ . The equations are the same as before except that φ → −φ. That is why we have to also change the sign for θ. 6.4. Region around the minimum (contour I3 ) Finally, to close the contour, we must add the interval parametrized by b−iσEi sin θ with Ei (1 − cos θ) ≤ b ≤ Ei (1 + cos θ). 6.5. Bound Now we can perform the same analysis as before. In the regions |b| ≤ N −α and |b − 2Ei | ≤ N −α , for α > 1/3, the analysis is as before. In the regions far from both saddles we have three different bounds depending on whether we are on I1 , I2 or I3 . −f (be−iθσ )] e 2 b≤E
1
i ,|b|>N
−α
≤ e− 2 fb N Re[C 1
≤ e − 2 fb N
where 0 < fb 1, α > 13 , β < −f ((b−2E )eiθσ )] i e 2 1
1
1−β
e
]b2
e−cN Re[C
∗ −2iσθ
e
b2 −cN 1−β−2α
e
]b2
b=N −α
(6.10)
1 3
b≥Ei ,|b−2Ei |>N −α
≤ e− 2 f2 N Re[C
≤ e − 2 f2 N
1−β
∗ −2iσθ
∗ −2iσθ
e
](b−2Ei )2
e−cN Re[C
∗ −2iσθ
e
](b−2Ei )2
(b−2Ei )2 −cN 1−β−2α
e
b−2Ei =N −α
(6.11)
and for the region near |b| = Ei (I3 ), expanding in Ei we find that the linear and quadratic terms vanish, and the leading contribution is cubic: −f (be−iθσ )] e 2
b=Ei ,0<θ1
3
≤ e−cN Ei = e−cN
1−3β
.
(6.12)
Note that, in order to get an exponentially small factor we must have 1/3 < α < 1−β 2 and 1 − 3β > 0. This is possible only for β < 1/3. Therefore we can bound the remainder terms and the result is RN (E) = O(N −(1−2β) ) .
(6.13)
December 23, 2004 10:36 WSPC/148-RMP
1214
00222
M. Disertori
7. Edge Region. Proof of Theorem 3 When we consider |E| = 2 − tN −2β , β = 1/3 and t ≤ 1, we enter the edge region where the semicircle law approximation is no longer true and an Airy kernel is expected. Here the Hessian is very tiny and becomes zero at the edge. The two saddles for a and b tend to merge so we can no longer treat them separately. So instead of translating to one saddle we translate to a point near to both saddles. A good choice is to take the point where the Hessian is exactly zero. This will coincide with the two saddles at |E| = 2. We will see that the error we make choosing this point instead of the saddles is negligible for β = 1/3. Then we will have to perform a rotation. 7.1. Translation point We study the second derivative of f (4.2) 1 d2 1 d2 f = 1 − , f =1− . (7.1) da2 (Eε − a)2 db2 (Eε − ib)2 This is zero at the points E − a = ±1 and E − ib = ±1. As we want to ensure that this point tends to the saddle when |E| goes to 2 we have to choose a = (E − σ) ,
ib = (E − σ)
(7.2)
b → b − i(E − 1) .
(7.3)
where σ is the sign of E. For E = ±2 these coincide indeed with the saddle points. To simplify the expressions, in the following we assume E is near 2, so that σ = 1. Then we perform the translation a → a + (E − 1) ,
As we are very near to the edge, instead of E we use the new variable E = 2−x,
2
x = t N−3 .
(7.4)
Therefore E − 1 = 1 − x. After the translation f becomes f = f1 (a) + f2 (b)
b2 a2 + a(1 − x) + ln(1 − a) , f2 (b) = − ib(1 − x) + ln(1 − ib) 2 2 and the integral to study becomes Z 2 2 1 N N − Im da db e− 2 [(a +b )+2(1−x)(a−ib)] (a + 1 − x) π 2π (1 + iε − ib)N 1 × 1− (1 + iε − a)N (1 + iε − a)(1 + iε − ib) Z 2 2 1 N N = − Im da db e− 2 [(a +b )+2(1−x)(a−ib)] a π 2π (1 + iε − ib)N 1 × 1 − (1 + iε − a)N (1 + iε − a)(1 + iε − ib) f1 (a) =
(7.5)
(7.6)
where we applied (3.2) and Im(1 − x) = 0. Note that we cannot take ε = 0 yet, as the a translation is real.
December 23, 2004 10:36 WSPC/148-RMP
00222
Density of States for GUE through Supersymmetric Approach
1215
7.2. Rotation for a In order to take ε = 0 we have to perform a rotation. We will see that the correct rotated contour is parametrized by ae−iθ , where θ = π6 +δ, for a ≥ 0, and θ = π6 −δ, for a < 0, and 0 < δ < 1. Of course, as we did not translate to a saddle, the maximum in absolute value will not occur at a = 0. We will see that nevertheless the error we make is negligible. In order to check we have to study the first derivative of Re f1 (a) 1 a2 cos 2θ + a(1 − x) cos θ + ln[1 − 2a cos θ + a2 ] (7.7) 2 2 1−x 1 2 2 0 a a cos 2θ − 2a cos θ cos 2θ − + 2x cos θ − x cos θ Re f1 (a) = A 2 (7.8)
Re f1 (a) =
where A = [1 − 2a cos θ + a2 ] > 0. The last term x cos θ appears because we are not passing through the saddle. Let us take a ≥ 0. We choose δ > 0 such that 1−x cos 2θ − 2 < 0. Then the first term has three zeros, one in a = 0 and two negative points. But, as we have constrained a ≥ 0 the only minimum is actually in a = 0. The same argument is true for a < 0. In this second case we must ensure > 0. The sign of δ is chosen to ensure that the additional zeros that cos 2θ − 1−x 2 are always outside the integration domain (negative when we restrict to a ≥ 0 and positive when a < 0). Now we have to take into account the error term, that is of order x. Therefore √ the true minimum is not in a = 0 but in a = O( x). If we insert this in Re N f1 (a), the dominant contribution comes form the linear and cubic terms 3 −N cos θ xa + N O(a3 ) = O N x 2 = O(1) . (7.9)
Note that the result is true only for x = O(N −2/3 ). The integration contour is shown in Fig. 7. As before, the contributions from the boundary terms vanish when L → ∞. We also have to check that we can extract a fraction f a2 /2 to ensure conver2 gence. If we study the minimum for Re[f1 (a) − f a2 ] we find that for any f > 0 a new minimum appears at a distance f from zero. Therefore the maximal fraction f we can afford with a negligible error is f = O(N −1/3 ). 7.3. Rotation for b Note that the b integral is well defined even if we do not perform any rotation. But in that case we can save only a very tiny fraction of the mass to ensure convergence. After the rotation we must check that the minimum is near b = 0 and that the error we make is negligible. Again we will have to perform a different rotation depending on the sign of b: b → b e+iδ , b ≥ 0 , and 0 < δ < 1.
b → b e−iδ , b < 0 ,
(7.10)
December 23, 2004 10:36 WSPC/148-RMP
1216
00222
M. Disertori
Im a saddles
E>0 ε=0 pole
δ
−L
εi −ε i
E/2 E−1
π /6
E
L
Re a
δ
Fig. 7.
Rotation for the a contour.
To check, we have to study the first derivative of Re f2 (b) after the rotation Re f2 (beiσb δ ) =
b2 1 cos 2δ + b(1 − x) sin σb δ − ln[1 + 2b sin σb δ + b2 ] 2 2
(7.11)
where σb = sign(b), 1−x 1 2 2 0 b b cos 2δ + 2b sin σb δ cos 2δ + − 2x sin δ − x sin σb δ Re f2 (b) = B 2 (7.12) where B = [1 + 2b sin σb δ + b2 ]. We see that for b ≥ 0 there is a minimum in b = O(x) and one on the negative axis. The error we make assuming the minimum is on zero instead of on b = O(x) is N O(xb) + N O(b3 ) = O(N −1/3 ). The additional √ term −x sin δ shifts the minimum to a point b = O( x). Again, the error we make approximating this point by zero again N O(xb) + N O(b3 ) = O(1) as in the a integral. The same holds for b < 0. Now, as in the a integral we can extract at most √ a fraction f = O( x) = O(N −1/3 ) of the quadratic term b2 /2. The integration contour is shown in Fig. 8. 7.4. Perturbative expansion The integral (7.6) after all these rotations becomes Z 3 −iθ 3iδσb a3 −3iθ 1 N ) −N (ibeiδσb x+ (ib) ) 3 e ρ¯N (E) = − Im da db e−iθ+iδσb eN (ae x+ 3 e e π 2π −iθ iδσb × eN [V (ae )−V (ibe )] ae−iθ −(ae−iθ + ibeiδσb ) + R(ae−iθ , beiδσb )
(7.13)
December 23, 2004 10:36 WSPC/148-RMP
00222
Density of States for GUE through Supersymmetric Approach
1217
Im b E>0
saddles
εi
ε
− i
Re b
−E/2
δ
− (E−1)
−L
Fig. 8.
L
Rotation for the b contour.
where 1
y4 (1 − ys)4 0 Z 1 d2 1 R(a, b) = − ds(1 − s) 2 , ds (1 − as)(1 − ibs) 0 V (y) =
Z
ds(1 − s)3
(7.14) (7.15)
we have taken ε = 0 and we have expanded around a, b = 0. 7.5. Dominant contribution
The dominant contribution is given by Z 3 −iθ 3iδσb N 1 a3 −3iθ ) −N (ibeiδσb x+ (ib) ) 3 e − Im da db e−iθ+iδσb eN (ae x+ 3 e e π 2π × ae−iθ −(ae−iθ + ibeiδσb ) . (7.16) Now note that
π
a3 e−3iθ = ae+i 2 −iσa δ
3
= v13 ,
π
−(ib)3 e3iσb δ = be−i 2 +iσb δ
3
= v23
(7.17)
where v1 and v2 belong to L, that is the path in the complex plane with Ph(v) = − π2 + δ and Ph(v) = π2 − δ (see Fig. 9). Therefore the integral above is equal to Z Z 3 3 v1 v2 −i 2 π 1 N −i 2 π 4 2 dv2 eN xv1 e 3 +N 3 eN xv2 +N 3 −v12 e−i 3 π + v1 v2 e−i 3 π dv1 Im e 3 π 2πi L L (7.18) where L is the path in the complex plane with Ph(v) = − π2 + δ and Ph(v) = π2 − δ 2 (see Fig. 9). Now we insert x = tN − 3 and we rescale v1 and v2 Z Z 1 3 3 v1 v2 −i 2 π 4 2 1 N − 3 −i 2 π 3 Im e dv1 dv2 ev1 e 3 t+ 3 ev2 t+ 3 −v12 e−i 3 π + v1 v2 e−i 3 π . π 2πi L L (7.19)
December 23, 2004 10:36 WSPC/148-RMP
1218
00222
M. Disertori
L Im (v)
Re (v)
δ
Fig. 9.
L contour.
We note that 1 2πi
Z
dv e−vx+
v3 3
= Ai(x)
(7.20)
L
is the so-called Airy function and satisfies Ai00 (x) = xAi(x) ,
2 2 2 2 Ai(x) + e−i 3 π Ai xe−i 3 π + ei 3 π Ai xei 3 π = 0 .
(7.21)
Using the relations above we have ρ¯N (E) =
1 00 Ai (−t)Ai(−t) − Ai0 (−t)2 . N 1/3
(7.22)
7.6. Remainder The remainder is given by Z 3 3 N e3iδσb ) −iθ+iδσb N (ae−iθ x+ a3 e−3iθ ) −N (ibeiδσb x+ (ib) 3 da db e e e [R1 + R2 ] Rem = 2π (7.23) where R1 = eN [V (ae
−iθ
)−V (ibeiδσb )]
R2 = −ae−iθ [(ae−iθ
ae−iθ R(ae−iθ , beiδσb ) Z 1 iδσb + ibe )] ds
(7.24)
0
× N [V (ae−iθ ) − V (ibeiδσb )]eN s[V (ae
−iθ
)−V (ibeiδσb )]
.
(7.25)
December 23, 2004 10:36 WSPC/148-RMP
00222
Density of States for GUE through Supersymmetric Approach
1219
To perform the bounds we apply N (ae−iθ x+ a3 e−3iθ ) −N (ibeiδσb x+ (ib)3 e3iδσb ) N s[V (ae−iθ )−V (ibeiδσb )] e 3 3 e e ≤ e−N
2 √ 1 2 2 x 2 (a +b ) −N 3
e
2 2 1 2 (a +b )
.
(7.26)
Therefore using the fact that R(a, b) = O(a2 ) and V (y) = O(a4 ), the remainder (7.23) is bounded by 2
Rem = O(N − 3 ) .
(7.27)
8. Outside the Spectrum. Proof of Theorem 4 We remark that the analysis above (Airy kernel) works also for t < 0, that is in the tails outside the spectrum. In particular, using the asymptotic expansions for the Airy functions, we see that ρ¯N (2 + t/N 2/3 ) ≤ K
1 N 1/3
1 − 23 N x3/2 1 − 32 t3/2 =K e e t Nx
(8.1)
where x = E − 2 = t/N 2/3 and t > 0. Also for large energies outside the spectrum E ≥ 2 + x, x ≥ N −2/3 , we expect exponential decay. 8.1. Saddle points Actually in this region the saddle points for a are both real, and the saddle points for b are pure imaginary r r E E E2 E2 ± −1, bs = −i ± −1 . (8.2) as = 2 4 2 4 It turns out that the maximum is in r |E| E2 amax = σ − −1 2 4
(8.3)
while the minimum is in amin
|E| =σ + 2
r
E2 −1 . 4
(8.4)
Here σE is the sign of E. For b we q have two imaginary saddle points. We will see that E 2 we have to translate to −i 2 − E4 − 1 . In this case, after the translation, the q 2 maximum in absolute value is only at the saddle. If we translate to −i E2 + E4 − 1 this is not true.
December 23, 2004 10:36 WSPC/148-RMP
1220
00222
M. Disertori
8.2. Integration contour We deform the integral over a only near the pole, that is Z Z Z da · · · da · · · + da · · · =
(8.5)
C(0,π)
|E−a|>R
where C(0, π) is the semicircle a = E − Reiθ with 0 ≤ θ ≤ π (see Fig. 10). The radius R will be chosen later. Now, in the first integral |E − a| > R therefore there is no pole and we can take ε = 0. At this point the integral is real. Moreover the b integral too is real ∗ Z Z Z b2 b2 b2 db e− 2 (E −ib)N (8.6) db e− 2 (E +ib)N = db e− 2 (E −ib)N = where R R in the last term we performed the change of variable b → −b. Therefore db |E−a|>R da · · · is real and gives no contribution to the density. Moreover we remark that Z I a2 a2 1 1 1 lim Im da e−N 2 da e−N 2 = (8.7) M ε→∞ (Eε − a) 2i C (E − a)M where C is a closed contour around the pole. Therefore we have only to analyze Z Z 2 2 (Eε − ib)N 1 N N da db e− 2 (a +b ) a 1 − 2π C (Eε − a)N (Eε − a)(Eε − ib) Z Z π iθ 2 2 N N = (−iReiθ )dθ e− 2 [(E−Re ) +b ] E − Reiθ db 2π −π 1 (E − ib)N 1− (8.8) × (Reiθ )N (Reiθ )(E − ib) where C is now a circle of radius R around the pole. E>0
Im a
E/2 − ∆
E/2 +∆
pole
E Re a C
Fig. 10.
Integration contour for a; note that ∆ =
q
E2 4
− 1.
December 23, 2004 10:36 WSPC/148-RMP
00222
Density of States for GUE through Supersymmetric Approach
1221
8.3. Choice of R We want toq choose R so that in absolute value the integrand is small. If we take 2 |E| R = 2 − E4 − 1 then for θ = 0 (if E > 0) or θ = π (if E < 0) E − Reiθ corresponds to the point of minimum amin . We then have to check that the integrand remains small for any 0 < θ ≤ π. Therefore we have to study the minimum of 1 1 Re(E − Reiθ )2 + = (E 2 + R2 cos 2θ − 2ER cos θ) . 2 2 The first derivative is E 2 −R sin 2θ+ER sin θ = R sin θ(−2R cos θ+E) = 2R sin θ −R cos θ . 2 f (θ) =
(8.9)
(8.10)
Now, applying R ≤ |E|/2 for any |E| ≥ 2 we have that, for any −π ≤ θ ≤ π, f 0 (θ) > 0 if
f 0 (θ) < 0 if
E > 2,
E < −2 .
(8.11)
Therefore the maximum of the integrand along C is on θ = 0 (if E > 0) or θ = π (if E < 0). The two terms in the integral over a are then bounded by Z 2 1 1 a −N da e 2 a N α (Eε − a) (Eε − a) C 1 1 1−α −N 1 (E−R)2 1 ≤ √ s e 2 RN R N |E| −R R 2
(8.12)
1 where K > 0 is some constant, α = 0, 1 depending on which term in 1− (E−a)(E−ib) we consider.√We have also used the bound R < 1. In order to extract the correct prefactor 1/ N we separated the regions where cos θ ' ±1 and cos θ ' 0, and 2in each we have extracted a fraction of the exponential decay exp[−f N R |E| − R θ ]. 2
8.4. Integration for b We translate b to the saddle:
E b→b−i − 2
r
E2 − 1 = b − iR . 4
(8.13)
It is easy to see that we can extract a fraction f (1 − R 2 ) of the mass to ensure exponential decay, where 0 < f < 1 is some constant. The result is Z 1 db e− N2 b2 (Eε − ib)N (Eε − ib)α N
≤ Ke2R
2
+N ln(|E|−R)
where K is some constant and α = 0, 1.
p
1 1 2 (|E| − R)α N (1 − R )
(8.14)
December 23, 2004 10:36 WSPC/148-RMP
1222
00222
M. Disertori
8.5. Perturbative steps It is not hard to see that the dominant contribution to the integral comes from the region θ ' 0 and b ' 0. Restricting to this region and performing a few perturbative steps we see that the first non-zero contribution must have a factor b2 or θ2 at the numerator, therefore we get an additional factor 1/(NR(|E|/2 − R)), where we also apply 1 − R2 ∝ R |E| 2 − R . Putting the a and b integrals together |¯ ρN (E)||E|>2 ≤ K
R −N { 12 [(|E|−R)2 −R2 ]−ln |E|−R+ln R} . 2 e |E| −R N R 2
Inserting the expression for R we get √ 2 √ 2 1 p |¯ ρN (E)||E|>2 ≤ K e−N 2[y y −1]+2N ln[y+ y −1] N y − y 2 − 1 (y 2 − 1)
(8.15)
(8.16)
where y = |E|/2. Note that when y is near 1 (that is E ' 2) y = 1 + x/2, we have 1 −cN x3/2 e (8.17) ρ¯N (2 + x) ≤ K Nx where c > 0 is some constant. This result is consistent with the behavior we get from the Airy kernel. Acknowledgments We thank T. Spencer for many discussions and suggestions related to this paper. Appendix A. Supersymmetric Formalism We summarize the conventions and notations we adopted in this work (they are based on the review by Mirlin [28]). A.1. Fermionic variables We say a set of variables χ1 , . . . , χN with their complex conjugates χ∗1 , . . . , χ∗N , is fermionic if they satisfy the following properties for any i,j: χi χj = −χj χi ,
χ∗i χj = −χ∗j χi ,
χ∗i χ∗j = −χ∗j χ∗i ,
(χ∗i )∗ = −χi , (χi χj )∗ = χ∗i χ∗j , Z Z Z Z 1 dχi 1 = dχ∗i 1 = 0 , dχi χi = dχ∗i χ∗i = √ . 2π With these definitions we introduce a vector and its adjoint as usual χ1 . . χ= χ+ = (χ∗1 , . . . , χ∗N ) . . , χN
(A.1) (A.2) (A.3)
(A.4)
December 23, 2004 10:36 WSPC/148-RMP
00222
Density of States for GUE through Supersymmetric Approach
Now χ+ χ is a real commuting variable and Z Y + M dχ∗i dχi e−χ M χ = det 2π i
1223
(A.5)
for any matrix M .
A.2. Supervectors and supermatrices A supervector is defined as S1 .. . SN , Φ= χ1 .. .
∗ Φ+ = (S1∗ , . . . , SN , χ∗1 , . . . , χ∗N )
(A.6)
χN
where Si are the commuting and χi are the anticommuting components. Similarly a supermatrix is a matrix with both commuting and anticommuting entries a σ M= (A.7) ρ b where a and b are ordinary matrices while σ and ρ have anticommuting elements. We identify the element of a supermatrix by four indices Mijαβ where α, β specify in which sector we are: (0, 0) corresponds to a (boson–boson); (1, 1) corresponds to b (fermion–fermion); (0, 1) corresponds to σ (boson–fermion); (1, 0) corresponds to ρ (fermion–boson). (i · j) identify the matrix element inside each sector. For example Mij00 = aij . The notions equivalent to trace and determinant are supertrace and superdeterminant Str M = Tr a − Tr b ,
Sdet M = det(a − σb−1 ρ) det b−1 .
(A.8)
Str ln M = ln Sdet M
(A.9)
With these definitions we have
Z Z
dΦ∗ dΦ e−Φ
+
MΦ
= Sdet M −1
(A.10)
dΦ∗ dΦ Φα,k Φ∗β,l e−Φ
+
MΦ
−1 = (M −1 )αβ . kl Sdet M
(A.11)
Note that some properties are different from that of the usual matrices, in particular: Sdet cM = Sdet M for any constant c.
(A.12)
December 23, 2004 10:36 WSPC/148-RMP
1224
00222
M. Disertori
Finally from these formulas one can derive the inverse of the supermatrix M (A.7): (a − σb−1 ρ)−1 −(a − σb−1 ρ)−1 σb−1 −1 . M = (A.13) −b−1 ρ(a − σb−1 ρ)−1 b−1 1 + ρ(a − σb−1 ρ)−1 σb−1 References [1] C. E. Porter, Statistical theory of spectra: Fluctuations (Academic, New York, 1965). [2] T. A. Brody, J. Flores, J. B. French, P. A. Mello, A. Pandley and S. S. M. Wong, Random-matrix physics: spectrum and strength fluctuations, Rev. Mod. Phys. 53 (1981) 385. [3] C. W. J. Beenakker, Random-matrix theory of quantum transport, Rev. Mod. Phys. 69 (1997) 731; cond-mat/9612179. [4] F. Haake, Quantum Signatures of Chaos, 2nd edn. (Berlin, Springer, 1999). [5] H.-J. Stoeckmann, Quantum Chaos: An Introduction (Cambridge University Press, Cambridge, 1999). [6] di Francesco, P. Ginsparg and J. Zinn-Justin, 2-D gravity and random matrices, Phys. Rep. 254(1) (1995); (hep-th/9306153). [7] R. Dijkgraaf, Intersection Theory, Integrable Hierarchies and Topological Field Theory, Lectures given at the Cargese Summer School on “New Symmetry Principles in Quantum Field Theory”, July 16–27, 1991, hep-th/9201003. [8] J. Ambjorn, Fluctuating geometries in statistical mechanics and field theory, Lectures presented at the 1994 Les Houches Summer School, hep-th/9411179. [9] J. J. M. Verbaarschot, The infrared limit of the QCD Dirac spectrum and applications of chiral random matrix theory to QCD, Lectures given at the APCTP-RCNP Joint internaltional school on Physics of hadrons, Osaka 1998 hep-th/9902157; J. J. M. Verbaarschot and T. Wetting, Ann. Rev. Nucl. Part. Sci. 50 (2000) 343; hepth/0003017. [10] P. M. Bleher and A. R. Its (eds.), Random Matrix Models and Their Applications, Vol. 40 (MSRI Publications). [11] N. M. Katz and P. Sarnak, Zeroes of zeta functions and symmetry, Bull. Am. Math. Soc. 36(1) (1999) 1. [12] N. K. Johansson, Shape fluctuations and Random matrices, Commun. Math. Phys. 209 (2000) 437. [13] C. A. Tracy and H. Widom, Distribution functions for largest eigenvalues and their applications, in Proceedings of the International Congress of Mathematicians, Vol. I, Beijing, 2002 (Higher Ed. Press, Beijing, 2002), pp. 587–596, math-ph/0210034. [14] T. Guhr, A. M¨ uller-Groeling and H. A. Weidenm¨ uller, Random-matrix theories in quantum physics: common concepts, Phys. Rep. 299 (1998) 189; cond-mat/9707301. [15] M. L. Mehta, Random Matrices, revised and enlarged edition (Academic Press, 1991). [16] P. J. Forrester, N. C. Snaith and J. J. M. Verbaarshot, J. Phys. A36 (2003) R1 (special edition on random matrix theory). [17] F. J. Dyson, Statistical theory of the energy levels of complex systems, J. Math. Phys. 3 (1962), I p. 140, II p. 157, II p. 166; F. J. Dyson, A Brownian motion model for the eigenvalues of a random matrix, J. Math. Phys. 3 (1962) p. 1191; F. J. Dyson, The threefold way. Algebraic structure of symmetry groups and ensembles in quantum mechanics, J. Math. Phys. 3 (1962), p. 1200. [18] P. Deift, Orthogonal Polynomials and Random Matrices: A Riemann–Hilbert Approach, Courant Lecture Notes (1999).
December 23, 2004 10:36 WSPC/148-RMP
00222
Density of States for GUE through Supersymmetric Approach
1225
[19] C. A. Tracy and H. Widom, Correlation functions, cluster functions, and spacing distributions for random matrices, J. Stat. Phys. 92(5–6) (1998) 809; solv-int/9804004. [20] C. A. Tracy and H. Widom, Introduction to random matrices, in Geometric and Quantum Aspects of Integrable Systems (Scheveningen, 1992), Lecture Notes in Phys., 424 (Springer, Berlin, 1993), pp. 103–130; hep-th/9210073. [21] Harish-Chandra, Proc. Nat. Acad. Sci. 42 (1956) 252. [22] C. Itzykson and J. B. Zuber, The planar approximation II, J. Math. Phys. 21(3) (1980) 411. [23] E. Br´ezin and S. Hikami, An extension of the Harish Chandra–Itzykson–Zuber integral, Commun. Math. Phys. 235(1) (2003) 125; math-ph/0208002. [24] F. J. Wegner, The mobility edge problem: Continuous symmetry and a conjecture, Z. Phys. B35 (1979) 207. [25] L. Sch¨ afer and F. J. Wegner, Disordered system with n orbitals per site: Lagrange formulation, hyperbolic symmetry, and goldstone modes, Z. Phys. B38 (1980) 113. [26] K. B. Efetov, Supersymmetry and theory of disordered metals, Adv. Phys. 32(1) (1983) 53. [27] K. B. Efetov, Supersymmetry in Disorder and Chaos (Cambridge University Press 1997). [28] A. Mirlin, Statistics of energy levels and eigenfunctions in disordered and chaotic systems: Supersymmetric approach, in Proceedings of the International School of Physics “Enrico Fermi”, Course CXLIII, eds. G. Casati, I. Guarneri and U. Smilansky (IOS Press, Amsterdam, 2000), pp. 223–298; cond-mat/0006421. [29] Y. V. Fyodorov, Basic features of Efetov’s supersymmetric approach, in Mesoscopic Quantum Physics, Les Houches, Session LXI, 1994 (Elsevier Science, 1995), p. 493. [30] J. A. Zuk, Introduction to the supersymmetry method for the Gaussian randommatrix ensembles, cond-mat/9412060. [31] F. Constantinescu, G. Felder, K. Gawedzki and A. Kupiainen, Analyticity of density of states in a gauge-invariant model for disordered electronic systems, J. Stat. Phys. 48 (3/4) (1987) 365. [32] A. Klein, The supersymmetric replica trick and smoothness of the density of states for random Schr¨ odinger operators, in Operator Theory: Operator Algebras and Applications, Part 1, p. 315; Proceedings of Symposia in Pure Mathematics 51, Part 1. [33] M. Disertori, H. Pinson and T. Spencer, Density of states for random band matrices, Commun. Math. Phys. 232 (2002) 83; math-ph/0111047. [34] C. A. Tracy and H. Widom, Level spacing distribution and the Airy kernel, Commun. Math. Phys. 159 (1994) 151; hep-th/9211141. [35] C. A. Tracy and H. Widom, Fredholm determinants, differential equations and matrix models, Commun. Math. Phys. 163 (1994) 33; hep-th/9306042. [36] F. Kalisch and D. Braak, Exact density of states for finite Gaussian random matrix ensembles via supersymmetry, J. Phys. A35 (2002) 9957, cond-mat/0201585.
January 12, 2005 14:5 WSPC/148-RMP
00225
Reviews in Mathematical Physics Vol. 16, No. 10 (2004) 1227–1258 c World Scientific Publishing Company
A QUANTIZATION OF BOX-BALL SYSTEMS
R. INOUE∗ , A. KUNIBA† and M. OKADO‡ ∗Research
Institute for Mathematical Sciences, Kyoto University, Kyoto 606-8502, Japan †Institute of Physics, University of Tokyo, Tokyo 153-8902, Japan ‡Division of Mathematical Science, Graduate School of Engineering Science, Osaka University, Osaka 560-8531, Japan ∗[email protected] †[email protected] ‡[email protected] Received 8 May 2004 Revised 26 October 2004 An L operator is presented related to an infinite dimensional limit of the fusion R ma(1) (1) trices for Uq (An−1 ) and Uq (Dn ). It is factorized into the local propagation operators which quantize the deterministic dynamics of particles and antiparticles in the soliton cellular automata known as the box-ball systems and their generalizations. Some properties of the dynamical amplitudes are also investigated. Keywords: Box-ball system; L operator; Yang–Baxter equation; vertex model.
1. Introduction The discovery of the box-ball systems [1–3] and their connection to the crystal basis theory [4–6] has led to a new parallelism across the integrable systems of three origins: quantum, ultradiscrete and classical [7]. They are a class of twodimensional vertex models in statistical mechanics, one-dimensional soliton cellular automata and discrete soliton equations. The fundamental objects that govern the local dynamics in these systems are the triad of quantum R, combinatorial R and tropical R, all satisfying the Yang–Baxter equation. They are a finite-dimensional matrix, a bijection among finite sets and a birational map, which are characterized as the intertwiners of Uq modules, crystals and geometric crystals, respectively. The (1) box-ball systems (gn = An−1 ) and their generalizations to the gn automata [4, 8] are associated with the combinatorial R, which arises both as the q → 0 limit of the quantum R and as the ultradiscretization of the tropical R [9]. An interesting feature in these automata is the factorization of time evolution into a product of propagation operators of particles and antiparticles with fixed color [10, 11]. This is a consequence of the factorization of the combinatorial R shown in [12]. Our aim in this paper is to elucidate a similar factorization for 1227
January 12, 2005 14:5 WSPC/148-RMP
1228
00225
R. Inoue, A. Kuniba and M. Okado
the relevant quantum R, and thereby to launch an integrable quantization of the deterministic dynamics of particles and antiparticles in the generalized box-ball systems. (1) To illustrate the idea, consider for example the quantum affine algebra Uq (An−1 ) and its irreducible finite dimensional representation Vm of m-fold symmetric tensors. The quantum R matrix for Vm ⊗V1 (2.3) gives rise to the commuting transfer matrix Tm (z) acting on · · · ⊗ V1 ⊗ V1 ⊗ · · ·, which reduces, at q = 0, to the time evolution of the box-ball system with capacity m carrier [13]. One can naturally extract an L operator, a Weyl algebra valued matrix, from the m → ∞ limit of the R matrix in the vicinity of the lowest weight vector. See (2.13) and (2.14) for example. More general L operators can be constructed similarly corresponding to the m generic situation. The limit considered here is motivated by the box-ball systems and has a special feature in that the resulting L admits the factorization as in Proposition 2.2. Each operator Ki appearing there encodes the amplitudes for a local propagation of color i particles as depicted in Fig. 2. At q = 0, it reduces to the deterministic dynamics in the box-ball system [2]. Sections 2.1–2.6 are devoted to an exposition of these observations. Sections 2.7 and 2.8 are concerned with some properties of the dynamical amplitudes and the implication of the Bethe ansatz, respectively. In Sec. 3 we establish parallel results (1) on Dn case. The calculation of the fusion R ∈ End(Vm ⊗ V1 ) is more involved (1) than An−1 . It is done in the limit m → ∞ in Appendix A. The L operator is given in Sec. 3.3 and factorized in Sec. 3.4. The propagation operators describe the amplitudes of pair creation and annihilation of particles and antiparticles as (1) depicted in Fig. 8. A quantized Dn automaton is presented in Sec. 3.5 with a few basic properties. (1) The fusion construction of the R matrices and their matrix elements for An−1 given in Sec. 2 are not new. They have been included for the sake of selfcontainedness. The content of this paper may be regarded as a generalization of the one in [12] for q = 0. It will be interesting to investigate the present results in the light of the works [14–16]. (1)
2. An−1 Case 2.1. R matrix R(z) and its fusion R(m,1) (z) We recall the standard fusion construction [17]. Let V = Cv1 ⊕ · · · ⊕ Cvn be the (1) vector representation of the quantum affine algebra Uq = Uq (An−1 ) without the derivation operator. Here v1 is the highest weight vector and our convention of the coproduct is ∆(ei ) = ei ⊗ 1 + ti ⊗ ei , ∆(fi ) = fi ⊗ t−1 i + 1 ⊗ fi for the Chevalley generators. The R matrix R(z) ∈ End(V ⊗ V ) reads ! X X X X R(z) = a(z) Eii ⊗ Eii + b(z) Eii ⊗ Ejj + c(z) z + Eji ⊗ Eij , (2.1) i i<j i>j i6=j a(z) = 1 − q 2 z ,
b(z) = q(1 − z) ,
c(z) = 1 − q 2 ,
January 12, 2005 14:5 WSPC/148-RMP
00225
A Quantization of Box-Ball Systems
1229
where Eij is the matrix unit acting as Eij vk = δjk vi . It satisfies the Yang–Baxter ˇ equation R23 (z 0 /z)R13 (z 0 )R12 (z) = R12 (z)R13 (z 0 )R23 (z 0 /z). The matrix R(z) = P R(z) commutes with ∆(Uq ), where P denotes the transposition of the components. Let Vm be the irreducible Uq module spanned by the m-fold q-symmetric tensors. We take V1 = V and realize the space Vm as the quotient V ⊗m /A, where A = P ⊗j V ⊗ Im P R(q −2 ) ⊗ V ⊗m−2−j . It is easy to see Im P R(q −2 ) = Ker P R(q 2 ) = Lj i<j C(vi ⊗ vj − qvj ⊗ vi ). For n ≥ i1 ≥ · · · ≥ im ≥ 1, we write the vector (vi1 ⊗ · · · ⊗ vim mod A) ∈ Vm as x = [x1 , . . . , xn ], where xi is the number of the letter i in the sequence i1 , . . . , im . Thus, xi ∈ Z≥0 and x1 + · · · + xn = m holds. Due to the Yang–Baxter equation, the operator R1,m+1 (zq m−1 )R2,m+1 (zq m−3 ) · · · Rm,m+1 (zq −m+1 ) a(zq m−3 )a(zq m−5 ) · · · a(zq −m+1 )
(2.2)
can be restricted to End(Vm ⊗ V ). As a result we get an m by 1 fusion R matrix R(m,1) (z) ∈ End(Vm ⊗ V ), which reads explicitly as X R(m,1) (z)(x ⊗ vj ) = wjk [x|y](y ⊗ vk ) , (2.3) k
m−x k − q xk +1 z q wjk [x|y] = (1 − q 2xk )q xk+1 +xk+2 +···+xj−1 z (1 − q 2xk )q m−(xj +xj+1 +···+xk )
j=k j>k
(2.4)
j
It is customary to attach the matrix element wjk [x|y] with a diagram like Fig. 1. Here y = [yi ] is specified by the weight conservation as yi = xi + δij − δik
(2.5)
in terms of x, j and k. At q = 0, the matrix element wjk [x|y] is nonzero if and only if x ⊗ vj ' vk ⊗ y in the combinatorial R: Bm ⊗ B1 ' B1 ⊗ Bm , where it takes the value z H , with 1 − H = winding number [18]. The fusion R matrix R(m,1) (z) reduces to R(z) in (2.1) for m = 1, and it satisfies the Yang–Baxter equation in End(Vm ⊗ V ⊗ V ): (m,1)
R23 (z 0 /z)R13
(m,1)
(z 0 )R12
(m,1)
(z) = R12
(m,1)
(z)R13
(z 0 )R23 (z 0 /z) .
j wjk [x|y] =
x
-y ? k
Fig. 1.
Diagram for wjk [x|y].
(2.6)
January 12, 2005 14:5 WSPC/148-RMP
1230
00225
R. Inoue, A. Kuniba and M. Okado
The R matrix R(1,m) (z) ∈ End(V ⊗ Vm ) is similarly obtained as R(1,m) (z)(vj ⊗ P x) = k w ¯jk [x|y](vk ⊗ y), where m−x k − q xk +1 z j=k q 2x m−(x +x +···+x j) k k k+1 w ¯jk [x|y] = (1 − q )q j>k 2xk xj+1 +xj+2 +···+xk−1 (1 − q )q z j
The inversion relation
P R(1,m) (z −1 )P R(m,1) (z) = (1 − q m+1 z)(1 − q m+1 z −1 ) Id
(2.7)
is valid. 2.2. L operator L(z) Now we extract an L operator L(z) from a certain limit of R (m,1) (z). We illustrate the idea along the n = 3 case. The 3 by 3 matrix (wji [x|y])1≤i,j≤3 with y chosen as (2.5) looks as x2 +x3 q − q x1 +1 z (1 − q 2x1 )z (1 − q 2x1 )q x2 z q x1 +x3 − q x2 +1 z (1 − q 2x2 )z . (1 − q 2x2 )q x3 1 − q 2x3
(1 − q 2x3 )q x1
q x1 +x2 − q x3 +1 z
Throughout the paper we assume that |q| < 1. Consider the limit m → ∞ with x1 and x2 kept fixed. Namely we take x3 → ∞ and stay in the vicinity of the lowest weight vector of Vm as m goes to infinity. The above matrix simplifies to x1 +1 −q z (1 − q 2x1 )z (1 − q 2x1 )q x2 z 0 −q x2 +1 z (1 − q 2x2 )z . (2.8) 1
q x1
q x1 +x2
In the limit, the constraint x1 + x2 ≤ m becomes void and the vector x = [x1 , x2 , x3 ] ∈ Vm gets effectively labeled as [x1 , x2 ] with arbitrary x1 , x2 ∈ Z≥0 . For generic (nonzero) x1 and x2 , the (1, 2) element (1 − q 2x1 )z in (2.8), for example, is the matrix element of the transition [x1 , x2 ] → [x1 − 1, x2 + 1] in view of (2.5). Similarly the (2, 3) element (1 − q 2x2 )z is the one for [x1 , x2 ] → [x1 , x2 − 1]. Introducing the operator P2 and Q2 that act on [x1 , x2 ] as P2 [x1 , x2 ] = q x2 [x1 , x2 ] and 2 Q2 [x1 , x2 ] = [x1 , x2 + 1], the (2, 3) element of (2.8) is represented as zQ−1 2 (1 − P2 ). With the similar operators P1 and Q1 concerning the coordinate x1 , the matrix (2.8) is presented as −1 2 2 −zqP1 zQ−1 1 (1 − P1 )Q2 zQ1 (1 − P1 )P2 2 (2.9) −zqP2 zQ−1 0 2 (1 − P2 ) Q1
P1 Q 2
P1 P2
where operators are all commutative except Pi Qi = qQi Pi .
January 12, 2005 14:5 WSPC/148-RMP
00225
A Quantization of Box-Ball Systems
1231
Motivated by these observations, we prepare for general n the Weyl algebra generated by the pairs Pi±1 , Q±1 (1 ≤ i ≤ n − 1) under the relations i Qi Qj = Q j Qi ,
Pi Qj = q δij Qj Pi ,
Pi Pj = P j Pi ,
Qi Q−1 = Q−1 i i Qi = 1 ,
Pi Pi−1 = Pi−1 Pi = 1 .
(2.10)
We actually consider a slight generalization of (2.9) containing parameters a1 , . . . , an−1 . Let A be the subalgebra of the Weyl algebra generated by 2 Pi , Qi , Ri = Q−1 i (1 − ai Pi )
1 ≤ i ≤ n−1.
(2.11)
We also use the subsidiary symbol Pi0 = −ai qPi . The previous discussion corresponds to the ∀ai = 1 case. The combination Ri ∈ A introduced here should not be confused with the R matrix. Then we define the operator L(z) ∈ A ⊗ End(V ) by L11 (z) · · · L1n (z) . .. .. , . L(z) = . . . Ln1 (z) · · · Lnn (z) where Lij (z) ∈ A is given by (Pi,j = Pi Pi+1 · · · Pj for i ≤ j) zRi Pi+1,j−1 Qj ( 0 i < n, zP i zRi Pi+1,n−1 L(z)ii = L(z)ij = P1,n−1 i = n, P1,j−1 Qj 0
i < j < n, i < j = n, j < i = n,
(2.12)
j < i < n.
This is an operator interpretation of wji [x|y] (2.4) in the limit xn → ∞ deformed (1) (1) with a1 , . . . , an−1 . See (2.24). For example for A1 and A2 , they read 0 zP1 zR1 Q2 zR1 P2 0 zP1 zR1 (2.13) L(z) = , L(z) = 0 zP20 zR2 . Q1 P1 Q1 P1 Q 2 P1,2 (1)
The latter agrees with (2.9) when ∀ai = 1. For A3 one has 0 zP1 zR1 Q2 zR1 P2 Q3 zR1 P2,3 0 zP20 zR2 Q3 zR2 P3 L(z) = . 0 0 0 zP3 zR3
(2.14)
Q1
P1 Q 2 P1,2 Q3 P1,3 P Our convention is L(z)(α ⊗ vj ) = i (Lij (z)α) ⊗ vi for α ∈ A. Similarly we let 1
2
1
L(z), L(z) ∈ A ⊗ End(V ⊗ V ) denote the operators acting as L(z)(α ⊗ vi ⊗ vj ) = 2 P P k (Lkj (z)α) ⊗ vi ⊗ vk . As an k (Lki (z)α) ⊗ vk ⊗ vj and L(z)(α ⊗ vi ⊗ vj ) = analogue of the Yang–Baxter equation (2.6), we have
January 12, 2005 14:5 WSPC/148-RMP
1232
00225
R. Inoue, A. Kuniba and M. Okado
Proposition 2.1. 2
1
1
2
R(z2 /z1 )L(z2 )L(z1 ) = L(z1 )L(z2 )R(z2 /z1 ) ∈ A ⊗ End(V ⊗ V ) . In Sec. 2.3, this will be proved based on the factorization of L(z). 2.3. Factorization of L(z) Let us introduce the operators Ki ∈ A ⊗ End(V ) for 1 ≤ i ≤ n − 1 by Ki = ((Ki )j,k )1≤j,k≤n , (Ki )i,i = Pi0 ,
(Ki )i,n = Ri ,
(Ki )n,i = Qi ,
(Ki )n,n = Pi ,
(2.15)
(Ki )j,j = 1 (j 6= i, n) .
The other elements are zero. The Ki with ∀ai = 1 will be interpreted as the local propagation operator in quantized box-ball system in Sec. 2.6. We also introduce an n by n matrix D(z) = z diag(1, . . . , 1, z −1 ), which acts on V only. Proposition 2.2. L(z) = D(z)K1 K2 · · · Kn−1 For example the latter in (2.13) is expressed as 0 0 1 P1 0 R 1 zP1 zR1 Q2 zR1 P2 0 zP2 zR2 = diag(z, z, 1) 0 1 0 0 0 Q1
P1 Q 2
P1 P2
Q1
0
P1
0 P20
0 Q2
0
R2 . P2
Proof. Denote the n by n matrix L(z = 1) defined by (2.12) by Ln . We are to (1) show K1 K2 · · · Kn−1 = Ln for An−1 . This is done by induction on n. The case n = 3 is checked in the above. Suppose the equality is valid for n. Then from the (1) structure of the matrices Ki , one can evaluate K1 K2 · · · Kn for An as the product of K1 and the rest as 1 0 ··· 0 0 P1 R1 0 = Ln+1 . (2.16) 1ln−1 .. + Ln . Q1 P1 0 0 Here L+ n is Ln with all the constituent operators Xi (X = P, P , Q, R) replaced by Xi+1 , and 1ln−1 is the identity matrix of size n − 1. It is straightforward to verify this identity.
Remark 2.3. Elements of A contained in any single Lij (z) (2.12) are all commutative. As a result, the identity (2.16) holds under any interchange of Pi , Pi0 , Qi and Ri on both sides.
January 12, 2005 14:5 WSPC/148-RMP
00225
A Quantization of Box-Ball Systems
1233
Let us make use of the factorization to prove Proposition 2.1. We first define σ1 , . . . , σn−1 , σ ∈ End(V ) by if j = i vi+1 σi v j = v i if j = i + 1 vj otherwise , σ = σn−1 σn−2 · · · σ1 .
Thus σvj = vj−1 is valid for indices in Z/nZ. Consider the following gauge transformation of Ki : Si = σi σi+1 · · · σn−1 Ki σn−1 σn−2 · · · σi+1
1 ≤ i ≤ n−1.
(2.17)
The components of Si ∈ A ⊗ End(V ) are given by Si = ((Si )j,k )1≤j,k≤n , (Si )i,i+1 = Pi ,
(Si )i+1,i+1 = Ri ,
(Si )i,i = Qi ,
(Si )i+1,i = Pi0 ,
(2.18)
(Si )j,j = 1 (j 6= i, i + 1) . The other components are zero. Note that Proposition 2.2 is rewritten as L(z) = D(z)σS1 S2 · · · Sn−1 .
(2.19)
Now Proposition 2.1 is a corollary of the formula (2.19) and Lemma 2.4. R(z2 /z1 )(D(z1 )σ ⊗ D(z2 )σ) = (D(z1 )σ ⊗ D(z2 )σ)R(z2 /z1 ) , 2 1
1 2
R(z)S i S i = S i S i R(z)
1 ≤ i ≤ n−1.
Proof. The first relation is directly confirmed. It is enough to check the latter at two distinct values of z. It is trivially valid at z = 1 and easily checked at z = q −2 . Remark 2.5. If ai = 1, the property Ki2 = 1ln is valid for 1 ≤ i ≤ n − 1. This is a remnant of the inversion relation (2.7). It implies L(z)−1 = Kn−1 · · · K1 D(z −1 ). The formula (2.19) was known at q = 0 as a factorization of combinatorial R [12], where Si appeared as the Weyl group operator on crystal basis. (1)
For A1 , the L operator here can also be obtained by specializing the q generic case of the one in [19]. The case ∀ai = 0 has appeared in the quantized Lotka– (1) Volterra model for An−1 [20].
January 12, 2005 14:5 WSPC/148-RMP
1234
00225
R. Inoue, A. Kuniba and M. Okado
2.4. Quantized box-ball system: Space of states Consider the formal infinite tensor product of V = Cv1 ⊕ · · · ⊕ Cvn : · · · ⊗ V ⊗ V ⊗ V ⊗ · · · = ⊕C(· · · ⊗ vj−1 ⊗ vj0 ⊗ vj1 ⊗ · · ·) .
(2.20)
An element of the form c(· · · ⊗ vj−1 ⊗ vj0 ⊗ vj1 ⊗ · · ·) will be called a monomial (a monic monomial if c = 1). The space of states of our quantized box-ball system is the subspace of (2.20) given by ( ) X P= cp p | conditions (i) and (ii) , (2.21) p: monic monomial
P
where (i) k∈Z |jk − n| < ∞ for any p = · · · ⊗ vj−1 ⊗ vj0 ⊗ vj1 ⊗ · · · appearing P in the sum, (ii) there exists N ∈ Z such that limq→0 q N p cp p = 0. Monomials can be classified according to the numbers w1 , . . . , wn−1 of occurrence of the letters 1, . . . , n − 1 in the set {jk }. Consequently one has the direct sum decomposition: P = ⊕Pw1 ,w2 ,...,wn−1 ,
(2.22)
n−1 where the sum runs over (w1 , . . . , wn−1 ) ∈ Z≥0 . We have P0,...,0 = Cpvac , where pvac = · · · ⊗ vn ⊗ vn ⊗ · · ·. The local state vjk ∈ V is regarded as the kth box containing a ball with color jk if jk 6= n, and the empty box if jk = n. The space of states of the box-ball system is the totality of the monomials in the above sense. The space of states P of our quantized box-ball system consists of linear superpositions thereof.
2.5. Time evolution We set ∀ai = 1 in the remainder of Sec. 2. Then the following provides an A module M: M = ⊕m1 ,...,mn−1 ∈Z≥0 C[m1 , . . . , mn−1 ] ,
Pi [. . . , mi , . . .] = q mi [. . . , mi , . . .] ,
Qi [. . . , mi , . . .] = [. . . , mi + 1, . . .] ,
(2.23)
Ri [. . . , mi , . . .] = (1 − q 2mi )[. . . , mi − 1, . . .] , where the right-hand side of the last formula is to be understood as 0 at mi = 0. The space M will be regarded as a quantization of the carrier known in the box-ball systems. By construction, for x = [x1 , . . . , xn−1 ] ∈ M one has X L(z)(x ⊗ vj ) = Wjk [x|y](y ⊗ vk ) , k (2.24) Wjk [x|y] = lim wjk [x1 , . . . , xn−1 , xn |y1 , . . . , yn−1 , yn ] , xn →∞
where y is determined from (2.5) in terms of j, k and x.
January 12, 2005 14:5 WSPC/148-RMP
00225
A Quantization of Box-Ball Systems
1235
According to the standard construction of transfer matrices in two-dimensional solvable vertex models [21, 22], the time evolution T (z) : P → P is constructed as a composition of local L operators as 1 0 −1 T (z) = · · · L(z)L(z) L (z) · · · 0,0 . (2.25) k
Here L(z) ∈ End(M ⊗ P) signifies the representation of the L operator: k L(z) m ⊗ (· · · ⊗ vjk−1 ⊗ vjk ⊗ vjk+1 ⊗ · · ·) X Lijk (z)m ⊗ (· · · ⊗ vjk−1 ⊗ vi ⊗ vjk+1 ⊗ · · ·) , =
(2.26)
i
where Lijk (z)m for m ∈ M is specified by (2.23). The symbol (· · ·)0,0 in (2.25) stands for the element in End(P) that is attached to the transition [0, . . . , 0] 7→ [0, . . . , 0] in the M part. By the definition T (z) preserves the weight subspace Pw1 ,w2 ,...,wn−1 and acts homogeneously on it as T (z)p = z w1 +···+wn−1 T (1)p for p ∈ Pw1 ,w2 ,...,wn−1 .
(2.27)
Therefore the commutativity T (z)T (z 0) = T (z 0 )T (z) is trivially valid. Henceforth we concentrate on T = T (z = 1), and T (p) for p ∈ P is to be understood as T (1)p. 2.6. Factorized dynamics The time evolution T admits a simple description as the product of propagation operators. Set 1 0 −1 Ki = · · · K i K i K i · · · 0,0 ∈ End(P) 1 ≤ i ≤ n − 1 , (2.28) k
where the representation K i ∈ End(M ⊗ P) is specified from Ki (2.15) in the same k
way as L(z) was done via L(z). To interpret Ki pictorially, we attach the following diagrams to the local operator Ki .
mi
Pi
Ri
Qi
−qPi
1
n
n
i
i
j
- mi −1 mi ? i
- mi +1 mi ? n
- mi ? i
- mi ? n q mi
mi
1 − q 2mi Fig. 2.
1
−q mi +1
mi
- mi ? j 1
Diagram for Ki (j 6= i, n).
Here mi ∈ Z≥0 is a coordinate in [m1 , . . . , mn−1 ] ∈ M. The horizontal and vertical arrows correspond to M and V , respectively. The diagrams depict the interaction
January 12, 2005 14:5 WSPC/148-RMP
1236
00225
R. Inoue, A. Kuniba and M. Okado
between the local box and the quantum carrier containing mi balls of color i. The carrier coming from the left encounters the local box whose state are specified on the top. It picks up/down a color i ball or does nothing and proceeds to the right leaving the box in the state given in the bottom with the listed amplitudes. The first line in the figure gives the operators acting on M that yield the amplitudes on the last line. For example one has Ki ([. . . , mi , . . .] ⊗ vn ) = (Pi [. . . , mi , . . .]) ⊗ vn + (Ri [. . . , mi , . . .]) ⊗ vi = q mi [. . . , mi , . . .] ⊗ vn + (1 − q 2mi )[. . . , mi − 1, . . .] ⊗ vi . The second term describes unloading whereas the first term is just a passage. It is easy to see that at q = 0, Ki reduces to the deterministic operator which coincides with the local interaction between a carrier and a box [13] in the conventional box-ball system [2, 1]. Now the composition (2.28) is expressed as Fig. 3.
··· - 0
n
n
j0
j1
n
n
- 0 ? n
- · · · s0 ? n
- s1 ? i0
- s2 · · · ? i1
- 0 ? n
- 0 - ··· ? n
Fig. 3.
Diagram for Ki (j 6= i, n).
The amplitude of Ki assigned with the transition from (· · · ⊗ vj0 ⊗ vj1 ⊗ · · ·) to (· · · ⊗ vi0 ⊗ vi1 ⊗ · · ·) is obtained as the product of all the amplitudes attached to the local vertices in Fig. 3 according to the rule specified in Fig. 2. The calculation involves an infinite product, which is well defined for elements in P. See Sec. 2.7 for examples of computations of the amplitudes. Theorem 2.6. The time evolution of the quantized box-ball system admits a factorization into propagation operators as T = K1 · · · Kn−1 .
(2.29)
Proof. This is a consequence of the definitions (2.25), (2.28) and the factorization of the L operator established in Proposition 2.2. At q = 0, Theorem 2.6 reduces to the original description of the time evolution in the box-ball system [2] as the composition of finer process to move balls with a fixed color. 2.7. Some properties of amplitudes (1)
For simplicity we concentrate on the A1 case in the remainder of Sec. 2, where one only has one kind of ball and T = K1 . However, by virtue of Theorem 2.6, all (1) the essential statements are equally valid for general An−1 under an appropriate
January 12, 2005 14:5 WSPC/148-RMP
00225
1237
A Quantization of Box-Ball Systems
resetting. In particular, Proposition 2.7 and Proposition 2.9 remain valid not only for T but also Ki for any 1 ≤ i ≤ n − 1. Let us write the action of the time evolution of a monic monomial p ∈ P as P T (p) = p0 Ap0 ,p p0 , where the sum is taken over monic monomials p0 ∈ P. We then P define the transposition t T of T by t T (p) = p0 Ap,p0 p0 . Proposition 2.7.
t
T = T −1
(2.30)
Proof. In view of Remark 2.5, the inverse T −1 = K1−1 is obtained by reversing the horizontal arrows in Fig. 2 and sending the carrier from the right to the left correspondingly in Fig. 3. By using this fact, one can verify the claim. See also Remark 2.13. Let ( , ) be the inner product such that (p, p0 ) = δp,p0 for all the monic monomials p and p0 . It is well defined on a subset of P × P. Then Proposition 2.7 tells that (T (r), T (s)) = (r, s) for (r, s) belonging to the subset. This property leads P to a family of q-series identities. In fact one has p Ap,r Ap,s = δrs for any monic monomials r and s. Pick the monomial p = · · · ⊗v2 ⊗v1 ⊗v2 ⊗· · · for instance. Then the left-hand side of (T (p), T (p)) = 1, the sum of squared amplitudes, is calculated as X 2 q k (1 − q 2 ) = 1 . (−q)2 + k≥0
Similarly for the monomial p = · · · ⊗ v2 ⊗ v1 ⊗ v1 ⊗ v2 ⊗ · · ·, the contributions to (T (p), T (p)) = 1 are grouped into the four cases as in Fig. 4, which add up to 1. Here the symbols • and ◦ stand for a ball v1 and an empty box v2 , respectively. The symbol · · · represents an array of empty boxes of the specified number. In each group, the upper configuration is p and the lower one is a monomial occurring in T (p). s s s s
(−q)4
s s s c|{z} ··· s
(−q)2
s s c s|{z} ··· s
(−q 2 )2
s s ··· s c c|{z} · · · s |{z}
P
P
k≥0 (q
k 2
) (1 − q 2 )2 = q 2 (1 − q 2 )
k
P
k≥0 (q
k 2
) (1 − q 2 )2 = q 4 (1 − q 2 )
k
k1
k1 ,k2 ≥0 (q
2k1 +k2 2
) (1 − q 2 )2 (1 − q 4 )2 = (1 − q 2 )(1 − q 4 )
k2
Fig. 4.
Squared amplitudes for T (p).
January 12, 2005 14:5 WSPC/148-RMP
1238
00225
R. Inoue, A. Kuniba and M. Okado
So far we have considered the quadratic form ( , ). Now we turn to a linear one. We use the standard notation (z)m = (z; q)m = (1 − z)(1 − zq) · · · (1 − zq m−1 ) , m (q)m = . (q)k (q)m−k k For t ≤ min(l, m), let βm,t,l be the sum of all the amplitudes for l successive vacant boxes to acquire t balls during the passage of a carrier containing m balls. Namely, it is the sum of the amplitudes for Fig. 5 over 1 ≤ i1 < i2 < · · · < it ≤ l.
1c
2c
c
c
c
m
c
c
c
s it
lc m−t
··· s i1 Fig. 5.
c
βm,t,l .
Lemma 2.8. βm,t,l = q
(m−t)(l−t)
(1 − q
2m
)(1 − q
2m−2
) · · · (1 − q
2(m−t+1)
)
l t
.
(2.31)
Proof. The contribution from Fig. 5 is (1 − q 2m )(1 − q 2m−2 ) · · · (1 − q 2(m−t+1) ) × q m(i1 −1)+(m−1)(i2 −i1 −1)+···+(m−t+1)(it −it−1 −1)+(m−t)(l−it ) .
(2.32)
The claim follows by summing this over 1 ≤ i1 < i2 < · · · < it ≤ l. Let Pfin be the subspace of P spanned by the superpositions of monomials P p cp exists. For instance, monomials are elements of Pfin . Conp cp p in which sider the linear function N : Pfin → C that takes value 1 on all the monic monomials. P
Proposition 2.9. T preserves N , i.e. N (T (p)) = N (p) for any p ∈ Pfin . For example for p = · · · v2 ⊗ v1 ⊗ v2 ⊗ · · ·, one has X N (p) = −q + q k (1 − q 2 ) = 1 . k≥0
For p = · · · v2 ⊗ v1 ⊗ v1 ⊗ v2 ⊗ · · · considered in Fig. 4, one has X X X N (p) = (−q)2 −q q k (1−q 2 )−q 2 q k (1−q 2 )+ q 2k1 +k2 (1−q 2 )(1−q 4 ) = 1 . k≥0
k≥0
k1 ,k2 ≥0
January 12, 2005 14:5 WSPC/148-RMP
00225
A Quantization of Box-Ball Systems
1239
The remainder of Sec. 2.7 is devoted to a proof of Proposition 2.9. We begin by introducing a map Φm for m ∈ Z≥0 , which is a slight generalization of T . We set Φ0 = T . For m ≥ 1, Φm acts on Pfin \{pvac } as follows. Pick any monic monomial p ∈ Pfin \{pvac} and decompose it uniquely as p = pleft ⊗ pright , so that pleft is free of balls and the leftmost component of pright is a ball. Let p0right be the linear combination of the monic monomials generated by the penetration of the carrier initially containing m balls through pright to the right. See Fig. 6. pright
z s
}|
m
{ ···
|
{z
}
p0right
Fig. 6.
p0right
We set Φm (p) = pleft ⊗ p0 right and extend it linearly to the map Φm : Pfin \{pvac } → Pfin \{pvac }. It is a direct sum of the action Pfin,N → Pfin,N +m over N ∈ Z≥1 , where the notation Pfin,N is the n = 2 case of (2.22) restricted to Pfin . Proposition 2.9 is obvious for p = pvac . Since N is linear, the other case follows from the m = 0 case of Proposition 2.10. For any monic monomial p ∈ PN ,
N (Φm (p)) = (1 + q)(1 + q 2 ) · · · (1 + q m )
(2.33)
is valid for any m ≥ 0 and N ≥ 1. The right-hand side depends on m but not on N , hence it will be denoted by αm . Note that αm = βm,m,∞ . Proof. We show (2.33) by induction on N . For N = 1, the relevant configurations either accommodate a ball or not just below the initial one. The former contributes −q m+1 βm,m,∞ to N (Φm (p)) and the latter does βm+1,m+1,∞ . The two contributions indeed sum up to αm . Assume the claim for N . In the monic monomial p ∈ PN +1 , suppose there are l empty boxes between the leftmost ball and its nearest neighbor. The configurations that accommodate t balls in the l boxes are classified into the two cases in Fig. 7. Accordingly we have the recursion relation N (Φm (p)) min(l,m+1)
min(l,m)
= −q
m+1
X t=0
βm,t,l N (Φm−t (˜ p)) +
X t=0
βm+1,t,l N (Φm+1−t (˜ p)) , (2.34)
January 12, 2005 14:5 WSPC/148-RMP
1240
00225
R. Inoue, A. Kuniba and M. Okado
s
c
c
m
s m−t
···
···
s |
{z
}
t balls
s
c
c
m
s m+1−t
···
···
c |
{z
}
t balls
Fig. 7.
Two kinds of contributions to N (Φm (p)).
where p˜ ∈ PN is the monic monomial obtained by removing the leftmost ball from p. Thus we are done if min(l,m+1)
min(l,m)
αm = −q
m+1
X
βm,t,l αm−t +
X
βm+1,t,l αm+1−t
t=0
t=0
is shown. This is a corollary of Lemma 2.11. Lemma 2.11. Let l, m ∈ Z≥0 . Then min(l,m)
αm =
X
βm,t,l αm−t .
t=0
Proof. We are to show min(l,m)
1=
X
q (l−t)(m−t)
t=0
(q)l (q)m . (q)t (q)l−t (q)m−t
Since both sides are symmetric with respect to l and m, we assume with no loss of generality that l ≤ m. Applying the q-binomial identity (z; q)t = Pt h t i s s(s−1)/2 , we expand the factor (q)m /(q)m−t = (q m−t+1 ; q)t . Then s=0 s (−z) q the right-hand side becomes l X t X (−1)s q (m−t)(l−t+s)+s(s+1)/2 t=0 s=0
(q)l . (q)l−t (q)s (q)t−s
By eliminating t by setting t = s + i, this is written as l X l i=0
i
q (m−i)(l−i)
l−i X l−i s=0
s
(−q i−l+1 )s q s(s−1)/2 .
The q-binomial identity tells us that the sum over s is equal to (q i−l+1 ; q)l−i = δil .
January 12, 2005 14:5 WSPC/148-RMP
00225
A Quantization of Box-Ball Systems
1241
P Remark 2.12. Set u = p p, where the sum extends over all the monic monomials in PN for any N ≥ 0. Then Propositions 2.7 and 2.9 tell that T (u) = u. Conversely, this property and Proposition 2.7 imply Proposition 2.9 since N (p) = (p, u) = (T (p), T (u)) = (T (p), u) = N (T (p)). 2.8. Bethe ansatz Consider the commuting family of transfer matrices Tm (z) (m ∈ Z≥1 ) constructed from the fusion R matrix R(m,1) (z) (2.3). Normalize them so that Tm (z)pvac = pvac . Then the time evolution T of our quantized box-ball system belongs to the family as T = T∞ (1). It therefore shares the eigenvectors with the simplest one T1 (z), which corresponds to the well-known six-vertex model [21]. A slight peculiarity here is that we work on P, which implies an infinite system from the onset under a fixed boundary condition. The Bethe ansatz result is adapted to such a circumstance as follows: Tm (z)|ξ1 , . . . , ξN iB = λm (z, ξ1 ) · · · λm (z, ξN )|ξ1 , . . . , ξN iB , X |ξ1 , . . . , ξN iB = Ci1 ,...,iN (ξ1 , . . . , ξN )|i1 , . . . , iN i , i1 <···
Ci1 ,...,iN (ξ1 , . . . , ξN ) =
X
sign(P )
P ∈SN
Aj,k = qηj − q −1 ηk ,
Y
j
!
APj ,Pk ξPi11 · · · ξPiNN ,
ηi =
1 − qξi , ξi − q
λm (z, ξi ) =
q m + ηi z , 1 + q m ηi z
where N is an arbitrary nonnegative integer, | · · ·iB ∈ PN is the joint eigenvector of Bethe, and |i1 , . . . , iN i is the monic monomial describing the ball configuration at positions i1 , . . . , iN . The sum over P runs over the symmetric group SN , and sign(P ) = ±1 denotes the signature of P . The above result holds for q ∈ R such that −1 < q < 1 and z ∈ C such that |zq| < 1. The parameters ξ1 , . . . , ξN should be all √ distinct for the Bethe vector not to vanish. They are to be taken from exp( −1R) to match the condition (ii) in (2.21), but otherwise arbitrary free from the Bethe equation. One sees that λm (z, ξi ) tends to ηi z in the limit q m → 0 in agreement with (2.27) with n = 2. The one-particle eigenvalue λm (z) = λm (z, ξi ) satisfies the degenerate T system λm (zq)λm (zq −1 ) = λm+1 (z)λm−1 (z). Except for the obvious N = 1 case, it is not known to us whether the property T (u) = u in Remark 2.12 can be deduced from the Bethe ansatz result quoted here. Remark 2.13. In terms of Tm (z) considered here and its transposition defined similarly to Sec. 2.7, Proposition 2.7 is the m → ∞ case of t Tm (z −1 ) = Tm (z)−1 derivable from the inversion relation (2.7).
January 12, 2005 14:5 WSPC/148-RMP
1242
00225
R. Inoue, A. Kuniba and M. Okado
(1) 3. Dn Case
3.1. R matrix R(z) Let J = {1, 2, . . . , n, −n, −n + 1, . . . , −1} be the set equipped with an order 1 ≺ 2 ≺ · · · ≺ n ≺ −n ≺ · · · ≺ −2 ≺ −1. In the following, elements of 2n × 2n matrices with indices from J are arranged in increasing order with respect to ≺ from the top left. We use the notation ( i i > 0, ¯i = (3.1) ξ = q 2n−2 , i + 2n i < 0. (1)
Let V = ⊕µ∈J Cvµ be the vector representation of Uq (Dn ). The R matrix R(z) ∈ End(V ⊗ V ) was obtained in [23, 24]. Here we start with the following convention: ! X X X X Ekj ⊗ Ejk + Ejj ⊗ Ekk + c(z) z Ekk ⊗ Ekk + b(z) R(z) = a(z) j6=k
k
+ (z − 1)(1 − q)
X j,k
fjk (z)Ejk ⊗ E−j −k ,
j≺k
jk
(3.2)
where the sums extend over J and Eij vk = δjk vi . a(z) = (1 − q 2 z)(1 − ξz) , b(z) = q(1 − z)(1 − ξz) , j = k, q + ξz ¯ ¯ j+k k− j fjk (z) = (1 + q)(−1) q j ≺ k, ¯ ¯ j+k k− j (1 + q)(−1) q ξz j k .
c(z) = (1 − q 2 )(1 − ξz) , (3.3)
The R matrix satisfies the Yang–Baxter equation. We denote by σ the automorphism of V acting as σv±1 = v∓1 , σv±n = v∓n , and σvµ = vµ for µ 6= ±1, ±n. 3.2. Fusion R matrix and its limit (1)
As the An−1 case, we set V1 = V and realize the space Vm of the m fold q-symmetric P tensors as the quotient V ⊗m /A, where A = j V ⊗j ⊗Im P R(q −2 )⊗V ⊗m−2−j . The basis of Im P R(q −2 ) can be taken as vi ⊗ vj − qvj ⊗ vi , for i ≺ j, i 6= ±j , v1 ⊗ v−1 − q 2 v−1 ⊗ v1 ,
vn ⊗ v−n − v−n ⊗ vn ,
vj ⊗ v−j − v−j ⊗ vj − qv−j−1 ⊗ vj+1 + q −1 vj+1 ⊗ v−j−1 , for 1 ≤ j ≤ n − 1 . A vector of the form vi1 ⊗ vi2 ⊗ · · · ⊗ vim is called normal ordered if −1 i1 · · · im 1 and the sequence i1 , . . . , im does not contain the letters n and −n simultaneously. The set of normal ordered vectors vi1 ⊗ vi2 ⊗ · · · ⊗ vim mod A form the basis of Vm . We label them as x = [x1 , . . . , xn , x−n , . . . , x−1 ], where xi ∈ Z≥0
January 12, 2005 14:5 WSPC/148-RMP
00225
A Quantization of Box-Ball Systems
1243
is the number of the letter i in the sequence i1 , . . . , im . Thus x1 + · · · + x−1 = m and xn x−n = 0 hold in accordance with the label in [25]. In V ⊗m normal ordering is done according to the local rule mod Im P R(q −2 ): v1 ⊗ v−1 = q 2 v−1 ⊗ v1 ,
vi ⊗ vj = qvj ⊗ vi
vj ⊗ v−j = q 2 v−j ⊗ vj − (1 − q 2 ) vn ⊗ v−n = v−n ⊗ vn = −
n−1 X i=1
j−1 X i=1
i ≺ j, i 6= ±j ,
(−q)j−i v−i ⊗ vi
2 ≤ j ≤ n−1,
(3.4)
(−q)n−i v−i ⊗ vi .
(m,1)
Then the fusion R matrix R (z) is the restriction of the operator (2.2) to End(Vm ⊗ V ). For x ∈ Vm and µ ∈ J we set X R(m,1) (z)(x ⊗ vµ ) = wµν [x|y](y ⊗ vν ) . (3.5) ν∈J,y∈Vm
Due to the weight conservation the matrix element wµν [x|y] is zero unless wt(x) + wt(vµ ) = wt(y) + wt(vν ) ,
(3.6)
where the weights may be regarded as elements in Zn by wt([x1 , . . . , xn , x−n , . . . , x−1 ]) = (x1 − x−1 , . . . , xn − x−n ) , |µ|th
(3.7)
wt(vµ ) = (0, . . . , 0, ±1 , 0, . . . , 0) for ± µ > 0 .
Leaving the calculation of wµν [x|y] in the general case aside, we present the result for the limit Wµν [x|y] :=
lim
x−n →∞
wµν [x|y] .
(3.8)
Note that one necessarily has xn = yn = 0 by the weight reason. Therefore x appearing in Wµν [x|y] is to be understood as the array (x1 , . . . , xn−1 , x−n+1 , . . . , x−1 ) that does not contain the ±n components, and the same applies to y as well. For positive integers j and k such that j ≤ k we use the symbols xj,k = xj + xj+1 + · · · + xk ,
x−j,−k = x−j + x−j−1 + · · · + x−k .
They are to be understood as zero for j > k. Derivation of Wµν [x|y] is outlined in Appendix A. We summarize the result in the following proposition. Proposition 3.1. Suppose j, k, l ∈ {1, 2, . . . , n − 1}. The nonzero matrix elements Wµν [x|y] are exhausted by the following list: W±j,±j [x|x] = −zq xj +x−j +1 , Wjk [x|x + (j) − (k)] = z(1 − q 2xk )q xk+1,j−1 +x−k , Wj,k [x|x − (l) − (−l) + (j) + (−k)]l<min(j,k) = (−1)k+l+1 z(1 − q 2xl )(1 − q 2x−l )q k−l−1+xl+1,j−1 +x−l−1,−k+1 ,
January 12, 2005 14:5 WSPC/148-RMP
1244
00225
R. Inoue, A. Kuniba and M. Okado
W−j>−k [x|x + (−j) − (−k)] = z(1 − q 2x−k )q xj +x−j−1,−k+1 , W−j<−k [x|x + (k) − (j)] = (−1)j+k z(1 − q 2xj )q j−k+xk+1,j−1 +x−k , W−j,−k [x|x + (l) + (−l) − (j) − (−k)]l<min(j,k) = (−1)j+l+1 z(1 − q 2xj )(1 − q 2x−k )q j−l−1+xl+1,j−1 +x−l−1,−k+1 , W−j,k [x|x − (j) + (−k)] = (−1)j+k z 2 (1 − q 2xj )q j+k−2+x1,j−1 +x−1,−k+1 , Wj,−k [x|x + (j) − (−k)] = (1 − q 2x−k )q x1,j−1 +x−1,−k+1 , Wn,k [x|x − (−n) + (−k)] = (−1)n+k z 2 q n+k−2+x1,n−1 +x−1,−k+1 , Wn,−k [x|x − (−n) + (k)] = (−1)n+k zq n−k+xk+1,n−1 +x−k , Wn,−k [x|x + (l) + (−l) − (−n) − (−k)]l
W−n,−k [x|x + (−n) − (−k)] = (1 − q 2x−k )q x1,n−1 +x−1,−k+1 ,
Wj,n [x|x − (−j) + (−n)] = (−1)j+n z(1 − q 2x−j )q n−j+xj +x−j−1,−n+1 , Wj,n [x|x − (l) − (−l) + (j) + (−n)]l<j = (−1)l+n+1 z(1 − q 2xl )(1 − q 2x−l )q n−l−1+xl+1,j−1 +x−l−1,−n+1 , W−j,n [x|x − (j) + (−n)] = (−1)j+n z 2 (1 − q 2xj )q n+j−2+x1,j−1 +x−1,−n+1 , Wj,−n [x|x + (j) − (−n)] = q x1,j−1 +x−1,−n+1 , W−j,−n [x|x − (−n) + (−j)] = zq xj +x−j−1,−n+1 , W−j,−n [x|x − (j) − (−n) + (l) + (−l)]l<j = (−1)j+l+1 z(1 − q 2xj )q j−l−1+xl+1,j−1 +x−l−1,−n+1 , Wn,n [x|x] = z 2 q 2n−2+x1,n−1 +x−1,−n+1 , W−n,−n [x|x] = q x1,n−1 +x−1,−n+1 , Wn,−n [x|x − 2(−n) + (l) + (−l)] = (−1)n+l+1 zq n−l−1+xl+1,n−1 +x−l−1,−n+1 , W−n,n [x|x + 2(−n) − (l) − (−l)] = (−1)n+l+1 z(1 − q 2xl )(1 − q 2x−l )q n−l−1+xl+1,n−1 +x−l−1,−n+1 .
January 12, 2005 14:5 WSPC/148-RMP
00225
A Quantization of Box-Ball Systems
1245
Here the notation y = x + (l) + (−l) − (j) − (−k) for example means that y is obtained from x by setting xl → xl + 1, x−l → x−l + 1, xj → xj − 1, x−k → x−k − 1. Since x−n becomes irrelevant in the limit (3.8), (−n) in the argument of Wµν may just be dropped. It has been included in the above formulas as a reminder of the conservation of the number of components. The matrix elements of the form Wµν [x|x − (λ) ± · · ·] with any λ ∈ {±1, . . . , ±(n − 1)} contain the factor 1 − q 2xλ as they should. 3.3. L operator L(z) We consider the Weyl algebra generated by Pµ±1 , Q±1 µ with µ ∈ J\{±n} under the same relation as (2.10). The subalgebra of the Weyl algebra generated by P µ , Qµ 2 and Rµ = Q−1 µ (1 − aµ Pµ ) with µ ∈ J\{±n} will again be denoted by A, where aµ is a parameter. We define the L operator L(z) = (Lµν (z))µ,ν∈J ∈ A ⊗ End(V ) so that Lµν (z) ∈ A with ∀aµ = 1 becomes the operator version of Wνµ [x|y] in Proposition 3.1. See (3.15). To present it explicitly, we assume 1 ≤ j, k, l ≤ n − 1 in this subsection. We set Pµ0 = −qaµ Pµ and use the symbols Pj,k = Pj Pj+1 · · · Pk ,
P−j,−k = P−j P−j−1 · · · P−k ,
0 0 Pj,k = Pj0 Pj+1 · · · Pk0 ,
0 0 0 0 P−j,−k = P−j P−j−1 · · · P−k
for j ≤ k. For j > k they should be understood as 1. Then Lµν (z) ∈ A reads as follows: j−1 X 0 Ljj (z) = zPj0 P−j + z R−l P−l−1,−j+1 Q−j Rl Pl+1,j−1 Qj , l=1
L−j,−j (z) =
0 zPj P−j
+z
j−1 X
0 Q−l P−l−1,−j+1 R−j Ql Pl+1,j−1 Rj ,
l=1
Lk>j (z) =
0 zR−j P−j−1,−k+1 Q−k Pj0
j−1 X
+z
0 R−l P−l−1,−k+1 Q−k Rl Pl+1,j−1 Qj ,
l=1
Lk<j (z) = zP−k Rk Pk+1,j−1 Qj + z
k−1 X
0 R−l P−l−1,−k+1 Q−k Rl Pl+1,j−1 Qj ,
l=1
L−k<−j (z) = zQ−j P−j−1,−k+1 R−k Pj + z
j−1 X
0 Q−l P−l−1,−k+1 R−k Ql Pl+1,j−1 Rj ,
l=1
0 0 L−k>−j (z) = zP−k Qk Pk+1,j−1 Rj + z
k−1 X l=1
0 0 Lk,−j (z) = z 2 P−1,−k+1 Q−k P1,j−1 Rj ,
L−k,j (z) = P−1,−k+1 R−k P1,j−1 Qj ,
0 Q−l P−l−1,−k+1 R−k Ql Pl+1,j−1 Rj ,
January 12, 2005 14:5 WSPC/148-RMP
1246
00225
R. Inoue, A. Kuniba and M. Okado 0 0 Lk,n (z) = z 2 P−1,−k+1 Q−k P1,n−1 ,
0 0 L−k,n (z) = zP−k Qk Pk+1,n−1 +z
k−1 X
0 Q−l P−l−1,−k+1 R−k Ql Pl+1,n−1 ,
l=1
Lk,−n (z) = zP−k Rk Pk+1,n−1 + z
k−1 X
0 R−l P−l−1,−k+1 Q−k Rl Pl+1,n−1 ,
l=1
L−k,−n (z) = P−1,−k+1 R−k P1,n−1 ,
Ln,j (z) =
0 zR−j P−j−1,−n+1 Pj0
+z
j−1 X
0 R−l P−l−1,−n+1 Rl Pl+1,j−1 Qj ,
l=1
Ln,−j (z) = z
2
0 0 P−1,−n+1 P1,j−1 Rj
,
L−n,j (z) = P−1,−n+1 P1,j−1 Qj , L−n,−j (z) = zPj Q−j P−j−1,−n+1 + z
j−1 X
0 Q−l P−l−1,−n+1 Ql Pl+1,j−1 Rj ,
l=1
0 0 Ln,n (z) = z 2 P1,n−1 P−1,−n+1 ,
L−n,n (z) = z
n−1 X
0 Ql P−l−1,−n+1 Q−l Pl+1,n−1 ,
l=1
Ln,−n (z) = z
n−1 X
0 R−l P−l−1,−n+1 Rl Pl+1,n−1 ,
l=1
L−n,−n (z) = P1,n−1 P−1,−n+1 . In these formulas, the operators Pµ , Qµ , Rµ and Pµ0 appearing in a single summand always have distinct indices hence their ordering does not matter. 3.4. Factorization of L(z) For µ ∈ J\{±n}, let Kµ = ((Kµ )λ,ν )λ,ν∈J ∈ A ⊗ End(V ) be the operator having the elements (Kµ )−n,µ = (Kµ )−µ,n = Qµ , (Kµ )µ,−n = (Kµ )n,−µ = Rµ , (Kµ )−n,−n = (Kµ )−µ,−µ = Pµ , (Kµ )µ,µ = (Kµ )n,n =
Pµ0
,
(Kµ )ν,ν = 1 ν 6= ±µ, ±n .
(3.9)
January 12, 2005 14:5 WSPC/148-RMP
00225
A Quantization of Box-Ball Systems
1247
2 0 All the other elements are zero. Here Rµ = Q−1 µ (1 − aµ Pµ ) and Pµ = −qaµ Pµ as in Sec. 3.3. We also introduce Sµ , S¯µ ∈ A ⊗ End(V ) for µ = 0, . . . , n as follows. First we specify S1 , . . . , Sn−1 by
(Sµ )µ,µ = (Sµ )−µ−1,−µ−1 = Qµ , (Sµ )µ+1,µ+1 = (Sµ )−µ,−µ = Rµ , (Sµ )µ,µ+1 = (Sµ )−µ−1,−µ = Pµ , (Sµ )µ+1,µ = (Sµ )−µ,−µ−1 =
Pµ0
(3.10)
,
(Sµ )ν,ν = 1 ν 6= ±µ, ±(µ + 1) ,
where the other elements are zero. Then S¯µ ∈ A ⊗ End(V ) with 1 ≤ µ ≤ n − 1 is obtained from Sµ by replacing Pµ , Qµ , Rµ and Pµ0 with P−µ , Q−µ , R−µ = 2 0 Q−1 −µ (1 − a−µ P−µ ) and P−µ = −qa−µ P−µ , respectively. Finally the remaining ones are determined by S0 = σS1 σ ,
S¯0 = σ S¯1 σ ,
Sn = σSn−1 σ ,
S¯n = σ S¯n−1 σ ,
(3.11)
−1
where σ = σ is defined at the end of Sec. 3.1. The operators Kµ and Sν , S¯ν are connected via a gauge transformation analogous to (2.17). To explain it we prepare the Weyl group operators σ0 , . . . , σn ∈ End(V ) which act as identity except σ0 : v1 ↔ v−2 ,
v−1 ↔ v2 ,
σi : vi ↔ vi+1 ,
v−i ↔ v−i−1
σn : vn−1 ↔ v−n ,
1 ≤ i ≤ n −1,
v−n+1 ↔ vn .
In terms of the sequences (i2n−2 , . . . , i2 , i1 ) = (n, n − 2, n − 3, . . . , 2, 0, 1, 2, . . . , n − 2, n) , (µ2n−2 , . . . , µ2 , µ1 ) = (−n + 1, . . . , −2, −1, 1, 2, . . . , n − 1) , the gauge transformation is given by ( σi1 · · · σik Sik σik−1 · · · σi1 Kµk = σi1 · · · σik S¯ik σik−1 · · · σi1 We note the relations
1 ≤ k ≤ n−1,
(3.12)
n ≤ k ≤ 2n − 2 .
σ = σi1 · · · σi2n−2 , σSi σ = Si ,
σ S¯i σ = S¯i
Define the diagonal matrices
(3.13)
1 ≤ i ≤ n −1.
2n−2
d(z) = z diag(z
−1
z }| { 1, . . . , 1, z) ,
D(z) = σi1 · · · σin−1 d(z)σin−1 · · · σi1
n−1
n−1
z }| { z }| { = z diag(1, . . . , 1, z, z −1 , 1, . . . , 1) .
(3.14)
January 12, 2005 14:5 WSPC/148-RMP
1248
00225
R. Inoue, A. Kuniba and M. Okado
Proposition 3.2. The L operator in Sec. 3.3 is factorized as L(z) = K−n+1 · · · K−1 D(z)K1 · · · Kn−1 . Equivalently it is also expressed as L(z) = σ S¯i2n−2 · · · S¯in d(z)Sin−1 · · · Si1 = S¯n−1 S¯n−2 · · · S¯2 S¯1 σd(z)S1 S2 · · · Sn−2 Sn . The equivalence of the first and the second expressions is due to (3.12) and (3.14). The second and the third are connected by (3.11) and (3.13). The first expression is proved in Appendix B. Proposition 3.3. The L operator and the R matrix (3.2) satisfy the same RLL relation as in Proposition 2.1. Proposition 3.3 is a corollary of Proposition 3.2 and the following: Lemma 3.4. R(z2 /z1 )(σd(z1 ) ⊗ σd(z2 )) = (σd(z1 ) ⊗ σd(z2 ))R(z2 /z1 ) , 2
1
1
2
2
1
1
2
R(z) Sµ Sµ = Sµ Sµ R(z) , R(z) S¯µ S¯µ = S¯µ S¯µ R(z) ,
1 ≤ µ ≤ n, 1 ≤ µ ≤ n.
Proof. The first relation is straightforward to check. Next consider the second relation with 1 ≤ µ ≤ n − 1. Comparing the R matrices (2.1) and (3.2), we find that the contributions proportional to a(z), b(z) and c(z) on both sides are equal (1) due to Lemma 2.4 for An−1 case. Thus we are to show the equality with R(z) P replaced with j,k fjk (z)Ejk ⊗ E−j −k . It is easily checked at z = 0 and z = ξ −1 for example, which suffices since fjk (z) is linear in z. Then the second relation with µ = n follows from the µ = n − 1 case by using Sn = (σd(z))−1 Sn−1 σd(z). The third relation can be shown similarly. As Remark 2.5, if aµ = 1, the property Kµ2 = 1l2n holds for any µ ∈ J\{±n}. (1) 3.5. Quantized Dn automaton (1)
Here we set up the quantized Dn automaton. It is a system of particles and antiparticles on one-dimensional lattice whose dynamics is governed by the L operator constructed in Sec. 3.3. In the limit q → 0, the dynamics become deterministic and (1) the system reduces to the Dn automaton [10, 4]. Since our results are parallel with those in Secs. 2.4–2.7, we shall only give a brief sketch and omit the details.
January 12, 2005 14:5 WSPC/148-RMP
00225
A Quantization of Box-Ball Systems
1249
The space of states P is given by (2.21), where V is now understood as the 2ndimensional vector representation V = Cv1 ⊕ · · ·⊕ Cv−1 . The condition (ii) remains P the same while the condition (i) is replaced by k∈Z |jk + n| < ∞. Monomials · · · ⊗ vj−1 ⊗ vj0 ⊗ vj1 ⊗ · · · can be classified according to the numbers w1 , . . . , wn , w−n+1 , . . . , w−1 of occurrence of the letters 1, . . . , n, −n + 1, . . . , −1 in the set {jk }. Consequently one has the direct sum decomposition P = ⊕Pw1 ,...,w−1 analogous to (2.22), where P0,...,0 = Cpvac with pvac = · · · ⊗ v−n ⊗ v−n ⊗ · · ·. The local state vjk ∈ V is regarded as the kth box containing a particle of color jk if jk ∈ {±1, . . . , ±(n − 1)}. Particles having colors with opposite signs are regarded as antiparticles of the other. The case jk = −n is interpreted as an empty box, while jk = n represents a bound state of a particle and an antiparticle. To formulate the time evolution, we assume ∀aµ = 1 from now on, and consider the space of the quantum carrier, namely, the A module M defined similarly to (2.23). The difference now is that we need 2n − 2 coordinates and to set M = ⊕C[m1 , . . . , mn−1 , m−n+1 , . . . , m−1 ]. Then the actions of Pµ , Qµ , Rµ and Pµ0 = −qPµ are again given by (2.23) by simply extending the index i to µ = ±1, . . . , ±(n − 1). By construction we have X L(z)(x ⊗ vµ ) = Wµν [x|y](y ⊗ vν ) (3.15) ν∈J, y∈M
for x ∈ M. Here the sum over y is taken under the constraint (3.6), where the weight wt should now be understood as (3.7) without the nth component. The time evolution T (z) : P → P is also given by the same formula (2.25), where (· · ·)0,0 now signifies the element in End(P) corresponding to the tran2n−2
z }| { sition from [0, . . . , 0] to itself in the M part. From (3.14) one has T (z)p = z w1 +···+wn−1 +2wn +w−n+1 +···+w−1 T (1)p for p ∈ Pw1 ,...,w−1 . The power of z is the total number of particles and antiparticles, for vn represents a bound state of a particle and an antiparticle. As it turns out, the total number is conserved, which implies the commutativity T (z)T (z 0) = T (z 0 )T (z). We concentrate on T = T (1) henceforth. The propagation operators Kµ for µ = ±1, . . . , ±(n − 1) are defined in the same way as (2.28) as the product of Kµ acting locally. This time the local interaction and their amplitudes implied by (3.9) are depicted in Fig. 8. Here mµ ∈ Z≥0 is a coordinate in [m1 , . . . , mn−1 , m−n+1 , . . . , m−1 ] ∈ M, meaning the number of color µ particles on the carrier. The top five diagrams are essen(1) tially the same as Fig. 2 for the An−1 case, where color µ particles on the carrier (horizontal line) behave according to the presence or absence of another color µ (1) particle in a local box. (The empty box −n here corresponds to n in the An−1 case.) The bottom four vertices are new. The second one there is the pair annihilation of a color µ particle on the carrier and the antiparticle −µ in the box to form the bound state n. The third one is the pair creation of µ and −µ from the bound state n. At q = 0 the amplitudes for Pµ with m > 0, Rµ with m = 0 and Pµ0 vanish
January 12, 2005 14:5 WSPC/148-RMP
1250
00225
R. Inoue, A. Kuniba and M. Okado
mµ
Pµ
Rµ
Qµ
Pµ0
1
−n
−n
µ
µ
ν
- mµ ? −n
mµ
−µ mµ
- mµ ? −µ q mµ
- mµ −1 mµ - mµ +1 mµ ? ? µ −n n
−µ mµ
Fig. 8.
1
mµ
- mµ ? ν
n
- mµ −1 mµ - mµ +1 mµ ? ? n −µ 1 − q 2mµ
- mµ ? µ
- mµ ? n −q mµ +1
1
Diagram for Kµ (ν 6= ±µ, ±n).
and the other ones become 1. As the result they reduce to the deterministic rule that agrees with the one in [10]. (1) As a parallel result with Theorem 2.6, the time evolution of the quantized Dn automaton admits the factorization into the propagation operators. Theorem 3.5. T = K−n+1 · · · K−1 K1 · · · Kn−1 . This is a consequence of Proposition 3.2. It extends a part of the earlier result at q = 0 based on the crystal basis theory [12, 10], where the time evolutions of a class of soliton cellular automata were factorized. Finally we state properties of the amplitude for T . Define the transposition t T of T , the subspace Pfin and the linear function N : Pfin → C in the same manner as Sec. 2.7. Proposition 3.6. Propositions 2.7 and 2.9 are both also valid for the quantized (1) Dn automaton. Proof. In view of the factorization of T , it is enough to show the claim for any one of the propagation operators, say K1 . Namely t K1 = K1−1 and N (K1 (p)) = N (p). Then without a loss of generality one may restrict the space of states to P w1 ,...,w−1 with all wµ being zero except w±1 and wn . Let π be the map that embeds the local (1) states into that for A1 as π(v1 ) = π(vn ) = • (a ball) and π(v−1 ) = π(v−n ) = ◦ (1) (an empty box), where we have used the notation in Sec. 2.7 for A1 . Further let φ (1) (1) be the map sending the pair of local states for A1 and Dn to that for the latter as φ(◦, v1 ) = v−n , φ(•, v1 ) = v1 ,
φ(◦, v−1 ) = v−1 , φ(•, v−1 ) = vn ,
φ(◦, v−n ) = v−n , φ(•, v−n ) = v1 ,
φ(◦, vn ) = v−1 , φ(•, vn ) = vn .
The componentwise action of these maps will also be denoted by the same symbol. For example, if p = · · · ⊗ v−n ⊗ v−1 ⊗ v−n ⊗ · · · and p0 = · · · ⊗ ◦ ⊗ • ⊗ ◦ ⊗ · · ·
January 12, 2005 14:5 WSPC/148-RMP
00225
A Quantization of Box-Ball Systems
1251
in the corresponding position, one has π(p) = · · · ⊗ ◦ ⊗ ◦ ⊗ ◦ ⊗ · · · and φ(p0 , p) = · · · φ(◦, v−n ) ⊗ φ(•, v−1 ) ⊗ φ(◦, v−n ) ⊗ · · · = · · · ⊗ v−n ⊗ vn ⊗ v−n ⊗ · · ·. Denot(1) ing the propagation operator for A1 by K1A , one has the embedding K1 (p) = φ K1A (π(p)), p . With the aid of this relation, the statements are reduced to the (1) A1 case established in Sec. 2.7. Appendix A. Proof of Proposition 3.1 The simplifying feature of the limit x−n → ∞ (3.8) is that one can decompose wµν [x|y] into three parts effectively. To see this suppose x ∈ Vm is in normal order (v−1 )⊗x−1 ⊗ · · · ⊗ (v−n+1 )⊗x−n+1 ⊗ (v−n )⊗x−n ⊗ (vn−1 )⊗xn−1 ⊗ · · · ⊗ (v1 )⊗x1 . (1)
Application of (2.2) for Dn to this generates a variety of vectors y = vj1 ⊗· · ·⊗vjm . However in the limit x−n → ∞ under consideration, the vectors v1 , . . . , vn are not allowed to appear in the left side of the segment v−n ⊗ · · · ⊗ v−n since they acquire the factor of order q x−n in the course of normal ordering. See (3.4). Similarly, v−1 , . . . , v−n+1 are forbidden to show up in the right side of v−n ⊗ · · · ⊗ v−n . In this way Wµν [x|y] is effectively decomposed into the right, left and the infinitely large central parts, where the allowed indices are limited to {1, . . . , n − 1}, {−1, . . . , −n + 1} and −n, respectively. Taking the situation into account, we derive Wµν [x|y] (3.8) in three steps. In Step 1, we compute all the matrix elements wµν [x|y] for x of the form x = im = m }| { z vi ⊗ · · · ⊗ vi , which serves as a building block for general x. In Step 2, we obtain the limits of wµν [x|y] that are relevant to the three parts separately. In Step 3, we glue the three parts together. Step 1. Lemma A.1. All the matrix elements of the form wj,k [im |y] are zero except the following: wi,i [im |im ] = (1 − q m+1 z)(1 − q m−1 ξz) w−i,−i [im |im ] = (q m−1 − z)(q m+1 − ξz) wj,j [im |im ] = q(q m−1 − z)(1 − q m−1 ξz) wj,i [im |im−1 , j] = (1 − q 2m )(1 − q m−1 ξz) ×
∀i ,
(A.1)
∀i , (
(A.2)
j 6= ±i 1 z
∀i ,
i j, j 6= ±i
i ≺ j, j 6= ±i
(A.3) ∀i , (A.4)
¯ ¯
w−i,j [im | − j, im−1 ] = (−1)i+j+1 (1 − q 2m )(q m−1 − z)q j+i−2 ( z 1 j ≺ −i × ∀i , ξ −1 −i ≺ j −1
(A.5)
January 12, 2005 14:5 WSPC/148-RMP
1252
00225
R. Inoue, A. Kuniba and M. Okado
w−i,i [im |im−2 , j, −j] = (−1)i+j+1 q n−j−1 (1 − q 2m )(1 − q 2m−2 ξ)z w−i,i [im | − i, im−1 ] m−1 (1 − q 2m )(1 − q m−1 ξz q + q 2i−1−m (z − q m−1 ))z = (1 − q 2m )(1 − q m−1 ξz + q 2i+1+m ξ(z − q m−1 ))
i = ±n, 1 ≤ j ≤ n − 1 , (A.6)
1≤i≤n−1
i 6= ±n ,
(A.7)
−n + 1 i −1
w−i,i [im |im−2 , j, −j] (−1)i+j+1 q i−j−1 (1 − q 2m )(1 − q 2m−2 )(1 − q m−1 ξz)z = (−1)i+j q i−j+1 ξ(1 − q 2m )(1 − q 2m−2 )(q m−1 − z)
1 ≤ i ≤ n−1, 1≤j
−n + 1 i −1 , 1 ≤ j < |i| .
(A.8)
In these formulas for wj,k [im |y], y should be understood as a normal-ordered vector in Vm having the specified contents of the letters. Sketch of the proof. The first four, (A.1)–(A.4), are straightforward to check. The other formulas (A.5)–(A.8) are shown in this order by induction on m. Here we illustrate it for (A.8). Let us write the R-matrix (3.2) as R(z) = P m i,j,k,l r[i, k; j, l](z)Eji ⊗ Elk . For simplicity wj,k [i |y](z) will be denoted by wj,k [y](z). We treat the case 1 ≤ j < i ≤ n − 1. The result (A.8) for m = 2 can be checked directly. Assume (A.1)–(A.8) up to m. The fusion construction leads to the following recursion relation for m ≥ 3: w−i,i [im−1 , j, −j](z)a(zq m−2 ) = qr[i, i; i, i](zq m)w−i,i [im−2 , j, −j](zq −1 ) +
n−1 X
α6=i,α=j+1
(−1)1+α+j (1 − q 2 )q α−j+m−1 r[i, α; α, i](zq m )w−i,α [im−1 , −α](zq −1 )
+ (−1)i+j+1 (1 − q 2 )q i−j+m−1 r[i, i; i, i](zq m )w−i,i [im−1 , −i](zq −1 ) + q m+1 r[i, j; j, i](zq m )w−i,j [im−1 , −j](zq −1 ) + r[i, −j; −j, i](zq m)w−i,−j [im−1 , j](zq −1 ) + (−1)n+j+1 q n−j+m−1 r[i, −n; −n, i](zq m)w−i,−n [im−1 , n](zq −1 ) + r[i, n; n, i](zq m )w−i,n [im−1 , −n](zq −1 ) ,
January 12, 2005 14:5 WSPC/148-RMP
00225
A Quantization of Box-Ball Systems
1253
where the underlined factors come from the normal-ordering. To check that (A.1)– (A.8) satisfy this is easy. Step 2. As explained in the beginning of the appendix, we investigate the three parts that constitute the limit Wµν [x|y] separately. First we consider the right part. 0 0 Lemma A.2. Set wµν [x|y] = wµν [x|y](z) = wµν [x|y]/a(z) and m1 = x1,n−1 . Suppose x and y have the form x = [x1 , . . . , xn−1 , 0, . . . , 0] and y = [y1 , . . . , yn−1 , y−n , 0, . . . , 0], respectively. Then the nonzero case of the limit 0 limz→∞ wµν [x|y] is given by 0 w±j,±j [x|x] → q −m1 ±xj (1 ≤ j ≤ n) , 0 wj,k [x|x + (j) − (k)] → −(1 − q 2xk )q −m1 −1+xk+1,j−1 (1 ≤ k < j ≤ n − 1) , 0 w−j,−k [x|x − (j) + (k)] → (−1)j+k (1 − q 2xj )q −m1 +j−k−xj,k (1 ≤ j < k ≤ n − 1) , 0 w−n,k [x|x + (−n) − (k)] → −(1 − q 2xk )q −1−x1,k (1 ≤ k ≤ n − 1) , 0 w−j,n [x|x − (j) + (−n)] → (−1)j+n (1 − q 2xj )q −m1 +j−n−xj,n−1 (1 ≤ j ≤ n − 1) .
Sketch of the proof. We illustrate the derivation of the second case. From the fusion construction one gets 0 wj,k [x|x + (j) − (k)](zq m1 −1 ) = q xk+1,j−1
×
k−1 Y i=1
wj,k [k xk |k xk −1 , j](zq xk −1 ) a(zq 2(xk −1) )
n−1 wk,k [ixi |ixi ](zq 2xk +2x1,i−1 +xi −1 ) Y wk,k [ixi |ixi ](zq 2x1,i−1 +xi −1 ) a(zq 2xk +2(x1,i −1) ) a(zq 2(x1,i −1) ) i=k+1
!
,
where the factor q xk+1,j−1 is due to normal ordering. Substituting (A.3) and (A.4), one finds that this tends to the desired form in the limit z → ∞. Next we deal with the central part. Lemma A.3. Nonzero limit q x−n → 0 of wµν [(−n)x−n |y] is given by wn,n [(−n)x−n |(−n)x−n ] → ξz 2 ,
w−n,−n [(−n)x−n |(−n)x−n ] → 1 , wn,−n [(−n)x−n | − j, (−n)x−n −2 , j] → (−1)j+n+1 q n−j−1 z , wn,j [(−n)x−n | − j, (−n)x−n −1 ] → (−1)j+n q n+j−2 z 2 , wn,−j [(−n)x−n |(−n)x−n −1 , j] → (−1)j+n q n−j z , w±j,±j [(−n)x−n |(−n)x−n ] → −qz , wj,−n [(−n)x−n |(−n)x−n −1 , j] → 1 , w−j,−n [(−n)x−n | − j, (−n)x−n −1 ] → z , where 1 ≤ j ≤ n − 1.
January 12, 2005 14:5 WSPC/148-RMP
1254
00225
R. Inoue, A. Kuniba and M. Okado
Proof. Straightforward calculation based on Lemma A.1. Finally for the left part, the following is verified similarly to Lemma A.2. Lemma A.4. Suppose x and y have the form x = [0, . . . , 0, x−n+1 , . . . , x−1 ] and y = [0, . . . , 0, y−n , y−n+1 , . . . , y−1 ]. Then the nonzero case of the limit limz→0 wµν [x|y] is given by w±j,±j [x|x] → q m2 ±x−j (1 ≤ j ≤ n) , wj,k [x|x − (−j) + (−k)] → (−1)j+k+1 (1 − q 2x−j ) × q m2 +k−j−1+x−j−1,−k+1 (1 ≤ j < k ≤ n) , w−j,−k [x|x + (−j) − (−k)] → (1 − q 2x−k )q m2 −x−k,−j (1 ≤ k < j ≤ n) , where m2 = x−1,−n+1 . Step 3. We demonstrate the gluing procedure with two examples. First we derive the 4th case in Proposition 3.1, Wi,l [x|x + (i) − (j) − (−j) + (−l)]. This is calculated as the simple product of the three parts: 0 wi,j [x|x + (i) − (j)](zq −m+m1 )wj,j [(−n)x−n |(−n)x−n ](zq m1 −m2 )
× wj,l [x0 |x0 − (−j) + (−l)](zq m−m2 ) , which is nonzero for 1 ≤ j ≤ min(i, l). For j < i < l, it is calculated by multiplying the 2nd in Lemma A.2, the 6th of Lemma A.3 and the 2nd of Lemma A.4, leading to −(1 − q 2xj )q −m1 −1+xj+1,i−1 × (−zq 1+m1 −m2 ) × (−1)j+l+1 (1 − q 2x−j )q m2 +l−j−1+x−j−1,−l+1 = (−1)j+l+1 z(1 − q 2xj )(1 − q 2x−j )q l−j−1+xj+1,i−1 +x−j−1,−l+1 . This agrees with the sought result. Second we consider the 9th case in Proposition 3.1, Wi,−k [x|x + (i) − (−k)]. This matrix element is obtained by collecting several contributions as 0 [x|x](zq −m+m1 )wi,−n [(−n)x−n |(−n)x−n −1 , i](zq m1 −m2 ) q xi+1,n−1 wi,i
+
i−1 X j=1
0 [x|x + (i) − (j)](zq −m+m1 ) q xj+1,n−1 +1 wi,j
× wj,−n [(−n)x−n |(−n)x−n −1 , j](zq m1 −m2 ) × w−n,−k [x0 |x0 − (−k) + (−n)](zq m−m2 ) ,
where we have set x = [x1 , . . . , xn−1 , 0, . . . , 0] and x0 = [0, . . . , 0, x−n+1 , . . . , x−1 ]. The underlined factors come from normal ordering. In the limit x−n → ∞, this is
January 12, 2005 14:5 WSPC/148-RMP
00225
A Quantization of Box-Ball Systems
1255
evaluated by using the first two of Lemma A.2, the 7th of Lemma A.3 and the last of Lemma A.4 as ! i−1 X q xi,n−1 − (1 − q 2xj )q xj+1,i−1 +xj+1,n−1 (1 − q 2x−k )q m2 −x−k,−n+1 −m1 . j=1
The sum leads to the result (1 − q 2x−k )q x1,i−1 +x−1,−k+1 . Appendix B. Proof of Proposition 3.2 h
0
(1)
i
R Let Ln PQ P be the L operator L(z) for An−1 with z = 1 defined in (2.12). The L operator with Pi and Pi0 interchanged for all i ∈ {1, . . . , n − 1} will be denoted h R i by Ln P . A similar convention is applied also for the other interchanges like Q P0 ¯ n [· · ·] is the one obtained from Ln [· · ·] by changing Ri ↔ Qi , etc. A matrix L ¯+ Xi (X = P, P 0 , Q, R) into X−i for all i ∈ {1, . . . , n−1}. Matrices L+ n [· · ·] and Ln [· · ·] ¯ are the ones obtained from Ln [· · ·] and Ln [· · ·] respectively by the replacement ˜ denote X±i → X±(i+1) for all i ∈ {1, . . . , n − 1}. For any square matrix M we let M
the one obtained by reversing the order of rows and columns simultaneously. Lemma B.1.
P10
R1 1ln−1
Q1
P1
P10
R1
1ln−1 Q1
t
L+ n ˜+ L n
P1 P −1 1 P R t ¯+ Ln Q P0 R−1 + P0 Q P −1 t˜ ¯ L n R P 1
Here
1
R−1
P0 Q
P
Q
R
0
P
R = Ln+1 P
1
Q−1 1ln−1 0 P−1
Q−1 1ln−1 0 P−1
˜ n+1 =L
P0
R
Q
P
P
Q
R
P0
P t¯ = Ln+1 Q
0 P t ˜¯ = L n+1 R
,
,
R P0 Q P
.
.
means the transposition.
Proof. The first relation is just (2.16). The second relation is obtained from the first one by taking ˜ and the interchanges P ↔ P 0 , Q ↔ R. See Remark 2.3. The third relation follows from the first one by t¯ and P ↔ P 0 . The last one follows from the third one by ˜ and P ↔ P 0 , Q ↔ R.
January 12, 2005 14:5 WSPC/148-RMP
1256
00225
R. Inoue, A. Kuniba and M. Okado
Lemma B.2.
K1 · · · Kn−1 = ρ
Ln
P0
R
Q
P
˜n L
t¯
K−n+1 · · · K−1 =
Ln
P
R
Q
P0
t˜ ¯
Ln
where ρ ∈ End(V ) denotes the interchange vn ↔ v−n .
P
Q
R
P0
0
Q
R
P
P
ρ,
(B.1)
,
(B.2)
Proof. We use induction on n. The n = 3 case is checked by a direct calculation. Assume (B.1) and (B.2) are fulfilled up to n. Then the left-hand side of (B.1) for n + 1 is K1 K2 · · · Kn 0 P1 1ln−1 = Q 1
R1 P10 P1 1ln−1 Q1
P10
1 R1 ρ
0
R
Q
P
P
˜+ L n
P
Q
R
P0
P1
R1 1ln−1
Q1 = ρ
L+ n
1
R1
P1 P10 1ln−1 Q1
L+ n
0
R
Q
P
P
1
ρ
˜+ L n
P
Q
R
P0
P1
1
ρ.
Owing to the first two relations in Lemma B.1, this coincides with the right-hand side of (B.1) for n + 1. Similarly the induction assumption leads to the following expression for the left-hand side of (B.2) for n + 1: K−n K−n+1 · · · K−1
=
1 t ¯+ Ln
P
R
Q
0
P
+ t˜ ¯n L
P0
Q
R
P
1
P−1 R−1
Q−1 1ln−1 0 P−1
P−1 1ln−1 R−1
. Q−1 0 P−1
January 12, 2005 14:5 WSPC/148-RMP
00225
A Quantization of Box-Ball Systems
1257
Again the product can be computed by using the latter two relations in Lemma B.1, yielding the right-hand side of (B.2) for n + 1. This completes the induction. Proof of Proposition 3.2. The product K−n+1 · · · K−1 D(z)K1 · · · Kn−1 can be calculated by using Lemma B.2, (3.14) and (2.12). The result agrees with the L(z) defined in Sec. 3.3. Acknowledgments The authors thank Taichiro Takagi and Yasuhiko Yamada for discussion. A. K. thanks Murray Batchelor, Vladimir Bazhanov, Vladimir Mangazeev and Sergey Sergeev for their warm hospitality at the Australian National University during his stay in March 2004. A. K. and M. O. are partially supported by Grand-in-Aid for Scientific Research JSPS No. 15540363 and No. 14540026, respectively from Ministry of Education, Culture, Sports, Science and Technology of Japan. References [1] D. Takahashi and J. Satsuma, A soliton cellular automaton, J. Phys. Soc. Jpn. 59 (1990) 3514–3519. [2] D. Takahashi, On some soliton systems defined by using boxes and balls, in Proceedings of the International Symposium on Nonlinear Theory and Its Applications (NOLTA ’93), (1993), pp. 555–558. [3] T. Tokihiro, D. Takahashi, J. Matsukidaira and J. Satsuma, From soliton equations to integrable cellular automata through a limiting procedure, Phys. Rev. Lett. 76 (1996) 3247–3250. [4] G. Hatayama, A. Kuniba and T. Takagi, Soliton cellular automata associated with crystal bases, Nucl. Phys. B577[PM] (2000) 619–645. [5] G. Hatayama, K. Hikami, R. Inoue, A. Kuniba, T. Takagi and T. Tokihiro, The (1) AM Automata related to crystals of symmetric tensors, J. Math. Phys. 42 (2001) 274–308. [6] K. Fukuda, M. Okado and Y. Yamada, Energy functions in box ball systems, Int. J. Mod. Phys. A15 (2000) 1379–1392. [7] A. Kuniba, M. Okado, T. Takagi and Y. Yamada, Tropical R and tau functions, Commun. Math. Phys. 245 (2004) 491–517. [8] G. Hatayama, A. Kuniba, M. Okado, T. Takagi and Y. Yamada, Scattering rules in soliton cellular automata associated with crystal bases, Contemporary Math. 297 (2002) 151–182. [9] A. Kuniba, M. Okado, T. Takagi and Y. Yamada, Geometric crystal and tropical R (1) for Dn , Int. Math. Res. Notices 48 (2003) 2565–2620. [10] G. Hatayama, A. Kuniba, and T. Takagi, Simple algorithm for factorized dynamics of gn -automaton, J. Phys. A: Math. Gen. 34 (2001) 10697–10705. [11] A. Kuniba, T. Takagi and A. Takenouchi, Factorization, reduction and embedding in integrable cellular automata, J. Phys. A37 (2004) 1691–1709. [12] G. Hatayama, A. Kuniba and T. Takagi, Factorization of combinatorial R matrices and associated cellular automata, J. Stat. Phys. 102 (2001) 843–863. [13] D. Takahashi and J. Matsukidaira, Box and ball system with a carrier and ultradiscrete modified KdV equation, J. Phys. A30 (1997) L733–L739.
January 12, 2005 14:5 WSPC/148-RMP
1258
00225
R. Inoue, A. Kuniba and M. Okado
[14] S. M. Khoroshkin and V. N. Tolstoy, Universal R-matrix for quantized (super) algebras, Commun. Math. Phys. 141 (1991) 599–617. [15] A. N. Kirillov and N. Yu. Reshetikhin, q-Weyl group and a multiplicative formula for universal R-matrices, Commun. Math. Phys. 134 (1990) 421–431. [16] Ya. S. Soibelman, Quantum Weyl group and some of its applications, Rend. Circ. Mat. Palermo Suppl. 26 (1991) 233–235. [17] P. P. Kulish, N. Yu. Reshetikhin and E. K. Sklyanin, Yang–Baxter equations and representation theory. I, Lett. Math. Phys. 5 (1981) 393–403. [18] A. Nakayashiki and Y. Yamada, Kostka polynomials and energy functions in solvable lattice models, Selecta Math. New Ser. 3 (1997) 547–599. [19] V. V. Bazhanov and Yu. G. Stroganov, Chiral Potts model as a descendant of the six-vertex model, J. Stat. Phys. 59 (1990) 799–817. [20] K. Hikami, R. Inoue and Y. Komori, Crystallization of the Bogoyavlensky lattice, J. Phys. Soc. Jpn. 68 (1999) 2234–2240. [21] R. J. Baxter, Exactly Solved Models in Statistical Mechanics (Academic Press, London, 1982). [22] V. E. Korepin, N. M. Bogoliubov and A. G. Izergin, Quantum Inverse Scattering Method and Correlation Functions (Cambridge University Press, Cambridge, 1993). [23] V. V. Bazhanov, Integrable quantum systems and classical Lie algebras, Commun. Math. Phys. 113 (1987) 471–503. [24] M. Jimbo, Quantum R matrix for the generalized Toda system, Commun. Math. Phys. 102 (1986) 537–547. [25] S.-J. Kang, M. Kashiwara and K. C. Misra, Crystal bases of Verma modules for quantum affine Lie algebras, Compositio Math. 92 (1994) 299–325.
January 18, 2005 10:1 WSPC/148-RMP
00224
Reviews in Mathematical Physics Vol. 16, No. 10 (2004) 1259–1290 c World Scientific Publishing Company
THE BETHE–SOMMERFELD CONJECTURE FOR THE 3-DIMENSIONAL PERIODIC LANDAU OPERATOR
DANIEL M. ELTON Department of Mathematics and Statistics, Lancaster University, Lancaster, LA1 4YF, United Kingdom [email protected] Received 3 January 2004 Revised 27 October 2004 The 3-dimensional Schr¨ odinger operator H corresponding to a uniform magnetic field and a periodic electric potential V is considered. Under the condition of magnetic flux rationality, and with a mild regularity assumption on V , it is shown that the number of gaps in the spectrum of H is finite. Keywords: Bethe–Sommerfeld conjecture; Schr¨ odinger operator; uniform magnetic field. Mathematics Subject Classifications (2000): Primary 35J10, 81Q10; Secondary 35P15, 35P20
1. Introduction The 3-dimensional periodic Landau operator is the Schr¨ odinger operator representing a uniform magnetic field and periodic electric potential in R3 . This operator can be written as H = (−i∇ − A)2 + V
(1.1)
where ∇ = (∇x , ∇y , ∇z ) is the usual gradient operator on R3 , A is a magnetic (real vector) potential for which the corresponding magnetic field B = ∇ × A is constant, and V is an electric (real scalar) potential which is periodic with respect to some lattice Γ ⊂ R3 . If one assumes that the magnetic field satisfies the flux rationality condition (see below) then the spectrum of H has a band-gap structure; that is, it is formed of closed intervals (the spectral bands) which are possibly separated by open intervals free of spectrum (the gaps). In this paper we consider the Bethe–Sommerfeld conjecture for the operator H, which states that the number of gaps in the spectrum of H is finite. As originally formulated in the 1930s by H. Bethe and A. Sommerfeld, this conjecture applied to the 3-dimensional periodic Schr¨ odinger operator (that is, the operator (1.1) with A = 0). More generally, the conjecture for the n-dimensional 1259
January 18, 2005 10:1 WSPC/148-RMP
1260
00224
D. M. Elton
periodic Schr¨ odinger operator has received considerable attention from a number of authors; it has been verified for n = 2 in [15, 4], for n = 3 in [20], and for n = 2, 3, 4 in [7]. For dimensions n ≥ 5 the conjecture has been established only for rational lattices Γ; see [19]. This work has naturally led to the investigation of the periodic polyharmonic oscillator (−∆)l + V in Rn (here l ∈ R+ ; taking l = 1 corresponds to the usual periodic Schr¨ odinger operator). Conditions on n and l for which the Bethe–Sommerfeld conjecture is known to hold have been steadily relaxed by various authors; it was established for 2l > n in [18, 19], for 4l > n + 1 in [8] (see also [9]), and for 8l > n + 3 in [13, 14]. More recent work in [17] has extended the conjecture to operators of the form P (D) + V where P (D) belongs to a class of higher-order constant coefficient elliptic operators with convex symbols. The Bethe–Sommerfeld conjecture has been relatively less studied for magnetic Schr¨ odinger operators. For the Schr¨ odinger operator with a periodic electric and magnetic potential the finiteness of the number of gaps has been proved only for n = 2; see [12]. The situation for a uniform magnetic field is quite different; indeed for n = 2 and V = 0 the spectrum consists of discrete regularly-spaced values (the well-known Landau levels). An elementary perturbation argument then shows that the spectrum of the 2-dimensional equivalent of the operator (1.1) must have infinitely many gaps, at least for small V . The 3-dimensional Schr¨ odinger operator corresponding to a uniform magnetic field (that is, the operator (1.1)) was considered in [5]; here it was shown that the Bethe–Sommerfeld conjecture is true provided the potential V is sufficiently small (in operator norm). The aim of this paper is to establish the same result without any restriction on the size of V . The band-gap picture for the spectrum of periodic operators can be obtained from the Bloch or Floquet analysis for such operators (see [11, 16]). This analysis cannot be applied directly to the operator (1.1) since the magnetic potential A corresponding to a uniform magnetic field in not itself periodic. However an analogue of the Bloch analysis can be applied if one assumes the magnetic field satisfies the flux rationality condition ([22, 2, 3]). This condition is a restriction on the relationship between the lattice Γ and the (constant) magnetic field B; more precisely, if we choose a basis {e1 , e2 , e3 } for Γ and let Ω = t1 e1 + t2 e2 + t3 e3 t1 , t2 , t3 ∈ [0, 1)
denote the corresponding unit cell, then the flux rationality condition is the requirement that the vector |Ω|B/2π has rational coordinates with respect to {e1 , e2 , e3 } (here |Ω| denotes the volume of Ω). This assumption is clearly independent of the choice of basis {e1 , e2 , e3 } for Γ. Our main result requires a mild regularity condition on the potential V which is most easily stated using the Fourier coefficients of V relative to the lattice Γ. Let ˜ denote the dual lattice of Γ (so mx ∈ 2πZ for all m ∈ Γ ˜ and x ∈ Γ). Then we Γ can write 1 X ˆ V (m)eimx , x ∈ R3 (1.2) V (x) = |Ω|1/2 ˜ m∈Γ
January 18, 2005 10:1 WSPC/148-RMP
00224
Bethe–Sommerfeld Conjecture for the 3-D Periodic Landau Operator
1261
where Vˆ (m) =
1 |Ω|1/2
Z
0
e−imx V (x0 ) d3 x0 , Ω
˜. m∈Γ
The regularity condition on V is the assumption that X |m|δ Vˆ (m) < +∞ for some δ > 0 .
(1.3)
˜ m∈Γ
(This condition requires that V is a “bit more” than continuous.) The main result of the paper can now be stated as follows. Theorem 1.1. Suppose B is a constant magnetic field satisfying the flux rationality condition and V is a potential satisfying condition (1.3). Then the spectrum of the operator H given by (1.1) contains a half-infinite interval ; that is, [Λ, +∞) ⊆ σ(H) for some constant Λ. In particular , the spectrum of H contains only finitely many gaps. The constant Λ can be chosen to depend continuously on the potential V, the lattice Γ and the magnetic field intensity |B|. In Sec. 3 we present a more detailed version of this result which contains information about the extent to which the spectral bands of H must overlap (see Theorem 3.1). The analogue of the Bloch analysis for H reduces this operator to a direct integral of operators with discrete spectra; the result is summarised in Theorem 2.1 (the proof of which is essentially a standard calculation and is deferred until Sec. 4). After this point our argument bears a formal similarity to that developed in a number of papers including [10, 13, 14] and [17]; this involves the combination of a careful analysis of the counting function for the unperturbed fibre operator appearing in the direct integral (essentially following the approach of [5]), together with a bound on how much the potential V can alter this counting function. The first part of Sec. 3 deals with the argument at this level. Also following ideas in the works mentioned above, the analysis of the effect of the potential on the counting function of the fibre operator is carried out by splitting the potential into a part of (controlled) finite rank and a part which does not alter the counting function locally; this process is carried out in Sec. 3.1 with Secs. 3.2 and 3.3 dealing with the two resultant parts. The proofs of some technical estimates required for the latter part are given separately in Sec. 5. 1.1. Notation In general boldface lowercase letters will denote elements of R3 while tildes over lowercase letters will indicate elements of R2 . Where the same letter is used a relationship of the form m = (m1 , m2 , m3 ) = (m1 , m) ˜ with m ˜ = (m2 , m3 ) is intended. Juxtaposition of elements of R3 or R2 will imply the usual scalar product; when thinking of elements of R3 as row vectors, notation of the form mxT will also be used.
January 18, 2005 10:1 WSPC/148-RMP
1262
00224
D. M. Elton
Throughout this paper C will denote an arbitrary positive constant which can vary from line to line. A subscript will be added if we want to keep track of a particular constant while notation of the form C(V, Γ) will be used to indicate that a constant can be chosen to depend continuously on the specified parameters. By continuous dependence on V we mean continuous dependence on δ and the value of the sum appearing in (1.3). By continuous dependence on the lattice Γ we mean continuous dependence on the basis vectors of this lattice (or, equivalently, the basis vectors of the dual lattice). In Sec. 3.2 we will use [x] to denote the greatest integer not exceeding x ∈ R. Also, in Sec. 5 Γ(x) is the usual gamma function (which has nothing to do with the lattice Γ!) 2. Reduction of the Operator In this section we establish a unitary equivalence between H and an operator written as the direct integral of operators with discrete spectra (see Theorem 2.1). We have used a reduction that essentially results in a mixture of the magnetic analogues of the x-representation and p-representation of a periodic operator (for periodic Schr¨ odinger operators see [16]; [22] discusses the magnetic analogue of the x-representation (referring to it as the qk-representation) while a description of the magnetic analogue of the p-representation can be found in [5]). Underlying the reduction is a discrete symmetry group for H — in our case this is a subgroup of the group of magnetic translations introduced in [21]. However our construction (given in Sec. 4) is direct and does not make explicit use of this symmetry group. Before stating the reduction result we choose convenient values for several quantities and introduce some useful notation. In order that the flux rationality condition is satisfied the magnetic field B must be parallel to an element of the lattice Γ. We make this assumption for the remainder of the paper. Furthermore, by taking an alternative basis if necessary, we may assume that B is parallel to the basis vector e3 ∈ Γ. By rotating the coordinate system in R3 , and replacing e1 by −e1 if necessary, we may assume that e3 is in the direction of the z-axis, e2 lies in the yz-plane with a positive y component and e1 has a positive x component; that is, we may assume e3x = e3y = e2x = 0, e3z , e2y , e1x > 0. It follows that B = (0, 0, β) , where, without loss of generality, we may assume β > 0. Although there are many choices of a corresponding magnetic potential such choices are all gauge equivalent and lead to unitarily equivalent operators (1.1); in particular, the choice of A has no effect on the spectrum of H. For convenience we choose the magnetic potential A = (0, βx, 0) . ˜ which is defined by ei f T = Let {f1 , f2 , f3 } denote the basis of the dual lattice Γ j 2πδij . Our assumptions on {e1 , e2 , e3 } immediately imply f1y = f1z = f2z = 0,
January 18, 2005 10:1 WSPC/148-RMP
00224
1263
Bethe–Sommerfeld Conjecture for the 3-D Periodic Landau Operator
e1x f1x = e2y f2y = e3z f3z = 2π and f1x , f2y , f3z > 0. Now the e3 component of B is β/e3z while |Ω| = e1x e2y e3z = (2π)2 e3z /(f1x f2y ). It follows that the flux rationality assumption is equivalent to the condition β=
f1x f2y p 2πq
for some p, q ∈ N .
Define two 3 × 3 matrices e1x e1y e1z e1 E = e2 = 0 e2y e2z and F = ( f1T e3
0
0
f2T
e3z
(2.1)
f1x
f3T ) = 0
0
f2x
f3x
f2y
f3y .
0
f3z
The dual basis condition becomes EF = 2πI3 = F E. Now introduce a change of coordinates which transforms the lattice Γ into the usual cubic one (2πZ)3 ; that is, for the point x ∈ R3 define new coordinates x1 , x2 , x3 ∈ R by 1 x = (x, y, z) = (x1 e1 + x2 e2 + x3 e3 ) ; 2π in particular xi = xfiT . Let ∂i = ∂/∂xi denote differentiation with respect to xi for i = 1, 2, 3 and set D = −i(∂1 , ∂2 , ∂3 )T . It is straightforward to check that (−i∇)T = F D. We also set βe1y x β1 x1 T βe1x eiy 1 EA = , i = 1, 2 , A= βe2y x = β2 x1 where βi := 2π 2π 4π 2 0 0 so AT = F A. With respect to the new coordinates our operator thus becomes H = H0 + V = (−i∇ − A)2 + V = (D − A)T G(D − A) + V
(2.2)
where G is the (metric) matrix G = F T F and, by a slight abuse of notation, we are now regarding V as a function of x1 , x2 , x3 . In particular, X V (x1 , x2 , x3 ) = Vm ei(m1 x1 +m2 x2 +m3 x3 ) (2.3) m∈Z3
where the Vm are related to the Fourier coefficients of V (see (1.2)) by Vm = |Ω|−1/2 Vˆ (m1 f1 + m2 f2 + m3 f3 ) ,
m ∈ Z3 .
The component of V corresponding to m = 0 is a constant term which causes a simple shift in the spectrum of H; we may thus assume V0 = 0. For the remainder of the paper we will work with the new coordinate system on R3 ; in particular, we will write x = (x1 , x2 , x3 ) to denote the coordinates of a point x ∈ R3 with respect to the basis {e1 , e2 , e3 }. ˜ more We will make use of a metric on Z3 associated to the dual lattice Γ; precisely, define |m|Γ˜ = m1 f1 + m2 f2 + m3 f3 = (f1x m1 + f2x m2 + f3x m3 )2 + (f2y m2 + f3y m3 )2 + (f3z m3 )2 1/2 (2.4)
January 18, 2005 10:1 WSPC/148-RMP
1264
00224
D. M. Elton
for all m ∈ Z3 . This metric is equivalent to the usual Euclidean metric so there exists a constant KΓ˜ ≥ 1 such that KΓ˜−1 |m| ≤ |m|Γ˜ ≤ KΓ˜ |m| ,
m ∈ Z3 .
In particular, the regularity condition (1.3) for V can be rewritten as X |m|δ |Vm | < +∞ for some δ > 0 .
(2.5)
(2.6)
m∈Z3
Define ˜b, c˜ ∈ R2 and α, γ ∈ R by ˜b = 1 (f2x , f3x ) , f1x α=
β f2y β2 = 2 f1x f1x
c˜ =
1 f1x (f2y , f3y ) = (f2y , f3y ) , f2y β2 β
and γ = f3z .
(2.7) (2.8)
For any m ∈ Z3 let U˜m denote the unitary operator on L2 (R) given by ˜˜ U˜m φ(x) = ei(m1 +bm)x φ(x + c˜m) ˜ .
(2.9)
For any p ∈ N (which will come from (2.1)) set Hp = L2 (R) ⊗ `2 (Z) ⊗ Cp .
(2.10)
The space `2 (Z) ⊗ Cp consists of square summable sequences indexed by Wp := Z × {0, . . . , p − 1}. Let ψm,j denote the sequence with a 1 in the (m, j)th position and 0’s elsewhere so that {ψm,j | (m, j) ∈ Wp } defines an orthonormal basis for `2 (Z) ⊗ Cp . Theorem 2.1. Suppose the flux rationality condition (2.1) is satisfied. Then the operator H = H0 + V given by (2.2) (with V given by (2.3)) is unitarily equivalent to a direct integral Z ⊕ H(k) dk [0,1)3
acting on the space Z
⊕
Hp dk . [0,1)3
The action of the fibre operator H(k) = H0 (k) + V(k) on Hp is given by ˜ m,k3 φ) ⊗ ψm,j H0 (k)(φ ⊗ ψm,j ) = (H
(2.11)
and V(k)(φ ⊗ ψm,j ) =
X
(m0 ,j 0 )∈Wp
0
0
m ,j φ) ⊗ ψm0 ,j 0 (V˜m,j
(2.12)
January 18, 2005 10:1 WSPC/148-RMP
00224
1265
Bethe–Sommerfeld Conjecture for the 3-D Periodic Landau Operator
for all (m, j) ∈ Wp and appropriate φ ∈ L2 (R), where 2 ˜ m,k3 = f 2 − d + (αx)2 + γ 2 (m + k3 )2 H 1x dx2
(2.13)
and 0
0
m ,j = V˜m,j
X
m1 ,m2 ∈Z
eiµk eiν V(m1 ,m2 p+j 0 −j,m0 −m) U˜(m1 ,m2 p+j 0 −j,m0 −m)
(2.14)
for some µ = µ(m1 , m2 , m0 , m, j 0 , j) ∈ R3 and ν = ν(m1 , m2 , m0 , m, j 0 , j) ∈ R.
˜ m,k3 (acting on L2 (R)) is simply a shifted harmonic oscillator. The operator H ˜ m,k3 consists of simple eigenWell-known calculations show that the spectrum of H values given by 2 Λk3 (m, n) := f1x (α(2n + 1)) + γ 2(m + k3 )2 = β(2n + 1) + γ 2(m + k3 )2 ,
n ∈ N0 .
A corresponding orthonormal eigenbasis can be chosen as φn (x) = p
√ α1/4 −αx2 /2 Hen ( αx) , √ e n n!2 π
(2.15)
where Hen is the nth Hermite polynomial (n.b., the eigenfunctions can be chosen to be independent of m and k3 ). The spectrum of the operator H0 (k) (acting on the space Hp ) consists of the eigenvalues Λk3 (m, n) ,
n ∈ N0 , m ∈ Z
(2.16)
each with multiplicity p (it is possible that Λk3 (m, n) may attain the same value for different values of n and m; in this case the multiplicity of the eigenvalue is given by p times the number of different possible choices of n and m). A corresponding orthonormal eigenbasis is given as φn ⊗ ψm,j n ∈ N0 , (m, j) ∈ Wp . (2.17)
Applying standard theory for operators expressed as a direct integral we will obtain an expression for σ(H) in terms of the eigenvalues of H(k) (see (3.1) in Sec. 3 below). It then becomes a critical issue to understand how the eigenvalues of the operator H0 (k) get perturbed by the potential V(k). We deal with this by considering matrix elements of V(k) relative to the eigenbasis (2.17). Owing to the explicit nature of V(k) given by (2.12) and (2.14), we are then led to the study of the matrix elements of the operator U˜m relative to the orthonormal basis {φn | n ∈ N0 } for L2 (R). For each n, n0 ∈ N0 and m ∈ Z3 set n,n Um = hU˜m φn , φn0 i . 0
∗ From (2.9) it is clear that U˜m =e is unitary we then get
i(m1 +˜ bm)˜ ˜ cm ˜
0
U˜−m . Together with the fact that U˜m
n ,n n,n |Um | = |U−m |≤1 0
(2.18)
(2.19)
January 18, 2005 10:1 WSPC/148-RMP
1266
00224
D. M. Elton
and X
n0 ∈N0
n,n 2 |Um | = kU˜m φn k2 = 1 , 0
(2.20)
with a similar result holding when the roles of n and n0 are swapped. More detailed n,n0 estimates will also be needed for the Um ’s; these are summarised in the next result, the proof of which is deferred until Sec. 5. Proposition 2.1. Let n, n0 ∈ N0 and m ∈ Z3 . If |m| ≤ (β 1/2 /3KΓ˜ )(1 + n0 + n)1/6 then 0
n,n |Um | ≤ 6(1 + |n0 − n|)−1/3 .
(2.21)
Under the additional assumptions m3 = 0 and m 6= 0 we also have 0
n,n |Um | ≤ C(1 + n0 + n)−1/6 ,
where C =
(2.22)
1/3 4 max{1, β 1/6 KΓ˜ }.
3. The Main Theorem in More Detail Let λj (k), j ∈ N denote the sequence of eigenvalues of the operator H(k) = H0 (k)+ V(k) from Theorem 2.1, arranged in non-decreasing order counting multiplicities. For each j ∈ N λj (k) is continuous in k (n.b., from (2.14), (2.6) and (2.19) it is straightforward to check that V(k) is norm continuous in k) so its range Ij := {λj (k) | k ∈ [0, 1)3 }
is an interval in R. Using Theorem 2.1 and standard theory for operators expressed as a direct integral (see [11, 16]) we can now write [ σ(H) = Ij ; (3.1) j∈N
the intervals Ij are the bands forming the spectrum of H. To characterise the length of these bands and the degree to which they overlap we introduce two functions of the spectral parameter λ ∈ R; the overlapping function is defined as ζ(λ) = max sup η [λ − η, λ + η] ⊆ Ij j∈N
and gives the length of the largest interval centred at λ which is contained entirely within a single band, while the multiplicity of overlapping is defined as m(λ) = #{j | λ ∈ Ij } and gives the number of bands containing λ. Theorem 3.1. Suppose the flux rationality condition (2.1) is fulfilled and V satisfies (1.3) (or equivalently (2.6)). Then there exists λ0 = λ0 (V, Γ, β) > 0 such that ζ(λ) ≥ C(V, Γ, β) for all λ ≥ λ0 .
and
m(λ) ≥ C(V, Γ, β)pλ1/2
January 18, 2005 10:1 WSPC/148-RMP
00224
Bethe–Sommerfeld Conjecture for the 3-D Periodic Landau Operator
1267
Theorem 1.1 clearly follows from this result, the proof of which will be given at the end of the present section. Throughout the remainder of Sec. 3 we use k = (k1 , k2 , k3 ) to denote the fibre parameter which takes values in [0, 1)3 . The dependence of various quantities (such as the fibre operator H) on k or its components will not always be made explicit. Any estimates written in this way are assumed to be uniform as the appropriate components of k range over [0, 1). We will also use the notation hf i to denote the average of a quantity f = f (k) in the third component of k; that is, Z 1 hf i = f (k) dk3 0
which clearly depends on k1 , k2 . To prove Theorem 3.1 we shall relate ζ(λ) and m(λ) to the counting function of H which can be defined as NH (λ) = #{j | λj (k) ≤ λ} . Now set N + (λ) = sup NH (λ)
and N − (λ) =
k3 ∈[0,1)
inf
k3 ∈[0,1)
NH (λ) .
It is straightforward to check that ζ(λ) ≥ sup{η | N − (λ + η) < N + (λ − η)}
(3.2)
m(λ) ≥ N + (λ) − N − (λ)
(3.3)
and
for any k1 , k2 ∈ [0, 1)2 (see [19, 20] for more details). Using (2.16) it is easy to see that the counting function of the unperturbed fibre operator H0 is given by (3.4) NH0 (λ) = p #{(n, m) ∈ N0 × Z | Λk3 (m, n) ≤ λ} .
Thus the computation NH0 (λ) is reduced to counting the number of integer points inside a parabolic region. The information we require about NH0 (λ) is summarised in the next two results. Lemma 3.1. We have
for all λ ≥ 0.
1 hNH0 (λ)i − 2 λ3/2 ≤ C(Γ, β) p 3βγ
This result is established as Lemma 5 in [5] with a weaker remainder estimate. Although the required modification to the proof given in [5] is minimal, a complete proof of Lemma 3.1 is given in Sec. 3.2. The next lemma appears as Eq. (2.34) in [5] and will not be proved here.
January 18, 2005 10:1 WSPC/148-RMP
1268
00224
D. M. Elton
Lemma 3.2. There exists λ0 = λ0 (Γ, β) > 0 such that
|NH0 (λ) − hNH0 (λ)i| ≥ C(Γ, β)pλ1/2 ,
for all λ ≥ λ0 .
In order to make use of the previous results we need to estimate how much the addition of the fibre potential V will alter the counting function of H0 . Assuming only that V is a bounded periodic operator, a basic estimate of the form CpkV kλ1/2 can be obtained by an obvious perturbation argument; an estimate of this form was used in [5] to obtain Theorem 1.1 when kV k is sufficiently small. The next result summarises the most important new work in this paper which is required to consider V of arbitrary size; its proof is the main topic of Secs. 3.1 to 3.3. Proposition 3.1. Suppose V satisfies (2.6) and set ε = min{1, δ}/36 > 0. Then there exists λ0 = λ0 (V, Γ, β) > 0 such that
|NH (λ) − NH0 (λ)| ≤ C(Γ, β)pλ1/2−ε
for all λ ≥ λ0 .
Proof of Theorem 3.1. In this proof estimates are assumed to hold for all λ ≥ λ0 , where λ0 , together with the general constant C, may vary from line to line but can be chosen to depend on V , Γ and β continuously. Now
|hNH (λ)i − hNH0 (λ)i| ≤ |NH (λ) − NH0 (λ)|
so Lemma 3.1 and Proposition 3.1 imply 1 hNH (λ)i − 2 λ3/2 ≤ Cλ1/2−ε . p 3βγ
(3.5)
On the other hand, averaging the inequality |NH (λ) − hNH (λ)i|
≥ |NH0 (λ) − hNH0 (λ)i| − |NH (λ) − NH0 (λ)| − |hNH (λ)i − hNH0 (λ)i| leads to
|NH (λ) − hNH (λ)i|
≥ |NH0 (λ) − hNH0 (λ)i| − |NH (λ) − NH0 (λ)| − |hNH (λ)i − hNH0 (λ)i|
≥ |NH0 (λ) − hNH0 (λ)i| − 2 |NH (λ) − NH0 (λ)| . (3.6)
Coupled with Lemma 3.2 and Proposition 3.1 we can thus find C1 = C1 (V, Γ, β) > 0 such that
|NH (λ) − hNH (λ)i| ≥ 4C1 pλ1/2 . Together with (3.5) we then get
2 3/2 1 λ + C1 λ1/2 . ± N ± (λ) ≥ ± p 3βγ
(3.7)
January 18, 2005 10:1 WSPC/148-RMP
00224
1269
Bethe–Sommerfeld Conjecture for the 3-D Periodic Landau Operator
Estimates for the multiplicity of overlapping now follow from (3.3): m(λ) ≥ N + (λ) − N − (λ) ≥ 2C1 pλ1/2 . On the other hand, if 0 ≤ η ≤ λ1/4 then (3.7) implies
1 2 ± N ± (λ ∓ η) ≥ ± (λ ∓ η)3/2 + C1 (λ ∓ η)1/2 p 3βγ 1 2 3/2 λ + C1 − η λ1/2 − C . ≥± 3βγ βγ
Therefore
1 N (λ − η) − N (λ + η) ≥ 2 C1 − η pλ1/2 − Cp βγ +
−
so (3.2) now gives ζ(λ) ≥ C. This completes the proof of Theorem 3.1. 3.1. Splitting the potential Let ε > 0 be as given in Proposition 3.1. For any λ ≥ 1 and k ∈ R define a set p e c Pλ,k = Pλ,k ∪ Pλ,k ∪ Pλ,k ⊂ R × [0, ∞)
where, using the eigenvalue function Λk (·, ·), we put p Pλ,k = (x, y) ∈ R × [0, ∞) |Λk (x, y) − λ| < λ−ε , e Pλ,k = (x, y) ∈ R × [0, ∞) β(2y + 1) ≤ λ1/3 , |γ 2 (x + k)2 − λ| ≤ 2λ1/2 and
c Pλ,k = (x, y) ∈ R × [0, ∞) |γ(x + k)| ≤ λ1/6 , |Λk (x, y) − λ| ≤ λ1/6 .
Set Ωλ,k = Pλ,k ∩ Z × N0 , and define Ωpλ,k , Ωeλ,k and Ωcλ,k similarly. Let Pλ,k3 denote the orthogonal projection onto the subspace of Hp spanned by the eigenvectors of H0 (k) contained in φn ⊗ ψm,j (m, n) ∈ Ωλ,k3 , j = 0, . . . , p − 1 . It follows immediately that
rank Pλ,k3 = p(#Ωλ,k3 ) .
(3.8)
Proposition 3.2. We have hrank Pλ,k3 i ≤ C(Γ, β)pλ1/2−ε for all λ ≥ 1. Define a new projection Qλ,k3 = I − Pλ,k3 and set Wλ,k3 = Qλ,k3 VQλ,k3 .
(3.9)
As the next result shows Wλ,k3 can be regarded as the part of the potential V which does not have a large effect on the eigenvalues of H0 near λ. Proposition 3.3. There exists λ0 = λ0 (V, Γ, β) ≥ 1 so that NH0 +Wλ,k3 (λ) = NH0 (λ) for all λ ≥ λ0 .
January 18, 2005 10:1 WSPC/148-RMP
1270
00224
D. M. Elton
The proofs of Propositions 3.2 and 3.3 are given in Secs. 3.2 and 3.3 respectively. Using these results we can now deal with Proposition 3.1. Proof of Proposition 3.1. Using (3.9) we can write H = H0 + V = H0 + Wλ,k3 + Kλ,k3
(3.10)
where Kλ,k3 = Pλ,k3 VQλ,k3 + Qλ,k3 VPλ,k3 + Pλ,k3 VPλ,k3 . Together with Proposition 3.3 and (3.10), it follows that |NH (λ) − NH0 (λ)| = |NH (λ) − NH0 +Wλ,k3 (λ)| ≤ rank Kλ,k3 ≤ 3 rank Pλ,k3 for all λ ≥ λ0 . Using Proposition 3.2 we then get
|NH (λ) − NH0 (λ)| ≤ 3hrank Pλ,k3 i ≤ C(Γ, β)p λ1/2−ε for all λ ≥ λ0 . This completes the result. 3.2. Counting points Remark 3.1. Let M0 ⊆ R × [0, ∞) and suppose Mk ⊆ R × [0, ∞) is defined by translation for all k ∈ R; that is, Mk = M0 − (k, 0) . Then the number of integer points in Mk (that is, #(Mk ∩ Z × N0 )) is a periodic function of k (with period 1) and has an average value given by X h#(Mk ∩ Z × N0 )i = Ln n∈N0
where Ln is the length of the line segment Mk ∩R×{n} (which is independent of k); to see this let χn : R → {0, 1} denote the characteristic function of M0 ∩ R × {n} so ! Z 1 X Z h#(Mk ∩ Z × {n})i = χn (m + k) dk = χn (x) dx = Ln . 0
m∈Z
R
For any λ, k ∈ R define a set Lλ,k = {(x, y) ∈ R × [0, ∞) | Λk (x, y) ≤ λ} . so (3.4) can be rewritten as NH0 (λ) = p(#(Lλ,k3 ∩ Z × N0 )) .
(3.11)
Proof of Lemma 3.1. In this proof, O(1) denotes a quantity whose absolute value can be bounded by C(β) for all λ ≥ 0.
January 18, 2005 10:1 WSPC/148-RMP
00224
Bethe–Sommerfeld Conjecture for the 3-D Periodic Landau Operator
1271
For any n ∈ N0 the length of the line segment Lλ,k3 ∩ R × {n} is non-zero only if λ ≥ β(2n + 1) and is equal to 2γ −1 (λ − β(2n + 1))1/2 when this is the case. Thus NH0 (λ) = 0 for λ ∈ [0, β) while for λ ≥ β Remark 3.1 and (3.11) imply γ hNH0 (λ)i = p
[(λ−β)/2β]
X
n=0
2(λ − β(2n + 1))1/2 .
Now, for any twice-differentiable function f we have Z 1 Z 1 x(x − 1) 00 1 f (x) dx − (f (0) + f (1)) = f (x) dx . 2 2 0 0 This allows us to rewrite (3.12) in the form γ hNH0 (λ)i = E + I + R p where E=
1 2(λ − β)1/2 + 2 λ − β(2[(λ − 3β)/2β] + 1) 1/2 2
+ 2(λ − β(2[(λ − β)/2β] + 1))1/2 = λ1/2 + O(1) , I= = = =
Z
[(λ−3β)/2β] 0
2(λ − β(2y + 1))1/2 dy
3/2 2 (λ − β)3/2 − λ − β(2[(λ − 3β)/2β] + 1) 3β 2 (λ − β)3/2 + O(1) 3β 2 3/2 λ − λ1/2 + O(1) 3β
and 1 R≤ 4 The result follows.
Z
[(λ−3β)/2β] 0
2 d 1/2 dy 2 (λ − β(2y + 1)) dy = O(1) .
Proof of Proposition 3.2. From (3.8) 1 hrank Pλ,k3 i = h#Ωλ,k3 i ≤ h#Ωpλ,k3 i + h#Ωeλ,k3 i + h#Ωcλ,k3 i . p Now Lλ−λ−ε ,k3 ⊂ Lλ+λ−ε ,k3
p and Pλ,k ⊆ Lλ+λ−ε ,k3 \Lλ−λ−ε ,k3
(3.12)
January 18, 2005 10:1 WSPC/148-RMP
1272
00224
D. M. Elton
so (3.11) implies 1 1 NH0 (λ + λ−ε ) − NH0 (λ − λ−ε ) . p p
#Ωpλ,k3 ≤ Lemma 3.1 then gives h#Ωpλ,k3 i ≤
2 2 1/2−ε 2 (λ + λ−ε )3/2 − (λ − λ−ε )3/2 + C(Γ, β) ≤ λ + C(Γ, β) . 3βγ 3βγ βγ
e Next note that Pλ,k ∩ R × {n} is empty if n ≥ λ1/3 /(2β) and consists of two 3 line segments each with length p p λ + 2λ1/2 − λ − 2λ1/2 ≤ 4
otherwise (where the second square root should be replaced with 0 if λ ∈ [1, 4]). Remark 3.1 now implies h#Ωeλ,k3 i ≤ 8([(2β)−1 λ1/3 ] + 1) ≤ 4(β −1 + 2)λ1/3 .
Finally, for each n ∈ N0 let Sn denote the rectangular strip with height 1 and c base Pλ,k ∩ R × {n}; that is, 3 c Sn = (x, n + w) (x, n) ∈ Pλ,k , w ∈ [0, 1) . 3 c ∩ R × {n} is just |Sn |, the area of Sn . Since the Sn Clearly the length of Pλ,k 3 are disjoint for distinct n, Remark 3.1 now gives X [ h#Ωcλ,k3 i = |Sn | = |S | where S := Sn . n∈N0
n∈N0
On the other hand it is straightforward to see that S ⊆ (x, y) ∈ R × [0, ∞) |γ(x + k3 )| ≤ λ1/6 , Λk (x, y) − λ ∈ [−λ1/6 , λ1/6 + 2β] . Combining the previous two equations we now get h#Ωcλ,k3 i ≤
2(1 + β) 1/3 2 1/6 1 λ (2λ1/6 + 2β) ≤ λ . γ 2β βγ
The result follows from the above estimates and the fact that ε < 1/6 (so λ ≤ λ1/2−ε ). 1/3
3.3. Estimating norms Using the orthonormal basis {ψm,j | (m, j) ∈ Wp } for `2 (Z) ⊗ Cp we can write the fibre space Hp = L2 (R) ⊗ `2 (Z) ⊗ Cp as an orthogonal direct sum M L2 (R) ⊗ ψm,j . (3.13) Hp = (m,j)∈Wp
Let Tm,j denote the orthogonal projection of Hp onto L2 (R) ⊗ ψm,j .
January 18, 2005 10:1 WSPC/148-RMP
00224
1273
Bethe–Sommerfeld Conjecture for the 3-D Periodic Landau Operator
Now suppose λ ∈ / σ(H0 (k)) and let R0 (λ) = (H0 − λ)−1 denote the resolvent of H0 . Since the operator Qλ,k3 |R0 (λ)|1/2 is diagonal with respect to the direct sum (3.13) we can write ˜ m φ ⊗ ψm,j Qλ,k3 |R0 (λ)|1/2 (φ ⊗ ψm,j ) = R (3.14)
˜ m is a self-adjoint operator on L2 (R) (to ease notafor each (m, j) ∈ Wp , where R ˜ m on λ and k3 has not been indicated explictional congestion the dependence of R itly; also, as will soon become apparent, this operator does not depend on j). The ˜ m is easiest to specify in terms of the orthonormal basis {φn | n ∈ N0 } action of R 2 for L (R) (see (2.15)). Firstly note that |R0 (λ)|1/2 (φn ⊗ ψm,j ) = |Λk3 (m, n) − λ|−1/2 (φn ⊗ ψm,j ) while Qλ,k3 (φn ⊗ ψm,j ) =
(
φn ⊗ ψm,j
0
if n ∈ Ωγ 2 (m+k3 )2 −λ
otherwise
where, for any a ∈ R, we define a set Ωa = n ∈ N0 |β(2n + 1) + a| ≥ λ−ε , β(2n + 1) > λ1/3 if |a| ≤ 2λ1/2 and |β(2n + 1) + a| > λ1/6 if |a + λ| ≤ λ1/3 . Hence
˜ m φn = R
(
|Λk3 (m, n) − λ|−1/2 φn
0
if n ∈ Ωγ 2 (m+k3 )2 −λ
otherwise .
(3.15)
Lemma 3.3. For any 0 6= m ∈ Z3 , l ∈ Z and λ ≥ 1 we have an estimate
l+m
3 ˜
R ˜ ˜ l ≤ B(m, λ) Um R where the bound B(m, λ) is given as B(m, λ) = C(Γ, β)λ
ε
(
λ−1/18 if |m| ≤ cλ1/18 1
if |m| > cλ1/18
(3.16)
with c = min{β 1/3 (3KΓ˜ )−1 , γ −1 }; in particular , B(m, λ) is independent of l. > Proof. For any a ∈ R partition the set Ωa as Ωa = Ω< a ∪ Ωa where 1/9 Ω≶ , a = n ∈ Ωa |β(2n + 1) + a| ≶ λ
with cases of equality to be included in Ω> a . Straightforward arguments show that X 1 ≤ C 1 λε , (3.17) |β(2n + 1) + a| < n∈Ωa
X
n∈Ω> a
1 ≤ C2 λ−1/9 |β(2n + 1) + a|2
(3.18)
January 18, 2005 10:1 WSPC/148-RMP
1274
00224
D. M. Elton
and X
|n|
1 ≤ C3 N 1/3 (1 + |n|)2/3
(3.19)
for all a ∈ R and N ≥ 1, where C1 , C2 , C3 are independent of a and N , and depend continuously on β. Using the orthonormal basis {φn | n ∈ N0 } for L2 (R), together with (3.15) and (2.18), we can estimate X
l+m
3 ˜ U˜m R ˜ l φn , R ˜ l+m3 φn0 2
R ˜ ˜ l 2 ≤ Um R n,n0 ∈N0
0
=
X
n∈Ωa n0 ∈Ωa0
n,n 2 |Um | |β(2n + 1) + a||β(2n0 + 1) + a0 |
(3.20)
where a = γ 2 (l + k3 )2 − λ and a0 = γ 2 (l + m3 + k3 )2 − λ. For each n, n0 ∈ N0 let Sn,n0 denote the term appearing in the sum on the right-hand side of (3.20). We will now estimate this sum by using the partitions of Ωa and Ωa0 to split it into four parts. > 0 Part (i): n ∈ Ω> older’s inequality, (2.20) and (3.18) we get a and n ∈ Ωa0 . Using H¨ 1/2 X 1/2 X n,n0 2 n,n0 2 |Um | |Um | X Sn,n0 ≤ n∈Ω> |β(2n + 1) + a|2 n∈Ω> |β(2n0 + 1) + a0 |2 a a n0 ∈Ω> a0
n0 ∈Ω> a0
n∈Ω> a n0 ∈Ω> a0
≤
X
n∈Ω> a
1/2
1 |β(2n + 1) + a|2
≤ C2 λ−1/9 .
X
n0 ∈Ω> a0
1/2
1 |β(2n0 + 1) + a0 |2
< 0 Part (ii): n ∈ Ω> a and n ∈ Ωa0 . Using (2.20) and (3.17) we get
X
n∈Ω> a n0 ∈Ω< a0
Sn,n0 ≤
X
n0 ∈Ω< a0
≤ λ−1/9
1 0 |β(2n + 1) + a0 | X
n0 ∈Ω< a0
X 1 n,n0 2 |Um | max> n∈Ωa |β(2n + 1) + a| > n∈Ωa
!
1 ≤ C1 λε−1/9 . |β(2n0 + 1) + a0 |
> 0 Part (iii): n ∈ Ω< a and n ∈ Ωa0 . This case can be treated by an argument similar to that used in the previous case, resulting in the same estimate.
January 18, 2005 10:1 WSPC/148-RMP
00224
Bethe–Sommerfeld Conjecture for the 3-D Periodic Landau Operator
1275
< 0 Part (iv): n ∈ Ω< a and n ∈ Ωa0 . We now deal with the most subtle part of the sum (3.20). The argument depends critically on the value of m and will be split into three sub-cases.
Case (i): |m| > cλ1/18 . Using the basic estimate (2.19) with (3.17) we have X X X 1 1 ≤ C12 λ2ε . Sn,n0 ≤ |β(2n + 1) + a| |β(2n0 + 1) + a0 | < < 0 < n∈Ωa
n∈Ωa n0 ∈Ω< a0
n ∈Ωa0
Now assume |m| ≤ cλ1/18 for the remaining two sub-cases. If |a| ≤ 2λ1/2 then β(2n + 1) > λ1/3 (by the definition of Ωa ). On the other hand, if |a| > 2λ1/2 then β(2n + 1) ≥ |a| − |β(2n + 1) + a| ≥ 2λ1/2 − λ1/9 ≥ λ1/3 . A similar argument also gives β(2n0 + 1) ≥ λ1/3 . It follows that λ1/3 ≤ β(n0 + n + 1) ⇒ |m| ≤ cλ1/18 ≤ β 1/2 (3KΓ˜ )−1 (n0 + n + 1)1/6 .
(3.21)
Case (ii): m3 = 0. Using (2.22) from Proposition 2.1 we get 0
n,n |Um | ≤ C4 (1 + n0 + n)−1/6 ≤ C4 β 1/6 λ−1/18 1/3
where C4 = 4 max{1, β 1/6 KΓ˜ }. Coupling this estimate with (3.17) we now get X X X 1 1 Sn,n0 ≤ C42 β 1/3 λ−1/9 0 + 1) + a0 | |β(2n + 1) + a| |β(2n < < 0 < n∈Ωa
n∈Ωa n0 ∈Ω< a0
n ∈Ωa0
≤ C12 C42 β 1/3 λ2ε−1/9 . 1/3 Case (iii): m3 6= 0. We may obviously assume Ω< a 6= ∅ which implies |a + λ| > λ < (recall the definitions of Ωa and Ωa ). Therefore
|l + k3 | = γ −1 (a + λ)1/2 > γ −1 λ1/6 . With a similar argument the assumption Ω< a0 6= ∅ leads to |l + m3 + k3 | > γ −1 λ1/6 . It follows that |a0 − a| = γ 2 (l + m3 + k3 )2 − (l + k3 )2 = γ 2 |l + m3 + k3 | + |l + k3 | |l + m3 + k3 | − |l + k3 | ≥ 2γλ1/6 |l + m3 + k3 | − |l + k3 | .
If l + m3 + k3 and l + k3 have the same sign then |l + m3 + k3 | − |l + k3 | = |m3 | ≥ 1
January 18, 2005 10:1 WSPC/148-RMP
1276
00224
D. M. Elton
since m3 6= 0. On the other hand, if l + m3 + k3 and l + k3 have opposite signs then |l + m3 + k3 | − |l + k3 | ≥ |2(l + k3 ) + m3 |
≥ 2|l + k3 | − |m3 | ≥ 2γ −1 λ1/6 − γ −1 λ1/18 ≥ γ −1 ,
where the second last inequality uses the fact that |m3 | ≤ |m| ≤ cλ1/18 ≤ γ −1 λ1/18 . The above estimates combine to ensure that N := (4β)−1 |a0 − a| ≥ C5 λ1/6
with C5 := (2β)−1 min{γ, 1} .
< 0 0 If n ∈ Ω< a and n ∈ Ωa0 satisfy |n − n| ≥ N then (2.21) from Proposition 2.1 implies −1/3 −1/18
0
n,n |Um | ≤ 6(1 + |n0 − n|)−1/3 ≤ 6N −1/3 ≤ 6C5
λ
.
Coupling this with (3.17) we now get X
Sn,n0 ≤
−2/3 −1/9 36C5 λ
X
n∈Ω< a
0 < n∈Ω< a , n ∈Ωa0 0
|n −n|≥N
−2/3 2ε−1/9
≤ 36C12 C5
λ
! 1 |β(2n + 1) + a|
X
n0 ∈Ω< a0
! 1 |β(2n0 + 1) + a0 |
.
< 0 0 Now suppose n ∈ Ω< a and n ∈ Ωa0 satisfy |n − n| < N . It follows that
1 1 0 |β(2n + 1) + a| |β(2n + 1) + a0 | 1 1 1 − = 0 0 0 0 2β(n − n) + (a − a) β(2n + 1) + a β(2n + 1) + a 1 −1 1 1 ≤ . N + 2β |β(2n + 1) + a| |β(2n0 + 1) + a0 | Coupling this with (2.21) from Proposition 2.1, (3.17) and (3.19) we now get X
0 < n∈Ω< a , n ∈Ωa0 |n0 −n|
Sn,n0 ≤ 18β −1 N −1
+
X
n0 ∈Ω< a0
X
n∈Ω< a
1 |β(2n + 1) + a|
1 0 |β(2n + 1) + a0 |
X
n∈Z with |n0 −n|
X
n0 ∈Z with |n0 −n|
(1 +
1 0 (1 + |n − n|)2/3
−2/3 −1 ε−1/9
≤ 36C1 C3 β −1 N −2/3 λε ≤ 36C1 C3 C5
|n0
β
λ
.
1 − n|)2/3 !
January 18, 2005 10:1 WSPC/148-RMP
00224
Bethe–Sommerfeld Conjecture for the 3-D Periodic Landau Operator
1277
This completes the consideration of the various parts of the sum (3.20) and with it the result. The next step toward proving Proposition 3.3 is to obtain a norm estimate for the operator S := |R0 (λ)|1/2 Wλ,k3 |R0 (λ)|1/2 = |R0 (λ)|1/2 Qλ,k3 VQλ,k3 |R0 (λ)|1/2 . Lemma 3.4. We have kSk ≤ C(V, Γ, β)λ−ε for all λ ≥ 1. Proof. Since S is self adjoint we have
kTm,j STm0 ,j 0 k = (Tm,j STm0 ,j 0 )∗ = kTm0 ,j 0 STm,j k
for all (m, j), (m0 , j 0 ) ∈ Wp . It follows that we can use the orthogonal direct sum decomposition (3.13) to estimate kSk as X kTm0 ,j 0 STm,j k . (3.22) kSk ≤ sup (m,j)∈Wp
(m0 ,j 0 )∈Wp
From (2.12) and (3.14) we see that 0 0 ˜ m0 V˜ m ,j R ˜ m φ ⊗ ψm0 ,j 0 Tm0 ,j 0 STm,j (φ ⊗ ψm,j ) = R m,j
for all (m, j), (m0 , j 0 ) ∈ Wp . Using (2.14) and Lemma 3.3 we then get
m0 m0 ,j 0 m ˜ V˜ ˜ kTm0 ,j 0 STm,j k = R m,j R X
m0
˜ U˜(m ,m p+j 0 −j,m0 −m) R ˜ m ≤ |V(m1 ,m2 p+j 0 −j,m0 −m) | R 1 2 m1 ,m2 ∈Z
≤
X
m1 ,m2 ∈Z
|V(m1 ,m2 p+j 0 −j,m0 −m) |B((m1 , m2 p + j 0 − j, m0 − m), λ)
(recall that V0 = 0 by assumption). Therefore X kTm0 ,j 0 STm,j k (m0 ,j 0 )∈Wp
≤ =
X
(m0 ,j 0 )∈W
X
m∈Z3
p
X
m1 ,m2 ∈Z
|V(m1 ,m2 p+j 0 −j,m0 −m) |B((m1 , m2 p + j 0 − j, m0 − m), λ)
|Vm |B(m, λ) .
The final expression is independent of (m, j) ∈ Wp so (3.22) and the definition of B(m, λ) (see (3.16)) now lead us to ! X X X ε −1/18 |Vm | . |Vm | + |Vm |B(m, λ) ≤ C(Γ, β)λ λ kSk ≤ m∈Z3
|m|≤cλ1/18
|m|>cλ1/18
January 18, 2005 10:1 WSPC/148-RMP
1278
00224
D. M. Elton
However the regularity condition (2.6) satisfied by V implies X X |Vm | ≤ |m|δ |Vm | ≤ C(V, Γ) m∈Z3
|m|≤cλ1/18
and X
|m|>cλ1/18
|Vm | ≤ c−δ λ−δ/18
X
m∈Z3
|m|δ |Vm | ≤ C(V, Γ, β)λ−δ/18 .
The result now follows from the fact that − min{1, δ}/18 = −2ε. Proof of Proposition 3.3. Since the counting function is upper semi-continuous it suffices to prove the result under the assumption that λ ∈ / σ(H0 ). It follows that 1/2 the operator |H0 − λ| has a bounded inverse given by |R0 (λ)|1/2 . Defining T to be the unitary operator satisfying H0 − λ = T |H0 − λ| we can now write H0 + gWλ,k3 − λ = T |H0 − λ|1/2 I − gT ∗ |R0 (λ)|1/2 Wλ,k3 |R0 (λ)|1/2 |H0 − λ|1/2
for all g ∈ R (n.b., T ∗ commutes with |R0 (λ)|1/2 ). Thus the operator H0 +gWλ,k3 −λ has a bounded inverse whenever g ∈ [0, 1] and
|R0 (λ)|1/2 Wλ,k3 |R0 (λ)|1/2 ≤ 1 . (3.23) 2 However Lemma 3.4 allows us to find λ0 = λ0 (V, Γ, β) ≥ 1 such that (3.23) is satisfied for all λ ≥ λ0 . Hence λ∈ / σ(H0 + gWλ,k3 ) for all g ∈ [0, 1] and λ ≥ λ0 . Now the operator H0 has compact resolvent whilst Wλ,k3 is self-adjoint and bounded. It follows that the spectrum of H0 + gWλ,k3 consists of real eigenvalues depending continuously on g. Since none of these eigenvalues can cross λ as g varies from 0 to 1 we must have NH0 +Wλ,k3 (λ) = NH0 (λ) whenever λ ≥ λ0 . This completes the proof. 4. Proof of Theorem 2.1 The reduction of the operator H will be achieved in three steps. The first step is to apply a Gelfand (or Zak) transformation in the variables x2 and x3 (n.b., H is periodic with respect to these variables). Define Z ⊕ Φ : L2 (R3 ) → L2 (R × S1 × S1 ) dk˜ [0,1)2
by ˜
˜ = e−ikx˜ Φu(x; k)
X
2 m∈Z ˜
˜
˜ e−2πikm u(x + 2π(0, m)) ˜ .
January 18, 2005 10:1 WSPC/148-RMP
00224
Bethe–Sommerfeld Conjecture for the 3-D Periodic Landau Operator
1279
Standard arguments (see [16]) show that Φ is a well-defined unitary operator with inverse Z ˜ ∗ ˜ ˜ dk˜ . Φ φ(x + 2π(0, m)) ˜ = eik(˜x+2πm) φ(x; k) [0,1)2
Furthermore we get Φ(H0 + V )Φ∗ =
Z
˜ + V ) dk˜ (H0 (k) [0,1)2
where V is now considered to be a function on R × S1 × S1 and ˜ = (D + (0, k) ˜ T − A)T G(D + (0, k) ˜ T − A) , H0 (k) the operator H0 with Di replaced by Di +ki for i = 2, 3 and considered to be acting on L2 (R × S1 × S1 ). The second step of the reduction is effected by the operator Z ⊕ Z ⊕ 2 1 1 ˜ Ψ: L (R × S × S ) dk → L2 (R) ⊗ `2 (Z2 ) dk˜ [0,1)2
[0,1)2
defined by ˜ 1 ˜ i˜ ˜ k)x ˜ = 1 eia(m) Ψu(x1 , m; ˜ k) e b(m+ 2π
where ˜b, c˜ are given by (2.7) and a(m) ˜ =
Z
S1 ×S1
˜x ˜ x e−im˜ u x1 + c˜(m ˜ + k), ˜; k˜ d˜ x
1 m2 (f2x f2y m2 + 2f2x f3y m3 ) = b2 c2 m22 + b2 c3 m2 m3 . 2β 2
(4.1)
A routine calculation shows that Ψ is well defined and satisfies hΨφ, Ψψi = hφ, ψi for Schwarz class functions. We can then extend Ψ by continuity to get the required unitary operator. Furthermore, the inverse of Ψ is given by X ˜ ˜ ˜ −i˜ ˜ k)(x c(m+ ˜ k)) ˜x 1 −˜ ˜ m; ˜ = 1 e−ia(m) e b(m+ eim˜ φ x1 − c˜(m ˜ + k), ˜ k˜ Ψ∗ φ(x; k) 2π 2 m∈Z ˜
˜ in L2 (R) ⊗ `2 (Z2 ). for any φ(x1 , m; ˜ k) ˜ so x1 φ(y1 , m; ˜ = y1 + c˜(m ˜ φ(y1 , m; ˜ and Write y1 = x1 − c˜(m ˜ + k) ˜ k) ˜ + k) ˜ k) ∂ ∂ ˜ 1 ˜ 1 −i˜ b(m+ ˜ k)y −i˜ b(m+ ˜ k)y ˜ ˜ ˜ ˜ . e −i φ(y1 , m; ˜ k) = e − b(m ˜ + k) φ(y1 , m; ˜ k) −i ∂x1 ∂y1
˜x On the other hand −i(∂/∂xi )eim˜ = mi for i = 2, 3. Since A = (β1 x1 , β2 x1 , 0)T it follows that ˜ − ˜b(m ˜ D1 − β1 (x1 + c˜(m ˜ + k)) ˜ + k) ˜ T − A Ψ∗ = ˜ Ψ D + (0, k) m2 + k2 − β2 (x1 + c˜(m ˜ + k)) .
m3 + k 3
January 18, 2005 10:1 WSPC/148-RMP
1280
00224
D. M. Elton
Now f1x β1 + f2x β2 =
βe1x (f1x e1y + f2x e2y ) = 0 4π 2
(consider the 12 components of the matrix equation F E = 2πI3 ). Coupled with our choice for ˜b and c˜ (see (2.7)) we now get ˜ T − A Ψ∗ ΨF D + (0, k) ˜ − f1x˜b(m ˜ + (f2x , f3x )(m ˜ f1x D1 − (f1x β1 + f2x β2 )(x1 + c˜(m ˜ + k)) ˜ + k) ˜ + k) ˜ − f2y β2 (x1 + c˜(m ˜ = (f2y , f3y )(m ˜ + k) ˜ + k)) f3z (m3 + k3 )
f1x D1 = −f2y β2 x1 . f3z (m3 + k3 )
Hence
˜ ∗ φ(x1 , m; ˜ = f 2 (D2 + (αx1 )2 ) + γ 2 (m3 + k3 )2 φ(x1 , m; ˜ ΨH0 (k)Ψ ˜ k) ˜ k) 1x 1
(4.2)
where α and γ are given by (2.8). To determine the effect of Ψ on the potential V we first collect some terms in (2.3) and write V as X ˜x V (x) = eim˜ Vm ˜ (x1 ) 2 m∈Z ˜
where Vm ˜ (x1 ) :=
X
Vm eim1 x1
(4.3)
m1 ∈Z
is a periodic function with period 2π. Then ˜ (V Ψ∗ φ)(x; k) =
1 2π
X
2 n ˜ ,m∈Z ˜
˜
˜
˜
˜x −ia(˜ n) −ib(˜ n+k)(x1 −˜ c(˜ n+k)) ˜ n eim˜ Vm−˜ e φ x1 − c˜(˜ n + k), ˜ ; k˜ ˜ n (x1 )e
and so ˜ (ΨV Ψ∗ φ)(x1 , m; ˜ k) Z 1 ia(m) ˜ ˜ k)x ˜ 1 ˜x ˜ x˜; k˜ d˜ e−im˜ (V Ψ∗ φ) x1 + c˜(m ˜ + k), x e ˜ eib(m+ 2π S1 ×S1 X ˜ ˜ n)) −i˜ ˜ n) = ei(a(m)−a(˜ e b(˜n+k)˜c(m−˜ =
n ˜ ∈Z2
˜ ˜ n)x1 ˜ ˜ . × eib(m−˜ Vm−˜ ˜(m ˜ + k))φ(x ˜(m ˜ −n ˜ ), n ˜ ; k) ˜ n (x1 + c 1 +c
(4.4)
January 18, 2005 10:1 WSPC/148-RMP
00224
Bethe–Sommerfeld Conjecture for the 3-D Periodic Landau Operator
1281
The final step in the reduction process makes use of the flux rationality condition (2.1) to essentially turn the variable m2 into a parameter. Define Z ⊕ Z ⊕ Θ: L2 (R) ⊗ `2 (Z2 ) dk˜ → Hp dk [0,1)2
[0,1)3
(with Hp as given in (2.10)) by Θφ(x, m, j; k) =
X 1 ˜ e2πim2 k1 φ(x, (m2 p + j, m); k) 1/2 (2π) m ∈Z
(4.5)
2
for x ∈ R, (m, j) ∈ Wp and k ∈ [0, 1)3 . Once again it is straightforward to check that Θ is a well-defined unitary operator. Furthermore, using (4.2) it is easy to ˜ ∗ Θ∗ = H0 (k) (where H0 (k) is as given by (2.11) and (2.13); verify that ΘΨH0 (k)Ψ ˜ ∗ is independent of the first component of m). n.b., the operator ΨH0 (k)Ψ ˜ We now consider the effect of Θ on V . If n2 , m, n02 , m0 ∈ Z and j, j 0 = 0, . . . , p−1 we shall use the notation m ˜ j and m ˜ 0j to mean m ˜ j = (n2 p + j, m) and m ˜ 0j = (n02 p + j 0 , m0 ) . Combining (4.4) and (4.5) we then get (ΘΨV Ψ∗ φ)(x, m0 , j 0 ; k) =
X X p−1 X 1 ˜ c(m ˜ j) ˜ j )) −i˜ ˜ j +k)˜ ˜ 0j −m 2πin02 k1 ˜ 0j )−a(m e b(m e ei(a(m 1/2 (2π) 0 n ,m∈Z j=0 n2 ∈Z
2
˜ ˜ 0 −m ˜ j )x ˜ ˜ . j ˜(m ˜ 0j + k))φ(x + c˜(m ˜ 0j − m ˜ j ), m ˜ j ; k) × eib(m Vm ˜ 0j −m ˜ j (x + c
(4.6)
Now Vm ˜ 0j −m ˜ j (x) is 2π periodic in x whilst the flux rationality condition (2.1) implies c 2 n2 p =
f1x f2y n2 p = 2πqn2 ∈ 2πZ . β
Hence ˜ = Vm Vm ˜(m ˜ 0j + k)) ˜((n02 − n2 )p + j 0 + k2 , m0 + k3 ) ˜ 0j −m ˜ j (x + c ˜ 0j −m ˜j x + c On the other hand, our definition of a (see (4.1)) gives a(m) ˜ − a(˜ n) − ˜b˜ nc˜(m ˜ −n ˜) 1 = b 2 c2 (m2 + n2 )(m2 − n2 ) − n2 (m2 − n2 ) 2 + b2 c3 m2 m3 − n2 n3 − n2 (m3 − n3 ) − b3 n3 c˜(m ˜ −n ˜) 1 ˜ −n ˜) = b2 c2 (m2 − n2 )2 + b2 c3 (m2 − n2 )m3 − b3 n3 c˜(m 2 for any m, ˜ n ˜ ∈ Z2 . It follows that
a(m ˜ 0j ) − a(m ˜ j ) − ˜bm ˜ j c˜(m ˜ 0j − m ˜ j ) =: σ (n02 − n2 )p + j 0 − j, m0 , m
January 18, 2005 10:1 WSPC/148-RMP
1282
00224
D. M. Elton
depends on n2 and n02 only through the difference n02 − n2 . Introducing the new summation variable m2 = n02 − n2 we can now rewrite (4.6) as (ΘΨV Ψ∗ φ)(x, m0 , j 0 ; k) X X p−1
=
˜˜
e2πim2 k1 e−ibk˜c(m2 p+j
0
−j,m0 −m) iσ(m2 p+j 0 −j,m0 ,m) i˜ b(m2 p+j 0 −j,m0 −m)x
e
e
m2 ,m∈Z j=0
× V(m2 p+j 0 −j,m0 −m) (x + c˜(m2 p + j 0 + k2 , m0 + k3 )) ×
X 1 e2πin2 k1 φ x + c˜(m2 p + j 0 − j, m0 − m), (n2 p + j, m); k˜ . 1/2 (2π) n ∈Z 2
Comparison with (4.5) shows that the last line in the previous expression is just (Θφ) x + c˜(m2 p + j 0 − j, m0 − m), m, j; k .
In particular the transformed potential ΘΨV Ψ∗ Θ∗ can be seen to be a fibred operator (that is, k appears only as a parameter). Using (4.3) and the unitary operator U˜m introduced in (2.9) we can now write (ΘΨV Ψ∗ Θ∗ φ)(x, m0 , j 0 ; k) X = eiµk eiν V(m1 ,m2 p+j 0 −j,m0 −m) U˜(m1 ,m2 p+j 0 −j,m0 −m) φ (x, m, j; k) (m,j)∈Wp m1 ,m2 ∈Z
where µ = µ(m1 , m2 , m0 , m, j 0 , j) ∈ R3 and ν = ν(m1 , m2 , m0 , m, j 0 , j) ∈ R are rather messy functions whose explicit form is fortunately not required. It is now easy to see that the operator ΘΨV Ψ∗ Θ∗ is the same as the operator V(k) given by (2.12) and (2.14). n,n 5. Estimates for the Matrix Element Um
0
The objective of this section is to prove Proposition 2.1. The method involves n,n0 exploiting an explicit formula for |Um | that can be given in terms of special functions. Firstly we list an identity for Hermite polynomials (see [6, p. 844, 7.377]); for any 0 ≤ n ≤ n0 and y, z ∈ C we have Z 0√ 0 0 2 (5.1) e−x Hen (x + y) Hen0 (x + z) dx = 2n πn!z n −n Ln(n −n) (−2yz) , R
(n0 −n)
where Ln
is the generalised Laguerre polynomial.
Lemma 5.1. For any 0 ≤ n ≤ n0 and m ∈ Z3 we have r n! √ n0 −n −ρ2 (n0 −n) 2 n,n0 ( 2ρ) e Ln (2ρ ) |Um | = n0 ! where 1 ρ = β −1/2 |m|2Γ˜ − (f3z m3 )2 1/2 . 2
(5.2)
January 18, 2005 10:1 WSPC/148-RMP
00224
Bethe–Sommerfeld Conjecture for the 3-D Periodic Landau Operator
1283
Proof. Introduce the complex number √ α˜ cm ˜ m1 + ˜bm ˜ √ . σ= −i 2 2 α From (2.9) and (2.15) we get n,n0 Um = hU˜m φn , φn0 i √ −(n+n0 )/2 Z α2 ˜˜ ˜ 2 /2 −αx2 /2 √ = ei(m1 +bm)x e−α(x+˜cm) e 0 n!n !π R √ √ ˜ Hen0 ( αx) dx × Hen ( α(x + c˜m)) Z 0 √ 2 2−(n+n )/2 σ2 −α(˜cm) ˜ 2 /2 e−x Hen (x − σ + α˜ e = √ cm) ˜ Hen0 (x − σ) dx 0 n!n !π R r √ 0 2 n! (n0 −n)/2 ˜ 2 /2 (n0 −n) 2 (−σ)n −n eσ −α(˜cm) Ln cm) ˜ − 2σ(σ − α˜ = n0 ! √ cm/2 ˜ so where the last line follows from (5.1). Now Re(σ) = α˜ √ cm) ˜ = −2σ(−σ) = 2|σ|2 . Re σ 2 − α(˜ cm) ˜ 2 /2 = −|σ|2 and − 2σ(σ − α˜
On the other hand, (2.7), (2.8) and (2.4) give us |σ|2 =
2 1 1 |m|2Γ˜ − (f3z m3 )2 = ρ2 . (f2y , f3y )m ˜ − i(f1x m1 + (f2x , f3x )m) ˜ = 4β 4β
The result follows.
Throughout the remainder of this section we will assume m ∈ Z3 is fixed and ρ ≥ 0 is given by (5.2). To develop the estimates appearing in Proposition 2.1 we need to consider several different cases for the relative values of n and n0 . 2/3
Lemma 5.2. If 1 < n0 ≤ n0 − n then 1 0 n,n0 2 1/3 |Um | ≤ exp ρ − (n + n + 1) and 18
n,n0 |Um |
1 0 1/2 ≤ exp ρ − (n − n) . 15 2
Remark 5.1. For any x > 0 we have Γ(x + 1/2)2 ≤ xΓ(x)2 . Indeed, if s, t ≥ 0 then s1/2 t1/2 ≤ (s + t)/2 so Z (Γ(x + 1/2))2 = (st)x−1/2 e−(s+t) ds dt R2
≤ =
1 2
Z
(st)x−1 (s + t)e−(s+t) ds dt R2
1 Γ(x + 1)Γ(x) + Γ(x)Γ(x + 1) = xΓ(x)2 . 2
January 18, 2005 10:1 WSPC/148-RMP
1284
00224
D. M. Elton
Proof. Firstly we recall a special function identity (see [6, p. 741, 6.643.4]): Z √ 2 √ 0 0 1 ∞ −t (n+n0 )/2 e t Jn0 −n (2ρ 2t) dt e−2ρ ( 2ρ)n −n Ln(n −n) (2ρ2 ) = n! 0 where Jn0 −n is the Bessel function of order n0 − n. Since |Jn0 −n (·)| ≤ 1 (see [1, 9.1.60] or (5.10) below) we may now use Lemma 5.1 to get n + n0 Z ∞ +1 Γ 2 0 2 1 2 n,n0 √ |Um |≤ √ eρ . (5.3) e−t t(n+n )/2 dt = eρ n!n0 ! n!n0 ! 0 Choose k ∈ N0 so that n0 − n = 2k or n0 − n = 2k + 1 depending on the parity of n0 − n. Since n0 − n > 1 (by assumption) and n0 − n ∈ N0 we must have k ≥ 1. If 2/3 n0 − n is even our hypothesis immediately gives n2/3 < n0 ≤ n0 − n = 2k ≤ 3k. 0 0 2/3 0 On the other hand, if n − n is odd then n ≤ n − n = 2k + 1 ≤ 3k while n0
2/3
− n2/3 ≥
2 2 0 −1/3 0 n (n − n) ≥ (n0 − n)1/2 ≥ 1 3 3
since n0 − n = 2k + 1 ≥ 3. For both parities of n0 − n we thus get n2/3 ≤ 2k ,
k ≥ 1,
n0
2/3
≤ 3k ,
n0 − n ≤ 3k .
(5.4)
Claim 5.1. We have Γ
n + n0 +1 2 n!n0 !
2
≤
k Y
n+j . n + k+j j=1
(5.5)
From Remark 5.1 we get 2 3 Γ n+k+ ≤ (n + k + 1)Γ(n + k + 1)2 2 ≤ (n + 2k + 1)Γ(n + k + 1)2 =
(n + 2k + 1)! Γ(n + k + 1)2 . (n + 2k)!
It follows that 2 3 Γ n+k+ Γ(n + k + 1)2 (n + k)! (n + k)! 2 ≤ = . (n + 2k + 1)!n! (n + 2k)!n! (n + 2k)! n! The first two terms in this equation are simply the left-hand side of (5.5) when n0 − n is odd and even respectively. On the other hand, the third term is easily seen to be the right-hand side of (5.5). The claim follows. From (5.4) we have n ≤ (2k)3/2 so k k 1 ≥ ≥ √ n+j (2k)3/2 + k (2 2 + 1)k 1/2
January 18, 2005 10:1 WSPC/148-RMP
00224
Bethe–Sommerfeld Conjecture for the 3-D Periodic Landau Operator
1285
1/2
for j = 1, . . . , k. Since k 1/2 ≥ 1 we also have (1 + x/k 1/2 )k ≥ 1 + x for any x ≥ 0. Combining these new inequalities we now get k k Y 1/2 1 n+k+j √ ≥ 1+ ≥ eµk (5.6) 1/2 n+j (2 2 + 1)k j=1 √ where µ = ln 1 + 1/(2 2 + 1) . However the inequalities (5.4) imply µk 1/2 ≥ µ(33/2 + 23/2 + 1)−1/3 (n0 + n + 1)1/3 ≥
1 0 (n + n + 1)1/3 9
and 2 0 (n − n)1/2 . 15 Combining these estimates with (5.3), (5.5) and (5.6) completes the result. µk 1/2 ≥ µ3−1/2 (n0 − n)1/2 ≥
Laguerre polynomials can be expressed in terms of the confluent hypergeometric function; using [1, 22.5.54] we get 0 n 0 Ln(n −n) (2ρ2 ) = M (−n, n0 − n + 1, 2ρ2 ) . n The confluent hypergeometric function can, in turn, be written as a pointwise absolutely convergent series of Bessel functions; from [1, 13.3.7] we get −(n0 −n)/2 2 M (−n, n0 − n + 1, 2ρ2 ) = (n0 − n)!eρ ρ2 (n0 + n + 1) ×
∞ X
Aj
j=0
ρ 0 (n + n + 1)1/2
where A0 = 1 ,
A1 = 0 ,
A2 =
j
√ Jn0 −n+j 2ρ n0 + n + 1 ,
1 0 (n − n + 1) 2
(5.7)
and, for j ≥ 2,
(j + 1)Aj+1 = (j + n0 − n)Aj−1 − (n0 + n + 1)Aj−2 .
It follows from Lemma 5.1 that ∞ q X √ ρj n,n0 Jn0 −n+j 2ρ n0 + n + 1 , |Um | ≤ Fn0 ,n |Aj | 0 j/2 (n + n + 1) j=0 where
Fn0 ,n :=
n0 −n n0 ! 2 . n! n0 + n + 1
The next two results give estimates for the constants appearing in (5.9). Lemma 5.3. Suppose n0 ≥ 2 and 0 ≤ n0 − n ≤ n02/3 . Then |Aj | ≤ (n0 + n + 1)j/3 .
(5.8)
(5.9)
January 18, 2005 10:1 WSPC/148-RMP
1286
00224
D. M. Elton
Proof. Set k = n0 − n and l = n0 + n + 1 so
0 ≤ k ≤ n02/3 ≤ (n0 + n + 1)2/3 = l2/3
while n0 ≥ 2 and n ≥ 0 so l ≥ 3. We have A0 = 1 = l0 , A1 = 0 ≤ l1/3 and k, 1 ≤ l2/3 so A2 = 21 (k + 1) ≤ l2/3 . Now let J ≥ 2 and suppose the result holds for j ≤ J. Since J +k l AJ+1 = AJ−1 − AJ−2 J +1 J +1 we then get J + k (J−1)/3 l (J−2)/3 |AJ+1 | ≤ l + l J +1 J +1 = l(J+1)/3
(J + k)l−2/3 + 1 . J +1
Now kl−2/3 ≤ 1 while l ≥ 3 ⇒ l−2/3 ≤ 3−2/3 ≤ ⇒ J(1 − l−2/3 ) ≥ 1
1 2 (as J ≥ 2)
⇒ 1 + Jl−2/3 ≤ J .
Thus (J + k)l−2/3 + 1 ≤ J + 1. Therefore |AJ+1 | ≤ l(J+1)/3 and the result follows by induction. Lemma 5.4. If 0 ≤ n ≤ n0 then Fn0 ,n ≤ 1. Proof. We have Fn0 ,n =
n0 (n0 − 1) · · · (n + 1) , 1 1 0 0 2 (n + n + 1) · · · 2 (n + n + 1)
where the numerator and denominator both contain n0 − n terms. Now set k = 1 1 0 0 2 (n − n − 1) and l = 2 (n + n + 1) so k ≤ l while
(l + k) (l + k − 1) (l − k + 1) (l − k) ··· . l l l l If n0 − n is odd this can be rearranged as Fn0 ,n =
(l + k)(l − k) (l + k − 1)(l − k + 1) l ··· , 2 2 l l l while if n0 − n is even we get Fn0 ,n =
(l + 12 )(l − 21 ) (l + k)(l − k) (l + k − 1)(l − k + 1) · · · . l2 l2 l2 The result now follows from the fact that l2 − k 02 (l + k 0 )(l − k 0 ) = ≤1 l2 l2 for any 0 ≤ k 0 ≤ l. Fn0 ,n =
January 18, 2005 10:1 WSPC/148-RMP
00224
Bethe–Sommerfeld Conjecture for the 3-D Periodic Landau Operator
1287
We also need some estimates for the Bessel function terms appearing in (5.9). Unfortunately the required estimates (given in Lemma 5.6) have not been found in an accessible form in a standard reference; we derive them here using elementary arguments. Lemma 5.5. For any x, ε > 0 and n ∈ R 1/2 θ ∈ [0, π] |x cos(θ) − n| < ε ≤ π ε . x
Proof. Set δ = ε/x, y = n/x and
Ωy,δ = Cos−1 ([y − δ, y + δ]) . We must show |Ωy,δ | ≤ πδ 1/2 which is obviously true when δ ≥ 1. Now, for fixed δ < 1, |Ωy,δ | is maximised with y = 1 − δ giving |Ωy,δ | ≤ Cos−1 (1 − 2δ). With z = πδ 1/2 we have 0 ≤ z < π so cos(z) ≤ 1 − 2z 2 /π 2 . Since Cos−1 is decreasing it follows that 2 2 −1 −1 ≤ z = πδ 1/2 , 1− 2z Cos (1 − 2δ) = Cos π completing the result. Lemma 5.6. (i) For any n ∈ N0 and x > 0 we have |Jn (x)| ≤ 3x−1/3 . (ii) For any n ∈ N0 and 0 < x ≤ n/2 we have |Jn (x)| ≤ 2n−1 . Proof. Define a function by f (θ) = x sin(θ) − nθ so we have the following integral representation for the Bessel function Jn (see [1, 9.1.21]); Z 1 π Jn (x) = cos(f (θ)) dθ . (5.10) π 0 Part (i). Set Ω0 = θ ∈ [0, π] |f 0 (θ)| < x1/3 and Ω1 = [0, π]\Ω0 R so Jn (x) = (I0 + I1 )/π where Ik = Ωk cos(f (θ)) dθ for k = 0, 1. Lemma 5.5 gives |I0 | ≤ |Ω0 | ≤ πx−1/3 .
On the other hand
sin(f (θ)) I1 = f 0 (θ)
+ ∂Ω1
Z
Ω1
f 00 (θ) sin(f (θ)) dθ . (f 0 (θ))2
Now f 00 (θ) = −x sin(θ) ≤ 0 on [0, π] while (f 0 (θ))2 > 0 on Ω1 . Thus Z Z f 00 (θ) f 00 (θ) 1 ≤− sin(f (θ)) dθ dθ = . 0 2 0 2 f 0 (θ) ∂Ω1 Ω1 (f (θ)) Ω1 (f (θ))
(5.11)
January 18, 2005 10:1 WSPC/148-RMP
1288
00224
D. M. Elton
Furthermore f 0 (θ) is decreasing on [0, π] so Ω0 (is empty or) consists of a single interval. Hence ∂Ω1 \{0, π} contains at most 2 points. Since f (0) = 0 and f (π) = −nπ we then get sin(f (θ)) 1 1 + 0 ≤ 6 max 0 |I1 | ≤ ≤ 6x−1/3 . (5.12) 0 θ∈Ω f (θ) f (θ) ∂Ω1 1 |f (θ)| ∂Ω1
Combining (5.11) and (5.12) we now get |Jn (x)| ≤ completing the first part.
1 1 |I0 | + |I1 | ≤ (π + 6)x−1/3 ≤ 3x−1/3 , π π
Part (ii). Since x ≤ n/2 we have |f 0 (θ)| = |n − x cos(θ)| ≥ n/2. Combined with the facts that f (0) = 0, f (π) = −nπ and |f 00 (θ)| ≤ x ≤ n/2 we now get π Z π 00 1 2π 2 1 sin(f (θ)) f (θ) + 0+ = , sin(f (θ)) dθ ≤ |Jn (x)| = 0 0 2 π f (θ) π n n 0 (f (θ)) 0
completing the result.
Lemma 5.7. Suppose n0 ≥ 2, 0 ≤ n0 − n ≤ n0 2/3 and ρ ≤ 41 (n0 + n + 1)1/6 . Then 0
n,n |Um | ≤ 4(2ρ)−1/3 (n0 + n + 1)−1/6
and
0
n,n |Um | ≤ 6(n0 − n + 1)−1/3 .
Proof. Combining (5.9) with Lemmas 5.3, 5.4 and 5.6(i) we obtain the estimate n,n0 |Um |
≤
q
F
n0 ,n
∞ X j=0
|Aj |
√ ρj Jn0 −n+j 2ρ n0 + n + 1 (n0 + n + 1)j/2
≤ 3(2ρ)−1/3 (n0 + n + 1)−1/6
∞ X
ρj (n0 + n + 1)−j/6
j=0
≤ 4(2ρ)−1/3 (n0 + n + 1)−1/6 , where the last line follows from the fact that ρ(n0 + n + 1)−1/6 ≤ 1/4. We thus have the first part of the result. Furthermore, under the additional assumption that n0 − n ≤ 4ρ(n0 + n + 1)1/2 we obtain 0
n,n (2ρ)−1/3 ≤ 21/3 (n0 − n)−1/3 (n0 + n + 1)1/6 ⇒ |Um | ≤ 27/3 (n0 − n)−1/3 . 0
n,n However |Um | ≤ 1 (see (2.19)) so 0
0
n,n 3 n,n (n0 − n + 1)|Um | ≤ 27/3 + 1 ⇒ |Um | ≤ 6(n0 − n + 1)−1/3 .
On the other hand if n0 − n > 4ρ(n0 + n + 1)1/2 Lemma 5.6(ii) implies √ Jn0 −n+j 2ρ n0 + n + 1 ≤ 2(n0 − n + j)−1 ≤ 2(n0 − n)−1/3
January 18, 2005 10:1 WSPC/148-RMP
00224
Bethe–Sommerfeld Conjecture for the 3-D Periodic Landau Operator
1289
for all j ∈ N0 . Combining this with (5.9) and Lemmas 5.3 and 5.4 in an argument n,n0 similar to that used above, we obtain the estimate |Um | ≤ 8(n0 − n)−1/3 /3. Since n,n0 we also have |Um | ≤ 1 it follows that n,n0 |Um | ≤ 1 + (8/3)3 1/3 (n0 − n + 1)−1/3 ≤ 3(n0 − n + 1)−1/3 , completing the second part of the result.
Proof of Proposition 2.1. The right-hand sides of estimates (2.21) and (2.22) n,n0 n0 ,n are symmetric in n, n0 while |Um | = |U−m | (see (2.19)). It follows that we may 0 n,n assume n0 ≥ n. Furthermore |Um | ≤ 1 (see (2.19)) which allows (2.21) and (2.22) to be verified directly when n0 = 0, 1. We now assume n0 ≥ 2. Before splitting the remainder of the proof into two cases we firstly note that (5.2) and (2.5) imply ρ≤
1 −1/2 1 1 β |m|Γ˜ ≤ β −1/2 KΓ˜ |m| ≤ (1 + n0 + n)1/6 . 2 2 6
(5.13)
2/3
Case (i): n0 − n ≤ n0 . The second estimate in Lemma 5.7 immediately gives (2.21). On the other hand, if m3 = 0 and m 6= 0 then 1/3
2β 1/2 ρ = |m|Γ˜ ≥ KΓ˜−1 |m| ≥ KΓ˜−1 ⇒ (2ρ)−1/3 ≤ β 1/6 KΓ˜
so (2.22) follows from the first estimate in Lemma 5.7. Case (ii): n0 − n ≥ n0 2/3 . Thus n0 ≥ n + 1 which leads to
(n0 − n)1/2 ≥ 2−1/3 (n0 + n + 1)1/3 ≥ 2−1/3 36ρ2
1 1 (n0 − n)1/2 ≤ − 32 (n0 − n)1/2 so the with the help of (5.13). It follows that ρ2 − 15 second estimate from Lemma 5.2 gives 1 32 0 n,n0 (n − n)−1/2 |Um | ≤ exp − (n0 − n)1/2 ≤ 32 e 0
n,n (n.b., e−x ≤ 1/(ex) for any x ≥ 0). Now |Um | ≤ 1 so 2 32 n,n0 3 n,n0 (1 + n0 − n)|Um | ≤1+ ⇒ |Um | ≤ 6(1 + n0 − n)−1/3 , e
which is just (2.21). 1 1 From (5.13) we get ρ2 − 18 (1 + n0 + n)1/3 ≤ − 36 (1 + n0 + n)1/3 so the first n,n0 estimate from Lemma 5.2 and the fact that |Um | ≤ 1 give 36 1 0 1/3 n,n0 2 n,n0 ≤ (1 + n0 + n)−1/3 . |Um | ≤ |Um | ≤ exp − (1 + n + n) 36 e Estimate (2.22) follows. Acknowledgments The author wishes to thank A. V. Sobolev for initial discussions on this problem and A. B. Pushnitski for several useful ideas relating to the use of special functions in Sec. 5.
January 18, 2005 10:1 WSPC/148-RMP
1290
00224
D. M. Elton
References [1] M. Abramowitz and I. A. Stegun (eds.), Handbook of Mathematical Physics (Dover Publications, New York, 1972). [2] J. Avron and R. Seiler, On the quantum Hall effect, J. Geom. Phys. 1(3) (1984) 13–23. [3] H. L. Cycon, R. G. Froese, W. Kirsch and B. Simon, Schr¨ odinger Operators with Applications to Quantum Mechanics and Global Geometry (Springer-Verlag, Berlin– New York, 1987). [4] B. E. J. Dahlberg and E. Trubowitz, A remark on two dimensional periodic potentials, Comment. Math. Helvetici 57 (1982) 130–134. [5] V. A. Ge˘ıler, V. A. Margulis and I. I. Chuchaev, Spectrum structure for the threedimensional periodic Landau operator, Algebra i Analiz 8(3) (1996) 104–124; English transl., St. Petersburg Math. J. 8(3) (1997) 447–461. [6] I. S. Gradshteyn and I. M. Ryzhik, A. Jeffrey (ed.), Tables of Integrals, Series and Products, 5th edn. (Academic Press, London, 1994). [7] B. Helffer and A. Mohamed, Asymptotics of the density of states for the Schr¨ odinger operator with periodic electric potential, Duke Math. J. 92 (1998) 1–60. [8] Yu. E. Karpeshina, Analytic perturbation theory for a periodic potential, Izv. Akad. Nauk SSSR Ser. Mat. 53(1) (1989) 45–65; English transl., Math. USSR-Izv. 34(1) (1990) 43–64. [9] Yu. E. Karpeshina, Perturbation Theory for the Schr¨ odinger Operator with a Periodic Potential, Lecture Notes in Math., 1663 (Springer, Berlin, 1997). [10] Yu. E. Karpeshina, On the density of states for the periodic Schr¨ odinger operator, Ark. Math. 38 (2000) 111–137. [11] P. Kuchment, Floquet Theory for Partial Differential Equations, Oper. Theory Adv. Appl., 60 (Birkh¨ auser, Basel, 1993). [12] A. Mohamed, Asymptotic of the density of states for the Schr¨ odinger operator with periodic electromagnetic potential, J. Math. Phys. 38(8) (1997) 4023–4051. [13] L. Parnovski and A. V. Sobolev, On the Bethe–Sommerfeld conjecture for the polyharmonic operator, Duke Math. J. 107 (2001) 209–238. [14] L. Parnovski and A. V. Sobolev, Lattice points, perturbation theory and the periodic polyharmonic operator, Ann. Henri Poincar´e 2 (2001) 573–581. [15] V. N. Popov and M. Skriganov, A remark on the spectral structure of the two dimensional Schr¨ odinger operator with a periodic potential, Zap. Nauchn. Sem. LOMI AN SSSR 109 (1981) 131–133; 181, 183–184 (Russian). [16] M. Reed and B. Simon IV, Methods of Modern Mathematical Physics IV: Analysis of Operators (Academic Press, New York, 1975). [17] Z. Shen, On the Bethe-Sommerfeld conjecture for higher-order elliptic operators, Math. Ann. 326 (2003) 19–41. [18] M. Skriganov, Finiteness of the number of gaps in the spectrum of the multidimensional polyharmonic operator with a periodic potential, Mat. Sb. (N.S.) 113 (155) (1980) 133–145, 176; English transl., Math. USSR-Sb. 41 (1982) 115–125. [19] M. Skriganov, Geometrical and arithmetical methods in the spectral theory of multidimensional periodic operators, Trudy Mat. Inst. Steklov. 171(2) (1985) 3–122; English transl., Proc. Steklov Math. Inst. 171 (Amer. Math. Soc., Providence, 1987). [20] M. Skriganov, The spectrum band structure of the three-dimensional Schr¨ odinger operator with periodic potential, Inv. Math. 80 (1985) 107–121. [21] J. Zak, Magnetic translation group, Phys. Rev. (2) 134(6) (1964), A1602–A1611. [22] J. Zak, The qk-representation in the dynamics of electrons in solids, Solid State Phys. 27 (1972) 1–62.
January 18, 2005 10:2 WSPC/148-RMP
00226
Reviews in Mathematical Physics Vol. 16, No. 10 (2004) 1291–1348 c World Scientific Publishing Company
CAUSAL PERTURBATION THEORY IN TERMS OF RETARDED PRODUCTS, AND A PROOF OF THE ACTION WARD IDENTITY
∗ ¨ MICHAEL DUTSCH
Institut f¨ ur Theoretische Physik, Universit¨ at Z¨ urich, CH-8057 Z¨ urich, Switzerland [email protected] KLAUS FREDENHAGEN II. Institut f¨ ur Theoretische Physik, Universit¨ at Hamburg, D-22761 Hamburg, Germany [email protected]
Received 6 August 2004 Revised 24 November 2004 In the framework of perturbative algebraic quantum field theory a local construction of interacting fields in terms of retarded products is performed, based on earlier work of Steinmann [42]. In our formalism the entries of the retarded products are local functionals of the off-shell classical fields, and we prove that the interacting fields depend only on the action and not on terms in the Lagrangian which are total derivatives, thus providing a proof of Stora’s “Action Ward Identity” [45]. The theory depends on free parameters which flow under the renormalization group. This flow can be derived in our local framework independently of the infrared behavior, as was first established by Hollands and Wald [32]. We explicitly compute non-trivial examples for the renormalization of the interaction and the field. Keywords: Axiomatic approach; renormalization; renormalization group evolution of parameters; perturbation theory.
1. Introduction Among the various, essentially equivalent formulations of quantum field theory (QFT) the algebraic formulation [28, 27] seems to be the most appealing one from the conceptual point of view, but on the other side the least accessible one from the computational point of view. In perturbation theory, which is still the most successful method for making contact between theory and experiment, the approach towards a construction of interacting fields in terms of operators on a Hilbert space was popular in the early years, see e.g. [34]. But later it was largely abandoned in favor of a direct determination of Green functions. This is partially due to some ∗ Work
supported by the Deutsche Forschungsgemeinschaft. 1291
January 18, 2005 10:2 WSPC/148-RMP
1292
00226
M. D¨ utsch & K. Fredenhagen
complications which arise in the perturbative expansion of Wightman functions compared to the time-ordered functions. Moreover, if the ultimate goal of QFT is the computation of the S-matrix, the approach via time-ordered functions is more direct, in view of the LSZ-formulas, and the approach via Wightman functions seems to be a detour, which is conceptually nice but unimportant for practitioners. There are, however, several reasons for a revision of this prevailing attitude. One is the desire to understand QFT on generic curved backgrounds. There, no general asymptotic condition a ` la LSZ exists; moreover, even the concepts of vacuum and particles lose their distinguished meaning, and one is forced to base the theory on the algebra of quantum fields (see e.g. [7] and references cited therein). Another reason is the connection to the classical limit. The algebra of perturbative quantum fields may be understood in terms of deformation quantization of the underlying Poisson algebra of free classical fields [13, 14], and one may hope that deformation quantization applies also to non-perturbative fields. But even in the traditional QFT on Minkowski space the algebraic formulation has great advantages because it completely separates the UV-problem from the IR-problem and, in principle, allows a consistent treatment of situations (like in QED [12]) where an S-matrix, strictly speaking, does not exist. Actually, the perturbative construction of the algebras of quantum fields is possible, e.g. using the Bogoliubov–Epstein–Glaser approach of causal perturbation theory [3, 18]. There, the system of time-ordered products of Wick polynomials of free quantum fields is recursively constructed, and interacting fields are given in terms of Bogoliubov’s formula [3] R R −1 δ def 1 T ei gL T ei gL+hA |h=0 , (1.1) AR gL (x) = i δh(x)
where g and h are test functions and A is a polynomial in the basic fields and their partial derivatives. The Taylor series expansion of the interacting field with respect to the interaction defines the retarded products Rn,1 (L(x1 ), . . . , L(xn ), A(x)) =
δn AR (x) δg(x1 ) · · · δg(xn ) gL
(1.2)
which can, by Bogoliubov’s formula, be expressed in terms of time-ordered and anti time-ordered products. It is desirable, however, to have a direct construction of the retarded products, without the detour via the time-ordered products. In particular, the approach to the classical limit simplifies enormously, since the retarded products are power series in ~, whereas the time-ordered products are Laurent series (see the simplifications in [14] compared to [13]). Such a construction was performed by Steinmann [42] (see also his recent book on QED [43]). We review his construction with some modifications. The most important one is the restriction to localized interactions (the support of the coupling function g in (1.1) is bounded) as it is characteristic for causal perturbation theory in the sense of Bogoliubov [3], Epstein and Glaser [18]. Therefore we do not have
January 18, 2005 10:2 WSPC/148-RMP
00226
Causal Perturbation Theory in Terms of Retarded Products
1293
to discuss the asymptotic structure. Instead we use the algebraic adiabatic limit introduced in [7]. This limit relies on the observation that the algebraic structure of observables localized in a certain region does not depend on the behavior of the interaction outside of the region. This allows a construction of interacting fields but not of the S-matrix (Sec. 5). In this way we obtain a unified theory of massive and massless theories as well as of theories on curved spacetime (the latter are not discussed in the present paper). We also allow as interaction terms with derivatives. Traditionally, in causal perturbation theory one uses, as arguments of time-ordered and retarded products, on shell fields, i.e. fields which are subject to the free field equationsa [18, 40, 16]. The inclusion of derivative couplings then leads to complications, and a consistent treatment requires a somewhat involved formalism [16]. Moreover, a change of the splitting between free and interaction terms in the Lagrangean requires a major effort. We therefore prefer to use off-shell fields, thereby following a suggestion of Stora [45]. A natural question is then whether in this framework derivatives commute with time-ordered and retarded products. This amounts to the problem whether the interacting fields R depend only on the action S and not on how it is written as an integral S = dx g(x)L(x), i.e. whether total derivatives in gL can be ignored. The corresponding Ward identity was termed Action Ward Identity by Stora [45]. We will show in this paper that the Action Ward Identity can indeed be fulfilled. We even go a step further: that the values of our retarded products are also offshell fields, in contrast with the literature. For the physical predictions this makes no difference, and we gain technical simplifications. The dependence of the theory on the renormalization conditions is usually analyzed in the adiabatic limit. In the algebraic adiabatic limit we can perform this analysis completely locally, thus avoiding all infrared problems (Sec. 5). Actually, such an analysis was already given by Hollands and Wald [32] for theories on curved spacetimes where the traditional adiabatic limit makes no sense, in general. In case of the scaling transformations we illustrate the formalism by computing (to lowest non-trivial order) examples for the renormalization of the interaction and the field. Hollands and Wald [31] also introduce a new concept of scaling transformation which applies to fields on generic curved spacetimes. This concept already entails important consequences for massive theories on Minkowski space. We therefore adopt this point of view: among the axioms of causal perturbation theory we require smooth mass dependenceb for m ≥ 0 and almost homogeneous scaling. These conditions ensure that renormalization depends only on the short distance behavior of the theory, in agreement with the principle of locality.
a An
exception is Sec. 4 of the Epstein–Glaser paper [18]: to interpret it consistently the arguments of time-ordered products must be off-shell fields. b In even dimensions this cannot be satisfied if the ∗-product is defined with respect to the usual two-point function of the free field; a modification of the ∗-product is necessary, see (2.13).
January 18, 2005 10:2 WSPC/148-RMP
1294
00226
M. D¨ utsch & K. Fredenhagen
A main question in perturbative QFT is whether symmetry with respect to a certain group G can be maintained in the process of renormalization. In Appendix C we prove that this is possible if all finite dimensional representations of G are completely reducible. In the case of compact groups and for Lorentzinvariance we complete this existence result by giving a construction of a symmetric renormalization. 2. Axioms for Retarded Products We consider for notational simplicity the theory of a real scalar field on ddimensional Minkowski space M, d > 2. The classical configuration space C is the space C ∞ (M, R) and the field ϕ is the evaluation functional on this space: (∂ a ϕ)(x)(h) = ∂ a h(x), a ∈ Nd0 . Let F be the set of all functionals F on C with values in the formal power series in ~ and which have the form N Z X F (ϕ) = dx1 · · · dxn ϕ(x1 ) · · · ϕ(xn )fn (x1 , . . . , xn ) , N < ∞, (2.1) n=0
where the fn ’s are C[[~]]-valued distributions with compact support, which are symmetric under permutations of the arguments and whose wave front sets satisfy the condition n n (2.2) WF(fn ) ∩ Mn × (V+ ∪ V− ) = ∅
and f0 ∈ C[[~]] (see [13, Sec. 5.1] and [14, Sec. 4]). The value of the functional F (ϕ) for the argument h ∈ C is obtained by substituting everywhere h for ϕ on the right side of (2.1): F (ϕ)(h) = F (h). An important example is (for a given k ∈ N) ( 0 for n 6= k fn (x1 , . . . , xn ) = R (2.3) Qk aj dx f (x) j=1 ∂ δ(xj − x) for n = k P Q (where f ∈ D(M) and aj ∈ Nd0 ), which gives F (ϕ) = (−1) j |aj | ( kj=1 ∂ aj ϕ)(f ). F is a *-algebra with the classical product (F1 · F2 )(h) := F1 (h) · F2 (h), where F ∗ is obtained from F (2.1) by complex conjugation of all fn ’s. We introduce the functional ( F → C[[~]] (2.4) ω0 : F 7→ F (0) ≡ f0 which will be interpreted as “vacuum state”. F(O) denotes the space of functionals localized in the spacetime region O, i.e. which depend only on ϕ(x) for x ∈ O, δF ⊂O . F(O) = F ∈ F supp δϕ
Here, the functional derivatives of a polynomial functional F (2.1) are F-valued distributions given by Z N X δk F n! dy1 · · · dyn−k ϕ(y1 ) · · · ϕ(yn−k ) = δϕ(x1 ) · · · δϕ(xk ) (n − k)! n=k
· fn (x1 , . . . , xk , y1 , . . . , yn−k ) .
(2.5)
January 18, 2005 10:2 WSPC/148-RMP
00226
Causal Perturbation Theory in Terms of Retarded Products
1295
(m)
Let ∆+ be the 2-point function of the free scalar field with mass m. On F we define an m-dependent associative product by (F ?m G)(ϕ) =
Z ∞ X ~n δn F dx1 · · · dxn dy1 · · · dyn n! δϕ(x1 ) · · · δϕ(xn ) n=0 ·
n Y
i=1
(m)
∆+ (xi − yi )
δn G , δϕ(y1 ) · · · δϕ(yn )
(2.6)
which induces a product ?m : F(O) × F(O) → F(O). The condition (2.2) on the coefficients fn guarantees that the product is well defined, i.e. the pointwise product of distributions on the right side of (2.6) exists and the “coefficients” of (F ?m G) satisfy again (2.2). ?m corresponds to a ?-product in the sense of deformation quantization [1] (see also [29, 5]), and it may be interpreted as Wick’s Theorem for “off-shell fields” (i.e. fields which are not restricted by any field equation). The corresponding algebras are denoted by A(m) and A(m) (O), respectively. For ~ = 0 the product reduces to the classical product. Since we understand the functionals F, G and their product (F ?m G) as formal power series in ~, each equation must hold individually in each order of ~, in particular renormalization (see Sec. 3) has to be done in this sense. The algebra of Wick polynomials is obtained by dividing out the ideal J (m) generated by the field equation ( N Z X (m) J = F (ϕ) = dx1 · · · dxn ϕ(x1 ) · · · ϕ(xn )(x1 + m2 )fn (x1 , . . . , xn ) | n=1
)
fn as above .
(2.7)
(m)
The quotient algebra F0 ≡ F/J (m) can be, for each fixed value of ~ > 0, faithfully represented on Fock space by identifying the classical product of fields R π dx · · · dx ϕ(x ) · · · ϕ(x )f (x , . . . , x ) with the normally ordered product 1 n 1 n n 1 n R dx1 · · · dxn : ϕ(x1 ) · · · ϕ(xn ): fn (x1 , . . . , xn ), where π is the canonical surjection (m) (m) π : F → F0 ([14, Theorem 4.1]). ω0 (2.4) induces a state on F0 which (m) corresponds to the Fock vacuum. F0 (O) denotes the image of F(O) under π. (m) Let C0 ⊂ C be the space of smooth solutions of the free field equation. Since F ∈ J (m) ⇔ F |C (m) = 0, the canonical surjection π can alternatively be viewed as 0
(m)
the restriction of the functionals F ∈ F to C0 (for details see [15]). We are particularly interested in local functionals. We call a functional F ∈ F local if δ2 F =0 δϕ(x)δϕ(y)
for
x 6= y .
January 18, 2005 10:2 WSPC/148-RMP
1296
00226
M. D¨ utsch & K. Fredenhagen
Local functionals are of the form Z N N X X F = dx Ai (x)hi (x) ≡ Ai (hi ) i=1
(2.8)
i=1
where the Ai ’s are polynomials of the field ϕ and its derivatives and the hi ’s are test functions with compact support, hi ∈ D(M). The set of local functionals will be denoted by Floc . Remark 2.1. There exist faithful Hilbert space representations of the off-shell fields. For example, a faithful representation π of the algebra F (with the classical product) is obtained by interpreting F as a vector space and the representation is defined by left-multiplication: π(F ) G := F G (F, G ∈ F). A possible scalar product reads hF, Gi := ω0 (F ∗ ?g G) ,
(2.9)
(m)
where ?g is the ?-product (2.6) with ∆+ replaced by the 2-point function [g] [g] ∆ of a generalized free field with weight function g ∈ D(R+ ), i.e. ∆+ (y) = R+ 2 (m) dm g(m2 ) ∆+ (y). (Note that the smoothness of g excludes the case of a free [g] (m0 ) field, ∆+ = ∆+ for some m0 , in which (2.9) would be degenerate.) (m) Compared with their Hilbert space representations, the algebras F and F0 are more flexible and more convenientc; and, as is demonstrated by this paper, they provide all necessary information. We want to construct, for any pair of local functionals F, G ∼ ~0 the quantum field theoretical operator FG/~ (“interacting field”) which corresponds to F under the interaction termd G/~. FG should be a formal power series in G where each term is an element of F(O) if F, G ∈ F(O). Here we deviate essentially from the usual formalism of perturbative QFT: there the interacting fields are Fock space operators (which means in our algebraic formulation that they are elements of (m) F0 ). Motivated by the study of the Peierls bracket [36, 15], we define them to be unrestricted functionals (“off-shell fields”). This simplifies strongly the proof of the “Main Theorem of perturbative renormalization” (Sec. 4.2) and e.g. the formulation of the renormalization conditions “Covariance” and “Field Independence” given below. At ~ = 0, the restriction FG |C (m) is the (perturbative) classical retarded 0 field as constructed in [15] (see also [11, 36]). We require the following properties, which may be motivated by their validity in classical field theory (see [15]): Initial condition: For G = 0 we obtain the original functional, F0 = F . c For
example one does not need to care about domains of unbounded operators. we will set ~ = 1.
d Mostly
January 18, 2005 10:2 WSPC/148-RMP
00226
Causal Perturbation Theory in Terms of Retarded Products
1297
Causality: Fields are not influenced by interactions which take place later: FG+H = FG if there is a Cauchy surface such that F is localized in its past and H in its future. def i ~ [FG/~ , HG/~ ]?m
GLZ Relation: The Poisson bracket {FG/~ , HG/~ } = the GLZ relation [23, 42, 15] {FG/~ , HG/~ } =
satisfies
d (F(G+λH)/~ − H(G+λF )/~ )|λ=0 . dλ
Due to the GLZ relation and (2.6) the interacting fields depend on the ∗-product (2.6) and with that they depend on the mass m of the free field. Steinmann [42] discovered that by these conditions an inductive construction of the perturbative expansion of FG (i.e. of the retarded products (1.2)) can be done up to local functionals which could be added in every order. But these undetermined terms correspond to the renormalization ambiguities which are there anyhow in perturbative quantum field theory. One may reduce these ambiguities by prescribing normalization conditions which are satisfied in classical field theory [15]. We impose the following conditions: Unitarity: Complex conjugation induces an involution F 7→ F ∗ of the algebra (m) which after restriction to C0 becomes the formal adjoint operation on Fock space. We require (FG )∗ = FG∗ ∗ . This condition implies that a real interaction G leads formally to a unitary S-matrix and hermitian interacting fields (if F ∗ = F ). ↑ Covariance: The Poincar´e group P+ has a natural automorphic action β on F. We require
βL (FG ) = βL (F )βL (G)
↑ ∀L ∈ P+ .
(See [8] and [31] for the formulation of covariance on curved spacetime.) In addition, global inner symmetries (in our case the field parity α : ϕ 7→ −ϕ) should be preserved, α(FG ) = α(F )α(G) .
(2.10)
Field Independence: A coherent prescription for the renormalization of polynomials in the basic fields and all sub-polynomials can be obtained by the following condition d δF δ FG = h, + h ∈ D(M) (2.11) F h, δG , δϕ δϕ G dλ λ=0 G+λhh, δϕ i
January 18, 2005 10:2 WSPC/148-RMP
1298
00226
M. D¨ utsch & K. Fredenhagen
R δH (where hh, δH dx h(x) δϕ(x) for H ∈ F). Below it will turn out that this δϕ i ≡ condition is equivalent to the natural generalization to our “off-shell formalism” of the causal Wick expansion given in [18, Sec. 4]. (The latter is equivalent to the condition N3 in [12], which is also called “relation to time-ordered products of sub-polynomials” in [16].) Field equation: The renormalization ambiguities can be used to fulfill the Yang– Feldman equation “off-shell”: Z δG ret . (2.12) ϕG (x) = ϕ(x) − dy ∆m (x − y) δϕ(y) G The latter condition may be enforced by requiring the validity of all local identities which hold classically as a consequence of the field equation. This was termed Master Ward Identity in [15]. Since the Master Ward Identity cannot always be fulfilled we do not impose it here. Smoothness in the mass m ≥ 0: the classical interacting fields depend smoothly on the mass of the free fields in the range m ≥ 0. In the quantum case, in even dimensional spacetime, this is no longer true even for the free fields because of e logarithmic singularities of the two-point function ∆+ m (see Appendix A). One can remedy this defect by passing to an equivalent star product ?m,µ which is +(d) defined by (2.6) with ∆m replaced by µ (d) 2 2 Hm (x) ≡ ∆+ m (x) − log(m /µ ) γ(x) ,
γ(x) ≡ md−2 h(d) (m2 x2 ) .
(2.13)
(This procedure essentially follows [30].) µ > 0 is an additional mass paramµ (d) eter. h(d) is analytic and it is chosen such that Hm is smooth in m ≥ 0. (2l+1) (4+2k) −k (k) Explicitly we choose h ≡ 0 and h (y) ≡ π f (y), where f is µ (d) given in Appendix A by (A.9). With this particular choice Hm solves the Klein–Gordon equation, but we point out that this is not necessary for the purposes of this paper,f cf. [30]. To show that the product ?m,µ is equivalent to ?m , we introduce the transformation Γ ∞ X k 1 m =1+ log(m/µ) · Γ (2.14) µ k! k=1
where Γ is the operator Γ≡Γ By using
(m)
≡
Z
dx dy γ(x − y)
δ2 . δϕ(x)δϕ(y)
eλϕ(f ) ?m eλϕ(g) = eλϕ(f +g) eλ
2
(f,∆+ m g)
(2.15)
(2.16)
e We are indebted to Stefan Hollands for pointing out to us that this fact invalidates our treatment of the scaling behavior in an older version of this manuscript. (m) µ (d) f In order that ? ≡ F /J (m) it is needed that Hm m,µ induces a well defined ?-product on F0 solves the Klein–Gordon equation. However, we always work with off-shell fields (i.e. with F ).
January 18, 2005 10:2 WSPC/148-RMP
00226
Causal Perturbation Theory in Terms of Retarded Products
1299
µ (understood as formal power series in λ), the same formula for (?m,µ , Hm ) and
m µ
Γ
e
λϕ(f )
=e
λϕ(f )
m µ
λ2 R dx dy
γ(x−y) f (x)f (y)
,
(2.17)
Γ we find that ( m µ ) intertwines between ?m,µ and ?m :
m µ
Γ
F ?m,µ G =
m µ
Γ
(F ) ?m
m µ
Γ
(G) ,
F, G ∈ F .
(2.18)
As mentioned after (2.7), the normally ordered product : ϕ(f )n :m (where norn mal ordering is done with ∆+ m ) agrees with π((ϕ(f )) ). The modified normally n µ ordered product : ϕ(f ) :m,µ (i.e. normal ordering is done with Hm ) agrees m −Γ n with π(( µ ) (ϕ(f )) ). With that (2.17) yields that the two different kinds of Wick powers are related by −λ2 γ(0) m λ ϕ(x) λϕ(x) :e :m,µ =: e :m . (2.19) µ We may now introduce interacting fields with respect to the modified ∗product, (m) −Γ Γ m m (m,µ) (F ) , (2.20) (FG ) := µ µ ( m )Γ (G) µ
(m)
where FG denotes the interacting field with respect to the usual ∗-product (0,µ) (2.6). Note that at m = 0 the two kinds of interacting fields agree: FG = (0) ) δF ⊂ supp ), field independent (in the FG . Since Γ is local (i.e. supp δ(ΓF δϕ δϕ sense that γ is field independent), Poincar´e invariant and commutes with the ∗-operation (since γ is Poincar´e invariant and real), the modified interacting fields satisfy the same conditions as the original interacting fields (where, of course, in the GLZ relation the commutator with respect to the modified ∗product has to be used). Our smoothness condition now takes the following form: we require that the maps m 7→ (FG )(m,µ) ,
F, G ∈ Floc ,
µ > 0,
(2.21)
are smooth (in the sense of one-sided derivatives at m = 0).g Remark 2.2. (1) The fact that the ?-products ?m,µ are equivalent for different values of µ (which follows immediately from (2.18)) may also be formulated in g Due to (2.13) the interacting fields depend only on m2 , and setting (F )(−m,µ) := (F )(m,µ) G G the map (2.21) can be extended to m < 0. It is even possible to require that (F G )(m,µ) is smooth in m2 (which is a stronger condition than smoothness in m). However, note that this footnote is not valid for spinor fields.
January 18, 2005 10:2 WSPC/148-RMP
1300
00226
M. D¨ utsch & K. Fredenhagen
the following way. Introduce new fields by −Γ m ϕ⊗n (x1 , . . . , xn )m,µ = ϕ(x1 ) · · · ϕ(xn ) . µ Every functional F ∈ F may be expanded in any of these fields XZ F = dx1 · · · dxn fnm,µ (x1 , . . . , xn )ϕ⊗n (x1 , . . . , xn )m,µ n
with suitable coefficients fnm,µ . The different ?-products then arise when the ?-product ?m is expressed in terms of the coefficients: XZ F ?m G = dx1 dy1 · · · fnm,µ (x1 , . . .)gkm,µ (y1 , . . .) n,k
=
m · µ XZ
−Γ
ϕ(x1 ) · · · ?m,µ ϕ(y1 ) · · ·
f m,µ ·
Y
µ Hm · g m,µ · (ϕ⊗ )m,µ .
Hence, the choice of µ can be understood as the choice of a basis for A(m) ≡ (F, ?m ). (2) The introduction of the modified ∗-product and modified interacting fields (m) (2.20) can be avoided by requiring that the function m 7→ FG is almost smooth for m ↓ 0 for all F ; G ∈ Floc . In doing so a function R+ 3 m 7→ f (m) is called almost smooth for m ↓ 0, if for any fixed mass parameter µ > 0 there exist polynomials pk,µ , k ∈ N0 , such that for each n ∈ N0 it holds ! X m mk pk,µ log m−n f (m) − →0 (2.22) µ k≤n
for m ↓ 0. Note that the polynomials pk,µ are uniquely determined by this condition. Then the scaling expansion (3.10) has to be generalized correspondingly. We do not go this way, because the treatment of the renormalization (m) ambiguities is much simpler for the modified interacting fields: the map DH of the Main Theorem (i.e. Theorem 4.2), which gives a finite renormalization of the modified interacting fields, is free of (log m)-terms (4.14); but the corresponding D(m) of the original interacting fields contains such terms (4.15). Scaling: Under a simultaneous scaling of the coordinates and the mass, (x, m) 7→ (ρx, ρ−1 m), the interacting classical fields transform homogeneously. This can no longer be maintained for the quantized theory, together with the requirement of smoothness at m = 0, even for free fields (in even dimensions) since µ Hm does not scale homogeneously (A.12). However, we will show that the retarded products can be normalized such that they scale almost homogeneously (i.e. up to logarithmic terms). The last two normalization conditions were first imposed in [31] in the more general context of renormalization on curved spacetime. In the traditional literature (e.g.
January 18, 2005 10:2 WSPC/148-RMP
00226
Causal Perturbation Theory in Terms of Retarded Products
1301
[18, 42, 7]) instead the weaker requirement was used that “renormalization should not make the interacting fields more singular” (in the UV-region), see footnote m. The listed conditions can be translated in a straightforward way into conditions on the retarded products Rn,1 which are by definition (1.2) the Taylor coefficients of the interacting field with respect to the interaction, A(f )L(g) =
∞ X 1 L(g) Rn,1 L(g)⊗n , A(f ) ≡ R(e⊗ , A(f )) n! n=0
(2.23)
with g, f ∈ D(M) and A, L ∈ P, where P is the algebra of polynomials in the classical field ϕ and its partial derivatives (with respect to pointwise multiplication). (m) (m,µ) We use (2.23) for both kinds of interacting fields, FG and FG , and by writing R (or Rn,1 ) we mean both kinds of retarded products, R(m) and R(m,µ) . The retarded ⊗n product Rn−1,1 is a linear map, from Floc into F which is symmetric in the first n − 1 variables. In the last expression of (2.23) the sequence R ≡ (Rn−1,1 )n∈N is viewed as a map R : T Floc → F ,
def
⊗n where T Floc = ⊕∞ n=0 Floc
(2.24)
def
and R−1,1 = 0 which is extended by linearity to formal power series. It is sometimes advantageous to interpret Rn−1,1 as F-valued distribution in n variables on the test function space D(M, P) (cf. [15]). In particular our retarded products are multilinear in the fields A ∈ P. We also use the symbolic notation Rn−1,1 (x1 , . . . , xn ) for a distribution which takes values in F ⊗ Pn0 , where Pn0 is the dual vector space of P ⊗n . After insertion of fields A1 , . . . , An ∈ P we obtain F-valued distributions on D(Mn ) which are symbolically written as Rn−1,1 (A1 (x1 ), . . . , An (xn )). Action Ward Identity (AWI): Since the retarded products depend only on the functionals, derivatives may be shifted from the test functions to the fields and vice versa. Hence, the associated distributions must satisfy the Action Ward Identity ∂µx Rn−1,1 (. . . , Ak (x), . . .) = Rn−1,1 (. . . , ∂µ Ak (x), . . .) .
(2.25)
Symmetry: Motivated by (2.23) we require that Rn−1,1 is symmetric in the first (n−1) factors. This property is reflected in the convention for the lower indices of Rn−1,1 . Initial condition:
R0,1 (F ) = F .
Causality: supp Rn−1,1 ⊂ {(x1 , . . . , xn ) ∈ Mn | xi ∈ xn + V¯− , ∀i = 1, . . . n − 1} .
(2.26)
GLZ relation: Rn−1,1 (. . . , y, z) − Rn−1,1 (. . . , z, y) = ~ Jn−2,2 (. . . , y, z)
(2.27)
January 18, 2005 10:2 WSPC/148-RMP
1302
00226
M. D¨ utsch & K. Fredenhagen
where Jn−2,2 is an algebra valued distribution in n variables, n ≥ 2, which is defined by Jn−2,2 (A1 (x1 ), . . . , An−2 (xn−2 ), B(y), C(z)) X def R|I|,1 (Ai (xi ), i ∈ I, B(y)), R|I c |,1 (Ai (xi ), i ∈ I c , C(z)) . = I⊂{1,...,n−2}
(2.28)
Here, I c is the complement of I in {1, . . . , n−2}. Obviously, Jn−2,2 is symmetric in the first n − 2 variables and antisymmetric in the last 2 variables. The properties of Jn−2,2 stated in the following lemma are necessary conditions for the GLZ-relation and the causality of Rn−1,1 . But actually, they are already fulfilled as a consequence of the definition of J (2.28), and the GLZ relation and Causality of the retarded products to lower orders [42]. Lemma 2.3. (a) J satisfies the Jacobi identity Jn−2,2 (. . . , x, y, z) + cycl(x, y, z) = 0 .
(2.29)
(b) The support of Jn−2,2 is contained in the set {xi ∈ xn + V¯− , i = 1, . . . , n − 1} ∪ {xi ∈ xn−1 + V¯− , i = 1, . . . n − 2, n} .
(2.30)
Proof (cf. [42]). (a) We start with the Jacobi identity of the Poisson bracket and use the notation xM ≡ (xm |m ∈ M ), where M ⊂ {1, . . . , n − 1}. So we know X {{R(xI , x), R(xH , y)}, R(xL , z)} + cycl(x, y, z) = 0 , (2.31)
ItHtL={1,...,n−1}
where t means the disjoint union. The sum over all decompositions of K ≡ I t H(= fixed) of the inner Poisson bracket is equal to J|K|,2 (xK , x, y), |K| ≤ n − 1, which splits into R(xK , x, y) − R(xK , y, x) due to the validity of the GLZ relation to lower orders. With that we obtain X 0= {(R(xK , x, y) − R(xK , y, x)), R(xL , z)} + cycl(x, y, z) KtL={1,...,n−1}
= Jn,2 (x1 , . . . , xn−1 , x, y, z) + cycl(x, y, z) .
(2.32)
(b) By definition of J and the support properties of R it follows that Jn−2,2 (x1 , . . . , xn−2 , y, z) vanishes if one of the first n − 2 arguments is not in the past of {y, z}. It remains to show that it vanishes also for (y − z)2 < 0. If one of the first n − 2 arguments is different from y and z, and is in the past of, say y, then by the Jacobi identity J has to vanish. If, on the other hand, all arguments xi are sufficiently near to either y or z, then they are space-like to the other point, hence all retarded products in the definition of J vanish up to those where all arguments in the first factor
January 18, 2005 10:2 WSPC/148-RMP
00226
1303
Causal Perturbation Theory in Terms of Retarded Products
are near to y and all arguments in the second factor are near to z. But then the Poisson bracket of these retarded products vanishes, since by assumption the retarded products are localized at their arguments.h An immediate consequence of the Initial condition and the GLZ relation is Rn−1,1 (F1 , . . . , Fn ) = O(~(n−1) ) if
F 1 , . . . , Fn ∼ ~ 0 .
Field Independence: The condition (2.11) translates into n X δ δFl Rn−1,1 (F1 , . . . , Fn ) = Rn−1,1 F1 , . . . , , . . . , Fn . δϕ(x) δϕ(x)
(2.33)
(2.34)
l=1
This condition determines the retarded product on the left side in terms of the retarded products on the right side, up to its value at ϕ = 0, i.e. its vacuum expectation value. Since by definition, the functionals Fi are polynomials in ϕ, one obtains the finite Taylor expansioni Z X 1 dx11 · · · dx1l1 · · · dxn1 · · · dxnln Rn−1,1 (F1 , . . . , Fn ) = l1 ! · · · l n ! l1 ···ln
ω0 Rn−1,1
δ l1 F 1 δ ln F n ,..., δϕ(x11 ) · · · δϕ(x1l1 ) δϕ(xn1 ) · · · δϕ(xnln )
(2.35) ϕ(x11 ) · · · ϕ(x1l1 ) · · · ϕ(xn1 ) · · · ϕ(xnln ) , where the coefficients ω0 Rn−1,1 (· · ·) are restricted by the other axioms. The retarded product on the right side is well-defined, because, due to Fk ∈ Floc , the support of δ l Fk /δϕ(x1 ) · · · δϕ(xl ) is contained in the total diagonal x1 = x2 = · · · = xl . After integrating out the corresponding δ-distributions the right side of (2.35) is a sum of terms of the form Z li n Y Y ∂ aiji ϕ(xi ) hi (xi ) (2.36) dnd x ω0 Rn−1,1 A1 (x1 ), . . . , An (xn ) i=1
ji =1
Nd0 .
with hi ∈ D(M), Ai ∈ P and multi-indices aiji ∈ Due to translation invariance of the vacuum, the coefficients in this expansion depend on the relative coordinates only. In particular, their wave front set satisfies the condition (2.2) on the admissible coefficients in F. The condition (2.34) that the retarded product is independent of ϕ, is the only axiom which relies on the fact that one perturbs around a theory with an action which is of second order in the field. Indeed, in the general case, the retarded products of classical field theory depend on the action only via its second functional derivative, see [15, Proposition 1]. h The
proof of the last fact given by Steinmann [42] is much more involved since he does not assume the localization property of the retarded products. i In (2.35) and (2.36) the product of fields is the classical product, ϕ(x)ϕ(y) (h) = h(x)h(y), and not the ?-product (2.6).
January 18, 2005 10:2 WSPC/148-RMP
1304
00226
M. D¨ utsch & K. Fredenhagen (m)
The restriction of (2.35) to C0 is the causal Wick expansion of Epstein and Glaser [18]. In particular the “coefficients” ω0 (Rn−1,1 (· · ·)) are exactly the same as in the on-shell formalism of [18]. Smoothness in the mass m ≥ 0: In terms of the retarded products our requirement reads that Γ Γ −Γ m m m (m,µ) (m) (F1 ), . . . , (Fn ) (2.37) Rn−1,1 (F1 , . . . , Fn ) = Rn−1,1 µ µ µ (m,µ)
depends smoothly on m ∀m ≥ 0. By this we mean that Rn−1,1 (F1 , . . . , Fn )(h) is smooth as a function of m for all F1 , . . . , Fn ∈ Floc and all field configurations h ∈ C, and that the derivatives are of the form G(h) with G ∈ F.j In particular (0,µ) (m,µ) this smoothness implies that Rn−1,1 can be obtained by the limit limm↓0 Rn−1,1 in the sense of distributions on D(Mn ). Scaling: As an introduction and for later purpose we first define “almost homogeneous scaling” for a distribution under rescaling of the coordinates (cf. [31]): Definition 2.4. A distribution t ∈ D 0 (Rk ) (or D0 (Rk \{0})) scales almost homogeneously with degree D ∈ R and power N ∈ N0 if !N +1 k X z r ∂ zr + D t(z1 , . . . , zk ) = 0 (2.38) r=1
and N is the minimal natural number with this property. For N = 0 the scaling is homogeneous with degree D. The condition (2.38) is equivalent to
0 = (ρ∂ρ )N +1 ρD t(ρz1 , . . . , ρzk ) =
∂ N +1 ρD t(ρz1 , . . . , ρzk ) . ∂(log ρ)N +1
(2.39)
Hence, t scales almost homogeneously with degree D and power N if ρD t(ρz1 , . . . , ρzk ) is a polynomial of log ρ with degree N . To formulate almost homogeneous scaling for the retarded products under simultaneous rescalings of the coordinates and the mass m we introduce some tools. The mass dimension of a monomial in P is fixed by the conditions d−2 + |a| and dim(A1 A2 ) = dim(A1 ) + dim(A2 ) 2 for all monomials A1 , A2 ∈ P. This introduces a grading for P: dim(∂ a ϕ) =
P = ⊕ j Pj ,
(2.40)
(2.41)
where Pj is the linear span of all monomials with mass dimension j. The mass P dimension of A = j Aj , with Aj ∈ Pj , is the maximum of the contributing
j The latter condition is necessary since the distributions occurring in the representation (2.1) of G have to satisfy the wave front set condition (2.2).
January 18, 2005 10:2 WSPC/148-RMP
00226
Causal Perturbation Theory in Terms of Retarded Products
1305
j’s. We also introduce the set of all field polynomials which are homogeneous in the mass dimension [ def Phom = Pj . (2.42) j
A scaling transformation σρ is introduced (in analogy to [32]) as an automorphism of F (considered as an algebra with the classical product) by σρ (ϕ(x)) = ρ
2−d 2
ϕ(ρ−1 x) .
(ρ−1 m)
(2.43) (m)
Note ω0 ◦ σρ = ω0 . Due to ρd−2 ∆+ (ρx) = ∆+ (x), σρ is also an algebra −1 isomorphism from A(ρ m) to A(m) . However, denoting by A(m,µ) the algebra −1 −1 (F, ?m,µ ), σρ is an isomorphism from A(ρ m,ρ µ) to A(m,µ) , but not from ρ−1 µ (d)
−1
µ (d)
A(ρ m,µ) to A(m,µ) , since ρd−2 Hρ−1 m (ρx) = Hm (x). For m = 0 the coordinates are scaled only and σρ is an automorphism of A(m=0) = A(m=0,µ) . For A ∈ Phom , we obtain ρdim(A) σρ (A(ρx)) = A(x) .
(2.44)
So, with the identification given by σρ , they scale homogeneously with degree given by their mass dimension. By inserting the definitions one finds −1 σρ Γ(ρ m) σρ−1
=Γ
(m)
and hence σρ
m µ
Γ(ρ−1 m)
σρ−1 =
m µ
Γ(m)
.
(2.45)
The property σρ (σρ−1 F ?0 σρ−1 G) = F ?0 G
(2.46)
of the ?-product at m = 0 cannot be maintained for the retarded products, i.e. (0)
(0)
Rn−1,1 ρ := σρ ◦ Rn−1,1 ◦ (σρ−1 )⊗n (0)
(2.47) (0)
will differ from Rn−1,1 , in general. But one can reach that Rn−1,1 scales almost homogeneously with
(0) degree zero, i.e. Rn−1,1 ρ has polynomial behavior in log ρ, (0) all F1 , . . . , Fn ∈ Floc , Rn−1,1 ρ (F1 , . . . , Fn ) is a polynomial
in the sense that for in log ρ. In the massive case, scaling relates retarded products for different masses; therefore, the condition of homogeneity gives no restriction at a fixed mass. Our condition of almost homogeneous scaling states that (m)
(ρ−1 m)
Rn−1,1 ρ := σρ ◦ Rn−1,1 ◦ (σρ−1 )⊗n ,
(2.48)
or equivalently (m,µ) Rn−1,1 ρ
:=
(ρ−1 m,µ) σρ ◦Rn−1,1 ◦(σρ−1 )⊗n
=
m ρµ
−Γ(m)
(m) ◦Rn−1,1 ρ ◦
m ρµ
Γ(m) ⊗n
(2.49)
(where (2.45) is used), has polynomial behavior in log ρ. We will see, that this condition together with the smoothness requirement, imposes non-trivial
January 18, 2005 10:2 WSPC/148-RMP
1306
00226
M. D¨ utsch & K. Fredenhagen
restrictions also in the massive case. As mentioned above, in even dimensions, the smoothness condition requires the transition to an equivalent ∗-product which depends on an additional mass parameter µ. It is obvious how the remaining conditions on the interacting fields (Unitarity, Covariance and Field equation) read in terms of the retarded products. With regard to the Field equation note that the retarded propagator is the same for the R(m) -products and the R(m,µ) -products: (m,µ)
R1,1
(m)
(ϕ(y), ϕ(x)) = ∆m (x − y)Θ(x0 − y 0 ) = ∆ret m (x − y) = R1,1 (ϕ(y), ϕ(x)) .
(2.50)
Hence, the Yang–Feldman equation (2.12) has precisely the same form for both kinds of interacting fields. From Bogoliubov’s definition of interacting fields (1.1) it follows that there is a unique correspondence between retarded products and time-ordered products, see e.g. [18]. So the axioms given in this section can equivalently be formulated in terms of time-ordered products. This is done in Appendix E. In addition we give there an explicit formula which describes the time-ordered products in terms of retarded products. 3. Construction of the Retarded Products Our procedure is based on the strategies developed in [42, 44] and [7]. 3.1. Inductive step outside of the total diagonal If the retarded products with less than n factors are given, we can define the distribution Jn−2,2 . But, by the GLZ relation, Symmetry and Causality, Rn−1,1 is already fixed by Jn−2,2 outside of the total diagonal ∆n = {(x1 , . . . , xn ) ∈ Mn |x1 = · · · = xn }. Namely, if not all points coincide, we may separate them into two nonempty sets which are in the past and the future, respectively, of a suitable Cauchy surface. If the last argument xn is in the past, then Rn−1,1 (x1 , . . . , xn ) vanishes due to the support properties of retarded products. If, on the other hand, xn is in the future and xk for some k 6= n is in the past, then the retarded product vanishes if the arguments xk and xn are permuted, hence in this case we find Rn−1,1 (x1 , . . . , xn−1 , xn ) = Jn−2,2 (x1 , . . . kˆ . . . , xn−1 , xk , xn ) if
xk 6∈ xn + V¯+ .
(3.1)
Moreover, it is only the totally symmetric part Sn of Rn−1,1 which is not completely fixed by lower orders, n
1X Rn−1,1 (xk+1 , . . . , xn , x1 , . . . , xk ) . Sn (x1 , . . . , xn ) = n def
k=1
(3.2)
January 18, 2005 10:2 WSPC/148-RMP
00226
Causal Perturbation Theory in Terms of Retarded Products
1307
This follows again from the GLZ relation which yields Rn−1,1 (x1 , . . . , xn ) = Sn (x1 , . . . , xn )+
n−1 1X Jn−2,2 (xk+1 , . . . , xn−1 , x1 , . . . , xk , xn ) . n k=1
We now want to prove the existence of Rn−1,1 . In this subsection we give the first step: we check that the above findings define a distribution on the complement ◦ of the total diagonal which we denote byk Rn−1,1 . For this purpose, in view of (3.1) and the sheaf theorem on distributions, it is sufficient to check that in case two of the first n − 1 arguments, say x, y, are different from the last argument z and both in the past of z, then Jn−2,2 (. . . , x, y, z) = Jn−2,2 (. . . , y, x, z) . But by the Jacobi identity (2.29), the difference is equal to Jn−2,2 (. . . , z, x, y) ◦ which vanishes since z is neither in the past of x nor y. So, Rn−1,1 is well defined by (3.1) and fulfils the axioms Symmetry and Causality by construction. ◦ In the next step we check that Rn−1,1 satisfies all other axioms. The only nontrivial points are the GLZ relation and the Scaling. Concerning the former we have to prove that ◦ ◦ (. . . , x, z, y) = Jn−2,2 (. . . , x, y, z) Rn−1,1 (. . . , x, y, z) − Rn−1,1
holds whenever the points x, y, z are not identical. If y 6= z then y 6∈ z + V¯+ or y 6∈ z + V¯− . In these cases the assertion follows from the construction (3.1) of ◦ Rn−1,1 . So it remains to treat the case x 6= y ∧ x 6= z, i.e. when x 6∈ y + V¯ ∪ z + V¯0 with , 0 ∈ {+, −}. In the case = 0 = − all terms in the GLZ relation vanish (by ◦ construction of Rn−1,1 and due to the support properties of J). In the case = − 0 and = + we analogously find R◦ (. . . , x, z, y) = 0 and J(. . . , z, x, y) = 0. So, by the Jacobi identity the assertion (3.1) becomes R ◦ (. . . , x, y, z) = J(. . . , y, x, z) ◦ which is the construction (3.1) of Rn−1,1 . The case = + and 0 = − is analogous. 0 Finally, for = = + we apply again the Jacobi identity to the right side and find two terms which are just by (3.1) the retarded products on the left side. (ρ−1 m)
We turn to the scaling. We show that σρ ◦ Jn−2,2 ◦ (σρ−1 )⊗n has polynomial ◦(m)
behavior in log ρ. This then implies the same property for Rn−1,1 , since the causal ◦(m,µ)
relations are scale invariant. (And from that it follows that Rn−1,1 also scales almost homogeneously, analogously to (2.49).) By definition, Jn−2,2 is a sum of ?products of retarded products Rk−1,1 and Rn−k−1,1 , k = 1, . . . n − 1, which scale by assumption with a polynomial behavior in log ρ. The statement now follows from −1 the fact that σρ is a ?-algebra isomorphism from A(ρ m) to A(m) . k The
◦(m)
construction of R◦n−1,1 (from the retarded products of lower orders) can be done for Rn−1,1 ◦(m,µ)
as well as for Rn−1,1 , the results are related by (2.37).
January 18, 2005 10:2 WSPC/148-RMP
1308
00226
M. D¨ utsch & K. Fredenhagen
3.2. Extension to the total diagonal; the Action Ward Identity We now come to the main step in renormalization, namely the extension of the ◦ symmetric part Sn◦ of Rn−1,1 (3.2) to a distribution on Mn . For the con◦ struction of Rn−1,1 the normalization conditions (i.e. the axioms Action Ward Identity, Covariance, Field Independence, Unitarity, Field equation, Smoothness in m ≥ 0 and Scaling) have not been needed, but they give guidance on how to do the extension of Sn◦ and reduce the non-uniqueness drastically. In particular the expansion (2.35) and (2.36) (i.e. the axiom Field Independence) and Covariance for translations simplify the problem to the extension of the symmetric ◦ part s◦n (A1 , . . .)(x1 − xn , . . .) of the distributionl rn−1,1 (A1 , . . .)(x1 − xn , . . .) from D(Rd(n−1) \{0}) to D(Rd(n−1) ). This is the crucial problem of perturbative renormalization, since it is this step which is non-unique and which is the source of anomalies. Since the Smoothness axiom applies for the R (m,µ) -products, but not ◦(m,µ) (m,µ) for the R(m) -products, the extension is done for Sn . From the resulting Sn , (m) ◦(m) the extension Sn of Sn is obtained by (2.37). The basic idea to fulfill the Action Ward Identity goes as follows: since ∂xµl sn (. . . , Al , . . .) is an extension of ∂xµl s◦n (. . . , Al , . . .) = s◦n (. . . , ∂ µ Al , . . .) we may def
define sn (. . . , ∂ µ Al , . . .) = ∂xµl sn (. . . , Al , . . .), provided sn (. . . , Al , . . .) was already constructed. We are now going to show that this can be done without running into inconsistencies. Namely, the fields in P are of the form A(x) =
N X
n=0
pn (∂ 1 , . . . , ∂ n )ϕ(x1 ) · · · ϕ(xn )|x1 =···=xn =x
with polynomials pn in the derivatives ∂µk = ∂x∂ µ , k = 1, . . . , n, µ = 0, . . . , d − 1, k which are symmetric in the upper indices k = 1, . . . , n. The polynomials pn are uniquely determined by A. Pn k Now let ∂µ = k=1 ∂µ denote the derivatives with respect to the center of ij mass coordinates and ∂µ = ∂µi − ∂µj , 1 ≤ i < j ≤ n the relative derivatives. The crucial observation is now that the vector space Pn of all symmetric polynomials pn is isomorphic to the tensor product of the space P com of polynomials p(∂) of the center of mass derivatives and the space Pnrel of symmetric polynomials pn (∂ ij , 1 ≤ i < j ≤ n) of the relative derivatives. (Symmetry is meant with respect to permutations σ(∂ ij ) := ∂ σ(i) σ(j) , σ ∈ Sn , where ∂ ij ≡ −∂ ji .) The argument is straightforward for the unsymmetrized polynomials. Namely, the independent variables ∂ in , i = 1, . . . , n − 1 generate a polynomial algebra P˜nrel , and the linear map ( W P com ⊗ P˜nrel → {∂ 1 , . . . , ∂ n } (3.3) α: P α(∂ ⊗ 1) = ni=1 ∂ i , α(1 ⊗ ∂ in ) = ∂ i − ∂ n is an isomorphism onto the polynomial algebra generated by ∂ i , i = 1, . . . , n.
l Using translation invariance of ω we denote ω (H(A (x ), . . . , A (x ))) by h(A , . . . , A )(x − n n n 0 0 1 1 1 1 ◦ etc. xn , . . .) for H = Rn−1,1 , R◦n−1,1 , Sn , Sn
January 18, 2005 10:2 WSPC/148-RMP
00226
Causal Perturbation Theory in Terms of Retarded Products
1309
This isomorphism intertwines the actions of the permutation group which are induced by the permutation of indices (on P˜nrel the action is given by σ(∂ in ) = ∂ σ(i)n − ∂ σ(n)n with ∂ nn = 0) and thus restricts to an isomorphism of the invariant subspaces. Interpreting Pnrel as a subspace of P˜nrel (by the obvious identification ∂ ij ≡ ∂ in −∂ jn ), the invariant subspace of P com ⊗ P˜nrel is just P com ⊗ Pnrel . We therefore use the space of balanced derivatives of fields [10] def
Pbal = {pn (∂ ij , 1 ≤ i < j ≤ n)ϕ(x1 ) · · · ϕ(xn )|x1 =···=xn =x | pn ∈ Pnrel , n ∈ N} (3.4) and obtain the isomorphism of vector spaces P∼ = P com (∂) ⊗ Pbal . In other words, every A ∈ P can uniquely be written as A=
N X
pi (∂)Bi ,
i=1
pi (∂) ∈ P com (∂) ,
Bi ∈ Pbal ,
N < ∞.
Applying this result to (2.8) we obtain Proposition 3.1. Let R F be a local functional. Then there exists a unique h ∈ D(M, Pbal ) with F = dx h(x).
The Action Ward Identity can now simply be fulfilled by performing the extension first only for balanced fields B ∈ Pbal and by using the AWI and linearity for the definition of the extension for general fields A ∈ P. By construction this yields, in every entry, a linear map from P com (∂) ⊗ Pbal ∼ = P to distributions on M. Next we recall from the literature the main statements about the extension of a distribution t◦ ∈ D0 (Rk \{0}) to t ∈ D 0 (Rk ) and give some completions. The existence and uniqueness of t can be answered in terms of Steinmann’s scaling degree [42] of t◦ . The latter is defined by def sd(f ) = inf δ ∈ R | lim ρδ f (ρx) = 0 , ρ↓0
f ∈ D0 (Rk ) or f ∈ D0 (Rk \{0}) . (3.5)
We immediately see that a distribution which scales almost homogeneously with degree D and an arbitrary power N < ∞ has scaling degree D. In addition one easily verifies sd(∂ a f ) ≤ sd(f )+|a| ,
sd(xb f ) ≤ sd(f )−|b| and sd(∂ a δ (k) ) = k +|a| , (3.6)
January 18, 2005 10:2 WSPC/148-RMP
1310
00226
M. D¨ utsch & K. Fredenhagen
P where a, b ∈ Nk0 , |a| ≡ kj=1 aj . Returning to the extension of t◦ , the definition (3.5) implies immediately: sd(t) ≥ sd(t◦ ). We are looking for extensions which do not increase the scaling degree.m Theorem 3.2 ([7]). Let t◦ ∈ D0 (Rk \{0}). (a) If sd(t◦ ) < k, there exists a unique extension t ∈ D 0 (Rk ) with sd(t) = sd(t◦ ). (b) If k ≤ sd(t◦ ) < ∞ there exist several extensions t ∈ D 0 (Rk ) with sd(t) = sd(t◦ ). Given a particular solution t0 , the most general solution reads X Ca ∂ a δ (k) (3.8) t = t0 + |a|≤sd(t◦ )−k
with arbitrary constants Ca ∈ C. (c) If sd(t◦ ) = ∞ there exists no extension t ∈ D 0 (Rk ). Case (c) is mentioned for mathematical completeness only. It does not appear ◦ in our construction, because rn+1,1 scales almost homogeneously with finite degree. 1
However, there are distributions t◦ with sd(t◦ ) = ∞, e.g. t◦ (x) = e |x| . The proof of (a) and (b) given in [7] is based on [18] and it is constructive (“W -extension”). It is sketched in Appendix B. The W -extension has the disadvantage that in general it does not maintain L↑+ -covariance for sd(t◦ ) > k. However, by a finite renormalization (which does not increase the scaling degree) Lorentz covariance can be restored (see [18, 42, 44, 40, 4] and also Appendix D). The W -extension breaks in general also the property of almost homogeneous scaling. However, Hollands and Wald have given a (finite) renormalization prescription to restore also this symmetry, in detail:n Proposition 3.3. Let t◦ ∈ D0 (Rk \{0}) scale almost homogeneously with degree D ∈ R and power N ∈ N under coordinate rescalings (2.38). Then t◦ has an extension t to a distribution on Rk which also scales almost homogeneously with degree D and with power N, if D 6∈ N0 +k, and with power (N +1) or N, if D ∈ N0 + k. The extension is unique if D 6∈ N0 +k, otherwise, the most general solution reads: P (particular solution) + |a|=D−k Ca ∂ a δ (k) , where the Ca ’s are arbitrary constants.
m In a large part of the literature (e.g. [3, 18, 40]) our axioms “Smoothness in m ≥ 0” and “Scaling” are replaced by the weaker requirement
sd(rn−1,1 (A1 , . . . , An )(x1 − xn , . . .)) ≤
n X
dim(Aj ) ,
∀Aj ∈ Phom ,
(3.7)
j=1
or an analogous normalization condition. For the extension problem this amounts (nearly always) to the requirement sd(t) = sd(t◦ ). We point out that (3.7) is a condition on the behavior under rescalings of the coordinates in the UV-region only. In contrast, our Scaling axiom is with respect to simultaneous rescalings of the coordinates and the mass, and it must hold for all x and for all m ≥ 0. n This is a version of Lemma 4.1 in the second paper of [31], which follows from the proof given in that paper. The “improved Epstein–Glaser renormalization” of [24] maintains the almost homogeneous scaling directly.
January 18, 2005 10:2 WSPC/148-RMP
00226
1311
Causal Perturbation Theory in Terms of Retarded Products
Now we assume that t◦ ≡ t(m)◦ ∈ D0 (Rdr \{0}) is smooth in m ≥ 0 and scales almost homogeneously with degree D and power N under simultaneous rescalings of the coordinates and the mass m: −1 (ρ∂ρ )N +1 ρD t(ρ m)◦ (ρy) = 0 . (3.9)
• For m = 0 the scaling of m is trivial and the proposition can directly be applied. • For m > 0 the Smoothness in m ≥ 0 of t◦ ≡ t(m)◦ ensures the existence of the Taylor expansiono t(m)◦ (y) =
D−dr X l=0
ml ◦ (m)◦ u (y) + m[D]−dr+1 tred (y) l! l
(3.10)
(where [D] is the integer part of D) with def u◦l (y) = (m)◦
∂ l t(m)◦ (y) , ∂ml m=0
(3.11)
where the “reduced part” tred is smooth in m ≥ 0. The almost homogeneous scaling of t(m)◦ (3.9) and the definition of u◦l imply (ρ∂ρ )N +1 (ρD−l u◦l (ρy)) = 0 .
(3.12)
u◦l
N +2
D−l
Hence, by the proposition, has an extension ul with (ρ∂ρ ) (ρ ul (ρy)) = 0. (m)◦ For the reduced part we find that m[D]−dr+1 tred scales almost homogeneously with degree D, because all other terms in (3.10) have this property. This gives (m)◦ ρD−[D]+dr−1 tred (ρy)
=
(ρm)◦ tred (y)
+
N X
(ρm)
(log ρ)j lj
(y)
(3.13)
j=1
(m)
(m)◦
for some lj ∈ D0 (Rdr \{0}) which are smooth in m ≥ 0. Since also tred smooth in m ≥ 0, we conclude (m)◦
lim ρdr tred (ρy) = 0 ,
ρ→0
i.e.
(m)◦
sd(tred ) < dr .
is
(3.14) (m)◦
With that and with part (a) of Theorem 3.2 the reduced part tred can (m) be uniquely extended. Due to the latter, the resulting tred is also smooth in m ≥ 0 and also the scaling property (3.13) is maintained. Namely, (ρ−1 m) (ρ∂ρ )N +1 ρD−[D]+dr−1 tred (ρy) has support in {0} and its scaling degree is less than dr for each fixed ρ > 0. Putting together the extensions of the individual terms we get t
(m)
def
(y) =
D−dr X l=0
ml (m) ul (y) + m[D]−dr+1 tred (y) . l!
(3.15)
o This is the “scaling expansion” of Hollands and Wald (given in the second paper of [31]) in the particular simple case of Minkowski space.
January 18, 2005 10:2 WSPC/148-RMP
1312
00226
M. D¨ utsch & K. Fredenhagen
By construction this is an extension of t(m)◦ with the wanted smoothness and scaling properties. It is unique for D 6∈ N0 + dr. For D ∈ N0 + dr the most general solution is obtained by adding D−dr X
X
ml Cl,a ∂ a δ (dr)
(3.16)
l=0 |a|=D−dr−l
to a particular solution, with arbitrary constants Cl,a . Note that the undetermined polynomial (3.16) scales even homogeneously (with degree D). If we would require almost homogeneous scaling only (and not smoothness in m ≥ 0), terms with l < 0 would be admitted in (3.16). An extension with such terms increases the scaling degree: sd(t) > sd(t◦ ), cf. footnote m. If t(m)◦ (3.9) is additionally Lorentz invariant, a slight modification of the usual proofs of Lorentz invariance [18, 42, 44, 40, 4] yields that t(m) (3.15) can be chosen to be also Lorentz invariant (Appendix D). The conditions Unitarity and Symmetry can easily be included, too (see e.g. [18]). With this general knowledge about the extension of a distribution to a point we (m,µ)◦ return to the extension of sn . For A1 , . . . , An ∈ Pbal ∩ Phom the distribution P (m,µ)◦ sn (A1 , . . . , An ) fulfills (3.9) with degree D = j dim(Aj ) and some power N < ∞, and we can proceed as follows: (m,µ)◦
Step 1. We first extend the distributions sn for homogeneous (2.42) balanced fields only, by applying the above given procedure, including the finite renormalizations which restore Lorentz covariance, almost homogeneous scaling (3.9), Symmetry, Unitarity and maintain Smoothness in m ≥ 0. Furthermore, the global inner symmetries (in our case the field parity (2.10)) can be preserved, cf. Appendix D. (m,µ)
Step 2. From that we construct sn Action Ward Identity. (m,µ)
Step 3. Finally we construct Sn (2.36),
by means of the Taylor expansion (2.35) and
def Sn(m,µ) (A1 (x1 ), . . . , An (xn )) =
s(m,µ) n
for all fields by using linearity and the
X l1 ···
1 l1 ! · · ·
Z
dx11 · · · dx1l1 · · ·
δ l1 A1 (x1 ) , . . . ϕ(x11 ) · · · ϕ(x1l1 ) · · · . δϕ(x11 ) · · · δϕ(x1l1 )
(3.17)
(m,µ)
By construction Sn is linear in the fields and fulfils the axioms Covariance with (m,µ) respect to translations and Field Independence. The properties of sn established (m,µ) in Steps 1 and 2 imply that Sn fulfils the corresponding axioms. In particular, (m,µ) δ∂A(y) satisfies the AWI. by using ∂y δA(y) δϕ(x) = δϕ(x) , we see from (3.17) that Sn From a particular solution, the general solution is obtained by adding an arbi(m,µ) trary local polynomial of the form (3.16) to sn (A1 , . . . , An ) which respects also
January 18, 2005 10:2 WSPC/148-RMP
00226
1313
Causal Perturbation Theory in Terms of Retarded Products (m)
linearity in the fields, Symmetry, Unitarity and the AWI. Rn−1,1 is obtained from (m,µ)
Rn−1,1 by (2.37). The construction given so far yields the most general solution R (m,µ) and R(m) of the axioms of Sec. 2 except the Field equation (2.12). (Since the latter has precisely the same form for both kind of retarded products the following procedure applies to both kinds and the results are related by (2.37).) Due to the expansion (2.35) and (2.36) the Field equation is equivalent to rn−1,1 (F1 , . . . , Fn−1 , ϕ(h)) = −
Z
dx h(x)
Z
dy ∆ret (x − y)
n−1 X k=1
δFk , rn−2,1 F1 , . . . kˆ . . . , Fn−1 , δϕ(y)
(3.18)
for all n ≥ 2, F1 , . . . , Fn−1 ∈ Floc , and h ∈ D(M). The right side gives an ◦ extension of rn−1,1 (F1 , . . . , Fn−1 , ϕ(h)), because the Field equation holds outside the total diagonal. It is Lorentz covariant, symmetric in the first (n − 1) factors, unitary, smooth in m ≥ 0, scales almost homogeneously (even with power ≤ (n − 2)) and respects the AWI. From (3.18) and the inductively known ω0 (Jn−2,2 (F1 , . . . , Fn−1 , ϕ(h))) we obtain rn−1,1 (F1 , . . . , ϕ(h), . . . , Fn−1 ) by using the GLZ relation and the Symmetry in the first (n − 1) factors.p By construc◦ tion this yields an extension of rn−1,1 (F1 , . . . , ϕ(h), . . . , Fn−1 ) which also satisfies all axioms. With that sn (. . . , ϕ(h)) (3.2) is uniquely determined in terms of the inductively known rn−2,1 . So, in order to fulfill the Field equation we modify Step 1 as follows: sn (A1 , . . . , An−1 , ϕ), A1 , . . . , An−1 ∈ Pbal , is uniquely given by the Field equation in the just described way and fulfils the required properties. However, the construction of sn (A1 , . . . , An ) remains unchanged if A1 , . . . , An are all of at least second order in ϕ and its partial derivatives. (If at least one factor is a C-number the retarded product vanishes and hence also sn , see [12, Lemma 2.3, part (C)].) Finally Steps 2 and 3 are done as before. Summing up we have proved: Theorem 3.4. There exist retarded products which fulfil all axioms of Sec. 2. (m,µ)
Example (Setting-sun diagram r1,1 (ϕ3 , ϕ3 ) for d = 4. The explicit calculation of a diagram usually requires somewhat less work if the extension is done directly for r◦ (and not for its symmetric part s◦ ). By using the GLZ relation and Causality we obtain ◦(m,µ) µ µ r◦(m,µ) (y) ≡ r1,1 (ϕ3 , ϕ3 )(y) = −6i Hm (y)3 − Hm (−y)3 Θ(−y 0 ) . (3.19) µ From (A.8) we can read off the first terms of the Taylor expansion in m2 of Hm (y): µ Hm (y) = D+ (y) + m2 log(−µ2 (y 2 − iy 0 0))f (0) + F (0) + hm (y) , (3.20) p The
result is given in [12, Lemma 2.3, part (B)].
January 18, 2005 10:2 WSPC/148-RMP
1314
00226
M. D¨ utsch & K. Fredenhagen
where D+ (y) ≡ −(4π 2 (y 2 − iy 0 0))−1 is the massless two-point function and hm (y) is of order O(m4 ) and has scaling degree sd(hm ) < 0. With that we get the scaling expansion (3.10) of (3.19): r◦(m,µ) (y) = u◦0 (y) +
m2 ◦(µ) ◦(m,µ) u (y) + rred (y) , 2 2
(3.21)
where u◦0 (y) = −6i D+ (y)3 − D+ (−y)3 Θ(−y 0 ) ,
◦(µ) u2 (y)
(3.22) = −36i D (y) log(−µ (y − iy 0))f (0) + F (0) − (y → −y) Θ(−y 0 ) . +
2
2
2
0
The power of the almost homogeneous scaling (3.9) (with degree 6) is the power of log(µ2 · · ·). It is different for the individual terms: it is 0 for u◦0 , 1 for u◦2 and 3 for ◦(m,µ) ◦(m,µ) rred respectively. In contrast to the reduced part rred , the renormalization of ◦(µ) u◦0 and u2 is non-trivial and it increases the power of the almost homogeneous scaling by 1. The extension of these two terms is given in Appendix B by using differential renormalization. An alternative method, which relies on the K¨ allen– Lehmann representation, is applied to the massless fish and setting-sun diagram in Appendix C. We now focus on the power N of the almost homogeneous scaling (3.9). The preceding example shows that, in the scaling expansion of r ◦(m,µ) (or s◦(m,µ) ), the terms for which N may be increased in the extension, are not the terms with the maximal value of N . The proof of part (ii) of the following proposition is based on this observation. Proposition 3.5. (i) If the number d of spacetime dimensions is odd, the power (m) (m,µ) N of the almost homogeneous scaling of Rn−1,1 ≡ Rn−1,1 is smaller than n, (m)
i.e. Rn−1,1 ρ (2.48) is a polynomial of (log ρ) with degree less than n.
(ii) For d = 4 the power N of the almost homogeneous scaling (3.9) of (m,µ)
rn−1,1 (A1 , . . . , An ) ,
with
Aj =
lj Y
∂ ajs ϕ
(3.23)
s=1
and n X j=1
dim(Aj ) ≤ 4(n − 1) + 3 ,
(3.24)
is bounded by n
1X lj N≤ 2 j=1
! n 1X ≤ dim(Aj ) . 2 j=1
(3.25)
January 18, 2005 10:2 WSPC/148-RMP
00226
1315
Causal Perturbation Theory in Terms of Retarded Products
Pn FeynNote that 21 j=1 lj is the number of (internal) lines in the corresponding Q man diagram. Due to the expansion (2.35) and (2.36) and the fact that j ∂ aj ϕ(xij ) scales homogeneously, part (ii) implies that the power N of the almost homogeneous (m,µ) scaling of Rn−1,1 (A1 , . . . , An ) (with Aj of the mentioned kind) is also bounded by (3.25). The restriction (3.24) on the Aj ’s is e.g. satisfied for interacting fields AL(g) if dim(A) ≤ 3 and L is renormalizable by power counting, cf. Sec. 4.1. Proof. Part (i): R0,1 (A) = A scales homogeneously (2.44). Following our inductive construction, one verifies that an increase of N may happen in the extension s◦n → sn only. Hence, by Proposition 3.3, N is increased at most by 1 in each inductive step. Part (ii): We give the proof for Aj = ϕlj only. The generalization to field polynomials with derivatives is straightforward, it gives only notational complications. In our inductive construction of the R(m,µ) -products N may now be increased also (m,µ) in the construction of jn−2,2 , since the GLZ relation uses the modified star product ?m,µ . The vacuum expectation value of the GLZ relation reads ω0 R(m,µ) ϕl1 (x1 ), . . . , ϕlk (xk ) , R(m,µ) ϕlk+1 (y1 ), . . . , ϕln (yn−k ) ? m,µ
=
X
· · · r(m,µ) ϕl1 −p1 (x1 ), . . . , ϕlk −pk (xk )
· r(m,µ) ϕlk+1 −pk+1 (y1 ), . . . , ϕln −pn (yn−k ) ·
p Y
j=1
µ Hm (xij
− y ij ) −
p Y
µ Hm (yij
j=1
!
− x ij ) ,
(3.26)
where p ≡ 21 (p1 + · · · + pn ). By using the inductive assumption and the fact that µ Hm scales almost homogeneously with power 1 (A.12)q we find that for each term (on the right side) N is bounded by k n n 1X 1 X 1X (lj − pj ) + (lj − pj ) + p = lj . 2 j=1 2 2 j=1
(3.27)
j=k+1
We turn to the extension s◦n → sn . The scaling degree of each term in (3.26) is bounded by sd(· · ·) ≤
n X j=1
dim(Aj ) ≤ 4(n − 1) + 3 .
(3.28)
• If p = 1 the extension is trivial and, hence, the power N is not increased in this step. q An essential ingredient of the generalization to field polynomials with derivatives is that partial µ derivatives ∂ a Hm (a ∈ N40 ) also scale almost homogeneously with power 1.
January 18, 2005 10:2 WSPC/148-RMP
1316
00226
M. D¨ utsch & K. Fredenhagen
• If p ≥ 2 and the scaling degree is ≥ 0, the extension may increase N by 1. The terms on the right side of (3.26) with the maximal value of N have the µ property that −m2 f (m2 y 2 ) log(m2 /µ2 ) is substituted for Hm (y) (A.11) in all µ Hm ’s. Therefore, the scaling degree of these terms is lowered by 2p ≥ 4, i.e. it is < 4(n − 1), and we are in the case of trivial extension. We conclude that in the (m,µ) ◦(m,µ) (ϕl1 , . . .) the corresponding value of N is not extension sn (ϕl1 , . . .) → sn increased, i.e. N is still bounded by (3.27). In d = 4 + 2k (k ∈ N) spacetime dimensions analogous bounds on the power N of the almost homogeneous scaling of the R(m,µ) -products can be derived by the same method. 4. Non-Uniqueness 4.1. Counting the indeterminate parameters before the adiabatic limit In contrast to the literature we count the indeterminate parameters without performing the adiabatic limit. P∞ 1 ⊗n ; A(x)) are left with The interacting fields AL(g) (x) = n=0 n! Rn,1 ((L(g)) ◦ an indefiniteness coming from the extension of the symmetric part of rn,1 to the origin (in relative coordinates). In general the normalization conditions restrict this indefiniteness only, they to do not remove it completely. Let L ∈ Pbal and let N (L, A, n) be the number of indeterminate parameters (i.e. the constants Ca in (3.8) or (B.6)) in Rn,1 (L(g))⊗n ; A(x)) coming from the inductive step (n − 1, 1) → (n, 1). This number depends on the choice of the normalization conditions. In the following we presume the axioms given in Sec. 2 except Lorentz covariance, Unitarity and Field equation. We will prove Proposition 4.1. Let L ∈ Pbal . (a) N (L, A, n) is bounded in n ∀A ∈ P fixed , iff dim(L) ≤ d. (b) For all A ∈ P there exists n(A) such that N (L, A, n) = 0 ∀n > n(A), iff dim(L) < d. An interaction L with the property (a) of n 7→ N (L, A, n) is called “renormalizable by power counting”. In the literature (also in causal perturbation theory [18, 40, 41]) the counting of indeterminate parameters is done in terms of the SP 1 ⊗n )), and the correspondmatrix in the adiabatic limit (i.e. n n! limg→1 Tn ((L(g)) ing version of the proposition can be proved rather easily, see e.g. [3, Sec. 28.1]. It does not make an essential difference that we count in terms of retarded products. But, since we do not perform the adiabatic limit, our discussion is more involved. Proof. Due to δA(x) X ∂A = (x)(−1)|a| ∂ a δ(y − x) a ϕ) δϕ(y) ∂(∂ a
(4.1)
January 18, 2005 10:2 WSPC/148-RMP
00226
Causal Perturbation Theory in Terms of Retarded Products
1317
we understand by the sub-polynomials U of A ∈ P all non-vanishing polynomials ∂k A d U ≡ ∂(∂ a1 ϕ)···∂(∂ ak ϕ) , k ∈ N0 , aj ∈ N0 and we write U ⊂ A. Ignoring the AWI, the indefiniteness of Rn,1 ((L(g))⊗n ; A(x)) is precisely the indefiniteness of all C-number distributions rn,1 (U1 , . . . , Un ; U ), U1 , . . . , Un ⊂ L and U ⊂ A, due to the expansion (2.35) and (2.36). Note 0 ≤ dim(Uj ) ≤ dim(L) ,
and 0 ≤ dim(U ) ≤ dim(A) .
(4.2)
The indefiniteness of rn,1 (U1 , . . . , Un ; U )(x1 − x, . . . , xn − x) is a polynomial (3.16) X Can (U1 , . . . , Un ; U )∂ a δ(x1 − x, . . . , xn − x) (4.3) |a|≤ω(U1 ,...,Un ;U )
which is invariant under permutations of (U1 , x1 ), . . . , (Un , xn ), (U, x), where ω(U1 , . . . , Un ; U ) =
n X j=1
dim(Uj ) + dim(U ) − dn
≤ dim(U ) + n(dim(L) − d) .
(4.4)
For m = 0 the sum (4.3) runs over |a| = ω only; this simplifies the proof. With (4.2)–(4.4) the only non-obvious statement of the proposition is that for an interaction L with dim(L) = d the function n 7→ N (L, A, n) is bounded ∀A ∈ P. This statement holds even true if the AWI is not required, and we are now going to prove it under this weaker supposition. The boundedness (in n) of ω(U1 , . . . , Un ; U ) alone, i.e. ω(U1 , . . . , Un ; U ) = dim(U ) −
n X j=1
(d − dim(Uj )) ≤ dim(U ) ≤ dim(A)
∀n ∈ N (4.5)
and ∀Uj ⊂ L, U ⊂ A, does not imply the assertion, because (i) the number of terms rn,1 (U1 , . . . , Un ; U ), Uj ⊂ L, U ⊂ A, is increasing with n; (ii) the number of indices a ∈ Ndn 0 with |a| ≤ ω0 , ω0 fixed, see (4.3), is also increasing with n. (Hence, one might expect that there is e.g. an increasing number of constants Can (L, . . . , L, A).) Now let A ∈ P be fixed. (i) is no problem, since there are indeterminate parameters in the rn,1 (U1 , . . . , Un ; U ) for ω(U1 , . . . , Un ; U ) ≥ 0 (4.5) only. Hence, we solely need to consider def
Rn = {rn,1 (U1 , . . . , Ul , L, . . . , L; U ) | U1 , . . . , Ul ⊂ L, U ⊂ A} ,
(4.6)
where l is given by l · dim(ϕ) = dim(A). However, the number of elements of Rn is constant for all n ≥ l. To invalidate the objection (ii) let U1 , . . . , Ul and U be fixed. Because rn,1 (U1 , . . . , Ul , L, . . . , L; U )(y1 , . . . , yn ) ,
def
yj = xj − x ,
(4.7)
January 18, 2005 10:2 WSPC/148-RMP
1318
00226
M. D¨ utsch & K. Fredenhagen
is symmetrical in yl+1 , . . . , yn , the number of constants Can (U1 , . . . , Ul , L, . . . , L; U ) with a ∈ Ndn 0 , |a| ≤ dim(U ) −
l X j=1
(d − dim(Uj )) (4.8)
is bounded in n. We use here a modified version of the fact that the number of coefficients in the symmetrical polynomials P (z1 , . . . , zm ) (zj ∈ R), m ∈ N, of a fixed degree becomes independent of the number m of variables zj , if m is big enough.r Summing up we find that N (L, A, n) is bounded in n for any fixed A ∈ P. 4.2. Main theorem of perturbative renormalization It is one of the main insights of renormalization theory that the ambiguities of the renormalization process can be absorbed in a redefinition of the parameters of the given model. In causal perturbation theory this was termed Main Theorem of Renormalization [44]. Different versions of this theorem may be found in [46, 22, 3, 44, 38, 25]. But there, in contrast to the formulation of renormalization in terms of the action functional, the parameters of a model are test functions. Therefore, the renormalization group which governs the change of parameters is more complicated, and it is only in the adiabatic limit that the more standard version of the renormalization group will be recovered. Fortunately, the algebraic adiabatic limit [7] is sufficient for this purpose, as was first shown by Hollands and Wald [32]. In this way, one finds an intrinsically local construction of the renormalization group which is suited for theories on curved spacetime and for theories with a bad infrared behavior. Here we give a slightly streamlined proof of the Main Theorem in the framework of retarded products. In Sec. 5 we discuss the consequences for the algebraic adiabatic limit. ˆ R : T Floc → F be linear maps which satisfy the axioms Theorem 4.2. (i) Let R, Symmetry, Initial Condition, Causality and GLZ for retarded products. Then there exists a unique symmetric linear map D : T Floc → Floc
(4.10)
r This becomes obvious by listing the symmetrical polynomials in z , . . . , z m which are homoge1 neous of degree k for the lowest values of k: e.g. for k = 4 and for all m ≥ 4 a basis is given by
P1 = C1 Sz14 , P4 = where Sf (z1 , . . . , zm ) ≡
1 m!
P
P2 = C2 Sz13 z2 ,
C4 Sz12 z2 z3
π∈Sm
,
P3 = C3 Sz12 z22 ,
P5 = C5 Sz1 z2 z3 z4 ,
f (zπ1 , . . . , zπm ).
(4.9)
January 18, 2005 10:2 WSPC/148-RMP
00226
Causal Perturbation Theory in Terms of Retarded Products
1319
with D(1) = 0 such that for all F, S ∈ Floc the following intertwining relation holds (in the sense of formal power series in λ) λS
ˆ λS , F ) = R(eD(e⊗ ) , D(eλS ⊗ F )) . R(e ⊗ ⊗ ⊗
(4.11)
Moreover , D satisfies the conditions (a) D(F ) = F , F ∈ Floc . δ D(F1 ⊗ . . . ⊗ Fn ) \ δFi ⊂ , (b) supp supp δϕ δϕ i
Fi ∈ Floc .
(4.12)
ˆ satisfy some of the conditions Covariance, Unitar(ii) If in addition, R and R ity, Field Independence and Field Equation, then D has the corresponding properties, i.e. it is covariant, hermitian,s has no explicit dependence on ϕ, δ D(eF δF ⊗) F (4.13) =D ⊗ e⊗ , δϕ δϕ and (under the condition Field Independence) fulfills D(eλS ⊗ ⊗ ϕ(h)) = ϕ(h), respectively. ˆ (m,µ) as above, which are smooth in m ≥ 0 (iii) If there are two families R(m,µ) , R (m) and satisfy the axiom Scaling, then the corresponding intertwining family D H is also smooth in m, and it is independentt of µ and invariant under scaling: (ρ−1 m)
σρ DH
σ −1 F
(e⊗ρ
(m)
) = DH (eF ⊗) .
(4.14)
ˆ (m) are related to R(m,µ) , R ˆ (m,µ) by (2.37), then the corresponding If R(m) , R (m) (m) intertwining D is related to DH also by (2.37) and fulfils σρ ◦ Dn(ρ
−1
m)
◦ (σρ−1 )⊗n = ρ−Γ
(m)
◦ Dn(m) ◦ (ρΓ
(m)
)⊗n .
(4.15)
ˆ (iv) Conversely, given R and D as above, Eq. (4.11) gives a new retarded product R (m) (m,µ) with the pertinent properties. If there are families R and DH as above, (m,µ) ˆ then the corresponding family R is smooth in m ≥ 0 and satisfies the Scaling axiom. ˆ can be abThe identity (4.11) states that the (finite) renormalization R → R sorbed in the renormalizations • λS → D(eλS ⊗ ) of the interaction and • F → D(eλS ⊗ ⊗ F ) of the field. It is crucial that the renormalization of the interaction is independent of the field F . Looking in (4.11) at the terms of nth order in λ and using the polarization s i.e. t For
D(F ⊗n )∗ = D((F ∗ )⊗n ). (m) this reason we write DH instead of D (m,µ) .
January 18, 2005 10:2 WSPC/148-RMP
1320
00226
M. D¨ utsch & K. Fredenhagen
identity we find that (4.11) is equivalent to ˆ 1 ⊗ · · · ⊗ Fn ) R(F =
X
X
R
n∈I⊂{1,...,n} P ∈Part(I c )
O
T ∈P
D(FT ) ⊗ D(FI )
!
(4.16)
N with FJ = j∈J Fj for an ordered index set J. It is instructive to write Eq. (4.16) in lowest orders: (n = 1)
ˆ ) = R(D(F )) ≡ D(F ) , F ≡ R(F
(n = 2)
ˆ 1 ⊗ F2 ) = R(F1 ⊗ F2 ) + D(F1 ⊗ F2 ) , R(F
(n = 3)
ˆ 1 ⊗ F2 ⊗ F3 ) = R(F1 ⊗ F2 ⊗ F3 ) R(F + R(D(F1 ⊗ F2 ) ⊗ F3 ) + R(F1 ⊗ D(F2 ⊗ F3 )) + R(F2 ⊗ D(F1 ⊗ F3 )) + D(F1 ⊗ F2 ⊗ F3 ) .
(4.17)
We see that the difference between the retarded products in order (1, 1) propagates to higher orders which gives the terms in the second last line. These terms are localized on partial diagonals and express the change of normalization of sub-diagrams. The term in the last line is localized on the total diagonal ∆3 and originates from the freedom of normalization of retarded products in the inductive step from (1, 1) to (2, 1). Note that the term with I = {1, . . . , n} in the second line of (4.16) gives, in def ˆ n−1,1 , Rk,1 , Dk for view of R0,1 = id, a definition of Dn = D F ⊗n in terms of R loc
k = 1, . . . , n − 1. It is here that our formalism seems to be superior over previous formulations. Namely, if the retarded products take their values only on shell, R0,1 is no longer the identity but the canonical surjection π with respect to the ideal generated by the free field equation. Then the definition of Dn requires a choice of representatives. Without the Action Ward Identity, such a choice is rather artificial, and we are not aware of any place in the literature where this problem is treated in full generality.
Proof of the Theorem. Part (iv): it is straightforward to check that every D ˆ via with the properties described in the theorem defines a new retarded product R Eq. (4.11) (or equivalently (4.16)). ˆ 0,1 = id is equivalent to Part (i): from Eq. (4.16) one immediately sees that R ˆ D1 = id, and that Rn−1,1 is determined by the Dl ’s of order l ≤ n and by R. Vice ˆ and the lower order D’s, and obviously versa, Dn is uniquely given in terms of R, R it is linear. If D satisfies the properties mentioned in part (i) (or parts (i)–(iii) respectively), then this holds true also for its truncation D (n) , which is defined by (n) (n) Dl = Dl for l ≤ n and Dl = 0 for l > n. Following part (iv), D (n) determines
January 18, 2005 10:2 WSPC/148-RMP
00226
Causal Perturbation Theory in Terms of Retarded Products
1321
ˆ (n) with the pertinent properties, which coincides with R ˆ in a retarded product R order (k, 1) for k < n. From (4.16) we see that (n−1)
ˆ n−1,1 − R ˆ Dn = R n−1,1 ,
(4.18)
i.e. Dn is the difference between two possible extensions of retarded product at order (n − 1, 1) and is therefore symmetric and localized on the diagonal. The latter implies ran Dn ⊂ Floc . Parts (ii)–(iii): additional properties of the retarded products imply directly the corresponding properties of Dn . In particular, (Field Equation) we know that the Field Equation (3.18) determines Rn−1,1 (. . . , ϕ(h)) uniquely in terms of Rn−2,1 ; with that (4.18) implies Dn (· · · ⊗ ϕ(h)) = 0; ˆ n−1,1 and (Scaling) from (4.18) and the almost homogeneous scaling of R (n) (m) ˆ R n−1,1 we conclude that ω0 (DH n (A1 (h1 )⊗· · · An (hn )) must be of the form (3.16) for A1 , . . . An ∈ Pbal ∩Phom and hence it scales hoP (m) mogeneously with degree i dim(Ai ); this yields (4.14) for DH n . The corresponding coefficients Cl,a in (3.16) are independent of (m) µ because they are dimensionless; this shows explicitly that DH is independent of µ. Finally (4.15) is obtained from (4.14) analogously to (2.49). (m)
We now want to get a more explicit expression for DH Smoothness in m and Scaling.
under the assumptions
(m)
Proposition 4.3. Let (DH )m≥0 be a family of symmetric, linear maps from T Floc → Floc which are local (in the sense of (4.12)) and independent of ϕ (4.13). Assume that the family is scale invariant and smooth in m. Then it admits the expansion ! n X Y (m) DH n (A1 (h1 ) ⊗ · · · ⊗ An (hn )) = ml dn,l,a (A1 ⊗ · · · ⊗ An ) ∂ a i hi n l∈N0 ,a∈(Nd 0)
i=1
(4.19) ⊗n with Ai ∈ Pbal and hi ∈ D(M), where dn,l,a are linear symmetric maps Pbal → Pbal which are homogeneous in the sense that tensor products of homogeneous fields are mapped onto homogeneous fields such that the mass dimensions satisfy the relation
dim(dn,l,a (A1 ⊗ · · · ⊗ An )) =
n X i=1
dim(Ai ) − l − |a| − d(n − 1) .
(4.20)
In particular , dn,l,a vanishes on tensor products of fields Ai , if the right-hand side is negative. Hence the sum in (4.19) is finite. Note that we have on the right side of (4.19) the pointwise product of the test Q functions: i (∂ ai hi (x)).
January 18, 2005 10:2 WSPC/148-RMP
1322
00226
M. D¨ utsch & K. Fredenhagen
Remark 4.4. (1) In massless models solely the terms with l = 0 contribute in (4.19). (2) If our requirements Smoothness in m ≥ 0 and Scaling are replaced by the upper bound (3.7) on the scaling degree for a fixed mass m, then there is an analoP Q ai gous expansion, Dn (A1 (h1 )⊗· · ·) = a dn,a (A1 ⊗· · ·) i ∂ hi , in terms of linear symmetric maps dn,a which are no longer homogeneous, but still satisfy the bound P dim(dn,a (A1 ⊗ · · ·)) ≤ i dim(Ai ) − |a| − d(n − 1). This change of the axioms causes only little and obvious modifications of the applications given in Sec. 5. (m)
Proof. By the field independence DH n admits a Taylor expansion in ϕ where the coefficients are vacuum expectation values of functional derivatives of its entries (analogously to (2.35)). Because of locality (4.12) the coefficients are supported on the total diagonal and are thus derivatives of the δ-distribution in the relative coordinates. Smoothness in m implies the existence of a Taylor expansion in m around m = 0. By integrating out the δ-distribution, reordering the sums and (m) partial integration we can write DH n in the form (4.19) with dn,l,a (A1 · · ·) ∈ Pbal . (m) Since DH n is symmetric and multi-linear in the fields A1 , . . . , An , the dn,l,a ’s must satisfy the corresponding properties. The homogeneous scaling (4.14) implies dn,l,a (A1 · · ·) ∈ Phom for A1 , . . . ∈ Phom ; and by using (2.44), which can equivalently be written as def
σρ−1 A(h) = ρdim(A) A(h(ρ) ) with h(ρ) (x) = ρ−d h(ρ−1 x) ,
(4.21)
we obtain (ρ−1 m)
σρ DH n = ρ
P
= ρ
P
=
i
i
X l,a
(σρ−1 A1 (h1 ) ⊗ · · ·)
dim(Ai )
dim(Ai )
(ρ−1 m)
σρ DH n
X m l Z l,a
ρ
P
i
ρ
(ρ)
(A1 (h1 ) ⊗ · · ·) dx(σρ dn,l,a (A1 ⊗ · · ·)(x))
dim(Ai )−l+d−dim(dn,l,a (...))−dn−|a|
m
l
Z
Y
(ρ) ∂ a i hi
i
dy dn,l,a (· · ·)(y)
!
(x)
Y i
ai
!
∂ hi (y) , (4.22)
(m)
where we have set y ≡ ρ−1 x. Since this expression agrees with DH n (A1 (h1 ) ⊗ · · ·) the exponent of ρ must vanish; this yields (4.20). Remark 4.5. It is instructive to formulate the theorem for the S-matrix as the ˆ generating functional of the time-ordered products: S(λS) = T (eiλS ⊗ ). Let R, R and D as given in the theorem. Then the corresponding time-ordered products T and Tˆ according to (E.6) and the associated S-matrices are related by iD(eλS ⊗ ) ˆ S(λS) ≡ Tˆ(eiλS ≡ S(D(eλS (4.23) ⊗ ) = T e⊗ ⊗ )) .
January 18, 2005 10:2 WSPC/148-RMP
00226
Causal Perturbation Theory in Terms of Retarded Products
1323
By linearity the map D of Theorem 4.2 can be extended to formal power series, i.e. to a map T (Floc )[[λ]] → Floc [[λ]]. Let S(λ) ∈ Floc [[λ]] with S(0) = 0. The renormalization of the interaction can be considered as a bijective analytic map def
S(λ)
S(λ) → Z(S(λ)) = D(e⊗
).
(4.24) (m)
(m)
(When the R(m,µ) -products are meant we write ZH (·) := DH (e·⊗ ).) With that admissible renormalizations of the interaction can be composed and give again an admissible renormalization. In detail, let Zl,j (·) = Dl,j (e·⊗ ), (l, j) = (1, 2), (2, 3) be given, with Dl,j satisfying the properties of the theorem. Moredef
over let S1 be an S-matrix fulfilling the axioms. Then, S2 (S) = S1 (Z1,2 (S)) and def
S3 (S) = S2 (Z2,3 (S)) = S1 (Z1,2 (Z2,3 (S))) satisfy also the axioms (due to part (iv) of the theorem) and, hence, Z1,2 ◦ Z2,3 is a renormalization of the interaction with the properties given in parts (i)–(iii) of the theorem. So we infer that the nonuniqueness in the construction of retarded products is governed by a group, which we may call the “Stueckelberg–Petermann renormalization group” R. It is the group of all analytic bijections Z of the space of formal power series S(λ) with values in Floc and with S(0) = 0 (thus Z(0) = 0), which satisfy the conditions (i) Z preserves the first order term, i.e. d d Z(S(λ))|λ=0 = S(λ)|λ=0 . dλ dλ (ii) Z is real, i.e. Z(S(λ)∗ ) = Z(S(λ))∗ . (iii) Z is local: it preserves the localization region, supp
δ Z(S(λ)) δ S(λ) = supp , δϕ δϕ
and it is additive on sums of terms with disjoint localizations, Z(S1 (λ) + S2 (λ)) = Z(S1 (λ)) + Z(S2 (λ)) 1 (λ) 2 (λ) if supp δ Sδϕ ∩ supp δ Sδϕ = ∅. (iv) Z is Poincar´e invariant. (v) Z does not explicitly depend on ϕ. (vi) Z acts trivially on ϕ in the sense that
Z(S(λ) + λϕ(h)) = Z(S(λ)) + λϕ(h) . (vii) In the preceding conditions it is not assumed that the retarded products satisfy the axioms Smoothness and Scaling. If the latter are included, Z is replaced (m) (m) by a family ZH = (ZH )m≥0 , where each component ZH must fulfil (i)–(vi) and additionally it must hold that (ρm)
ZH (m)
and that ZH
(m)
= σρ ◦ Z H
is smooth in m ≥ 0.
◦ σρ−1
January 18, 2005 10:2 WSPC/148-RMP
1324
00226
M. D¨ utsch & K. Fredenhagen
In terms of Z and its first derivative Z 0 the transformation formula (4.11) of the retarded products reads FˆS = (Z 0 (S)F )Z(S)
(4.25)
where Z 0 (S)F :=
d Z(S + τ F )|τ =0 . dτ
(4.26)
5. The Algebraic Adiabatic Limit and the Renormalization Group Of particular interest are certain factor groups of the Stueckelberg–Petermann renormalization group R introduced in the preceding section. Up to now our renormalization group transformations act on explicitly spacetime dependent interaction P Lagrangians g = gi Li ∈ D(M, Pbal ). We want to extract from this information the action of the renormalization group on constant Lagrangians L ∈ Pbal . This requires a kind of adiabatic limit. The usual adiabatic limit g → k (where k is constant) needs a good infrared behavior (see e.g. [18, 19, 2, 40]). We therefore work in the algebraic adiabatic limit. To explain the algebraic adiabatic limit let O ⊂ M be a causally closed, open region. Let RS ∈ Floc . Due to Proposition 3.1 there is a unique function g ∈ D(M, Pbal ) with g = S; we may therefore write A(h)g Rinstead of A(h)S for the R interacting fields and Z(g), Z 0 (g) for Z( g) (4.24) and Z 0 ( g) (4.26). We introduce the algebra Ag (O) of interacting fields belonging to O as the sub-algebra of A(m) which is generated by the interacting fields A(h)g with A ∈ P and h ∈ D(O). Note that Ag (O) depends on the chosen retarded products. For L ∈ Pbal we define def
GL (O) = {g ∈ D(M, Pbal ) | g(x) = L for all x in a neighborhood of the closure of O} . The algebraic adiabatic limit relies on the following observation [7]u : for any g1 , g2 ∈ GL (O) there exists a set Autg1 ,g2 of automorphisms α of A(m) [[λ]] with α A(h)g1 = A(h)g2 ∀α ∈ Autg1 ,g2 , ∀h ∈ D(O), ∀A ∈ P . (5.1)
This has the important consequence that the algebraic structure of Ag (O) is independent of the choice of g ∈ GL (O). Following [7, 13] we may formalize the algebraic adiabatic limit in the following way: consider the bundle of algebras Ag (O) over the space of compactly-supported Lagrangians g ∈ GL (O) where L ∈ Pbal is a constant Lagrangian. A section B = (Bg ) is called covariantly constant if it holds that α(Bg1 ) = Bg2
∀α ∈ Autg1 ,g2 .
u An alternative proof of (5.1) which is based on our axioms for retarded products (Sec. 2) is given in [14]. It deals with on-shell valued retarded products; however it applies also to our F -valued retarded products because it uses the Symmetry, Causality and GLZ relation only.
January 18, 2005 10:2 WSPC/148-RMP
00226
Causal Perturbation Theory in Terms of Retarded Products
1325
In particular the interacting fields are covariantly constant sections. The local algebra AL (O) is now defined as the algebra of covariantly constant sections of the bundle introduced above. To get a net of local algebras in the sense of the Haag–Kastler axioms [27, 28] one has in addition to fix the embeddings into algebras of larger regions. Let O1 ⊂ O2 . Then we define the embeddings of algebras iO2 O1 : AL (O1 ) → AL (O2 ) by restricting a section B to Lagrangians g ∈ GL (O2 ). It is easy to see that these embeddings satisfy the consistency condition iO 3 O 2 ◦ i O 2 O 1 = i O 3 O 1 for O1 ⊂ O2 ⊂ O3 . Moreover, the net is covariant under Poincar´e transformations provided the Lagrangian L is Lorentz invariant [13]. It also satisfies the condition of local commutativity as a consequence of the conditions GLZ relation and Causality. We may also look for local fields associated to the net. By definition, a local field associated to the net is a family of distributions (AO ) with values in AL (O) such that iO2 O1 (AO1 (h)) = AO2 (h) if supp h ⊂ O1 and which transform covariantly under Poincar´e transformations [8]. Examples for local fields are given in terms of the classical fields A ∈ P by the sections (AO (h))g := (A(h))g ,
g ∈ GL (O) .
It is an open question whether there are other local fields. This amounts to a determination of the Borchers class for perturbatively-defined interacting field theories. We are now going to investigate what happens with the Stueckelberg–Petermann renormalization group R in the algebraic adiabatic limit. For this purpose we insert a g ∈ GL (O) into (4.19) and find λ
R
ZH (λg) = DH (e⊗
g
) ∈ Gz(λL) (O)
(5.2)
with z(A) =
X 1 ml dn,l,0 (A⊗n ) , n! n,l
A ∈ λPbal [[λ]] .
(5.3)
(The terms a 6= 0 with derivatives of the test functions in (4.19) do not contribute to z because g|O = constant.) Hence, the renormalization group transformations ZH ∈ R induce transformations z : λPbal [[λ]] → λPbal [[λ]] : z(λL) = λL + O(λ2 ) .
(5.4)
Since ZH is invertible this holds also for z. The map γ : ZH 7→ z is a homomorphism of R to the renormalization group in the adiabatic limit Radlim := γ(R). As one
January 18, 2005 10:2 WSPC/148-RMP
1326
00226
M. D¨ utsch & K. Fredenhagen
might expect, the kernel of this homomorphism acts trivially on the local nets. Actually, this holds already if the given Lagrangian is left invariant, in detail: Theorem 5.1. Let AL and AˆL be two local nets which are defined by renormalizaˆ (m,µ) which are related by a renormalization group tion prescriptions R(m,µ) and R transformation ZH (4.24) such that γ(ZH )(λL) = λL. Then the nets are equivalent, i.e. there exist isomorphisms βO : AL (O) → AˆL (O) with βO2 ◦ iO2 O1 = ˆiO2 O1 ◦ βO1 . 0 0 Proof. Since Fˆg = (ZH (g)F )ZH (g) (4.25) where ZH (g) is an invertible linear transformation on the space of local functionals, the algebras Aˆg (O) and AZH (g) (O) coincide. In addition note that g ∈ GL (O) is equivalent to ZH (g) ∈ GL (O). We are now going to show that a section B in AL (O) can be mapped to a section βO (B) in AˆL (O) by
βO (B)g = BZH (g) . Obviously, this map is an isomorphism of the algebras of sections. It remains to prove, that B is covariantly constant if and only if βO (B) is. But this follows from R g 0 the fact that ZH (g)F = DH (e⊗ ⊗ F ) is independent of the choice of g ∈ GL (O) if F is localized within O. Therefore the conditions on the intertwining isomorphisms d g1 ,g2 (O) = AutZ (g ),Z (g ) (O). α are identical: Aut H 1 H 2
Due to this theorem, γ(ZH ) = z is the renormalization of the interaction in the algebraic adiabatic limit. From (4.20) and (5.3) we immediately find dim(L) ≤ d
⇒
dim(z(λL)) ≤ d .
(5.5)
Example (Renormalization of L = λϕ4 in d = 4 dimensions). If R(m,µ) and ˆ (m,µ) satisfy the field parity (2.10), this holds also for the corresponding map DH R of Theorem 4.2 and hence for z (5.3). Taking this, Lorentz covariance, Unitarity, (5.4) and (5.5) into account we obtainv z(λϕ4 ) = λ (1 + a)ϕ4 + b((∂ϕ)2 − ϕϕ) + m2 cϕ2 + m4 e1 , (5.6) (m,µ)
where a, b, c, e ∈ λR[[λ]]. The term m4 e1 is irrelevant because Rn,1 (· · · 1 · · ·) = 0 for n ≥ 1 (see e.g. [12, Lemma 1(C)]). a describes coupling constant renormalization, b and c wave function and mass renormalization. As usual, the latter renormalizations can be absorbed in a redefinition of the free theory, so that there is only one free parameter left. Remark 5.2. The invariance of the Lagrangian is sufficient but not necessary for the invariance of the net (in the example (5.6) the parameter e is irrelevant for the v According
to (3.4) there is (up to multiplication with a constant) precisely one Lorentz invariant balanced field with two factors ϕ and two derivatives, namely ∂ 12µ ∂µ12 ϕ(x1 )ϕ(x2 )|x1 =x2 =x = 2(ϕϕ − (∂ µ ϕ)∂µ ϕ)(x).
January 18, 2005 10:2 WSPC/148-RMP
00226
1327
Causal Perturbation Theory in Terms of Retarded Products
structure of the net). We plan to determine the corresponding subgroup of Radlim [6]. The field renormalization in the algebraic adiabatic limit is a map z (1) : λPbal [[λ]] × Pbal → P[[λ]] : (λL, A) 7→ z (1) (λL)A = A + O(λ)
(5.7)
which is determined by the requirement that the transformation formula (4.11) or (4.25) takes the simpler form ˆ λL , (z (1) (λL)A)(h)z(λL) = A(h)
∀h ∈ D,
∀A, L ∈ Pbal ,
(5.8)
in the algebraic adiabatic limit. This condition has a unique solution: in the formula R (4.19) for DH n ((λ g)⊗n ⊗ A(h)) we specialize to g ∈ GL (O) and h ∈ D(O) and perform partial integrations. This yields X 1 ml (−1)|a| ∂ a dn+1,l,(0,...,0,a) ((λL)⊗n ⊗ A) . (5.9) z (1) (λL)A = n! d n,l,a∈N0
Hence, Rthe algebraic adiabatic limit simplifies the field renormalization A(h) 7→ λ g DH (e⊗ ⊗ A(h)) to the linear map Pbal 3 A 7→ z1 (λL)A ∈ P[[λ]], i.e. the test function remains unchanged. By using the definition z (1) (λL)∂ a A := ∂ a z (1) (λL)A ,
A ∈ Pbal ,
(5.10)
and linearity we extend z (1) to a map Pbal [[λ]] × P → P[[λ]] and with that the relation (5.8) holds even for all A ∈ P. From (4.20) and (5.9) we conclude dim(L) ≤ d ⇒ dim(z (1) (λL)A) = dim(A)
(5.11)
(where d is the number of spacetime dimensions). In a massless model with L ∈ Pd (2.41) we find z(λL) ∈ Pd
and A ∈ Pj ⇒ z (1) (λL)A ∈ Pj ,
(5.12)
due to Remark 4.4(1). We point out that z and z (1) are uniquely determined by ˆ (m,µ) and that they are universal, i.e. independent of O. R(m,µ) and R Example (Renormalization of ϕ and ϕ2 in a massless model with L ∈ Pd ). If the retarded products satisfy the Field equation, the renormalization of ϕ is the identity, due to Theorem 4.2(ii). Here we do not use this assumption, instead we work out some consequences of (5.12). Since P1 = {aϕ|a ∈ R} the renormalization of ϕ has the simple form ϕ → zϕ(1) ϕ ,
zϕ(1) ∈ R[[λ]] .
(5.13)
For A = ϕ2 the renormalized field is of the form (1)
(1)
z (1) (λL)ϕ2 = z0 ϕ2 + z1µ ∂ µ ϕ ,
(1)
(1)
z0 , z1µ ∈ R[[λ]] .
(5.14)
ˆ (m,µ) are Lorentz covariant, the right side of (5.14) must also If L, R(m,µ) and R (1) have this symmetry, i.e. z1µ = 0. (Alternatively, the latter follows also if L is
January 18, 2005 10:2 WSPC/148-RMP
1328
00226
M. D¨ utsch & K. Fredenhagen
ˆ (m,µ) preserve the field parity (2.10).) With that the even in ϕ and R(m,µ) and R (1) 2 renormalization of ϕ is “diagonal”, too. Usually z0 (5.14) is non-trivial and it (1) cannot be related to (zϕ )2 (5.13). Scaling transformations. In Sec. 3 we have shown that one can fulfil almost (m,µ) homogeneous scaling of Rn,1 for all m ≥ 0. The results of this section yield far(m,µ)
reaching additional information about the connection of Rρ (2.49) and R(m,µ) (m,µ) (cf. [32] and [25]). The basic observation is the following: if R fulfils the axioms given in Sec. 2 (Lorentz covariance, global inner symmetries and the Field equation (m,µ) may be excluded or included), then the same axioms hold true for Rρ , too, as can (m) be verified straightforwardly. Therefore, there exists a sequence DH ρ (4.10) with the properties mentioned in the Main Theorem. So, the scaling transformations on a given renormalization prescription induce a one-parameter subgroup of the renormalization group Radlim , which may be called “Gell–Mann–Low Renormalization Group”. With the scaling as renormalization transformation we are now going to compute to lowest non-trivial order the renormalization of the fields ϕ2 , ϕ3 , ϕ4 and of the interaction λL for L = ϕ4 (in d = 4 dimensions) and m = 0. We assume that R ≡ R(0) ≡ R(0,µ) fulfils all axioms of Sec. 2. Renormalization of ϕ2 . By (4.16) and the expansion (2.35) and (2.36) we obtain Dρ (ϕ4 (x1 ) ⊗ ϕ2 (x)) = Rρ (ϕ4 (x1 ), ϕ2 (x)) − R(ϕ4 (x1 ), ϕ2 (x)) = 6 ρ4 r(ϕ2 , ϕ2 )(ρ(x1 − x)) − r(ϕ2 , ϕ2 )(x1 − x) ϕ2 (x1 ) =
6 log ρ δ(x1 − x)ϕ2 (x1 ) , (2π)2
(5.15)
where we use a symbolic notation for Dρ and the result (C.7) of Appendix C. The tree diagram of R(ϕ4 , ϕ2 ) does not contribute, because it scales homogeneously. Going over to the algebraic adiabatic limit the formula (5.9) yields 6 2 zρ(1) (λϕ4 )ϕ2 = 1 + λ log ρ + O(λ ) ϕ2 , (5.16) (2π)2 independently of the normalization of the fish diagram r(ϕ2 , ϕ2 ). Renormalization of ϕ3 . Analogous to (5.15) we get Dρ (ϕ4 (x1 ) ⊗ ϕ3 (x)) = 18 ρ4 r(ϕ2 , ϕ2 )(ρ(x1 − x)) − r(ϕ2 , ϕ2 )(x1 − x) ϕ(x)ϕ2 (x1 ) + 4[ρ6 r(ϕ3 , ϕ3 )(ρ(x1 − x)) − r(ϕ3 , ϕ3 )(x1 − x)]ϕ(x1 )
=
18 3 log ρ δ(x1 − x)ϕ3 (x1 ) + (log ρ) δ(x1 − x)ϕ(x1 ) , (2π)2 2(2π)4
(5.17)
January 18, 2005 10:2 WSPC/148-RMP
00226
Causal Perturbation Theory in Terms of Retarded Products
1329
where we have inserted the results (C.7) and (C.13) of Appendix C. In the algebraic adiabatic limit this gives 18 2 log ρ + O(λ ) ϕ3 zρ(1) (λϕ4 )ϕ3 = 1 + λ (2π)2 3 2 + λ log ρ + O(λ ) ϕ (5.18) 2(2π)4 by means of (5.9). Terms other than ϕ3 and ϕ do not appear to higher orders either, due to (5.12), the maintenance of the field parity and Lorentz invariance. So, (1) the field renormalization of ϕ3 is non-diagonal. However, since zρ (λϕ4 )ϕ = ϕ 1 (due to the validity of the Field equation and (5.10)), the field (ϕ3 + 12 (2π) 2 ϕ) is (1) 18 2 4 an eigenvector of zρ (λϕ ) with eigenvalue 1 + λ (2π)2 log ρ + O(λ ) . Renormalization of L = ϕ4 . We continue Example (5.6) for the particular case of the scaling transformations and m = 0. Since the corresponding tree diagrams scale homogeneously we obtain Dρ (ϕ4 (x1 ) ⊗ ϕ4 (x)) = 36 ρ4 r(ϕ2 , ϕ2 )(ρ(x1 − x)) − r(ϕ2 , ϕ2 )(x1 − x) ϕ2 (x1 )ϕ2 (x) + 16 ρ6 r(ϕ3 , ϕ3 )(ρ(x1 − x)) − r(ϕ3 , ϕ3 )(x1 − x) ϕ(x1 )ϕ(x) + ρ8 r(ϕ4 , ϕ4 )(ρ(x1 − x)) − r(ϕ4 , ϕ4 )(x1 − x) 1 =
36 (log ρ)δ(x1 − x)ϕ2 (x1 )ϕ2 (x) (2π)2
+
6 (log ρ)(δ)(x1 − x)ϕ(x1 )ϕ(x) + · · · δ(x1 − x)1 (2π)4
(5.19)
by using (C.7) and (C.13); the form of the last term follows from Lorentz invariance and that it must scale homogeneously with degree 8. In the algebraic adiabatic limit we get the following renormalization of the interaction: 18 3 zρ (λϕ4 ) = λ + λ2 log ρ + O(λ ) ϕ4 (2π)2 3 3 2 (5.20) log ρ + O(λ ) (∂ µ ϕ)∂µ ϕ − ϕϕ , + −λ 2(2π)4
where we apply (5.3) and take into account that z takes values in Pbal . Due to (5.12) there is no mass renormalization, i.e. the constant c in (5.6) vanishes to all orders. Field Renormalization of ϕ4 . From Dρ (ϕ4 ⊗ ϕ4 ) (5.19) we can also read off the field renormalization of ϕ4 to first order in λ: 36 6 4 zρ(1) (λϕ4 )ϕ4 = ϕ4 + λ log ρ ϕ + log ρ ϕϕ + O(λ2 ) . (5.21) (2π)2 (2π)4
January 18, 2005 10:2 WSPC/148-RMP
1330
00226
M. D¨ utsch & K. Fredenhagen
6. Outlook The construction of a renormalized perturbative quantum field theory, in the sense of algebraic quantum field theory [27], was carried through without ever meeting infrared problems. In particular, the renormalization group (in the sense of Stueckelberg and Petermann) could be constructed in purely local terms. This is in variance with standard techniques of perturbation theory which typically rely on global properties. Given the algebra of interacting fields, one may then, in a second step, look for states of interest, for instance vacuum or particle states. This amounts to performing the adiabatic limit in the conventional sense and was done for massive theories by Epstein and Glaser [18, 19] and, on the basis of retarded products, by Steinmann [42]. (For QED see [2] for the construction of the vacuum state and [43] for the analysis of scattering.) One may then relate the global renormalization parameters, as masses and coupling constants at e.g. zero momentum, to the local parameters involved in our construction. One may also look for other situations, for instance at finite temperature or with non-trivial boundary conditions. Then, other global parameters are of interest, but the local parameters remain the same. On a generic curved spacetime it seems that a completely local procedure is by far the best way to construct perturbative quantum fields; in particular the large ambiguity of renormalization in theories without translation invariance has recently been removed (up to few parameters) by requiring the generally covariant locality principle [31, 32, 8]. In connection with the renormalization group the following topics will be studied in a subsequent paper [6]. • The map D of the Main Theorem (Theorem 4.2) scales homogeneously for the modified interacting fields only. For this reason most applications of this theorem given in Secs. 4 and 5 are restricted to the modified interacting fields. These results can be translated into statements about the original interacting fields by the transformation formula (2.37) (which holds also for (D (m) , D(m,µ) )). • The generator of the Gell-Mann–Low Renormalization Group (i.e. the subgroup of Radlim induced by the scaling transformations on a given renormalization prescription R) is related to the β function. The Gell-Mann–Low subgroups beˆ are conjugate to each longing to different renormalization prescriptions R and R other. The generator starts with a term of second order which is universal. • The absorption of the b- and c-terms of (5.6) in a redefinition of the free theory (wave function and mass renormalization) requires that the physical predictions are independent of the splitting of the action in a free and an interacting part (where the free part is always quadratic in ∂ a ϕ (a ∈ Nd0 )). It turns out that the latter is an additional (re)normalization condition, which is part of the “Principle of Perturbative Agreement” required by Hollands and Wald [33]. • The scaling transformations are the bridge to Wilson’s renormalization group. This has to be investigated as well as the connection to the Buchholz–Verch scaling limit.
January 18, 2005 10:2 WSPC/148-RMP
00226
Causal Perturbation Theory in Terms of Retarded Products
1331
• Hollands and Wald made a corresponding analysis for curved spacetimes [32]. But the present formalism is not yet fully adapted to general Lorentzian spacetimes. Appendix A. Mass Dependence of the Two-Point Function +(d)
To investigate the mass dependence of the two-point function ∆m (x1 − x2 ) ≡ ω0 (ϕ(x1 ) ?m ϕ(x2 )) in d-dimensions we compute the Fourier transformationw Z 1 +(d) dd p Θ(p0 )δ(p2 − m2 )e−ipy (A.1) ∆m (y) = (2π)d−1 in the sense of distributions. We perform the p0 -integration and use Z Z ∞ Z π dp pd−2 dd−1 p~ · · · = |Sd−3 | dθ (sin θ)d−3 · · · , 0
(A.2)
0
where k+1
|Sk | =
2π 2 Γ( k+1 2 )
(A.3)
is the surface of the unit ball in Rk+1 . With y ≡ (t, ~y), r ≡ |~y | this gives Z π Z ∞ −iωt |Sd−3 | d−3 ipr cos θ e d−2 +(d) dθ(sin θ) e dp p ∆m (y) = . (A.4) (2π)d−1 0 2ω ω=√p2 +m2 0 +(d)
It is well known that ∆m is the limit of a function which is analytic in the (d) forward tube Rd − iV+ . (This is e.g. a consequence of the Wightman axioms.) +(d) Taking additionally Lorentz covariance into account we conclude that ∆m is of the form +(d) ∆m (y) = lim f (y 2 − iy 0 ) , →0
(A.5)
where f (z) is analytic for z ∈ C\U for some U ⊂ R. With that it suffices to compute +(d) ∆m for y = (t, ~0), t > 0. Namely, for d = 3 we obtain Z ∞ Z ∞ 1 i ∂ e−iωt e−iωt ~ ∆+(3) (t, 0) = = dp p dp p 2 m √ 4π 0 ω ω= p2 +m2 4π ∂t 0 ω =
i ∂ 4π ∂t
Z
∞
du
mt
e−iu −i e−imt = . u 4π t
(A.6)
+(3)
Due to (A.5)p∆m (y) is obtained for arbitrary y by replacing it (in the latter formula) by −(y 2 − iy 0 0). This gives √ 2 0 1 p ∆+(3) (A.7) e−m −(y −iy 0) . m (y) = 4π −(y 2 − iy 0 0)
w An
analogous (unpublished) computation of the commutator function by Rehren and M. D. was very helpful for writing essential parts of this Appendix.
January 18, 2005 10:2 WSPC/148-RMP
1332
00226
M. D¨ utsch & K. Fredenhagen
Analogously, for d = 4 one obtains −1 ∆+(4) + log(−m2 (y 2 − iy 0 0))m2 f (m2 y 2 ) + m2 F (m2 y 2 ) , m (y) = 2 2 4π (y − iy 0 0) (A.8) where f and F are analytic functions, see e.g. [3, Sec. 15.1]. f can be expressed in terms of the Bessel function J1 of order 1, namely ∞ X √ 1 f (z) ≡ 2 √ J1 ( z) = Ck z k , Ck ∈ R ; (A.9) 8π z k=0
and F is given by a power series ∞ 1 X (−z/4)k , F (z) ≡ − {ψ(k + 1) + ψ(k + 2)} 4π k!(k + 1)!
(A.10)
k=0
where the Psi-function is related to the Gamma-function by ψ(x) ≡ Γ0 (x)/Γ(x). +(3) +(4) We see that ∆m is smooth in m ≥ 0, but ∆m is not smooth at m = 0 (it is only continuously differentiable)! However, µ (4) 2 2 2 2 2 Hm (y) ≡ ∆+(4) m (y) − m f (m y ) log(m /µ )
(where µ > 0 is a fixed mass parameter) is smooth in m ≥ 0. In addition, µ (4)
+(4)
(A.11) µ (4)
• Hm (y)−∆m (y) is a smooth function of y, i.e. the wave front sets of Hm (y) µ (4) +(4) and ∆m (y) agree and, hence, (Hm (y))k , k ∈ N exists; (4) µ (4) +(4) • the antisymmetric part of Hm is the same as for ∆m (namely = i∆m /2, (d) where ∆m is the commutator function); µ (4) • Hm is Poincar´e invariant; µ (4) • Hm satisfies the Klein–Gordon equation since (y + m2 ) f (m2 y 2 ) = 0 (by using Bessel’s differential equation); µ (4) • Hm does not scale homogeneously: µ(4)
µ(4) (y) = log(ρ)2m2 f (m2 y 2 ) ; ρ2 Hρ−1 m (ρy) − Hm
µ (4)
(A.12)
+(4)
• and Hm=0 = ∆m=0 . To investigate the mass dependence of the two-point function in dimensions +(d+2) +(d) d ≥ 5 we derive a recursion relation which expresses ∆m in terms of ∆m . From (A.4) we find Z π Z ∞ eiωt 2 2 2 +(d) d dθ(sin θ)d−1 e−ipr cos θ (∂r − ∂t − m )∆m (y) = |Sd−3 | dp p 2ω 0 0 = (2π)2 (d)
|Sd−3 | +(d+2) ∆ (y) . |Sd−1 | m
(A.13)
By using y = (∂t2 − ∂r2 − d−2 y) (the r ∂r + derivatives with respect to the angles of ~ +(d) (d) +(d) (d) 2 latter vanish in ∆m ) and ( + m )∆m = 0 we obtain −1 +(d+2) +(d) ∆m (y) = ∂ r ∆m (y) . (A.14) 2πr
January 18, 2005 10:2 WSPC/148-RMP
00226
Causal Perturbation Theory in Terms of Retarded Products +(d)
+(d)
+(d)
Because of ρd−2 ∆ρ−1 m (ρy) = ∆m (y) and Poincar´e invariance, ∆m form +(d) ∆m (y) = md−2 F (d) (m2 (y 2 − iy 0 0)) .
1333
is of the (A.15)
With that we obtain +(d+2) ∆m (y) =
2π(y 2
1 +(d) (m∂m + 2 − d)∆m (y) . − iy 0 0)
+(3)
µ(4)
+(4)
The explicit formulas for ∆m , Hm , ∆m imply that
(A.16)
and the recursion relation (A.16)
+(2l+1)
• (in odd dimensions) ∆m is smooth in m ≥ 0; +(2l) • (in even dimensions) ∆m contains a term which behaves as m2(l−1) log(m2 /µ2 ) for m → 0; µ (4+2k)
• Hm
+(4+2k)
(y) ≡ ∆m
(k)
− π −k m2(k+1) f (k) (m2 y 2 ) log(m2 /µ2 )
(A.17)
(where f is the kth derivative of f (A.9)) is smooth in m ≥ 0 and has the same µ (4) (d) properties as Hm : its antisymmetric part is = i∆m /2 (where d ≡ 4 + 2k), µ (d) +(d) it is Poincar´e invariant, WF(Hm ) = WF(∆m ), it solves the Klein–Gordon (4+2k) equation (since (y + m2 )f (k) (m2 y 2 ) = 0), it scales almost homogeneously µ (d) +(d) with degree (d − 2) and power 1, and Hm=0 = ∆m=0 . Due to the statements µ (d) about the antisymmetric part, Poincar´e invariance and the wave front set, H m can be used for the definition (2.6) of the ∗-product.
Appendix B. Extension of a Distribution to a Point We review in this Appendix the proofs of Theorem 3.2 and Proposition 3.3 (given in [7] and [31] respectively) and add some completions. Similar or related techniques can be found in the older works [26, 42, 18]. In Theorem 3.2(a) the extension is obtained by the following limit: let χ be a smooth function on Rk such that 0 ≤ χ ≤ 1, χ(x) = 0 for |x| < 1 and χ(x) = 1 for |x| > 2. One can show that the following limit def
(t, h) = lim (t◦ (x), χ(ρx)h(x))
(B.1)
ρ→∞
(note χ(ρx)h(x) ∈ D(Rk \{0})) exists, and that the so-defined t fulfils sd(t) = sd(t◦ ). The construction of the extensions in the most interesting case k ≤ sd(t◦ ) < ∞ (Theorem 3.2(b)) proceeds as follows. Let def
ω = sd(t◦ ) − k ,
def
Dω (Rk ) = {h ∈ D(Rk ) | ∂ a h(0) = 0 ∀|a| ≤ ω} .
◦
(B.2)
k
We will see that t has a unique extension tω to Dω ≡ Dω (R ) and that each projector W from D ≡ D(Rk ) onto Dω yields an extension t ∈ D 0 (Rk ) (with def
sd(t) = sd(t◦ )) by (t, h) = (tω , W h) (which is called “W -extension”). There are many possibilities to construct such a projector W or, equivalently, to choose a corresponding complementary space E = ran(1 − W ) of Dω in D.
January 18, 2005 10:2 WSPC/148-RMP
1334
00226
M. D¨ utsch & K. Fredenhagen
(By “complementary space” we mean: D = Dω ⊕ E). The following lemma gives a parametrization of these possibilities in terms of a set of functions: Lemma B.1 ([7]). (a) For any set of functions {wa ∈ D | a ∈ Nk0 , |a| ≤ ω, ∂ b wa (0) = δab
∀b ∈ Nk0 } ,
(B.3)
the linear map W : D → D : W h(x) = h(x) −
X
(∂ a h)(0)wa (x)
(B.4)
|a|≤ω
is a projector onto Dω . (b) Conversely, given a projector W from D onto Dω (or equivalently a complementary space E of Dω in D), then there exist functions (wa )a with the properties (B.3), such that W can be expressed in terms of the (wa )a by (B.4). An example for the functions (wa )|a|≤ω is wa (x) = and w|U ≡ 1 for some neighborhood U of x = 0.
xa a! w(x)
where w ∈ D(M)
Proof. x (a) is obvious. To prove (b) we first show that there exists a basis (wa )|a|≤ω of the vector space E = ran(1 − W ) with ∂ b wa (0) = δab . The decomposition D = Dω ⊕ E induces a decomposition of the dual space D 0 = Dω0 ⊕ Dω⊥ by the prescriptions (f2 , h1 ) = 0 ∧ (f1 , h2 ) = 0, ∀f2 ∈ Dω0 , h1 ∈ E, f1 ∈ Dω⊥ , h2 ∈ Dω . A basis of Dω⊥ is given by (∂ a δ)|a|≤ω . We define ((−1)|a| wa )|a|≤ω to be the dual basis (in E), and it obviously has the properties (B.3). P a So, for any h ∈ D, (1 − W )h can be written as (1 − W )h(x) = a c wa (x) b b b with ca ∈ C, and we find ∂ h(0) = ∂ (1 − W )h(0) = c , |b| ≤ ω. Hence, W h(x) = P h(x) − a (∂ a h)(0)wa (x). We split any h ∈ D into h = h1 + h2 , h1 = has the form X xa ga (x) , W h(x) ≡ h2 (x) = |a|=[ω]+1
P
|a|≤ω (∂
with
a
h)(0)wa ∈ E, h2 ∈ Dω . h2 ga ∈ D .
(B.5)
This decomposition of h2 is non-unique in general, however, we will see that this does not matter. From (3.6) we recall sd(xb f ) ≤ sd(f ) − |b|, ∀f ∈ D 0 (Rk ) or D0 (Rk \{0}). For |a| = [ω] + 1 we find sd(xa t◦ ) ≤ ω + k − ([ω] + 1) < k. Therefore, from Theorem 3.2(a) we know that xa t◦ ∈ D0 (Rk \{0}) has a unique extension xa t◦ ∈ D0 (Rk ). Now we define t ∈ D 0 (Rk ) by X X def (t, h) = (xa t◦ , ga ) + Ca (∂ a h)(0), h ∈ D(Rk ) , (B.6) |a|=[ω]+1
x The
idea of this proof is given in [7].
|a|≤ω
January 18, 2005 10:2 WSPC/148-RMP
00226
Causal Perturbation Theory in Terms of Retarded Products
1335
where the Ca ∈ C are arbitrary constants. By means of (B.1) we find for the first term X (xa t◦ , ga ) = lim (t◦ (x), χ(ρx)W h(x)) . (t, W h) = (B.7) |a|=[ω]+1
ρ→∞
P Hence, |a|=[ω]+1 (xa t◦ , ga ) is independent of the choice of the decomposition (B.5). Obviously, t is an extension of t◦ , and in [7] it is proved sd(t) = sd(t◦ ). t|Dω ≡ tω is uniquely fixed (B.7), but t|E is completely arbitrary, namely (t, wa ) = Ca can be arbitrarily chosen. This is the freedom of normalization in perturbative renormalization (3.8), which gives rise to the renormalization group (see Sec. 4). In Theorem 3.2(c) there exists a linear functional t¯ on D(Rk ) which fulfils t¯(h) = ◦ (t , h) ∀h ∈ D(Rk \{0}), according to the Hahn–Banach theorem. But t¯ is not continuous, i.e. it is not a distribution. It is useful to know that given an extension t of t◦ , there exists a projector W from D onto Dω such that t = t ◦ W (i.e. all constants Ca in (B.6) vanish); in detail: def
Lemma B.2. Let ω = sd(t◦ ) − k and let an extension t of t◦ be given with sd(t) = sd(t◦ ). (a) Then there exists a complementary space E of Dω in D with t|E = 0. (b) There exist functions wa ∈ D, a ∈ Nk0 , |a| ≤ ω with ∂ b wa (0) = δab and t = t ◦ W, where W is given in terms of the (wa )a by (B.4). Proof. By Lemma B.1(b), the statement (b) is a consequence of (a). To prove (a) let E1 be a complementary space of Dω in D and W1 the corresponding projector on Dω . We choose a g ∈ Dω with (t, g) = 1. Now we set def
E = {k − (t, k)g | k ∈ E1 } .
(B.8)
Obviously E is a vector space and it holds that t|E = 0. To see D = Dω + E we decompose any h ∈ D into h = h1 + h2 , h1 ∈ E1 , h2 ∈ Dω . Then, h = (h1 − (t, h1 )g) + (h2 + (t, h1 )g), (h1 − (t, h1 )g) ∈ E, (h2 + (t, h1 )g) ∈ Dω . It remains to show E ∩Dω = {0}. Let l ∈ E ∩Dω . So, l = k −(t, k)g for some k ∈ E1 , and on the other hand l = W1 l = W1 k − (t, k)W1 g = −(t, k)g. We find k = 0 and hence l = 0. Obviously the W -extension (B.7) is a non-local renormalization prescription: it depends on t◦ |D(U ) where U := ∪|a|≤ω supp wa . In contrast the condition of almost homogeneous scaling ensures that the extension depends on the short distance behavior of t◦ only. We now prove that the latter condition can be maintained, following to a large extent [31]. Proof of Proposition 3.3. Let t1 be any extension of t◦ with sd(t1 ) = sd(t◦ ) = D. Since t◦ scales almost homogeneously (with power N ) the support of (x∂x + D)N +1 t1 (x) must be contained in {0}. In addition it holds that
January 18, 2005 10:2 WSPC/148-RMP
1336
00226
M. D¨ utsch & K. Fredenhagen
sd((x∂x + D)N +1 t1 ) = sd(t1 ) = D, and hence !N +1 X X t1 (x) = xr ∂ x r + D r
r
(B.9)
|a|≤D−k
We will frequently use X
Ca ∂ a δ (k) (x) .
!
xr ∂xr + D ∂ a δ (k) (x) = (D − k − |a|)∂ a δ (k) (x) .
• For D 6∈ N0 + k we may set X def t = t1 −
|a|≤D−k
Ca ∂ a δ (k) , (D − k − |a|)N +1
(B.10)
(B.11)
This is an extension which maintains even the power of the almost homogeneous scaling. • If D ∈ N0 + k the subtraction of the ∂ a δ-terms in (B.11) does not work for |a| = D − k. We can only perform the finite renormalization X Ca def ∂ a δ (k) . (B.12) t = t1 − (D − k − |a|)N +1 |a|
With that X r
xr ∂ x r + D
!N +1
t=
X
Ca ∂ a δ (k) .
(B.13)
|a|=D−k
However, applying the operator (x∂x + D) once more we get zero, i.e. t scales almost homogeneously with power ≤ N + 1. The statements about the uniqueness of t are obvious, because ∂ a δ (k) scales homogeneously with degree (k + |a|). How does one find an extension t of t◦ (with sd(t) = sd(t◦ )) in practice? For sd(t◦ ) < k this is trivial: t is given by the same formula, the domain may be extended by continuity (B.1). But for k ≤ sd(t◦ ) < ∞ the map W (B.4) gets complicated in explicit calculations. (Exceptions are the purely massive theories in which one a may choose wa = xa! 6∈ D in (B.3) and (B.4); this gives the “central solution” of Epstein and Glaser [18].) A construction of R1,1 (or equivalently T2 ) is given in Appendix B. It uses the K¨ allen–Lehmann representation of the commutator of two Wick polynomials, and hence, it is unclear how to generalize this method to higher orders. It seems that differential renormalization [21, 35, 39] is a practicable way to trace back the case k ≤ sd(t◦ ) < ∞ to the trivial case sd(t◦ ) < k in arbitrary high orders. The idea is to write t◦ as a derivative of a distribution f ◦ ∈ D0 (Rk \{0}) with sd(f ◦ ) < k; more precisely X Ca ∂ a (Ca ∈ C) (B.14) t◦ = Df ◦ where D= |a|=l
January 18, 2005 10:2 WSPC/148-RMP
00226
Causal Perturbation Theory in Terms of Retarded Products
1337
such that sd(f ◦ ) = sd(t◦ ) − l < k. Let f ∈ D 0 (Rk ) be the unique extension of f ◦ with sd(f ) = sd(f ◦ ). Then, def
t = Df
(B.15)
solves the extension problem with sd(t) = sd(t◦ ). The non-uniqueness of t shows up in the non-uniqueness of f ◦ : one may add to f ◦ a distribution g ◦ ∈ D0 (Rk \{0}) with sd(g ◦ ) = sd(f ◦ ) and Dg ◦ = 0 on Rk \{0}. The (unique) extension g of g ◦ (with sd(g) = sd(g ◦ )) fulfils supp Dg ⊂ {0} and, hence, the addition of g ◦ can change t (B.15) only by a local term. For example let sd(t◦ ) = k and D = . Then, g is of the form: g = αDret + (solution of the homogeneous differential equation) (α ∈ C), and this yields tnew ≡ D(f + g) = told + α δ. Example. Differential renormalization of the massless fish and setting-sun diagram 1 1 0 ◦ , − 2 r0 (y) = j0 (y)Θ(−y ) , j0 (y) ≡ (y 2 − iy 0 0)2 (y + iy 0 0)2 (B.16) 1 1 0 ◦ r1 (y) = j1 (y) Θ(−y ) , j1 (y) ≡ , − 2 (y 2 − iy 0 0)3 (y + iy 0 0)3 and of r2◦ (y)
0
= j2 (y)Θ(−y ) ,
j2 (y) ≡
log(−µ2 (y 2 − iy 0 0)) − (y → −y) (y 2 − iy 0 0)2
(B.17)
for d = 4, cf. Appendix C. These r ◦ -distributions appear in the Example (3.19)– (3.22).y In agreement with Lemma 2.3(b) the jl ’s have support in V¯+ ∪ V¯− . We are looking for distributions Jl with sd(Jl ) < 4 ,
supp Jl ⊂ (V¯+ ∪ V¯− )
and jl = Dl Jl
(B.18)
where Dl is a power of the wave operator. Due to the lowered scaling degree of Jl , the product Jl (y) Θ(−y 0 ) exits in D0 (R4 ) and one easily verifies that def rl (y) = Dl Jl (y)Θ(−y 0 ) (B.19)
is a Lorentz invariant extension of rl◦ with the same scaling degree. With some trial and error one finds −log(−µ2 (y 2 − iy 0 0)) j0 (y) = y − (y → −y) , 4 (y 2 − iy 0 0) j1 (y) = y y
j2 (y) = y y We
−log(−µ2 (y 2 − iy 0 0)) − (y → −y) , 32 (y 2 − iy 0 0)
(B.20)
−(log(−µ2 (y 2 − iy 0 0)))2 − 2 log(−µ2 (y 2 − iy 0 0)) − (y → −y) . 8 (y 2 − iy 0 0)
omit constant pre-factors.
January 18, 2005 10:2 WSPC/148-RMP
1338
00226
M. D¨ utsch & K. Fredenhagen
In the cases of the fish and the setting-sun diagram a scale µ > 0 is introduced; this cannot be avoided by using other methods of renormalization either, cf. Appendix C. If we replaced (−µ2 ) by µ2 in J0 and J1 the relation jl = Dl Jl would still hold, but J0 and J1 would have support in {y|y 2 ≤ 0}. This alternative possibility to fulfil jl = Dl Jl reflects the peculiarity that j0 and j1 vanish on D({y|y 2 > 0}). All Jl ’s scale almost homogeneously with degree 2 and the corresponding power is the power of log(µ2 · · ·). We explicitly see that in all three examples the extension increawses this power by 1. For the breaking of homogeneous scaling of the fish and setting-sun diagram we obtain ρ4 r0 (ρy) − r0 (y) = iπ log ρ y Θ(−y 0 )δ(y 2 ) = i2π 2 log ρ δ(y) , (B.21)
iπ 2 iπ log ρ y y Θ(−y 0 ) δ(y 2 ) = log ρ δ(y) , (B.22) 8 4 where we use Θ(−y 0 ) δ(y 2 ) ∼ Dret (−y) and Dret = δ. In Appendix C these results are obtained by means of another method of renormalization, which is more straightforward. ρ6 r1 (ρy) − r1 (y) =
Remark B.3. The construction in the proof of Theorem 3.2 yields also an extension t of the given t◦ if one works in (B.2), (B.5) and (B.6) with an ω which is strictly greater than (sd(t◦ ) − k); but then it holds genericly sd(t) = ω + k > sd(t◦ ). (This is called an “over-subtracted” extension.) Appendix C. Extension of Two-Point Functions In this Appendix the number of spacetime dimensions is d = 4. The x-space method which we give here to renormalize the fish diagram def
r0 (y) = r1,1 (ϕ2 , ϕ2 )(y) ,
y ≡ x1 − x ,
(C.1)
and the setting-sun diagram def
r1 (y) = r1,1 (ϕ3 , ϕ3 )(y) ,
(C.2)
can be used for arbitrary first-order terms r1,1 (A1 , A2 ), A1 , A2 ∈ P. (See footnote l for the notation.) We treat here the massless case, which needs additional care to avoid IR-divergences. We first compute the fish diagram by following our inductive construction of Sec. 3. A straightforward calculation yields def j0 (y) = ω0 [ϕ2 (x1 ), ϕ2 (x)]? Z ∞ i dm2 ∆m (y) , (C.3) = 2 D+ (y)2 − D+ (−y)2 = 2(2π)2 0
ret which is the K¨ allen–Lehmann representation. We recall ∆m (y) = ∆ret m (y)−∆m (−y) 2 ret ret ¯ with ( + m )∆m = δ and supp ∆m ⊂ V+ . According to the axioms Causality, GLZ relation and Scaling the required distribution r0 is determined by
supp r0 ⊂ V¯− ,
(r0 , h) = −i(j0 , h) ∀h ∈ D(V¯− \{0})
(C.4)
January 18, 2005 10:2 WSPC/148-RMP
00226
Causal Perturbation Theory in Terms of Retarded Products
1339
and (ρ∂ρ + 4)2 r0 (ρy) = 0 . −∆ret m (−y)
(C.5)
2
If we replace ∆m (y) in (C.3) by the m -integral becomes UV-divergent. However, for µ > 0 one easily verifies that Z ∞ 1 ∆ret (−y) def 2 r0 µ (y) = ) ( − µ (C.6) dm2 m2 y 2 2(2π) m + µ2 0 exists as a distribution and solves (C.4). To avoid an IR-divergence, we need to introduce a scale µ > 0, which breaks homogeneous scaling (because of ρ4 r0 µ (ρy) = r0 ρµ (y)). To verify the scaling requirement (C.5) we compute ρ4 r0 µ (ρy) − r0 µ (y) = r0 ρµ (y) − r0 µ (y) Z ∆ret (−y) 1 2 2 2 ( − µ ) + (1 − ρ )µ dm2 2m 2 2 = y 2 2(2π) m +ρ µ Z ∆ret (−y) − (y − µ2 ) dm2 m2 m + µ2 Z 1 1 1 2 2 ret − 2 = dm (y − µ )∆m (−y) 2(2π)2 m2 + ρ 2 µ2 m + µ2 Z 1 ∆ret (−y) log ρ δ(y) . (C.7) + (1 − ρ2 )µ2 dm2 2m 2 2 = m +ρ µ (2π)2 Hence, (C.5) is indeed satisfied for all µ > 0. So we get explicitly the breaking of homogeneous scaling without really computing the integral (C.6). As a byproduct the calculation (C.7) shows explicitly that the choice of µ > 0 is precisely the choice of the indeterminate parameter C in the general solution r0 (y) + Cδ(y).z We proceed analogously for the setting-sun diagram: def j1 (y) = ω0 [ϕ3 (x1 ), ϕ3 (x)]? Z ∞ 3i = 6 D+ (y)3 − D+ (−y)3 = dm2 m2 ∆m (y) . (C.9) 16(2π)4 0 Obviously,
r1 µ1 ,µ2 (y) = z Note
that
−3 (y − µ21 )(y − µ22 ) 16(2π)4
Z
∞ 0
dm2
(m2
m2 ∆ret m (−y) + µ21 )(m2 + µ22 )
Z ∞ −1 ∆ret m (−y) 2 2 (− + µ )(− + ν ) dm2 y y 2 2 2(2π) (m + µ2 )(m2 + ν 2 ) 0 Z ∞ 1 1 dm2 = r0 µ (y) + · ( − µ2 )δ(y) 2(2π)2 0 (m2 + µ2 )(m2 + ν 2 )
(C.10)
r0 µν (y) ≡
(C.8)
solves also (C.4), but it violates (C.5): the term ∼ δ(y) scales homogeneously with degree 6 (instead of 4).
January 18, 2005 10:2 WSPC/148-RMP
1340
00226
M. D¨ utsch & K. Fredenhagen
(where µ1 , µ2 > 0) solves (C.4). But we are looking for solutions r1 of (C.4) which additionally scale almost homogeneously with degree 6 and power ≤ 1, i.e. (ρ∂ ρ + 6)2 r1 (ρy) = 0. For r1 µ,µ the breaking of homogeneous scaling is equal to ρ6 r1 µ,µ (ρy) − r1 µ,µ (y) = (r1 ρµ,ρµ (y) − r1 µ,ρµ (y)) + (r1 µ,ρµ (y) − r1 µ,µ (y)) =
−3 −y δ(y)(log ρ2 ) + δ(y)µ2 (ρ2 − 1) , 4 16(2π)
(C.11)
where the method (C.7) is used twice. We see that r1 µ,µ violates our scaling condition and, by a generalization of the calculation (C.11), one finds that this holds true even for all r1 µ1 ,µ2 , (µ1 , µ2 ) ∈ R+ × R+ . However, from the result (C.11) we read off that 3 def µ2 δ(y) + C2 δ(y) (C.12) r1 (y) = r1 µ,µ (y) + 16(2π)4 (where C2 ∈ R is arbitrary) fulfils our requirements. (C.12) is the most general solution which is additionally Lorentz invariant and unitary. For the breaking of homogeneous scaling we obtain ρ6 r1 (ρy) − r1 (y) =
3 (log ρ) δ(y) . 8(2π)4
(C.13)
A general fact shows up in the results (C.7) and (C.13) (which is also valid for m > 0): the breaking of homogeneous scaling is independent of the normalization (i.e. of the choice of µ in (C.6) and C2 in (C.12)), because the undetermined polynomial P l a |a|+l=ω Ca,l m ∂ δ(y) scales homogeneously. Appendix D. Maintenance of Symmetries in the Extension of Distributions In contrast to a large part of the literature, in this Appendix we work with our normalization conditions Smoothness in m ≥ 0 and Scaling, instead of the upper bound (3.7) on the scaling degree. However, by obvious modifications, the procedure given here can just as well be based on the latter normalization condition. We investigate the question whether symmetries can be maintained in the process of renormalization. Or in mathematical terms: given a t◦ ≡ t(m)◦ ∈ D0 (Rk \{0}) which is smooth in m ≥ 0 and scales almost homogeneously with degree D and power N (3.9), does there exist an extension t ≡ t(m) ∈ D0 (Rk ) with the same symmetries and smoothness (in m) as t◦ and which scales almost homogeneously with D and power ≤ (N + 1)? Existence of a symmetric extension. Let V be a representation of a group G on D(Rk ) under which D(Rk \{0}) and t◦ are invariant, (t◦ , V (g)h) = (t◦ , h)
We denote by V
T
∀h ∈ D(Rk \{0}) ,
g ∈ G. 0
(D.1)
k
the transposed representation of G on D (R )
(V T (g)s, h) = (s, V (g −1 )h) ,
s ∈ D0 (Rk ) ,
(D.2)
January 18, 2005 10:2 WSPC/148-RMP
00226
Causal Perturbation Theory in Terms of Retarded Products
1341
and we additionally assume that smoothness in m ≥ 0 and the scaling behavior (3.9) are maintained under V T (g), ∀g ∈ G. Let t ∈ D0 (Rk ) be an arbitrary extension of t◦ with the required smoothness and scaling properties. For h ∈ D(Rk \{0}) we know (V T (g)t, h) = (t, V (g −1 )h) = (t◦ , V (g −1 )h) = (V T (g)t◦ , h) . So, V T (g)t is an extension of V T (g)t◦ = t◦ . With (3.16) we concludeaa ) ( X def T l a (k) ⊥ k def m Cl,a ∂ δ | Cl,a ∈ R or C l(g) = V (g)t − t ∈ Dω (R ) =
(D.3)
(D.4)
|a|+l=ω
def where ω = D − k. For an ˜l ∈ Dω⊥ (Rk ) we find V T (g)˜l = V T (g)(t + ˜l) − V T (g)t ∈ Dω⊥ (Rk ), since both terms are an extension of t◦ . Hence, def
G 3 g → π(g) = V T (g)|Dω⊥ (Rk )
(D.5)
is a sub-representation of V T . We are searching for an l0 ∈ Dω⊥ (Rk ) such that t + l0 is invariant, V T (g)(t + l0 ) = t + l(g) + V T (g)l0 = t + l0 ,
(D.6)
i.e. l0 must fulfil l(g) = l0 − π(g)l0 ,
∀g ∈ G .
From (D.4) it follows that l(g) has the property l(gh) = V T (gh)t − t = V T (g) V T (h)t − t + V T (g)t − t = π(g)l(h) + l(g) .
(D.7)
(D.8)
A solution of such an equation is called a “cocycle”. If l(g) is of the form (D.7) for some l0 , it is called a “coboundary”, and such an l(g) solves automatically the cocycle equation (D.8). The space of the cocycles modulo the coboundaries is called the cohomology of the group with respect to the representation π. Summing up, the invariance of t◦ with respect to the representation V T of G can be maintained in the extension if the cohomology of G with respect to π (D.5) is trivial. We are now going to show that this supposition holds true if all finite dimensional representations of G are completely reducible. For this purpose we consider the restriction of V T (g) to the space (C · t) ⊕ Dω⊥ (Rk ). From (D.4) we see that this is a finite dimensional representation of G, which may be identified with the matrix representation 1 0 g→π ¯ (g) = . (D.9) l(g) π(g) aa The
case m = 0 is included by using ml |m=0 = δl,0 .
January 18, 2005 10:2 WSPC/148-RMP
1342
00226
M. D¨ utsch & K. Fredenhagen
Due to the complete reducibility of π ¯ , there exists a 1-dimensional invariant subspace U which is complementary to the representation space of π (D.5). Such a subspace is of the form U = C · l10 . So there exists an l0 ∈ Dω⊥ (Rk ) with 1 1 1 π ¯ (g) = ∈C· . (D.10) l0 l(g) + π(g)l0 l0 Hence, π ¯ |U = 1, which means that l0 solves (D.7). For the Lorentz group L↑+ all finite dimensional representations are completely reducible and, hence, Lorentz invariance can be maintained in perturbative renormalization. However, for the scaling transformations one has to consider the representations of R+ as a multiplicative group. They are not always completely reducible. An example for a reducible but not completely reducible representation is 1 0 . (D.11) R+ 3 ρ 7→ ln ρ 1 The existence of such representations can be understood as the reason for the breaking of homogeneous scaling. Example. The action of massless QED is invariant with respect to the following U (1) transformations U (1)V : ψ(x) → eiα(x) ψ(x) ,
U (1)A : ψ(x) → eiβ(x)γ5 ψ(x) ,
(D.12)
α(x), β(x) ∈ [0, 2π), and ψ¯ is transformed correspondingly. According to Noether’s Theorem the corresponding currents ¯ µ ψ)gL jVµ gL = −(ψγ
µ ¯ µ 5 and jA gL = −(ψγ γ ψ)gL
(D.13)
are conserved in classical field theory. In QFT the just derived result implies that invariance of the retarded products with respect to U (1)V × U (1)A transformations (D.12) can be realized, because this group is compact. But it is well known µ µ that conservation of jA L is not compatible with conservation of jV L ; this is the axial anomaly. So, in general Noether’s Theorem cannot be fulfilled in QFT, see however [9]. Construction of a Lorentz-invariant extension. There remains the question of how to find the solution l0 of Eq. (D.7) (if it exists). For compact groups this can be done in the following way: we set Z def l0 = dg l(g) , (D.14) G
where dg is the uniquely determined measure on G which has norm 1 and is invariant under left- and right-translations (Haar-measure). To verify that (D.14) solves (D.7) we use the cocycle equation: Z Z l0 − π(g)l0 = dh l(h) − π(g)l(h) = dh l(h) − l(gh) + l(g) . (D.15) G
G
January 18, 2005 10:2 WSPC/148-RMP
00226
Causal Perturbation Theory in Terms of Retarded Products
1343
Due to the translation invariance of the Haar-measure, the integrals over the first two terms cancel, and since the measure is normalized we indeed obtain (D.7). For the Lorentz-group this method fails. Epstein and Glaser [18] use the fact that the renormalized distributions are boundary values of functions which are analytic in a certain region of the complexified Minkowski space and which are additionally invariant under the complex Lorentz group. It suffices then to take into account the invariance with respect to the compact subgroup SO(4). We give here an alternative method which has some resemblance with [42] and [4]: the strategy is to construct the projector P on the space Dinv of all invariant vectors in the representation π ¯ (D.9). Then, with t being the above-used arbitrary extension of t◦ (with the mentioned smoothness and scaling properties), the definition def
tinv = P t ,
(D.16)
yields a L↑+ -invariant distribution. And it is also an extension of t◦ with the required properties, because (tinv − t) ∈ Dω⊥ (Rk ). The latter is shown below. To find P note that for a Casimir operator C (in any representation) the invariant vectors Dinv are a subspace of its kernel C −1 (0), because C is built from the elements of the Lie algebra. Below we will find a Casimir operator C0 of the Lorentz group with Dinv = C0−1 (0) in each finite dimensional representation. With that our method relies on the fact that the operator c−1 (c1 − C0 ) annihilates the eigenvectors (of C0 ) to the eigenvalue c 6= 0, and on C0−1 (0) it is = 1. Therefore, ¯ (D.9), the in each finite dimensional representation of L↑+ and in particular for π operator Y c1 − π ¯ (C0 ) def (D.17) P = c c6=0
(c runs through all eigenvalues 6= 0 of C0 ) is a projector on Dinv . To show (tinv − t) ∈ Dω⊥ (Rk ) we write tinv = P t = P (t + l0 ) − P l0 = t + l0 − P l0 ,
(D.18)
where we use that t + l0 is invariant (D.6). It follows from (D.9) and (D.17) that the upper right coefficient of the matrix P is = 0. Hence, P l0 ∈ Dω⊥ (Rk ), and this gives the assertion. To determine a Casimir operator C0 of the Lorentz group with C0−1 (0) = Dinv in each finite dimensional representation, we first note that there are two quadratic Casimirs, ~2 − M ~2 C0 = L
~ ·M ~ , and C1 = L
(D.19)
~ = 1 (x0 ∂~ − ~x∂0 ) . M i
(D.20)
~ denotes the infinitesimal rotations and M ~ the infinitesimal Lorentz-boosts. where L On D(R4 ) it holds that ~ = 1 ~x × ∂~ , L i
January 18, 2005 10:2 WSPC/148-RMP
1344
00226
M. D¨ utsch & K. Fredenhagen
The irreducible finite dimensional representations of the Lorentz group L↑+ are those irreducible finite dimensional representations of SL(2, C) which represent the matrix −1 by 1. The irreducible finite dimensional representations of SL(2, C) are indexed by two spin quantum numbers j1 , j2 ∈ 21 N0 and have the form (D.21) πj1 j2 (A) ξ ⊗2j1 ⊗ η ⊗2j2 = (Aξ)⊗2j1 ⊗ ((A∗ )−1 η)⊗2j2
with ξ, η ∈ C2 . For j1 + j2 ∈ N0 this yields a representation of the Lorentz group. ~ i, M ~ i the representations of L ~ and M ~ on the left (i = 1) and the We denote by L right factor (i = 2) respectively. In the fundamental representation of SL(2, C) (i.e. (j1 , j2 ) = ( 12 , 0)) we have ~ = 1 ~σ , L 2
~ = i ~σ , M 2
(D.22)
and in the conjugated representation (i.e. (j1 , j2 ) = (0, 12 )) it holds that ~ = 1 ~σ , L 2
~ = − i ~σ . M 2
(D.23)
~ 2 = −iL ~2 , M
(D.24)
So we have ~ 1 = iL ~1 , M
and the validity of these relations goes over to all representations πj1 ,j2 , j1 , j2 ∈ 12 N0 (D.21). With that we obtain for the Casimir operators ~1 +L ~ 2 )2 − ( M ~1 +M ~ 2 )2 = 2 L ~ 2 + 2L ~ 2 = 2(j1 (j1 + 1) + j2 (j2 + 1)) C0 = ( L 1 2
(D.25)
and ~1 + L ~ 2 ) · (M ~1 +M ~ 2 ) = iL ~ 21 − iL ~ 22 = i(j1 (j1 + 1) − j2 (j2 + 1)) . C1 = ( L
(D.26)
We find indeed that C0 vanishes on the trivial representation j1 = j2 = 0 only. However, the kernel of C1 is much bigger. As an example let us consider a Lorentz invariant distribution tω on Dω (R4 ), ω = 2. Let X W =1− (−1)|a| |wa ih∂ a δ| |a|≤ω
be a projector on Dω (R4 ). The representation of the Lorentz group on Dω (R4 )⊥ = P { |a|≤2 Ca ∂ a δ|Ca } ⊂ D0 (R4 ) has the irreducible sub-representations 1 1 , , (1, 1) (j1 , j2 ) = (0, 0), 2 2 with the eigenvalues c = 0, 3, 8 of the Casimir operator C0 (D.25). The latter reads 1 C0 = − (xµ ∂ν − xν ∂µ )(xµ ∂ ν − xν ∂ µ ) 2 = (xµ ∂µ )2 + 2xµ ∂µ − x2
January 18, 2005 10:2 WSPC/148-RMP
00226
Causal Perturbation Theory in Terms of Retarded Products
1345
by using (D.19) and (D.20). Following (D.16) and (D.17) a Lorentz invariant extension of tω is obtained by 1 1 1 − C0 . tinv := P (tω ◦ W ) with P := 1 − C0 3 8 Appendix E. Time-Ordered Products One can define time-ordered products (“T -products”) (Tn )n∈N by a direct translation of the axioms for retarded products (given in Sec. 2) with the following modifications: • Tn is required to be symmetrical in all factors. • Causality is expressed by causal factorization: T (A1 (x1 ), . . . , An (xn )) = T (A1 (x1 ), . . . , Ak (xk )) ? T (Ak+1 (xk+1 ), . . . , An (xn ))
(E.1)
if {x1 , . . . , xk } ∩ ({xk+1 , . . . , xn } + V¯− ) = ∅. • Among the defining properties of T -products there is none corresponding to the GLZ relation (2.27). The information corresponding to (2.27) is (1.1) and (1.2), i.e. the definition of retarded products as coefficients of the interacting fields which are obtained from the T -products by Bogoliubov’s formula. (In [12, Proposition 2] we start from the T -products and derive the GLZ relation (2.27). This derivation uses only (1.1) and (1.2), and linearity and Symmetry are needed in order that (1.2) determines Rn,1 also for non-diagonal entries.) def
The generating functional of the T -products is the (local) S-matrix S(F ) = T (eiF ⊗ ), F ∈ Floc . These defining properties of T -products are equivalent to our defining properties of retarded products in the sense of the unique correspondence (Tn )n∈{1,2,...,N +1} ↔ (Rn,1 )n∈{0,1,...,N }
(E.2)
given by (1.1) and (1.2). Using the anti-chronological products (T¯n )n∈N , which are def definedbb by T¯(e−iF ) = S(F )−1 , the correspondence (E.2) can be written more ⊗
explicitly: def
Rn,1 (F1 ⊗ · · · ⊗ Fn ; F ) = in
X
(−1)|I| T¯|I| (⊗l∈I Fl )T|I c |+1 ((⊗j∈I c Fj ) ⊗ F ) .
I⊂{1,...,n}
(E.3)
This formula can also be used to construct inductively the T -products from the retarded products: it yields Tn+1 in terms of Rn,1 and Tl , T¯k with k, l ≤ n. Alterbb Note
that T¯m is uniquely determined in terms of the T -products Tl , 1 ≤ l ≤ m.
January 18, 2005 10:2 WSPC/148-RMP
1346
00226
M. D¨ utsch & K. Fredenhagen
natively, to obtain a direct formula for Tn in terms of the {Rl,1 |0 ≤ l ≤ n − 1}, we write Bogoliubov’s formula (1.1) in the form d S(λF ) . dλ This differential equation is solved by the Dyson series Z λ2 Z λk Z λ ∞ X dλ1 Fλ1 F · · · Fλk−1 F Fλk F . dλk S(λF ) = 1 + ik dλk−1 · · · FλF = −iS(λF )−1
k=1
0
For λ = 1 the term of nth order in F reads n X X Tn (F ⊗n ) = ik−n ·
(E.5)
0
0
k=1
(E.4)
l1 +···+lk =n−k
n! l1 !l2 ! · · · lk !
1 (l1 + 1)(l1 + l2 + 2) · · · (l1 + l2 + · · · + lk + k)
· Rl1 ,1 (F ⊗l1 , F ) · · · Rlk ,1 (F ⊗lk , F ) ,
(E.6)
where we have used Rl,1 ((λF )⊗l , F ) = λl Rl,1 (F ⊗l , F ) and computed the λ-integrals. (E.6) agrees with [20, formulacc (55)], which is derived there in a different way. Acknowledgments We profitted from discussions with Dan Grigore, Henning Rehren, Othmar Steinmann and Raymond Stora. In particular the correspondence with Raymond Stora was very helpful to find the proof of the AWI. We particularly thank Romeo Brunetti for useful hints and careful reading of the manuscript, and Stefan Hollands for helpful comments and for pointing out an error in an earlier version. While working on this paper M. D. was mainly at the University of G¨ ottingen; he thanks Karl-Henning Rehren and Detlev Buchholz for their warm hospitality. References [1] F. Bayen, M. Flato, C. Fronsdal, A. Lichnerowicz and D. Sternheimer, Deformation theory and quantization, Ann. Phys. 111(61) (1978) 111. [2] P. Blanchard and R. S´en´eor, Green’s functions for theories with massless particles (in perturbation theory), Ann. Inst. H. Poincar´e A23 (1975) 147. [3] N. N. Bogoliubov and D. V. Shirkov, Introduction to the Theory of Quantized Fields (Interscience Publishers, Inc., New York, 1959). [4] K. Bresser, G. Pinter and D. Prange, The Lorentz invariant extension of scalar theories, hep-th/9903266; D. Prange, Lorentz Covariance in Epstein–Glaser Renormalization, hep-th/9904136. [5] C. Brouder, B. Fauser, A. Frabetti and R. Oeckel, Quantum field theory and Hopf algebra cohomology, J. Phys. A37 (2004) 5895–5927. [6] R. Brunetti, M. D¨ utsch and K. Fredenhagen, in progress. cc We
thank Christian Brouder for showing us this formula.
January 18, 2005 10:2 WSPC/148-RMP
00226
Causal Perturbation Theory in Terms of Retarded Products
1347
[7] R. Brunetti and K. Fredenhagen, Microlocal analysis and interacting quantum field theories: Renormalization on physical backgrounds, Commun. Math. Phys. 208 (2000) 623. [8] R. Brunetti, K. Fredenhagen and R. Verch, The generally covariant locality principle — A new paradigm for local quantum physics, Commun. Math. Phys. 237 (2003) 31–68. [9] D. Buchholz, S. Doplicher and R. Longo, On Noether’s theorem in quantum field theory, Ann. Phys. 170 (1986) 1. [10] D. Buchholz, I. Ojima and H. Roos, Thermodynamic properties of non-equilibrium states in quantum field theory, Ann. Phys. 297 (2002) 219–242. [11] B. S. DeWitt, The spacetime approach to quantum field theory, in Relativity, Groups, and Topology II : Les Houches 1983, eds B. S. DeWitt and R. Stora, part 2, NorthHolland, New York (1984) 381–738. [12] M. D¨ utsch and K. Fredenhagen, A local (perturbative) construction of observables in gauge theories: the example of QED, Commun. Math. Phys. 203 (1999) 71. [13] M. D¨ utsch and K. Fredenhagen, Algebraic quantum field theory, perturbation theory, and the loop expansion, Commun. Math. Phys. 219 (2001) 5. [14] M. D¨ utsch and K. Fredenhagen, Perturbative algebraic field theory, and deformation quantization, Fields Institute Communications 30 (2001) 151–160. [15] M. D¨ utsch and K. Fredenhagen, The master ward identity and generalized Schwinger– Dyson equation in classical field theory, Commun. Math. Phys. 243 (2003) 275–314. [16] M. D¨ utsch and F.-M. Boas, The master ward identity, Rev. Math. Phys. 14 (2002) 977–1049. [17] M. D¨ utsch, T. Hurth, K. Krahe and G. Scharf, Causal construction of Yang–Mills theories. II. N. Cimento A107 (1994) 375. [18] H. Epstein and V. Glaser, The role of locality in perturbation theory, Ann. Inst. H. Poincar´e A19 (1973) 211. [19] H. Epstein and V. Glaser, Adiabatic limit in perturbation theory, in Renormalization Theory, eds. G. Velo and A. S. Wightman (1976), pp. 193–254. [20] H. Epstein, V. Glaser and R. Stora, General properties of the n-point functions in local quantum field theory, in Structural Analysis of Collision Amplitudes, Les Houches, eds. R. Balian and D. Iagolnitzer (1975). [21] D. Z. Freedman, K. Johnson and J. I. Latorre, Differential regularization and renormalization: a new method of calculation in quantum field theory, Nucl. Phys. B371 (1992) 353–414. [22] M. Gell-Mann and F. E. Low, Quantum electrodynamics at small distances, Phys. Rev. 95 1300–1312. [23] V. Glaser, H. Lehmann and W. Zimmermann, Field operators and retarded functions, Nuovo Cimen. 6 (1957) 1122. [24] J. M. Gracia-Bondia, Improved Epstein–Glaser renormalization in coordinate space I. Euclidean framework, Math. Phys., Anal. Geom. 6 (2003) 59; J. M. Gracia-Bondia and S. Lazzarini, Improved Epstein-Glaser renormalization in coordinate space II. Lorentz invariant framework, J. Math. Phys. 44 (2003) 3863. [25] D. R. Grigore, Scale invariance in the causal approach to renormalization theory, Ann. Phys. (Leipzig) 10 (2001) 473. [26] W. G¨ uttinger, Generalized functions and dispersion relations in physics, Fortschr. Physik 14 (1966) 483–602; W. G¨ uttinger and A. Rieckers, Spectral representations of Lorentz invariant distributions and scale transformation, Commun. Math. Phys. 7 (1968) 190–217. [27] R. Haag, Local Quantum Physics: Fields, Particles and Algebras, 2nd edn. (Springer-
January 18, 2005 10:2 WSPC/148-RMP
1348
00226
M. D¨ utsch & K. Fredenhagen
Verlag, Berlin, 1996). [28] R. Haag and D. Kastler, An algebraic approach to field theory, J. Math. Phys. 5 (1964) 848. [29] A. C. Hirshfeld and P. Henselder, Star products and perturbative quantum field theory, Ann. Phys. 298 (2002) 382–393. [30] S. Hollands, Algebraic approach to the 1/N expansion in quantum field theory, Rev. Math. Phys 16 (2004) 509–558. [31] S. Hollands and R. M. Wald, Local Wick polynomials and time-ordered-products of quantum fields in curved spacetime, Commun. Math. Phys. 223 (2001) 289; S. Hollands and R. M. Wald, Existence of local covariant time-ordered-products of quantum fields in curved spacetime, Commun. Math. Phys. 231 (2002) 309–345. [32] S. Hollands and R. M. Wald, On the renormalization group in curved spacetime, Commun. Math. Phys. 237 (2003) 123–160. [33] S. Hollands and R. M. Wald, Conservation of the stress tensor in interacting quantum field theory in curved spacetimes, gr-qc/0404074. [34] G. K¨ allen, Formal integration of the equations of quantum theory in the Heisenberg representation, Ark Fysik 2 (1950) 371. [35] J. I. Latorre, C. Manuel and X. Vilasis-Cardona, Systematic differential renormalization to all orders, Ann. Phys. (N.Y.) 231 (1994) 149. [36] D. M. Marolf, The generalized Peierls bracket, Ann. Phys. (N.Y.) 236 (1994) 392. [37] R. Peierls, The commutation laws of relativistic field theory, Proc. Roy. Soc. (London) A214 (1952) 143. [38] G. Pinter, Finite renormalizations in the Epstein–Glaser framework and renormalization of the S-matrix of Φ4 -theory, Ann. Phys. (Leipzig) 10 (2001) 333. [39] D. Prange, Epstein–Glaser renormalization and differential renormalization, J. Phys. A32 (1999) 2225. [40] G. Scharf, Finite Quantum Electrodynamics. The causal approach, 2nd edn. (Springer-Verlag, 1995). [41] G. Scharf, Quantum Gauge Theories — A True Ghost Story (John Wiley and Sons, 2001). [42] O. Steinmann, Perturbation Expansions in Axiomatic Field Theory, Lecture Notes in Physics 11 (Springer-Verlag, Berlin-Heidelberg-New York, 1971). [43] O. Steinmann, Perturbative QED and Axiomatic Field Theory (Springer-Verlag, 2000). [44] R. Stora, Differential Algebras in Lagrangean Field Theory, ETH-Z¨ urich Lectures, January-February 1993; G. Popineau and R. Stora, A pedagogical remark on the main theorem of perturbative renormalization theory, unpublished preprint (1982). [45] R. Stora, Pedagogical experiments in renormalized perturbation theory, in conference Theory of Renormalization and Regularization, Hesselberg, Germany (2002), http://wwwthep.physik.uni-mainz.de/∼scheck/Hessbg02.html; and private communication. [46] E. C. G. Stueckelberg and A. Petermann, La normalisation des constantes dans la theorie des quanta, Helv. Phys. Acta 26 (1953) 499–520.
January 13, 2005 16:28 WSPC/148-RMP final
REVIEWS IN MATHEMATICAL PHYSICS Author Index Volume 16 (2004)
Alama, S. & Bronsard, L., On the second critical field for a Ginzburg–Landau model with ferromagnetic interactions Almog, Y., Nonlinear surface superconductivity in the large κ limit Baro, M., Kaiser, H.-Chr., Neidhardt, H. & Rehberg, J., A quantum transmitting Schr¨odinger–Poisson system Bronsard, L., see Alama, S. Comte, M. & Sauvageot, M., On the Hessian of the energy form in the Ginzburg–Landau model of superconductivity Contucci, P., Giardin`a, C. & Pul´e, J., The thermodynamic limit for finite dimensional classical and quantum disordered systems Damak, M. & Georgescu, V., Self-adjoint operators affiliated to C ∗ -algebras Damianou, P. A., Multiple Hamiltonian structure of Bogoyavlensky–Toda lattices Dimock, J., Markov quantum fields on a manifold Disertori, M., Density of states for GUE through supersymmetric approach Dorlas, T. C. & Pul´e, J. V., The invariant measures at weak disorder for the two-line Anderson model Dutkay, D. E., Positive definite maps, representations and frames D¨utsch, M. & Fredenhagen, K., Causal perturbation theory in terms of retarded products, and a proof of the action ward
identity Elton, D. M., The Bethe–Sommerfeld conjecture for the 3-dimensional periodic Landau operator Exner, P. & Kondej, S., Strong-coupling asymptotic expansion for Schr¨odinger operators with a singular interaction supported by a curve in R3 Fredenhagen, K., see D¨utsch, M. Georgescu, V., see Damak, M. Geyler, V. A. & Sˇ ˇtov´ıcˇ ek, P., Zero modes in a system of Aharonov–Bohm fluxes Giardin`a, C., see Contucci, P. Goswami, D., Twisted entire cyclic cohomology, J-L-O cocycles and equivariant spectral triples H¨afner, D. & Nicolas, J.-P., Scattering of massless Dirac fields by a Kerr black hole Hollands, S., Algebraic approach to the 1/N expansion in quantum field theory Inoue, R., Kuniba, A. & Okado, M., A quantization of box-ball systems Jensen, A. & Nenciu, G., Erratum — a unified approach to resolvent expansions at thresholds Kaiser, H.-Chr., see Baro, M. Kishimoto, A., Core symmetries of a flow Klimˇc´ık, C., Quasitriangular WZW model Koma, T., Revisiting the charge transport in quantum hall systems
2 (2004) 147
8 (2004) 961
3 (2004) 281 2 (2004) 147
4 (2004) 421
5 (2004) 629
2 (2004) 257
2 (2004) 175
2 (2004) 243
9 (2004) 1191
5 (2004) 639
4 (2004) 451
1349
10 (2004) 1291
10 (2004) 1259
5 (2004) 559 10 (2004) 1291 2 (2004) 257
7 (2004) 851 5 (2004) 629
5 (2004) 583
1 (2004) 29
4 (2004) 509
10 (2004) 1227
5 (2004) 675 3 (2004) 281 4 (2004) 479 6 (2004) 679
9 (2004) 1115
January 13, 2005 16:28 WSPC/148-RMP
1350
final
Author Index
Kondej, S., see Exner, P. Kondratiev, Y., Minlos, R. & Zhizhina, E., One-particle subspace of the Glauber dynamics generator for continuous particle systems K¨oster, S., Local nature of Coset models Kuniba, A., see Inoue, R. Longo, R. & Rehren, K.-H., Local fields in boundary conformal QFT Minlos, R., see Kondratiev, Y. Morosi, C. & Pizzocchero, L., On approximate solutions of semilinear evolution equations M¨uck, M., Construction of metastable states in quantum electrodynamics Naudts, J., Continuity of a class of entropies and relative entropies Neidhardt, H., see Baro, M. Nenciu, G., see Jensen, A. Nicolas, J.-P., see H¨afner, D. Okado, M., see Inoue, R. Pickrell, D., The radial part of the
5 (2004) 559
9 (2004) 1073 3 (2004) 353 10 (2004) 1227
7 (2004) 909 9 (2004) 1073
3 (2004) 383
1 (2004) 1
6 (2004) 809 3 (2004) 281 5 (2004) 675 1 (2004) 29 10 (2004) 1227
zero-mode Hamiltonian for sigma models with group target space Pizzocchero, L., see Morosi, C. Pul´e, J. V., see Dorlas, T. C. Pul´e, J., see Contucci, P. Rehberg, J., see Baro, M. Rehren, K.-H., see Longo, R. Rennie, A., Nonunital spectral triples associated to degenerate metrics Sauvageot, M., see Comte, M. Sengupta, A. N., Connections over two-dimensional cell complexes Skrypnyk, T., Deformations of loop algebras and classical integrable systems: finite-dimensional Hamiltonian systems Soffer, A. & Weinstein, M. I., Selection of the ground state for nonlinear Schr¨odinger equations Sˇ ˇtov´ıcˇ ek, P., see Geyler, V. A. Weinstein, M. I., see Soffer, A. Zhizhina, E., see Kondratiev, Y.
5 (2004) 603 3 (2004) 383 5 (2004) 639 5 (2004) 629 3 (2004) 281 7 (2004) 909
1 (2004) 125 4 (2004) 421
3 (2004) 331
7 (2004) 823
8 (2004) 977 7 (2004) 851 8 (2004) 977 9 (2004) 1073