Introduction to Classical Integrable Systems

This page intentionally left blank This book provides a thorough introduction to the theory of classical integrable s...

Author: Olivier Babelon | Denis Bernard | Michel Talon

75 downloads 1369 Views 4MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD

This page intentionally left blank

This book provides a thorough introduction to the theory of classical integrable systems, discussing the various approaches to the subject and explaining their interrelations. The book begins by introducing the central ideas of the theory of integrable systems, based on Lax representations, loop groups and Riemann surfaces. These ideas are then illustrated with detailed studies of model systems. The connection between isomonodromic deformation and integrability is discussed, and integrable ﬁeld theories are covered in detail. The KP, KdV and Toda hierarchies are explained using the notion of Grassmannian, vertex operators and pseudo-diﬀerential operators. A chapter is devoted to the inverse scattering method and three complementary chapters cover the necessary mathematical tools from symplectic geometry, Riemann surfaces and Lie algebras. The book contains many worked examples and is suitable for use as a textbook on graduate courses. It also provides a comprehensive reference for researchers already working in the ﬁeld. o l i v i e r b a b e l o n has been a member of the Centre National de la Recherche Scientiﬁque (CNRS) since 1978. He works at the Laboratoire de Physique Th´eorique et Hautes Energies (LPTHE) at the University of Paris VI-Paris VII. His main ﬁelds of interest are particle physics, gauge theories and integrables systems. m i c h e l t a l o n has been a member of the CNRS since 1977. He works at the LPTHE at the University of Paris VI-Paris VII. He is involved in the computation of radiative corrections and anomalies in gauge theories and integrable systems. d e n i s b e r n a r d has been a member of the CNRS since 1988. He currently works at the Service de Physique Th´eorique de Saclay. His main ﬁelds of interest are conformal ﬁeld theories and integrable systems, and other aspects of statistical ﬁeld theories, including statistical turbulence.

CAMBRIDGE MONOGRAPHS ON MATHEMATICAL PHYSICS General editors: P. V. Landshoﬀ, D. R. Nelson, S. Weinberg J. Ambjørn, B. Durhuus and T. Jonsson Quantum Geometry: A Statistical Field Theory Approach A. M. Anile Relativistic Fluids and Magneto-Fluids J. A. de Azc´ arraga and J. M. Izquierdo Lie Groups, Lie Algebras, Cohomology and Some Applications in Physics† O. Babelon, D. Bernard and M. Talon Introduction to Classical Integrable Systems V. Belinski and E. Verdaguer Gravitational Solitons J. Bernstein Kinetic Theory in the Early Universe G. F. Bertsh and R. A. Broglia Oscillations in Finite Quantum Systems N. D. Birrell and P. C. W. Davies Quantum Fields in Curved Space† M. Burgess Classical Covariant Fields S. Carlip Quantum Gravity in 2 + 1 Dimensions J. C. Collins Renormalization† M. Creutz Quarks, Gluons and Lattices† P. D. D’Eath Supersymmetric Quantum Cosmology F. de Felice and C. J. S. Clarke Relativity on Curved Manifolds† P. G. O. Freund Introduction to Supersymmetry† J. Fuchs Aﬃne Lie Algebras and Quantum Groups† J. Fuchs and C. Schweigert Symmetries, Lie Algebras and Representations: A Graduate Course for Physicists Y. Fujii and K. Maeda The Scalar Tensor Theory of Gravitation A. S. Galperin, E. A. Ivanov, V. I. Ogievetsky and E. S. Sokatchev Harmonic Superspace R. Gambini and J. Pullin Loops, Knots, Gauge Theories and Quantum Gravity† M. G¨ ockeler and T. Sch¨ ucker Diﬀerential Geometry, Gauge Theories and Gravity† C. G´ omez, M. Ruiz Altaba and G. Sierra Quantum Groups in Two-dimensional Physics M. B. Green, J. H. Schwarz and E. Witten Superstring Theory, volume 1: Introduction† M. B. Green, J. H. Schwarz and E. Witten Superstring Theory, volume 2: Loop Amplitudes, Anomalies and Phenomenology† S. W. Hawking and G. F. R. Ellis The Large-Scale Structure of Space-Time† F. Iachello and A. Aruna The Interacting Boson Model F. Iachello and P. van Isacker The Interacting Boson–Fermion Model C. Itzykson and J.-M. Drouﬀe Statistical Field Theory, volume 1: From Brownian Motion to Renormalization and Lattice Gauge Theory† C. Itzykson and J.-M. Drouﬀe Statistical Field Theory, volume 2: Strong Coupling, Monte Carlo Methods, Conformal Field Theory, and Random Systems† C. V. Johnson D-Branes J. I. Kapusta Finite-Temperature Field Theory† V. E. Korepin, A. G. Izergin and N. M. Boguliubov The Quantum Inverse Scattering Method and Correlation Functions† M. Le Bellac Themal Field Theory† Y. Makeenko Methods of Contemporary Gauge Theory N. H. March Liquid Metals: Concepts and Theory I. M. Montvay and G. M¨ unster Quantum Fields on a Lattice† A. Ozorio de Almeida Hamiltonian Systems: Chaos and Quantization† R. Penrose and W. Rindler Spinors and Space-time, volume 1: Two-Spinor Calculus and Relativistic Fields† R. Penrose and W. Rindler Spinors and Space-time, volume 2: Spinor and Twistor Methods in Space-Time Geometry† S. Pokorski Gauge Field Theories, 2nd edition J. Polchinski String Theory, volume 1: An Introduction to the Bosonic String J. Polchinski String Theory, volume 2: Superstring Theory and Beyond V. N. Popov Functional Integrals and Collective Excitations† R. G. Roberts The Structure of the Proton† J. M. Stewart Advanced General Relativity† A. Vilenkin and E. P. S. Shellard Cosmic Strings and Other Topological Defects† R. S. Ward and R. O. Wells Jr Twistor Geometry and Field Theories† †

Issued as a paperback

Introduction to Classical Integrable Systems OLIVIER BABELON Laboratoire de Physique Th´ eorique et Hautes Energies, Universit´es Paris VI–VII

DENIS BERNARD Service de Physique Th´eorique de Saclay, Gif-sur-Yvette

MICHEL TALON Laboratoire de Physique Th´ eorique et Hautes Energies, Universit´es Paris VI–VII

   Cambridge, New York, Melbourne, Madrid, Cape Town, Singapore, São Paulo Cambridge University Press The Edinburgh Building, Cambridge  , United Kingdom Published in the United States of America by Cambridge University Press, New York www.cambridge.org Information on this title: www.cambridge.org/9780521822671 © O. Babelon, D. Bernard & M. Talon 2003 This book is in copyright. Subject to statutory exception and to the provision of relevant collective licensing agreements, no reproduction of any part may take place without the written permission of Cambridge University Press. First published in print format 2003 - isbn-13 978-0-511-07050-1 eBook (EBL) - isbn-10 0-511-07050-0 eBook (EBL) - isbn-13 978-0-521-82267-1 hardback -  hardback isbn-10 0-521-82267-X

Cambridge University Press has no responsibility for the persistence or accuracy of s for external or third-party internet websites referred to in this book, and does not guarantee that any content on such websites is, or will remain, accurate or appropriate.

Contents

1

Introduction

1

2 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 2.10 2.11 2.12 2.13

Integrable dynamical systems Introduction The Liouville theorem Action–angle variables Lax pairs Existence of an r-matrix Commuting ﬂows The Kepler problem The Euler top The Lagrange top The Kowalevski top The Neumann model Geodesics on an ellipsoid Separation of variables in the Neumann model

5 5 7 10 11 13 17 17 19 20 22 23 25 27

3 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 3.10

Synopsis of integrable systems Examples of Lax pairs with spectral parameter The Zakharov–Shabat construction Coadjoint orbits and Hamiltonian formalism Elementary ﬂows and wave function Factorization problem Tau-functions Integrable ﬁeld theories and monodromy matrix Abelianization Poisson brackets of the monodromy matrix The group of dressing transformations

32 33 35 41 49 54 59 62 65 72 74

vii

viii

Contents

3.11 Soliton solutions

79

4 4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 4.9 4.10

Algebraic methods The classical and modiﬁed Yang–Baxter equations Algebraic meaning of the classical Yang–Baxter equations Adler–Kostant–Symes scheme Construction of integrable systems Solving by factorization The open Toda chain The r-matrix of the Toda models Solution of the open Toda chain Toda system and Hamiltonian reduction The Lax pair of the Kowalevski top

86 86 89 92 94 96 97 100 105 109 115

5 5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 5.9 5.10 5.11 5.12 5.13 5.14

Analytical methods The spectral curve The eigenvector bundle The adjoint linear system Time evolution Theta-functions formulae Baker–Akhiezer functions Linearization and the factorization problem Tau-functions Symplectic form Separation of variables and the spectral curve Action–angle variables Riemann surfaces and integrability The Kowalevski top Inﬁnite-dimensional systems

124 125 130 138 142 145 149 153 154 156 162 164 167 169 175

6 6.1 6.2 6.3 6.4 6.5 6.6 6.7 6.8

The closed Toda chain The model The spectral curve The eigenvectors Reconstruction formula Symplectic structure The Sklyanin approach The Poisson brackets Reality conditions

178 178 181 182 184 191 193 196 200

7 7.1

The Calogero–Moser model The spin Calogero–Moser model

206 206

Contents

ix

7.2 7.3 7.4 7.5 7.6 7.7 7.8 7.9 7.10 7.11 7.12 7.13

Lax pair The r-matrix The scalar Calogero–Moser model The spectral curve The eigenvector bundle Time evolution Reconstruction formulae Symplectic structure Poles systems and double-Bloch condition Hitchin systems Examples of Hitchin systems The trigonometric Calogero–Moser model

208 210 214 216 218 220 221 223 226 232 239 244

8 8.1 8.2 8.3 8.4 8.5 8.6 8.7 8.8 8.9 8.10 8.11

Isomonodromic deformations Introduction Monodromy data Isomonodromy and the Riemann–Hilbert problem Isomonodromic deformations Schlesinger transformations Tau-functions Ricatti equation Sato’s formula The Hirota equations Tau-functions and theta-functions The Painlev´e equations

249 249 251 262 264 270 272 277 278 280 282 290

9 9.1 9.2 9.3 9.4 9.5 9.6 9.7 9.8 9.9

Grassmannian and integrable hierarchies Introduction Fermions and GL(∞) Boson–fermion correspondence Tau-functions and Hirota bilinear identities The KP hierarchy and its soliton solutions Fermions and Grassmannians Schur polynomials From fermions to pseudo-diﬀerential operators The Segal–Wilson approach

299 299 303 308 311 314 316 322 328 331

10 10.1 10.2 10.3 10.4

The KP hierarchy The algebra of pseudo-diﬀerential operators The KP hierarchy The Baker–Akhiezer function of KP Algebro-geometric solutions of KP

338 338 341 344 348

x

Contents

10.5 10.6 10.7 10.8 10.9 10.10 10.11

The tau-function of KP The generalized KdV equations KdV Hamiltonian structures Bihamiltonian structure The Drinfeld–Sokolov reduction Whitham equations Solution of the Whitham equations

352 355 359 363 364 370 379

11 11.1 11.2 11.3 11.4 11.5 11.6 11.7 11.8 11.9 11.10

The KdV hierarchy The KdV equation The KdV hierarchy Hamiltonian structures and Virasoro algebra Soliton solutions Algebro-geometric solutions Finite-zone solutions Action-angle variables Analytical description of solitons Local ﬁelds Whitham’s equations

382 382 386 392 394 398 408 414 419 425 433

12 12.1 12.2 12.3 12.4 12.5 12.6 12.7 12.8 12.9 12.10

The Toda ﬁeld theories The Liouville equation The Toda systems and their zero-curvature representations Solution of the Toda ﬁeld equations Hamiltonian formalism Conformal structure Dressing transformations The aﬃne sinh-Gordon model Dressing transformations and soliton solutions N -soliton dynamics Finite-zone solutions

443 443 445 447 454 456 463 467 471 474 481

13 13.1 13.2 13.3 13.4 13.5 13.6 13.7 13.8

Classical inverse scattering method The sine-Gordon equation The Jost solutions Inverse scattering as a Riemann--Hilbert problem Time evolution of the scattering data The Gelfand--Levitan--Marchenko equation Soliton solutions Poisson brackets of the scattering data Action--angle variables

486 486 487 496 497 498 502 505 510

Contents

xi

14 14.1 14.2 14.3 14.4 14.5 14.6 14.7 14.8

Symplectic geometry Poisson manifolds and symplectic manifolds Coadjoint orbits Symmetries and Hamiltonian reduction The case M = T ∗ G Poisson–Lie groups Action of a Poisson–Lie group on a symplectic manifold The groups G and G∗ The group of dressing transformations

516 516 522 525 532 534 538 540 542

15 15.1 15.2 15.3 15.4 15.5 15.6 15.7 15.8 15.9 15.10 15.11 15.12 15.13 15.14 15.15

Riemann surfaces Smooth algebraic curves Hyperelliptic curves The Riemann–Hurwitz formula The ﬁeld of meromorphic functions of a Riemann surface Line bundles on a Riemann surface Divisors Chern class Serre duality The Riemann–Roch theorem Abelian diﬀerentials Riemann bilinear identities Jacobi variety Theta-functions The genus 1 case The Riemann–Hilbert factorization problem

545 545 547 549 549 551 553 554 554 556 559 560 562 563 567 568

16 16.1 16.2 16.3 16.4 16.5 16.6

Lie algebras Lie groups and Lie algebras Semi-simple Lie algebras Linear representations Real Lie algebras Aﬃne Kac–Moody algebras Vertex operator representations

571 571 574 580 583 587 592

Index

599

1 Introduction

The aim of this book is to introduce the reader to classical integrable systems. Because the subject has been developed by several schools having diﬀerent perspectives, it may appear fragmented at ﬁrst sight. We develop here the thesis that it has a profound unity and that the various approaches are simply changes of point of view on the same underlying reality. The more one understands each approach, the more one sees their unity. At the end one gets a very small set of interconnected methods. This fundamental fact sets the tone of the book. We hope in this way to convey to the reader the extraordinary beauty of the structures emerging in this ﬁeld, which have illuminated many other branches of theoretical physics. The ﬁeld of integrable systems is born together with Classical Mechanics, with a quest for exact solutions to Newton’s equations of motion. It turned out that apart from the Kepler problem which was solved by Newton himself, after two centuries of hard investigations, only a handful of other cases were found. In the nineteenth century, Liouville ﬁnally provided a general framework characterizing the cases where the equations of motion are “solvable by quadratures”. All examples previously found indeed pertained to this setting. The subject stayed dormant until the second half of the twentieth century when Gardner, Greene, Kruskal and Miura invented the Classical Inverse Scattering Method for the Korteweg– de Vries equation, which had been introduced in ﬂuid mechanics. Soon afterwards, the Lax formulation was discovered, and the connection with integrability was unveiled by Faddeev, Zakharov and Gardner. This was the signal for a revival of the domain leading to an enormous amount of results, and truly general structures emerged which organized the subject. More recently, the extension of these results to Quantum Mechanics already led to remarkable results and is still a very active ﬁeld of research. 1

2

1 Introduction

Let us give a general overview of the ideas we present in this book. They all ﬁnd their roots in the notion of Lax pairs. It consists of presenting the ˙ equations of motion of the system in the form L(λ) = [M (λ), L(λ)], where the matrices L(λ) and M (λ) depend on the dynamical variables and on a parameter λ called the spectral parameter, and [ , ] denotes the commutator of matrices. The importance of Lax pairs stems from the following simple remark: the Lax equation is an isospectral evolution equation for the Lax matrix L(λ). It follows that the curve deﬁned by the equation det (L(λ) − µI) = 0 is time-independent. This curve, called the spectral curve, can be seen as a Riemann surface. Its moduli contain the conserved quantities. This immediately introduces the two main structures into the theory: groups enter through the Lie algebra involved in the commutator [M, L], while complex analysis enters through the spectral curve. As integrable systems are rather rare, one naturally expects strong constraints on the matrices L(λ) and M (λ). Constructing consistent Lax matrices may be achieved by appealing to factorization problems in appropriate groups. Taking into account the spectral parameter promotes this group to a loop group. The factorization problem may then be viewed as a Riemann–Hilbert problem, a central tool of this subject. In the group theoretical setting, solving the equations of motion amounts to solving the factorization problem. In the analytical setting, solutions are obtained by considering the eigenvectors of the Lax matrix. At any point of the spectral curve there exists an eigenvector of L(λ) with eigenvalue µ. This deﬁnes an analytic line bundle L on the spectral curve with prescribed Chern class. The time evolution is described as follows: if L(t) is the line bundle at time t then L(t)L−1 (0) is of Chern class 0, i.e. is a point on the Jacobian of the spectral curve. It is a beautiful result that this point evolves linearly on the Jacobian. As a consequence, one can express the dynamical variables in terms of theta-functions deﬁned on the Jacobian of the spectral curve. The two methods are related as follows: the factorization problem in the loop group deﬁnes transition functions for the line bundle L. The framework can be generalized by replacing the Lax matrix by the ﬁrst order diﬀerential equation (∂λ − Mλ (λ))Ψ = 0, where Mλ (λ) depends rationally on λ. The solution Ψ acquires non-trivial monodromy when λ describes a loop around a pole of Mλ . The isomonodromy problem consists of ﬁnding all Mλ with prescribed monodromy data. The solutions depend, in general, on a number of continuous parameters. The deformation equations with respect to these parameters form an integrable system. The theta-functions of the isospectral approach are then promoted to more general objects called the tau-functions.

1 Introduction

3

One can study the behaviour around each singularity of the diﬀerential operator quite independently. In the group theoretical version, the above extension of the framework corresponds to centrally extending the loop groups. Around a singularity the most general extended group is the group GL(∞) which corresponds to the KP hierarchy. It can be represented in a fermionic Fock space. Fermionic monomials acting on the vacuum yield decomposed vectors, which describe an inﬁnite Grassmannian introduced by Sato. In this setting, the time ﬂows are induced by the action of commuting one-parameter subgroups, and the tau-function is deﬁned on the Grassmannian, i.e. the orbit of the vacuum, and characterizes it. Finally the Pl¨ ucker equations of the Grassmannian are identiﬁed with the equations of motion, written in the bilinear Hirota form. We have tried, as much as possible, to make the book self-contained, and to achieve that each chapter can be studied quite independently. Generally, we ﬁrst explain methods and then show how they can be applied to particular examples, even though this does not correspond to the historical development of the subject. In Chapter 2 we introduce the classical deﬁnition of integrable systems through the Liouville theorem. We present the Lax pair formulation, and describe the symplectic structure which is encoded into the so-called rmatrix form. In Chapter 3 we explain how to construct Lax pairs with spectral parameter, for ﬁnite and inﬁnite-dimensional systems. The Lax matrix may be viewed as an element of a coadjoint orbit of a loop group. This introduces immediately a natural symplectic structure and a factorization problem in the loop group. We also introduce, at this early stage, the notion of tau-functions. In Chapter 4 we discuss the abstract group theoretical formulation of the theory. We then describe the analytical aspects of the theory in Chapter 5. In this setting, the action variables are g moduli of the spectral curve, a Riemann surface of genus g, and the angle variables are g points on it. We illustrate the general constructions by the examples of the closed Toda chain in Chapter 6, and the Calogero model in Chapter 7. The following two Chapters, 8 and 9, describe respectively the isomonodromic deformation problem and the inﬁnite Grassmannian. Soliton solutions are obtained using vertex operators. Chapters 10 and 11 are devoted to the classical study of the KP and KdV hierarchies. We develop and use the formalism of pseudo-diﬀerential operators which allows us to give simple proofs of the main formal properties. Finite-zone solutions of KdV allow us to make contact with integrable systems of ﬁnite dimensionality and soliton solutions. In the next Chapter, 12, we study the class of Toda and sine-Gordon ﬁeld theories. We use this opportunity to exhibit the relations between

4

1 Introduction

their conformal and integrable properties. The sine-Gordon model is presented in the framework of the Classical Inverse Scattering Method in Chapter 13. This very ingenious method is exploited to solve the sineGordon equation. The last three chapters may be viewed as mathematical appendices, provided to help the reader. First we present the basic facts of symplectic geometry, which is the natural language to speak about Classical Mechanics and integrable systems. Since mathematical tools from Riemann surfaces and Lie groups are used almost everywhere, we have written two chapters presenting them in a concise way. We hope that they will be useful at least as an introduction and to ﬁx notations. Let us say brieﬂy how we have limited our discussion. First we choose to remain consistently at a relatively elementary mathematical level, and have been obliged to exclude some important developments which require more advanced mathematics. We put the emphasis on methods and we have not tried to make an exhaustive list of integrable systems. Another aspect of the theory we have touched only very brieﬂy, through the Whitham equations, is the study of perturbations of integrable systems. All these subjects are very interesting by themselves, but the present book is big enough! A most active ﬁeld of recent research is concerned with quantum integrable systems or the closely related ﬁeld of exactly soluble models in statistical mechanics. When writing this book we always had the quantum theory present in mind, and have introduced all classical objects which have a well-known quantum counterpart, or are semi-classical limits of quantum objects. This explains our emphasis on Hamiltonians methods, Poisson brackets, classical r-matrices, Lie–Poisson properties of dressing transformations and the method of separation of variables. Although there is nothing quantum in this book, a large part of the apparatus necessary to understand the literature on quantum integrable systems is in fact present. The bibliography for integrable systems would ﬁll a book by itself. We have made no attempt to provide one. Instead, we give, at the end of each chapter, a short list of references, which complements and enhances the material presented in the chapter, and we highly encourage the reader to consult them. Of course these references are far from complete, and we apologize to the numerous authors having contributed to the domain, and whose due credit is not acknowledged. Finally we want to thank our many colleagues from whom we learned so much and with whom we have discussed many parts of this book.

2 Integrable dynamical systems

We introduce the deﬁnition of integrable systems through the Liouville theorem, i.e. systems for which n conserved quantities in involution are known on a phase space of dimension 2n. The Liouville theorem asserts that the equations of motion can then be solved by quadrature. The notion of Lax matrix is introduced. This is a matrix whose elements are dynamical and whose time evolution is isospectral, a central object in the theory. It is also shown that the Poisson brackets of the elements of the Lax matrix are expressed in the so-called r-matrix form. Finally, we present some historical examples of integrable systems which are solved by the method of separation of variables. This leads to linearization of the time evolution on the Jacobian of Riemann surfaces, another recurring theme in the book. 2.1 Introduction In Classical Mechanics the state of the system is speciﬁed by a point in phase space. This is generally a space of even dimension with coordinates of position qi and momentum pi . The Hamiltonian is a function on phase space, denoted H(pi , qi ). The equations of motion are a ﬁrst order diﬀerential system taking the Hamiltonian form: q˙i =

∂H , ∂pi

p˙i = −

∂H ∂qi

(2.1)

Here and in the following, a dot will refer to a time derivative. For any function F (p, q) on phase space, this implies that F (p(t), q(t)) obeys: dF F˙ ≡ = {H, F } dt 5

6

2 Integrable dynamical systems

where for any functions F and G the Poisson bracket {F, G} is deﬁned as: ∂F ∂G ∂G ∂F − {F, G} ≡ ∂pi ∂qi ∂pi ∂qi i

For the coordinates pi , qi themselves we have {qi , qj } = 0,

{pi , pj } = 0,

{pi , qj } = δij

(2.2)

The quantity H(p, q) is automatically conserved under time evolution, d dt H(p, q) = 0, so that the motion takes place on the subvariety of phase space deﬁned by H = E constant. Historically, it proved very diﬃcult to ﬁnd dynamical systems such that eqs. (2.1) could be solved exactly. However, there is a general framework where the explicit solutions can be obtained by solving a ﬁnite number of algebraic equations and computing ﬁnite number of integrals, i.e. the solution is obtained by quadratures. These dynamical systems are the Liouville integrable systems that we will consider in this book. A dynamical system on a phase space of dimension 2n is Liouville integrable if one knows n independent functions Fi on the phase space which Poisson commute, that is {Fi , Fj } = 0. The Hamiltonian is assumed to be a function of the Fi . In order to understand the geometry of the situation, let us discuss a very simple example: the harmonic oscillator. The phase space is of dimension 2 and the Hamiltonian is H = 12 (p2 +ω 2 q 2 ) with Poisson bracket {p, q} = 1. The phase space is ﬁbred into ellipses H = E except for the point (0, 0) which is a stationary point. An adapted coordinate system ρ, θ is given by: ρ p = ρ cos(θ), q = sin(θ) (2.3) ω and the non-vanishing Poisson bracket is {ρ, θ} = ω/ρ. In these coordinates the ﬂow reads: √ ρ = 2E, θ = ωt + θ0 i.e. the ﬂow takes place on the above ellipsis. This can be straightforwardly generalized to a direct sum of n harmonic oscillators with n 1 2 H= (p + ωi2 qi2 ) 2 i i=1

and Poisson bracket eq. (2.2). We do have n conserved quantities in involution, Fi = 12 (p2i + ωi2 qi2 ), and the level manifold Mf , i.e. the set of points of phase space such that Fi = fi , is an n-dimensional real torus, which is

2.2 The Liouville theorem

7

explicitly a cartesian product of n topological circles. The motion takes place on these tori which foliate the phase space. We can intoduce n angles θi as above which evolve linearly in time with frequency ωi . An orbit of the dynamical ﬂow is dense on the torus when the ωi are rationally independent. For Liouville integrable systems, we shall assume that the conserved quantities are well-behaved so that the n dimensional surfaces Mf deﬁned by Fi = fi are generically regular, and foliate the phase space. This does not preclude the existence of singular points such as pi = qi = 0 in the above example of the harmonic oscillator. In this setting we are now going to prove the Liouville theorem and show that the geometry of the situation is analogous to that of the harmonic oscillator example. 2.2 The Liouville theorem We consider a dynamical Hamiltonian system with phase space M of dimension 2n. Introduce canonical coordinates pi , qi such that the nondegenerate Poisson bracket reads as in eq. (2.2). As usual a non-degenerate Poisson bracket on M is equivalent to the data of a non-degenerate closed 2-form ω, dω = 0, deﬁned on M , called the symplectic form, see Chapter 14. In the canonical coordinates the symplectic form reads ω= dpj ∧ dqj j

Let H be the Hamiltonian of the system. Deﬁnition. The system is Liouville integrable if it possesses n independent conserved quantities Fi , i = 1, . . . , n, {H, Fj } = 0, in involution {Fi , Fj } = 0 The independence means that at generic points (i.e. anywhere except on a set of measure zero), the dFi are linearly independent, or that the tangent space of the surface Fi = fi exists everywhere and is of dimension n. There cannot be more than n independent quantities in involution otherwise the Poisson bracket would be degenerate. In particular, the Hamiltonian H is a function of the Fi . The Liouville theorem. The solution of the equations of motion of a Liouville integrable system is obtained by “quadrature”. Proof. Let α = i pi dqi be the canonical 1-form and ω = dα = i dpi ∧ dqi be the symplectic 2-form on the phase space M . We will construct

8

2 Integrable dynamical systems

a canonical transformation (pi , qi ) → (Fi , Ψi ) such that the conserved quantities Fi are among the new coordinates: ω= dpi ∧ dqi = dFi ∧ dΨi i

i

If we succeed in doing that, the equations of motion become trivial: F˙j = {H, Fj } = 0 ∂H ψ˙ j = {H, ψj } = = Ωj ∂Fj

(2.4)

The Ωj depend only on F and so are constant in time. In these coordinates, the solution of the equations of motion read: Fj (t) = Fj (0),

ψj (t) = ψj (0) + tΩj

To construct this canonical transformation, we exhibit its so-called generating function S. Let Mf be the level manifold Fi (p, q) = fi . Suppose that on Mf we can solve for pi , pi = pi (f, q), and consider the function q m α= pi (f, q)dqi S(F, q) ≡ m0

q0

i

where the integration path is drawn on Mf and goes from the point of coordinate (p(f, q0 ), q0 ) to the point (p(f, q), q), where q0 is some reference value. Suppose that this function exists, i.e. if it does not depend on the path ∂S from m0 to m, then pj = ∂q . Deﬁning ψj by j ψj = we have dS =

∂S ∂Fj

ψj dFj + pj dqj

j

Since d2 S = 0 we deduce that ω = j dpj ∧ dqj = j dFj ∧ dψj . This shows that if S is a well-deﬁned function, then the transformation is canonical. To show that S exists, we must prove that it is independent of the integration path. By Stokes theorem, we have to prove that: dα|Mf = ω|Mf = 0

2.2 The Liouville theorem

9

Fig. 2.1. A leaf Mf on phase space Let Xi be the Hamiltonian vector ﬁeld associated with Fi , deﬁned by dFi = ω(Xi , ·), ∂Fi ∂ ∂Fi ∂ Xi = − ∂qk ∂pk ∂pk ∂qk k

These vector ﬁelds are tangent to the manifold Mf because the Fj are in involution, Xi (Fj ) = {Fi , Fj } = 0 Since the Fj are assumed to be independent functions, the tangent space to the submanifold Mf is generated at each point m ∈ M by the vectors Xi |m (i = 1, . . . , n). But then ω(Xi , Xj ) = dFi (Xj ) = 0 and we have proved that ω|Mf = 0, and therefore S exists. We have eﬀectively obtained the solution of the equations of motion through one quadrature (to calculate the function S) and some “algebraic manipulation” (to express the p as functions of q and F ) Remark 1. From the closedness of α on Mf , the function S is unchanged by continuous deformations of the path (m0 , m). However, if Mf has non-trivial cycles, which is generically the case, S is a multivalued function deﬁned in a neighbourhood of Mf . The variation over a cycle ∆cycle S = α cycle

is a function of F only. This induces a multivaluedness of the variables ψj : ∆cycle ψj = ∂ ∆cycle S. For instance, in the case of harmonic oscillators, we see that above each ∂Fj

10

2 Integrable dynamical systems

point (q1 , . . . , qn ) we have 2n points on the Mf level surface, due to the independent choices of sign in pi = ± 2fi − ωi2 qi2 . So we have many choices for the path of integration, reﬂecting the topology of the torus.

Remark 2. The deﬁnition we have given of a Liouville integrable system requires some care. Given any Hamiltonian H, the Darboux theorem, see Chapter 14, implies that we can always ﬁnd locally a system of canonical coordinates on phase space (P1 , . . . , Pn ; Q1 , . . . , Qn ), with H = P1 , hence fulﬁlling the hypothesis of the Liouville theorem. For integrable systems we require that the conserved quantities are globally deﬁned on a suﬃciently large open set, and that the surfaces Fi = fi are well-behaved and foliate the phase space. This is not generally the case for the Pi constructed by the Darboux theorem. Moreover, in all known examples, the conserved quantities are even algebraic functions of canonical coordinates on some open domain and the solutions of the equations of motion are analytic. Remark 3. Using the Poisson commuting functions Fi , one can solve simultaneously the n time evolution equations dF/dti = {Fi , F }, since: ∂ ∂ ∂ ∂ F− F = {Fi , {Fj , F }} − {Fj , {Fi , F }} = {{Fi , Fj }, F } = 0 ∂ti ∂tj ∂tj ∂ti Since the Hamiltonian vector ﬁelds are well-deﬁned and linearly independent everywhere, the ﬂows deﬁne a locally free (no ﬁxed points) and transitive (goes everywhere) action of a small open set in Rn on the surface Mf . Assuming that Mf is connected and compact, the ﬂows extend to all values of the times ti and ﬁll the whole surface Mf , hence we have a surjective action of Rn on Mf . The stabilizer of a point is an Abelian discrete subgroup of Rn since the action is locally free, so it is of the form Zn . Thus Mf appears as the quotient of Rn by Zn , i.e. a torus. This reﬁnement, due to Arnold, of the Liouville theorem shows that, under suitable global hypothesis, the phase space is indeed foliated by n dimensional tori, called the Liouville tori. It is remarkable that for small perturbations of integrable systems, there still exist Liouville tori “almost everywhere”. This is the content of the famous Kolmogorov–Arnold–Moser (KAM) theorem.

2.3 Action–angle variables As already noticed in the proof of the Liouville theorem, the level manifold Mf has non-trivial cycles. Under suitable compactness and connectivity conditions, the Mf are n-dimensional tori Tn . This points to the introduction of angle variables to describe the motion along the cycles. The torus Tn is isomorphic to a product of n circles Ci . We may choose special angular coordinates on Mf dual to the n fundamental cycles Ci (see eq. (2.5)).

2.4 Lax pairs

11

The action variables Ij are deﬁned as the integrals of the canonical 1-form over the cycles Cj , 1 Ij = α 2π Cj The Ij are functions of the constants of motion Fj and we suppose they are independent, so that if the values of Ij (j = 1, . . . , n) are known, then Mf is determined. Let us consider the canonical transformation generated by the same function as above: m α S(I, q) = m0

but expressed in terms of the variables Ii instead of Fi . Denoting by θj the variable conjugate to Ij , the canonical transformation generated by S is deﬁned by ∂S ∂S , θk = pk = ∂qk ∂Ik The variables θk are canonically conjugated to the action variables Ij . We show that they can be regarded as normalized angular variables on the cycles Cj . That is, 1 dθk = δjk (2.5) 2π Cj By deﬁnition of θk , ∂ dθk = dS, ∂Ik Cj Cj

dS =

∂S ∂S dqi + dIi ∂qi ∂Ii i

Since on the manifold Mf , dIi = 0, we have ∂S ∂ ∂ dθk = dqi = α = 2πδjk ∂Ik Cj ∂qi ∂Ik Cj Cj This proves that θk are angle variables. 2.4 Lax pairs The new concept which emerged from the modern studies of integrable systems is the notion of Lax pairs. A Lax pair L, M consists of two matrices, functions on the phase space of the system, such that the Hamiltonian evolution equations, eq. (2.1), may be written as dL ≡ L˙ = [M, L] dt

(2.6)

12

2 Integrable dynamical systems

Here, [M, L] = M L − LM denotes the commutator of the matrices M and L. The immediate interest in the existence of such a pair lies in the fact that it allows for an easy construction of conserved quantities. Indeed, the solution of eq. (2.6) is of the form L(t) = g(t)L(0)g(t)−1 where the invertible matrix g(t) is determined by the equation M=

dg −1 g dt

It follows that if I(L) is a function of L invariant by conjugation L → gLg −1 , then I(L(t)) is a constant of the motion. Such functions are functions of the eigenvalues of L. We say that the evolution equation (2.6) is isospectral, which means that the spectrum of L is preserved by the time evolution. Remark 1. Recall that integrability of the system in the sense of Liouville demands that (i) the number of independent conserved quantities equals the number of degree of freedom, and that (ii) these conserved quantities are in involution. Remark 2. A Lax pair is by no means unique. Even the size of the matrices may be changed. There is also a natural gauge group acting on the Lax pair: L −→ gLg −1 ,

M −→ gM g −1 +

dg −1 g dt

where g is an invertible matrix, a function on phase space.

Let us present some simple examples showing that the equations of motion can indeed be recast into Lax form. Example 1. For any integrable system in the sense of Liouville, one may construct a Lax pair in a tautological way. Consider a ﬁnitedimensional Hamiltonian system, with n degrees of freedom, Poisson bracket { , } and Hamiltonian H. Suppose it is integrable in the sense of Liouville, which means that it possesses n independent integrals of the motion Fi , i = 1, . . . , n, in involution. The Liouville theorem states that there exists, at least locally, a system of conjugate coordinates Ii , θi , i = 1, . . . , n, where the Ij are functions of the Fi only. In these coordinates, the equations of motion take the very simple form I˙j = 0,

∂H θ˙j = ∂Ij

(2.7)

13

2.5 Existence of an r-matrix

Introduce the Lie algebra generated by {Hi , Ei , i = 1, . . . , n} with relations [Hi , Hj ] = 0, [Hi , Ej ] = 2 δij Ej , [Ei , Ej ] = 0 This Lie algebra has a natural representation by 2n × 2n matrices. Set: L=

n

Ij Hj + 2Ij θj Ej ,

M =−

j=1

n ∂H j=1

∂Ij

Ej

The equation L˙ = [M, L] is then equivalent to eq. (2.7). Thus L, M form a Lax pair. However, this construction is useless since it requires the knowledge of the action angle variables to build the Lax pair, but if these are known, there is no need for a Lax pair any more. Example 2. As a second example we exhibit a Lax pair for the harmonic oscillator. Let: p ωq 0 −ω/2 (2.8) L= , M= ωq −p ω/2 0 We check immediately that the Lax equation, eq. (2.6), is equivalent to the equations of motion q˙ = p, p˙ = −ω 2 p. Let us observe that the Hamiltonian H can be written as 14 TrL2 . This example can be generalized to n independent harmonic oscillators by writing the Lax matrices L, M in a block diagonal form where each block is a two by two matrix as above. Now the conserved quantities are TrL2p = 2 (2Fi )p , with 2Fi = p2i + ω 2 qi2 , and TrL2p+1 = 0, so that they are equivalent to the collection of the Fi . 2.5 Existence of an r-matrix A Lax pair provides us with conserved quantities without referring to a Poisson structure. The notion of Liouville integrability requires the knowledge of a Poisson structure together with the involution property of the conserved quantities. We shall now present the general form of Poisson brackets between the matrix elements of the Lax matrix which ensures the involution property for the conserved quantities. Suppose we are given a Lax pair L, M , which are N × N matrices, and suppose that the matrix L can be diagonalized: L = U ΛU −1

(2.9)

The matrix elements λk of the diagonal matrix Λ are the conserved quantities. We will not consider here the question of the independence of these quantities.

14

2 Integrable dynamical systems

Let us ﬁrst introduce some notations. Let Eij be the canonical basis of the N × N matrices, (Eij )kl = δik δjl . We can write L= Lij Eij ij

The components Lij of the Lax matrix are functions on the phase space. We can evaluate the Poisson brackets {Lij , Lkl } and gather the results as follows. Let Lij (Eij ⊗ 1), L2 ≡ 1 ⊗ L = Lij (1 ⊗ Eij ) L1 ≡ L ⊗ 1 = ij

ij

The index 1 or 2 means that the matrix L sits in the ﬁrst or second factor in the tensor product. Similarly, for T living in the tensor product of two copies of N × N matrices, we set Tij,kl Eij ⊗ Ekl , T21 = Tij,kl Ekl ⊗ Eij T = T12 = ij,kl

ij,kl

More generally, when we have tensor products with more copies of N × N matrices, we denote by Lα the embedding of L in the α position, e.g. L3 = 1 ⊗ 1 ⊗ L ⊗ 1 ⊗ · · ·, and Tαβ the embedding of T in the α and β position. We shall also denote by Trα the partial trace on the space in α position in a tensor product. For example Tij,kl Tr(Eij ) Ekl Tr1 T12 = ij,kl

Deﬁne {L1 , L2 } as the matrix of Poisson brackets between the elements of L: {L1 , L2 } = {Lij , Lkl }Eij ⊗ Ekl ij,kl

For an integrable system, the Poisson brackets between the elements of the Lax matrix L can be written in the following very special form: Proposition. The involution property of the eigenvalues of L is equivalent to the existence of a function, r12 , on the phase space such that: {L1 , L2 } = [r12 , L1 ] − [r21 , L2 ]

(2.10)

Proof. Assume ﬁrst that the eigenvalues of L Poisson commute, {λi , λj } = 0. Recall that L is diagonalized by U , eq. (2.9). Since U is a function

2.5 Existence of an r-matrix

15

on phase space, we compute directly the Poisson brackets {L1 , L2 } = {U1 Λ1 U1−1 , U2 Λ2 U2−1 } using the Leibnitz rule. We get nine terms. Out of these, four terms involve the Poisson brackets {U1 , U2 }. Introducing the quantity k12 = {U1 , U2 }U1−1 U2−1 , these terms can be written as [[k12 , L2 ], L1 ] = 12 [[k12 , L2 ], L1 ] − 12 [[k21 , L1 ], L2 ] Four other terms involve {Λ1 , U2 } and {U1 , Λ2 }. Introducing q12 = U2 {U1 , Λ2 }U1−1 U2−1 we can write them as [q12 , L1 ] − [q21 , L2 ]. Putting all this together, we get {L1 , L2 } = U1 U2 {Λ1 , Λ2 }U1−1 U2−1 + [r12 , L1 ] − [r21 , L2 ] where r12 = q12 + 12 [k12 , L2 ]. This proves one part of the equivalence when {Λ1 , Λ2 } = 0. Conversely, suppose we have eq. (2.10). Then, in any matrix representation n,m n,m {Ln1 , Lm 2 } = [a12 , L1 ] + [b12 , L2 ]

(2.11)

with an,m 12

=

n−1 m−1

Ln−p−1 Lm−q−1 r12 Lp1 Lq2 1 2

p=0 q=0

bn,m 12 = −

n−1 m−1

Ln−p−1 Lm−q−1 r21 Lp1 Lq2 1 2

p=0 q=0

Taking the trace of eq. (2.11), and using that the trace of a commutator is zero, we get that the quantities of Tr (Ln ) are in involution. This is equivalent to the involution of the eigenvalues λk of L. Although simple to prove, this proposition is important for developing formal aspects of integrable systems since it allows us to control the Poisson brackets of the Lax matrix. The Jacobi identity on the Poisson bracket, see Chapter 14, yields the following constraint on r: [L1 , [r12 , r13 ] + [r12 , r23 ] + [r32 , r13 ] + {L2 , r13 } − {L3 , r12 }] + cyc. perm. = 0 (2.12) where cyc. perm. means cyclic permutations of tensor indices 1, 2, 3. In a sense, solving this equation amounts to classifying integrable Hamiltonian systems.

16

2 Integrable dynamical systems

If r happens to be constant, the only remaining terms in eq. (2.12) are the ﬁrst ones. In particular, the Jacobi identity is satisﬁed if a constant r-matrix satisﬁes [r12 , r13 ] + [r12 , r23 ] + [r32 , r13 ] = 0 When r is antisymmetric, r12 = −r21 , this is called the classical Yang– Baxter equation. This case will be extensively studied in Chapter 4. Remark 1. The form of the bracket is preserved by gauge transformations. If {L1 , L2 } = [r12 , L1 ] − [r21 , L2 ] such that and L = gLg −1 , then there exists a matrix function r12 {L1 , L2 } = [r12 , L1 ] − [r21 , L2 ] can be expressed in terms of the functions r12 and of the Poisson The function r12 brackets between g and the Lax matrix L. 1 r12 = g1 g2 r12 + g1−1 {g1 , L2 } + [u12 , L2 ] g1−1 g2−1 (2.13) 2

where u12 = g1−1 g2−1 {g1 , g2 }.

Remark 2. In the form (2.10) the antisymmetry property of the bracket is explicit, although r has no special symmetry property. Furthermore, we have the freedom to redeﬁne r by r12 −→ r12 + [σ12 , L2 ]

(2.14)

where σ is symmetric, without changing the Poisson bracket.

Example. Let us give an example of this construction in the simple example of the harmonic oscillator. The Lax matrix L is given in eq. (2.8) and we introduce the action–angle coordinates ρ, θ as in eq. (2.3). In these coordinates the matrix L is diagonalized by: sin 2θ cos 2θ U = U −1 = sin 2θ − cos 2θ Since {U1 , U2 } = 0, r12 = q12 , which is easily computed to be: ω 0 1 ⊗L r12 = 2 −1 0 2ρ It is easy to verify that this r-matrix indeed satisﬁes eq. (2.10). Let us notice that it is a dynamical r-matrix, which means that it depends explicitly on the dynamical variables.

2.7 The Kepler problem

17

2.6 Commuting ﬂows The Poisson brackets, eq. (2.10), are equivalent to the involution of the eigenvalues of L. An equivalent set of commuting Hamiltonians, Hn , is given by the traces of the powers of the Lax matrix: Hn = Tr (Ln )

(2.15)

The Hamiltonians Hn are in involution, {Hn , Hm } = 0, since they are symmetric polynomials in the eigenvalues. Furthermore, we show that the time evolution of the Lax matrix L with Hamiltonian Hn naturally takes the Lax form. Proposition. Suppose that {L1 , L2 } = [r12 , L1 ] − [r21 , L2 ]. If we take Hn = Tr (Ln ) as Hamiltonians, then the equations of motion admit a Lax representation: dL ≡ {Hn , L} = [Mn , L] , dtn

with Mn = −n Tr1 (Ln−1 r21 ) 1

(2.16)

Proof. Set m = 1 in eq. (2.11), and take the trace over the ﬁrst space, to r21 ). get dL/dtn = [Mn , L] with Mn = −n Tr1 (Ln−1 1 Note that the matrices Mn are unchanged under the transformation eq. (2.14). However, the matrices Mn are not unique since adding any matrix commuting with L does not change the equations of motion. We close this chapter by presenting some of the few historical examples of integrable systems which were known at the end of the nineteenth century. Of course all systems with one degree of freedom (phase space of dimension 2) are integrable since the Hamiltonian H is conserved. We discuss below more sophisticated examples with higher dimensional phase spaces. 2.7 The Kepler problem The ﬁrst historical integrable system is the Kepler two-body problem. In the centre of mass frame, the equations of motion take the form: d2 xi ∂V (r) = − , r = x21 + x22 + x23 dt2 ∂xi In the traditional Kepler problem, V (r) = C/r, but we will consider any centrally symmetric potential V (r). This is a Hamiltonian system with 1 2 pi + V (r) 2 3

H=

i=1

18

2 Integrable dynamical systems

and Poisson brackets {pi , xj } = δij . The phase space is of dimension 6, and we have to exhibit three commuting conserved quantities. Due to central symmetry, the angular momentum J = (J1 , J2 , J3 ),

Jij = xi pj − xj pi = ijk Jk

is conserved. Here ijk is the totally antisymmetric Levi–Civita tensor. The three components Ji are conserved but do not Poisson commute. However, a set of three independent Poisson commuting quantities is pro2 + J 2 + J 2 . At this point, one may follow vided by H, J3 ≡ J12 , J 2 ≡ J12 23 13 the standard solution, which takes advantage of the conservation of J, and restrict ourselves to the plane perpendicular to J where the motion takes place. Here we prefer to show the Liouville theorem at work, and use only the three commuting conserved quantities. Due to the spherical symmetry of the problem, it is convenient to use spherical coordinates: x1 = r sin θ cos φ,

x2 = r sin θ sin φ,

x3 = r cos θ

We introduce the conjugate momenta pr , pθ , pφ by writing the canonical 1-form α = pi dxi = pr dr + pθ dθ + pφ dφ. In these coordinates the conserved quantities read: 1 1 1 H = p2r + 2 p2θ + 2 2 p2φ + V (r) 2 r r sin θ 1 J 2 = p2θ + p2 sin2 θ φ J3 = pφ (2.17) On the surface Mf corresponding to ﬁxed values of the conserved quantities, we solve for the p in terms of the the position variables, yielding:

J2 J32 , pφ = J3 pr = 2 H − V (r) − 2 , pθ = J 2 − r sin2 θ Note that on Mf , pr depends only on r, pθ only on θ and pφ only on φ (it is constant). The variables r, θ, φ are then called separated variables. The 1-form α restricted on Mf is then obviously closed. The action S appearing in the Liouville theorem reads: φ θ r

J2 J32 2 H − V (r) − 2 dr + J2 − J3 dφ S= dθ + r sin2 θ The angle variables corresponding to our action variables are given by ψH =

∂S , ∂H

ψJ 2 =

∂S , ∂J 2

ψJ3 =

∂S ∂J3

19

2.8 The Euler top

and have simple time evolution with respective frequencies (1, 0, 0) by eq. (2.4). Hence ψJ 2 and ψJ3 remain constant, while ψH = t − t0 . This gives the standard formula for the Kepler motion: r dr t − t0 =

2 2 H − V (r) − Jr2 Note that the constancy of ψJ3 implies: φ˙ =

J 3 sin2 θ J 2 −

J32 sin2 θ

θ˙

This, in turn, implies the conservation of J1 , J2 : J1 = −J3 cot θ cos φ − sin φ

J2 −

J32 sin2 θ

J2 −

J32 sin2 θ

J2 = −J3 cot θ sin φ + cos φ

as exso that the motion takes place in the plane perpendicular to J, pected. It is worth noticing that the present approach is the one which prevails in Quantum Mechanics, where the three components of J cannot be measured simultaneously. 2.8 The Euler top We consider a rotating solid body attached to a ﬁxed point. The Euler top corresponds to the case where there is no external force. It is very convenient to consider the equations of motion in a frame rotating with the body, as discovered by Euler. We choose the moving frame with origin at the ﬁxed point of the top (that is the point where the top is attached), and the axis being the principal inertia axis which diagonalizes, the inertia ten sor computed with respect to the ﬁxed point Iij = (x2 δij − xi xj )ρ(x)dx with ρ(x) the mass density. Let J be the angular momentum of the top seen in the moving frame. We have J = I. ω where I = Diag(I1 , I2 , I3 ) and ω is the rotation vector of the moving frame. We shall assume that the principal moments of inertia Ii are all diﬀerent. The equation of motion reads: dJ = − ω ∧ J dt

20

2 Integrable dynamical systems

It expresses the conservation of J in the absolute frame. This can be recast into the Hamiltonian framework by deﬁning the Poisson brackets: {Ji , Jj } = ijk Jk where ijk is the usual antisymmetric tensor. The Hamiltonian reads: 1 Ji2 2 Ii 3

H=

i=1

This Poisson bracket is degenerate because J 2 Poisson commutes with everything. One must choose a symplectic leaf to get a well-deﬁned Hamiltonian system. This is achieved by ﬁxing the value of J 2 to a numerical value. Then the phase space is of dimension 2, and the system is integrable with conserved quantity H. Note that the trajectories are immediately obtained as the intersection of the sphere J12 + J22 + J32 = J 2 and the ellipsoid J12 /I1 + J22 /I2 + J32 /I3 = 2H. Using these relations to compute J2 and J3 in terms of J1 and substituting into the equation of −1 −1 ˙ motion of J1 , i.e. J1 = (I3 − I2 )J2 J3 , yields an equation of the form J˙1 = α + βJ12 + γJ14 , so that J1 is an elliptic function of t. 2.9 The Lagrange top When the top is in a gravitational ﬁeld its weight has to be taken into account and the problem is more complicated. Let us assume that the rotating frame has its origin at the ﬁxed point. The problem is integrable only in special cases. One case, found by Lagrange, is when two inertia moments are equal, for example I1 = I2 , and the centre of mass is located at a position (x1 = 0, x2 = 0, x3 = h) with respect to the rotating frame. This situation is achieved when the top has an axis of symmetry (around the third axis) and is attached to a point on this axis. For any top in a gravitational ﬁeld, the equations of motion in the rotating frame take the form: dJ = − ω ∧ J + h ∧ P , dt

dP = − ω ∧ P dt

(2.18)

where P is the weight of the top, which is constant in the absolute frame, and h is the vector from the ﬁxed point to the centre of mass, which is constant in the rotating frame. From these equations one can check the conservation of three quantities: P 2 , J · P and the energy 1 − P · h H = (J · I −1 J) 2

21

2.9 The Lagrange top

In order to formulate the equations of motion in a Hamiltonian framework let us introduce the following Poisson brackets between the six dynamical quantities Ji and Pi : {Ji , Jj } = ijk Jk ,

{Ji , Pj } = ijk Pk ,

{Pi , Pj } = 0

The Hamilton equations of motion are precisely eqs. (2.18). The Poisson structure is degenerate, i.e. the two conserved quantities P 2 , J·P are in the centre. Hence the symplectic leaves are of dimension 4. The Hamiltonian H provides one conserved quantity deﬁned on the leaves and that is all in general. Using the particular hypothesis of the Lagrange top, namely the rotational symmetry around the third axis, it is easy to see that J · h is a second independent conserved quantity. Indeed, multiplying the ﬁrst eq. (2.18) by h we ﬁnd d h · J = h3 (ω1 J2 − ω2 J1 ) = h3 ω1 ω2 (I2 − I1 ) = 0 = h · ( ω ∧ J) dt where we have used that h is along the third axis. Since h is conserved, this quantity Poisson commutes with the Hamiltonian, hence the system is integrable. To solve this integrable system we describe the top by Euler angles (θ, φ, ψ). Recall that the rotation vector and the weight vector can be expressed in the moving frame as:         P1 φ˙ sin θ sin ψ + θ˙ cos ψ P sin θ sin ψ ω1  ω2  =  φ˙ sin θ cos ψ − θ˙ sin ψ  ,  P2  =  P sin θ cos ψ  φ˙ cos θ + ψ˙ P cos θ ω3 P3 The two quantities in the centre of the Poisson bracket are P 2 = P 2 ˙ Moreover, the and P · J ≡ P Jz = P (I1 sin2 θ + I3 cos2 θ)φ˙ + P I3 cos θψ. ˙ This Lagrange conserved quantity reads h · J ≡ hJ3 = hI3 (φ˙ cos θ + ψ). ˙ ˙ allows us to eliminate φ and ψ in the Hamiltonian: 1 1 ˙ 2 − P cos θ H = I1 (sin2 θφ˙ 2 + θ˙2 ) + I3 (φ˙ cos θ + ψ) 2 2 We get a one-dimensional system in the variable θ with Hamiltonian: 1 1 J32 1 (Jz − J3 cos θ)2 H = I1 θ˙2 + − P cos θ + 2 2I1 2 I3 sin2 θ It follows that θ is an elliptic function of the time t. Note that the choice of the Euler angle coordinates has disentangled the dynamics, since φ and ψ no longer appear. This is related to the symmetry of the problem with respect to both the vertical axis and the axis of the top.

22

2 Integrable dynamical systems 2.10 The Kowalevski top

There is another much hidden case of integrability of the top, which was discovered by S. Kowalevski. As before we consider the motion of the top in a moving frame with origin at the ﬁxed point of the top. Assume that the moments of inertia obey I1 = I2 = 2I3 . Assume further that the centre of mass is in the plane x3 = 0, but away from the origin, so that the top has no rotational symmetry. However, we are free to choose the inertia axis up to a rotation around the third one, hence to assume that the ﬁxed point is on the ﬁrst axis. We introduce the traditional notations       h γ1 p ω =  q  , P =  γ2  , h =  0  0 γ3 r and we write eqs. (2.18) in components, with c0 = h/I3 : 2p˙ = qr 2q˙ = −pr − c0 γ3 r˙ = c0 γ2

γ˙ 1 = rγ2 − qγ3 γ˙ 2 = pγ3 − rγ1 γ˙ 3 = qγ1 − pγ2

The Hamiltonian and Poisson brackets are the same as in the Lagrange case. Again the Hamiltonian H=

I3 (2p2 + 2q 2 + r2 ) − hγ1 2

and the following quantities P 2 = γ12 + γ22 + γ32 ,

P · J = I3 (2pγ1 + 2qγ2 + rγ3 )

are conserved. The last two are in the centre of the Poisson bracket, so the symplectic leaves are of dimension 4, and we need one further conserved quantity to prove integrability. To introduce it naturally, consider z = p + iq and ξ = γ1 + iγ2 . The equations of motion give: 2z˙ = −irz − ic0 γ3 ,

ξ˙ = −irξ + iγ3 z

We can eliminate γ3 by considering the combination z 2 +c0 ξ which obeys: d 2 (z + c0 ξ) = −ir(z 2 + c0 ξ), dt

d 2 ¯ = ir(¯ ¯ z 2 + c0 ξ) (¯ z + c0 ξ) dt

where the second equation is obtained from the ﬁrst one by complex conjugation. It is then clear that |z 2 + c0 ξ|2 is conserved. In terms of the original variables, we have obtained the Kowalevski conserved quantity: K = (p2 − q 2 + c0 γ1 )2 + (2pq + c0 γ2 )2

(2.19)

2.11 The Neumann model

23

Note that the conditions I1 = I2 = 2I3 are essential in this calculation. The solution of this model has been obtained by S. Kowalevski and is considerably more involved than in the previous cases. The main steps of the solution will be presented in Chapters 4 and 5.

2.11 The Neumann model This model deals with the motion of a particle on a sphere SN −1 submitted to harmonic forces with generically diﬀerent frequencies in each direction. It was ﬁrst introduced by Neumann. An easy formulation is achieved by introducing a Lagrange multiplier, Λ, and writing the Lagrangian: L=

N 1 k=1

2

1 + Λ( x2l − 1) 2 N

(x˙ 2k

−

ak x2k )

l=1

The equations of motion are: x ¨k = −ak xk + Λxk ,

x2l = 1

l

Tocompute Λ, we multiply by xk and sum over k, which yields Λ = − k (x˙ 2k − ak x2k ) where we used that the constraint implies k (xk x ¨k + x˙ 2k ) = 0. This leads to the non-linear Newton equations of motion for the particle: (x˙l 2 − al x2l ) (2.20) x ¨k = −ak xk − xk l

Conversely, if we start with initial conditions satisfying k x2k = 1 and ˙ k = 0, these conditions are preserved by the time evolution. k xk x It is important for us to cast this model in the Hamiltonian formulation. This is achieved by introducing a larger phase space and then reducing by a symmetry. Consider a 2N -dimensional phase space with coordinates xn , yn , n = 1, . . . , N and canonical Poisson brackets {xn , ym } = δnm

(2.21)

and introduce the “angular momentum” antisymmetric matrix: Jkl = xk yl − xl yk and the Hamiltonian: H=

1 2 1 Jkl + ak x2k 4 2 k=l

k

(2.22)

24

2 Integrable dynamical systems

We shall assume in the following that a1 < a2 < · · · < aN . The Hamiltonian equations are, with X = (xk ), Y = (yk ), and the diagonal constant matrix L0 = (ak δkl ): X˙ = −JX,

Y˙ = −JY − L0 X

(2.23)

The Hamiltonian and the symplectic form have a symmetry Y → Y + λX,

X→X

and we can perform a Hamiltonian reduction under this symmetry group (see Chapter 14). The moment map is given by M = 12 k x2k since {M, Yk } = Xk , {M, Xk } = 0. We ﬁx the moment to M = 12 . The reduced phase space is obtained by then taking the quotient by the group of stability of the moment which is here the whole group of symmetry. This amounts to imposing some gauge condition, e.g. (X, Y ) = 0. The reduced phase space has the correct dimension 2n − 2 for a point on a sphere. Remark. The reduced equations of motion are equivalent to the equations of motion for the Neumann model. Indeed, the reduced system is characterized by the conditions t XX = 1 and t XY = 0, but the equations of motion eq. (2.23) do not preserve the second condition. We need to perform simultaneously a time-dependent gauge transformation Y → Y +λ(t)X to keep the motion on the gauge surface. Writing: 0=

d ˙ − λJX) (X, Y + λX) = (−JX, Y + λX) + (X, −JY − L0 X + λX dt = −(X, L0 X) + λ˙

since J is antisymmetric, gives λ˙ = (X, L0 X). The equation of motion for Y = (Y +λX) on the gauge surface is thus: ˙ = −JY − L0 X + (X, L0 X)X Y˙ = (−JY − L0 X) − λJX + λX Since J = J we have X˙ = −J X = Y and Y˙ = −(Y , Y )X − L0 X + (X, L0 X)X so that eliminating Y we ﬁnally get: ˙ X) ˙ − (X, L0 X) X ¨ = −L0 X − (X, X which is identical to eq. (2.20).

The Liouville integrability of this system is a consequence of the existence of (N − 1) independent quantities in involution, ﬁrst found by K. Uhlenbeck: Fk = x2k +

l=k

2 Jkl , ak − al

k

Fk = 1

(2.24)

2.12 Geodesics on an ellipsoid

25

Notice that the Hamiltonian of the Neumann model can be expressed in terms of the Fk as 1 ak Fk H= 2 k

Alternatively we can implement the Hamiltonian reduction by considering functions of X and Y invariant under Y → Y + λX . Such invariant functions are functions of X and J. Given X and J, an antisymmetric rank 2 matrix whose image contains X, we can always ﬁnd a vector Y up to the above symmetry such that Jkl = xk yl − xl yk . The equations of motion can be written in terms of the two gauge invariant matrices J = X t Y − Y t X and K = X t X K˙ = −[J, K],

J˙ = [L0 , K]

That eq. (2.23) implies the above is a simple computation. Conversely, knowing K it is easy to compute X since K is a projector on X whose length is 1 and then one can compute Y knowing J, up to a gauge. 2.12 Geodesics on an ellipsoid The Neumann problem was shown by Moser to contain, in particular, the geodesic motion on an ellipsoid which was found to be integrable by Jacobi. Consider the Hamiltonian belonging to the Neumann conserved quantities, but diﬀerent from the usual Neumann Hamiltonian: 1 Fk = Q(X, X) − Q(X, X)Q(Y, Y ) + Q2 (X, Y ) H= ak k

where we have deﬁned the quadratic form: 1 xk yk Q(X, Y ) = ak k

The Hamiltonian H is of course conserved, so we can restrict ourselves on the surface H = 0. Note that if one deﬁnes the quantity ξ, invariant under the gauge transformation Y → Y + λX by ξ=Y −

Q(X, Y ) X Q(X, X)

we have H = Q(X, X)(1 − Q(ξ, ξ)) so that the condition H = 0 is equivalent to the condition Q(ξ, ξ) = 1. The ﬂow of H, for H = 0, leaves ξ on the ellipsoid Q(ξ, ξ) = 1.

26

2 Integrable dynamical systems

We want to show that the trajectories of ξ are geodesics on the ellipsoid. We compute the time evolution of ξ, and for this we remark that since ξ is gauge invariant, one can use the unreduced equations of motion: ∂H yk xk ξk = −2Q(X, X) + 2Q(X, Y ) = −2Q(X, X) ∂yk ak ak ak ∂H xk yk y˙ k = − = −2(1 − Q(Y, Y )) − 2Q(X, Y ) ∂xk ak ak

x˙ k =

Note that, since X is gauge independent, so is its time derivative. Denoting s = Q(X, Y )/Q(X, X) we then have: Q2 (X, Y ) xk ξ˙k = −2 1 − Q(Y, Y ) + − sx ˙ k Q(X, X) ak On the surface H = 0 this becomes simply: dξk dξk = −sx ˙ k =⇒ = −xk dt ds Since x2 = 1 the length of the vector dξ/ds is 1, which means that s is the length parameter on the trajectory of ξ. Next we compute: 1 dxk d2 ξk Q(X, X) ξk = = −2 2 ds s˙ dt s˙ ak

(2.25)

The vector with components ξk /ak is the gradient of Q(ξ, ξ) hence is normal to the ellipsoid Q(ξ, ξ) = 1. Equation (2.25) shows that the second derivative of ξ with respect to the length parameter s is normal to the surface. This characterizes geodesics, as we now show. Indeed, to ﬁnd geodesics on the surface f (ξ) = 0, we have to minimize the arc length: ˙ ˙ ξ · ξ + Λf (ξ) dt where Λ is a Lagrange parameter. The Euler–Lagrange equation reads: ˙ d ξ = Λ∇f (ξ) dt ˙ ˙ ξ·ξ So the derivative of the normalized velocity vector is perpendicular to the surface. The geodesic motion on an ellipsoid was solved originally by Jacobi by introducing ellipsoidal coordinates which separate the variables for this problem. As they separate the variables of the Neumann model as well, we now explain this method for the Neumann model.

2.13 Separation of variables in the Neumann model

27

2.13 Separation of variables in the Neumann model Following Jacobi and Neumann, we introduce (N − 1) parameters on the sphere ζ1 , . . . , ζN −1 . They are the roots of the equation: u(ζ) ≡

k

x2k =0 ζ − ak

This equation is invariant by xk −→ λxk so that ζ1 < ζ2 < · · · < ζN −1 are indeed deﬁned on the sphere. Conversely, by deﬁnition of the ζj we have for x ∈ S (N −1) : j (ζ − ζj ) j (ak − ζj ) =⇒ x2k = (2.26) u(ζ) = k (ζ − ak ) l=k (ak − al ) Considering the graph of u(ζ) it is easy to see that: a1 < ζ1 < a2 < ζ2 < a3 < · · · < ζN −1 < aN and we have a bijection of this domain D of the ζj on the “quadrant” xk > 0 ∀k of the sphere. The (ζj ) deﬁne an orthogonal system of coordinates on the sphere, of ellipsoidal type. Let us consider, for each root ζj , the vector: 1 xN ∂x x1 (2.27) = vj , vj = ,..., ∂ζj 2 ζj − a1 ζj − aN Since ζj solves u(ζ) = 0, we have x · vj = 0. Moreover, vj · vj =

k

u(ζj ) − u(ζj ) x2k = 0, =− (ζj − ak )(ζj − ak ) ζj − ζj

if j = j

Therefore the vectors vj are (N − 1) orthogonal vectors in the tangent plane to the sphere S (N −1) at the point x. As a byproduct, since vj2 = −u (ζj ) we also get the metric tensor: gjj =

∂x ∂x 1 · = − δjj u (ζj ) ∂ζj ∂ζj 4

To compute the momentaconjugated to the variables ζj , we consider the canonical 1-form α = k y k dxk associated with the Poisson bracket eq. (2.21). We write it as α = j pj dζj . One gets pj =

k

yk

∂xk 1 = y · vj ∂ζj 2

28

2 Integrable dynamical systems

These (N − 1) equations determine y up to a vector proportional to x which does not aﬀect the value of Jkl . A solution is y = 1/2 j g jj pj vj which easily gives: Jkl = −

1 (ak − al ) vjk vjl g jj pj 2 j

With this, we can compute the conserved quantities Fk , eq. (2.24), in terms of the new canonical coordinates ζi , pj : Fk =

x2k

1 l l ak vj · vj − + al vj vj vjk vjk g jj g j j pj pj 4 l

jj

Noting that vj · vj = 4gjj δjj and

l

al vjl vjl = 4ζj gjj δjj , we obtain:



 g jj p2j  Fk = x2k 1 − ζj − ak j

It is convenient to introduce the generating function for the Fk : Fk (λ − bn ) H(λ) ≡ = n λ − ak (λ − ak ) k

(2.28)

k

Fk = 1 to for appropriate bn , n = 1, . . . , N − 1, and we have used normalize the leading coeﬃcient in the numerator. By a simple calculation we ﬁnd:   H(λ) = u(λ) 1 −

g jj p2j  ζj − λ

(2.29)

j

Following the general strategy of the Liouville theorem, we express the momenta pj in terms of the conserved quantities Fk and the ζj . We have: g jj p2j = lim

λ→ζj

λ − ζj 1 H(λ) =⇒ p2j = − H(ζj ) u(λ) 4

where we have taken into account the value of the metric tensor and eq. (2.26). Notice that, on Mf , the momentum pj is a function of ζj only, so that the coordinates ζj form a set of separated variables for the Neumann model.

2.13 Separation of variables in the Neumann model The function S = 1 S= 2 j

ζj

j

29

pj dζj reads

1 −H(ζ) dζ = 2

ζj

j

(ζ − bn ) − n dζ k (ζ − ak )

(2.30)

and is a sum of terms, one for each separated variable. Choosing as independent action variables the (N −1) independent quantities bn instead of the N dependent quantities Fk , the conjugate variables ψn are: ∂S 1 ζj dζ ψn = =− −H(ζ) (2.31) ∂bn 4 ζ − bn j

By the Liouville theorem, the time evolution of the ψn under the Hamiltonian H(λ) is linear: ∂H(λ) H(λ) 1 ζ˙j −H(ζj ) ˙ ψn = =− =− (2.32) ∂bn λ − bn 4 ζj − bn j

where the last equality is obtained by diﬀerentiating eq. (2.31) with respect to time. Equation (2.32) gives the evolution of the variables ζj . This can be formulated in a more geometrical way by introducing the polynomial of degree (2N − 1): N N −1 P (ζ) = (ζ − ai ) (ζ − bn ) n=1

i=1

We can rewrite eq. (2.32) as the set of N − 1 equations: Qn (ζj )dζj dt = 4 Qn (λ), −P (ζj ) i (λ − ai ) j with Qn (λ) =

n = 1, . . . , N − 1

(λ − bm )

m=n

Since the Qn (λ) are N − 1 linearly independent polynomials of degree N − 2 this system is equivalent to the following one: j

ζjk dζj λk = 4 dt, −P (ζj ) i (λ − ai )

k = 0, . . . , N − 2

(2.33)

30

2 Integrable dynamical systems

The quantities ζ k dζ/ −P (ζ), k = 0, . . . , N − 2, are the Abelian diﬀerentials of the ﬁrst kind on the hyperelliptic Riemann surface of genus g = N − 1 given by the equation s2 + P (ζ) = 0, see Chapter 15. The sums appearing in the left-hand side are thus Abel sums, and describe a point in the Jacobian of the curve. Equation (2.33) shows that this point moves linearly in time. This relationship between integrability, separation of variables, Riemann surfaces and linear ﬂows on their Jacobian will reappear in a much broader context in the following chapters. References [1] J.L. Lagrange, M´ecanique analytique. (1788) In Oeuvres de Lagrange, XII, Gauthier–Villars (1889), Paris. [2] C. Jacobi, Vorlesungen u ¨ber Dynamik, Gesammelte Werke, Supplement band (1884) Berlin. [3] J. Liouville, Note sur l’int´egration des ´equations diﬀ´erentielles de la dynamique. Journal de Math´ematiques (Journal de Liouville) XX (1855) 137. [4] C. Neumann, De problemate quodam mechanico, quod ad primam integralium ultraellipticorum classem revocatur. Crelle Journal 56 (1859) 46–63. [5] Sophie Kowalevski, Sur le probl`eme de la rotation d’un corps solide autour d’un point ﬁxe. Acta Mathematica 12 (1889) 177–232. [6] C.S. Gardner, J.M. Greene, M.D. Kruskal and R.M. Miura, Method for solving the Korteweg–de Vries equation. Phys. Rev. Lett. 19 (1967) 1095. [7] P.D. Lax, Integrals of non-linear equations of evolution and solitary waves. Comm. Pure Appl. Math. 21 (1968) 467. [8] L. Landau and E. Lifchitz, M´ecanique. MIR (1969) Moscow. [9] K. Uhlenbeck, Minimal 2-spheres and tori in S k . Preprint (1975). [10] V. Arnold, M´ethodes math´ematiques de la m´ecanique classique. MIR (1976) Moscow. [11] J. Moser, Various aspects of integrable Hamiltonian systems. Proc. CIME Bressanone, Progress in Mathematics, 8 Birkhauser (1978) 233.

2.13 Separation of variables in the Neumann model

31

[12] E.K. Sklyanin, On the complete integrability of the Landau–Lifchitz equation. Preprint LOMI E-3-79. Leningrad, 1979. [13] M. Semenov-Tian-Shansky, What is a classical r-matrix? Funct. Anal. and Appl. 17 4 (1983) 17. [14] D. Mumford, Tata lectures on Theta II. Progress in Mathematics, 43 (1984) Birkh¨ auser.

3 Synopsis of integrable systems

In this chapter, we introduce Lax pairs with spectral parameters. These are Lax matrices L(λ) and M (λ) depending analytically on a parameter λ. The study of the analytical properties of the Lax equation ˙ L(λ) = [M (λ), L(λ)] yields considerable insight into its structure, and in fact, quickly introduces many of the major objects and concepts, which will be developed in depth in the subsequent chapters. The ﬁrst important result is that the possible forms of M (λ) are completely determined by eq. (3.15). This form of M (λ) is such that the commutator [M (λ), L(λ)] has the same polar structure as L(λ). The Lax equation has then a natural interpretation as a ﬂow on a coadjoint orbit of a loop group. This has in turn the important consequence of introducing a symplectic structure into the theory allowing us to connect with Liouville integrability. Moreover, this geometric interpretation of the Lax equation lends itself to its solution by factorization in a loop group, which is a Riemann–Hilbert problem. Studying the analytic structure of M (λ), we are led to consider an inﬁnite family of elementary ﬂows, depending on the order of the poles. This introduces a connection between time ﬂows and the spectral parameter dependence, which ﬁnds a striking expression in Sato’s formula expressing the wave function in terms of tau-functions, eq. (3.61). The same ideas are exploited to analyse ﬁeld theories. Here the Lax equation is replaced by a zero curvature equation for a Lax connection depending on a spectral parameter. We show that the role of the Lax matrix is played by the so-called monodromy matrix. Starting from a linear Poisson bracket in the r-matrix form for the Lax connection, we get a quadratic Poisson bracket for the monodromy matrix. The factorization problem allows us to deﬁne a group of transformations, the dressing group, acting on the solutions of the equations of motion. It is shown that this 32

3.1 Examples of Lax pairs with spectral parameter

33

action is a Lie–Poisson action whose generator is the monodromy matrix. Finally, we use simple dressing elements to produce the so-called soliton solutions. 3.1 Examples of Lax pairs with spectral parameter We begin our study of Lax pairs depending on a complex parameter λ, called the spectral parameter, by giving a few examples. Example 1. Our ﬁrst example will be provided by the Euler top, see section (2.8) in Chapter 2. In this case, a Lax pair appears naturally. Let us introduce the 3 × 3 matrices Jij = ijk Jk and Ωij = ijk ωk . Then the equation of motion ddtJ = − ω ∧ J can be recast in matrix form: dJ = [Ω, J] dt This is a Lax pair with L = J, and M = Ω, but unfortunately the conserved quantities, Tr Ln , either vanish or are functions of J 2 , and therefore the Hamiltonian is not included in this set of conserved quantities (recall that J 2 is in the centre of the Poisson bracket). To cure this problem some modiﬁcations are needed. Let us introduce a diagonal matrix: I = Diag(I1 , I2 , I3 ) with Ik = 12 (Ii + Ij − Ik ), where (i, j, k) is a cyclic permutation of (1, 2, 3). With these notations we have J = IΩ + ΩI We assume that all Ij are diﬀerent and we set: L(λ) = I 2 +

1 J, λ

M (λ) = λI + Ω

(3.1)

where λ is a free arbitrary parameter, the so-called spectral parameter. To check that the Lax equation gives back the equations of motion, we compute: 1 ˙ L(λ) − [M (λ), L(λ)] = [J, I] + [I 2 , Ω] + (J˙ + [J, Ω]) λ The ﬁrst two terms cancel, while the vanishing of the 1/λ-term gives the equations of motion. This Lax pair is much better than the previous one because: 2 Tr L2 (λ) = Tr I 4 − 2 J 2 λ 1 3 3 3 6 2 2 6 2 2 Tr L (λ) = Tr I + 2 Tr I J = Tr I − 2 (Tr I) J − I1 I2 I3 H λ λ 4

34

3 Synopsis of integrable systems

hence we now do have the Hamiltonian among the conserved quantities of the form Tr Ln (λ). The new important point is that the Lax matrix depends on a spectral parameter λ and this was necessary to generate the proper conserved quantities. Furthermore, the Lax equation holds true identically in λ. Example 2. As a second example we consider the Lagrange top. The matrices L(λ) and M (λ) are written as 4 × 4 matrices in block form, 0 λ th 0 I t h + λ−2 t P , M (λ) = (3.2) L(λ) = λ−1 J Ih + λ−2 P λh Ω where the 3 × 3 matrices J and Ω are as in the previous example, and h and P are 3 × 1 matrices corresponding to the vectors h and P of the Lagrange top, see section (2.9) in Chapter 2. Moreover, I stands for the two equal moments of inertia of the top I = I1 = I2 . Let us write the Lax equation 0 = L˙ − [M, L] or: 0 I t hΩ − t hJ + λ−2 ( t P˙ + t P Ω) 0= −IΩh + Jh + λ−2 (P˙ − ΩP ) λ−1 (J˙ + [J, Ω] + P t h − h t P ) Due to the Lagrange condition I = I1 = I2 we have IΩh = Jh and the vanishing of the other elements reduces to the equations of motion J˙ + [J, Ω] + P t h − h t P = 0 and P˙ = ΩP . Example 3. Finally consider the Neumann model. As we have seen in section (2.11) in Chapter 2 the equations of motion on gauge invariant quantities are: K˙ = −[J, K], J˙ = [L0 , K] To recast these two relations into the Lax form we introduce the matrices L(λ) = L0 +

1 1 J − 2 K, λ λ

1 M (λ) = − K λ

(3.3)

and compute: 1 1 ˙ L(λ) − [M (λ), L(λ)] = (J˙ − [L0 , K]) − 2 (K˙ + [J, K]) λ λ The Lax equation with spectral parameter is equivalent to the vanishing of the two coeﬃcients J˙ − [L0 , K] and K˙ + [J, K]. Hence the Lax equation is equivalent to the equations of motion of the Neumann model. There also exists a Lax pair with spectral parameter for the Kowalevski top, however, its construction is more involved and will be discussed in Chapter 4.

3.2 The Zakharov–Shabat construction

35

3.2 The Zakharov–Shabat construction Given an integrable system, there does not yet exist a useful algorithm to construct a Lax pair. There does exist, however, a general procedure, due to Zakharov and Shabat, to construct consistent Lax pairs giving rise to integrable systems. This is a general method to construct matrices L(λ) and M (λ), depending on a spectral parameter λ, such that the Lax equation ∂t L(λ) = [M (λ), L(λ)]

(3.4)

is equivalent to the equations of motion of an integrable system. The method consists of specifying the analytical properties of the matrices L(λ) and M (λ), λ ∈ C. We consider here systems with a ﬁnite number of degrees of freedom. The main result is eq. (3.15) expressing the possible forms of the matrix M in the Lax pair. We will end the section by showing that the previous examples do ﬁt into this framework. We ﬁrst introduce a notation. For any matrix valued rational function f (λ) with poles of order nk at points λk at ﬁnite distance, we can decompose f (λ) as f (λ) = f0 +

fk (λ),

−1

with fk (λ) =

fk,r (λ − λk )r

r=−nk

k

with f0 a constant. The quantity fk (λ) is called the polar part at λk . When there is no ambiguity about the pole we are considering, we will often use the alternative notation f− (λ) ≡ fk (λ). Around one of the points λk , f (λ) may be decomposed as follows: f (λ) = f (λ)+ + f (λ)−

(3.5)

with f (λ)+ regular at the point λk and f (λ)− = fk (λ) being the polar part. Let us now consider matrices L(λ) and M (λ) of dimension N × N . We will assume that the matrices L(λ) and M (λ) are rational functions of the parameter λ. Let {λk } be the set of their poles, namely the poles of L(λ) and those of M (λ). With the above notations, assuming no pole at inﬁnity, we can write quite generally: L(λ) = L0 +

k

Lk (λ),

with Lk (λ) ≡

−1 r=−nk

Lk,r (λ − λk )r

(3.6)

36

3 Synopsis of integrable systems

and M (λ) = M0 +

Mk (λ)

with Mk (λ) ≡

−1

Mk,r (λ − λk )r (3.7)

r=−mk

k

Here nk and mk refer to the order of the poles at the corresponding point λk . The coeﬃcients Lk,r and Mk,r are matrices. We will assume that the positions of the poles λk are constants independent of time. The Lax equation (3.4), with L(λ) and M (λ) given by eqs. (3.6, 3.7), must hold identically in λ. Looking at eqs. (3.4) we see that the pole λk in the left-hand side is a priori of order nk while in the right-hand side it is potentially of order nk + mk . Hence we have two types of equation. The ﬁrst type does not contain time derivatives and comes from setting to zero the coeﬃcients of the poles of order greater than nk in the right-hand side of the equation. This will be interpreted as mk constraint equations on Mk . The equations of the second type are obtained by matching the coeﬃcients of the poles of order less or equal to nk on both sides of the equation. These equations contain time derivatives and are thus the true dynamical equations. Proposition. Assuming that L(λ) has distinct eigenvalues in a neighbourhood of λk , one can perform a regular similarity transformation g (k) (λ) diagonalizing L(λ) in a vicinity of λk : L(λ) = g (k) (λ) A(k) (λ) g (k)−1 (λ)

(3.8)

where A(k) (λ) is diagonal and has a pole of order nk at λk . As a result, the decomposition of L(λ) and M (λ) in polar parts reads:

L = L0 + Lk , with Lk = g (k) A(k) g (k)−1 (3.9) −

k

M = M0 +

k

Mk ,

with

Mk = g (k) B (k) g (k)−1

−

(3.10)

where B (k) (λ) has a pole of order mk at λk . Moreover, the Lax equation implies that B (k) (λ) is diagonal. Proof. If λk is a pole of L(λ), demanding that L(λ) has distinct eigenvalues in a neighbourhood of λk means that Lk,−nk has distinct eigenvalues. Then the matrix Q(λ) = (λ − λk )nk L(λ), which is regular at λk , can be diagonalized in vicinity of λk with a regular matrix g (k) (λ). This proves eq. (3.8). Then deﬁning B (k) (λ) by M (λ) = g (k) (λ) B (k) (λ) g (k)−1 (λ) + ∂t g (k) (λ) g (k)−1 (λ)

(3.11)

3.2 The Zakharov–Shabat construction

37

the Lax equation becomes: A˙ (k) (λ) = [B (k) (λ), A(k) (λ)] This implies A˙ (k) = 0 as expected (because the commutator with a diagonal matrix has no element on the diagonal), and, moreover, if we assume that the diagonal elements of A(k) are all distinct this equation implies that B (k) is also diagonal. Finally, the term ∂t g (k) g (k)−1 is regular and does not contribute to the singular part Mk of M at λk . Hence (k) Mk = (g (k) B (k) g (k)−1 )− which only depends on B− . It is worth noting that the ﬁrst n coeﬃcients of the expansion of g (k) (λ) only depend on the ﬁrst n coeﬃcients of the expansion of Q(λ). The matrix g (k) (λ) is deﬁned up to a right multiplication by an arbitrary analytic diagonal matrix. Note that this simultaneous diagonalization of L(λ) and M (λ) works around any point where L(λ) has distinct eigenvalues. This proposition clariﬁes the structure of the Lax pair. Only the singular parts of A(k) and B (k) contribute to Lk and Mk . The independent (k) parameters in L(λ) are thus L0 , the singular diagonal matrices A− of the form −1 (k) Ak,r (λ − λk )r (3.12) A− = r=−nk

and jets of regular matrices g(k) of order nk − 1, deﬁned up to right multiplication by a regular diagonal matrix d(k) (λ): g

(k)

=

n k −1

gk,r (λ − λk )r

(3.13)

r=0

From these data, we can reconstruct the Lax matrix L(λ) by deﬁning L = L0 + k Lk with

(k) Lk ≡ g(k) A− g(k)−1 (3.14) −

Then around each λk one can diagonalize L(λ) = g (k) A(k) g (k)−1 . This (k) yields an extension of the matrices A− and g(k) to complete series A(k) and g (k) in (λ − λk ). Finally, to deﬁne M (λ) = M0 + k Mk , we choose a set of polar matrices (B (k) (λ))− and use the series g (k) to deﬁne Mk by eq. (3.10). In the vicinity of a singularity, L(λ) and M (λ) can be simultaneously diagonalized if the Lax equation holds true. In this diagonal gauge, the

38

3 Synopsis of integrable systems

Lax equation simply states that the matrix A(k) (λ) is conserved and that B (k) (λ) is diagonal. When we transform these results into the original gauge, we get the general solution of the non-dynamical constraints on M (λ): Proposition. Let L(λ) be a Lax matrix of the form eq. (3.6). The general form of the matrix M (λ) such that the orders of the poles match on both sides of the Lax equation is M = M0 + k Mk with

Mk = P (k) (L, λ) (3.15) −

where P (k) (L, λ) is a polynomial in L(λ) with coeﬃcients rational in λ and ( )− denotes the singular part at λ = λk . Proof. It is easy to show that this is indeed a solution. We have to check that the order of the poles is correct. Let us look at what happens around λ = λk . Using a beautiful argument ﬁrst introduced by Gelfand and Dickey we write:

(k) [Mk , L]− = P (L, λ) , L − −

(k) (k) (k) = P (L, λ) − P (L, λ) , L = − P (L, λ) , L +

−

+

−

where we used that a polynomial in L commutes with L. From this we see that the order of the pole at λk is less than nk . To show that this is a general solution, recall eqs. (3.8, 3.10). Since A(k) (λ) is a diagonal N × N matrix with all its elements distinct in a vicinity of λk , its powers 0 up to N − 1 span the space of diagonal matrices and one can write B (k) = P (k) (A(k) , λ)

(3.16)

where P (k) (A(k) , λ) is a polynomial of degree N − 1 in A(k) . The coeﬃcients of P (k) are rational combinations of the matrix elements of A(k) and B (k) , hence admit Laurent expansions in λ − λ k in a vicinity

of λk . Inserting eq. (3.16) into eq. (3.10) one gets Mk = P (k) (L, λ) . More−

over, in this formula the Laurent expansions of the coeﬃcients of P (k) can be truncated at some positive power of λ − λk since a high enough power cannot contribute to the singular part, yielding a polynomial with coeﬃcients Laurent polynomials in λ − λk . It is important to realize that the dynamical variables are the matrix elements of the Lax matrix, or the matrix elements of the Lk,r . Choosing

39

3.2 The Zakharov–Shabat construction

the number and the order of the poles of the Lax matrix amounts to specifying a particular model. Choosing the polynomials P (k) (L, λ) amounts to specifying the dynamical ﬂows. The above propositions give the general form of M (λ) as far as the matrix structure and the λ-dependence is concerned. One should keep in mind however that the coeﬃcients of the polynomials P (k) (L, λ) are a priori functions of the matrix elements of L and require further characterizations in order to get an integrable system. In the setting of the next section these coeﬃcients will be constants. Remark 1. If λk is a pole of M (λ) and not a pole of L(λ), one can redeﬁne M (λ) without changing the Lax equations of motion so as to eliminate the singularities of M (λ) at λk . Indeed, redeﬁning: M (λ) → M (λ) − P (k) (L, λ) does not change the Lax equation. The new M (λ) is regular at λk . Of course we cannot eliminate the poles common to L(λ) and M (λ) by this procedure.

Remark 2. The Lax equation is invariant under similarity transformations, L → L = gLg −1 ,

M → M = gM g −1 + ∂t gg −1

(3.17)

If this similarity transformation is independent of λ, it will not spoil the analytic properties of L(λ) and M (λ). We can use the gauge freedom eq. (3.17) to diagonalize L0 , L0 = Diag(a1 , . . . , aN ) Consistency of eq. (3.4) then requires M0 to be diagonal also and thus L˙ 0 = [M0 , L0 ] = 0. Hence M0 is a polynomial P of L0 , so that replacing M (λ) → M (λ) − P (L(λ)) gets rid of M0 .

Remark 3. For Lax matrices L(λ) and M (λ) rational functions of λ, we can easily compare the number of variables to the number of equations contained in eq. (3.4). The variables are the matrices L0 , Lk,r and M0 , Mk,r . A naive counting, assuming that L(λ) and M (λ) are generic and independent matrices, gives in units of N 2 : number of variables = 2 + nk + mk = 2 + l + m k

number of equations = 1 +

k

(nk + mk ) = 1 + l + m

k

where l and m are the total order (degree of the divisor) of the poles of L(λ) and M (λ) respectively. Therefore there is one more variable than the number of equations, which reﬂects the gauge invariance eq. (3.17) of the Lax equation. If we assume however that λ belongs to a higher genus Riemann surface with genus g, the situation is very diﬀerent. Indeed, suppose that L(λ) and M (λ) have poles of total multiplicity l and m respectively. Let us count the number of meromorphic functions on which L(λ) and M (λ) can be expanded with constant matrix coeﬃcients. By the

40

3 Synopsis of integrable systems

Riemann–Roch theorem, L(λ) can be expanded on a basis of (l − g + 1) independent meromorphic functions (in the generic case), and M (λ) on (m − g + 1) functions. So we have number of variables = 2 + l + m − 2g Similarly, the commutator [M (λ), L(λ)] has poles of total multiplicity l + m and can be expanded on a basis of (l + m − g + 1) independent functions. Therefore number of equations = 1 + l + m − g So (number of equations – number of variables) = g − 1. Taking into account the gauge symmetry of the Lax equation, we see that the number of equations is always greater than the number of unknowns when g > 0. This shows that if λ belongs to a Riemann surface of genus g ≥ 1, (and such systems exist, at least for g = 1, see Chapter 7), one has to consider a non-generic situation.

Remark 4. As we already mentioned, the dynamical variables are the matrix elements of the Lk,r . In most cases this is too general and we may try to impose more restrictions on the matrix elements of the Lk,r . For instance, assuming that the λk are all real, then clearly one can impose that all the matrices Lk and Mk are anti-Hermitian. Another simple example is provided by the Neumann model whose Lax matrices (3.3) are such that t L(−λ) = L(λ), t M (−λ) = −M (λ). More generally, we may deﬁne the action of a reduction group R and impose that L(λ) and M (λ) are invariant under R. This may be done, for example, by demanding that the Lax pair L(λ) and M (λ) satisﬁes: −1 L(r(λ))R = L(λ), R −1 M (r(λ))R = M (λ) R (3.18) and r(λ) of R. This type of restriction is always compatible for some representations R with the Lax equation. It provides a way to lower the number of degrees of freedom. An example of this procedure can be found later in this chapter, see eq. (3.84).

We end this section by illustrating theses constructions on the examples of the Euler and Lagrange tops and of the Neumann model. We verify that the matrices L(λ) and M (λ) are indeed related as in eq. (3.15). Example 1. Let us consider the Euler top. We see that L(λ), eq. (3.1), has a pole at 0 and M (λ) has a pole at ∞. Let us apply the above procedure to remove this pole. There exists a polynomial P (x) = αx2 + βx + γ such that P (I 2 ) = I. We will need the coeﬃcient α = −1/I1 I2 I3 . Redeﬁning M (λ) to M (λ) − λP (L(λ)) one gets M = M0 − (α/λ)J 2 with M0 = Ω − α(I 2 J + JI 2 ) − βJ. One can check that M0 = 0. (Hint: for i = j compute (Ii − Ij )(M0 )ij using P (Ii2 ) = Ii ). Hence for the Euler top we can choose α M (λ) = − J 2 (3.19) λ We see that this new M (λ) is such that M (λ) = −α(λL2 )− . The Lax matrix of the Euler top L(λ) = I 2 + λ−1 J is of the form L0 + L− with

3.3 Coadjoint orbits and Hamiltonian formalism 41 L0 diagonal non-dynamical. The eigenvalues of J are (0, i J 2 , −i J 2 ), which are non-dynamical since J 2 belongs to the centre of the Poisson bracket and has been ﬁxed to a numerical value. Example 2. For the Lagrange top L(λ), eq. (3.2), has a pole at 0 and M (λ) has a pole at ∞. One can remove this pole by redeﬁning M (λ) → M (λ) − I −1 λL(λ). Notice however that since the eigenvalues of L0 are degenerate one cannot express M0 as a polynomial in L0 . The new M (λ) can be expressed as M = M0 + M− with M− (λ) = −I −1 (λL(λ))− . For the Lagrange top the Lax matrix is again of the form L0 + L− where For the singular part one gets, since J is obviously L0 is non-dynamical. −2 2 antisymmetric, A− = λ Diag( P , − P 2 , 0, 0) which again belongs to the centre of the Poisson bracket and is non-dynamical. Example 3. Finally, let us consider the Neumann model. We see from eq. (3.3) that we have M (λ) = (λL(λ))− where we project on the singular part at λ = 0. The Lax matrix of the Neumann model is of the form L = L0 + L− with L0 a numerical diagonal matrix, and the singular part L− at the only pole λ = 0 is given by: L− = λ−1 J − λ−2 K. This is a rank 2 matrix whose image is spanned by the vectors x and y. It is easy to diagonalize in this subspace and one gets the singular diagonal part A− = λ−2 Diag(1, 0, . . . , 0). It is again a numerical matrix. (k)

We have seen in these three examples that the singular parts, A− , of the matrix A(k) (λ) are independent of the dynamical variables. We will show in the next section that, in this case, eq. (3.14) admits an important interpretation as a coadjoint orbit. 3.3 Coadjoint orbits and Hamiltonian formalism In this section we show that the Zakharov–Shabat construction, when the (k) matrices A− are non-dynamical, can be interpreted as coadjoint orbits. This introduces a natural symplectic structure in the problem and gives a Hamiltonian interpretation to the Lax equation. This also allows us to compute the Poisson brackets of the matrix elements of the Lax matrix in terms of an r-matrix. We ﬁrst recall some notions about adjoint and coadjoint actions of Lie algebras and Lie groups, see Chapter 14. Let G be a connected Lie group with Lie algebra G. The group G acts on G by the adjoint action denoted Ad: X −→ (Ad g)(X) = gXg −1 , g ∈ G, X ∈ G

42

3 Synopsis of integrable systems

Similarly the coadjoint action of G on the dual G ∗ of the Lie algebra G (i.e. the vector space of linear forms on the Lie algebra) is deﬁned by: (Ad∗ g.Ξ)(X) = Ξ(Ad g −1 (X)), g ∈ G, Ξ ∈ G ∗ , X ∈ G The inﬁnitesimal version of these actions provides actions of the Lie algebra G on G and G ∗ , denoted ad and ad∗ respectively and given by: ad X(Y ) = [X, Y ], X, Y ∈ G, ad X.Ξ(Y ) = −Ξ([X, Y ]), X, Y ∈ G, Ξ ∈ G ∗ ∗

To see how these notions relate to our problem, let us consider ﬁrst a Lax matrix with only one polar singularity at λ = 0: (3.20) L(λ) = g(λ) A− (λ) g −1 (λ) −

−1 r with A− (λ) = r=−n Ar λ , and g(λ) has a regular expansion around λ = 0. Let G be the loop group of invertible matrix valued power series expanr sion around λ = 0. The elements of G are regular series g(λ) = ∞ r=0 gr λ . The product law is the pointwise product: (gh)(λ) = g(λ)h(λ). Formally, ∞ the Lie algebra G of G consists of elements of the form X = r=0 Xr λr . Its Lie bracket is given by the pointwise commutator. ∗ The dual−rG of G can be identiﬁed with the set of polar matrices Ξ(λ) = r≥1 Ξr λ , where the sum contains a ﬁnite but arbitrary large number of terms, by the pairing: Tr (Ξr+1 Xr ) Ξ, X ≡ Tr Resλ=0 (Ξ(λ)X(λ)) = r

where Resλ=0 is deﬁned to be the coeﬃcient of λ−1 . The coadjoint action of G on G ∗ is deﬁned by ((Ad∗ g) · Ξ) (X) = Ξ(g −1 Xg) for Ξ ∈ G ∗ and any X ∈ G. Using the above model for G ∗ , and since Ξ, g −1 Xg = gΞg −1 , X = (gΞg −1 )− , X , we get (Ad∗ g) · Ξ(λ) = g · Ξ · g −1 − This is precisely eq. (3.20). The Lax matrix can thus be interpreted as belonging to the coadjoint orbit of the element A− (λ) of G ∗ under the loop group G. With this interpretation, the Lax equation reads: L˙ = ad∗ M · L = [M, L]

(3.21)

This shows that the equation of motion is a ﬂow on the coadjoint orbit.

3.3 Coadjoint orbits and Hamiltonian formalism

43

Coadjoint orbits in G ∗ are equipped with the canonical Kostant–Kirillov symplectic structure. Choosing two linear functions h1 (Ξ) = Ξ(X) and h2 (Ξ) = Ξ(Y ) with X, Y ∈ G, so that dh1 = X and dh2 = Y , the Kostant–Kirillov Poisson bracket reads: {Ξ(X), Ξ(Y )} = Ξ([X, Y ]) where the right-hand side is the linear function Ξ → Ξ([X, Y ]). This Poisson bracket is very natural but one has to be aware that it is degenerate. The kernel is the set of Ad∗ -invariant functions. Let us specialize this construction to our case. We identify G ∗ with series expansions singular at λ = 0 using the linear form induced by Tr Resλ=0 . We parametrize the orbit of the element A− (λ) by the group element g(λ). Consider the 1-form α on the group given by α = −Tr Resλ=0 A− g −1 δg The pullback on the group of the Kostant–Kirillov symplectic form reads (see Chapter 14):

ω = δα = Tr Resλ=0 A− g −1 δg ∧ g −1 δg (3.22) This interpretation of L(λ) as a coadjoint orbit assumes that A− (λ) is not a dynamical variable. This construction can be extended to the multi-pole case. We consider the direct sum of loop algebras Gk , around λ = λk : Gk G≡ k

An element of this Lie algebra has the form of a multiplet X(λ) = (X1 (λ), X2 (λ), . . .)

where Xk (λ), deﬁned around λk , is of the form Xk (λ) = n≥0 Xk,n (λ − λk )n . The Lie bracket is such that [Xk (λ), Xl (λ)] = 0 if k = l. The group G is the direct product of the groups Gk of regular invertible matrices at λk : G ≡ (G1 , G2 , . . .) (3.23) The dual G ∗ of this Lie algebra consists of multiplets Ξ = (Ξ1 (λ), Ξ2 (λ), . . .)

where Ξk (λ) around λk is of the form Ξk (λ) = r≥1 Ξk,r (λ − λk )−r . In this sum the number of terms is ﬁnite but arbitrary. The pairing is simply Ξk , Xk = Tr Resλk (Ξk (λ)Xk (λ)) Ξ, X ≡ k

k

44

3 Synopsis of integrable systems

The coadjoint action of G on G ∗ is given by the usual formula: if g = (g1 , g2 , . . .) ∈ G and Ξ = (Ξ1 , Ξ2 , . . .) ∈ G ∗ (Ad∗ g).Ξ(λ) = ((g1 Ξ1 g1−1 )− , (g2 Ξ2 g2−1 )− , . . .) A coadjoint orbit consists of elements Ξk with a ﬁxed maximal order of the pole. Then, we can interpret eq. (3.9) as the coadjoint orbit of the element ((A1 )− , (A2 )− , . . .). Alternatively, we can consider the function on G ∗ L(λ) = L0 + Ξk (3.24) k

with poles at the points λk . Given this function we can recover the Ξk by extracting the polar parts. The constant matrix L0 is added to match the formula for the Lax matrix, eq. (3.9). By choice it is assumed to be invariant by coadjoint action. The pairing can be rewritten as L, X = Tr Resλk L(λ)Xk (λ) k

Note that only Ξk contributes to the residue at λk and the formula is compatible with the matrix L0 being invariant by coadjoint action. It is interesting to compute the dimension of this coadjoint orbit. In (k) (k) the formula Lk = (g (k) A− g (k)−1 )− , the matrices A− characterize the orbit and are non-dynamical. The dynamical variables are the jets of order (nk − 1) of the g (k) , which gives N 2 nk parameters. But Lk is invariant under g (k) → g (k) d(k) with d(k) a jet of diagonal matrices of the same order. Hence the dimension of the Lk orbit is (N 2 − N )nk , and the dimension of the orbit is the even number: nk dim M = (N 2 − N ) k

In the multi-pole case, the pullback of the symplectic form reads:

(k) Tr Resλk A− g (k)−1 δg (k) ∧ g (k)−1 δg (k) ω= (3.25) k

We can now use this symplectic form to evaluate the Poisson brackets of the elements of the Lax matrix. To write them we use the tensor notation of section (2.5) in Chapter 2 and show that they take the r-matrix form. We assume that each Lk (λ) is a generic element of an orbit of the (k) loop group GL(N )[λ], that L0 and the A− are non-dynamical, and the symplectic form is given by eq. (3.25).

3.3 Coadjoint orbits and Hamiltonian formalism

45

Proposition. With the symplectic structure eq. (3.25), the Poisson brackets of the matrix elements of L(λ) can be written as: C12 {L1 (λ), L2 (µ)} = − (3.26) , L1 (λ) + L2 (µ) λ−µ with C12 = i,j Eij ⊗ Eji , where the Eij are the canonical basis matrices. The commutator in the right-hand side of eq. (3.26) is the usual matrix commutator. Proof. Let us ﬁrst assume that we have only one-pole and L = (gA− g −1 )− . Because we are dealing with a Kostant–Kirillov bracket for the loop algebra of gl(N ), we can immediately write the Poisson bracket of the Lax matrix using the deﬁning relation {L(X), L(Y )} = L([X, Y ]). Using L(X) = Tr Resλ=0 (L(λ)X(λ)), this gives: {L(X), L(Y )} = Tr Resλ=0 (L(λ)[X(λ), Y (λ)])

(3.27)

By deﬁnition of the notation {L1 , L2 }, we have: {L(X), L(Y )} = {L1 (λ), L2 (µ)} , X(λ) ⊗ Y (µ) where , = Tr12 Resλ Resµ . We need to factorize X(λ)⊗Y (µ) in eq. (3.27). To this end, we introduce a Casimir operator C12 = Eα ⊗ Eα∗ ∈ G ⊗ G ∗ α

where Eα and Eα∗ are two dual bases of G and G ∗ respectively. We choose n Eij = λn Eij ,

∗n Eij = λ−n−1 Eji ,

n≥0

m = δ δ δ so that under the pairing Tr Res we have E ∗ nij , Ekl ik jl nm . The Casimir operator is such that for Y ∈ G, we have Y1 = C12 (Y2 ). Then we want to write L, [X, Y ] = [C12 , L1 ], X ⊗ Y , however, this formula n , L]⊗E ∗n and does not make sense as it stands because [C12 , L1 ] = α [Eij ij n ∈ G while L ∈ G ∗ and the commutator is not deﬁned. To overcome Eij this problem we embed G and its dual G ∗ into the full loop algebra G˜ n , n ∈ Z. generated by Eij G˜ = G + G ∗ (3.28)

Note that in this sum, G and G ∗ do not commute. Let us compute C12 , assuming |λ| < |µ|: ∞ C12 λn , =− C12 (λ, µ) = C12 n+1 µ λ−µ n=0

C12 =

i,j

Eij ⊗ Eji

46

3 Synopsis of integrable systems

We can now write L(λ)[X(λ), Y (λ)] = [C12 (λ, µ), L(λ) ⊗ 1] , X(λ) ⊗ Y (µ) . Consider the rational function of λ: ϕ(λ) = {L1 (λ), L2 (µ)} − [C12 (λ, µ), L(λ) ⊗ 1]. By inspection ϕ contains only negative powers of µ, and we have ϕ, X(λ) ⊗ Y (µ) = 0. Hence ϕ contains only positive powers of λ and is regular at λ = 0. It has a pole at λ = µ, due to the form of C(λ, µ). We remove this pole by subtracting to ϕ the quantity [C12 (λ, µ), 1 ⊗ L(µ)] which contains only positive powers of λ and is therefore in the kernel of ·, X(λ) ⊗ Y (µ) . The pole at λ = µ disappears since [C12 , L(µ) ⊗ 1 + 1 ⊗ L(µ)] = 0. The redeﬁned ϕ is regular everywhere and vanishes for λ → ∞, hence vanishes identically. This proves eq. (3.26) in the one-pole case. We can now study the multi-pole situation occuring in eq. (3.9). Consider L = L0 + k=1 Lk . Each Lk lives in a coadjoint orbit as above equipped with its own symplectic structure. From eq. (3.25) they have vanishing mutual Poisson brackets {Lj1 , Lk2 } = 0 for j, k = 0, . . . , N and j = k. We assume further that L0 does not contain dynamical variables {L01 , L02 } = 0,

{L01 , Lk2 } = 0

(here the indices 1 and 2 refer to the tensorial notation). Then since C12 /(λ − µ) is independent of the pole λk , it is obvious that the r-matrix relations for each orbit combine by addition to give eq. (3.26) for the complete Lax matrix L(λ).

Remark. The quantity C12 =

Eij ⊗ Eji

(3.29)

i,j

often occurs when calculating r-matrices. It is called the tensor Casimir of gl(N ). Its main properties are [C12 , g ⊗ g] = 0,

Tr2 C12 g2 = g1 ,

∀g ∈ GL(N )

(3.30)

This proposition shows that the generic Zakharov–Shabat system, equipped with this symplectic structure, is an integrable Hamiltonian system (the precise counting of independent conserved quantities will be done in Chapter 5). It also gives us a very simple formula for the r-matrix specifying the Poisson bracket of L(λ): r12 (λ, µ) = −r21 (µ, λ) = −

C12 (λ − µ)

(3.31)

47

3.3 Coadjoint orbits and Hamiltonian formalism

The Jacobi identity is satisﬁed because this r-matrix veriﬁes the classical Yang–Baxter equation (see eq. (2.12) in Chapter 2): [r12 , r13 ] + [r12 , r23 ] + [r13 , r23 ] = 0 where rij stands for rij (λi , λj ). Note that r12 is antisymmetric: r12 (λ1 , λ2 ) = −r21 (λ2 , λ1 ). As in Chapter 2, these Poisson brackets for the Lax matrix ensure that one can deﬁne commuting quantities. The associated equations of motion take the Lax form. Proposition. The functions on phase space:

H (n) (λ) ≡ Tr Ln (λ) are in involution. The equations of motion associated with H (n) (µ) can be written in the Lax form with M = k Mk : n−1 L (λ) Mk (λ) = −n (3.32) λ−µ k Proof. The quantities H (n) (λ) are in involution because {Tr Ln (λ), Tr Lm (µ)} = nmTr12 {L1 (λ), L2 (µ)}Ln−1 (λ)Lm−1 (µ) 1 2 nm n−1 =− (µ) + [C12 , Lm (λ)) = 0 Tr12 ([C12 , Ln1 (λ)]Lm−1 2 (µ)]L1 2 λ−µ where we have used that the trace of a commutator vanishes. Similarly, we have: C12 n−1 (n) ˙ (µ), L1 (λ) L(λ) = {H (µ), L(λ)} = nTr2 L λ−µ 2 Performing the trace and remembering that Tr2 (C12 M2 ) = M1 , we get ˙ L(λ) = [M (n) (λ, µ), L(λ)],

M (n) (λ, µ) = n

Ln−1 (µ) λ−µ

(3.33)

This M (n) (λ, µ) has a pole at λ = µ and is otherwise regular. According to the general procedure we can remove this pole by subtracting some polynomial in L(λ) without changing the equations of motion. Obviously one can redeﬁne: M (n) (λ, µ) → M (n) (λ, µ) − n

Ln−1 (λ) Ln−1 (λ) − Ln−1 (µ) = −n λ−µ λ−µ

48

3 Synopsis of integrable systems

This new M has poles at all λk andis regular at λ = µ. Decomposing it into its polar parts, we write M = k Mk with n−1 L (λ) Mk (λ) = −n λ−µ k This is of the form eq. (3.15) with P (k) (L, λ) = −

n Ln−1 (λ) λ−µ

(3.34)

Notice that the coeﬃcients of the polynomial P (k) (L, λ) are pure numerical constants. Example 1. In the case of the Euler top L(λ) = I 2 + λ1 J. The singular part satisﬁes t L− (−λ) = L− (λ). This is not preserved by a general coadjoint action L− (λ) = (g(λ)L− (λ)g −1 (λ))− . To overcome this problem we consider the subgroup of matrices satisfying t g −1 (−λ) = g(λ) which may be called graded orthogonal. Its Lie algebra consists of matrices X(λ) such that t X(−λ) = −X(λ). Its dual under the pairing Tr Res consists of matrices L(λ) such that t L(−λ) = L(λ), and having an expansion in a ﬁnite sum of strictly negative powers of λ. The matrix L− (λ) is an orbit under this coadjoint action. The symplectic structure on this orbit is obtained by applying eq. (3.27):

1 {Jij , Jkl } = − δjk Jil − δjl Jik + δil Jjk − δik Jjl (3.35) 2 The computation of the r-matrix of the Euler top is similar to the proof of eq. (3.26), except that the loop algebra being diﬀerent, the Casimir operator has to be recomputed. We keep the factor −1/2 in eq. (3.35) in order to match the general considerations. Of course if this factor is omitted we must multiply the ﬁnal r-matrix by −2. A basis Eα of the graded loop algebra is given for n = 0, 1, . . . by: (Eij − Eji )λ2n , i < j,

(Eij + Eji )λ2n+1 , i < j,

Eii λ2n+1

The dual basis Eα∗ under Tr Res is given respectively for n = 0, 1, . . . by: 1 1 − (Eij − Eji )λ−2n−1 , i < j, (Eij + Eji )λ−2n−2 , i < j, Eii λ−2n−2 2 2 Then one gets for C12 = α Eα ⊗ Eα∗ : 1 1 1 1 C12 (λ, µ) = − Eij ⊗ Eji − Eij ⊗ Eij 2λ−µ 2λ+µ ij

ij

3.4 Elementary ﬂows and wave function

49

This implies that poles at λ = ±µ appear in ϕ(λ) = {L1 (λ), L2 (µ)} − [C12 (λ, µ), L(λ) ⊗ 1]. To cancel these poles we now subtract [C21 (µ, λ), 1 ⊗ L(µ)] which only contains positive powers of λ. Indeed, the residue at λ = µ is [C12 , L1 (µ) + L2 (µ)], with C12 = ij Eij ⊗ Eji , and therefore vanishes as previously. The residue at λ = −µ reads [D12 , L1 (−µ)−L2 (µ)] with D12 = ij Eij ⊗ Eij . Now L(−µ) = t L(µ) and one checks that for any matrix A the commutator [D12 , A1 − t A2 ] vanishes. Hence ϕ(λ) − [C21 (µ, λ), 1 ⊗ L(µ)] = 0. Strictly speaking, one should have done this calculation on the polar part L− (λ) and added the L0 part afterwards. However, since the full Lax matrix satisﬁes t L(−λ) = L(λ), the reasoning is actually valid for the full Lax matrix. Finally one gets: {L1 (λ) , L2 (µ)} = [r12 (λ, µ), L1 (λ)] − [r21 (µ, λ), L2 (µ)] with r12 (λ, µ) = C12 (λ, µ). Note that this is a two-poles r-matrix with poles at λ = ±µ. Example 2. We consider next the Neumann model. Recall that the Lax matrix reads L(λ) = L0 + λ1 J − λ12 K. As in the Euler top it satisﬁes t L(−λ) = L(λ). Hence we are dealing with the graded orthogonal group t g −1 (−λ) = g(λ). Let us check that the matrix L (λ) is an orbit under − the coadjoint action. As a matter of fact: (g(λ)L− (λ)g −1 (λ))− = −

1 1 g0 K t g 0 + (g0 J t g 0 − g1 K t g 0 + g0 K t g 1 ) 2 λ λ

with g(λ) = g0 +λg1 +. . .. Recalling that K = X t X and J = X t Y −Y t X, we see that this is exactly of the same form as L− (λ) with X → g0 X,

Y → g0 Y + g1 X

One can check that the Kostant–Kirillov bracket on this orbit reproduces the canonical Poisson bracket on the variables X and Y . It follows that the Neumann model has the same r-matrix as the Euler top. 3.4 Elementary ﬂows and wave function We have found that the time evolution of a Zakharov–Shabat system is given by matrices M (λ) of the form eq. (3.10). This leaves an inﬁnite number of choices for M (λ). We introduce an inﬁnite number of elementary times corresponding to these choices. We will show that these ﬂows are pairwise commuting. This deﬁnes a so-called integrable hierarchy. (k) The elementary ﬂows correspond to diagonal matrices B− having a single pole of order n at λk in matrix diagonal entry α. Here, and in the

50

3 Synopsis of integrable systems

following, we use the multi-index i = (k, n, α). We thus deﬁne matrices Mi by:

Mi ≡ g (k) ξi g (k)−1 ,

ξi ≡ ξ(k,n,α) =

k

1 Eαα (λ − λk )n

(3.36)

We call ti = t(k,n,α) the time variable associated with Mi through the Lax equation: (3.37) ∂ti L = [Mi , L] A general ﬂow is a linear combination of these elementary ones. Note that Mi (λ), which is a priori deﬁned around λk , is a rational fonction of λ, with only a polar part at λk , hence is deﬁned in the whole λ-plane. The Lax equation, eq. (3.37), has a meaning in the whole λ-plane and deﬁnes the time evolution of the quantities locally deﬁned at λk , such as g (k ) , with respect to the times associated with λk . We will need some notations. Let: ξ(k,n,α) t(k,n,α) (3.38) ξ (k) (λ, t) = n,α

ξ(λ, t) =

k

ξ (k) (λ, t) =

ξi ti

(3.39)

i=(k,n,α)

These are generating functions with coeﬃcients rational in λ. The function ξ (k) (λ, t) involves all the times above the singularity λk , while ξ(λ, t) involves all the times of the hierarchy. It is easy to ﬁnd the Hamiltonians generating these elementary ﬂows. Proposition. The Hamiltonian generating the ﬂow ti is A(k) (λ)Eαα dλ Tr(A(k) (λ)Eαα ) Hi = = Tr Resλk (λ − λk )n (λ − λk )n Γ(k) 2iπ (k)

where A(k) (λ) is the diagonal form of L(λ), with singular part A− (λ), and Γ(k) a small contour around λk . Proof. Let us introduce the diﬀerential dHi (L) deﬁned by δHi = δL, dHi for any variation δL of the Lax matrix. To compute it we start from eq. (3.8), written as A(k) = g (k)−1 Lg (k) , so that δA(k) = g (k)−1 δLg (k) + [A(k) , g (k)−1 δg (k) ]. Hence δTr(A(k) (λ)Eαα ) = Tr(δL g (k) Eαα g (k)−1 ), where we used that [Eαα , A(k) ] = 0. We get the formula: dHi (L) = (λ − λk )−n g (k) Eαα g (k)−1

(3.40)

3.4 Elementary ﬂows and wave function

51

Next, ∂ti L(µ) = {Hi , L(µ)}, so we have:

∂ti L(µ) = Tr1 Resλ=λk dHi (λ) ⊗ 1[r12 (λ, µ), L1 (λ) + L2 (µ)] = Tr1 Resλ=λk (r12 (λ, µ)[L(λ), dHi (λ)]1 ) + Tr1 Resλ=λk (r12 (λ, µ)dHi (λ) ⊗ 1), L2 (µ) with r12 (λ, µ) given by eq. (3.31). In the ﬁrst term, [L(λ), dHi (λ)] is proportional to g (k) [A(k) , Eαα ]g (k)−1 and vanishes since A(k) and Eαα are both diagonal. In the second term we expand r12 (λ, µ) in positive powers of (λ − λk )/(µ − λk ) to get something polar in (µ − λk ): ∞ (λ − λk )m Resλ=λk dHi (λ) Tr1 Resλ=λk (r12 (λ, µ)dHi (λ) ⊗ 1) = (µ − λk )m+1 m=0 Eαα (k) −1 g = Mi (µ) = (dHi (µ))k = g (k) (µ − λk )n k m In the last step we used that for any function f (λ) = +∞ m=−∞ λ fm , one has the identity ∞

Res(λm f (λ))µ−m−1 =

m=0

∞

f−m−1 µ−m−1 = (f (µ))−

m=0

Comparing eq. (3.36) and eq. (3.40) we ﬁnd the useful relation: dHi (L) = g (k) ξi g (k)−1 ,

Mi = (dHi )−

(3.41)

We now verify directly that the ﬂows ∂ti deﬁned by eqs. (3.37) and (3.36) all commute. This amounts to showing that [L, ∂ti Mj −∂tj Mi −[Mi , Mj ]] = 0. We get even the stronger result: Proposition. The matrices Mi deﬁning the time evolution ti satisfy the zero curvature condition ∂ti Mj − ∂tj Mi − [Mi , Mj ] = 0

(3.42)

As a consequence, the ﬂows deﬁned by eqs. (3.36, 3.37) are all commuting. Proof. Let i = (k, n, α) and j = (k , n , α ). Diagonalizing L = g (k ) A(k ) g (k )−1 around λk , the Lax equation gives

(k )

∂ti g (k ) = Mi g (k ) + g (k ) di

(3.43)

52

3 Synopsis of integrable systems (k )

where di is an unknown diagonal matrix. This equation holds true in (k) a vicinity of λk . If i = (k, n, α) and k = k, this implies that di = −ξi + regular, because ∂ti g (k) is regular and Mi = (g (k) ξi g (k)−1 )− , while (k ) if k = k, we only conclude that di is regular around k . Note that g (k) is known only up to a right multiplication by a regular diagonal matrix (k ) (k ) g (k) → g (k) d(k) and this changes di → di − d(k )−1 ∂ti d(k ) . From eqs. (3.36) and (3.43), we get (k)

∂tj Mi = [Mj , g (k) ξi g (k)−1 ]k + (g (k) [dj , ξi ]g (k)−1 )k Since the commutator in the second term involves diagonal matrices, it vanishes. Hence we have ∂tj Mi = [Mj , g (k) ξi g (k)−1 ]k . Let us assume ﬁrst that λk = λk . Then Mj is regular at λk and only (g (k) ξi g (k)−1 )k = Mi contributes to the polar part of the above commutator, yielding ∂tj Mi = [Mj , Mi ]k Similarly we have ∂ti Mj = [Mi , Mj ]k The zero curvature condition follows because [Mi , Mj ] is a rational function with poles only at λk and λk and vanishes at inﬁnity, so that [Mi , Mj ] = [Mi , Mj ]k + [Mi , Mj ]k Assume next that λk = λk . We still have ∂tj Mi = [Mj , g (k) ξi g (k)−1 ]k ,

∂ti Mj = [Mi , g (k) ξj g (k)−1 ]k

where now all projections are at λk . But Mi − g (k) ξi g (k)−1 = O(1) and Mj −g (k) ξj g (k)−1 = O(1) so that [Mi −g (k) ξi g (k)−1 , Mj −g (k) ξj g (k)−1 ]k = 0, or [Mi , Mj ] − [g (k) ξi g (k)−1 , Mj ]k − [Mi , g (k) ξj g (k)−1 ]k + [g (k) ξi g (k)−1 , g (k) ξj g (k)−1 ]k = 0 The last term vanishes because [ξi , ξj ] = 0. From this the zero curvature condition readily follows. When parametrizing L(λ) as in eq. (3.9), the dynamical variables are the g (k) (λ), modulo gauge transformations consisting of right multiplication by regular diagonal matrices. Let us write the equations of motion on the variables g (k) (λ).

53

3.4 Elementary ﬂows and wave function

Proposition. There exists a gauge choice such that the equations of motion read, for each i = (k, n, α):

∂ti g (k ) = Mi g (k ) − g (k ) ∂ti ξ (k ) (λ, t)δkk (k )

(3.44)

= g (k )−1 (∂ti − Mi )g (k ) is Proof. Equation (3.43) means that di the gauge transform of Mi by g (k ) . The zero curvature equation being (k ) (k ) invariant under gauge transformation implies that ∂ti dj − ∂tj di − (k )

[di

(k )

, dj

] = 0 for any indices (i, j, k ). Since the matrices di

(k)

are diag(k )

= onal, the commutator vanishes, and the condition implies that di ∂ti h(k ) for some diagonal matrix h(k ) . We can now use the freedom g (k ) → g (k ) d(k ) to suppress the regular part of h(k ) around λk by (k ) choosing d(k ) = exp(h+ ). This is a choice of gauge. We already no(k )

(k )

around λk is ﬁxed, di = 0 if ticed that the singular part of di (k ) (k ) = −ξi if k = k , this determines di completely: k = k and di ) (k) (k di = −∂ti ξ (λ, t)δkk . The set of Lax equations, eq. (3.37), for all the times ti is what is called an integrable hierarchy. Written on the variables g (k) , eq. (3.44) reads in detail, when i = (k , n, α) and k = k:

∂ti g (k) = g (k) ξi g (k)−1 g (k) − g (k) ∂ti ξ (k) = − g (k) ξi g (k)−1 g (k) (3.45) −

and when i =

(k , n, α)

and

+

k

= k:

∂ti g (k) = Mi g (k)

(3.46)

In this equation, Mi , which is a rational function of λ with only one-pole at λk , is regular around λk . The zero curvature condition also allows to introduce the “wave function”: Deﬁnition. The wave function Ψ(λ; t1 , t2 , . . .), is a matrix function depending on all the times simultaneously, and satisfying ∂ti Ψ = Mi Ψ,

Ψ(λ, t)|t=0 = 1

(3.47)

Locally around each λk we have Ψ(λ, t) = g (k) (λ, t)eξ

(k) (λ,t)

g (k)−1 (λ, 0)

(3.48)

The compatibility conditions of eqs. (3.47) are precisely the zero curvature (k) equations, eqs. (3.42). Equation (3.48) follows because g (k) (λ, t)eξ (λ,t) (k)−1 is easily seen to satisfy eq. (3.47). Multiplying on the right by g (λ, 0) enforces the initial condition Ψ(λ, t)|t=0 = 1.

54

3 Synopsis of integrable systems 3.5 Factorization problem

We now show that solving the hierarchy amounts to solving a factorization problem in a loop group. This is in fact solving a Riemann–Hilbert factorization problem. These two aspects, group theory and analytic properties, will be fundamental in Chapters 4 and 5. At the end of the section we make contact with the wave function. In the construction orbits, we introduced a loop algebra of coadjoint n at λ = 0. Its dual space G ∗ was G of elements X = n≥0 Xn λ regular identiﬁed with the set of elements Ξ = n<0 Ξn λn , regular at λ = ∞. It so happens that G ∗ is itself a loop algebra, and in the computation of the ˜ r-matrix we had to embed G and G ∗ into a single larger loop algebra G, see eq. (3.28): G˜ = G + G ∗ To adapt the notations to a more algebraic setting, in this section we ˜ and by G− the subalgebra G ∗ . Any denote by G+ the subalgebra G of G, ˜ ˜ element X ∈ G can be decomposed uniquely as ˜ = X+ − X− , X

X± ∈ G±

At the group level this corresponds, formally, to decomposing an element ˜ = exp(G) ˜ as g˜ ∈ G −1 g˜ = g− g+ ,

g± ∈ exp(G± )

To give a meaning to this formal factorization we interpret it as a Riemann–Hilbert problem. To do this, we introduce a small contour Γ ˜ ∈ G˜ is then viewed as a matrix valued around λ = 0. An element X ˜ function X(λ) deﬁned on Γ. An element X+ ∈ G+ is an invertible matrix valued function X+ (λ) on Γ which can be analytically extended inside Γ. An element X− ∈ G− is a matrix valued function X− (λ) on Γ which can ˜ is an invertbe analytically extended outside Γ. An element g˜(λ) ∈ exp(G) ible matrix valued function on Γ. The Riemann–Hilbert problem consists of factorizing g˜(λ) as a product of two matrices g± (λ) analytic inside and outside the contour respectively. The existence of such a decomposition is ensured by the following: Theorem. Let Γ be the closed contour |λ| = 1 in the λ-plane and g˜(λ) a matrix deﬁned on Γ. There exist two matrices g± (λ), with g+ (λ) analytic inside Γ and g− (λ) analytic outside Γ such that det g± = 0 in their respective domain of deﬁnition, and: −1 (λ)Λ(λ)g+ (λ) g˜(λ) = g−

(3.49)

3.5 Factorization problem

55

Here Λ(λ) is a diagonal matrix with entries of the form λki . The ki are integers, uniquely determined up to order by g˜(λ), and called the indices. This solution to the factorization problem is unique if we require g− (λ)|λ=∞ = 1. This theorem, which amounts to a classiﬁcation of holomorphic vector bundles on the Riemann sphere, has a long history and has been proved by D. Hilbert, G. Birkhoﬀ, A. Grothendieck and many others. For a sketch of the proof, see Chapter 15. Remark 1. To understand the occurence of indices, consider the scalar case, and assume that g˜ is analytic in a ring surrounding Γ. The factorization problem is easily solved taking logarithms, but one has to be careful about the multivaluedness of log g˜. Suppose that it jumps by 2ikπ when λ describes Γ. Then the function log g˜(λ)−k log(λ) is analytic and monovalued in the ring, hence can be expanded in a Laurent series: log g˜(λ) = k log(λ) + X+ (λ) − X− (λ) where X+ (λ) is the series of positive powers of λ in this Laurent expansion, and converges in the disc |λ| ≤ 1, while X− (λ) is the series of strictly negative powers of λ, which converges for |λ| ≥ 1. Then g± = exp X± are uniquely determined and non-vanishing. The integer k is the index. In the case where g˜ is a priori given as an exponential of an element of G˜ the index naturally vanishes.

We treat ﬁrst the case of one-pole which we assume to be located at λk = 0. We call g+ the element g (k) around λk = 0 appearing in equations of the hierarchy, eq. (3.45). Since we have only one-pole, the full set of equations reads, with i = (n, α), −1 ∂ti g+ = − g+ ξi g+ g (3.50) + + Proposition. For small enough time, the solution of the system eq. (3.50) is obtained by solving the factorization problem −1 g− (λ, t)g+ (λ, t) = eξ(λ,t) g+ (λ, 0)e−ξ(λ,t)

(3.51)

Note that in the right-hand side, the time dependence is explicit. Proof. We want to show that g+ (λ, t), deﬁned through the factorization problem eq. (3.51), satisﬁes eq. (3.50). Taking the time derivative of eq. (3.51) with respect to tj , and multiplying on the left by g− ≡ g− (λ, t) −1 −1 ≡ g+ (λ, t), we get with, ξj = ∂tj ξ: and on the right by g+ −1 −1 −1 −1 + ∂tj g+ g+ = g− ξj g− − g+ ξj g+ −∂tj g− g−

56

3 Synopsis of integrable systems Identifying the + and − parts, we have: −1 −1 −1 −1 −1 = − g+ ξj g+ , ∂ g g = −g ξ g + g ξ g ∂tj g+ g+ t − − j + j − − + j + −

The ﬁrst equation is just eq. (3.50). This is a remarkable result, as it shows that the solution of the integrable hierarchy is reduced to solving a factorization problem in group theory. This will be promoted to an abstract algebraic setting for integrable systems in Chapter 4. Remark 2. In eq. (3.51) we assumed that there are no indices. This is certainly true for small enough times because in the equivalent formulation eq. (3.52), the righthand side is close to the identity for t small. As times get larger, indices may jump and may cause singularities in the solution. In Chapter 5 we will show that, indeed, poles appears at ﬁnite complex time.

The factorization problem, eq. (3.51), can be stated in a slightly diﬀerent but equivalent way. Since ξ(λ, t) contains only a polar part, −1 (λ, t) ξ(λ, t) ∈ G− and it can be reabsorbed in g− (λ, t). Let us deﬁne θ− and θ+ (λ, t) by −1 −1 θ− (λ, t) = e−ξ(λ,t) g− (λ, t),

−1 θ+ (λ, t) = g+ (λ, t)g+ (λ, 0)

Equation (3.51) now takes the form −1 −1 (λ, t)θ+ (λ, t) = g+ (λ, 0)e−ξ(λ,t) g+ (λ, 0) θ−

(3.52)

Since g+ (λ, 0) is the matrix which diagonalizes the Lax matrix L(λ, 0) = −1 (λ, 0), we can formulate the factorization problem in the g+ (λ, 0)A(λ)g+ form −1 (λ, t) θ+ (λ, t) = e− i dHi (L(λ,0)) ti θ− where dHi is the diﬀerential of the Hamiltonian generating the ﬂow ti given in eq. (3.40). It is this last formulation which lends itself to an easy generalization to the multi-pole case. We introduce small contours Γ(k) around each pole λk . As for the one-pole case, we deﬁne the group G+ as the set of invertible matrices analytic inside all the contours Γ(k) , and the group G− as the set of invertible matrices analytic outside these contours, and normalized to the identity at λ = ∞. It is important to understand that the group G+ and the group G, the direct product of the groups Gk appearing in eq. (3.23), may be identiﬁed. Indeed, an element of G+ is known if we give its restrictions, g (k) , to the interiors of the Γ(k) . Hence an element of G+ can be viewed as a multiplet of elements g (k) analytic inside Γ(k) . This is a group homomorphism. The Lie algebra of G+ is then identiﬁed

57

3.5 Factorization problem

with G. Similarly, the Lie algebra of G− can be identiﬁed with G ∗ , as a vector space, using the non-degenerate pairing: dλ dλ (k) X− , X+ = Tr(X− X+ ) = Tr(X− X+ ) Γ 2iπ Γk 2iπ k

In the multi-pole case,the Riemann–Hilbert problem can be formulated ˜ of as follows. As in the one-pole case, we introduce the loop group G ˜ is a collection of elements matrices deﬁned on Γ. An element g˜(λ) ∈ G g˜(k) (λ) given on the contours Γ(k) . The Riemann–Hilbert problem now consists of factorizing elements g˜(λ) as follows: −1 (λ)g+ (λ) = g˜(k) (λ), g−

λ ∈ Γ(k)

(3.53)

where g+ (λ) is analytic inside all the Γ(k) and g− (λ) is analytic outside all the Γ(k) . As in eq. (3.49), non-trivial indices may occur, but this does not happen for g˜(k) (λ) close enough to the identity. This problem is solved by reducing it recursively to Riemann–Hilbert (1) be the solution problems with only one contour. Indeed, let h−1 − h+ = g for the contour Γ(1) with h− (∞) = 1. We seek the complete solution in the form g = f h− outside Γ(1) and g = f h+ inside Γ(1) . On Γ(1) we get f−−1 f+ = 1 while on Γ(k) for k ≥ 2 we get f−−1 f+ = h− g˜(k) h−1 − , so that f is obtained by solving a modiﬁed Riemann–Hilbert problem on the contours Γ(2) and so on. Proposition. The solution of the hierarchy equations eq. (3.37) is obtained by solving the multi-pole Riemann–Hilbert problem −1 (λ, t)θ+ (λ, t) = e− θ−

i

dHi (L(λ,0)) ti

(3.54)

In the right-hand side the sum extends over all times of the hierarchy. The Lax matrix at time t is reconstructed from the initial condition L(λ, 0) by −1 −1 (λ, t) = θ− (λ, t)L(λ, 0)θ− (λ, t) L(λ, t) = θ+ (λ, t)L(λ, 0)θ+

(3.55)

Proof. We ﬁrst have to check that eq. (3.55) is consistent. The formula with θ+ gives a deﬁnition of L(λ, t) with λ inside the Γ(k) , while the formula with θ− gives a deﬁnition of L(λ, t) with λ outside the Γ(k) . They −1 coincide for λ ∈ Γ(k) since by the factorization problem θ− (λ, t)θ+ (λ, t) commutes with L(λ, 0). Since θ± are regular inside their respective domain of deﬁnition, the singularities of L(λ, t) are the same as the singularities of L(λ, 0). Taking the derivative of eq. (3.54) with respect to ti , we get for λ ∈ Γ(k) −1 −1 ∂ti θ+ (t)θ+ (t) − ∂ti θ− (t)θ− (t)

= −θ− (t) dHi (L(0))e−

i

dHi (L(0)) ti

−1 θ+ (t)

−1 (t) = −dHi (L(t)) = −θ− (t)dHi (L(0)) θ−

58

3 Synopsis of integrable systems

Hence we need to decompose dHi (L(t)) ∈ G˜ into its + and − parts. The solution of this factorization problem is (see eq. (3.41)) −1 (t) = Mi , ∂ti θ− (t)θ−

−1 ∂ti θ+ (t)θ+ (t) = −dHi (L) + Mi

Indeed Mi is analytic everywhere outside λk , while −dHi (L) − Mi is analytic inside all the Γ(l) . Since the decomposition is unique, this is the solution. Using the expression of L in terms of θ− (t) we get −1 , L] = [Mi , L] ∂ti L = [∂ti θ− θ−

With the solution θ± of the factorization problem, we can reconstruct the wave function Ψ deﬁned in eq. (3.47). Recall that around λk the wave function admits the expansion eq. (3.48). The matrix g (k) (λ, t) in this formula diagonalizes L(λ, t) and, in view of eq. (3.55), it is related to θ+ (λ, t) by g (k) (λ, t) = θ+ (λ, t)g (k) (λ, 0)d(k) (λ, t), where d(k) (λ, t) is a diagonal matrix regular inside Γ(k) . Note that d(k) (λ, t)|t=0 is the identity matrix. To ﬁnd it for other values of t, we write its time evolution: ∂tj ξ (k) (λ, t) = g (k)−1 (λ, t) dHj (L(λ, t)) g (k) (λ, t) − d(k)−1 ∂tj d(k) By eq. (3.40), we have g (k)−1 (λ, t) dHj (L(λ, t)) g (k) (λ, t) = ξj (λ). The time evolution of d(k) is therefore dictated by the simple equation d(k)−1 ∂tj d(k) = −∂tj ξ (k) (λ, t) + ξj (λ) whose solution is d(k) (λ, t) = eξ(λ,t)−ξ

(k) (λ,t)

Inserting into eq. (3.48), Ψ(λ, t) = θ+ (λ, t)g (k) (λ, 0)eξ(λ,t) g (k)−1 (λ, 0), or:

Ψ(λ, t) = θ+ (λ, t)e

i

dHi (L(λ,0))ti

(3.56)

This provides a formula for Ψ(λ, t) inside the contours Γ(k) . Outside these contours we simply have Ψ(λ, t) = θ− (λ, t)

(3.57)

because, by the factorization problem, the two expressions coincide on the Γ(k) . Hence, we have found a global expression for the wave function. For completeness, we write the relation between the wave function and the Lax matrix L(λ, t)Ψ(λ, t) = Ψ(λ, t)L(λ, 0)

3.6 Tau-functions

59

The wave function is analytic in λ except at the λk where it has essential singularities. This could be expected from Poincar´e’s theorem on diﬀerential equations, applied to the system eq. (3.47). The time dependence of the essential singularities is also extremely simple: they are exponentials with linear time dependence, see eq. (3.56). This fact will be important in Chapter 5 where the function Ψ(λ, t) will be reconstructed from its analytical properties. 3.6 Tau-functions We now introduce the important concept of tau-functions. The elementary ﬂows induce a coupling between the λ dependence and the corresponding time dependence, which is best expressed in terms of tau-functions. The existence of tau-functions relies on the following main observation. Proposition. Let d = i dti ∂ti . In the gauge of eq. (3.44), the 1-form Tr Resλk (g (k)−1 ∂λ g (k) dξ (k) ) (3.58) Υ≡− k

is closed, dΥ = 0. As a consequence we can deﬁne a function τ (t1 , t2 , . . .) of the inﬁnite series of times t(k,n,α) for all k, n, α such that Υ = d log τ or: Tr Resλk (g (k)−1 ∂λ g (k) ξi ) (3.59) ∂ti log τ = − k

Proof. The evolution equations eq. (3.44)can be written as dg (k) = Mg (k) − g (k) dξ (k) , where we have set M = i Mi dti . Hence by diﬀerentiating eq. (3.58), we have Tr Resλ=λk (d∂λ ξ (k) ∧ dξ (k) − ∂λ M ∧ g (k) dξ (k) g (k)−1 ) dΥ = k

The ﬁrst term vanishes, because the order of the pole at λk of Tr (d∂λ ξ (k) ∧ dξ (k) ) is at least 3. To transform the second term, we notice that M has the same polar part as g (k) dξ (k) g (k)−1 at λk so that we can write g (k) dξ (k) g (k)−1 = M + N (k) where N (k) is regular at λk . We get: Tr Resλ=λk (∂λ M ∧ (M + N (k) )) dΥ = − k

We now show that: 1 Tr Resλ=λk (∂λ M ∧ N (k) ) = − Tr Resλ=λk (∂λ M ∧ M) 2

(3.60)

60

3 Synopsis of integrable systems

This implies that dΥ = − 12 k Tr Resλ=λk (∂λ M ∧ M) vanishes because Tr(∂λ M ∧ M) is a rational 1-form on the λ Riemann sphere, hence the sum of its residues vanishes. Equation (3.60) results from two local computations around λk . First we have: Tr Resλ=λk (∂λ (M + N (k) ) ∧ (M + N (k) )) = Tr Resλ=λk (∂λ (g (k) dξ (k) g (k)−1 ) ∧ (g (k) dξ (k) g (k)−1 )) = Tr Resλ=λk (d∂λ ξ (k) ∧ dξ (k) ) = 0 Next, using that the residue of a derivative of a function of λ vanishes, 0 = Tr Resλ=λk ∂λ (N (k) ∧ M), we obtain Tr Resλ=λk (∂λ N (k) ∧ M) = Tr Resλ=λk (∂λ M ∧ N (k) ). Inserting this relation into the left-hand side of the above formula yields eq. (3.60). The tau-function has remarkable and beautiful implications in the theory of integrable systems. Here we shall only present its fundamental relation to the wave-function Ψ. Around a singularity λk , we can expand Ψ as Ψ = g (k) (λ, t)eξ

(k) (λ,t)

g (k)−1 (λ, 0),

(k)

g (k) (λ) = g0 h(k) (λ) (k)

with h(k) (λ) = 1 + O(λ − λk ). We will relate the matrix elements hβα (λ) to tau-functions. Theorem. The matrix h(k) (t, λ) can be expressed in the form:

(k) τ t − [λ]α h(k) αα (t, λ) = τ (t)

(k) (k) τβα t − [λ]α (k) hβα (t, λ) = (λ − λk ) , α = β τ (t)

(3.61)

(k)

where the notation t − [λ]α means the shift of the time t(k,n,α) t − [λ](k) α ≡ t(k ,n,γ) − δkk δγα

(λ − λk )n n

Equation (3.61) is a famous formula discovered by Sato. Its main feature is that the λ dependence in the numerator is expressed by translations of the appropriate time variables. More information on tau-functions, including (k) expressions of τβα (t) in terms of τ (t), will be given in Chapter 8. The proof is long and will be done in several steps.

61

3.6 Tau-functions Lemma 1. Let f (λ) = ∞

λ

n

λ

−n

f (λ)

+

∞

j=0 fj λ

= λ∂λ f (λ),

n=1

j.

We have ∞

λn Res(λ−n f (λ)) = λf (λ)

(3.62)

n=1

Proof. Indeed, ∞

j ∞ ∞ ∞ ∞ λn λ−n f (λ) + = fj λj = fj λj = jfj λj = λ∂λ f (λ)

n=1

n=1 j=n

j=1 n=0

j=1

This shows the ﬁrst equality. The second one is simpler: ∞

λn Res(λ−n f (λ)) =

n=1

∞

λn fn−1 = λf (λ)

n=1

Let us consider the vicinity of a given pole which we assume to be at the origin λk = 0 (the case λk = 0 is simply recovered by translating λ in the formulae below). (k) Introduce the generating diﬀerential operator of time derivatives ∇α : ∂ ∂ ∇(k) (λ − λk )n = λn (3.63) α = ∂t(k,n,α) ∂t(k,n,α) n>0

n>0

(k)

Lemma 2. The action of ∇α on the tau-function is given by: (k)−1 ∇(k) ∂λ h(k) ) α log τ = −λ Tr (Eαα h

(3.64)

Its action on h(k) is given by: h(k)−1 ∇α(k) h(k) = −λ[h(k)−1 ∂λ h(k) , Eαα ] − h(k)−1 Eαα h(k) + Eαα

(3.65)

Proof. In the deﬁnition of log τ , eq. (3.59), we can replace g (k) by h(k) (k) because g0 is independent of λ. Hence, using eq. (3.62), −∇(k) α log τ

=

∞

λn Tr Res(λ−n Eαα h(k)−1 ∂λ h(k) ) = λTr(Eαα h(k)−1 ∂λ h(k) )

n=1

To prove the second formula, we start from the equations of the hierarchy, eq. (3.45), with ξi = Eαα λ−n . They read:

(k) (k) ∂ti g0 = −g0 h(k) Eαα λ−n h(k)−1 0

(k) (k) −n (k)−1 ∂ti h = − h Eαα λ h h(k) ++

62

3 Synopsis of integrable systems

where for any series f (λ) = i≥0 fi λi we have deﬁned (f (λ))0 = f0 and f (λ)++ = i≥1 fi λi . n −n f (λ)) = f (λ) − f , we get Using eq. (3.62) again and ∞ 0 0 n=1 λ (λ (k) =− ∇(k) α h

∞

λn h(k) Eαα λ−n h(k)−1

n=1

++

h(k)

= −λ∂λ (h(k) Eαα h(k)−1 ) + h(k) Eαα h(k)−1 − Eαα h(k) This is equivalent to eq. (3.65). We can now give the proof of the theorem: Proof. Let us take the matrix element β, α of eq. (3.65). Separating the cases β = α and β = α, we can write it as: (k) (k)−1 ∂λ h(k) )βα − (h(k)−1 Eαα h(k) − Eαα )βα (h(k)−1 ∇(k) α h )βα = −λ(h

+ δβα λ(h(k)−1 ∂λ h(k) )αα Multiplying on the left by h(k) and remembering that, by eq. (3.64), we (k) have λ(h(k)−1 ∂λ h(k) )αα = −∇α log τ , we obtain (k)

(k)

(k)

(k) (k) (k) ∇(k) α hβα = −λ∂λ hβα − (Eαα h )βα + (h Eαα )βα − (∇α log τ )hβα

If β = α we set Xαα = τ hαα and if β = α, we set Xβα = λ−1 τ hβα . The (k)

(k)

(k)

(k)

(k)

(k)

(k)

equation becomes (∇α + λ∂λ )Xβα = 0. This means that Xβα (t, λ) is of (k)

(k)

(k)

the form Xβα (t, λ) = τβα (. . . , t(k,n,α) − λn /n, . . .). Since hαα (λ)|λ=0 = 1, (k)

the function ταα (t) is in fact equal to τ (t). 3.7 Integrable ﬁeld theories and monodromy matrix For a system with a ﬁnite number of degrees of freedom, we have seen that a Lax matrix could be interpreted as a coadjoint orbit. It is possible to adapt this interpretation to ﬁeld theory by properly choosing the Lie algebra involved. We shall consider two-dimensional ﬁeld theory on a cylinder with space variable x ∈ [0, 2π] and time variable t ∈ [−∞, +∞]. To introduce the space variable x, we consider the loop algebra G of (periodic) maps from the circle S 1 to the some Lie algebra G, i.e. maps S 1 → G. The simplest case corresponds to choosing G to be the algebra of N × N matrices, but more frequently, it will be an element of a loop algebra with spectral parameter λ as in the ﬁnite-dimensional case. So we are dealing

3.7 Integrable ﬁeld theories and monodromy matrix

63

with double loop algebras. In order to introduce some structure in the x direction, we consider the central extension of the x–loop algebra (see Chapter 16): G = G + CK i (x) + ci K reads by deﬁnition: The commutator of two elements Xi = X 1 (x), X 2 (x)] + [X1 , X2 ] = [X

2π

2 (x))dx K 1 (x)∂x X (X

0

where ( , ) is an invariant non-degenerate bilinear form on G (such as Tr or Tr Res). The dual space G∗ of G can be identiﬁed with the space of pairs of elements of the form Ξ = (Ξ(x), ζ) with pairing 2π Ξ(X) = (Ξ(x), X(x))dx + ζc 0

The coadjoint action is deﬁned as usual (ad∗ X · Ξ)(Y ) = −Ξ([X, Y ]) and takes the form 2π 2π ∗ (ad X · Ξ)(Y ) = − (Ξ(x), [X(x), Y (x)])dx − ζ (X(x)∂ x Y (x))dx 0 0 2π = Y (x))dx (−[Ξ(x), X(x)] + ζ∂x X(x), 0

so that

ad∗ X · Ξ = (−[Ξ(x), 0) X(x)] + ζ∂x X(x),

(3.66)

We see that the element ζ is invariant by coadjoint action, and we will choose orbits with ζ = 1. In this setting, the Lax equation eq. (3.21) reads, for L = (U, 1) and M =V: ∂t U − ∂x V − [V, U ] = 0 (3.67) This is a zero curvature condition. Alternatively, one can say that the variable x behaves like one of the times of ﬁnite-dimensional systems, see eq. (3.42). It is important, however, to realise that in the ﬁeld theory case, the construction of commuting quantities is more complicated because we have to construct functions invariant under the coadjoint action eq. (3.66). For this, the right object to consider is the so-called monodromy matrix which we now introduce. The zero curvature condition (3.67) expresses the compatibility condition of the associated linear system (∂x − U ) Ψ = 0,

(∂t − V ) Ψ = 0

(3.68)

64

3 Synopsis of integrable systems

The matrices U and V can be thought of as the x and t components of a connection. This connection will be called the Lax connection. Given U and V , the linear system (3.68) determines the matrix Ψ up to multiplication on the right by a constant matrix, which we can ﬁx by requiring Ψ(λ, 0, 0) = 1. This Ψ will be called the wave function. Choosing a path γ from the origin to the point (x, t), the wave function can be written symbolically as ←− Ψ(x, t) =exp (U dx + V dt) (3.69) γ ←−

where exp denotes the path-ordered exponential. This is just the parallel transport along the curve γ with the connection (U, V ). Since the Lax connection satisﬁes the zero curvature relation (3.67) the value of the path-ordered exponential is independent of the choice of this path. In particular, if γ is the path x ∈ [0, 2π] with ﬁxed time t, we call Ψ(2π, t) the monodromy matrix T (λ, t): 2π ←− U (λ, x, t)dx (3.70) T (λ, t) ≡exp 0

where we assume that U (λ, x, t) and V (λ, x, t) depend on a spectral parameter λ. Proposition. Assume that all ﬁelds are periodic in x with period 2π. Let T (λ, t) be the monodromy matrix and let H (n) (λ) = Tr (T n (λ, t))

(3.71)

Then, H (n) (λ) is independent of time. Hence traces of powers of the monodromy matrix generate conserved quantities. Proof. Thinking of the path-ordered exponential on [a, b] as b ←− exp U (x)dx ∼ (1 + δxU (xn )) · · · (1 + δxU (x1 )) a

with a subdivision x1 = a < x2 < · · · < xn = b such that xi+1 − xi = δx → 0, we get (all exponentials are path-ordered exponentials): 2π 2π x dxe x U dx U˙ (λ, x)e 0 U dx ∂t T (λ, t) = 0 2π 2π x = dxe x U dx (∂x V + [V, U ])e 0 U dx 0 2π 2π

x = dx∂x e x U dx V e 0 U dx 0

65

3.8 Abelianization Performing the integral, ∂t T (λ, t) = V (λ, 2π, t)T (λ, t) − T (λ, t)V (λ, 0, t)

(3.72)

So, if the ﬁelds are periodic, we have V (λ, 2π, t) = V (λ, 0, t) and the relation becomes ∂t T (λ, t) = [V (λ, 0, t), T (λ, t)] This is a Lax equation. It implies that H (n) (λ) is time-independent. Expanding in λ we obtain an inﬁnite set of conserved quantities. It is the monodromy matrix which plays the role of the Lax matrix in the ﬁeld theoretical context. 3.8 Abelianization We now discuss the analogue of the Zakharov–Shabat construction for ﬁeld theory. We consider the linear system eq. (3.68) where U (λ, x, t) and V (λ, x, t) are matrices depending in a rational way on a parameter λ having poles at constant values λk . U = U0 +

Uk

with

Uk =

V = V0 +

Uk,r (λ − λk )r

(3.73)

Vk,r (λ − λk )r

(3.74)

r=−nk

k

−1

Vk

with

k

Vk =

−1 r=−mk

The compatibility condition of the linear system (3.68) is the zero curvature condition (3.67). We demand that it holds identically in λ. These conditions are always compatible, since by the same naive counting argument as for ﬁnite-dimensional systems there is one more variable than the number of equations. The origin of this indeterminacy is the same: the zero curvature condition is invariant by gauge transformations. If the gauge transformation is independent of λ, it will not spoil the analytic properties of U and V . Notice that eq. (3.67) implies that U0 and V0 are pure gauge, i.e. there exists a group valued function h such that U0 = ∂x hh−1

and V0 = ∂t hh−1

(3.75)

Remark 1. Using a λ independent gauge transformation periodic in x, we can always choose a gauge in which U0 is constant diagonal and V0 = 0. To show this we start from eq. (3.75). Writing that U0 (x) is periodic implies that ∂x (h−1 (x)h(x+2π)) = 0. We change basis so that h−1 (x)h(x + 2π) is diagonal and denote it by exp(2πP ) with

66

3 Synopsis of integrable systems

˜ ˜ + 2π, t) = h(x, ˜ P diagonal. Hence we can write h(x, t) = h(x, t) exp(P x) with h(x t) ˜ ˜ ˜ and we gauge transform under h. Then we have U0 = P and V0 = x∂t P . But V˜0 is periodic in x, hence ∂t P = 0.

As for ﬁnite-dimensional systems, we ﬁrst make a local analysis around each pole λk in order to understand solutions of eq. (3.67). We show that around each singularity λk , one can perform a gauge transformation bringing simultaneously U (λ) and V (λ) to a diagonal form. The important new feature we want to emphasize, as compared to the ﬁnite-dimensional case, is that this construction is local in x. Let us assume that the pole is located at λ = 0. The rational functions U (λ, x, t), V (λ, x, t) can be expanded in a Taylor series in a neighbourhood of this pole: U (λ, x, t) =

∞

Ur (x, t)λr ,

V (λ, x, t) =

r=−n

∞

Vr (x, t)λr

(3.76)

r=−m

We have: Proposition. There exists a local, periodic, gauge transformation ∂x − U = g(∂x − A)g −1 ,

∂t − V = g(∂t − B)g −1

(3.77)

where g(λ), A(λ) and B(λ) are formal series in λ: g=

∞ r=0

r

gr λ ,

A=

∞

r

Ar λ ,

r=−n

B=

∞

Br λr

r=−m

such that the matrices A(λ) and B(λ) are diagonal. Moreover ∂t A(λ) − ∂x B(λ) = 0. Proof. Let g0 be the matrix diagonalizing the leading term in eq. (3.76), U−n = g0 A−n g0−1 ˜ r ˜ )g −1 , U ˜ = ∞ U Let ∂x − U = g0 (∂x − U 0 r=−n r λ . Since λ = 0 is a pole of ˜−n = A−n . We set g = g0 h. Equation (3.77) U and g0 is regular we have U ˜ ˜ h + hA = 0. Expanding in becomes ∂x − U = h(∂x − A)h−1 , or ∂x h − U powers of λ, we get ∂x hl −

l r=−n

˜r hl−r − hl−r Ar ) = 0, (U

l = −n, . . . , ∞

(3.78)

67

3.8 Abelianization

Of course hl = 0 if l < 0; h0 = 1. In the sum, we separate the ﬁrst and the last term ∂x hl −[A−n , hl+n ]−

l−1

˜r hl−r −hl−r Ar )−U ˜l +Al = 0, (U

l = −n, . . . , ∞

r=−n+1

(3.79) Projecting this equation on the diagonal matrices, we determine Al in terms of Ak , k < l, and hk , k < l + n. (The term [A−n , hl+n ] does not contribute to the diagonal since A−n is itself diagonal). Similarly, the oﬀ-diagonal part of this equation determines the oﬀ-diagonal part of hl+n in terms of the same variables. We can make the solution unique by requiring, for instance, diag (hl+n ) = 0. Therefore h and A are determined recursively. Note that this is a purely algebraic computation, so g and A are periodic functions of x and are algebraic functions of the matrix elements of U and their derivatives. Under the gauge transformation g we obviously have ∂t A − ∂x B = [B, A], or ∂t Al − ∂x Bl = ∞ [B r=−n l−r , Ar ]. If B is regular at λ = 0 the expansion of B starts at B0 and ∂t A−n = [B0 , A−n ]. Hence the oﬀ-diagonal part of the commutator is zero and therefore B0 is diagonal. If, however, B is singular, the expansion of B starts at B−m and the most singular term in the commutator is λ−n−m [B−m , A−n ], and this has to vanish because n + m > max(n, m). Hence B−m is diagonal. We ﬁnish the proof by induction on l. Assume Br is diagonal until Bl+n−1 . Then ∂t Al − ∂x Bl = l+m r=−n [Bl−r , Ar ] = [Bl+n , A−n ], hence Bl+n is diagonal. It is important to notice that this procedure only requires local computations. There is no diﬀerential equation to be integrated when recursively diagonalizing the Lax connection around its poles. As for ﬁnite-dimensional systems, we can reconstruct all the matrices Uk and Vk , and therefore the Lax connection, from simple data. U = U0 +

Uk ,

(k) with Uk ≡ g (k) A− g (k)−1

Vk ,

(k) with Vk ≡ g (k) B− g (k)−1

k

V = V0 +

−

−

(3.80) (3.81)

Remark 2. Let us check that the reconstruction formulae (3.80, 3.81) for U and V are such that the order of the poles in the commutator appearing in the zero curvature condition matches the order of the poles in the derivative terms. Indeed the polar part at λk of the commutator is [Uk , V ]− + [Vk , U ]− and can be higher than the order of the poles of the derivatives ∂t Uk and ∂x Vk . So we have to show that the formula

68

3 Synopsis of integrable systems

ensures that the poles of these commutators, which are naively of order nk + mk , are actually of order less than max(nk , mk ). Indeed, consider for example the commutator [Uk , V ]− . Similarly as for ﬁnite-dimensional systems, we may write it as: (k)

g (k) A− g (k)−1

[Uk , V ]− = =

,V −

(3.82) −

(k) (k) g (k) A− g (k)−1 − g (k) A− g (k)−1 , V

+

(k)

= g (k) A− g (k)−1 , g (k) ∂t g (k)−1

− −

−

(k)

g (k) A− g (k)−1

,V +

−

In the last line we use the fact that ∂t −V is diagonalized by g (k) , i.e. V = −g (k) ∂t g (k)−1 + g (k) B (k) gk−1 with B (k) diagonal. All the terms of the last line have a pole of order at most nk . Similarly one shows that the order of the pole of [Vk , U ]− is at most mk . This shows that the constraints hidden in eq. (3.67) are solved by the formulas (3.80, 3.81).

We now use this diagonal gauge to compute the conserved quantities. 2π Proposition. The quantities Q(k) (λ) = 0 A(k) (λ, x, t) dx are local conserved quantities of the ﬁeld theory. They are related to eq. (3.71) by 2π (n) (k) H (λ) = Tr exp n A (λ, x, t)dx = Tr exp nQ(k) (λ) 0

Proof. Around each pole, in the diagonal gauge, the zero curvature condition reduces to ∂t A(k) (λ, x, t) − ∂x B (k) (λ, x, t) = 0 It is the equation of conservation of a current. The charge Q(k) (λ) = 2π (k) 0 A (λ, x, t) dx is conserved because

2π

∂t Qk (λ) =

∂x B (k) (λ, x, t) = B (k) (λ, 2π, t) − B (k) (λ, 0, t) = 0

0

where we have used the fact that A(k) (λ, x, t) and B (k) (λ, x, t) are local in terms of the coeﬃcients of the Lax connection and are therefore periodic in x. Expanding Q(k) (λ) in powers of λ−λk produces an inﬁnite number of conserved quantities. Under a gauge transformation, U → g U = g −1 U g − g −1 ∂x g, the monodromy matrix is transformed into g T (λ, t) with g

T (λ, t) = g −1 (2π, t)T (λ, t)g(0, t)

69

3.8 Abelianization Thus, if g(x, t) is periodic, g(2π, t) = g(0, t), one has Tr ( g T (λ, t)) = Tr (T (λ, t))

In the diagonal gauge around λ = λk , the monodromy matrix is easily computed and one gets: 2π (n) (k) A (λ, x, t)dx H (λ) = Tr exp n 0

There is no problem of ordering in the exponential since the matrices A(k) are diagonal. Remark 3. The abelianization procedure can be used to give a local expression of the wave function, (x,t)

Ψ(λ, x, t) = g (k) (λ, x, t)e

0

A(k) (λ)dx+B (k) (λ)dt (k)−1

g

(λ, 0, 0)

As for ﬁnite-dimensional systems, this shows that Ψ(λ, x, t) possesses essential singularities at the points λk . By the Poincar´e theorem on diﬀerential equations, these are the only singularities of Ψ(λ, x, t).

We now give some examples of two-dimensional ﬁeld theories having a zero curvature representation. Example 1. The ﬁrst example is the non-linear σ model. For simplicity, we look for a Lax connection in which U and V have only one simple pole at two diﬀerent points and U0 = V0 = 0. Choosing these points to be at λ = ±1, we can thus parametrize U and V as: U=

1 Jx , λ−1

V =−

1 Jt λ+1

(3.83)

with Jx and Jt taking values in some Lie algebra. Decomposing the zero curvature condition [∂x − U, ∂t − V ] = 0 over its simple poles gives two equations: ∂t Jx − 12 [Jx , Jt ] = 0, ∂x Jt + 12 [Jx , Jt ] = 0. Taking the diﬀerence implies that [∂t + Jt , ∂x + Jx ] = 0. Thus J is a pure gauge and there exists g such Jt = g −1 ∂t g and Jx = g −1 ∂x g. Taking now the sum of the two equations implies ∂t Jx + ∂x Jt = 0, or equivalently, ∂t (g −1 ∂x g) + ∂x (g −1 ∂t g) = 0

70

3 Synopsis of integrable systems

This is the ﬁeld equation of the so-called non-linear sigma model, with x, t as light-cone coordinates. Example 2. Another important example is the sinh-Gordon model. It also has a two-poles Lax connection, one-pole at λ = 0, the other at λ = ∞. Moreover, we require that in the light-cone coordinates, x± = x ± t, U (λ, x± ) has a simple pole at λ = 0 and V (λ, x± ) a simple pole at λ = ∞. The most general 2 × 2 system of this form is: (∂x+ − U )Ψ = 0, (∂x− − V )Ψ = 0,

U = U0 + λ−1 U1 V = V0 + λV1

The matrices Ui , Vi are taken to be traceless matrices, so contain 12 parameters. One can reduce this number by imposing a symmetry condition under a discrete group, as in eq. (3.18). Namely, we consider the group Z2 acting by: 1 0 −1 Ψ(λ) −→ σz Ψ(−λ)σz , σz = (3.84) 0 −1 and we demand that Ψ be invariant by this action. This restriction means that the wave function belongs to the twisted loop group. It follows that: σz U (−λ)σz = U (λ) and σz V (−λ)σz = V (λ). We still have the possibility of performing a gauge transformation by an element g, independent of λ, in order to preserve the pole structure of the connection, and commuting with the action of Z2 , i.e. g diagonal. This gauge freedom can be used to set (V0 )ii = 0. The symmetry condition then gives: u0 λ−1 u1 0 λv1 , V = U= λ−1 u2 −u0 λv2 0 In this gauge, the zero curvature equation reduces to: ∂x− u0 − u1 v2 + v1 u2 = 0 ∂x− u1 = 0, ∂x− u2 = 0 ∂x+ v1 − 2v1 u0 = 0, ∂x+ v2 + 2v2 u0 = 0

(3.85) (3.86) (3.87)

From eq. (3.86) we have u1 = α(x+ ), u2 = β(x+ ). We set u0 = ∂x+ ϕ. Then, from eq (3.87) we have v1 = γ(x− ) exp 2ϕ and v2 = δ(x− ) exp −2ϕ. Finally, eq (3.85) becomes : ∂x+ ∂x− ϕ + β(x+ )γ(x− )e2ϕ − α(x+ )δ(x− )e−2ϕ = 0 This is the sinh-Gordon equation. The arbitrary functions α(x+ ), β(x+ ) and γ(x− ), δ(x− ) are irrelevant: they can be absorbed into a redeﬁnition

3.8 Abelianization

71

of the ﬁeld ϕ and a change of the coordinates x+ , x− . Taking them as constants, equal to m, we ﬁnally get 0 mλe2ϕ ∂x+ ϕ mλ−1 ; V = (3.88) U= mλ−1 −∂x+ ϕ mλe−2ϕ 0 Hence the Lax connection of the sinh-Gordon model is naturally recovered from two-poles systems with Z2 symmetry. This construction generalizes to other Lie algebras, the reduction group being generated by the Coxeter automorphism, and yields the Toda ﬁeld theories. Remark 4. There is a relation between the linear system eq. (3.88) and what is called B¨ acklund transformations. These transformations produce new solutions of a non-linear partial diﬀerential equation from old ones. Assume that ϕ satisﬁes a second order non-linear partial diﬀerential equation. The B¨ acklund transformation requires that the new function ϕ is obtained by solving a ﬁrst order system: = P (ϕ, ϕ, ∂x+ ϕ, ∂x− ϕ), ∂x+ ϕ

∂x− ϕ = Q(ϕ, ϕ, ∂x+ ϕ, ∂x− ϕ)

Of course the compatibility condition of this system puts strong constraints on P and Q. We say the transformation is auto-B¨ acklund if ϕ satisﬁes the same equation as ϕ. In the sinh-Gordon case, the transformation is deﬁned by: = 2mλ−1 sinh(ϕ − ϕ), ∂x+ (ϕ + ϕ)

∂x− (ϕ − ϕ) = 2mλ sinh(ϕ + ϕ)

(3.89)

where λ is an arbitrary parameter. The compatibility condition reads: ∂x+ ∂x− ϕ = −∂x+ ∂x− ϕ + 4m2 cosh(ϕ − ϕ) sinh(ϕ + ϕ) + ϕ) sinh(ϕ − ϕ) = ∂x+ ∂x− ϕ + 4m2 cosh(ϕ and reduces to the sinh-Gordon equation ∂x+ ∂x− ϕ = 2m2 sinh(2ϕ). Moreover, we then ﬁnd ∂x+ ∂x− ϕ = 2m2 sinh(2ϕ), so if ϕ solves the sinh-Gordon equation, so does the transformed ﬁeld ϕ. The relation between eq. (3.89) and the linear system eq. (3.88) is obtained by setting eϕ = eϕ uv . The B¨ acklund transformation then reads: u(−∂x+ v + ∂x+ ϕ v + mλ−1 u) + v(∂x+ u + ∂x+ ϕ u − mλ−1 v) = 0 u(∂x− v − −mλ e2ϕ u) + v(−∂x− u + mλ e−2ϕ v) = 0 Requiring the vanishing of the four terms in the parenthesis yields exactly the linear system: v v (∂x+ − U ) = 0, (∂x− − V ) =0 u u where the connection is given in eq. (3.88). Conversely, if we have a solution (u, v) of the linear sytem associated with ϕ, then eϕ = eϕ uv is a solution of the B¨ acklund transformation. In general the relation between ϕ and ϕ is non-local. However, when expanding ϕ in formal power of either λ or 1/λ each term of the inﬁnite series is a local function of ϕ. This remark can be used to deduce an inﬁnite set of local conserved currents

72

3 Synopsis of integrable systems

in the sinh-Gordon model. Indeed, from the deﬁning relations (3.89) of the B¨ acklund transformation, we see that the current Jx+ , Jx− with components: Jx+ = λ−1 cosh(ϕ − ϕ),

Jx− = −λ cosh(ϕ + ϕ)

is conserved: ∂x− Jx+ + ∂x+ Jx− = 0. Expanding it in power series of either λ or 1/λ gives two inﬁnite series of local conserved currents.

3.9 Poisson brackets of the monodromy matrix As we just saw, the zero curvature equation leads to the construction of an inﬁnite set of conserved currents. We want to compute the Poisson brackets of the conserved charges associated with these conserved currents. For this we will compute the Poisson brackets of the matrix elements of the monodromy matrix. In order to do it we assume the existence of an r-matrix relation such that: {U1 (λ, x), U2 (µ, y)} = [r12 (λ − µ), U1 (λ, x) + U2 (µ, y)]δ(x − y)

(3.90)

We assume that r is a non-dynamical r-matrix such as eq. (3.31). We say that the Poisson bracket eq. (3.90) is ultralocal due to the presence of δ(x − y) only. This hypothesis actually covers a large class of integrable ﬁeld theories. Since we are computing Poisson brackets, let us ﬁx the time t, and consider the transport matrix from x to y y ←− T (λ; y, x) =exp U (λ, z)dz x

In particular the monodromy matrix is T (λ) = T (λ; 2π, 0). The matrix elements [T ]ij of T (λ; y, x) are functions on phase space. As in Chapter 2, section (2.5), we use the tensor notation to arrange the table of their Poisson brackets. Proposition. If eq. (3.90) holds, we have the fundamental Sklyanin relation for the transport matrix: {T1 (λ; y, x), T2 (µ; y, x)} = [r12 (λ, µ), T1 (λ; y, x)T2 (µ; y, x)]

(3.91)

As a consequence, the traces of powers of the monodromy matrix H (n) (λ) = Tr (T n (λ)), generate Poisson commuting quantities: {H (n) (λ), H (m) (µ)} = 0

(3.92)

Proof. Let us ﬁrst prove the relation (3.91) for the Poisson brackets of the transport matrices. Notice that λ is attached to T1 and µ to T2 , so that

3.9 Poisson brackets of the monodromy matrix

73

there is no ambiguity if we do not write explicitly the λ and µ dependence. The transport matrix T (y, x) veriﬁes the diﬀerential equations ∂x T (y, x) + T (y, x)U (x) = 0 ∂y T (y, x) − U (y)T (y, x) = 0

(3.93)

Since Poisson brackets satisfy the Leibnitz rules, we have {T1 (y, x), T2 (y, x)} = (3.94) y y dudv T1 (y, u)T2 (y, v){U1 (u), U2 (v)}T1 (u, x)T2 (v, x) x

x

Replacing {U1 (u), U2 (v)} by eq. (3.90), and using the diﬀerential equation satisﬁed by T (y, x), this yields: {T1 (y, x), T2 (y, x)} y y = dudv δ(u − v). T1 (y, u)T2 (y, v) r12 (∂u + ∂v )T1 (u, x)T2 (v, x) x x

+(∂u + ∂v )(T1 (y, u)T2 (y, v)) r12 T1 (u, x)T2 (v, x) y

= dz ∂z T1 (y, z)T2 (y, z).r12 .T1 (z, x)T2 (z, x) x

Integrating this exact derivative gives the relation (3.91). Let us now show that the trace of the monodromy matrix H (n) (λ) generates Poisson commuting quantities. Equation (3.91) implies {T1n (λ), T2m (µ)} = [r12 (λ, µ), T1n (λ)T2m (µ)] We take the trace of this relation. In the left-hand side we use the fact that Tr12 (A ⊗ B) = Tr (A)Tr (B) and get {H (n) (λ), H (m) (µ)}. The right-hand side gives zero because it is the trace of a commutator. Let us emphasize that it is the integration process involved in the transport matrix which leads from the linear Poisson bracket eq. (3.90) to the quadratic Sklyanin Poisson bracket eq. (3.91). The proposition shows that we may take as Hamiltonian any element of the family generated by H (n) (µ). We show that the corresponding equations of motion take the form of a zero curvature condition. Proposition. Taking H (n) (µ) as Hamiltonian, we have U˙ (λ, x) ≡ {H (n) (µ), U (λ, x)} = ∂x V (n) (λ, µ, x) + [V (n) (λ, µ, x), U (λ, x)] (3.95)

74

3 Synopsis of integrable systems

where

V (n) (λ, µ; x) = nTr1 T1 (µ; 2π, x)r12 (µ, λ)T1 (µ; x, 0)T1n−1 (µ, 2π, 0)

This provides the equations of motion for a hierarchy of times, when we expand in µ. Proof. To simplify the notation, we do not explicitly write the λ, µ dependence as above, noting that µ is attached to the tensorial index 1 and λ to the tensorial index 2. We have: 2π {T1 (2π, 0), U2 (x)} = dy T1 (2π, y) {U1 (y), U2 (x)} T1 (y, 0) 0

= T1 (2π, x) [r12 , U1 (x) + U2 (x)] T1 (x, 0) Expanding the commutator we get four terms {T1 (2π, 0), U2 (x)} = T1 (2π, x) · r12 · U1 (x)T1 (x, 0) +T1 (2π, x) · r12 · U2 (x) T1 (x, 0) "# $ "# $ ! ! commute

use diﬀ. eq.

− T1 (2π, x) U1 (x) ·r12 · T1 (x, 0) − T1 (2π, x) U2 (x) ·r12 · T1 (x, 0) ! ! "# $ "# $ use diﬀ. eq.

commute

Using the diﬀerential equations (3.93) and commuting factors as indicated gives {T1 (2π, 0), U2 (x)} = ∂x V12 (x) + [V12 (x), U2 (x)] where we have introduced V12 (x) = T1 (2π, x) · r12 · T1 (x, 0). From this (n) (n) (n) we get {T n (2π, 0), U2 (x)} = ∂x V12 (x) + [V12 (x), U2 (x)] with V12 (x) = n−i−11 V12 (x)T1i . Taking the trace over the ﬁrst space, remembering i T1 (n) that H (n) (µ) = Tr T n (µ), and setting V (n) (λ, µ, x) = Tr1 V12 (x), we ﬁnd eq. (3.95). 3.10 The group of dressing transformations We now introduce a very important notion, the group of dressing transformations, which is related to the Zakharov–Shabat construction. These transformations provide a way to construct new solutions of the ﬁeld equations of motion from old ones. It deﬁnes a group action on the space of classical solutions of the model, and therefore on the phase space of the model. Dressing transformations are special non-local gauge transformations preserving the analytical structure of the Lax connection. These transformations are intimately related to the Riemann–Hilbert problem which we have discussed in the section on factorization.

3.10 The group of dressing transformations

75

We choose a contour Γ in the λ-plane such that none of the poles λk of the Lax connection are on Γ. We will take for Γ the sum of contours Γ(k) , each one surrounding a pole λk as in the factorization problem. To deﬁne the dressing transformation, we pick a group valued function ˜ on Γ. From the Riemann–Hilbert problem, eqs. (3.49, 3.53), g(λ) ∈ G g(λ) can be factorized as: −1 g(λ) = g− (λ)g+ (λ)

where g+ (λ) and g− (λ) are analytic inside and ouside the contour Γ respectively. In the following discussion we assume that g(λ) is close enough to the identity so that there are no indices. Let U, V be a solution of the zero curvature equation eq. (3.67) with the prescribed singularities speciﬁed in eqs. (3.73, 3.74). Let Ψ ≡ Ψ(λ; x, t) be the solution of the linear system (3.68) normalized by Ψ(λ; 0, 0) = 1. We set: θ(λ; x, t) = Ψ(λ; x, t) · g(λ) · Ψ(λ; x, t)−1

(3.96)

At each space–time point (x, t), we perform a λ decomposition of θ(λ, x, t) according to the Riemann–Hilbert problem as: −1 θ(λ; x, t) = θ− (λ; x, t) · θ+ (λ; x, t)

(3.97)

with θ+ and θ− analytic inside and outside the contour Γ respectively. Then, Proposition. The following function, deﬁned for λ on the contour Γ, −1 Ψg (λ; x, t) = θ± (λ; x, t) · Ψ(λ; x, t) · g± (λ)

(3.98)

extends to a function Ψg+ , deﬁned inside Γ except at the points λk where it has essential singularities, and a function Ψg− deﬁned outside Γ. On Γ we have Ψg−−1 Ψg+ |Γ = 1. So Ψg± deﬁne a unique function Ψg which is normalized by Ψg (λ, 0) = 1 and is a solution of the linear system (3.68) with Lax connection U g and V g given by −1 −1 + ∂x θ± · θ± U g (λ; x, t) = θ± · U · θ±

V (λ; x, t) = θ± · V · g

−1 θ±

+ ∂t θ± ·

−1 θ±

(3.99) (3.100)

The matrices U g and V g , which satisfy the zero curvature equation (3.67), are meromorphic functions on the whole complex λ plane with the same analytic structure as the components U (λ) and V (λ) of the original Lax connection.

76

3 Synopsis of integrable systems

Proof. First it follows directly from the deﬁnitions of g± and θ± that for λ on Γ, −1 −1 θ+ (λ; x, t) · Ψ(λ; x, t) · g+ (λ) = θ− (λ; x, t) · Ψ(λ; x, t) · g− (λ)

so that the two expressions of the right-hand side of eq. (3.98) with the + and − signs are equal, and eﬀectively deﬁne a unique function Ψg on Γ. It is clear that this function can be extended into two functions Ψg± respectively deﬁned inside and outside this contour by: −1 Ψg± = θ± · Ψ · g±

These functions have the same essential singularities as Ψ at the points λk . By construction, they are such that Ψg−−1 Ψg+ |Γ = 1. We may use Ψg± to deﬁne the Lax connection U±g , V±g inside and outside the contour Γ. Explicitly: g−1 −1 −1 = ∂x θ± θ± + θ ± U θ± U±g = ∂x Ψg± · Ψ± −1 −1 V±g = ∂t Ψg± · Ψg−1 = ∂t θ± θ± + θ± V θ ± ±

Since Ψg−−1 Ψg+ |Γ = 1 we see that U+ coincides with U− on the contour Γ and similarly V+ = V− for λ ∈ Γ and hence the pairs U±g , V±g deﬁne a conection U g , V g on the whole λ-plane. Since θ± are regular in their respective domains of deﬁnition, we see that U g , V g have the same singularities as U , V . This proposition eﬀectively states that the dressing transformations (3.98) map solutions of the equations of motion into new solutions. Given a solution U, V of the zero curvature equation with the prescribed pole ˜ we produce a new solution structure and an element of the loop group G, of the zero curvature equation with same analytical structure. But, since this analytic structure is the main information which speciﬁes the model, we have produced a new solution of the equations of motion. Thus, the dressing transformations Ψ → Ψg act on the solution space. They form a group, called the dressing group and denoted by GR . This ˜ but is not isomorphic to it since its composition group is modeled on G law is diﬀerent. Indeed, −1 g+ and h = h−1 Proposition. Let g = g− − h+ be two elements of the dressing group GR . The composition law of dressing transformations is given by

h • g = (h− g− )−1 (h+ g+ )

(3.101)

77

3.10 The group of dressing transformations

Representing the elements of the dressing group by the pairs (g− , g+ ) and (h− , h+ ) we may write the composition law as: (h− , h+ ) • (g− , g+ ) = (h− g− , h+ g+ ). In particular the plus and minus components commute. −1 g+ and h = h−1 Proof. Consider two elements g = g− − h+ and transform g g h successively Ψ → Ψ → (Ψ ) ; we have: g −1 Ψg = θ± Ψ g±

with

g θ± = (ΨgΨ−1 )±

hg (Ψg )h = θ± Ψg h−1 ±

with

hg θ± = (Ψg hΨg−1 )± (3.102)

The factorization of (Ψg hΨg−1 ) can be written as follows: hg −1 hg g g (θ− ) θ+ ≡ Ψg hΨg−1 = θ− Ψ (h− g− )−1 (h+ g+ ) Ψ−1 θ+

−1

or, equivalently, hg g θ± = (Ψ (h− g− )−1 (h+ g+ ) Ψ−1 )± θ±

Inserting this formula into eq. (3.102) proves the multiplication law for the dressing transformations. The Lie algebra GR of the dressing group is composed of the two commuting subalgebras G± that we have introduced in the section on factorization. Recall that G− consists of maps X− (λ) extendable ouside Γ, while G+ = ⊕k Gk , [Gk , Gk ] = 0, k = k , consists of a collection of maps Xk (λ) ˜ but in the regular inside Γ(k) . As a vector space, GR is isomorphic to G, dressing Lie algebra, [G− , G+ ] = 0. The inﬁnitesimal form of the dressing transformation, eq. (3.98), for ˜ = X+ − X− , with X± ∈ G± is: any X ˜ −1 )± Ψ − Ψ X± δX˜ Ψ = (ΨXΨ

(3.103)

We end these general considerations on dressing transformations by clarifying their relation to the Poisson structure of the theory. We shall assume the Poisson bracket eq. (3.90) for the Lax connection with r12 (λ, µ) = −

C12 , λ−µ

C12 =

Eij ⊗ Eji

ij

The Poisson bracket of the wave function is thus, from eqs. (3.69, 3.91): {Ψ1 (λ; x), Ψ2 (µ; x)} = [r12 (λ, µ), Ψ1 (λ; x)Ψ2 (µ; x)]

(3.104)

The r-matrix is related to the factorization problem in the loop algebra ˜ ˜ G˜ whose elements are maps X(λ) deﬁned on Γ. Recall that X(λ) can be

78

3 Synopsis of integrable systems

˜ = X+ − X− with X± ∈ G± . Its component X− (λ) can decomposed as X be computed by: dµ ˜ 2 (µ), λ outside Γ X− (λ) = Tr2 r12 (λ, µ)X Γ 2iπ Its component X+ (λ) = (X1 (λ), X2 (λ), . . .) reads: dµ ˜ 2 (µ), λ inside Γ(k) Tr2 r12 (λ, µ)X Xk (λ) = Γ 2iπ ˜ k , where X ˜ k denotes the We have to verify that (Xk − X− )|Γ(k) = X ˜ component of X(λ) on Γ(k) . Recalling the formula 1 1 − = −2iπδ(x) x + i0 x − i0 and taking λ± be two values of λ pinching the contour Γ(k) from inside and outside, we can write: r12 (λ+ , µ) − r12 (λ− , µ) = 2iπC12 δ(λ − µ)

(3.105)

with δ(λ − µ) the Dirac measure. Then we have:

dµ ˜ 2,l (µ) (Xk − X− )(λ)|Γ(k) = Tr2 (r12 (λ+ , µ) − r12 (λ− , µ))X Γ(l) 2iπ l

˜ k (λ) =X Introducing two maps R± acting on the loop algebra G˜ by dµ ± ˜ ˜ 2 (µ) R (X)(λ) = X± (λ) = Tr2 r12 (λ± , µ)X Γ 2iπ

(3.106)

˜ = X+ − X− . This is equivalent to R+ − R− = Id, we have shown that X the identity operator. With this result at hand we can spell out the Poisson property of dressing transformations. The action eq. (3.103) does not naively preserve the Poisson brackets eq. (3.104). However, a good Poisson action is recovered if the dressing group itself is equipped with a non-trivial Poisson structure. It is then called a Poisson–Lie group and the action a Lie–Poisson action. The theory of Poisson–Lie groups and Lie–Poisson actions is sketched in Chapter 14. Here we only need to know that inﬁnitesimal actions are generated by so-called non-Abelian Hamiltonians T . This means that there exists a function on phase space, T , taking value in the dual group, such that for any function f on phase space δX f = X, T −1 {T, f }

(3.107)

3.11 Soliton solutions

79

Here X is an element of the Lie algebra of the Poisson–Lie group acting on the manifold, and T −1 {T, f } belongs to the dual of this Lie algebra; is the pairing, see Chapter 14. In the Abelian case, writing T = exp P, we get δX f = X, {P, f } = {H(X), f }, where H(X) = P(X). This is the standard formula showing that the action is symplectic in this case. In our situation X is an element of the Lie algebra of the dressing group ˜ GR = (G+ , G− ) and T an element of the loop group exp G. Proposition. The action of dressing transformations is a Lie–Poisson action. The non–Abelian generator is the monodromy matrix. Proof. Introduce the monodromy matrix T (µ) = Ψ(µ, 2π). Using the ultralocality property, its Poisson bracket with the wave function is, for 0 ≤ x ≤ 2π: {Ψ1 (λ, x), T2 (µ)} = T2 (µ)Ψ−1 2 (µ, x)[r12 (λ, µ), Ψ1 (λ, x)Ψ2 (µ, x)] In this formula, we can freely replace λ on a contour Γ(k) by either of the two values λ± pinching it from inside and outside Γ(k) . This is because for µ outside Γ(k) , this replacement has no eﬀect since r12 (λ, µ) is regular, while for µ on Γ(k) , by eq. (3.105), the diﬀerence is the product of the Dirac measure δ(λ − µ) and the commutator [C12 , Ψ1 (λ)Ψ2 (λ)] which vanishes. ˜ ∈ GR we have: Therefore, for any X

dµ ˜ 2 (µ)T −1 (µ){Ψ1 (λ± ), T2 (µ)} Tr2 X 2 Γ 2iπ

˜ ˜ −1 (λ) Ψ(λ) − Ψ(λ, x) R± (X)(λ) = R± ΨXΨ where the two signs ± give the same answer. Since the maps R± project on the subalgebras G± , see eq. (3.106), this reads:

dµ ˜ 2 (µ)T −1 (µ){Ψ1 (λ± , x), T2 (µ)} = δ ˜ Ψ(λ, x) Tr2 X 2 X Γ 2iπ where δX˜ Ψ(λ, x) is the inﬁnitesimal form of the dressing transformation, eq. (3.103). Comparing with eq. (3.107), this proves that T (µ) is the non– Abelian Hamiltonian and shows that the action is Lie–Poisson. 3.11 Soliton solutions In general, a matrix Riemann–Hilbert problem like eq. (3.49) cannot be solved explicitly by analytical methods. This statement applies to the fundamental solution of the Riemann–Hilbert problem, i.e. the one satisfying the conditions det θ± = 0. However, once the fundamental solution

80

3 Synopsis of integrable systems

is known, new solutions “with zeroes” can easily be constructed from it. This can be used to produce new solutions to the equations of motion. Starting from a trivial vacuum solution, we obtain in this way the so-called soliton solutions. To deﬁne the Riemann–Hilbert with zeroes, we ﬁrst introduce a deﬁnition. We say that a matrix function θ(λ) has a zero at the point λ0 if det θ(λ0 ) = 0 and if in the vicinity of this point θ(λ) = F0 + (λ − λ0 )F1 + O(λ − λ0 )2 , θ−1 (λ) =

1 C0 + C1 + O(λ − λ0 ) λ − λ0

Since θ(λ)θ−1 (λ) = θ−1 (λ)θ(λ) = Id, we have F0 C0 = C0 F0 = 0, C0 F1 + C1 F0 = Id. In particular Ker F0 = Im C0 ,

Ker C0 = Im F0

Let now Γ be a closed contour in the λ-plane. As in the previous section Γ could be a sum of small contours Γ(k) around each point λk , but here we can also consider more general contours provided no point λk sits on them. Let g(λ) be a matrix deﬁned on Γ, and consider the Riemann– Hilbert problem −1 g(λ) = θ− (λ)θ+ (λ) (3.108) where θ+ (λ) is analytic inside Γ and has N zeroes located at the points −1 (λ) is analytic outside Γ and has N zeroes at the µ1 , . . . , µN , and θ− −1 (λ) which has zeroes, and points λ1 , . . . , λN . We emphasize that it is θ− not θ− (λ). Let us ﬁx the two set of subspaces: −1 (λ)|λ=λn , Vn = Im θ−

Wn = Ker θ+ (λ)|λ=µn

Then we have: Proposition. The choice of the subspaces Vn , Wn , speciﬁes uniquely the solution of the factorization problem eq. (3.108), up to a left multiplication by a constant matrix. This factorization problem is called a Riemann– Hilbert problem with zeroes. Proof. Suppose θ± and θ˜± are two solutions of the Riemann–Hilbert −1 −1 problem. Then the function χ = θ˜− θ− = θ˜+ θ+ is a meromorphic function in the whole complex plane. Its possible poles are located at λn , µn . But around such a pole, let us say µn , we have θ+ (λ) = Fn + O(λ − µn ), θ˜+ (λ) = F˜n + O(λ − µn ),

1 Cn + O(1), λ − µn 1 −1 C˜n + O(1) θ˜+ (λ) = λ − µn −1 θ+ (λ) =

81

3.11 Soliton solutions

Since Ker Fn = Wn = Im Cn , Ker F˜n = Wn = Im C˜n , we see that around 1 λ = µn , χ = λ−µ F˜n Cn + O(1) is regular because F˜n Cn = 0. The same n analysis holds true at λn , and so χ(λ) is regular everywhere, hence a constant. There is a simple way to construct the solution of the Riemann–Hilbert problem with zeroes, from its fundamental solution. We begin by adding a pair of zeroes at (µN , λN ). −1 Proposition. Let θ+ and θ− be the solution of the Riemann–Hilbert problem eq. (3.108) with zeroes at µN and λN respectively such that WN = −1 Ker θ+ (µN ) and VN = Im θ− (λN ) are ﬁxed. Let θ˜± be a solution of the Riemann–Hilbert problem, without zeroes at these points. Then µN − λN −1 1− PN θ˜+ (λ) θ+ (λ) = χ0 λ − λN λN − µN −1 −1 ˜ θ− (λ) = θ− (λ) 1 − PN χ0 (3.109) λ − µN

where χ0 is a constant matrix and PN is a projector such that Im PN = θ˜+ (µN )WN ,

Ker PN = θ˜− (λN )VN

−1 −1 Proof. Introduce the matrix χ(λ) = θ˜− θ− = θ˜+ θ+ as above. It is a meromorphic function in the λ-plane with possible simple poles at λN and µN , so we can parametrize χ as

χ = χ0 +

1 χ1 , λ − µN

χ−1 = χ2 +

1 χ3 λ − λN

From the condition χ(λ)χ−1 (λ) = 1, we ﬁnd χ2 = χ−1 and the two 0 relations 1 1 −1 χ1 χ0 + χ3 = 0, χ1 χ3 = 0 χ0 + µN − λN λN − µN Adding these two equations gives χ0 χ3 = −χ1 χ−1 0 . Let us deﬁne PN =

χ0 χ3 χ1 χ−1 0 =− λN − µN λN − µN

This matrix is a projector since PN2 = PN , because χ1 χ3 = (λN − µN )χ1 χ−1 0 . We can rewrite χ(λ) in terms of PN : µN − λN λN − µN −1 −1 1− χ(λ) = 1 − PN χ0 ; χ (λ) = χ0 PN λ − µN λ − µN

82

3 Synopsis of integrable systems

from which eqs. (3.109)

follow. Next we demand WN = Ker θ+ (µN ) = ˜ Ker (1 − PN )θ+ (µN ) so that (1 − PN )θ˜+ (µN )WN = 0 or Ker (1 − PN ) = −1 −1 Im PN = θ˜+ (µN )WN . Similarly, VN = Im θ− (λN ) = Im θ˜− (λN )(1 − −1 ˜ PN )χ0 , or what is the same, VN = θ− (λN )Ker PN , so that Ker PN = θ˜− (λN )VN . Repeated applications of this result allows one to build a solution of the Riemann–Hilbert problem with zeroes at µ1 , . . . , µN , λ1 , . . . , λN . µN − λN µ1 − λ1 −1 −1 1− θ+ (λ) = χN 1 − PN . . . χ1 P1 θ˜+ (λ) λ − λN λ − λ1 λ1 − µ1 λN − µN −1 −1 ˜ P1 χ1 . . . 1 − PN χN θ− (λ) = θ− (λ) 1 − λ − µ1 λ − µN Here θ˜± (λ) refers to the fundamental solution of the Riemann–Hilbert problem. We now extend the method of dressing transformations to the case of a Riemann–Hilbert problem with zeroes. Proposition. Given a Lax connection satisfying the zero curvature condition and the associated wave function Ψ(λ, x, t), and given vector spaces Vn (0), Wn (0), we deﬁne Vn (x, t) = Ψ(λn , x, t)Vn (0),

Wn (x, t) = Ψ(µn , x, t)Wn (0)

(3.110)

We use Vn (x, t), Wn (x, t) to deﬁne a Riemann–Hilbert problem with zeroes −1 (λ)g+ (λ) on Γ, the at λ1 , λ2 , . . . and µ1 , µ2 , . . .. Then for any g(λ) = g− g transformation Ψ → Ψ , −1 , Ψg = θ± Ψg±

−1 θ− θ+ = Ψ−1 gΨ

is a dressing transformation, i.e. preserves the analytic structure of the Lax connection. Proof. We start with the linear system (∂x − U (λ, x, t))Ψ = 0,

(∂t − V (λ, x, t))Ψ = 0

and dress it with a solution with zeroes of the Riemann–Hilbert problem, according to eqs. (3.99, 3.100): −1 −1 + ∂x θ± · θ± , U g = θ± · U · θ±

−1 −1 V g = θ ± · V · θ± + ∂t θ ± · θ ±

In general, the components of the dressed connection will have simple poles at the points µn , λn . We must require that the residues of these

3.11 Soliton solutions

83

poles vanish. At λ = µn we have −1 θ+ (λ) = Fn + O(λ − µn ), θ+ (λ) =

1 Cn + O(1) λ − µn

Requiring that the residue at λ = µn vanishes yields Fn (∂x − U |λ=µn )Cn = 0,

Fn (∂t − V |λ=µn )Cn = 0

This means that the space Wn = Ker Fn = Im Cn should be invariant under the action of the operators ∂x − U |λ=µn and ∂t − V |λ=µn . Similarly, at λ = λn we have −1 (λ) = Fn + O(λ − λn ), θ− (λ) = θ−

1 Cn + O(1) λ − λn

and setting the residues to zero gives Cn (∂x − U |λ=λn )Fn = 0,

Cn (∂t − V |λ=λn )Fn = 0

This means that the space Vn = Im Fn = Ker Cn should be invariant under the action of the operators ∂x − U |λ=λn and ∂t − V |λ=λn . The simplest solution is to choose the spaces Vn (x, t), Wn (x, t) as in eq. (3.110).

The interest of this procedure is that it yields non-trivial results even if the Riemann–Hilbert problem is trivial, i.e. g(λ) = Id. Then its fundamental solution is also trivial, θ˜± (λ) = Id, and the solutions with zeroes are constructed by purely algebraic means. The resulting θ± (λ) are rational functions of λ. To make this method eﬀective, we need a simple solution of the zero curvature condition ∂t U − ∂x V − [V, U ] = 0 to start with. Simple solutions can be found in the form U = A(λ, x), V = B(λ, t), [A, B] = 0

t x Then Ψ = exp 0 Adx + 0 Bdt . The solutions obtained by dressing this simple type of solutions by the trivial Riemann–Hilbert problem with zeroes are called soliton solutions. Example. Let us illustrate this construction on the non-linear sigma model. There, one has (see eq. (3.83)) U=

1 Jx , λ−1

V =−

1 Jt λ+1

84

3 Synopsis of integrable systems

We consider the case of 2 × 2 matrices. The non-linear sigma model ﬁeld is related to Jx , Jt , by Jx = g −1 ∂x g, Jt = g −1 ∂t g. The matrix g can be easily related to the solution Ψ of the linear system: g = Ψ−1 (λ = 0, x, t) A simple solution of the equations of motion is ax bt Jx = aσ3 , Jt = bσ3 , Ψ0 = e[( λ−1 − λ+1 )σ3 ] , g0 = e−(ax+bt)σ3

We want to dress this solution by solving a Riemann–Hilbert problem with zeroes at λ1 and µ1 . According to the general formulae, we have µ1 − λ1 λ1 − µ1 −1 −1 1− P , θ− (λ) = 1 − P χ0 θ+ (λ) = χ0 λ − λ1 λ − µ1 with χ0 an constant matrix and P a projector. It can be parametrized by two vectors w and n: wi nj Pij = w·n The spaces V1 and W1 are deﬁned by −1 V1 = Im θ− (λ)|λ=λ1 = Im (1 − P )χ0 = Ker P W1 = Ker θ+ (λ)|λ=µ1 = Ker (1 − P )χ0 = Im P

Hence W1 is spanned by the vector w, and V1 is spanned by the vector n⊥ perpendicular to n. The invariance properties eq. (3.110) are ensured if we set w(x, t) = Ψ(x, t, µ1 )w(0),

n⊥ (x, t) = Ψ(x, t, λ1 )n⊥ (0),

In components, this reads

w1 (x, t) = e

ax − µ bt+1 µ1 −1 1

w1 (0),

−

w2 (x, t) = e

ax − µ bt+1 µ1 −1 1

w2 (0)

and −

n1 (x, t) = e

ax − λ bt+1 λ1 −1 1

n1 (0),

n2 (x, t) = e

ax − λ bt+1 λ1 −1 1

n2 (0)

These formulae completely determine the projector P (x, t). The dressed ﬁeld is then reconstructed by λ1 − µ1 −1 −1 −1 g(x, t) = Ψ |λ=0 = Ψ0 |λ=0 θ− |λ=0 = g0 (x, t) 1 + P (x, t) χ0 µ1 The generalization to the N -soliton case is clear.

3.11 Soliton solutions

85

References [1] I.M. Gelfand and L.A. Dickey, Fractional powers of operators and Hamiltonian systems. Funct. Anal. Appl. 10 (1976) 259. [2] V.E. Zakharov and A.B. Shabat, Integration of Non Linear Equations of Mathematical Physics by the Method of Inverse Scattering. II Funct. Anal. Appl. 13 (1979) 166. [3] M. Adler, On a trace functional for formal pseudodiﬀerential operators and symplectic structure of the Korteweg–de Vries type equations. Inv. Math. 50 (1979) 219. [4] M. Jimbo and T. Miwa, Monodromy preserving deformation of linear ordinary diﬀerential equations with rational coeﬃcients. III Physica D 4 (1981) 26–46. [5] M. Semenov-Tian-Shansky. What is a classical r-matrix? Funct. Anal. Appl. 17 4 (1983) 17. [6] L.D. Faddeev, Integrable Models in 1 + 1 Dimensional Quantum Field Theory. Les Houches Lectures 1982. Elsevier Science Publishers (1984). [7] M. Semenov-Tian-Shansky, Dressing transformations and Poisson group actions. Publ. RIMS 21 (1985) 1237. [8] L.D. Faddeev and L.A. Takhtajan, Hamiltonian Methods in the Theory of Solitons. Springer (1986). [9] L.A. Dickey, Soliton Equations and Hamiltonian Systems. World Scientiﬁc (1991). [10] L.A. Dickey, On the τ –function of matrix hierarchies of integrable equations. J. Math. Phys. 32 (1991) 2996–3002.

4 Algebraic methods

We abstract the group theoretical settings of integrable systems. In this framework, the Lax matrix is viewed as a coadjoint orbit of a Lie algebra G. When the r-matrix is non-dynamical the Jacobi identity simpliﬁes and one can use it to deﬁne a second Lie algebra structure on G. Hence G has a structure of Lie bi-algebra, and conversely, such a structure deﬁnes an r-matrix. One can then build dynamical systems admitting Lax representations and conserved quantities in involution. Furthermore, the solution of the equations of motion is reduced to a factorization problem in Lie group theory. We illustrate these constructions in the case of ﬁnite-dimensional Lie groups, with the open Toda chain model which we solve completely by algebraic methods. Finally, we demonstrate the versatility of the algebraic setting by constructing the Lax pair with spectral parameter for the Kowalevski top. 4.1 The classical and modiﬁed Yang–Baxter equations In Chapter 2, we have computed the Poisson brackets of the entries Lij of the Lax matrix L = L E ij ij , and have shown that they can be ij expressed with an r-matrix (see section (2.5) in Chapter 2):

{L1 , L2 } = [r12 , L1 ] − [r21 , L2 ]

(4.1)

with r12 = ij,kl rij,kl Eij ⊗ Ekl , and Eij is the canonical basis of gl(N ). So r12 can be viewed as an element of gl(N ) ⊗ gl(N ). We can generalize this setup immediately by considering a Lie algebra G equipped with a non-degenerate invariant scalar product ( , ), also denoted by Tr ( ). We will use a basis {Ta } of G, and denote the matrix of scalar products by gab = (Ta , Tb ) = Tr (Ta Tb ) and its inverse by g ab . 86

4.1 The classical and modiﬁed Yang–Baxter equations

87

The proper interpretation of the Lax matrix L is as an element of G ∗ , i.e. a linear form on G. It can also be viewed as an element of G, since the invariant scalar product allows us to identify G and its dual G ∗ : X ∈ G −→ L(X) ≡ (L, X) With L viewed as an element of G, eq. (4.1) shows that r12 is an element of G ⊗ G. The nice structural aspects, however, appearwhen one interprets the r-matrix as a linear map R : G −→ G. If r12 = ab rab Ta ⊗ Tb , then R(X) = rab Ta (Tb , X) = Tr2 (r12 X2 ) (4.2) ab

The r-matrix relation, eq. (4.1), can be presented in a dual form: {L(X), L(Y )} = L([X, Y ]R )

(4.3)

with the R-bracket, [ , ]R , deﬁned as [X, Y ]R = [R(X), Y ] + [X, R(Y )]

(4.4)

To prove these formulae, take the invariant scalar product of both sides of eq. (4.1) by X ⊗ Y . On the left-hand side we get {L(X), L(Y )}, while on the right-hand side we get ([R(Y ), L], X) − ([R(X), L], Y ), which is equal to (L, [X, Y ]R ) by invariance of the scalar product. In this dual formalism, the Jacobi identity, eq. (2.12) in Chapter 2, becomes an equation on R: L( [X, J(Y, Z)] + cyc.perm.) = 0

(4.5)

where cyc.perm. means cyclic permutation of (X, Y, Z) and the quantity J(Y, Z) is deﬁned as J(Y, Z) = {L(Y ), R(Z)} − {L(Z), R(Y )} + [R(Y ), R(Z)] − R ([Y, Z]R ) If the R matrix is a constant on phase space (independent of the dynamical variables), the Jacobi identity becomes: L( [X, [R(Y ), R(Z)] − R ([Y, Z]R )] + cyc.perm.) = 0 A particular way to fulﬁl this equation is to set 1 [R(X), R(Y )] − R ( [X, R(Y )] + [R(X), Y ] ) = − [X, Y ] 4

(4.6)

88

4 Algebraic methods

so that it reduces to the Jacobi identity in G. Equation (4.6) is called the modiﬁed Yang–Baxter equation and will be extensively studied below. The factor 1/4 is conventional and can be changed by a rescaling of R. Example. The simplest example of an equation of the type eq. (4.3) is provided by the case in which the map R is proportional to the identity map. It corresponds to the Kostant–Kirillov bracket on G ∗ , which is deﬁned by: {L(X), L(Y )}K = L([X, Y ]) for any X, Y in G. Comparing with eq. (4.3) we see that this bracket corresponds to RK = 1/2 Id. The modiﬁed Yang–Baxter equation (4.6) is satisﬁed with this value of RK . Under dualization, the Poisson bracket can be written in the form (4.1) with r12 = 12 C12 , where C12 is the tensor Casimir of G, g ab Ta ⊗ Tb (4.7) C12 = ab

Recall that the tensor Casimir has the two main properties: [C12 , X1 + X2 ] = 0,

X1 = Tr2 (C12 X2 ),

X∈G

where we used the tensorial notations of section (2.5) in Chapter 2. Before studying the modiﬁed Yang–Baxter equation, we would like to explain its relation with another important equation appearing in this domain: the classical Yang–Baxter equation. For any r in G ⊗ G it reads: [r12 , r13 ] + [r12 , r23 ] + [r13 , r23 ] = 0

(4.8)

This equation is important because it is the classical limit of the quantum Yang–Baxter equation, which is one of the main tools in the study of many quantum integrable models. In dualized form eq. (4.8) reads [R(X), R(Y )] − R( [X, R(Y )] − [t R(X), Y ] ) = 0

(4.9)

where we deﬁned t R(X) = Tr2 (r21 X2 ). Please, notice the subtle diﬀerence between the left-hand sides of eq. (4.6) and eq. (4.9). The two expressions agree if t R = −R, i.e. if the r-matrix is antisymmetric: r12 = −r21 . The relation between the solutions of the modiﬁed Yang–Baxter equation and of the classical Yang–Baxter equation is as follows: Proposition. Let R be an antisymmetric solution of the modiﬁed Yang– Baxter equation, then R± = R± 12 Id both satisfy the classical Yang–Baxter equation.

4.2 Algebraic meaning of the classical Yang–Baxter equations

89

Proof. We compute R± ([X, Y ]R ) = R([X, Y ]R ) ± 12 [X, Y ]R = [R± (X), R± (Y )]

(4.10)

The statement follows from the observation that [X, Y ]R = [R∓ (X), Y ] + [X, R± (Y )] and the fact that R∓ = −t R± when R is antisymmetric. ± Viewed as an element of G ⊗ G, R± corresponds to r12 with ± = r12 ± 12 C12 r12

(4.11)

where r12 is the dualized form of the antisymmetric solution of the modiﬁed Yang–Baxter equation, and C12 is the tensor Casimir element in G ⊗ G. 4.2 Algebraic meaning of the classical Yang–Baxter equations We now undertake an abstract study of the bracket eq. (4.3), for a generic linear mapping R : G −→ G solution of the modiﬁed Yang–Baxter equation. This will naturally lead us to introduce a factorization problem in the Lie group associated with G. Proposition. Let R be a solution of the modiﬁed Yang–Baxter equation (4.6), then the antisymmetric bracket [ , ]R satisﬁes the Jacobi identity. It thus deﬁnes a second Lie algebra structure on G. Proof. We have to prove that the bracket [ , ]R satisﬁes the Jacobi identity: [X, [Y, Z]R ]R + [Z, [X, Y ]R ]R + [Y, [Z, X]R ]R = 0 Expanding the external R-brackets, we get: [R(X), [Y, Z]R ] + [R(Z), [X, Y ]R ] + [R(Y ), [Z, X]R ] + [X, R([Y, Z]R )] + [Z, R([X, Y ]R )] + [Y, R([Z, X]R )] = 0 Developing the terms in the ﬁrst line and using the Jacobi identity on the Lie bracket [ , ], we can rewrite them as −[X, [R(Y ), R(Z)]] − [Y, [R(Z), R(X)]] − [Z, [R(X), R(Y )]] They combine with the terms in the second line to yield: [X, [R(Y ), R(Z)] − R([Y, Z]R )] + cyclic permutations = 0

90

4 Algebraic methods

This is satisﬁed if R is a solution of the modiﬁed Yang–Baxter equation because the original Lie bracket [ , ] obeys the Jacobi identity. We are then in a very special situation where the vector space G is equipped with two Lie algebra structures deﬁned by the two brackets [ , ] and [ , ]R . This is called a Lie bi-algebra. We will denote by GR the Lie algebra with underlying vector space G but with Lie bracket [ , ]R . The relation between these two Lie algebra structures is described by the following: Proposition. Let us deﬁne R± = R ± 12 Id, K± = Ker (R∓ ) and G± = Im (R± ), then (i) (R± ) : GR −→ G are Lie algebra homomorphisms, (ii) G± ⊂ G are Lie subalgebras of G, (iii) K± ⊂ GR are ideals of GR and G± GR /K∓ . Proof. This proposition is a straightforward consequence of the modiﬁed Yang–Baxter equation written as in eq. (4.10). Given the maps R+ and R− , we thus construct two subalgebras G± ∈ G. Since R+ − R− = Id, any element X ∈ G can be written, perhaps not uniquely, as X = X + − X−

with X± = R± (X) ∈ G± = Im (R± )

(4.12)

Note that the bracket [ , ]R takes the simple form: [X, Y ]R = [X+ , Y+ ] − [X− , Y− ]

(4.13)

We deﬁne the Lie algebra G+ ⊕ G− as the Cartesian product of G+ and G− in which [G+ , G− ] = 0. We can embed GR into G+ ⊕G− by the map X → (R+ (X), R− (X)). From eq. (4.13) we see that GR is a subalgebra of G+ ⊕ G− through this embedding. The question then arises to determine the image G˜R of GR in G+ ⊕G− . This is the object of the next two propositions. Proposition. (i) K± ⊂ G± are ideals in G± , hence G± /K± are Lie algebras. (ii) The mapping ν : G+ /K+ −→ G− /K− deﬁned by ν : R+ X −→ R− X is a Lie algebra isomorphism. Proof. To prove the ﬁrst part of the proposition, we remark that K± ⊂ G± since, for X ∈ K± , we have by deﬁnition RX = ± 12 X, so that X = ±R± X. For the same reason, on K± we have [ , ]R = ±[ , ] and K± are indeed subalgebras in G± . To prove that they are ideals, we consider X ∈ K± , Y ∈ G± . We can write Y = R± Z, then [X, Y ] = [X, R± Z] = [X, RZ] ± 12 [X, Z] = [X, RZ] + [RX, Z] = [X, Z]R

4.2 Algebraic meaning of the classical Yang–Baxter equations

91

but [X, Z]R ∈ K± since K± are ideals in GR . To prove the second part let us denote by X ± the equivalence class X ± = R± X [mod K± ] First ν : G+ /K+ −→ G− /K− is well-deﬁned, since an element of K+ is mapped to 0. The mapping ν is surjective because it is induced by the surjective mapping G+ → G− given by R+ X → R− X. It is injective because if R− X ∈ K− one has R+ (X) = 0 by deﬁnition of K− . Finally we prove that ν is a Lie algebra isomorphism, i.e: [ν(x), ν(y)] = ν[x, y] for any x, y of the form x = R+ X, y = R+ Y . Recalling that [R± X, R± Y ] = R± [X, Y ]R and the deﬁnition of the Lie algebra bracket on G± /K± , we have: ν([R+ X, R+ Y ]) = ν(R+ [X, Y ]R ) = R− [X, Y ]R = [R− X, R− Y ] = [ν(R+ X), ν(R+ Y )]

Finally, we have the following important result: Proposition. Consider the two maps: R+ ,R−

1,−1

GR −→ G+ ⊕ G− −→ G that is X → (R+ (X), R− (X)) and (X+ , X− ) → X+ − X− respectively. The ﬁrst map is a Lie algebra injective homomorphism. Let G˜R be its image. The second map, when restricted to G˜R , is bijective. Finally, G˜R is characterized by the set of couples (R+ X, R− Y ) such that ν(R+ X) = R− Y . Proof. The ﬁrst map is injective and the second one is surjective because R+ − R− = Id. The elements of G˜R are of the form (R+ X, R− X) hence satisfy the condition ν(R+ X) = R− X. Conversely, let us start from a pair (R+ X, R− Y ) such that ν(R+ X) = R− Y . This means R− Y = R− X, hence there exists K− ∈ K− such that R− (X − Y ) = K− = −R− K− . It follows that (X − Y + K− ) = K+ belongs to Ker R− = K+ . Then we have ˜ R− Y = R− X ˜ with X ˜ = X + K− = Y + K+ so that the R+ X = R+ X, + − point (R X, R Y ) belongs to G˜R . This proposition shows that we can decompose uniquely any element X ∈ G as X = X+ − X− with X± ∈ G± and ν(X+ ) = X− . This will be the basis for the factorization problem below. Another important consequence of these results is that the algebraic structure we have exhibited is equivalent to the existence of an r-matrix.

92

4 Algebraic methods

This is the starting point of the classiﬁcation theorems of the solutions of the Yang–Baxter equation by Belavin–Drinfeld and Semenov-TianShansky. In the next sections we content ourselves with giving simple examples of this algebraic setting. In the Adler–Kostant–Symes scheme ν is trivial, and in the open Toda chain ν is non-trivial. 4.3 Adler–Kostant–Symes scheme A class of solutions of the modiﬁed Yang–Baxter equation is produced by the Adler–Kostant–Symes scheme. Let G be a Lie algebra and assume that we have two Lie subalgebras A and B such that, as a vector space, G is the direct sum G =A+B Note that A and B are Lie subalgebras but they are not assumed to commute ([A, B] = 0). We denote by PA the projection on A along B and P B = 1 − PA . Proposition. The linear map R = Yang–Baxter equation.

1 2

(PA − PB ) satisﬁes the modiﬁed

Proof. By deﬁnition we have [X, Y ]R = [PA X, PA Y ]−[PB X, PB Y ], where we used PA + PB = 1, thus: R([X, Y ]R ) =

1 2

([PA X, PA Y ] + [PB X, PB Y ])

Since [RX, RY ] =

1 ([PA X, PA Y ] − [PA X, PB Y ] − [PB X, PA Y ] + [PB X, PB Y ]) 4

we obtain

1 [RX, RY ] − R([X, Y ]R ) = − [X, Y ] 4 This is the modiﬁed Yang–Baxter equation. This construction is a particular example of the general discussion. With the notations of the previous section, R+ = R + 12 = PA and R− = R − 12 = −PB , and G+ = K+ = A, G− = K− = B. Thus G± /K± = {0} so we have the decomposition of GR as a direct sum of Lie algebras: GR = A ⊕ B In the Lie algebra GR , A and B commute. It is then clear that any element X ∈ G can be decomposed as X = X+ − X− with X+ = PA X and X− = −PB X.

4.3 Adler–Kostant–Symes scheme

93

Example. Let G˜ be the loop algebra G˜ = G ⊗ C[[λ, λ−1 ]] with G a ﬁnitedimensional simple Lie algebra. Elements of G˜ are linear combinations of elements X ⊗ λn with X ∈ G and n ∈ Z. The commutation relations in G˜ are: [X ⊗ λn , Y ⊗ λm ] = [X, Y ] ⊗ λn+m

(4.14)

The two subalgebras A = G+ , B = G− are spanned by elements of the form X ⊗ λn with n ≥ 0 and n < 0 respectively. Clearly [A, B] = 0 for the ˜ For any X ∈ G and any formal power series Lie algebra of G. structure n f (λ) = n∈Z fn λ , the maps R± are deﬁned by: R+ (X ⊗ f (λ)) = X ⊗ f+ (λ) (4.15) R− (X ⊗ f (λ)) = −X ⊗ f− (λ) where f+ (λ) = n≥0 fn λn and f− (λ) = n<0 fn λn denote the regular and singular part of f (λ) around λ = 0 respectively. Since by deﬁnition f (λ) = f+ (λ)+f− (λ), we have R+ −R− = 1. Elements of the subalgebras G± are of the form X ⊗ f± (λ) with f± (λ) regular (resp. singular) at the origin. We have K± = Ker R∓ = G± . Thus G± /K± = {0} and the map ν is trivial. The decomposition eq. (4.12) for X ⊗ f (λ) consists of writing X ⊗ f (λ) as the sum of two functions, one analytic around the origin and the other one analytic around inﬁnity. Introducing the scalar product in ˜ G: dλ (X(λ), Y (λ)) = Res Tr (X(λ)Y (λ)) = Tr (X(λ)Y (λ)) (4.16) 2iπ where Tr ( ) is the invariant bilinear form on G, we can write the maps R± in a more operatorial way: for any X(λ) in G˜ dµ Tr2 (C12 X2 (µ)) ± (R X)(λ) = µ−λ Γ± 2iπ where C12 is the tensor Casimir for G given in eq. (4.7). The integration contours Γ± are for R+ a path enclosing λ and µ = 0, and for R− a path enclosing µ = 0 but not λ. These formulae are of theform eq. (4.2) + + n n+1 , ∈ G˜ ⊗ G˜ given by r12 = −C12 /(λ − µ) = C12 ∞ with r12 n=0 λ /µ − where we expand in powers because |λ| < |µ|. Similarly r12 = ∞ ofn λ/µ + − n+1 . Finally we get r12 = (r12 + r12 )/2 −C12 /(λ − µ) = −C12 n=0 µ /λ in the form: C12 r12 (λ, µ) = − λ−µ This is the r-matrix that we met in Chapter 3.

94

4 Algebraic methods 4.4 Construction of integrable systems

We now show that the setting of section (4.2) can be used to construct integrable systems. Starting from a Lie algebra structure on G one can construct a Poisson bracket on its dual G ∗ . Better, starting from a Lie bialgebra, we have two Lie algebra structures on G, namely [ , ] and [ , ]R , and one obtains two Poisson brackets { , } and { , }R on G ∗ . As we know, the functions on G ∗ , invariant under the action [ , ], are in the centre of the Poisson structure { , } and cannot provide useful Hamiltonians. However, they will not be in the centre of the second Poisson structure { , }R and we will show that they then provide commuting Hamiltonians. Moreover, the equations of motion take the Lax form. We start from an r-matrix solution of the modiﬁed Yang–Baxter equation. On G, we have two Lie algebra structures, one associated with the original Lie bracket [ , ] and one corresponding to the bracket [ , ]R . We denote by GR the Lie algebra equipped with this second bracket. Similarly, we denote by G and GR the corresponding simply connected Lie groups. On G, we have two adjoint actions adX(Y ) = [X, Y ],

adR X(Y ) = [X, Y ]R = [RX, Y ] + [X, RY ]

and also two coadjoint actions of G on G ∗ : ad∗ X · L(Y ) = −L([X, Y ]),

ad∗R X · L(Y ) = −L([X, Y ]R )

One can express ad∗R in terms of ad∗ ad∗R X · L = ad∗ (RX) · L + R∗ ad∗ X · L

(4.17)

We then have two Poisson brackets on F(G ∗ ), { , } and { , }R , which are the Kostant–Kirillov brackets for the two Lie algebra structures, i.e. {f1 , f2 } (L) = L([df1 , df2 ]),

{f1 , f2 }R (L) = L([df1 , df2 ]R )

For linear functions f1 (L) = L(X) and f2 (L) = L(Y ) the second Poisson bracket becomes exactly eq. (4.3). Thus the Poisson bracket structure of the Lax matrix is given by the Kostant–Kirillov bracket for the algebra GR . In accordance, we view L as an element of a coadjoint orbit of GR . Theorem. (i) The Ad∗ -invariant functions on G ∗ are in involution with respect to both Poisson brackets.

4.4 Construction of integrable systems

95

(ii) Choosing an Ad∗ -invariant Hamiltonian H, the equation of motion on G ∗ with respect to the Poisson bracket { , }R may be written in the two equivalent forms dL = −ad∗R dH · L dt dL = ad∗ M · L with M = −R(dH) dt

(4.18) (4.19)

Proof. Let us choose two ad∗ -invariant functions on G ∗ , f1 and f2 , i.e. ad∗ dfi · L = 0 (for i = 1, 2). As shown in Chapter 14, f1 and f2 are in the kernel of { , }, and a fortiori {f1 , f2 } = 0. On the other hand, {f1 , f2 }R (L) = L([df1 , df2 ]R ) = L([Rdf1 , df2 ] + [df1 , Rdf2 ]) = ad∗ df2 .L(Rdf1 ) − ad∗ df1 .L(Rdf2 ) = 0 This proves the ﬁrst part of the proposition. For a general function f on phase space, the equation of motion reads f˙ = {H, f }R , where H is the Hamiltonian which is chosen to be an ad∗ -invariant function on G ∗ . By ˙ deﬁnition of df we have f˙ = L(df ). Thus: ˙ L(df )(L) = {H, f }R (L) = L([dH, df ]R ) = −ad∗R dH · L(df ) Since this is true for any f we get eq. (4.18). We may also use eq. (4.17) to express ad∗R in terms of ad∗ . We get ad∗R dH · L = −ad∗ M · L with M = −RdH, where we have used the invariance of H which implies ad∗ dH · L = 0. This proves eq. (4.19). This theorem allows one to build an integrable system, provided we have enough Ad∗ -invariant functions, from the algebraic data G, GR . In the case where G is equipped with an invariant non-degenerate quadratic form ( , ), G and its dual G ∗ may be identiﬁed, and ad∗ becomes ad. The condition for a function on G ∗ to be ad∗ -invariant may then be written as [dH, L] = 0. Moreover, we see that the equation of motion of L, eq. (4.19), takes the form of a Lax equation: dL = [M, L], dt

M = −R(dH)

(4.20)

Remark 1. Notice that M is not uniquely deﬁned, as we can add to it anything which commutes with L, in particular any polynomial in L. Remark 2. By construction the Lax matrix L lies on a coadjoint orbit of GR . However, the Lax form of the equation of motion shows that the evolution also takes place on an orbit of G. Hence the ﬂow occurs on the intersection of orbits of G and GR .

96

4 Algebraic methods 4.5 Solving by factorization

We consider a Hamiltonian system with a Lax matrix L ∈ G ∗ , with G ∗ the dual of G. We assume that G is equipped with an invariant bilinear form, so that G ∗ and G are identiﬁed. We also assume that the Poisson bracket of L is given as above by {L(X), L(Y )}R = L([X, Y ]R ), where R is a solution of the modiﬁed Yang–Baxter equation. As explained in the previous section, the Hamiltonians Hn in involution {Hn , Hm }R = 0 are taken among the Ad∗ -invariant functions on G ∗ ∼ G, for example, Hn = Tr (Ln ). Let us choose one of these functions as the Hamiltonian H. The equations of motion have a Lax representation, eq(4.20). Since H is an invariant function, one has [L, dH] = 0 and we can write as well, using R± = R ± 12 Id: ∂L = [M+ , L] = [M− , L], ∂t

M± = −R± (dH)

(4.21)

The fact that the equations of motion admit two Lax representations, with M± ∈ G± and (M+ , M− ) ∈ G˜R , is the key point of the following discussion. Proposition. The solution of the equations of motion with Hamiltonian H is given, for t small enough, by: −1 −1 (t) = θ− (t) L0 θ− (t) L(t) = θ+ (t) L0 θ+

where θ+ (t) and θ− (t) are the solutions of the factorization problem −1 (t) θ+ (t) exp(−tdH(L0 )) = θ−

(4.22)

with θ± ∈ G± , ν(θ+ ) = θ− . Proof. We want to solve dL = [M+ , L], dt

dL = [M− , L] dt

where M± are deﬁned in eq. (4.21). Since M± ∈ G± we know that there −1 −1 = M+ and θ˙− θ− = M− , are θ+ (t) ∈ G+ and θ− (t) ∈ G− such that θ˙+ θ+ with initial value θ± (0) = 1. Then we have: −1 −1 (t) = θ− (t) L0 θ− (t) L(t) = θ+ (t) L0 θ+

Using M+ − M− = −dH we get: −1 −1 dH(L) = −θ˙+ θ+ + θ˙− θ− = −θ−

d −1 −1 (θ θ+ ) θ+ dt −

4.6 The open Toda chain

97

From the invariance of H by coadjoint action on L, we see that dH(L) = −1 −1 ) = θ− · dH(L0 ) · θ− , which implies: dH(θ− · L0 · θ− d −1 −1 θ+ )−1 = −dH(L0 ) (θ θ+ ) · (θ− dt − −1 (t)θ+ (t) = exp (−tdH(L0 )). We are led to the algebraic probyielding θ− lem of decomposing the group element exp(−tdH(L0 )), whose time dependence is explicit, as a product of elements in G± , the Lie groups corresponding to G± . Since there is a unique decomposition at the Lie algebra level, see section (4.2), the solution of this factorization problem exists and is unique, at least in the vicinity of the unit element. The solution of this algebraic problem yields the solution of the evolution equation, at least for small t.

Remark 1. The factorization problem in the Proposition is of the same form as in eq. (3.54) of Chapter 3. The reconstruction formula for the L matrix is also identical to eq. (3.55) and the proofs are parallel. Hence the present discussion is a more algebraic presentation of the material in Chapter 3.

Remark 2. Viewing L as a matrix, a set of Ad∗ -invariant functions is provided by the traces of the powers of L: Hn = Tr (Ln ). In this case, we have, as in eq. (3.33) in Chapter 3: ± dHn (L) = n Tr2 Ln−1 C12 and M± = −n Tr2 Ln−1 r12 2 2 with C12 the tensor Casimir.

4.6 The open Toda chain In this section we introduce the Toda chains. They are integrable systems associated with Lie algebras. The open chains that we consider here are associated with ﬁnite-dimensional Lie algebras, and provide simple examples to illustrate the algebraic constructions presented above. In particular, the solution of the model will be achieved by solving the factorization problem. In contrast, the study of the closed Toda chain requires the consideration of loop algebras, and will be presented in Chapter 6. We will ﬁrst review some notations for Lie algebras introduced in Chapter 16. Let G be a ﬁnite-dimensional simple Lie algebra of rank r, equipped with an invariant scalar product denoted by ( , ) or Tr ( ). We choose a Cartan subalgebra with an orthonormal basis {Hi } (not to be confused with the Hamiltonians!). We have the Cartan decomposition of G: G±β G = N− ⊕ H ⊕ N+ , with N± = β positive

98

4 Algebraic methods

where G±β are one-dimensional, generated by the root vectors E±β for any positive root β. We will need the commutation relations:

where Hα =

[Hi , Hj ] = 0 [H, E±α ] = ±α(H) E±α [Eα , E−α ] = (Eα , E−α ) Hα ,

i α(Hi )Hi .

We will often use the constants nα such that n2α (Eα , E−α ) = 1

Toda chains are associated with any simple Lie algebra G as follows. Consider two elements in the Cartan subalgebra: q=

r

q i Hi ,

p=

i=1

r

pi H i ,

r = rank G

i=1

The coeﬃcients (qi , pi ), i = 1, . . . , r are the coordinates of a (2r)dimensional phase space with Poisson brackets 1 {pi , qj } = δij 2

(4.23)

where the factor 1/2 is introduced to simplify later formulae. The Hamiltonian of the open Toda chain is by deﬁnition: exp (2α(q)) H = (p, p) + 2 α simple

The equations of motion are then: dq = {H, q} = p dt dp Hα exp (2α(q)) = {H, p} = −2 dt α simple

or, combining the two, d2 q = −2 Hα exp (2α(q)) dt2

(4.24)

α simple

In the usual sl(N + 1) case, the Lie algebra of traceless (N + 1) × (N + 1) matrices, the Cartan algebra can be taken as traceless diagonal matrices. The simple root vectors Eαi are the canonical matrices Ei,i+1 , and

4.6 The open Toda chain

99

E−αi = Ei+1,i for i = 1, . . . , N . In this case it is not very convenient to use an orthonormal basis of the Cartan algebra under the scalar product (X, Y ) = Tr (XY ). Instead we write an element in the Cartan subalgebra +1 qi = 0. In this description the as q = N i=1 qi Eii with the constraint simple roots are αi (q) = qi − qi+1 . The dynamical variables of the open N +1 N +1 Toda chain are the two elements q = q E and p = i ii i=1 i=1 pi Eii with the constraints qi = 0 and pi = 0. These constraints are not compatible with the canonical Poisson bracket, eq. (4.23), so we use instead the Dirac bracket: 1 1 {pi , qj } = δij − , {pi , pj } = {qi , qj } = 0 2 N +1 The Hamiltonian H=

N +1

p2i + 2

i=1

N

e2(qi −qi+1 )

i=1

generates the equations of motion: q˙i = pi , i = 1, . . . , N + 1 2(q1 −q2 ) p˙1 = −2e p˙i = −2e2(qi −qi+1 ) + 2e2(qi−1 −qi ) , i = 2, . . . , N p˙N +1 = 2e2(qN −qN +1 ) The particular form of the equations for p˙1 and p˙N +1 explains the terminology “open” Toda chain. The equations of motion of the open Toda chain can be written in Lax form and the system is integrable. Proposition. The equations of motion of the open Toda chain admit a Lax pair representation L˙ = [M, L] with: nα eα(q) (Eα + E−α ) , M = − nα eα(q) (Eα − E−α ) L = p+ α simple

α simple

(4.25) Proof. We compute successively the time derivative of L and the commutator [M, L]. We have dq dL dp nα α e α(q) (Eα + E−α ) = + dt dt dt α [M, L] =

nα α(p) eα(q) (Eα + E−α )

α

−

α

β

nα nβ e(α+β)(q) [Eα − E−α , Eβ + E−β ]

100

4 Algebraic methods

Since [E±α , E±β ] is antisymmetric in the exchange of α and β, the second term reduces to − nα nβ e(α+β)(q) ([Eα , E−β ] − [E−α , Eβ ]) α

β

Since α and β are simple roots, we have [Eα , E−β ] = δαβ (Eα , E−α )Hα and therefore [M, L] = −2 e2α(q) Hα + nα α(p) eα(q) (Eα + E−α ) α

α

from which the result follows. It is important to notice that this calculation is performed using only the Lie algebra structure of G and never refers to a representation. It thus extends naturally to the case of a Kac–Moody algebra. This Lax representation provides conserved quantities, namely any invariant polynomial in L. To count the independent conserved quantities, we recall the Chevalley theorem which states that, for a simple Lie algebra, the ring of invariant polynomials is freely generated by r = rank G primitive polynomials which are of degree ni , i = 1, . . . , r, the so-called exponents of the Lie algebra. We have thus the r independent conserved quantities at our disposal for a phase space of dimension 2r. It remains to show that they are in involution. This is the aim of the next section. 4.7 The r-matrix of the Toda models Since the r-matrix of the Toda chain is the canonical example of solution of the classical Yang–Baxter equation associated with simple Lie algebras, and plays an important role in the following as well as in many integrable models, we present its computation in detail. Proposition. Let L be the Lax matrix (4.25), with p and q satisfying the Poisson bracket eq. (4.23). Then, there exists a unique antisymmetric rmatrix r12 = −r21 ∈ G ⊗ G, independent of p and q, such that {L1 , L2 } = [r12 , L1 + L2 ]. 1 Eα ⊗ E−α − E−α ⊗ Eα r12 = (4.26) 2 (Eα , E−α ) α positive

This implies that the conserved Hamiltonians are in involution. Proof. We have 1 {L1 , L2 } = nα eα(q) [Hα ⊗ (Eα + E−α ) − (Eα + E−α ) ⊗ Hα ] 2 α simple

4.7 The r-matrix of the Toda models

101

and [r12 , L1 + L2 ] = [r12 , p ⊗ 1 + 1 ⊗ p] + nα eα(q) [r12 , (Eα + E−α ) ⊗ 1 + 1 ⊗ (Eα + E−α )] α simple

Looking for r12 independent of p and q, one can identify the coeﬃcients of p and q in these equations, and we ﬁnd for the r-matrix the conditions: [r12 , Hi ⊗ 1 + 1 ⊗ Hi ] = 0 [r12 , (Eα + E−α ) ⊗ 1 + 1 ⊗ (Eα + E−α )] 1 = (Hα ⊗ (Eα + E−α ) − (Eα + E−α ) ⊗ Hα ) 2 when α is a simple root. Taking the commutator of the second relation with H ⊗ 1 + 1 ⊗ H, and using the ﬁrst one, we see that it splits into two independent equations involving Eα and E−α separately, i.e: [r12 , Hi ⊗ 1 + 1 ⊗ Hi ] = 0 (4.27) 1 [r12 , E±α ⊗ 1 + 1 ⊗ E±α ] = (Hα ⊗ E±α − E±α ⊗ Hα ) (α simple) 2 To solve these equations, we notice that the Casimir element C12 in G ⊗ G obeys [C12 , 1 ⊗ X + X ⊗ 1] = 0 for all X ∈ G. Therefore, eq. (4.27) tells us that r12 is determined only up to the addition of a multiple of C12 . Next we observe that if r12 is a solution, then −r21 is another solution, hence (r12 − r21 )/2 is an antisymmetric solution. So we can assume without loss of generality that r12 is antisymmetric. Note that the Casimir element is symmetric, and therefore does not aﬀect such a solution. To ﬁnd r12 , we dualize the equations using R(X) = Tr2 (r12 · 1 ⊗ X). Denoting (X, Y ) = Tr (XY ), we obtain for α simple: [R(X), Hi ] − R([X, Hi ]) = 0 (4.28) 1 [R(X), E±α ] − R ([X, E±α ]) = 2 ((E±α , X)Hα − (Hα , X)E±α ) (4.29) The ﬁrst equation implies R(H) ⊂ H and R(G±β ) ⊂ G±β . The most general form of r12 compatible with these requirements and the antisymmetry property is: rβ rij Hi ⊗ Hj + r= (Eβ ⊗ E−β − E−β ⊗ Eβ ) (Eβ , E−β ) ij

β positive

If X ∈ H, eq. (4.29) reduces to 1 E±α = α(R(X))E±α , ±α(X) rα − 2

(α simple)

102

4 Algebraic methods

which implies R(X) = 0 for all X ∈ H, i.e. rij = 0, and rα = 12 for α a simple root. Finally, for β any positive root, we have R(Eβ ) = rβ Eβ , and eq. (4.29) gives rβ [Eβ , Eα ] − rβ+α [Eβ , Eα ] = 0 Therefore, we deduce that rβ = 12 , for all positive roots β. Gathering all this information, we obtain: R(H) = 0

(4.30)

1 R(E±β ) = ± E±β 2

(β positive)

(4.31)

which is equivalent to eq. (4.26). This result proves the integrability of the Toda chain models. We have found a constant antisymmetric r-matrix. Let us see how it ﬁts into the general framework explained above. Proposition. Consider the r-matrix eq. (4.26). Deﬁne the map R(X) = Tr2 (r12 · 1 ⊗ X). It satisﬁes the modiﬁed Yang–Baxter equation: 1 [R(X), R(Y )] − R([X, Y ]R ) = − [X, Y ] 4

(4.32)

Proof. This is checked by a case by case analysis, using eqs. (4.30, 4.31).

The antisymmetric r-matrix r12 is a solution of the modiﬁed Yang– ± Baxter equation (4.6). Hence r12 = r12 ± 12 C12 (C12 is the Casimir) are solutions of the classical Yang–Baxter equations: ± ± ± ± ± ± , r13 ] + [r12 , r23 ] + [r13 , r23 ]=0 [r12

For simple Lie algebras, C12 reads: C12 =

Hi ⊗ H i +

i

α positive

Eα ⊗ E−α + E−α ⊗ Eα (Eα , E−α )

± Thus, the matrices r12 are given explicitly by: + r12 =

1 Hi ⊗ H i + 2 i

− =− r12

α positive

1 Hi ⊗ H i − 2 i

Eα ⊗ E−α (Eα , E−α )

α positive

E−α ⊗ Eα (Eα , E−α )

(4.33) (4.34)

4.7 The r-matrix of the Toda models

103

The general structural results of the algebraic approach allow one to understand the Lax pair of the open Toda chain, starting from this Toda r-matrix. The action of R± on the generators of G is given by    R+ (H) = 12 H  R− (H) = − 12 H + (4.35) R (E ) = Eα R− (E ) = 0  + α  − α R (E−α ) = −E−α R (E−α ) = 0 for α any positive root. The subalgebras G± = Im R± , are just the two standard Borel subalgebras of G, G± = B± = H ⊕ N± . The subalgebras K± = Ker R∓ are K± = N± . Therefore G± /K± = H, and the isomorphism ν : H → H is just ν(H) = −H; H ∈ H. In this example the isomorphism ν is non-trivial. The image G˜R of GR in B+ ⊕ B− consists of the couples (X+ , X− ) such that the components of X+ and X− on the Cartan subalgebra H are opposite. Any element X ∈ G can be decomposed uniquely as: X = X + − X−

with X± ∈ B± ,

X+ |H = −X− |H

In the simple Lie group G, the corresponding decomposition is a factor−1 g+ , where g± ∈ exp (B± ) and the factors of ization problem, g = g− g± on the Cartan torus are inverse to each other, i.e. g± = h±1 n± with h ∈ exp H and n± ∈ exp N± . Let us write explicitly the R-bracket for the antisymmetric r-matrix eq. (4.26): [Hi , Hj ]R = 0,

1 [Hi , E±α ]R = α(Hi )E±α , 2

[Eα , E−β ]R = 0

We identify G and G ∗ via the scalar product ( , ). We can write a generic element of G ∗ as x−α xα xi Hi + L= Eα + E−α (Eα , E−α ) (Eα , E−α ) i

α positive

Notice that xi = xi (L) = (L, Hi ) and x±α = x±α (L) = (L, E±α ). Next, in order to construct an integrable system, we must choose an orbit of GR in G ∗ . This is the object of the following: Proposition. The elements of G ∗ of the form xα L= xi Hi + (Eα + E−α ) (Eα , E−α ) i

α simple

describe an orbit under the coadjoint action of GR .

(4.36)

104

4 Algebraic methods

Proof. We must show that this form of L is stable under an inﬁnitesimal transformation ad ∗R X, ∀X ∈ G. Recall that ad ∗R X · L = ad ∗ (RX) · L + R∗ ad ∗ X · L Using the identiﬁcation of G and G ∗ , and the fact that R is antisymmetric, we have ad ∗R X · L = −[RX, L] + R[X, L] If X = H ∈ H, we have ad ∗R H · L = R[H, L] =

1 α(H)xα (Eα + E−α ) 2 (Eα , E−α ) α simple

This shows that ad ∗R H only modiﬁes the xα coordinates of L. If X = Eβ ∈ Gβ , with β positive, we have 1 ad ∗R Eβ · L = (R − )[Eβ , L] 2 Since L contains only roots of height ≥ −1, we have Gα [Eβ , L] ∈ H ⊕ α positive

and therefore (R − 12 )[Eβ , L] ∈ H because Gα ⊂ Ker(R − 12 ) ∀α positive. Similarly ad ∗R E−β · L = (R + 12 )[E−β , L] ∈ H. This shows that ad ∗R E±β only modify the xi coordinates of L. This result gives a description of the orbit in terms of the coordinates xi , xα . Computing the Kostant–Kirillov bracket with respect to GR we get on the orbit (4.36): {xi , xj }R = 0,

1 {xi , xα }R = α(Hi )xα , 2

{xα , xβ }R = 0

Indeed, since dxi = Hi , dx±α = E±α , the Poisson bracket { , }R is given by: 1 1 {xi , x±α }R = (L, [dxi , dx±α ]R ) = α(Hi )(L, E±α ) = α(Hi )x±α 2 2 and similarly: {xα , x−β }R = 0, {xα , xβ }R = Cα,β xα+β , where Cα,β are the structure constants of G. Moreover, on the orbit (4.36) we have xα+β = 0 since only simple roots appear. The orbit being a symplectic manifold, we know by the theorem of Darboux, see Chapter 14, that there exists (locally) a set of canonical

105

4.8 Solution of the open Toda chain

coordinates pi , qi such that {pi , qj }R = 12 δij . Here these coordinates are given by: xi = pi , xα = nα (Eα , E−α ) exp(α(q)) Inserting this parametrization into eq. (4.36) one recovers the Lax matrix of the open Toda chain. The model is completely speciﬁed once we choose a Hamiltonian H ∈ I(G ∗ ). Taking H = (L, L) yields: nα eα(q) (Eα − E−α ) M = −R(dH) = −2R(L) = − α simple

which is the second element of the Lax pair of the open Toda chain. 4.8 Solution of the open Toda chain To solve the equations of motion of the open Toda chain, we have to solve the factorization problem eq. (4.22). Speciﬁcally, −1 −1 L(t) = θ+ (t)L0 θ+ (t) = θ− (t)L0 θ− (t) (4.37) with L0 the Lax matrix at time t = 0. The factorization problem is speciﬁed by the r-matrix of the Toda chain. It is a Gauss decomposition with the diagonal part equally shared between the two factors. Even though this is a purely algebraic problem, it may not be an easy task to perform explicitly. Fortunately, if G is a ﬁnite-dimensional simple Lie algebra, there exists a suitable parametrization of the Lax operator, found by Kostant, in which the factorization can be done. We will explain this method in the case of G = sl(N + 1). We write the Lax matrix of the Toda chain as: LToda = e−ad q E− + p + ead q E+ , with E± = nα E±α

e−tdH(L0 ) = θ− (t)−1 θ+ (t);

α simple

Note that if we assign degree 0 to the generators of the Cartan subalgebra H and degree 1 or −1 to the generators Eα or E−α with α simple, then E− is of degree −1, p of degree 0 and E+ of degree +1. The Lax matrix is a traceless symmetric Jacobi matrix, i.e. Lij = 0 except for j = i±1, i. We will parametrize the space of traceless symmetric (N + 1) × (N + 1) Jacobi matrices, which is of dimension 2N , with two diagonal matrices: Ω ∈ sl(N + 1), and ∆ ∈ SL(N + 1). To this end, we introduce the matrix w which is the unique matrix such that det w = 1, wHw−1 ⊂ H and wEi,i+1 w−1 = EN +2−i,N +1−i

(4.38)

106

4 Algebraic methods N

Explicitly, its matrix elements are [w]ij = (−1) 2 δN +2−i,j . The action eq. (4.38) represents the action of the longest element in the Weyl group. Its main property in what follows is the relation w−1 E− w = E+

(4.39)

Lemma. Let Ω be a traceless diagonal matrix, and ∆ be a diagonal matrix with determinant 1. Considering the unique element u ∈ N− = exp (N− ) which diagonalizes E− + Ω: E− + Ω = uΩu−1

(4.40)

and introducing the unique solution b± ∈ B± = exp (B± ) of the factorization problem −1 b−1 u∆u−1 − b+ = w where we require that the diagonal is equally shared between b+ and b−1 − as in the previous section, we have −1 −1 b+ (E− + Ω) b−1 + = b− E+ + w Ω w b− Proof. From the factorization condition, we have b+ = b− w−1 u∆u−1 . Then −1 −1 (E− + Ω) u∆−1 u−1 wb−1 b+ (E− + Ω) b−1 + = b− w u∆u − −1 E = b− w−1 (E− + Ω) wb−1 = b + w Ω w b−1 − + − −

because u∆u−1 commutes with E− + Ω = uΩu−1 . Note that the factorization problem is non-trivial due to the presence of w which does not belong to N− . As a straightforward consequence of this Lemma, we have the: Proposition. Choose Ω and ∆ as above, then the matrix L deﬁned by −1 −1 (4.41) L = b+ (E− + Ω) b−1 + = b− E+ + w Ω w b− is a symmetric traceless Jacobi matrix. Proof. The ﬁrst expression of L shows that it expands on degrees ≥ −1, while the second form shows that it expands on degrees ≤ 1, hence L is a Jacobi matrix. To show that it is symmetric, write b± = h±1 n± with n± ∈ N± and h in the Cartan torus. We implemented the condition that b± have inverse components on the Cartan torus. We write h = exp (−q) for

4.8 Solution of the open Toda chain

107

q ∈ H. The E−α term in L is given by exp (−ad q)E−α = exp (α(q))E−α . Similarly, the Eα term is given by exp (ad q)Eα = exp (α(q))Eα . The two coeﬃcients are equal. Finally, since Ω is traceless, so is L. The fact that we have at our disposal 2N parameters in Ω and ∆ indicates that we can parametrize all symmetric traceless Jacobi matrices in this way. The diagonal matrices Ω, ∆ form a system of coordinates on the space of symmetric Jacobi matrices. In the sl(N + 1) case we have, since h = exp(−q), h2i+1 = e2(qi −qi+1 ) h2i

(4.42)

where hi are the entries of the diagonal matrix h. Moreover, denoting Ω = diag (ωi ), we have explicitly uij =

i l=j+1

1 , ωj − ωl

u−1 ij =

i−1 l=j

1 ωi − ωl

(4.43)

This is obtained by writing u Ω = (E− + Ω) u or (ωj − ωi )uij = ui−1,j . The solution of this recursion relation with the boundary condition uii = 1 yields eq. (4.43). To compute u−1 we proceed in the same way with the equation Ωu−1 = u−1 (E− + Ω). Coming back to the open Toda chain, we parametrize the Lax matrix L(t) by two matrices Ω(t) and ∆(t). We call L0 = L(t = 0) the Lax matrix at time t = 0, and let Ω0 and ∆0 denote its coordinates. Note that the matrix elements of Ω(t) are the eigenvalues of L(t), due to eqs. (4.40, 4.41), and so are constant in time: Ω(t) = Ω0 . Similarly, the matrix u(t) = u is independent of time because it depends only on the matrix elements of Ω(t) by eq. (4.43). Lemma. In the above coordinates we have −1 −1 (t) = θ− (t)L0 θ− (t) L(t) = θ+ (t)L0 θ+

where θ± have inverse components on the Cartan torus and θ+ (t) = b+ (t)b−1 + (0),

θ− (t) = b− (t)b−1 − (0)

Proof. The result follows by writing in eq. (4.41) the time-independent matrix Ω(t) in terms of L0 . We have (E− + Ω0 ) = b−1 + (0)L0 b+ (0) and also (0)L b (0). The components of θ± on the Cartan E+ + w−1 Ω0 w = b−1 0 − − −1 ±1 torus are (h(t)h (0)) and are clearly inverse to each other.

108

4 Algebraic methods

We now ﬁnd the time evolution of the coordinate ∆(t) by requiring that eq. (4.37) holds with the above θ± (t). This yields the: Proposition. The time evolution of the Toda chain, in the coordinates Ω(t), ∆(t), reads Ω(t) = Ω0 ,

∆(t) = ∆0 e−tdH(Ω0 )

(4.44)

Proof. The ﬁrst assertion has already been proved. To prove the second one, we start from exp(−tdH(L0 )) = θ− (t)−1 θ+ (t). The left-hand side is easy to evaluate using eq. (4.41): exp(−tdH(L0 )) = b+ (0)u exp(−tdH(Ω0 ))u−1 b−1 + (0)

(4.45)

On the other hand, using the explicit form of θ± (t), the right-hand side is equal to: −1 −1 −1 −1 −1 θ− (t)θ+ (t) = b− (0)b−1 − (t)b+ (t)b+ (0) = b− (0)w u∆(t)u b+ (0)

where in the second expression we used the factorization problem at time t for b± (t). However, at t = 0 the same problem yields b− (0) = −1 b+ (0)u∆−1 0 u w. Plugging into the previous expression we get: −1 −1 −1 (t)θ+ (t) = b+ (0)u∆−1 θ− 0 ∆(t)u b+ (0)

Comparing with eq. (4.45), we get ∆−1 0 ∆(t) = exp(−tdH(Ω0 )). This proposition solves the equations of motion of the open Toda chain since it gives the time evolution of the coordinates used to write the Lax matrix. We can give a more explicit form of the solution of the equations of motion in terms of determinants. To do that, consider the matrix −1 2 −1 −1 B = b−1 − b+ = n− h n+ = w u∆u

We want to compute h2 directly. If we have a representation V of sl(N +1) with a highest weight vector |v and its dual V ∗ with the dual highest weight vector v|, we have v|B|v = v|h2 |v , because by deﬁnition n+ |v = |v and v|n− = v|. Since h lives in a space of dimension r = rank G, using r such representations with independent highest weights, one completely reconstructs h. In the sl(N + 1) case we can take the N representations Λk CN +1 , k = 1, . . . , N , where Λ denotes the wedge product. Let | 1 , . . . , | N +1 be the canonical basis of CN +1 . Then the highest weight vector |vk of Λk CN +1 and its dual vk | are: |vk = | 1 ∧ · · · ∧ | k ,

vk | = 1 | ∧ · · · ∧ k |

4.9 Toda system and Hamiltonian reduction

109

Since by deﬁnition, in the representation Λk CN +1 , B|vk = B| 1 ∧ · · · ∧ B| k it is easy to check that n+ |vk = |vk . Moreover, we have for any (N + 1) × (N + 1) matrix B:   B11 · · · B1k ..  . vk |B|vk = det  .. , with B = (Bij ) . Bk1 · · · Bkk Proposition. Let |vk be the highest weight vectors of sl(N + 1) deﬁned above. Let B = w−1 u∆(t)u−1 and deﬁne the N functions of t, τk (t) ≡ vk |B|vk then the solution of the open Toda chain is given by e2(qi −qi+1 ) =

τi+1 (t)τi−1 (t) τi2 (t)

(4.46)

Proof. Let hi be the matrix elements of the diagonal (N + 1) × (N + 1) −1 2 matrix h in b−1 − b+ = n− h n+ . The Toda position coordinates are given by eq. (4.42). According to the previous analysis we have τk = vk |B|vk = 2 2 2 vk |h2 |vk , because B = n−1 − h n+ , so that τk = h1 · · · hk . Setting τ0 = τN +1 = 1 by convention, we deduce h2i =

τi τi−1

Let us summarize the way to ﬁnd the solution: given the initial data L0 one computes the coordinates (Ω0 , ∆0 ) and we let these coordinates evolve according to eq. (4.44). Then, one computes u and evaluates the τi by the determinant formula. Finally, the Toda coordinates qi are given by (4.46). All these steps are simple algebraic computations.

4.9 Toda system and Hamiltonian reduction Complementary to the algebraic approach to integrable systems presented in the previous sections, there exists a more geometrical approach, advocated by Olshanetsky and Perelomov, which we want to present on the example of the open Toda chain. The idea is to start from the geodesic motion on a Lie group G and to perform a suitable Hamiltonian reduction, which leads to the open Toda chain.

110

4 Algebraic methods

Geodesics on the group G correspond to left translations of oneparameter groups (the tangent vector is transported parallel to itself), d so dt (g −1 g) ˙ = 0. The solution is of the form g(t) = g(0) exp (Xt), for X in the Lie algebra of G. Of course this leaves open the problem of eﬀectively computing the exponential, which was the main concern of the previous section. We ﬁrst recall some notions about Lie groups and symmetric spaces, see Chapter 16. Let G be a complex simple Lie group with Lie algebra G. Let {Hi } be the generators of a Cartan subalgebra H, and let {E±α } be the corresponding root vectors, chosen to form a Weyl basis, i.e. all the structure constants are real, see Chapter 16. The real normal form of G is the real Lie algebra G0 generated over R by the Hi and E±α . Let σ be the Cartan involution: σ(Hi ) = −Hi , σ(E±α ) = −E∓α . The ﬁxed points of σ form a Lie subalgebra K of G0 generated by {Eα − E−α }. We have the decomposition: G0 = K ⊕ M where M is the real vector space generated by the {Eα + E−α } and the {Hi }. Notice that due to the choice of the real normal form, A = H ∩ G0 is a maximal Abelian subalgebra of G0 and it is entirely contained in M. Finally, we need the real nilpotent subalgebras N± generated respectively by the {E±α }. Let G0 be the connected Lie group corresponding to the Lie algebra G0 , and similarly K corresponding to K and N± corresponding to N± . The Cartan algebra A exponentiates to A. The coset space G0 /K is a symmetric space of the non-compact type. The connected Lie group G0 admits the following Iwasawa decomposition: G0 N+ × A × K as a manifold that is, any element g in G0 can be written uniquely as g = nQk. Remark. In the case G0 = SL(N + 1, R), let σ : g → t g−1 be the canonical

automorphism of G. At the Lie algebra level, σ reads σ(X) = −t X. The subgroup K is the group of orthogonal matrices with determinant 1, and the symmetric space G0 /K can be viewed as the space of symmetric matrices, with determinant 1. Finally, A is the set of diagonal matrices, and N+ is the group of upper triangular matrices with 1 on the diagonal. To show that any g can be written as nQk, recall the Gram–Schmidt orthogonalization procedure. Starting from a basis (f1 , . . . , fN +1 ) one constructs inductively an orthogonal basis by setting u1 = f1 , u2 = f2 − f1 (f2 , f1 )/(f1 , f1 ), etc., that is (u1 , . . . , uN +1 ) = (f1 , . . . , fN +1 )n with n ∈ N+ . One then normalizes this basis by setting (u1 , . . . , uN +1 ) = (u1 , . . . , uN +1 )Q, with Q the diagonal matrix with entries

4.9 Toda system and Hamiltonian reduction

111

Qj = 1/ (uj , uj ). Applying this procedure to the basis fj of columns of g −1 , we have (f1 , . . . , fN +1 )g = (e1 , . . . , eN +1 ) where ej is the canonical basis of RN +1 . On the other hand: (f1 , . . . , fN +1 )n = (u1 , . . . , uN +1 ) (u1 , . . . , uN +1 )Q = (u1 , . . . , uN +1 ) (u1 , . . . , uN +1 )k = (e1 , . . . , eN +1 ) for a unique orthogonal matrix k. Comparing, we get the unique decomposition g = nQk.

We next recall some generalities about the symplectic structure on the cotangent bundle of a Lie group, and Hamiltonian reduction, see Chapter 14. The tangent space of G at g ∈ G consists of vectors g · X, where X ∈ G, the Lie algebra of G. The cotangent bundle N = T ∗ G is identiﬁed with G × G ∗ by p ∈ Tg∗ (G) −→ (g, ξ), where p(g · X) = ξ(X). Using the invariant bilinear form on G, we can identify G and G ∗ , so that ξ(X) = (ξ, X). If (δg, δξ) is a vector tangent to T ∗ G at the point (g, ξ), the canonical 1-form is given by: (4.47) α (δg, δξ) = 2ξ (g −1 · δg) thereby providing the symplectic form ω = dα. The factor 2 has been introduced to match the conventions of the previous sections. Notice that α and ω are invariant under both left and right translations. One can then attempt to reduce this phase space using Lie subgroups HL and HR of G with Lie algebras HL and HR , acting respectively on the left and on the right on T ∗ G by: −1 ( (hL , hR ), (g, ξ) ) → (hL gh−1 R , hR ξhR )

We are interested in the computation of Poisson brackets of functions on phase space. It is important to remember that, if we have a group action on a manifold M , (g, m) → gm, the action on functions on M reads f → g · f , where g · f (m) = f (g −1 m). In our case we can take as elementary functions the collection of the matrix elements of g in a faithful representation of G, and the components of ξ. Any other function can be expressed in these coordinates. So it is enough to give the Poisson brackets of these elementary functions. The general theory, see Chapter 14, shows that these Poisson brackets are expressed as: 1 {ξ(X), ξ(Y )} = ξ([X, Y ]), 2

1 {ξ(X), g} = g X, 2

{g, g} = 0

(4.48)

112

4 Algebraic methods

In these equations g is an abuse of notation for the matrix ρ(g) in some faithful representation. What eq. (4.48) really means is that {ξ(X), ρ(g)ij } = 12 ρ(gX)ij , and {ρ(g)ij , ρ(g)kl } = 0. On these elementary functions, g and ξ(Y ), the inﬁnitesimal group action δ(XL ,XR ) reads: δ(XL ,XR ) g = −XL g + gXR ,

δ(XL ,XR ) ξ(Y ) = −([XR , ξ], Y )

It is straightforward to check, using the Poisson brackets (4.48), that the generating Hamiltonians are: HXL = −2ξ(g −1 XL g),

HXR = 2ξ(XR )

where we recall that for any function f , we have by deﬁnition of the Hamiltonian HX generating the group action, δX f = {HX , f }. This means that the moments, deﬁned by HX = (P, X), are: P L (g, ξ) = −2PHL∗ gξg −1 ,

∗ ξ P R (g, ξ) = 2PHR

(4.49)

∗ in G ∗ by where we have introduced the projectors on the spaces HL,R ∗ taking the restriction of ξ ∈ G to HL,R , the Lie algebras of HL,R .

The geodesic motion on G is a Hamiltonian system on T ∗ G whose Hamiltonian is H = (ξ, ξ). Indeed, one ﬁnds using eq. (4.48), ξ˙ = 0 and g −1 g˙ = ξ as equations of motion. Notice that H is bi-invariant, so one can reduce by the group actions on the left and on the right. To get the Toda chain we shall perform the reduction of the geodesic motion on T ∗ G0 by the action of the group N+ on the left and K on the right according to the Iwasawa decomposition. More precisely, this is achieved by a suitable choice of the momentum (µ = µR , µL ). We take: PK∗ (ξ) = µR = 0 −2PN+∗ (gξg

−1

) = µ ≡ −2 L

(4.50) E−α

(4.51)

α simple

In the following we shall identify N+∗ with N− , and K∗ with K through the invariant scalar product. The isotropy group Gµ of the momentum µ, is N+ × K. This is obvious for the right component since µR = 0. The isotropy group of µL is by deﬁnition the set of elements g ∈ N+ such that: (µL , g −1 Xg) = (µL , X), ∀X ∈ N+ Since µL only contains roots of degree −1 (see section (4.8)), the only contribution to (µL , X) comes from X (1) , the degree one component of X. But

4.9 Toda system and Hamiltonian reduction

113

(g −1 Xg)(1) = X (1) ∀g ∈ N+ , because, for Y ∈ N+ , exp (Y )X exp(−Y ) = X + [Y, X] + · · ·, and Y is of degree ≥ 1. Hence the isotropy group of µL is N+ itself. The reduced phase space is Fµ = Nµ /Gµ , where Nµ is the level manifold deﬁned by the equations (4.50, 4.51). Let us compute its dimension. Let d = dim G0 and r = dim A. Then we have: dim K = dim N+ =

d−r , 2

dim T ∗ G0 = 2d

The dimension of the submanifold Nµ is dim Nµ = 2d−dim K−dim N+ = d + r and the dimension of the reduced phase space is dim Fµ = dim Nµ − dim K − dim N+ = 2r, which is the correct dimension of the phase space of the Toda chain. Since the isotropy group is the whole of N+ × K, any point (g, ξ) of Nµ can be brought to the form (g, ξ) → (Q, L) with Q ∈ A (due to the Iwasawa decomposition) by the action of the isotropy group. We shall identify G0 and G0∗ under the Killing form for convenience. Equation (4.50) implies that L ∈ M, which is orthogonal to K. Thus we can write: L=p+ lα (Eα + E−α ), p ∈ A α positive

Inserting this form into eq. (4.51) and setting Q = exp(q), we get: PN− QLQ−1 = lα e−α(q) E−α = E−α α>0

α simple

hence lα = exp(α(q)) for α simple and lα = 0 otherwise. We have obtained the Lax matrix of the Toda chain: eα(q) (Eα + E−α ) (4.52) L=p + α simple

The set of the (Q, L) is obviously a (2r)-dimensional subvariety S of Nµ . For any point (g, ξ) in Nµ one can write uniquely g = nQk and ξ = k −1 Lk. To compute the reduced symplectic form in the coordinates (Q, L), it is enough to evaluate the canonical 1-form: α = 2ξ(g −1 δg) = 2L(Q−1 δQ) = 2L(δq) = 2 pi δqi i

114

4 Algebraic methods

This completes the identiﬁcation of the reduced geodesic ﬂow with the open Toda chain. An interesting feature of this approach is that it allows us to compute naturally the r-matrix of the Toda model. We refer to Chapter 14 for the exposition of the general method. The function L(X) (for any X ∈ M) on the reduced phase space has a uniquely deﬁned extension on T ∗ G0 , invariant under the action of the group N+ × K: FX (g, ξ) = (ξ, k −1 Xk)

(4.53)

where k = k(g) is uniquely determined by the Iwasawa decomposition g = nQk. In this situation one has simply the reduced Poisson bracket: {L(X), L(Y )} = {FX , FY } = {(kξk −1 , X), (kξk −1 , Y )} We evaluate this bracket with the help of eqs. (4.48), and then restrict to n = k = e. For a function f : G → G we have: {ξ(X), f (g)} =

1 d f (getX )|t=0 2 dt

Applying this to the function k = k(g), we get: 1 {ξ(X), k(g)} = ∇k(X), 2

where ∇k(X) =

∂ k(Q exp(tX))|t=0 ∂t

We then immediately obtain:

1 L , [X, Y ] − [X, ∇k(Y )] − [∇k(X), Y ] {L(X), L(Y )} = 2 Notice that [X, Y ] ∈ K because X, Y ∈ M, hence L([X, Y ]) = 0, because L ∈ M. From this equation we get an r-matrix for the Toda system given by: 1 R X = − ∇k(X) ∈ K 2 We can compute this r-matrix as follows: Given an element X ∈ M, we write the Iwasawa decomposition of QetX as: QetX = etX+ · QetXa · etXK This uniquely deﬁnes the quantities, (a priori depending on t), X+ ∈ N+ , Xa ∈ A, and XK ∈ K. Letting t tend to zero, we see that ∇k(X) = XK and moreover QX = X+ Q + Xa Q + QXK ⇒ X = Q−1 X+ Q + Xa + XK

(4.54)

4.10 The Lax pair of the Kowalevski top

115

Any α>0 xα (Eα + E−α ) + X ∈ M can be written in the form X = x H which we rewrite as i i i X=2 xα Eα + xi Hi − xα (Eα − E−α ) (4.55) α>0

i

α>0

Noticing that Q−1 X+ Q ∈ N+ , we get by comparison of eqs. (4.54) and (4.55), Q−1 X+ Q = 2 α>0 xα Eα , Xa = i xi Hi , and ﬁnally 1 1 1 R X = − ∇k(X) = − XK = xα (Eα − E−α ) 2 2 2 α>0

.1 ⊗ X), we obtain the r-matrix: By dualization, R X = Tr2 (r12 r12 =

1 (Eα − E−α ) ⊗ (Eα + E−α ) 4 (Eα , E−α )

(4.56)

α>0

to the r-matrix, r , we computed in eq. (4.26). For We want to relate r12 12 this, we recall that L and r12 satisfy the equation {L1 , L2 } = [r12 , L1 ] − [r21 , L2 ]. Applying the automorphism σ1 = σ ⊗ 1 to this equation and noting that σ(L) = −L, we get {L1 , L2 } = [σ1 r12 , L1 ]+[σ1 r21 , L2 ]. Adding the two equations, we get a similar r-matrix relation:

1 1 {L1 , L2 } = [ (r12 + σ1 r12 ), L1 ] − [ (r21 − σ1 r21 ), L2 ] 2 2 It is straightforward to check that: 1 , (r12 + σ1 r12 ) = r12 2

1 (r21 − σ1 r21 ) = r21 2

hence we recover eq. (4.56). 4.10 The Lax pair of the Kowalevski top Applying the algebraic setting of this chapter to classical Lie groups and low-dimensional coadjoint orbits, one can ﬁnd a rich variety of integrable systems. This pragmatic approach was rewarded by the discovery of a Lax pair with spectral parameter for the Kowalevski top by Reyman and Semenov-Tian-Shansky. We now explain this construction. Consider a Lie group G with an involutive automorphism σ. This involution deﬁnes a linear involution on the Lie algebra G that we also call σ. Let K be the subgroup of ﬁxed points of σ, and K its Lie algebra. We have G =K⊕R

116

4 Algebraic methods

where σ = +1 on K and σ = −1 on R. Since σ is a Lie algebra automorphism of order 2, we have [K, K] ⊂ K,

[K, R] ⊂ R,

[R, R] ⊂ K

By exponentiation, the vector space R is a representation space for the Lie group K acting by conjugation. From these data we ﬁrst construct the loop algebra G˜ = G ⊗ C[[λ, λ−1 ]] with the commutation relations eq. (4.14), and then the twisted loop algebra G˜σ as the set of ﬁxed points of the induced involution σ (X(λ)) = (σ · X)(−λ). Elements of the twisted loop algebra are of the form X(λ) = Xn λn with Xn ∈ K for n even, and Xn ∈ R for n odd. As in section (4.3), this algebra decomposes as a sum of two subalgebras: G˜σ = G˜σ + + G˜σ − where elements of G˜σ + have vanishing or positive powers of λ while elements of G˜σ − have strictly negative powers of λ. Applying the Adler– Kostant–Symes scheme, we deﬁne a pair of matrices R± whose actions are given by R+ (X(λ)) = −X+ (λ) R− (X(λ)) = X− (λ) As compared with eq. (4.15), we have introduced a minus sign to match the conventions of Chapter 2. Hence we have two Lie algebra brackets on G˜σ , the original one, and the one in which G˜σ ± commute, deﬁning the Lie algebra G˜σ R , with which ˜R. is associated the Lie group G The invariant bilinear form on G˜σ is still given byeq. (4.16). It allows us to write elements of the dual G˜σ∗ as series L(λ) = n ln λn−1 with ln ∈ K for n even and ln ∈ R for n odd. On this dual we consider the Poisson algebra deﬁned by the Poisson brackets { , }R . To construct an integrable system we select a particularly simple orbit: Proposition. Let A and B be two ﬁxed elements of R. (i) The set of elements L(λ; ξ, k) = A + ξλ−1 + (k −1 Bk) λ−2 ,

ξ ∈ K, k ∈ K

(4.57)

˜R. is a coadjoint orbit of G −2 (ii) Setting L = l−1 λ + l0 λ−1 + l1 , with l0 ∈ K and l±1 ∈ R, the Poisson brackets of this Lax matrix are given for x, y ∈ K, z ∈ R by: {(l0 , x), (l0 , y)}R = −(l0 , [x, y]),

{(l0 , x), (l−1 , z)}R = −(l−1 , [x, z]) (4.58)

4.10 The Lax pair of the Kowalevski top

117

All other Poisson brackets vanish. In particular, l1 is in the centre of the Poisson algebra. (iii) Equation (4.57) shows that there is a map from T ∗ (K) to the orbit. This is a Poisson map. Proof. We ﬁrst show that the elements of the form L = l−1 λ−2 +l0 λ−1 +l1 ˜ R . It is enough to show that such elements are form a coadjoint orbit of G xn λn we have by stable under the coadjoint action of G˜σ R . For X = ∗ eq. (4.13) (adR X · L)(Y ) = L([X, Y ]R ) = −L([X+ , Y+ ] + [X− , Y− ]), so that: (ad∗R X · L, Y ) = −([X+ , L], Y+ ) + ([X− , L], Y− ) = −([x0 , l−1 ]λ−2 − ([x0 , l0 ] + [x1 , l−1 ])λ−1 , Y+ ) We see that under this coadjoint action l1 remains invariant, l−1 is transformed by the coadjoint action of K, and l0 is transformed to a generic element of K: δl1 = 0,

δl−1 = −[x0 , l−1 ],

δl0 = −[x0 , l0 ] − [x1 , l−1 ]

Next we compute the Poisson brackets of L induced by the Kostant– Kirillov bracket { , }R . In the formula {(L, X), (L, Y )}R = (L, [X, Y ]R ), we choose X = xp λp , p = −1, 0, 1, and similarly for Y , to probe the three components of L. For example {(l1 , x−1 ), (l0 , y0 )}R = (L, [x−1 , y0 ]R ) = 0 because x−1 and y0 lie in the two subspaces of G˜σ which commute in G˜σ R . Similarly, {(l0 , x0 ), (l−1 , y1 )}R = −(L, [x0 , y1 ]λ) = −(l−1 , [x0 , y1 ]), etc. To show that these Poisson brackets are the same as those of T ∗ (K), recall the parametrization (k, ξ) of the generic element of the cotangent bundle T ∗ (K) and the Poisson bracket formulae (with an extra minus sign compared to Chapter 14): {ξ(x), ξ(y)} = −ξ([x, y]),

{ξ(x), k} = −kx,

{k, k} = 0

(4.59)

The ﬁrst of eq. (4.59) reproduces the Kostant–Kirillov bracket of K as in the ﬁrst of eq. (4.58) with the identiﬁcation ξ = l0 . The second of eq. (4.59) is equivalent to {ξ(x), k −1 Bk} = −[k −1 Bk, x], which coincides with the second of eq. (4.58) since (l−1 , [x, z]) = ([l−1 , x], z) and l−1 = k −1 Bk. It follows that the map from T ∗ (K) to the orbit preserves the symplectic structure. In the general Adler–Kostant–Symes construction, any function invariant under the coadjoint action of G˜σ yields an integrable ﬂow on our coadjoint orbit. Such functions are given by Res Tr (λm Ln (λ)) for any m, n. In particular, taking m = 1 and n = 2 we get the Hamiltonian: H = −Tr (ξ 2 ) − 2Tr (k −1 BkA)

(4.60)

118

4 Algebraic methods

where the minus sign is appropriate to get a positive Hamiltonian when K is a compact group. Recall that K acts on the left and on the right −1 on T ∗ (K) by ((hL , hR ), (k, ξ)) → (hL kh−1 R , hR ξhR ), so this Hamiltonian is invariant under the subgroup of KL of elements which stabilize B, h−1 L BhL = B, and under the subgroup of KR of elements which stabilize A, h−1 R AhR = A. More generally, under these special subgroups, the Lax matrix L is invariant under hL and gets conjugated under hR , so that any Hamiltonian of the hierarchy is invariant. One can then perform a Hamiltonian reduction under the action of such subgroups. These reduced Hamiltonians are in involution using the reduced Poisson bracket, because reduced Poisson brackets coincide with unreduced ones for invariant functions. This scheme is particularly interesting when G is a real Lie group and σ is a Cartan involution. Then K is a maximal compact subgroup of G, see Chapter 16, so that G/K is a symmetric space. The situation which leads to the Kowalevski top is obtained by considering G = SO(p, q) and K = SO(p) × SO(q) with p ≥ q. The Kowalevski case corresponds to p = 3, q = 2. The group SO(p, q) is the real Lie group of pseudoorthogonal transformations of Rp+q leaving invariant the metric x21 +· · ·+ x2p − x2p+1 − · · · − x2p+q . Elements of its Lie algebra may be represented as matrices of the form: Xp D (Xp , Xq ) ∈ so(p) ⊕ so(q) (4.61) X= t D Xq where D is an arbitrary (p, q) matrix. The subalgebra K consists of block diagonal matrices (Xp , Xq ), while the subspace R consists of the oﬀdiagonal terms. Let us write the L matrix eq. (4.57) in this context, with k = (kp , kq ) ∈ SO(p) × SO(q) and ξ = (ξp , ξq ) ∈ so(p) ⊕ so(q): 0 kp−1 bkq 0 a ξp 0 −1 + λ−2 λ + (4.62) L= t a 0 0 ξq kq−1 t bkp 0 Here, we have written the matrices A, B ∈ R in eq. (4.57) in the form 0 b 0 a , B= t A= t a 0 b 0 where a and b are rectangular p × q matrices. We have also computed k −1 Bk explicitly. To obtain the Kowalevski top we choose a speciﬁc orbit, i.e. we specify B. This amounts to choosing the rectangular p × q matrix b, which we take as: 1q b= 0p−q

4.10 The Lax pair of the Kowalevski top

119

where 1q means the q × q identity matrix, and 0p−q means the (p − q) × q zero matrix. The subgroup of KL which stabilizes B consists of matrices of the form: 0 hq hq 0 , hq ≡ ∈ SO(p) for hq ∈ SO(q) hL = 0 1p−q 0 hq where the map hq → hq embeds SO(q) into SO(p) and the map hq → hL embeds the group SO(q) into SO(p) × SO(q). Hence the reduction subgroup identiﬁes with SO(q). At the Lie algebra level, it is realized into the Lie algebra so(p) ⊕ so(q) by pairs of matrices q , Xq ) XL = (X Using these embeddings we can write kp−1 bkq = rb, where r = kp−1 kq ∈ SO(p). The quantity r is invariant under the action of the reduction hq kp , hq kq ). Moreover, the action of KL leaves group since hq · (kp , kq ) = ( ξ invariant, so we have the manifestly invariant expression for L: 0 rb ξp 0 0 a −1 λ + t −1 + λ−2 , r = kp−1 kq L= t 0 ξq a 0 br 0 The moment map is given by eq. (4.49). For any XL in the Lie algebra of the reduction group it is given by: q ) + (kq ξq kq−1 , Xq ) (P L (k, ξ), XL ) = (kp ξp kp−1 , X = (kq ξq kq−1 + πq (kp ξp kp−1 ), Xq ) where Xq ∈ so(q), k = (kp , kq ) ∈ KL , ξ = (ξp , ξq ) and πq (M ) is the projection operator which restricts the p × p matrix M to its q × q upper q is a matrix with Xq in the left corner. This projector appears because X upper left q × q corner and 0 in the lower right (p − q) × (p − q) corner. We choose to reduce on the surface of zero momentum which is given by kq ξq kq−1 + πq (kp ξp kp−1 ) = 0, that is: 1 ξq = − πq (J), 2

J = 2r−1 ξp r ∈ so(p)

(4.63)

The factor of 2 in the deﬁnition of J is introduced for later convenience. We still need to quotient by the stability group of the moment, i.e. the whole group SO(q). However, since we deal only with invariant quantities like L, r, the Poisson brackets on the reduced phase space can be computed by simply using the Poisson brackets on the unreduced phase

120

4 Algebraic methods

space, see Chapter 14. From eq. (4.59), we easily compute the Poisson brackets that we will need later: {ξp (r−1 Xr), ξp (r−1 Y r)} = ξp (r−1 [X, Y ]r) {ξp (X), r} = Xr,

{ξq (X), r} = −rX

(4.64)

The Hamiltonian eq. (4.60) takes the form H = −Tr (ξ 2 ) − 4Tr (t bF ),

F = r−1 a

(4.65)

Using the explicit form of b, we have Tr (t bF ) = Tr πq (F ), while Tr (ξ 2 ) = Tr (ξp2 )+Tr (ξq2 ). By eq. (4.63), we have Tr (ξp2 ) = 1/4 Tr (J 2 ) and Tr (ξq2 ) = 1/4 Tr πq (J)2 . In particular when (p = 3, q = 2) we use that J is a 3 × 3 antisymmetric matrix which we write as Jij = ijk Jk , and we express the 3 × 2 matrix F as F = (P , P ) for two vectors P , P of components γi , γi , i = 1, 2, 3. Then Tr πq (F ) = γ1 + γ2 , and Tr πq (J)2 = −2J32 , while Tr J 2 = −2J2 , so the Hamiltonian eq. (4.60) takes the form: 1 H = (J12 + J22 + 2J32 ) − 4(γ1 + γ2 ) 2 The Poisson brackets eq. (4.64) translate into the following non-zero Poisson brackets for the quantities Ji , γi , γi : {Ji , Jj } = ijk Jk ,

{Ji , γj } = ijk γk ,

{Ji , γj } = ijk γk

We recover exactly the dynamical data of the Kowalevski top, except that it is generalized to a situation with two external forces γ and γ . The special Kowalevski case corresponds to γ = 0. It is now clear that J is the angular momentum. The matrix r can be identiﬁed as the rotation relating the absolute and the moving frames, so that Ω = −r−1 r˙ is the rotation vector. We have r−1 r˙ = r−1 {H, r} = −r−1 {Tr (ξp2 ) + Tr (ξq2 ), r}, where in the last equality we dropped the F term in the Hamiltonian, eq. (4.65), because {r1 , r2 } = 0. Using eq. (4.64), we get r−1 r˙ = −2r−1 ξp r + 2ξq = −J + 2ξq . It follows that, on the zero momentum surface, we have: Ω = J + πq (J) For (p = 3, q = 2) we denote Ωij = ijk ωk and we see that Jk = Ik ωk with I1 = I2 = 1 and I3 = 1/2. Finally, F = r−1 a represents external forces in the rotating frame. These forces are constant in the absolute frame and given by the constant matrix a.

4.10 The Lax pair of the Kowalevski top

121

It is convenient to conjugate the Lax matrix on the zero momentum surface by the block diagonal (p + q) × (p + q) matrix D = Diag(r, 1q ) so that the expression of the rotated Lax matrix is: 1 J 0 0 F 0 b −1 −1 + λ + t λ−2 (4.66) L ≡ D LD = t F 0 b 0 2 0 −πq (J) In this way the Lax matrix is naturally expressed only in terms of quantities deﬁned in the moving frame. can be written The Lax equation describing the motion of the matrix L in the form: d 0 b Ω 0 −1 ( ( λ + L(λ) = [M (λ), L(λ)], M (λ) = 2 t b 0 0 0 dt −1 ˙ ((λ) = 2(λL(λ)) In fact we have M − − D D, the ﬁrst term follows from the general theory, and the second term is produced by the conjugation Explicitly, this Lax equation reads: by D in L.

F˙ = ΩF,

J˙ = −[J, Ω] − 4(F t b − b t F ),

˙ = 4( t F b −t bF ) πq (J)

which in components gives: 2p˙ = qr + 8γ3 2q˙ = −pr − 8γ3 r˙ = −8(γ1 − γ2 )

γ˙ 1 = rγ2 − qγ3 γ˙ 2 = pγ3 − rγ1 γ˙ 3 = qγ1 − pγ2

γ˙ 1 = rγ2 − qγ3 γ˙ 2 = pγ3 − rγ1 γ˙ 3 = qγ1 − pγ2

These equations generalize the equations of the Kowalevski top given in Chapter 2, which are recovered if γ = 0 and c0 = 8 or h = 4. Finally, it is convenient to switch to a more compact representation of the so(3, 2) Lie algebra by 4 × 4 matrices instead of 5 × 5 matrices. We ﬁrst consider the complex Lie algebra so(5, C) and view so(3, 2) as one ¯ of its non-compact real forms, obtained using the conjugation X → η Xη where η = Diag (1, 1, 1, −1, −1) is the metric left-invariant by so(3, 2). Any element of the Lie algebra so(3, 2), in the vector representation, can be written as in eq. (4.61) with:   0 x3 −x2 0 y   , D = (γ , γ ) Xp = −x3 , Xq = 0 x1 −y 0 x2 −x1 0 This matrix is mapped to an antisymmetric matrix of so(5, C) by a change of basis, namely conjugation by Diag (1, 1, 1, i, i). In this basis the matrix reads: X iD Xp (4.67) X= −i t D Xq

122

4 Algebraic methods

Let Tµν = Eµν − Eνµ for µ < ν be the standard basis of antisymmetric matrices, obeying the so(5) algebra: [Tµν , Tρσ ] = Tµσ δνρ − Tρν δµσ + Tσν δµρ − Tµρ δνσ It is well known, and easy to check, that this algebra can be realized with Γ matrices satisfying the anticommutation relations {Γµ , Γν } = 2δµν , µ, ν = 1, . . . , 5. This is called the spinorial representation of so(5, C). The Tµν are given by Tµν = 1/4[Γµ , Γν ]. It remains to notice that the Γµ can be represented by 4 × 4 matrices as follows: Γj = σ1 ⊗ σj ,

Γ4 = σ2 ⊗ σ0 ,

Γ5 = σ3 ⊗ σ1

where σ0 = 1 and the σj are the 2 × 2 Pauli matrices: σ1 =

0 1

1 0

,

σ2 =

0 −i i 0

,

σ3 =

1 0

0 −1

(4.68)

obeying the relations σi2 = 1 and σi σj = i ijk σk for (i, j, k) = (1, 2, 3). in eq. (4.67) expands on the Tµν as The matrix X = x3 T12 − x2 T13 + x1 T23 + yT45 + i X

3

(γj Tj4 + γj Tj5 )

j=1

Plugging the representation of Tµν in terms of the Γµ , we get: =1 X 2

ix · σ − γ · σ iyσ0 + iγ · σ

iyσ0 − iγ · σ ix · σ + γ · σ

We ﬁnally write the Lax pair in this 4 × 4 representation: · 1 − γ · σ −i γ σ = L γ · σ 2 iγ · σ 1 1 −σ1 −iσ2 iJ · σ −iJ3 σ0 + + 2 iσ2 σ1 4λ −iJ3 σ0 iJ · σ 2λ 1 −σ1 −iσ2 ω · σ 0 ( = 1 i + M σ1 0 i ω · σ 2 λ iσ2

(4.69)

It is a simple computation to check directly that the Lax equation with this Lax pair reproduces the equations of motion of the Kowalevski top.

4.10 The Lax pair of the Kowalevski top

123

References [1] M. Toda, Vibration of a chain with non-linear interaction. J. Phys. Soc. Japan 22 (1967) 431. [2] B. Kostant, The solution to the generalized Toda lattice and representation theory. Adv. Math. 34 (1979) 195–338. [3] M. Adler, On a trace functional for formal pseudodiﬀerential operators and symplectic structure of the Korteweg–de Vries type equations. Inv. Math. 50 (1979) 219. [4] E.K. Sklyanin, On the complete integrability of the Landau–Lifchitz equation. Preprint LOMI E-3-79. Leningrad, 1979. [5] W. Symes, Systems of Toda type, inverse spectral problems and representation theory. Inv. Math. 59 (1980) 13. [6] M. Olshanetsky and A. Perelomov, Classical integrable ﬁnitedimensional systems related to Lie algebras. Phys. Rep. 71 (1981) 313–400. [7] A. Belavin and V. Drinfeld, On the solutions of the classical Yang– Baxter equation for simple Lie algebras. Funct. Anal. Appl. 16, 3 (1982) 1–29. [8] V. Drinfeld, Hamiltonian structures on Lie groups, Lie bialgebras, and the geometrical meaning of the Yang–Baxter equations. Dokl. Akad. Nauk SSSR 268 (1983) 285–287. [9] D. Olive and N. Turok, Algebraic structure of Toda systems. Nucl. Phys. B220 (1983) 491–507. [10] M. Semenov-Tian-Shansky, What is a classical r-matrix? Funct. Anal. Appl. 17 4 (1983) 17. [11] L. Ferreira and D. Olive, Non-compact symmetric spaces and the Toda molecule equations. Comm. Math. Phys. 99 (1985) 365–384. [12] A. Bobenko, A. Reyman and M. Semenov-Tian-Shansky, The Kowalevski top 99 years later: a Lax pair, generalizations and explicit solutions. Comm. Math. Phys. 122 (1989) 321–354. [13] A. Reyman and M. Semenov-Tian-Shansky, Group-theoretical methods in the theory of ﬁnite-dimensional integrable systems. Encyclopaedia of Mathematical Sciences, 16. Springer-Verlag (1990).

5 Analytical methods

In this chapter, we present the general ideas for solving the Lax equations with spectral parameter. The spectral curve Γ is the characteristic equation for the eigenvalues of the Lax matrix: det (L(λ) − µ) = 0. Since ˙ the Lax equation L(λ) = [M (λ), L(λ)] is isospectral, the eigenvalues of L(λ) are time-independent and so is the spectral curve. At any point of the spectral curve there exists, by deﬁnition, an eigenvector of L(λ) with eigenvalue µ. We explain how we can reconstruct the eigenvector from its analyticity properties on Γ. In particular all the dynamical information is contained in the divisor of its poles which we call the dynamical divisor. The time evolution of this divisor is equivalent to a linear ﬂow on the Jacobian of Γ. We give three proofs of this result. The ﬁrst one proceeds by explicitly computing the time evolution of the image of the dynamical divisor by the Abel map in the Jacobian. The second one uses a special type of functions on the Riemann surface, the Baker–Akhiezer functions, to reconstruct the eigenvectors explicitly. Finally, the linearization property also follows very directly by properly interpreting the group theoretical factorization method in its Riemann–Hilbert incarnation. As a result, one can express the dynamical variables in terms of θ functions deﬁned on the Jacobian of the spectral curve. We then show that the symplectic structure can be nicely written in terms of coordinates of the points of the dynamical divisor, hence exhibiting the interplay between analytical data and separation of variables. We ﬁnally present the application of theses ideas to sketch the solution of the Kowalevski top.

124

5.1 The spectral curve

125

5.1 The spectral curve Let us consider an N × N Lax matrix L(λ), depending, as in Chapter 3, rationally on a spectral parameter λ ∈ C with poles at points λk : Lk (λ) (5.1) L(λ) = L0 + k

L0 is independent of λ and Lk (λ) is the polar part of L(λ) at λk , i.e. r Lk = −1 r=−nk Lk,r (λ − λk ) . The analytical method of solution of integrable systems is based on the study of the eigenvector equation: (L(λ) − µ1) Ψ(λ, µ) = 0

(5.2)

where Ψ(λ, µ) is the eigenvector with eigenvalue µ. The characteristic equation for the eigenvalue problem (5.2) is: Γ : Γ(λ, µ) ≡ det(L(λ) − µ 1) = 0

(5.3)

This deﬁnes an algebraic curve in C2 which is called the spectral curve. We are considering here the smooth compact curve obtained from this equation by the desingularization procedure explained in Chapter 15, even if we do not mention it explicitly. A point on Γ is a pair (λ, µ) satisfying eq. (5.3). If N is the dimension of the Lax matrix, the equation of the curve is of the form: Γ : Γ(λ, µ) ≡ (−µ) + N

N −1

rq (λ)µq = 0

(5.4)

q=0

The coeﬃcients rq (λ) are polynomials in the matrix elements of L(λ) and therefore have poles at λk . Since the Lax equation L˙ = [M, L] is isospectral, these coeﬃcients are time-independent and are related to the action variables. From eq. (5.4) we see that the spectral curve appears as an N -sheeted covering of the Riemann sphere. To a given point λ on the Riemann sphere there correspond N points on the curve whose coordinates are (λ, µ1 ), . . . , (λ, µN ), where the µi are the solutions of the algebraic equation Γ(λ, µ) = 0. By deﬁnition µi are the eigenvalues of L(λ). Our goal is to determine the analytical properties of the eigenvector Ψ(λ, µ) and see how much of L(λ) can be reconstructed from them. The result is that one can reconstruct L(λ) up to global (independent of λ) similarity transformations. This is not too surprising since the analytical properties of L(λ) and the spectral curve are invariant under global gauge

126

5 Analytical methods

transformations consisting of similarity transformations by constant invertible matrices. So from analyticity we can only hope to recover the system where global gauge transformations have been factored away. In general, we may ﬁx the gauge by diagonalizing L(λ) for one value of λ. To be speciﬁc, we choose to diagonalize at λ = ∞, i.e. we diagonalize the coeﬃcient L0 : L0 = lim L(λ) = diag(a1 , . . . , aN ) λ→∞

(5.5)

We assume for simplicity that all the ai are diﬀerent. Then on the spectral curve we have N points above λ = ∞: Qi ≡ (λ = ∞, µi = ai ) In the gauge (5.5) there remains a residual action which consists of conjugating the Lax matrix by constant diagonal matrices. Generically, these transformations form a group of dimension N − 1 and we will have to factor it out. Before doing complex analysis on Γ, one has to determine its genus. A general strategy is as follows. As we have seen, Γ is an N -sheeted covering of the Riemann sphere. There is a general formula expressing the genus g of an N -sheeted covering of a Riemann surface of genus g0 (in our case g0 = 0). It is the Riemann–Hurwitz formula: 2g − 2 = N (2g0 − 2) + ν

(5.6)

where ν is the branching index of the covering, see Chapter 15. Let us assume for simplicity that the branch points are all of order 2. To compute ν we observe that this is the number of values of λ where Γ(λ, µ) has a double root in µ. This is also the number of zeroes of ∂µ Γ(λ, µ) on the surface Γ(λ, µ) = 0. But ∂µ Γ(λ, µ) is a meromorphic function on Γ, and therefore the number of its zeroes is equal to the number of its poles and it is enough to count the poles. These poles can only be located where the matrix L(λ) itself has a pole. So we are down to a local analysis around the points of Γ such that L(λ) has a pole. Let us apply this idea to the matrix (5.1). Above a pole λk , we have N branches of the form µj = lj /(λ − λk )nk + · · · , where lj are the eigenvalues of Lk,−nk thatare assumed all distinct. On such a branch we have ∂µ Γ(λ, µ)|(λ,µj (λ)) = i=j (µj (λ) − µi (λ)), which thus has a pole of order (N − 1)nk . Summing on all branches, the total order of the poles over λk is N (N − 1)nk . Summing on all poles λk of L(λ), we see that the total branching index is ν = N (N − 1) k nk . This gives: N (N − 1) nk − N + 1 g= 2 k

5.1 The spectral curve

127

For consistency of the method it is important to observe that the genus is related to the dimension of the phase space and to the number of action variables occuring as independent parameters in eq. (5.4), which should also be half the dimension of the phase space. As we have seen in Chapter 3, Lk = (gk · (Ak )− · gk−1 )− , where the (Ak )− characterize the orbit and are non-dynamical. The gk are deﬁned modulo right multiplication by diagonal matrices, and we have in addition to quotient by global gauge transformations. Proposition. The phase space M has dimension 2g and there are g proper action variables in eq. (5.4). Proof. The dynamical variables are the jets of order (nk − 1) of the gk . This gives N 2 nk parameters. But Lk is invariant under gk → gk dk with dk a jet of diagonal matrices of the same order. Hence the dimension of the Lk orbit is (N 2 − N )nk . We also have the residual global gauge invariance by diagonal matrices acting as gk → dgk , or L(λ) → dL(λ)d−1 . This preserves the diagonal form of L0 . The orbits of this action are of dimension (N − 1), since the identity does not act. The phase space M is obtained by Hamiltonian reduction by this action (see Chapter 14). First one ﬁxes the momentum, yielding (N − 1) conditions, and then one takes the quotient by the stabilizer of the momentum which is here the whole group since it is Abelian. As a result, the dimension of the phase space is reduced by 2(N − 1), yielding: nk − 2(N − 1) = 2g dim M = (N 2 − N ) k

Let us now count the number of independent coeﬃcients in eq. (5.4). It is clear that rj (λ) is a rational function of λ. The value of rj at ∞ is known since µj → aj . Note that rj is the symmetrical function σj (µ1 , . . . , µN ), where µi are the eigenvalues of L(λ). Above λ = λk , they can be written as nk (j) cn µj = + regular (5.7) (λ − λk )n n=1 (j)

(j)

where all the coeﬃcients c1 , . . . , cnk are ﬁxed and non-dynamical because they are the matrix elements of the diagonal matrices (Ak )− , while the regular part is dynamical. We see that rj (λ) has a pole of order jnk at λ = λk , and so can be expressed using j k nk parameters, namely the coeﬃcients of all these poles. Summing over j we have altogether 1 k nk parameters. They are not all independent however, be2 N (N + 1) (j)

cause in eq. (5.7) the coeﬃcients cn are non-dynamical. This implies that

128

5 Analytical methods

the nk highest order terms in rj are ﬁxed and yields N nk constraints on the coeﬃcients of rj . We are left with 12 N (N − 1) k nk parameters, that is g + N − 1 parameters. It remains to take the symplectic quotient by the action of constant diagonal matrices. We assume that the system is equipped with the Poisson bracket (3.26) of Chapter 3. Consider the Hamiltonians Hn = (1/n) Resλ=∞ Tr (Ln (λ)) dλ, i.e. the term in 1/λ in Tr (Ln (λ)). These are functions of the rj (λ) in eq. (5.4). We show that they are the generators of the diagonal action. First we have: Lk (λ))dλ Resλ=∞ Tr (Ln (λ))dλ = n Resλ=∞ Tr (Ln−1 0 k

= n Resλ=∞ Tr (Ln−1 L(λ))dλ 0

(5.8)

since all Lk (λ) are of order 1/λ at ∞. Using the Poisson bracket we get C12 n−1 , L(λ) ⊗ 1 + 1 ⊗ L(µ) dλ {Hn , L(µ)} = − Resλ=∞ Tr1 L0 ⊗ 1 λ−µ The term L(λ) ⊗ 1 in the commutator does not contribute because the L0 part produces a vanishing contribution by cyclicity of the trace and all other terms are of order at least 1/λ2 . The term 1 ⊗ L(µ) yields −[Ln−1 , L(µ)], which is the coadjoint action of a diagonal matrix on L(µ). 0 Since L0 is generic, the Ln0 generate the space of all diagonal matrices, so we get exactly N − 1 generators H1 , . . . , HN −1 . In the Hamiltonian reduction procedure, the Hn are the moments of the group action and are to be set to ﬁxed (non-dynamical) values. Hence when the system is properly reduced we are left with exactly g action variables. Example. Let us consider the example of the Neumann model. Recall the Lax matrix (see Chapter 3): 1 1 J − 2K λ λ The spectral curve can be computed as follows: 1 1 det (L(λ) − µ) = det (L0 − µ) + J − 2 K λ λ 1 −1 1 = det (L0 − µ) det 1 + (L0 − µ) ( J − 2 K) λ λ Fk 1 (ai − µ) 1 + 2 (5.9) = λ µ − ak L(λ) = L0 +

i

k

129

5.1 The spectral curve where the conserved quantities Fk are given by: Fk = x2k +

2 Jkj

j=k

ak − aj

(5.10)

This is because J is a projector of rank 2 and K is a projector of rank 1. The matrix P = (L0 − µ)−1 ( λ1 J − λ12 K) is of rank 2. Its image is spanned by the two vectors v1 = (L0 − µ)−1 X and v2 = (L0 − µ)−1 Y while its kernel is the (N − 2)-dimensional space orthogonal to X and Y , which is generically supplementary to the image. We have P = 0 on the kernel, and: 1 1 1 P v1 = V (µ) − 2 U (µ) v1 − U (µ)v2 λ λ λ 1 1 1 P v2 = (W (µ) + 1) − 2 V (µ) v1 − V (µ)v2 λ λ λ where the functions U (µ), V (µ), W (µ) are deﬁned as: U (µ) =

i

x2i , ai − µ

V (µ) =

xi yi , ai − µ

W (µ) = −1 +

i

i

yi2 ai − µ (5.11)

From this it follows that:

1 1 Fk det (1 + P ) = 1 − 2 V 2 (µ) − U (µ)W (µ) = 1 + 2 (5.12) λ λ µ − ak k

which yields the result. Incidentally this proves formula (2.24) in Chapter 2. Since we have already found an r-matrix for the Neumann system this proves its integrability. The spectral curve can be written in the form: (µ − ai )Fk (µ − bi ) k 2 i=k λ =− = − i (5.13) (µ − a ) (µ − ai ) i i i Performing the birational transformation (see Chapter 15) λ = λ i (µ − ai ), we get: N N −1 2 (µ − bi ) (5.14) λ = − (µ − ai ) i=1

i=1

which is a hyperelliptic curve of genus g = N − 1. Note that the phase space is of dimension 2(N − 1) and that we have (N − 1) independent conserved quantities, namely the N quantities Fk modulo the relation

130

5 Analytical methods

k Fk = 1. Let us remark that in this case we do not quotient by the diagonal action. This is because the diagonal action does not preserve the particular form of the Lax matrix in terms of the vectors X and Y . In J) identically vanish. this case the Hamiltonians Hn = Tr (Ln−1 0 To illustrate the discussion of N -sheeted coverings, it is instructive to consider the covering projection (λ, µ) → λ which allows us to see the spectral curve as an N -sheeted covering of the Riemann sphere of the variable λ. To compute the branching index of this covering we have to ﬁnd the total number of poles of ∂µ Γ(λ, µ). Such poles can only occur when λ = ∞ or µ = ∞. First, above λ = ∞ we have the N points µ = ai . These are not branch points and the local parameter is 1/λ. Around such a point we have by eq. (5.9):

(λ, µ) → Qi = (∞, ai ),

µ = ai − Fi /λ2 + O(1/λ4 )

(5.15)

hence ∂µ Γ = λ2 j=i (ai − aj ) + O(1). We thus have N double poles at these points. When µ → ∞ we have, by eq. (5.9), λ2 = −1/µ → 0 and λ is again a local parameter, (λ, µ) → (0, ∞),

µ=−

1 + O(1) λ2

(5.16)

At this point we have ∂µ Γ = −(−1/λ2 )N −2 + · · ·, hence we have a pole of order 2N − 4. So the branching index is ν = 4N − 4. This yields g = N − 1. Note that Γ(λ, µ) is a very non-generic polynomial of degree 2N − 1 and the orbit to which L(λ) belongs is a very low-dimensional one, nevertheless all the numbers ﬁt nicely.

5.2 The eigenvector bundle The aim of this chapter is to present the general procedure for solving integrable models using analytical properties on the spectral curve. In order to simplify the exposition we shall assume that all functions or matrices are generic. We wish to examine how the Lax matrix can be reconstructed from the analytic data characterizing its eigenvectors. This analysis will exhibit the special role played by the divisor D of the poles at ﬁnite distances of the eigenvector Ψ(P ). This divisor contains all the dynamical information. Let P be a point on the spectral curve. We assume that P = (λ, µ) is not a branch point so that all eigenvalues of L(λ) are distinct and the eigenspace at P is one-dimensional. Let Ψ(P ) be an eigenvector, and

5.2 The eigenvector bundle

131

ψj (P ) its N components:  ψ1 (P ) ..  Ψ(P ) =  . 

ψN (P ) Since the normalization of the eigenvector Ψ(λ, µ) is arbitrary, one has to make a choice before making a statement about its analytical properties. We choose to normalize it such that its ﬁrst component is equal to one, i.e. ψ1 (P ) = 1,

at any point P ∈ Γ.

It is then clear that the ψj (P ) depend locally analytically on P . As a matter of fact: Proposition. With the above normalization, the components of the eigenvectors Ψ(P ) at the point P = (λ, µ) are meromorphic functions on the spectral curve Γ. Proof. For a generic point P on the curve Γ, i.e. for a pair P = (λ, µ) satisfying eq. (5.3), there exists a unique eigenvector Ψ(P ) of the matrix L(λ) normalized by the condition ψ1 (P ) = 1. The un-normalized components ψi (P ) can be taken as suitable minors ∆i (P ) of the matrix L(λ) − µ1, and are thus meromorphic functions on Γ. After dividing by ∆1 (λ, µ) to normalize the ﬁrst component, all the other components ψj (P ) are still meromorphic functions on Γ. With each point P (λ, µ) on Γ we associate a meromorphic eigenvector Ψ(P ). At a branch point however, special care must be taken since there could be several eigenvectors associated with that point. We show that, for a generic Lax matrix, the eigenspaces are one-dimensional even at a branch point P . Moreover, the eigenspaces around P admit a unique analytic continuation at P , irrespective of the branch chosen. Proposition. With each point P in Γ we associate the eigenspace at P . This allows us to deﬁne an analytic line bundle that we call the eigenvector bundle. Proof. The ﬁrst point is to show that the eigenspace at P is of dimension 1 at each point of Γ, even at a branch point, in the generic case. Consider the matrix A ≡ L(λ) − µ. The fact that we are on a branch point of the curve Γ is expressed by two algebraic equations in the coefﬁcients of A: Γ(λ, µ) = 0 and ∂µ Γ(λ, µ) = 0. To say that the kernel of

132

5 Analytical methods

A at P is of dimension ≥ 2, means that the dimension of its image is ≤ (N − 2). Let us show that this implies at least three algebraic equations on the coeﬃcients of A. Let v1 , . . . , vN be the columns of A. We assume for simplicity that the kernel of A at P is of dimension 2 and that v1 , . . . , vN −2 are independent. First we impose that v1 , . . . , vN are linearly dependent, producing one condition det A = 0, which is the equation of the spectral curve. Then we impose that v1 , . . . , vN −1 are dependent on v1 , . . . , vN −2 . This is expressed by the vanishing of two (N − 1) × (N − 1) minors. One of these conditions is equivalent to ∂µ Γ(λ, µ) = 0 but there remains another independent one, hence the variety of such matrices is of codimension 1. Another way of saying this is that when a matrix has coinciding eigenvalues it can be put in the Jordan form, but is not diagonalizable in general. We now construct an abstract line bundle starting from the dimension 1 eigenspace EP at each point P , see Chapter 15. We call e1 , . . . , eN the canonical basis of ambient space in which the eigenvectors live. Deﬁne N open sets Ui on the curve Γ by the conditions Ui = {P ∈ Γ| exists V ∈ EP s.t. Vi = (V, ei ) = 0}, meaning that the eigenspace EP is not perpendicular to ei . Obviously the Ui form an open covering of Γ. On each intersection Ui ∩ Uj we deﬁne transition functions tij (P ) by tij (P ) = Vi /Vj where V is any non-zero eigenvector at P . The quotient is independent of the choice of V and the components Vi , Vj do not vanish on Ui ∩ Uj . In view of the argument of the previous proposition it is clear that tij (P ) is analytic with respect to the point P on Γ and non-vanishing. Finally, the cocycle condition tij tjk = tik is trivially satisﬁed on Ui ∩ Uj ∩ Uk . Hence these transition functions deﬁne an analytic line bundle which we call the eigenvector bundle. Any meromorphic section of this bundle can be described as a collection (V1 (P ), . . . , VN (P )), where Vi (P ) is deﬁned and meromorphic on Ui and Vi (P ) = tij (P )Vj (P ) on Ui ∩ Uj . One can see this collection as a P -dependent vector lying in the eigenspace EP with components Vi (P ) on ei .

Remark 1. Alternatively one can deﬁne normalized eigenvectors ψ(i) (P ) with (i) ψi (P )

= 1. Then Ui is the open set where ψ (i) (P ) remains ﬁnite. The transition (j) (i) function tij (P ) = ψi (P ) = 1/ψj (P ) has no zero nor pole on Ui ∩ Uj .

Remark 2. It may be useful to understand the situation at branch points on the simple example of 2 × 2 matrices. Let a(λ) b(λ) L(λ) = c(λ) d(λ)

5.2 The eigenvector bundle

133

which has eigenvalues µ± (λ) = 12 (a(λ) + d(λ)) ± 12 ∆(λ) with ∆(λ) = (a(λ) − d(λ))2 + 4b(λ)c(λ). The corresponding normalized eigenvectors are: ∆(λ) d(λ) − a(λ) 1 Ψ± = , ψ± = ± ψ± 2b(λ) 2b(λ) Assume that λ0 is a root of ∆(λ) = 0. It is obvious that when λ → λ0 , Ψ+ and Ψ− tend smoothly to the same limit except if one has also b(λ0 ) = 0. If b(λ0 ) = 0 one can express L(λ0 ) in the basis given by Ψ(λ0 ) and (0, 1) getting 1 (a(λ0 ) + d(λ0 )) b(λ0 ) L(λ0 ) → 2 1 0 (a(λ0 ) + d(λ0 )) 2 from which it is obvious that L(λ0 ) is of the Jordan form and has just one eigenvector. If, however, b(λ0 ) = 0 then we also have a(λ0 ) = d(λ0 ). Assuming that d(λ) − a(λ), b(λ) vanish to ﬁrst order in λ − λ0 , then (d(λ) − a(λ))/2b(λ) tends to some limit ψe . We see that ψ± ∼ ψe ± c(λ)/b(λ). Hence if c(λ0 ) = 0 we still have only one eigenvector of the form (0, 1), while if c(λ) also vanishes to ﬁrst order in λ − λ0 the matrix L(λ0 ) is diagonalizable, and the eigenvectors Ψ± tend generically to diﬀerent limits at λ0 . However, in this case we have ∆ ∼ (λ − λ0 )2 so that the corresponding point (λ0 , µ0 ) of the spectral curve is not a branch point, but a singular point. Upon desingularization it blows up to two points and the two values of Ψ are perfectly natural. Of course this analysis clearly covers what happens at a branch point of order 2 in the general case.

We now compute the Chern class of the eigenvector bundle. To do that we view Ψ(P ) as a meromorphic section of our bundle in the above way, i.e. as the collection deﬁned respectively on U1 , . . . , UN : (ψ1 (P ) = 1, ψ2 (P ), . . . , ψN (P )). Notice that this section does not vanish because ψi (P ) does not vanish on Ui by deﬁnition. We compute the number of poles of this section, which yields the Chern class. The number of poles of the normalized eigenvectors cannot be deduced by simply counting the number of zeroes of minors. Indeed, let ∆(λ, µ) be the matrix of cofactors of (L(λ) − µ1), which, by deﬁnition, is such that = Γ(λ, µ)1. Therefore at P = (λ, µ) ∈ Γ, the matrix ∆(P ) is (L(λ)−µ1)∆ a matrix of rank 1, since the kernel of (L(λ)−µ1) is of dimension 1. Hence, ) are of the form αi (P )βj (P ) and the for P ∈ Γ the matrix elements of ∆(P )β1 (P ) αi (P ) components of the normalised eigenvector are ψi (P ) = αα1i (P (P )β1 (P ) = α1 (P ) . We thus expect cancellations to occur when we take the ratio of the minors and we cannot deduce the number of poles of the normalized eigenvector by simply counting the number of zeroes of the ﬁrst minor. Proposition. We say that the vector Ψ(P ) possesses a pole if one of its components has a pole. The number of poles of the normalized vector Ψ(P ) is: m=g+N −1 (5.17)

134

5 Analytical methods

Proof. Let us introduce the function W (λ) of the complex variable λ deﬁned by:

2 W (λ) = det Ψ(λ) where Ψ(λ) is the matrix of eigenvectors of L(λ) deﬁned as follows: 

ψ1 (P1 ) .. Ψ(λ) = .

ψ1 (P2 ) .. .

··· .. .

 ψ1 (PN ) ..  .

(5.18)

ψN (P1 ) ψN (P2 ) · · · ψN (PN ) where the points Pi are the N points above λ. In this formula ψ1 (Pj ) = 1. Changing the normalization of the eigenvectors Ψ(Pj ) amounts to multi plying Ψ(λ) on the right by a diagonal matrix. By deﬁnition Ψ(λ) is the matrix diagonalizing L(λ). The function W (λ) is well-deﬁned as a rational function of λ on the Riemann sphere since the square of the determinant does not depend on the order of the Pj . It has a double pole where Ψ(P ) has a simple pole. To count its poles, we count its zeroes. We show that W (λ) has a simple zero for values of λ corresponding to a branch-point of the covering, therefore m = ν/2. Recall that from eq. (5.6) the number of branch points is ν = 2(N + g − 1). First notice that W (λ) only vanishes on branch points where there are at least two identical columns. Indeed, let Pi = (µi , λ) be the N points above λ. Then the Ψ(Pi ) are the eigenvectors of L(λ) corresponding to the eigenvalues µi are thus linearly independent when all the µi are diﬀerent. Therefore W (λ) cannot vanish at such a point. The other possibility for the vanishing of W (λ) would be that the vector Ψ(P ) itself vanish at some point (all components have a common zero at this point), but this is impossible because the ﬁrst component is always 1. Let us assume now that λ0 corresponds to a branch point, which is generically of order 2. At such a point W (λ) has a simple zero. Indeed, let z be an analytical parameter on the curve around the branch point. The covering projection P → λ gets expressed as λ = λ0 +λ1 z 2 +O(z 3 ). The determinant vanishes to order z, hence W vanishes to order z 2 . This is precisely proportional to λ − λ0 . A similar analysis can be performed if the branch point is of higher order. We now need to examine the behaviour of the eigenvector around λ = ∞. At the N points Qi above λ = ∞, the eigenvectors are proportional to the canonical vectors ei , (ei )k = δik , since L(λ = ∞) is diagonal, cf. eq. (5.5). While this is compatible with the normalization ψ1 (P ) = 1 at the

5.2 The eigenvector bundle

135

point Q1 , it is not compatible at the points Qi , i ≥ 2, if the proportionality factor remains ﬁnite. The situation is described more precisely by the following: Proposition. The k th component ψk (P ) of Ψ(P ) has a simple pole at Qk and vanishes at Q1 for k = 2, 3, . . . , N . Proof. Around Qk (λ = ∞, µ = ak ), k = 1, . . . , N , the eigenspace of L(λ) is spanned by a vector of the form Vk (λ) = ek + O(1/λ). The ﬁrst component of Vk is Vk1 = δ1k + O(1/λ). To get the normalized Ψ one has to divide Vk by Vk1 . So we get: 

Ψ(P )|P ∼Q1

 1  O(1/λ)    ..     .   .  , . = .  .   ..     . ..   O(1/λ)



Ψ(P )|P ∼Qk

 1  O(1)   .   .   .    =  O(λ)  ,    O(1)   .   ..  O(1)

k ≥ 2 (5.19)

where O(λ) is the announced pole of the k th component of Ψ(P )|P ∼Qk . The previous proposition shows that ﬁxing the gauge by imposing that L(λ) is diagonal at λ = ∞ introduces N − 1 poles at the positions Qi , i = 2, . . . , N . The location of these poles is independent of time, and is really part of the choice of the gauge condition. These poles do not contain any dynamical information. Only the positions of the other g poles have a dynamical signiﬁcance. Let D be the divisor of these dynamical poles. We call it the dynamical divisor. Recall that the vector Ψ(P ) possesses a pole if one of its components has a pole. Therefore the two previous propositions tell us that the divisor of the k th components of the eigenvector Ψ(P ) is bigger than (−D + Q1 − Qk ). This information is enough to reconstruct the eigenvectors and the Lax matrix. Proposition. Let D be a generic divisor on Γ of degree g. Up to normalization, there is a unique meromorphic function ψk (P ) with divisor (ψk ) ≥ −D + Q1 − Qk . Proof. This is a direct application of the Riemann–Roch theorem, since ψk is required to have g + 1 poles and one prescribed zero. Hence it is generically unique apart from multiplication by a constant ψk → dk ψk .

136

5 Analytical methods

Equipped with these functions ψk (P ) for k = 2, . . . , N we construct a vector function with values in CN :   1  ψ2 (P )   Ψ(P ) =  ..   . ψN (P ) The normalization ambiguity of the ψk translates into left multiplication of the vector Ψ(P ) by a constant diagonal matrix d = diag (1, d2 , . . . , dN ). We have constructed a line bundle on the Riemann surface Γ, which is the line bundle associated to the divisor −D − Q2 − · · · − QN of degree −(g + N − 1) (see Chapter 15). In fact we have constructed an embedding of this line bundle into Γ × CN . Theorem. Given the spectral curve Γ, such that above the points λk the N branches satisfy eq. (5.7), there exists a unique matrix L(λ), rational in λ, such that (L(λ) − µ1)Ψ(P ) = 0 This matrix has poles at the points λk and satisﬁes the boundary condition: limλ→∞ L(λ) = diag(a1 , . . . , aN ). whose columns are the vectors ψ(Pi ), Proof. Consider the matrix Ψ(λ) where Pi = (λ, µi ) are the N points above λ, cf. eq. (5.18). This matrix depends on the ordering of the columns, i.e. on the ordering of the points Pi . However, the matrix −1 (λ) L(λ) = Ψ(λ) ·µ ·Ψ

(5.20)

does not depend on this ordering and is a well-deﬁned function on the base curve. Here µ is the diagonal matrix µ = diag (µ1 , . . . , µN ). One has to examine the poles of the right-hand side of eq. (5.20). At a generic coalesce and its determinant branch point two columns of the matrix Ψ has a simple zero with respect to the local parameter. These zeroes are the only zeroes of det Ψ(λ). This is because the meromorphic function 2 (λ) is a function of λ and has 2(N + g − 1) poles, since Ψ W (λ) = (det Ψ) has (g + N − 1) poles. The function W (λ) has the same number of zeroes and poles, hence has also 2(N + g − 1) zeroes. At the branch points it behaves like z 2 ∼ (λ − λb ) (where z is a local parameter), hence has a simple zero. Thus the branch points contribute to ν = 2(N +g −1) zeroes, which are all the zeroes of W (λ).

5.2 The eigenvector bundle

137

We now show that, at a branch point, the matrix (5.20) is regular. is the matrix of cofactors of Ψ we have: Recall that if ∆ ·∆ = det Ψ 1, L(λ) = 1 Ψ ·µ Ψ ·∆ (5.21) det Ψ ∆ = 0, thus Im ∆ = Ker Ψ which is oneAt the branch point, Ψ dimensional. We may assume without loss of generality that the two eigen So, at the branch vectors that coalesce are the ﬁrst two columns of Ψ. is spanned by e1 − e2 , where ei are the canonical point, the kernel of Ψ also base vectors (ei )j = δij . Since the ﬁrst two diagonal elements of µ so Ψ(λ) = 0. become equal we see that µ acts as a scalar on Im ∆, ·µ ·∆ Hence the numerator of eq. (5.21) has a simple zero at the branch point, cancelling the simple pole of the determinant. Therefore the matrix L(λ) is a rational function of the parameter λ. It has poles only at the projections of the points where µ has poles, i.e. at the points λk , see eq. (5.7). At λ = ∞, the leading part of Ψ(λ) is diagonal since it is dominated by the functions ψk (P ) with P approaching Qk . Therefore at inﬁnity L(λ) goes to µ |λ=∞ = diag(a1 , . . . , aN ). This theorem is a crucial step of this method of resolution. It says that, once the spectral curve has been given, which amounts to giving the values of the integrals of motion, all remaining dynamical data are encoded into the divisor D. In other words, this theorem teaches us that the dynamical variables are the action variables and the points of this divisor. It should be emphasized, however, that Ψ(P ) is deﬁned up to left multiplication by diagonal matrices ψk → dk ψk . On the Lax matrix L(λ) this amounts to a conjugation by a constant diagonal matrix. Hence the object we reconstruct is actually the Hamiltonian reduction of the dynamical system by this group of diagonal matrices, as emphasized at the beginning of this chapter. Remark. It is worth comparing the reconstruction formula (5.20) for the Lax matrix with the local analysis of section (3.2) in Chapter 3. Recall that in this chapter we explained that the pair of matrices L(λ) and M (λ) could be diagonalized simultaneously, locally around each pole λk . Explicitly, the diagonalization formula (3.8) for L(λ) was L(λ) = gk Ak gk−1 with gk and Ak power series in (λ − λk ) and Ak diagonal. Of course gk is determined up to right multiplication by a diagonal matrix. The expression (5.20) of L(λ) in terms of the eigenvectors is simply a global version of the previous local statement. Example. Let us illustrate the analytical properties of the eigenvector bundle on the example of the Neumann model. There is a simple

138

5 Analytical methods

description of the eigenvectors in this case. Indeed, from eq. (3.3) in Chapter 3, 1 1 −1 L(λ)Ψ = µΨ =⇒ Ψ = −(L0 − µ) (5.22) JΨ − 2 KΨ λ λ Since JΨ = (Y · Ψ)X − (X · Ψ)Y and KΨ = (X · Ψ)X we see that Ψ is known once we know its projections on X and Y . Projecting eq. (5.22) on X and Y one gets a 2 × 2 system: 1 1 − λ1 V (µ) − λ12 U (µ) X ·Ψ λ U (µ) =0 Y ·Ψ − λ1 (1 + W (µ)) − λ12 V (µ) 1 + λ1 V (µ) The vanishing of the determinant of this linear system is precisely the equation of the spectral curve: λ2 = V 2 (µ) − U (µ)W (µ) Solving this system for X · Ψ and Y · Ψ, and inserting back into eq. (5.22), one gets: ψk a1 − µ (λ − V (µ))xk + U (µ)yk = · ψ1 ak − µ (λ − V (µ))x1 + U (µ)y1 Let us check the general results on these explicit formulae. Recalling the expansion (5.15), we see that (a1 − µ)/(ak − µ) has a double zero at Q1 (λ = ∞, µ = a1 ) and a double pole at Qk (λ = ∞, µ = ak ). Consider the meromorphic function φk = (λ−V (µ))xk +U (µ)yk . It has poles at the points Qi . Using eq. (5.15) we see that it has double poles at Qi , i = k and a simple pole at Qk . In fact (−V (µ)xk +U (µ)yk ) = λ2 xi Jki /Fi +O(1). This show that ψk /ψ1 has a simple zero at Q1 and a simple pole at Qk . To ﬁnd the other poles of ψk /ψ1 we study the zeroes of φk which has (2N −1) poles and therefore (2N −1) zeroes. Among them, N are common to all functions φk and cancel in ψk /ψ1 . Indeed, a common zero is such that λ − V (µ) = U (µ) = 0. By eq. (5.12) the points satisfying these two equations are on the spectral curve. The equation U (µ) = 0 has (N − 1) roots, µj , at ﬁnite µ, and λj = V (µj ) selects one of the two points above µj . In addition we have the point λ = 0, µ = ∞ which is a simple zero in view of eq. (5.16). Finally, ψk /ψ1 has a zero at Q1 , a pole at Qk , and g = N − 1 poles at ﬁnite distance depending on the dynamical data, in agreement with the general considerations of this section. 5.3 The adjoint linear system −1 (λ) In view of eq. (5.20) for the Lax matrix, it is important to compute Ψ in an eﬃcient way. This is achieved by introducing the solution Ψ+ (P ) of

5.3 The adjoint linear system

139

the adjoint linear system: Ψ+ (P ) (L(λ) − µ1) = 0

(5.23)

Here Ψ+ (P ) is a row vector. The precise relation between Ψ+ (P ) and the −1 (λ) is provided by the following: matrix Ψ Proposition. Let Ψ+ (P ) be a solution of the adjoint system (5.23). The inverse of the matrix Ψ(λ) deﬁned in eq. (5.18) is the matrix whose rows are the N row vectors Ψ(−1) (Pj ) = Ψ+ (Pj )/ Ψ+ (Pj )Ψ(Pj ) with Pj the N points above λ and V W = i Vi Wi .

(5.24)

Proof. One has to show that N k=1

ψk+ (Pj ) ψk (Pi ) = δij Ψ+ (Pj )Ψ(Pj )

where Pi and Pj are two points of Γ above the same λ. This is obvious for i = j, and for i = j, Ψ(Pi ) and Ψ+ (Pj ) are orthogonal because computing Ψ+ (Pj )L(λ)Ψ(Pi ) in two diﬀerent ways we ﬁnd µ(Pj ) Ψ+ (Pj )Ψ(Pi ) = Ψ+ (Pj )Ψ(Pi ) µ(Pi ), hence the scalar product vanishes if µ(Pi ) and µ(Pj ) are diﬀerent. So we get:   +  −1 (λ) =  Ψ 

Ψ (P1 )

Ψ+ (P1 )Ψ(P1 )

.. . +

Ψ (PN )

Ψ+ (PN )Ψ(PN )

  

(5.25)

−1 (λ) to reconstruct them We now use this relation between Ψ+ (P ) and Ψ from their analyticity properties. One may perform on Ψ+ (P ) the same analysis as for the vector Ψ(P ). Normalizing the ﬁrst component of Ψ+ (P ) to 1, one sees that the vector Ψ+ (P ) has g poles at a divisor D+ and (N − 1) simple poles at Q2 , . . . , QN above λ = ∞. Moreover, ψk+ (P ), k ≥ 2, has a zero at Q1 . Our ﬁrst task is to relate the divisor D+ to D. Proposition. Let Ψ+ (P ) be the solution of eq. (5.23) normalized by ψ1+ (P ) = 1. The diﬀerential form Ω≡

dλ Ψ+ (P )Ψ(P )

(5.26)

140

5 Analytical methods

is an Abelian diﬀerential of the second kind with a double pole at Q1 and zeroes at D and D+ . Conversely, there is a unique diﬀerential Ω of the second kind with a double pole at the point Q1 and having among its zeroes the g points of D. Its g other zeroes are then completely ﬁxed and deﬁne D+ . Its image under the Abel map is given by: A(D+ ) = −A(D) + A(B) − 2

N

A(Qj )

(5.27)

j=2

where B is the divisor of branch points of the covering (λ, µ) → λ. Proof. Consider the meromorphic function f (p) = Ψ+ (P )Ψ(P ) . It has 2(g + N − 1) poles coming from the poles of Ψ and Ψ+ . Their divisor is D + D+ + 2 N j=2 Qj . Therefore it has also 2(g + N − 1) zeroes, which are in fact the branch points of the covering, as we now see (recall that the covering has 2(g+N −1) branch points by the Riemann–Hurwitz formula). So let P = (λ0 , µ0 ) be a branch point and consider two points P1 and P2 above the same λ close to λ0 on the two sheets of the covering that coalesce at P . Because Ψ+ (P1 ) and Ψ(P2 ) are dual eigenvectors corresponding to diﬀerent eigenvalues, they are orthogonal: Ψ+ (P1 )Ψ(P2 ) = 0

(5.28)

The assertion then follows by continuity, since Ψ(P2 ) → Ψ(P ) and Ψ+ (P1 ) → Ψ+ (P ) (recall that the line bundles are analytic at P ). At a branch point P we have (µ − µ0 )2 ∼ (λ − λ0 ), so µ − µ0 is a local parameter and dλ = (µ − µ0 )dµ vanishes at P . Moreover, dλ has double poles at the points Q1 , . . . , QN above λ = ∞. We see that Ω is regular at the branch points, has a double pole at Q1 , and has zeroes at D + D+ . Recall that given a point P ∈ Γ and a divisor D = (γ1 , . . . , γg ) of degree g on Γ, the Abel map with base point P0 is deﬁned by: P g γj ωj and Aj (D) = ωj (5.29) Aj (P ) = P0

j=1

P0

with ωj is a normalized basis of Abelian diﬀerentials. Applying Abel’s theorem to the meromorphic function Ω/dλ one gets eq. (5.27). Conversely, assume that we have two such forms Ω and Ω with a double pole at Q1 and divisor of zeroes D + D+ and D + D + respectively. Their quotient is a meromorphic function with a divisor of poles of degree g, i.e. a constant generically. Hence the diﬀerential Ω is unique. The outcome of this proposition is that Ω is uniquely characterized by its behaviour at inﬁnity and its zeroes at the points of D. Therefore, given

5.3 The adjoint linear system

141

the dynamical divisor D, we know the form Ω and we can ﬁnd the divisor D+ as the complementary set of zeroes of Ω. This information on the divisor D+ can now be used to reconstruct the vector Ψ+ (P ). Its components ψk+ (P ) are uniquely determined up to normalization once we know the divisor D+ . Here, however, we have no freedom on these normalizations since we must preserve the orthogonality conditions (5.28). Let ψ˜k+ (P ) be any choice of such meromorphic function and let ψk+ (P ) = (1/ck ) ψ˜k+ (P ). We want to determine the constants ck . We require that ψk+ (P ) satisﬁes an orthogonality relation of the form Ψ+ (Pj )Ψ(Pi ) = f (Pj )δij . This means that the matrices of elements ψi+ (Pj )/f (Pj ) and ψj (Pi ) are inverse to each other. By uniqueness of the inverse matrix we also have: 1 1 ψ˜i+ (Pk )ψj (Pk ) ψi+ (Pk )ψj (Pk ) = δij , or = ci δij f (Pk ) f (Pk ) k

k

We have an independent characterization of f (P ). By eq. (5.26) f −1 (P )dλ = Ω(P ), and therefore f (P ) is known from its analyticity properties. This allows us to compute ci as follows. Consider on the Riemann surface the form ψ˜i+ (P )ψj (P )Ω(P ). For i = j, it has a double pole at Qi and for i = j, it has simple poles at Qi and Qj and no other singularity. This is because the poles at D and D+ in Ψ ˜ + cancel against the zeroes of Ω, and the double pole of Ω at Q1 and Ψ ˜ + at this point. Finally we deﬁne a form combines with zeroes of Ψ and Ψ on the Riemann sphere λ by: ωij = ψ˜i+ (Pk )ψj (Pk )Ω(Pk ) k

where the N points Pk are the points above λ. If there are no branch points among the Pk , λ is a local parameter around each Pk and ωij = gij (λ)dλ. If there is a branch point a short computation shows that this still holds. Since ωij is regular for ﬁnite λ, the function gij (λ) is in fact a polynomial in λ. Moreover, ωij has poles of order at most 1 for i = j and 2 for i = j at λ = ∞. Since dλ has a double pole at ∞, this implies ωij = 0 for i = j and ωii = ci dλ for some constants ci . We have obtained orthogonality relations: Ω ωij = ci δij dλ, ci = lim ψ˜i+ (P )ψi (P ) (P ) (5.30) P →Qi dλ is These orthogonality relations show that the inverse of the matrix Ψ given by:

Ω(Pi ) 1 ˜+ −1 ψ (Pi ) = (5.31) Ψ dλ cj j ij

142

5 Analytical methods

It is worth noticing that this expression is invariant under a change of ˜ + , i.e. ψ˜+ → dj ψ˜+ yields cj → dj cj normalization of the components of Ψ j j and dj cancels. On the other hand it transforms appropriately under a change of normalization of the components of Ψ. One then gets Ω = dλ/ Ψ+ Ψ if one sets the k th component of Ψ+ to the invariant value 1/ck ψ˜k+ . Let us summarize the situation in a proposition: Proposition. Given an eﬀective divisor D of degree g, the functions ψk (P ) with divisor ≥ −(D + Qk − Q1 ) are unique up to normalization. There is a unique form Ω having a double pole at Q1 and vanishing on D. It has g other zeroes at an eﬀective divisor D+ of degree g. There exists a unique set of functions ψk+ (P ) of divisor ≥ −(D+ + Qk − Q1 ) such that k ψi+ (Pk )ψj (Pk )Ω(Pk ) = δij dλ. The inverse of the matrix Ψ −1 = ψ + (Pi ) Ω(Pi )/dλ. is Ψ ij j 5.4 Time evolution The aim of this section is to solve for the equations of motion by looking at the time evolution of the dynamical divisor D. The outcome is the beautiful fact that the dynamical ﬂow linearizes on the Jacobian of the spectral curve. Recall that the time evolution is governed by the Lax equation, d L(λ) = [M (λ), L(λ)] dt As we have seen in Chapter 3, the matrix M (λ) is of the form

M= Mk , with Mk (λ) = P (k) (L, λ)

(5.32)

−

k

where P (k) (L, λ) is a polynomial in L(λ) with constant rational coeﬃ cients in λ, and P (k) (L, λ) − denotes its polar part at λk . Suppose that, at time t, we made the analysis of the previous section and built the normalized eigenvector Ψ(t, P ). If Ψ(t, P ) is an eigenvector of L(λ) with eigenvalue µ, the Lax equation eq. (5.32) implies (L(λ) − µ)( dΨ dt − M Ψ) = 0. It follows that d Ψ(t, P ) = (M (λ) − C(t, P )1) Ψ(t, P ) (5.33) dt where C(t, P ) is a scalar function. Normalizing the eigenvector Ψ(t, P ) such that its ﬁrst component equals 1 gives: C(t, P ) = M1j (λ)ψj (t, P ) j

143

5.4 Time evolution

By the analysis of the previous section, the normalized eigenvector Ψ(t, P ) has poles at the dynamical divisor D(t) and at the N − 1 points Qk , k = 2, . . . , N . Consider the function N (t, dt, P ) ≡ 1 + C(t, P )dt, with dt inﬁnitesimal: M1j (λ)ψj (t, P ) N (t, dt, P ) = 1 + dt j

One can rewrite eq. (5.33) in the equivalent form: N (t, dt, P ) Ψ(t + dt, P ) = (1 + dtM (λ)) · Ψ(t, P ) + O(dt2 ) (5.34) We see that the meromorphic function N (t, dt, P ) of P ∈ Γ normalizes the eigenvector whose time evolution is naturally induced by the Lax equation ˙ = M Ψ. The divisor of this meromorphic function reads: Ψ (N ) = D(t + dt) +

mk

α Pk,i − D(t) −

k,i α=1

mk Pk,i

k,i

From eq. (5.34) we see that N cancels the poles of Ψ(t+dt, P ) at D(t+dt) and produces the poles of Ψ(t, P ) at D(t). The poles at Q2 , . . . , QN are the same on both sides and do not appear in N . Moreover, since Mk (λ) has a pole of order mk at λk , N has poles of order mk at the N points α to match the number of Pk,i above λk . Finally, N has extra zeroes Pk,i its poles. Since dt is small, and N = 1 for dt = 0, the zeroes are close to the poles, D(t + dt) is close to D(t), and on each sheet i of the covering α close to the mth order pole P . there are exactly mk zeroes Pk,i k,i k Theorem. Let γj (t) with j = 1, . . . , g be the points of the dynamical divisor D(t). Let ω be any holomorphic diﬀerential on Γ. The time evolu˙ tion of the points γj (t) induced by the Lax equation L(λ) = [M (λ), L(λ)] (k) with M (λ) = k (P (L, λ))− is such that: d dt g

γj (t)

ω=

j=1

N k

ResPk,i ωP (k) (µ, λ)

(5.35)

i=1

where the points Pk,i are the N points above λk . Notice that the right-hand side is independent of time. Proof. Since N (t + dt, P ) is a meromorphic function, Abel’s theorem tells us that: g γj (t+dt) Pk,α (dt) ω+ ω=0 (5.36) j=1

γj (t)

k,i,α Pk,i

144

5 Analytical methods

for any holomorphic diﬀerential ω. This equation will give us the time evolution of the divisor as in eq. (5.35) if we can evaluate the second sum. For this we need the following lemma: Lemma. Consider a point P ∈ C, a holomorphic diﬀerential ω in the neigbourhood V of P and an analytic function u in V with a pole of order m at P . Consider ∈ C small enough, so that the m points Pα ( ), where u(Pα ( )) + −1 = 0, belong to V . Then m 1 Pα () lim ω = −ResP (ωu) →0 P α=1

Proof. Let ω = dσ with σ(P ) = 0. For any path π enclosing the zeroes Pα ( ) of u + −1 and the point P , we have m 1 Pα () u 1 1 ω = σ(Pα ( )) = ResPα () σ α α u + −1 α=1 P u 1 = σdz 2iπ π 1 + u We used that the integrand is regular at P because σ(P ) = 0. When dz 1 = − 2iπ tends to zero, the right-hand side tends to π u σ 2iπ π uω = −ResP (uω). Returning to the proof of the Theorem, we decompose the second sum in eq. (5.36) as a sum of terms associated with each pole of M (λ), Pk,i on Γ. α , close to P , are by deﬁnition solutions of 1+dtu (P α ) = The points Pk,i k,i k k,i 0 with uk = j (Mk )1j ψj . Thus, by the Lemma and eq. (5.36), one ﬁnds:   g γj (t+dt) ω = dt ResPk,i ω (Mk )1j ψj (t, P ) (5.37) j=1

γj (t)

j

k,i

Recall that Mk (λ) is the polar part of P (k) (L, λ), i.e.

Mk (λ) = P (k) (L, λ) = P (k) (L, λ) − P (k) (L, λ)

with Therefore, 

ResPk,i ω

−

P (k) (L, λ)

j

+

+

regular at λk , hence does not contribute to eq. (5.37). 

(Mk )1j ψj (t, P ) = ResPk,i ω P (k) (L, λ)Ψ(t, P ) 1

= ResPk,i ωP (k) (µ, λ)ψ1 (t, P ) = ResPk,i ωP (k) (µ, λ)

145

5.5 Theta-functions formulae

where we have used the fact that Ψ(t, P ) is an eigenvector and the normalization ψ1 (t, P ) = 1. Equation (5.35) can alternatively be written in terms of Abel’s map. Theorem. The ﬂow induced by the Lax equation (5.32) on the eigenvector bundle is a linear ﬂow on the Jacobian of the spectral curve: A(D(t)) − A(D(0)) = −tU (M ) (5.38)

with (M ) =− ResPk,i ωj P (k) (µ, λ) (5.39) Uj k,i

5.5 Theta-functions formulae Since the motion linearizes on the Jacobian Jac (Γ), it is natural to express the solution in terms of Riemann’s theta-functions. We use the notations and results of Chapter 15, in particular K is the vector of Riemann’s constant. We ﬁrst recall the way to parametrize meromorphic functions on a Riemann surface Γ, in terms of theta-functions, on its Jacobian. For e in an open set of the divisor Θ = {z ∈ Jac (Γ)|θ(z) = 0}, the function y θ(e + x ω) does not vanish identically in y. By Riemann’s theorem it has g zeroes y1 , . . . , yg for given x. Since e ∈ Θ, one of these zeroes is x and we choose y1 = x. We now show that y2 , . . . , yg are independent of x. Indeed, by Riemann’s theorem we have: A(y1 ) + · · · + A(yg ) = A(x) − e − K, so that y2 , . . . , yg are determined by A(y2 ) + · · · + A(yg ) = −e − K. This equation is independent of x. As a side remark note that for such an x yj , j ≥ 2, we have θ(−e + yj ω) = 0 for all x. This means that some translation of the curve Γ embedded into the Jacobian by the Abel map is entirely contained in Θ, hence one has to be careful in the choice of the vector e. To use this result to construct meromorphic functions on y the y Riemann surface, notice that the building block θ(e + x1 ω)/θ(e + x2 ω) has a zero at y = x1 and a pole at y = x2 . The extra (g − 1) zeroes at the numerator and denominator cancel. We then assemble such blocks so that the product has no monodromy. ) and its inverse are easily writExplicit expressions for the matrix Ψ(P ten. Let D(t) = (γ1 (t), . . . , γg (t)) be the dynamical divisor. Let U (M ) be (M ) Then, the g-dimensional vector eq. (5.39) with components Uj P θ(A(P ) − A(Qk ) + A(Q1 ) − ζD(t) ) θ(e + Q1 ω) ψk (t, P ) = dk (5.40) P θ(A(P ) − ζD(t) ) θ(e + ω) Qk

146

5 Analytical methods

In this equation ζD = A(D) + K. Equation (5.38) implies ζD(t) = ζD(0) − tU (M ) . The dk are constants (d1 = 1) related to the residual diagonal action of the gauge group. Note that the sum of the arguments of the theta functions in the numerator is equal to the sum of the arguments in the denominator so that the whole expression has no monodromy when the point P loops around non-trivial cycles of Γ. Hence eq. (5.40) deﬁnes a meromorphic function on Γ with the correct zeroes and poles. Similarly one has: P + 1 θ(A(P ) − A(Qk ) + A(Q1 ) − ζD+ (t) ) θ(e + Q1 ω) + ψk (t, P ) = P ck θ(A(P ) − ζD+ (t) ) θ(e+ + ω) Qk

(5.41) Here D+ is the divisor given by eq. (5.27) and the normalization constants ck are deﬁned in eq. (5.30). We now compute them. To do that we express the meromorphic function Ω/dλ in terms of theta-functions. Let B be the set of branch points of the covering (λ, µ) → λ. We decompose the set of 2(g + N − 1) points of B into four subsets B0 , B0 , B1 , B1 such that card B0 = card B0 = N − 1 and card B1 = card B1 = g. This decomposition is arbitrary but does not aﬀect the ﬁnal formulae. Then we can write ( Bi means bk ∈Bi ): Ω (P ) = (5.42) dλ P P N + j=2 θ(e + Qj ω)θ(e + Qj ω) θ(A(P ) − ζD(t) ) θ(A(P ) − ζD+ (t) ) P P + B0 θ(e + bk ω) B θ(e + bk ω) θ(A(P ) − ζB1 ) θ(A(P ) − ζB1 ) 0

This expression of Ω/dλ has the correct zeroes and poles, and has no monodromy in view of eq. (5.27). To compute ck we can now use eq. (5.30), yielding: ck = dk ×

Qk

Qk

(5.43)

θ(A(Q1 ) − ζD(t) )θ(A(Q1 ) − ζD+ (t) ) j=k θ(e + Qj ω)θ(e+ + Qj ω) Qk Qk + B0 θ(e + bk ω) B θ(e + bk ω)θ(A(Qk ) − ζB1 )θ(A(Qk ) − ζB1 ) 0

−1 as products This gives an expression of the elements of the matrix Ψ of theta-functions, through eq. (5.31). Example. Let us apply the above formalism to ﬁnd the solution of the Neumann model. In this case since t L(λ) = L(−λ) the normalized Ψ+ (λ, µ) is equal to t Ψ(−λ, µ). The transformation (λ, µ) → (−λ, µ) is

5.5 Theta-functions formulae

147

just the hyperelliptic involution σ on the spectral curve of the Neumann model. The ﬁxed points of σ are the (2g + 2) points (λ, µ) lying above λ = ∞, namely the Qj , and above λ = 0, namely the point P∞ (λ = 0, µ = ∞) and the N − 1 points of coordinates (λ = 0, µ = βi ), see eq. (5.14). We take the point P∞ as base point of the Abel map, and note that the hyperelliptic involution changes the sign of the Abelian diﬀerentials which are of the form p(µ)dµ/λ, so that for any point P we get A(σ(P )) = −A(P ) modulo periods. The branch points B of 2the covering (λ, µ) → λ are solutions of ∂µ Γ(λ, µ) = 0, i.e. k Fk /(µ − ak ) = 0. This equation has (2N − 2) roots at ﬁnite distance, each one giving rise to two points related by the hyperelliptic involution. Hence the set B is globally invariant under σ and one can choose B0 = σ(B0 ) and B1 = σ(B1 ) in the above construction. Considering eq. (5.27), we see Indeed, since B = that D+ = σ(D). i (bi + σ(bi )) with σ(bi ) = bi we have A(B) = i (A(bi ) + A(σ(bi ))) = 0 up to periods. Similarly, since σ(Qj ) = Qj we have A(Qj ) = −A(Qj ) so that 2A(Qj ) = 0 modulo periods. This shows that A(Qj ) is a half-period on the Jacobian torus. Finally, we get A(D+ ) = −A(D) modulo periods so that D+ = σ(D) since D is generic. From this we understand that the requirement ψk+ (t, P ) = ψk (t, σ(P )) ﬁxes the constants dk and e+ in the expressions (5.40) and (5.41). Let us compare for instance the theta-functions θ(A(P ) − A(Qk ) + A(Q1 ) − ζD+ (t) ) and θ(A(σ(P )) + A(Qk ) − A(Q1 ) − ζD(t) ). The sum of the two arguments vanishes modulo periods, because in the hyperelliptic case the vector of Riemann’s constants K is some half-period, hence these two theta-functions have the same zeroes. Applying a similar argument to the other theta-functions and choosing e+ = −e, we see that: P 1 θ(−A(P ) + A(Qk ) − A(Q1 ) − ζD(t) ) θ(e − Q1 ω) + (5.44) ψk (t, P ) = P ck θ(−A(P ) − ζD(t) ) θ(e − Qk ω) Starting from the expressions (5.40) for ψk , (5.44) for ψk+ and (5.42) for Ω/dλ, one computes ck according to eq. (5.30). Then the relation ψk+ (t, P ) = ψk (t, σ(P )) ﬁxes the constant dk . To compute the solution of the Neumann model note that the diagonal element Lkk of the Lax matrix reads Lkk (λ) = ak − x2k /λ2 . On the other hand, from the reconstruction formula eq. (5.20), we have (−1) ) (P ), where the P are the N points Lkk (λ) = i i k i ψk (Pi )µ(Pi )(ψ above λ. When λ → 0 only µ(P∞ ) diverges as −1/λ2 while all the other terms remain ﬁnite. Hence: x2k = ψk (P∞ )(ψ (−1) )k (P∞ )

148

5 Analytical methods

Note that this formula implies k x2k = 1 as it should be in the Neumann model, and moreover this expression is independent of the constant dk . Inserting the above expressions we immediately ﬁnd: x2k (t) = αk

θ(A(Qk ) − A(Q1 ) + ζD(t) ) θ(A(Qk ) − A(Q1 ) − ζD(t) ) θ(A(Q1 ) − ζD(t) ) θ(A(Q1 ) + ζD(t) )

where αk is given by ratios of theta-functions completely independent of the dynamical divisor D(t). It depends only on the geometry of the spectral curve. It is convenient to express this result in terms of theta–functions with characteristics θ[η](z) which are essentially translates of the thetafunction by half-periods: θ[η](z) = eiπ(

t η Bη +2 t η (z+η ))

θ(z + η)

(5.45)

where η = Bη + η is a half-period, i.e. η , η ∈ (Z/2)g . Note that A(Qi ) are half-periods. By redeﬁning ζD → ζD + A(Q1 ) one gets rid of A(Q1 ) in the theta–functions. Indeed the ﬁrst factors in the numerator and denominator get translated by the period 2A(Q1 ). This produces exponential factors whose dependence in ζD cancel. Similarly, the dependence in D in the exponential factor in eq. (5.45) cancels as well, this time between the two factors in the numerator, and separately the two factors Q between + η2k−1 is an in the denominator of x2k . Finally A(Qk ) = P∞k ω = Bη2k−1 even non-singular characteristic. Hence one gets x2k (t) = αk

θ2 [η2k−1 ](ζD(t) ) θ2 [0](ζD(t) )

One could in principle evaluate the coeﬃcients αk directly by using the very special properties of theta-functions on hyperelliptic curves. However, we have a short cut by appealing to the Frobenius formula, only valid on hyperelliptic curves, which states that: g+1 2 θ [η2k−1 ](z) θ2 [η2k−1 ](0) =1 θ2 [0](z) θ2 [0](0) k=1

so we ﬁnally get: x2k (t) =

θ2 [η2k−1 ](z0 − U t) θ2 [η2k−1 ](0) θ2 [0](z0 − U t) θ2 [0](0)

(5.46)

Here U is obtained by applying eq. (5.35) with P (0) (µ, λ) = λµ (recall that for the Neumann model there is only a singularity at λ = 0 and

5.6 Baker–Akhiezer functions

149

M (λ) = (λL(λ))− , see Chapter 3). This yields: Uj = − ResPi (λµ)ωj (P ) = −ResP∞ (λµ)ωj (P ) = ωj (P∞ ) Pi

where Pi are the N points above λ = 0, and we used that for λ = 0, the product (λµ) has only a simple pole at P∞ (λ = 0, µ = ∞). The ωj are the normalized Abelian diﬀerentials. 5.6 Baker–Akhiezer functions Baker–Akhiezer functions are special functions with essential singularities on Riemann surfaces. With them, we have a very natural parametrization of eigenvectors of the linear system. We also get a very simple proof of the linearization theorem. Deﬁnition. Let P1 , . . . , Pl be points on a Riemann surface Γ of genus g. Let wi (P ),with wi (Pi ) = 0, be local parameters around these points. r Let Si (P ) = −1 r=−mi Si,r wi be some singular parts around Pi . Let D be a divisor on Γ. A Baker–Akhiezer function, ΨBA (P ), deﬁned with these data, is a function such that: (1) it is meromorphic on Γ outside the points Pi with the divisor of its poles and zeroes satisfying (ΨBA ) + D ≥ 0, (2) for P → Pi the product ΨBA (P )e−Si (wi (P )) is analytic. It is important to keep in mind the data involved in the deﬁnition of the Baker–Akhiezer functions. First we need a set of punctures Pi on the Riemann surface. Second a set of local parameters wi (P ) allowing to deﬁne a set of singular parts Si in the neighbourhood of each puncture. Notice that this deﬁnition is not invariant under change of local parameters wi . Assuming for the moment that such functions exist it is worth doing a few remarks. Remark 1. If Baker–Akhiezer functions associated with a given set of data exist, they form a vector space. However, the sum of two Baker–Akhiezer functions with diﬀerent singular parts Si is not a Baker–Akhiezer function.

Remark 2. The ratio of two Baker–Akhiezer functions associated with a given set of singular parts is a meromorphic function. This allows one to use standard analysis on Riemann surfaces to study Baker–Akhiezer functions. Remark 3. Even though Baker–Akhiezer functions are not meromorphic functions, they have the same number of poles and zeroes. The diﬀerential form d(log f ) is a meromorphic form. The sum of its residues is the number of zeroes minus the number

150

5 Analytical methods

of poles of f and has to vanish. Essential singularities do not contribute because around Pi we have d(log f ) = dSi + regular and dSi has no residue.

We now give a fundamental formula expressing the Baker–Akhiezer functions in terms of Riemann theta-functions. Recall that a diﬀerential of the second kind is a meromorphic diﬀerential with poles of order ≥ 2. See Chapter 15 for more details. Let Ω(S) be the unique Abelian diﬀerential of the second kind, normalized with vanishing a-periods, and with singular part at the points Pi of the form dSi (wi (P )). Thus, near the points Pi , −1 (S) r Si,r wi + regular Ω =d r=−mi (S)

Denote by 2iπU (S) the vector of b-periods of Ω(S) . Its g components Uj , j = 1, . . . , g are: 1 (S) Uj = Ω(S) (5.47) 2iπ bj Proposition. If D = gi=1 γi is a generic divisor of degree g, the following expression deﬁnes a Baker–Akhiezer function with D as divisor of poles: P θ(A(P ) + U (S) − ζ) (S) ΨBA (P ) = const. exp Ω (5.48) θ(A(P ) − ζ) P0 Here ζ = A(D) + K, where K is the vector of Riemann’s constants and A denotes the Abel map with based point P0 , cf. eq. (5.29). Proof. It is enough to check that the function deﬁned by the formula (5.48) is well-deﬁned (i.e., it does not depend on the path of integration between P0 and P ) and has the desired analytical properties. Indeed, when P describes some a-cycle, nothing happens because the theta-functions are a-periodic and Ω(S) is normalized. If P describes the bj -cycle the quo(S) tient of theta-functions is multiplied by exp(−2iπUj ) (see Chapter 15) (S)

while the exponential factor changes by exp(2iπUj ), so that ψBA is well-deﬁned. Clearly it has the right poles if deg D = g. Remark 4. For a generic divisor D of degree ≥ g, the dimension of the vector space of Baker–Akhiezer functions is equal to deg(D)−g+1. In particular for deg D = g the above formula gives the unique Baker–Akhiezer function having poles at D up to

151

5.6 Baker–Akhiezer functions

a constant. If we have two Baker–Akhiezer functions, their ratio is a meromorphic function with d = deg D poles. By the Riemann–Roch theorem the dimension of the space of such functions is d − g + 1.

It is worth noticing that generically we get a non-trivial Baker–Akhiezer function with only g poles, while to get a non-trivial meromorphic function we need generically (g + 1) poles. To understand why Baker–Akhiezer functions arise naturally in the construction of the eigenvectors, let us consider the unnormalized eigenvector Ψun (t, P ) whose time evolution is governed by the equation ∂t Ψun (t, P ) = M (λ)Ψun (t, P )

(5.49)

The normalized eigenvector Ψ(t, P ) and the unnormalized eigenvector Ψun (t, P ) are related by multiplication by a scalar function: Ψun (t, P ) = f (t, P )Ψ(t, P ) and (Ψun )1 (t, P ) = f (t, P ). Taking the ﬁrst component of eq. (5.49), one gets: f˙ = Cf with C = M1j ψj j

where C is the same object appearing in eq. (5.33). Let us describe the singularities of f (t, P ). Note that C(P ) has poles where Ψ has poles or at points above the poles λk of M (λ) (recall that in general the poles of M (λ) are a subset of the poles of L(λ)). Consider ﬁrst the points Pk,i , i = 1, . . . , N above a point λk . In the vicinity of Pk,i we have:

f˙ = (M Ψ)1 f = (P (k) (L, λ))− + regular Ψ f 1

(k) (k) P (µ, λ) + regular f = P (L, λ) + regular Ψ f = −

1

where we have used the fact that Ψ is an eigenvector of L(λ) with eigenvalueµ and Ψ1 = 1. The quantity f˙/f has poles at points (λ, µ) such that P (k) (µ, λ) − = 0. The projection ( )− is computed using the local parameter λ. Notice that P (k) (µ(λ), λ) − is independent of time, therefore f (t, P ) has an essential singularity at Pk,i of the form: t P (k) (µ,λ))−

f (t, P ) = e (

× regular

Let us now consider the poles of C coming from Ψ. First at λ = ∞, while Ψ has poles, M vanishes so that C is regular, and nothing special happens for f . At a point γ(t) of the dynamical divisor D(t) we have

152

5 Analytical methods

ψi ∼ αi (t)/(λ − γ(t)) and C ∼ r(t)/(λ − γ(t)). Comparing the second order pole in both sides of eq. (5.33), we ﬁnd r(t) = −γ(t). ˙ Thus ∂t log f = ∂t log(λ − γ) + regular, showing that f (t, P ) vanishes at the points of the dynamical divisor D(t). Finally, let us remark that the poles of f (t, P ) are independent of time. Indeed, assuming that f (t, P ) has a pole of order k at a point γ(t), which by the previous argument is not a pole of C, the orders of the poles on both sides of the equation f˙ = Cf are diﬀerent if γ˙ = 0. Considering the solution such that f (t = 0) = 1 we see that the divisor of its zeroes is the dynamical divisor D(t) and the divisor of its poles is D(0). Moreover, f has essential singularities at the points Pk,i with prescribed singular parts, hence f is the unique Baker–Akhiezer function with these essential singularities and the g poles corresponding to the divisor D(0), so that: P θ(A(P ) + tU (M ) − ζD(0) ) (M ) f (t, P ) = exp tΩ (5.50) θ(A(P ) − ζD(0) ) P0 The linear time dependence in the theta-function in the numerator of this equation arises from the form of the singular exponential and the requirement that there is no monodromy. This provides another quick proof of the linearization of the ﬂow on the Jacobian. If D(t) is the divisor of the zeroes of f (t, P ), by the Riemann theorem it satisﬁes: t (M ) A(D(t)) − A(D0 ) = −t U =− Ω(M ) (5.51) 2iπ b This shows that the ﬂow is linear on the Jacobian! On the other hand we know that A(D(t)) is given by eq. (5.38), which coincides with eq. (5.51) because by Riemann’s bilinear indentity N

1 Ω(M ) = − ResPk,i ωj P (k) (µ, λ) 2iπ bj k

i=1

with ωj the normalized Abelian diﬀerentials. See Chapter 15. We ﬁnally give the expression of the components of the unnormalized eigenvector Ψun (t, P ) = f (t, P )Ψ(t, P ): P (M ) (Ψun )k (t, P ) = dk exp t Ω P0

P θ(A(P ) − A(Qk ) + A(Q1 ) − ζD(t) ) θ(e + Q1 ω) × P θ(A(P ) − ζD(0) ) θ(e + ω) Qk

In the product f (t, P )ψk (t, P ), the zeroes of f cancel the dynamical poles of ψk which are replaced by the constant poles of f .

5.7 Linearization and the factorization problem

153

Remark 5. As explained in Chapter 3, diﬀerent functions P (k) (L, λ) correspond to diﬀerent dynamical ﬂows. Therefore, diﬀerent Abelian diﬀerentials of the second kind with poles at the points above λk correspond to diﬀerent time ﬂows. In other words, all the diﬀerent dynamics are encoded into the singular diﬀerentials Ω(M ) .

5.7 Linearization and the factorization problem We show that the solution of integrable systems by factorization can be interpreted as the time evolution of the eigenvector bundle. This gives a third very short proof that the ﬂows linearize on the Jacobian of the spectral curve. We consider small disks Uk around the poles λk of L(λ) and deﬁne the open set U+ as the union of Uk , while U− is an open set slightly larger than the complement of U+ in the complex plane. On these open sets we have deﬁned in Chapter 3 a factorization problem −1 θ− (λ, t)θ+ (λ, t) = e−

i ti dHi (L(λ,0))

where θ± are analytic and invertible in U± respectively. Recall that we Ψ −1 (0) has diﬀerent expressions on the have shown that the matrix Ψ(t) patches U+ and U− given by eqs. (3.56, 3.57). Multiplying these equations on the right by Ψ(0) we get on U+ and U− respectively:

Ψ(t) = θ+ (λ, t)Ψ(0)e

µ) i ti dHi (

,

Ψ(t) = θ− (λ, t)Ψ(0)

where µ is the diagonal matrix of eigenvalues of L(λ, 0). We used that −1 because H(L) is an ad-invariant function. dH(L) = ΨdH( µ)Ψ We can interpret this matrix equation in λ as a vector equation on the Riemann surface Γ. First we lift each disc Uk around λk to N disks around the Pk,i , and still deﬁne U+ as the union of these disks, and U− as an open set in Γ containing the closure of U+ . Each column of these matrix equations can be viewed as vector equations at a point P above λ: Ψ(t, P )e−

i ti dHi (µ(P ))

= θ+ (λ, t)Ψ(0, P ),

Ψ(t, P ) = θ− (λ, t)Ψ(0, P ) (5.52)

valid on the open sets U± respectively. The vector Ψ(0, P ) is a section of the eigenvector bundle, E0 , at time t = 0. As explained in Chapter 15, (θ+ (λ, t)Ψ(0, P ), θ− (λ, t)Ψ(0, P )) deﬁnes a section of a line bundle isomorphic to E0 due to the regularity properties of the matrices θ± . In the left-hand sides of eqs. (5.52), we write Ψ(t, P ) = f (t, P )Ψm (t, P ), where Ψm (t, P ) is meromorphic with ﬁrst component equal to 1, and f (t, P ) is the Baker–Akhiezer function (5.50). Note

154

5 Analytical methods

that, by deﬁnition, Ψm (t, P ) is a section of the eigenvectorbundle Et . We introduce the line bundle Ft with transitionfunction e− i ti dHi (µ(P )) on U+ ∩ U− which possesses the section (f e− i ti dHi (µ(P )) , f ) on (U+ , U− ) since the ﬁrst term is regular in U+ by eq. (5.52). Recall that the product of bundles admits the product of sections. It is now clear that Et ∼ E0 ⊗ Ft . The bundle Ft is of Chern class 0 because Et and E0 have the same Chern class at least for t small. Hence Ft deﬁnes a point in the Jacobian Jac (Γ). This point moves linearly in time since the addition law on the Jacobian corresponds to taking the product of transition functions. 5.8 Tau-functions We now wish to relate the formula for the Baker–Akhiezer function to the so-called tau-function (see Chapter 3). Let us consider for simplicity the case of only one singular point P∞ , and let z be a local parameter for a point P in the vicinity of P∞ such that z(P∞ ) = 0. The general case is similar but more cumbersome to present. In order to be able to describe at once all possible singular parts, we introduce an inﬁnite set of elementary time variables tk . Denote by Ω(k) the normalized diﬀerential of the second kind with singular part d(z −k ) at P∞ and denote by U (k) its b-periods, 1 (k) Uj = Ω(k) 2iπ bj ∞ Proposition. Let ψBA (P ) be the Baker–Akhiezer function with divisor D of poles of degree g and singular part ξ(t, z) = k tk z −k at P∞ (this should be understood in the sense of formal series), normalized such that: ψBA (P ) = eξ(t,z) (1 + O(z)), z ∼ 0 + , Deﬁne t − [z] = tk − k1 z k . Then we have in the vicinity of P∞ : ψBA (P ) = eξ(t,z)

τ (t − [z]) τ (t)

(5.53)

The function τ (t) may be expressed in terms of the Riemann’s thetafunction,

tk U (k) − ζ (5.54) τ (t) = eα(t)+β(t,t) θ A(P∞ ) + k

Here α(t) and β(t, t) are a linear and a quadratic form in the times t respectively, and ζ = A(D) + K. A(D) is the Abel map and K the vector of Riemann’s constants.

155

5.8 Tau-functions

Proof. From eq. (5.48), the normalized Baker–Akhiezer function can be written as: P ξ(t,z) (k) −k exp tk (Ω − d z ) ψ(P ) = e (5.55) k

P∞

θ(A(P ) + k tk U (k) − ζ)θ(A(P∞ ) − ζ) × θ(A(P ) − ζ)θ(A(P∞ ) + k tk U (k) − ζ) P P Indeed, eq. (5.48) contains P0 Ω which can be written ξ(t, z) + P∞ (Ω − dξ)+C st where (Ω−dξ) = k tk (Ω(k) −d z −k ) is regular at P∞ . Moreover, eq. (5.55) is obviously correctly normalized. On the other hand, if τ (t) is assumed to be of the form (5.54), we have z −k (k) − ζ) τ (t − [z]) k (tk − k )U −α([z])−2β(t,[z])+β([z],[z]) θ(A(P∞ ) + =e τ (t) θ(A(P∞ ) + k tk U (k) − ζ) We need to compare eq. (5.53) and eq. (5.55). Recall that near P∞ , P (k) − d z −k ) = b (z) is regular. So we ﬁrst choose α(t) and β(t, t) k P∞ (Ω such that: θ(A(P ) − ζ) 1 β(t, [z]) = − tk bk (z), α([z]) = β([z], [z]) + log 2 θ(A(P∞ ) − ζ) k

This deﬁnes α and β uniquely and consistently. For this, we must check that the coeﬃcients βkjj of β are symmetric. We have βkj = −1/2 jbkj , where bk (z) = j bkj z . We apply the Riemann bilinear identity, eq. (15.8) in Chapter 15 to the two normalized second kind Abelian diﬀerentials Ωk and Ωl . The left-hand side in this identity vanishes because the integrals over a-cycles vanish, while the right-hand side yields kblk = lbkl . This choice takes care of the exponential prefactor and two of the theta functions in eq. (5.55). To deal with other two theta functions in (5.55) we Taylor expand the Abel map A(P ) − A(P∞ ) around P∞ . Writing (j) i ωj = ∞ i=0 ci z dz in the vicinity of the point P∞ and Taylor expanding using Riemann bilinear identities, one deduces: ∞ ∞ ∞ i z k (k) 1 zk (j) z Aj (P ) − Aj (P∞ ) = ci−1 = − Ω(k) = − U i 2πi k bj k j i=1

k=1

k=1

Using this relation we get: zk A(P ) + tk U (k) − ζ = A(P∞ ) + (tk − )U (k) − ζ k k

Gathering all this we obtain eq. (5.53).

k

(5.56)

156

5 Analytical methods

The formula (5.53) relating Baker–Akhiezer functions to tau-functions is usually called the Sato formula, cf. eq. (3.61) in Chapter 3. It may easily be generalized for several punctures, see Chapter 8 for more details. It shows that the local parameter z can be generated from translations on the inﬁnite set of times. The left-hand side of eqs. (5.56) gives a convergent expression for the formal series of the right-hand side. Moreover, the Baker–Akhiezer function provides a global meaning to Sato’s formula in this case. 5.9 Symplectic form Our aim is to express the symplectic form inherited from the coadjoint orbit structure in terms of the dynamical divisor. We consider a rational Lax matrix of the form L(λ) = L0 + k Lk (λ), where each Lk may be written as: Lk (λ) = gk · (Ak )− · gk−1 − with (Ak )− diagonal matrices. Locally around λk , L(λ) can be diagonalized as, see Chapter 3: L(λ) = gk Ak gk−1 Both matrices Ak and gk depend on λ. Ak has poles at λk but gk is regular. By deﬁnition L0 is non-dynamical. The variables (Ak )− are also chosen to be not dynamical, and specify the coadjoint orbit. The dynamical variables are the matrix elements of the jets g(k) , cf. eq. (3.13) in Chapter 3. The pullback on the loop group of the Kirillov symplectic form on the coadjoint orbit is: Resλk Tr (Ak )− gk−1 δgk ∧ gk−1 δgk dλ ω= k

The dynamical variables g(k) and g(k ) Poisson commute for k = k . We have seen that the Lax pair description of a dynamical system naturally provides coordinates on phase space, namely g = genus (Γ) independent action variables Fi which parametrize the spectral curve Γ, and g points γi = (λγi , µγi ) on the spectral curve, which we called the dynamical divisor. It is important to express the symplectic form in these coordinates. The phase space appears as a ﬁbred space whose base is the space of moduli of the spectral curve, explicitly described as coeﬃcients of the equation Γ(λ, µ) = 0 of the spectral curve, and the ﬁbre at a given Γ is the Jacobian of the curve Γ(λ, µ) = 0. On this space we introduce a

5.9 Symplectic form

157

diﬀerential δ which varies the dynamical variables Fi , λγi , µγi subjected to the constraint Γ{Fi } (λγi , µγi ) = 0. We will need an auxiliary ﬁbre bundle above the same base whose ﬁbre is Γ × Jac(Γ). We extend δ to this space by keeping the previous deﬁnition on the Jac(Γ) part and on the Γ part, we diﬀerentiate any function of Fi , λ, µ with Γ{Fi } (λ, µ) = 0, by keeping λ constant. This deﬁnition makes sense because the bundle of curves is given by the family of equations Γ(λ, µ) = 0, where the coeﬃcients of Γ depend on the moduli which parametrize the base space. This provides a universal deﬁnition of the meromorphic function λ on the whole family of curves. So diﬀerentiating on the bundle of curves, keeping λ constant, provides a horizontal direction, i.e. a connection. Explicitly, for a function f (P ; Fi ) we take λ as a local parameter, then δf = i ∂Fi f δFi . At a branch point, however, the local parameter is µ, and we have: ∂Fi f δFi (5.57) δf = ∂µ f δµ + i

To compute δµ we diﬀerentiate the equation Γ{Fi } (λ, µ) = 0 at λ constant, getting: 1 δµ = − ∂Fi Γ{Fi } (λ, µ)δFi (5.58) ∂µ Γ{Fi } (λ, µ) i

At a branch point of the covering (λ, µ) → λ, we have ∂µ Γ{Fi } (λ, µ) = 0, hence the diﬀerential δf acquires a pole even though f is regular. Note, however, that if f (P ) depends on P only through λ(P ), δf is regular at the branch points. Recall that at each point P (λ, µ) on Γ a column eigenvector Ψ(P ) of the Lax matrix is deﬁned, up to normalization, and that we have deﬁned a dual line eigenvector Ψ(−1) (P ) such that Ψ(−1) (P )Ψ(P ) = 1. This allows us to deﬁne a 3-form K on our extended ﬁbre bundle. We regard it as a 1-form on Γ whose coeﬃcients are 2-forms on phase space. K = K1 + K2

(5.59)

K1 = Ψ

(P )δL(λ) ∧ δΨ(P ) dλ

K2 = Ψ

(P )δµ ∧ δΨ(P ) dλ

(−1) (−1)

Of course Ψ(P ) is deﬁned, knowing the dynamical divisor, up to multiplication by a diagonal matrix independent of P . We normalize the eigenvectors at ∞ so that ψi (Qj ) = λδij + O(1),

for i, j = 2, . . . , N

(5.60)

158

5 Analytical methods

Proposition. Deﬁne the 2-form on phase space: ω = k,i ResPk,i K, where Pk,i are the points above the poles λk of L(λ). Then we have: ω=2

g

δλγi ∧ δµγi

(5.61)

i=1

where (λγi , µγi ) are the coordinates of the points of the dynamical divisor D. Proof. The sum of the residues of K, seen as a form on Γ, vanishes. The poles of K are located at four diﬀerent places. First the dynamical poles of Ψ, then the poles at the Pk,i coming from L and µ, next the poles above λ = ∞ coming from Ψ and dλ, and ﬁnally the poles at the branch points of the covering coming from the poles of Ψ(−1) and from eq. (5.58). Let us compute the residues at the dynamical poles (γ1 , . . . , γg ). We write the coordinates of these points as: γi = (λγi , µγi ) for i = 1, . . . , g. Near such a point we can choose λ as a universal local parameter and Ψ = 1/(λ − λγi ) × Ψreg , hence: δΨ =

δλγi dλ (Ψ + O(1)) , so that K1 ∼ Ψ(−1) δLΨ ∧ δλγi λ − λ γi λ − λ γi

Since (L−µ)Ψ = 0 and Ψ(−1) (L−µ) = 0, we have (δL−δµ)Ψ+(L−µ)δΨ = 0. Multiplying by Ψ(−1) we get Ψ(−1) δLΨ = δµ, therefore: Resγi K1 = δµ|γi ∧ δλγi

(5.62)

Here δµis to be seen as a meromorphic function on Γ given by eq. (5.58), that is j ∂Fj Γ|γi δFj + ∂µ Γ|γi δµ|γi = 0. However, varying Γ(λγi , µγi ) = 0 we obtain j ∂Fj ΓδFj + ∂λ Γδλγi + ∂µ Γδµγi = 0. Comparing these equations we get: ∂λ Γ -δλγi δµ|γi = δµγi + ∂µ Γ γi

and the second term does not contribute to the wedge product in eq. (5.62). The contribution of K2 is exactly the same. So we ﬁnally get: Resγi K = 2δµγi ∧ δλγi

(5.63)

We now show that there are no residues at the branch points due to the proper choice of K2 . Let us look at the term K1 . At a branch point b, Ψ(−1) has a simple pole, δL is regular, δΨ has a simple pole due to eq. (5.58) and the form dλ has a simple zero, hence K1 has a simple pole

5.9 Symplectic form

159

at b. To compute its residue it is enough to keep the polar part in δΨ, i.e. to replace δΨ by ∂µ Ψδµ (recall that µ is a good local parameter around b). We get: Resb K1 = Resb Ψ(−1) δL∂µ Ψ ∧ δµ dλ = Resb Ψ(−1) (δL − δµ)∂µ Ψ ∧ δµ dλ where in the last equation we have used the antisymmetry of the wedge product to replace δL by δL − δµ. Using again the eigenvector equation (L−µ)Ψ = 0, and varying the point (λ, µ) on the curve around b, one gets (L − µ)∂µ Ψ = Ψ −

dλ dL Ψ dµ dλ

(5.64)

where dλ/dµ vanishes at the branch point. We then diﬀerentiate with δ and multiply on the left by Ψ(−1) to get: Resb Ψ(−1) (δL − δµ)∂µ Ψ ∧ δµ dλ = Resb Ψ(−1) δΨ ∧ δµ dλ dλ dL (−1) − Resb Ψ δ Ψ ∧ δµ dλ dµ dλ It is easy to see that the ﬁrst term is exactly cancelled by the term Resb K2 . The second term gives a non-vanishing contribution Resb

δµb ∧ δµ dλ µ − µb

(5.65)

To show it, note that the quantity ζ = (dλ/dµ)(dL/dλ)Ψ vanishes at b = (λb , µb ). Writing ζ = (µ−µb )ζ1 , we get δζ = −δµb /(µ−µb )ζ+δµ ζ1 +ζ2 with ζ2 regular. The ζ1 term does not contribute due to the antisymmetry of the wedge product and the ζ2 term has no residue. Using eq. (5.64) we dλ dL have Ψ(−1) dµ dλ Ψ = 1 yielding eq. (5.65). This contribution is exactly cancelled by the contribution of a new form K3 : K3 = δ (log ∂µ Γ) ∧ δµ dλ We will see that K3 has poles only at the branch points. At the branch point b, ∂µ Γ has a zero, so we write ∂µ Γ = (µ − µb )S with S regular. The contribution of the point b to K3 is: Resb

δ∂µ Γ ∧ δµdλ ∂µ Γ

The variation of ∂µ Γ reads δ∂µ Γ = δ(µ − µb ) S + (µ − µb )δS. The second term does not contribute to the residue because S is regular, while the

160

5 Analytical methods

variation δµ cancels due to the antisymmetry of the wedge product, and we are left with the contribution of δµb which exactly cancels eq. (5.65). We now compute the residues above λ = ∞. Recall that we consider a reduced Hamiltonian system under the action of diagonal matrices. Recall the normalization of the eigenvectors at ∞, eq. (5.60). Notice that L = L0 + O(1/λ), where L0 is non-dynamical so δL0 = 0, and that µ = ai + O(1/λ) around Qi hence δL and δµ are O(1/λ). Moreover, Ψ(−1) vanishes at Qi and dλ has a double pole. Altogether K1 and K2 are regular at Qi since (δΨ)(Qi ) = O(1) due to the normalization condition. Finally, K3is also regular since, on the sheet µ = µi (λ), one can write ∂µ Γ = j=i (µi − µj ) yielding δ log ∂µ Γ = O(1/λ). Hence δ log ∂µ Γ ∧ δµ = O(1/λ2 ) has a double zero which compensates the double pole of dλ at inﬁnity. All this shows that K has no residues above λ = ∞. It remains to show that K3 has no other poles. Obviously, K3 is regular at the points of the dynamical divisor and does not contribute to the residues at these points. To compute the residue of K3 at the points Pk,i above λk , we note that if ∂µ Γ has a pole of some order m, it can be written ∂µ Γ = c(λ)/(λ − λk )m , where c(λ) is regular and non-vanishing. Since δλ = 0 and δλk = 0 we get δ (log ∂µ Γ) = δ log c(λ) which is regular. At λk we remark that δµ is regular on all sheets above λk . This is because, due to the form of L(λ), we have µ = (Ak )− + regular. Since (Ak )− characterizes the coadjoint orbit and is not dynamical, one has to take δ(Ak )− = 0. Hence K3 has no residue. Proposition. The form eq. (5.61) is given by: ω=2

g i=1

δλγi ∧ δµγi = 2

Resλk Tr (Ak )− gk−1 δgk ∧ gk−1 δgk dλ (5.66)

k

where (λγi , µγi ), i = 1, . . . , g, are the coordinates of the points of the dynamical divisor D. This shows that ω is the symplectic form on the orbit. Proof. Let us compute the residues at the poles λk of K1 , where only Lk contributes. Recall the local diagonalization theorem of Chapter 3, eq. (3.8), which allows us to write the Lax matrix as L = gk Ak gk−1 around λ = λk . Thus locally around λk we may identify the matrix Ψ(λ) with gk . −1 −1 More precisely, by eq. (5.20), we have Ψ(λ) = gk dk and Ψ (λ) = d−1 k gk with dk a diagonal matrix. The residues are obtained by integrating over small circles surrounding each of the N points Pk,i above λk . We can choose these small circles so that they project on the base λ on a single

5.9 Symplectic form

161

small circle surrounding λk . Then we get N i=1

N 1 Ψ(−1) (Pi )δL(λ) ∧ δΨ(Pi ) dλ 2iπ Ck,i i=1

1 −1 (λ)δL(λ) ∧ δ Ψ(λ)dλ Tr Ψ (5.67) = 2iπ Ck

ResPk,i K1 =

−1 (λ) is equal to the matrix whose rows where we used the fact that Ψ (−1) (Pi ). The trace has been reconstructed in eq. (5.67) are the vectors Ψ because Ψ(Pi ), i = 1, . . . , N , form a basis of eigenvectors. Using the iden tiﬁcation of Ψ(λ) in terms of gk gives: Resλk K1 = Resλk Tr

−1 −1 −1 −1 d−1 δg ∧ (δg g (A ) g − g (A ) g δg g d + g δd ) dλ − − k k k k k k k k k k k k k k = −2 Resλk Tr (Ak )− gk−1 δgk ∧ gk−1 δgk + gk−1 δgk [(Ak )− , δdk d−1 k ] dλ The last term vanishes because it involves the commutator of two diagonal matrices. Finally, K2 is regular at λk because, as we already remarked, δµ is regular on all the sheets above λk . This proposition means that the coordinates (λγi , µγi ) of the point γi of the dynamical divisor are canonical coordinates. Remark 1. This result shows the nice interplay between the analytical and the group-theoretical approaches to integrable systems. We are able to show that (λγi , µγi ) are canonical coordinates using only the fact that L parametrizes a coadjoint orbit, speciﬁed by constant matrices (Ak )− and L0 . Remark 2. In practice, to perform one has to compute at each this calculation,

Ψ −1 dλ. In the residue at λk , only pole of L the quantity ωk = Resλk Tr δL ∧ δ Ψ δLk appears. From eq. (5.20) one has Ψ −1 )jet , Lk ] δLk = [(δ Ψ

(5.68)

Ψ −1 )jet up where ()jet is the expansion to order nk − 1. This equation determines (δ Ψ to a quantity commuting with Lk . It is easy to see that

Ψ −1 , Lk ] ∧ δ Ψ Ψ −1 dλ ωk = Resλk Tr [δ Ψ is not aﬀected by this ambiguity, using the antisymmetry of the wedge product and the cyclicity of the trace.

162

5 Analytical methods

Example. We consider the example of the Neumann model. The above analysis can be applied to this model, except for one feature. As we have already stressed, there is no residual action of diagonal matrices on L, hence one has to pay special attention to the residues above λ = ∞. The Ψ −1 dλ. sum of residues at the poles above λ = ∞ is Res∞ Tr δL ∧ δ Ψ Ψ −1 is regular at ∞, while dλ has a pole of order One can see that δ Ψ 2. Since δL has a zero of order 1, one generally gets a residue. However Ψ −1 is constrained by δΨ Ψ −1 , L] + Ψδ µΨ −1 δL = [δ Ψ At λ = ∞ the second term vanishes and L tends to D, hence the order 0 Ψ −1 is diagonal. The leading term in δL is 1/λ δJ (see eq. (3.3) term in δ Ψ in Chapter 3) and has no diagonal element, consequently the considered µ trace vanishes. Similarly the term involving K2 has no residue because δ vanishes to order 2 at λ = ∞. To compute the residue at λ = 0 (a second order pole of L) we remark that the jet: Ψ −1 = δX t X − Xδ t X + λ δY t X + Xδ t Y − δX t Y − Y δ t X + O(λ2 ) δΨ solves eq. (5.68) and has the correct symmetry properties. One gets the expression of the symplectic form of the Neumann model in terms of the dynamical divisor: ω=4

N

δyi ∧ δxi = 2

i=1

N −1

δλγj ∧ δµγj

(5.69)

j=1

5.10 Separation of variables and the spectral curve Let us call Fi , i = 1, . . . , g, the action variables which are also the moduli of the spectral curve. For ﬁxed Fi , we have seen that the motion takes place on the Jacobian Jac (Γ{Fi } ). When varying initial conditions, the {Fi } will eventually vary and we get a foliation of the (complexiﬁed) phase space in terms of the Jacobian tori of Γ{Fi } . So we are back to the situation described in Liouville’s theorem, cf. Chapter 2. Let us check that the symplectic form does vanish when we restrict ourself to one of the tori of the foliation. We view Jac (Γ) as the g th symmetric product Γg . Solving the equation Γ(λγj , µγj ; {Fi }) = 0, one has µγj = µγj ({Fi }, λγj ) which depends on λγj only and not on the other λ. The symplectic form can then be written as: g g ∂µγj δFi ∧ δλγj , α = δµγj ∧ δλγj = µγj δλγj ω = δα = ∂Fi j=1

i,j

j=1

5.10 Separation of variables and the spectral curve

163

If we restrict ourselves to a level manifold Fi = fi , we have δFi = 0 and ω|f = 0. Let us explain why the conjugate variables (λγj , µγj ) form a set of separated variables. The construction is similar to the method used for proving the Liouville theorem. Consider the function m λγj S({Fi }, {λγj }) = α= µ(λ)dλ m0

j

λ0

The integration contour is drawn on the level manifold Fi = fi . Just as in the Liouville case, this function does not depend on local variations of the integration path. It is explicitly separated since it can be written as a sum of functions each depending on only one variable λγj : S({Fi }, {λγj }) = Sj ({Fi }, λγj ) j

Since

∂Sj ∂λγj

= µγj and since the point (λγj , µγj ) belongs to the curve Γ

with equation Γ(λ, µ) = 0, each function Sj is a solution of the diﬀerential equation: ∂Sj Γ λγj , ; {Fi } = 0, k = 1, . . . , g (5.70) ∂λγj Of course the coeﬃcients of the function Γ(λ, µ) depend on the values of the integrals of motion Fi . This is an equation of the form N −1 ∂Sj N ∂Sj q − + rq (λγj ) =0 ∂λγj ∂λγj q=1

where the coeﬃcients rq (λ) are deﬁned in eq. (5.4). Remark. Equation (5.70) plays an important role in the quantum case. It is the separated Schroedinger equation also known as the Baxter equation in some cases. The commuting Hamiltonians {Fi } are functions of the 2g coordinates λγj , µγj . To ﬁnd them we write that the curve Γ(λ, µ; {Fi }) = 0 passes through the g points (λγj , µγj ) of the dynamical divisor D. Hence the equations of the Liouville torus Fi = fi in these coordinates read Γ(λγj , µγj ; {fi }) = 0

(5.71)

The standard Hamilton–Jacobi equation is obtained by setting µγj = ∂λγj S, where S is the action. Due to the form of eq. (5.71), it is clear

164

5 Analytical methods

that one can take S({λγj }) = j s(λγj ), where the unique function s(λ) obeys the one-variable equation Γ(λ, ∂λ s; {fi }) = 0. This shows that the Hamilton–Jacobi equation separates into g identical one-variable equations, using the variables λγj . This is a particularly striking example of separation of variables. Remark. It is sometime advantageous to consider λγj as a function of µγj . Deﬁning S =

µγ

∂S λγj dµγj , we get λγj = ∂µ . The relation between S and S is γj simply a Legendre transform: S = j µγj λγj − S. j

j

µ0

Example. In the Neumann model, we see that eq. (2.30) in Chapter 2 is exactly of the form 1 µγj λγj dµγj S= 2 j

where the points (λγj , µγj ) belongs to the spectral curve eq. (5.13). So the results of Chapter 2 are particular cases of the general theory explained in this chapter. Moreover, we see in eq. (5.14) that the spectral curve depends on g = N −1 dynamical moduli bi , while the ai are non-dynamical. Asking that a curve of the form eq. (5.13) passes through the g points (λγj , µγj ) determines the symmetric functions of the coeﬃcients bi in terms of the (λγj , µγj ). In fact, setting P (µ) = i (µ − bi ), we ﬁnd the conditions P (µγj ) = −λ2γj i (µγj − ai ). By the Lagrange interpolation formula we reconstruct P (µ): 2 i (µγj − ai ) (µ − µγj ) − λγj (µ − µγk ) (5.72) P (µ) = k=j (µγj − µγk ) j

j

k=j

Using the canonical Poisson bracket eq. (5.69), it is a simple exercise to check that {P (µ), P (µ )} = 0, as it should be. 5.11 Action–angle variables So far, we dealt with complexiﬁed dynamical sytems. We found that the phase space of this complexiﬁed system can be viewed as a ﬁbration where the base is the moduli space of the spectral curve and the ﬁbre is the Jacobian of the spectral curve corresponding to the speciﬁc values of the moduli parameters. This is very similar to the situation in the Liouville theorem, but the Liouville tori are real tori of dimension g, while the Jacobian has real dimension 2g. We need to choose a real slice of this complex phase space.

5.11 Action–angle variables

165

This can be done as follows. On Γ we choose a canonical basis of 2g cycles ak , bk . The cycles ak are non-intersecting. Once they are chosen, we can adapt the basis of Abelian diﬀerentials of the ﬁrst kind ωk such that they . are normalized by ak ωl = δkl . The real slice can be deﬁned by restricting the g points of the dynamical divisor D to move along these g non-intersecting cycles, each point on a diﬀerent cycle. This obviously is a product of g real circles. One has to be aware that a real slice has in general several connected components, and the above description applies to one of them. Finally, explicit models correspond to speciﬁc cycles and not only to homology classes of cycles. The g angle variables are given by θk =

g i=1

γi

m0

ωk =

g i=1

λγi

σk (λ)dλ

(5.73)

λ0

where the integration paths are taken along the cycles ak and the Abelian diﬀerentials ωk are written in terms of the local parameter λ as ωk = σk (λ)dλ. With these assumptions the angles have real periods. The angles being deﬁned, one may ﬁnd the conjugated action variables. To do this, we need a Lemma. Lemma. The conditions characterizing the dynamical moduli can be summarized into the single statement: δ(µdλ) is a regular form

(5.74)

where δ is the diﬀerentiation with respect to the moduli, keeping λ constant. Proof. Taking the variation at λ constant produces poles at the branch points of the covering which are cancelled by corresponding zeroes of dλ. The form µdλ has poles at ﬁnite distance where L(λ) has poles. Around a pole λk , we have (k)

Lk (λ) = (g (k) (λ)A− (λ)g (k)−1 (λ))− (k)

and we assume that the diagonal polar part A− (λ) is non-dynamical. Hence the singular part of µ is kept ﬁxed under δ, and δ(µdλ) is regular at λk . At λ = ∞, dλ has a double pole. The dominant term of µdλ is ai dλ when µ → Qi = (∞, ai ), and is kept ﬁxed because we assume that L0 is non-diagonal. The subdominant term is also kept ﬁxed because of the reduction by the group of conjugation by diagonal matrices.

166

5 Analytical methods

The Hamiltonians Hn generating this group action are given in eq. (5.8). Setting µi = ai + bλi + · · ·, we have Hn = an−1 bi i i

After Hamiltonian reduction, these quantities are to be kept ﬁxed. So both ai and bi are non-dynamical and δ(µdλ) is regular at inﬁnity. We emphasize that all the conditions specifying the non-dynamical variables in L(λ) are accounted for by eq. (5.74). Under these conditions, we have seen at the beginning of this chapter that the counting of parameters leaves a phase space of dimension 2g. The g action variables are now easily constructed: Proposition. Assume that δ(µdλ) is regular. Then we have: ω=

g

δµγi ∧ δλγi =

i=1

g

δIi ∧ δθi

(5.75)

i=1

where the action variable Ik , canonically conjugated to θk , is Ik = µdλ

(5.76)

ak

on the basis of holomorphic Proof. By eq. (5.74), δ(µdλ) decomposes Abelian diﬀerentials: δ(µdλ) = i αi ωi . To ﬁnd the coeﬃcients αi , we integrate both sides on the cycles al . We get δ(µdλ) = δ µdλ = δIl (5.77) αl = al

al

Hence we have, with ωk = σk (λ)dλ: δIk ωk = δIk σk (λ)dλ δ(µdλ) = k

k

Since the variations are taken at λ constant so that δ(µdλ) = δ(µ)dλ, and since δµ decomposes on the δIk by eq. (5.58), we have ∂µ ∂µ δIk , = σk (λ) δµ = ∂Ik ∂Ik k

By the deﬁnition of the angular variables in eq. (5.73) one has, using δIk : δσi (λ) = k ∂σ∂Ii (λ) k g g λγ j ∂2µ σi (λγj )δλγj + dλ δIk δθi = ∂Ii ∂Ik λ0 j=1

j=1

k

167

5.12 Riemann surfaces and integrability Finally, we obtain: ω = δµγi ∧ δλγi = δIi ∧ σi (λγj )δλγj i

=

i

 δIi ∧ δθi −

i,j

j

λγj

λ0

 ∂2µ dλ δIk  = δIi ∧ δθi ∂Ii ∂Ik i

k

where the second term vanishes because ∂Ii ∂Ik µ is symmetrical in the indices i, k and δIi ∧ δIk is antisymmetric. This shows that the Ii are canonically conjugated to the θi . At the level of Poisson brackets, we have {Ii , Ij } = 0,

{Ii , θj } = δij ,

{θi , θj } = 0

5.12 Riemann surfaces and integrability We are now in a position to clarify the link between integrable systems and Riemann surfaces. Let Γ be a Riemann surface of genus g and let λ be a meromorphic function on it. We assume that λ takes each value N times. Any other meromorphic function µ on Γ is related to λ by an algebraic relation Γ(λ, µ) = 0. One can choose µ such that this relation is irreducible. Then the ﬁeld of meromorphic functions on Γ is the ﬁeld of rational functions of λ and µ. The choice of these functions allows us to present Γ as an N sheeted covering of the Riemann sphere by (λ, µ) → λ. We can interpret Γ as the spectral curve of a Lax matrix L(λ) in the following way. Let Q1 , Q2 , . . . , QN be the N points above λ = ∞, µ(Qi ) = ai . Choose a divisor D of g points on Γ. From these data, we construct N linearly independent meromorphic functions, ψ1 = 1 and ψk with a zero at Q1 and poles at D + Qk for k = 2, . . . , N . This determines ψk uniquely up to multiplication by a constant ck . Let Pi = (λ, µi ) be the N points ij = ψi (Pj ) and µ = diag(µi ), and above λ. Deﬁne the N × N matrices Ψ let µ −1 L=Ψ Ψ This matrix is a rational function of λ because it is a rational function of λ, µ1 , . . . , µN , invariant by permutations of the µj . It tends to the diagonal matrix diag(ai ) at ∞, and Γ is the spectral curve of L(λ). Note that L is deﬁned only up to conjugation by diagonal matrices due to the undeterminacy in the normalization of the functions ψk .

168

5 Analytical methods

We now introduce time evolutions such that Γ is time-independent, but the divisor D depends on time. This is enough to assert the existence of a rational Lax equation ˙ L(λ) = [M (λ), L(λ)],

˙ Ψ −1 M (λ) = Ψ

We are thus exactly in the situation of the Zakharov–Shabat construction. To relate to Liouville integrable systems, we have to introduce a symplectic structure on the dynamical variables. We have seen that imposing coadjoint orbit structure at the poles of L(λ) automatically yields integrable systems once we have performed the Hamiltonian reduction by the diagonal group action of dimension N − 1. This produces a dynamical system of dimension 2g. The g angle variables are given by the dynamical divisor, which evolves linearly on Jac(Γ), and the g action variables are contained in the moduli of the curve. The conditions we impose on the moduli, coming from the coadjoint orbit structure and the Hamiltonian reduction, can be written in a very concise way: δ(µdλ) is a holomorphic diﬀerential where δ is the diﬀerential with respect to the dynamical moduli. This means that the polar parts of µdλ are non-dynamical. Since δµdλ = g ω k=1 k δIk , we see that we have exactly g dynamical modules. In this setting, the standard symplectic form on the variables γi = (λγi , µγi ) is equal to the Kirillov symplectic form on L(λ), as we have shown . θi = γi in eq. (5.61). Moreover, due to eq. (5.75), the angle variables ω are canonically conjugated to the action variables I = j i i ai µdλ, i.e. g g δµγi ∧ δλγi = ωK = δIi ∧ δθi (5.78) i=1

i=1

We want to emphasize the meaning of this result. Starting from a Riemann surface Γ(λ, µ) = 0, we specify g dynamical moduli F1 , . . . , Fg , by imposing that δ(µdλ) be regular. We take g arbitrary points γi = (λγi , µγi ) and impose the symplectic structure ω = i δµγi ∧ δλγi on these data. We determine the g moduli Fi by solving the g equations meaning that the curve passes through the points γi : Γ(λγi , µγi ; F1 , . . . , Fg ) = 0 This determines Fi as symmetric functions of the λj , µj . The beautiful result is that these functions Poisson commute, {Fi , Fj } = 0, because, by eq. (5.78), the action variables, Ii , Poisson commute and they are

5.13 The Kowalevski top

169

independent functions of the Fj . See eq. (5.72) for an example of this situation. Remark 1.

The above construction can be generalized by imposing conditions

such as

δµ dλ (n,m) = ωk δIk n m µ λ g

k=1

(n,m) Ik .

for g modules This will modify the symplectic form as well. An example of this is given in Chapter 6.

Remark 2. If the Riemann surface Γ can be viewed as covering of the Riemann sphere in diﬀerent ways, one can construct Lax matrices L(λ) of diﬀerent sizes in the above way. In particular if Γ is hyperelliptic, one can construct a 2 × 2 Lax matrix.

Remark 3. Lax matrices with elliptic dependence on the spectral parameter can be viewed as particular cases of this setup when the covering of the Riemann sphere λ : Γ → S factorizes as λ : Γ → T → S, where T is the torus. The rational Lax matrix has a size twice as big as the elliptic one 5.13 The Kowalevski top We now brieﬂy discuss the algebro-geometric solution of the Kowalevski top. It is more convenient to start from a slightly diﬀerent Lax matrix from the one in eq. (4.69) in Chapter 4, obtained by conjugation L(λ) → λP −1 L(λ)P : iσ0 + σ3 σ1 + σ2 P = σ0 + iσ3 iσ1 − iσ2 The new Lax matrix reads:  0 −λ ξ2   0 i  λ ξ1 L(λ) =  1 2 z λ γ3  1  2 1 −λ γ3 − z2 2

1 z2 2 −λ γ3 −J3 1 −2 − λ ξ2 λ

 λ γ3   1 − z1   2  1 2 + λ ξ1   λ  J3

(5.79)

where we have used Kowalevski’s variables z1 = J1 + iJ2 , z2 = J1 − iJ2 , ξ1 = γ1 + iγ2 , ξ2 = γ1 − iγ2 . We restrict ourselves to the pure Kowalevski case γ = 0. In this basis the Lax matrix satisﬁes the symmetry properties: −1 t L(−λ) = −Σ−1 L(λ) Σ1 , t L(λ) = −Σ−1 1 2 L(λ) Σ2 , L(−λ) = Σ3 L(λ) Σ3 (5.80)

170

5 Analytical methods

where the matrices Σ1 , Σ2 , Σ3 are given by: σ1 0 σ2 0 , Σ2 = , Σ1 = 0 σ1 0 σ2

Σ3 =

σ3 0

0 σ3

The matrix P has been chosen to simplify the expression of these symmetries and in particular to diagonalize Σ3 . The ﬁrst of eqs. (5.80) expresses the fact that L(λ) belongs to a twisted loop algebra. The second one says that L(λ) belongs to sp(4), which is well-known to be isomorphic to so(3, 2), while the third is a combination of the ﬁrst two. The equation of the spectral curve Γ : det (L(λ) − µ) = 0 reads: 2 1 λ 2 1 λ4 λ2 K 4 γ − H + 2 µ2 + (γ 2 )2 + ((J · γ )2 − Hγ 2 ) + =0 µ − 2 4 λ 16 16 256 (5.81) The Hamiltonians H and K in this formula are given by: 1 2 1 (J + J22 + 2J32 ) − 4γ1 = z1 z2 + J32 − 2(ξ1 + ξ2 ) 2 1 2 K = (J12 − J22 + 8γ1 )2 + (2J1 J2 + 8γ2 )2 = (z12 + 8ξ1 )(z22 + 8ξ2 ) H =

while γ 2 and J · γ are in the centre of the Poisson algebra. Note that the coordinates λ and µ on the spectral curve appear only through λ2 and µ2 , which is a consequence of the symmetries eqs. (5.80). It will be necessary in the following to have a clear picture of the solutions µ(λ) of eq. (5.81) around λ = 0 and λ = ∞. Around λ = ∞, we have four branches: J · γ γ 2 1 µ= λ + i + O( ) (5.82) 2 2 λ 4 γ where , are independent signs. Around λ = 0, we get two branches with µ → 0 and two branches with µ → ∞: √ 1 H K (5.83) λ + O(λ3 ), µ = − λ + O(λ3 ) µ= 16 λ 8 Of course all these branches properly exchange under the symmetries λ → −λ and µ → −µ. At this point it is important to recall that Γ is deﬁned as the desingularization of the curve deﬁned by eq. (5.81). We are going to study Γ by considering it as successive coverings of simpler curves. Setting λ2 = z in eq. (5.81) yields a curve C of equation: z 2 1 z2 z K 1 4 µ2 + (γ 2 )2 + ((J · γ )2 − Hγ 2 ) + µ − γ − H + =0 2 4 z 16 16 256

5.13 The Kowalevski top

171

and Γ is a two-sheeted cover of C. Setting µ2 = y we get the curve E of equation: z2 z 2 1 z K 1 y + (γ 2 )2 + ((J · γ )2 − Hγ 2 ) + y2 − γ − H + =0 2 4 z 16 16 256 and C is a two-sheeted branched cover of E. First, E is an elliptic curve of genus 1. Indeed, setting t = 1/z and Y = ty − 14 γ 2 + H8 t − 12 t2 , the equation of E takes the form Y 2 = tP3 (t), where P3 (t) is the polynomial of degree 3: 2 H 1 3 H 2 (J · γ )2 γ 2 K t− P3 (t) = t − t + + − 4 8 64 4 256 16 The four branch points are obtained for Y = 0, so there is a branch point at t = 0 (or z = ∞) and three branch points at t (or z) ﬁnite. We now study the covering C → E, coming from µ → y = µ2 . This two-sheeted covering can only be branched at y = 0 and y = ∞. The meromorphic function y on E takes each value three times because, given y, z is determined by a third degree equation. Hence it has three zeroes and three poles. Setting y = 0 in the equation of E, one gets 1 2 2 1 K 2 (γ ) + t((J · γ )2 − Hγ 2 ) + t =0 16 16 256 yielding two points (y = 0, t = t1 ) and (y = 0, t = t2 ), where t1 , t2 are the two roots of this second degree equation. The third point with y = 0 and the three points with y = ∞ occur when t = 0 and t = ∞. For t → ∞ we have two points P1 , P2 on the curve E corresponding to the branches: H K 1 1 1 P1 : y = t − , P2 : y = (5.84) +O +O 2 4 t 256 t t Since t is a good local parameter at ∞, P1 is a pole of y, while P√ 2 provides the third zero of y. For t → 0, a good local parameter on E is t and we ﬁnd two branches: P3 : y =

γ 2 1 J · γ 1 √ + O(1) ±i 4 t 4 t

(5.85)

showing that y has a double pole at this point P3 (t = 0, y = ∞) on E. Of these six poles and zeroes of y only four are branch points of the covering C → E. This is because at the point P3 the equation of C is singular. Since C is the desingularized curve, P3 blows up to two points P˜3 and P˜3 of C, and the point P3 has two pre-images as its neighbours. On the other

172

5 Analytical methods

hand, P1 and P2 are branch points and have just one pre-image each, P˜1 and P˜2 on C. Using the Riemann–Hurwitz formula 2g−2 = N (2g0 −2)+ν, where g0 = 1, N = 2 and ν = 4, we ﬁnd that C has genus 3. Finally, we study the covering Γ → C, coming from λ → z = λ2 which can possibly be ramiﬁed only at z = 0, ∞. Generically, given z, there are four values of µ satisfying the equation of C, so that the meromorphic function z has four zeroes and four poles. We have already obtained the four branches of Γ above λ = ∞ in eq. (5.82) which correspond by definition to four points on the smooth curve Γ. They project on the two branches of C given by eq. (5.85), hence the covering at the two points P˜3 and P˜3 is unbranched. Similarly, above λ = 0 we have the four √ branches of Γ given √ in eq. (5.83). The two points of Γ, Q1 (µ ∼ λ K/16) and Q2 (µ ∼ −λ K/16), project on P˜2 , and the two points Q3 (µ ∼ 1/λ) and Q4 (µ ∼ −1/λ) project on P˜1 , as seen from eq. (5.84). So the covering is unbranched at these points. This exhausts the zeroes and poles of z, hence the covering Γ → C is unbranched. Applying the Riemann–Hurwitz formula with g0 = 3, N = 2, ν = 0, we ﬁnd that the genus of the spectral curve Γ is equal to 5. We see that, in the case of the Kowalevski top, the Jacobian of the spectral curve is of dimension 5 while the Liouville tori are of dimension 2. This non-generic situation is related to the symmetry properties, eqs. (5.80), of the Lax matrix L(λ) as we now show. Consider the eigenvector Ψ(λ, µ) satisfying L(λ)Ψ = µΨ at the point (λ, µ) of the spectral curve, normalized such that the ﬁrst component is equal to 1. According to the general discussion Ψ has g + N − 1 = 8 poles on the spectral curve. It is easy to get the following expansions for the eigenvector at the four reads points Q1 , Q2 , Q3 , Q4 , above λ = 0. The matrix Ψ(λ)   1 1 1 1 −iζ + O(λ) i zz12 + O(λ) −i zz12 + O(λ)   iζ + O(λ)  −iz2  4i 1  ζλ + O(λ2 ) iz2 ζλ + O(λ2 ) − 4i 1 + O(1) + O(1)  4

− 14 z1 λ + O(λ2 )

4

− 14 z1 λ

z2 λ

z2 λ

− z42 λ1 + O(1) − z42 λ1 + O(1) (5.86) The column i corresponds to an expansion at the point Qi . We have denoted ζ = (z12 + 8ξ1 )/(z22 + 8ξ2 ). Note that the eigenvector Ψ(λ, µ) has simple poles at Q3 and Q4 . This can be understood in the context of the general analysis of this chapter. Indeed, choosing a basis where the constant coeﬃcient of 1/λ in L(λ) (which plays the role of L0 in the general discussion) is diagonal, we expect poles at the points above λ = 0. However, because we have two degenerate vanishing eigenvalues, we have only two-poles instead of the expected three. They are on the sheets corresponding to the non-vanishing

173

5.13 The Kowalevski top

eigenvalues, hence at the two points Q3 , Q4 . When returning to our basis, we have to make linear combinations of the last two components of Ψ and this explains our formulae. Recall that the Lax matrix obeys eq. (5.80) so that if L(λ)Ψ(λ, µ) = µΨ(λ, µ) then L(−λ)Σ3 Ψ(λ, µ) = µΣ3 Ψ(λ, µ). Since Σ3 is diagonal and the ﬁrst component of Ψ is equal to one, we get: Ψ(−λ, µ) = Σ3 Ψ(λ, µ)

(5.87)

We have seen that Γ is a two-sheeted unbranched cover of C. The two sheets are exchanged under the involution τ : (λ, µ) → (−λ, µ). Note that τ exchanges the points Q1 , Q2 and also Q3 , Q4 . It exchanges the corresponding sheets in a vicinity of λ = 0. On the explicit solution, eq. (5.86), we have Ψ(−λ) = Σ3 Ψ(λ)Σ 1 . The matrix Σ1 on the right accounts for the exchange of the sheets under τ . As a result of eq. (5.87) the meromorphic functions ψi (P ), i = 2, 3, 4 on the curve Γ obey the symmetry properties: ψ2 (τ · P ) = −ψ2 (P ),

ψ3 (τ · P ) = ψ3 (P ),

ψ4 (τ · P ) = −ψ4 (P )

In particular the divisor of the poles of Ψ is invariant under the involution τ . Two of these poles are the points Q3 , Q4 which are exchanged by this involution. The remaining six poles thus come in three pairs (γj , τ · γj ), j = 1, 2, 3. These pairs can be seen as points on C, also denoted by γi . The dynamical divisor on C, D(t) = 3i=1 γi (t), is of degree 3 and moves linearly on the dimension 3 Jacobian of the curve C, as we now show. Considering the Lax pair L, M given in eq. (4.69) in Chapter 4 and remembering that L has been rescaled, L → λL, we see that the polar parts of L and M at λ = 0 are related by M− = 1/2 L− . This relation is not aﬀected by the further similarity L → P −1 LP that we have used in this section. Proceeding exactly as in eq. (5.35), we get: τ (γi (t)) 3 4 γi (t) 1 d ω+ ω = ResQi µω (5.88) dt 2 i=1

i=3

where Q3 , Q4 are the two points on Γ with λ = 0 and µ = ∞, and ω is any Abelian diﬀerential of the ﬁrst kind on Γ. In particular, choosing for ω the pullback on Γ of an Abelian diﬀerential ω on C, the right-hand side becomes twice the residue at the point P˜1 on C of the form ( 12 µω). This is because in the vicinity of the corresponding points µ(−λ) = −µ(λ), and the pullback of a form on C has a local expression σ(λ)dλ with σ(−λ) = −σ(λ). Note that non-vanishing higher order Hamiltonians in the Kowalevski hierarchy are traces of an even power of L ultimately

174

5 Analytical methods

yielding an odd power of µ in eq. (5.88), so the same conclusion applies to all these ﬂows. Similarly, the left-hand side doubles between a pair of corresponding points γi , by deﬁnition of a pullback. Finally, we get: d dt 3

i=1

γi (t)

ω = ResP˜1

1 µω 2

(5.89)

Since C is of genus 3, we have three independent forms ω and this proves that the ﬂow is linear on the Jacobian of C. One can view the functions ψi (P ) as functions on C in the following way. First ψ3 (P ) is a well-deﬁned meromorphic function on C (since it is even under τ ) with three poles at D, one-pole on P˜1 and a zero at P˜2 , and is therefore uniquely determined. The functions ψ2 (P ) and ψ4 (P ) require special treatment since they are odd under τ√ , hence are multivalued on C. However, we can consider the function λ = z deﬁned on Γ which is odd under τ and the functions λψ2 (P ) and λψ4 (P ) which, being even under τ , yield well-deﬁned meromorphic functions on C. These functions are uniquely characterized by their analyticity properties: λψ2 (P ) has three simple poles at D, two simple zeroes at P˜1 and P˜2 and simple poles at P˜3 and P˜3 , while λψ4 (P ) has three simple poles at D, a double zero at P˜2 , and two simple poles at P˜3 and P˜3 . Hence we have shown that one can work only with √ the genus 3 curve C, together with the extra multivalued function λ = z. We still have three points in D, while we need only two degrees of freedom. A further restriction is provided by the other symmetry (λ, µ) → (λ, −µ) induced by the second eq. (5.80). Note that the righthand side of eq. (5.89) contains only odd powers of µ for the general Kowalevski ﬂow. At the point P˜1 (which is a branch point of C → E) µ is a good local parameter. Assume that ω is the pullback on a form on E, hence has a local expression σ(µ)dµ with σ(µ) odd, then µ2k+1 σ(µ) is even, and has no 1/µ term, so that the right-hand side of eq. (5.89) vanishes. Since E is of genus 1, we get one condition which restricts the ﬂow. We ﬁnally see that the ﬂow occurs on a two-dimensional subvariety of Jac (C), the so-called Prym variety of the covering C → E. It is deﬁned as the subvariety such that any tangent vector is in the kernel of the pullback to C of any Abelian form of E. Here the action of ω on a tangent vector to Jac (C) is deﬁned by the left-hand side of eq. (5.89). We have recovered a four-dimensional phase space for the Kowalevski top. At this point one can solve the equations of motion with theta-functions on Jac (C) and reduce them to two-dimensional theta-functions by using the Prym condition. Finally, Kowalevski has directly solved the system by using a curve of genus 2. For a study of these approaches and their relations we refer to the literature.

5.14 Inﬁnite-dimensional systems

175

5.14 Inﬁnite-dimensional systems In the ﬁeld theory case, we can use the previous constructions to ﬁnd particular classes of solutions to the ﬁeld equations, called ﬁnite-zone solutions. The equations we have to solve are the ﬁrst order diﬀerential system: (∂x − U (λ))Ψ = 0 (∂t − V (λ))Ψ = 0

(5.90) (5.91)

whose compatibility conditions are equivalent to the ﬁeld equations. The situation is very diﬀerent as compared to the ﬁnite-dimensional case. As we saw in Chapter 3, the analogue of the spectral curve is det(T (λ) − µ) = 0

(5.92)

where T (λ) is the monodromy matrix of the linear system (5.90, 5.91). This equation does not deﬁne an algebraic curve of ﬁnite genus. This had to be expected since, in ﬁeld theory, we need an inﬁnite number of action variables, which is incompatible with the ﬁnite genus of the spectral curve. Thus we cannot directly apply the previous construction. However, if we restrict our goal to ﬁnding only particular solutions to eqs. (5.90, 5.91), then the knowledge acquired in this chapter becomes directly applicable. In fact, the two equations (5.90, 5.91) are exactly of the type of eq. (5.33), whose solution was built in terms of Baker–Akhiezer functions. One can adapt this construction to solve them simultaneously. The idea consists of interpreting the two equations (5.90, 5.91) as evolution equations with respect to two diﬀerent “times” for a system with a ﬁnite number of degrees of freedom associated with some Lax matrix L(λ). This Lax matrix should satisfy: [∂x − U (λ), L(λ)] = 0 [∂t − V (λ), L(λ)] = 0

(5.93)

To exhibit such Lax matrices, we consider the higher order ﬂows as described in eq. (3.95) of Chapter 3. They provide a family of compatible linear equations (∂ti − Vi )Ψ = 0 for i = 1, 2, 3, . . . , where we have identiﬁed t1 = x, V1 = U and t2 = t, V2 = V . Since these equations are compatible they satisfy a zero-curvature condition: Fij ≡ ∂ti Vj − ∂tj Vi − [Vi , Vj ] = 0,

∀i, j = 1, . . . , ∞

We now look for particular solutions which are stationary for some given time tn , i.e. ∂tn Vi = 0 for all i. The zero-curvature conditions Fni = 0

176

5 Analytical methods

reduce to a system of Lax equations: dL = [Mi , L], dti

i = 1, . . . , ∞

with L = Vn , Mi = Vi

This is an integrable hierarchy for a ﬁnite-dimensional dynamical system described by the Lax matrix L. Taking n larger and larger, the genus of the corresponding spectral curve usually increases and we get families of solutions involving more and more parameters. We give an example of this procedure in Chapter 11. The methods of this chapter deal in fact with systems with a ﬁnite number of degrees of freedom. In Chapter 13 we present the inverse scattering method which deals directly with systems with an inﬁnite number of degrees of freedom. References [1] Sophie Kowalevski, Sur le probl`eme de la rotation d’un corps solide autour d’un point ﬁxe. Acta Mathematica 12 (1889) 177–232. [2] B. Dubrovin, V. Matveev and S. Novikov, Non-linear equations of Korteweg–de Vries type, ﬁnite-zone linear operators, and Abelian varieties. Russian Math. Surveys 31 (1976) 59–146. [3] P. van Moerbeke and D. Mumford, The spectrum of diﬀerence operators and algebraic curves. Acta. Math. 143 (1979) 93–154. [4] A.G. Reyman and M.A. Semenov-Tian-Shansky, Reduction of Hamiltonian systems, aﬃne Lie algebras and Lax equations. Inventiones Mathematicae 54 (1979) 81–100. [5] M. Adler and P. van Moerbeke, Linearization of Hamiltonian systems, Jacobi varieties and representation theory. Advances in Mathematics 38 (1980) 318–379. [6] A.G. Reyman and M.A. Semenov-Tian-Shansky, Reduction of Hamiltonian systems, aﬃne Lie algebras and Lax equations II. Inventiones Mathematicae 63 (1981) 425–432. [7] D. Mumford, Tata Lectures on Theta Vols. I and II, Birkhauser (1983–1984). [8] A. Bobenko, A. Reyman and M. Semenov-Tian-Shansky, The Kowalevski top 99 years later: a Lax pair, generalizations and explicit solutions. Comm. Math. Phys. 122 (1989) 321–354.

5.14 Inﬁnite-dimensional systems

177

[9] E. Horozov and P. van Moerbeke, The full geometry of Kowalevski’s top and (1,2)-Abelian surfaces. Comm. Pure Appl. Math. 42 (1989) 357–407. [10] B.A. Dubrovin, I.M. Krichever and S.P. Novikov, Integrable Systems I. Encyclopedia of Mathematical Sciences, Dynamical Systems IV, Springer (1990) 173–281. [11] A. Beauville, Jacobienne des courbes spectrules et syst`emes hamiltoniens compl`etement int´egrables. Act. Math. 164 (1990) 211–235. [12] I.M. Krichever and D.H. Phong, On the integrable geometry of soliton equations and N=2 supersymmetric gauge theories. J. Diﬀ. Geom. 45 (1997) 349–389.

6 The closed Toda chain

In contrast to open Toda chains, the closed Toda chain is associated with loop algebras. This introduces a spectral parameter into the theory. The aim of this chapter is to construct the general solution of the closed Toda chain by means of the analytical method. We shall do this in two ways. The ﬁrst method, based on an (n+1)×(n+1) Lax matrix, follows closely Chapter 5. This canonical example illustrates the general constructions of this chapter, for instance, linearization of the ﬂows on the Jacobian of the spectral curve, separation of variables and the corresponding Hamilton– Jacobi equations. The second method, which is based on 2 × 2 matrices, can be regarded as a lattice version of the ﬁeld theoretical considerations of Chapter 3. A monodromy matrix is introduced which satisﬁes a quadratic Poisson bracket. This provides short cuts which in general are not available, and makes contact with Sklyanin’s method of separation of variables. We take advantage of this particular example to discuss the reality conditions, which are frequently quite subtle.

6.1 The model We consider a chain of (n + 1) points with positions qi and momenta pi with equations of motion given by: q˙i = pi ,

p˙i = 2e2(qi−1 −qi ) − 2e2(qi −qi+1 ) ,

i = 1, . . . , n + 1

(6.1)

The fact that the chain is closed is implemented by setting (p0 , q0 ) = (pn+1 , qn+1 ) and (pn+2 , qn+2 ) = (p1 , q1 ), which gives sense to the above equations for i = 1 and i = n + 1. Alternatively, one can view the points 178

6.1 The model

179

as sitting on a circle. This is a Hamiltonian system with canonical Poisson brackets {pi , qj } = 12 δij and Hamiltonian: H= p2i + 2 exp 2(qi − qi+1 ) (6.2) i

i

The system has translational symmetry qi → qi + a and one can eliminate n+1 the centre of mass motion by imposing the two conditions i=1 pi = 0 and n+1 i=1 qi = 0, so we are left with a phase space of dimension 2n. In contrast to the open Toda chain which is associated with the ﬁnite-dimensional Lie algebra sl(n + 1), the closed Toda chain is associated with the inﬁnite-dimensional Kac–Moody algebra. In fact we con˜ n+1 ≡ sider only its loop representation with vanishing central charge: sl −1 sln+1 ⊗ C(λ, λ ). Its rank is (n + 1), see Chapter 16 for more details. The generators E±αi associated with the simple roots are represented by Eαi = Ei,i+1 , E−αi = Ei+1,i , for i = 1, . . . , n, together with Eαn+1 = λEn+1,1 , and E−αn+1 = λ−1 E1,n+1 , where Ejk are the (n + 1) × (n + 1) canonical matrices (Ejk )mn = δjm δkn . The elements of the Cartan subalgebra are represented by (n + 1) × (n + 1) diagonal traceless matrices. Applying the general construction of the Toda models, see Chapter 4, the Lax pair for the closed Toda chain is given by: L(λ) =

n+1

pi Eii +

i=1 n+1

M (λ) = −

n+1

ai (Eαi + E−αi )

(6.3)

i=1

ai (Eαi − E−αi )

i=1

where the coeﬃcients ai are given by ai = exp (qi − qi+1 ), i = 1, . . . , n, and an+1 = exp (qn+1 − q1 ). Note that (qi − qi+1 ) = αi (q), where q is the traceless diagonal matrix i qi Eii and αi is the simple root associated with the root vector Ei,i+1 . The quantities ai satisfy the condition n+1 i=1 ai = 1. Explicitly, the Lax pair reads:   a1 0 ... λ−1 an+1 p1  a1  p2 a2 ... 0  .  .. ..  .  . .  .    ai−1 pi ai 0 L(λ) =  0 (6.4)   .  .. ..  ..  . .    0  an ... an−1 pn λan+1 ... 0 an pn+1

180

6 The closed Toda chain       M (λ) =     

0 a1 .. .

−a1 0

0 −a2 .. .

0 .. .

ai−1

0 −λan+1

... ...

... ... 0

−ai .. . an−1 0

0 an

 λ−1 an+1  0  ..  .   0  (6.5)  ..  .  −an  0

The matrices L(λ) and M (λ) have poles at λ = 0 and λ = ∞. The equations of motion eq. (6.1) are equivalent to the Lax equation: ˙ L(λ) = [M (λ), L(λ)] as one can check easily by an explicit computation. Notice that the correct equations of motion for q1 and qn+1 are obtained thanks to the λdependent terms in L(λ) and M (λ). As usual, the Lax equation ensures that Tr (Lp (λ)) are conserved quantities, in particular the Hamiltonian eq. (6.2) reads H = Tr (L2 (λ)) which is independent of λ. As for all Toda models, the Poisson bracket of the Lax matrix can be written in terms of an r-matrix. This implies that the closed Toda chain is integrable. There are two natural r-matrices, r± , which can be computed according to the general formula (4.34) in Chapter 4. In particular: 1 + Hi ⊗ Hi + Eα ⊗ E−α r12 (λ, λ ) = ρ(λ) ⊗ ρ(λ ) 2 α>0

i

where ρ(λ) is the loop representation given above with spectral parameter ˜ n+1 are λn Eij λ. To compute r+ , we recall that the positive roots in sl with i < j; n ≥ 0 or i ≥ j; n > 0, and the corresponding negative roots are λ−n Eji . We immediately get: + (λ, λ ) r12

=

(6.6) 



1  1 λ + λ λ E ⊗ E + E ⊗ E + λ Eij ⊗ Eji  ii ii ij ji 2 λ − λ λ − λ i

i<j

i>j

This expression is valid for |λ| < |λ |. The formula for r− (λ, λ ) is the same as for r+ (λ, λ ) but valid in the region |λ| > |λ |. So we consider in general the rational function r12 (λ, λ ) which is the extension of the right-hand side of eq. (6.6). Notice that r12 (λ, λ ) = −r21 (λ , λ) so that {L1 (λ), L2 (λ )} = [r12 (λ, λ ), L1 (λ) + L2 (λ )]

6.2 The spectral curve

181

± Using the dualization Tr Res, we deﬁne from r12 two maps R± . Let us ± deﬁne M± = −2R (L):

M+ = −

pi H i − 2

i

M− =

pi H i + 2

i

n+1

ai Eαi

i=1 n+1

ai E−αi

i=1

we have from eqs. (6.4, 6.5): 1 M (λ) = (M+ (λ) + M− (λ)), 2

1 L(λ) = (−M+ (λ) + M− (λ)) 2

so that the Lax equation can be written: ˙ L(λ) = [M (λ), L(λ)] = [M+ (λ), L(λ)] = [M− (λ), L(λ)] 6.2 The spectral curve The spectral curve Γ is the smooth algebraic curve deﬁned by: Γ

:

det(L(λ) − µ) = 0

(6.7)

For a ﬁxed λ, this is the equation for the eigenvalues of L(λ). By expanding the determinant we see that it is of the form: (6.8) Γ(λ, µ) ≡ (λ + λ−1 ) − 2t(µ) = 0 n where 2t(µ) = µn+1 − n+1 i=1 pi µ + · · · is a polynomial of degree (n + 1). The spectral curve is a hyperelliptic curve since it can be written as s2 = t2 (µ) − 1,

with s = λ − t(µ)

(6.9)

Let us compute the genus of the curve Γ. The polynomial t2 (µ) is of degree 2n + 2 and generically the equation t2 (µ) − 1 = 0 has no double roots, so the genus of the curve Γ is g = n. This is equal to the number of degrees of freedom, i.e. half the dimension of phase-space. In the following, we shall always assume that we are in this generic situation. Notice that the hyperelliptic curve eq. (6.9) is very special because the polynomial of degree 2n + 2 in the right-hand side is expressed in terms of a polynomial of degree n + 1. This also shows that the number of action variables is precisely g = n when n+1 i=1 pi = 0.

182

6 The closed Toda chain

Let us see how we can recover this result from the general analysis in Chapter 5, when looking at the curve as an (n + 1)-sheeted cover of the λ plane. We recall the Riemann–Hurwitz formula for computing the genus: 2g − 2 = (2g0 − 2)(n + 1) + ν where g0 = 0 and ν is the number of branch points. To ﬁnd ν, we count ∂ ∂ Γ(λ, µ) = −2 ∂µ t(µ). This is a polynomial of degree n in µ, the zeroes of ∂µ independent of λ. Hence we have n zeroes at ﬁnite distance. To each value of µ correspond two values of λ, and the total contribution of the branch points at ﬁnite distance is 2n. We now look at µ = ∞ which corresponds to λ = ∞ or λ = 0. For λ = ∞, we have λ µn+1 , which means that we have a branch point of order n + 1, contributing n to ν. Similarly, λ = 0 is a branch point of order n+1, contributing also n to ν. Adding everything, we ﬁnd ν = 4n and g = n. We will call P + and P − the two points above λ = ∞ and λ = 0 respectively. In the neighbourhood of P ± the local parameter is µ−1 and we have by direct expansion of eq. (6.8): n+1

P + : λ = µn+1 1 − µ−1 ( pj ) + O(µ−2 )

(6.10)

j=1

P

−

−n−1

: λ=µ

−1

1+µ

n+1

(

pj ) + O(µ−2 )

(6.11)

j=1

6.3 The eigenvectors Equation (6.7) is the condition for µ to be an eigenvalue of L(λ). Therefore, with each point P on Γ one can associate an (n + 1) dimensional eigenvector Ψ(P ): (L(λ) − µ)Ψ(P ) = 0,

P = (λ, µ) ∈ Γ

Writing this equation explicitly for  ψ1 Ψ(P ) =  ...  ψn+1 

we ﬁnd the following system of linear equations: p1 ψ1 + a1 ψ2 + λ−1 an+1 ψn+1 = µψ1 ai−1 ψi−1 + pi ψi + ai ψi+1 = µψi λan+1 ψ1 + an ψn + pn+1 ψn+1 = µψn+1

(6.12)

183

6.3 The eigenvectors

We extend the deﬁnition of the coeﬃcients ai , pi by periodicity, ai+n+1 = ai , pi+n+1 = pi , and introduce a second order diﬀerence operator D: (DΨ)i ≡ ai−1 ψi−1 + pi ψi + ai ψi+1 This operator is a discrete version of a Schroedinger operator with periodic potential. Equations (6.12) are then equivalent to: (DΨ)i = µψi ,

with ψi+n+1 = λψi

(6.13)

Thus, Ψ is a Bloch wave for the diﬀerence operator D with a Bloch momentum λ. We choose the normalization condition: ψ0 (P ) = 1 for all P ∈ Γ or alternatively ψn+1 (P ) = λ. This is slightly diﬀerent from the convention of Chapter 3 where we normalized ψ1 (P ) = 1 but it will prove to be more convenient in the following. We need to know the analyticity properties of Ψ at inﬁnity. Proposition. Let us normalize ψ0 (P ) = 1. Then, at the points P + and P − above λ = ∞ and λ = 0, the eigenvector Ψ(P ) behaves as: qi −q0 i

ψi (P ) = e

i−1

( pj ) + O(µ−2 ) ,

−1

µ 1−µ

P ∼ P+

(6.14)

j=0 −qi +q0 −i

ψi (P ) = e

µ

−1

1+µ

i

( pj ) + O(µ−2 ) ,

P ∼ P−

(6.15)

j=1

where the qi are the Toda position parameters (q0 = qn+1 , p0 = pn+1 ). Proof. It is easy to check that this is consistent with eq. (6.12). The result then follows by the uniqueness of the eigenvector. From the general theory, see eq. (5.17) in Chapter 5, we expect g + (n + 1) − 1 = 2n poles for the eigenvector. From eq. (6.14), we see that we have a ﬁxed pole of order n at P + . There remain g = n poles at ﬁnite distance. These are the dynamical poles. We shall denote by γ1 , . . . , γn their positions. Note that their number equals the number of degrees of freedom. Recall from Chapter 5 that these dynamical poles contain all the relevant information to reconstruct the eigenvector. We now consider the time evolution of the eigenvector. The Lax equation implies d Ψ(t, P ) = (M (λ) − C(t, P ) 1) Ψ(t, P ) dt

184

6 The closed Toda chain

with C(t, P ) some scalar function, determined by imposing the condition ψ0 (t, P ) = 1 or equivalently ψn+1 (t, P ) = λ, which yields C(t, P ) = j Mn+1,j ψj . Alternatively, one can use the natural time evolution d Ψ(t, P ) = M (λ) Ψ(t, P ) dt

(6.16)

but then, at t = 0 the eigenvector Ψ(t, P ) is a Baker–Akhiezer function. It has n poles at ﬁnite distance, independent of time, and essential singularities at the points P ± given by: Proposition. Let Ψ(t, P ) be the eigenvector evolving according to natural eq. (6.16). Then ψi (t, µ) = eqi (t) e−µt µi (1 + O(µ−1 )), −qi (t)

ψi (t, µ) = e

µt −i

−1

e µ (1 + O(µ

)),

P → P+ P →P

(6.17) −

(6.18)

Proof. The time evolution of the eigenvector (Lψ)i = µψi is: d ψi = (M Ψ)i = ((L + M+ )Ψ)i = ((−L + M− )Ψ)i dt Let us see what happens near P + . Since Ψ(t, P ) is also a solution of (L(λ) − µ)Ψ(t, P ) = 0, the results of eqs. (6.14, 6.15) still apply with qi (t) now depending on t. Hence we can write ψi (t, P ) = f (t, P )eqi (t)−q0 (t) µi (1 + O(µ−1 )) There is an extra multiplicative factor f (t, P ), independent of n, because we relax the condition ψ0 = 1 for t = 0. Writing ψ˙ n+1 = f˙λ = −((L − M− )Ψ)n+1 = −µλf + pn+1 λf + 2an ψn we get f˙ = (−µ + pn+1 + O(µ−1 ))f , hence f = exp(−µt + q0 (t))(1 + O(µ−1 )), where we used pn+1 = q˙0 . The analysis near P − is similar, using M = M+ + L. 6.4 Reconstruction formula Starting from a curve of the particular form eq. (6.8), we reconstruct the eigenvector Ψ(P ) from its analyticity properties. From this the whole closed Toda chain model and its solution is obtained. Consider ﬁrst what happens at time t = 0. Let the algebro-geometrical data be speciﬁed by the curve eq. (6.8), with a divisor of g = n points

185

6.4 Reconstruction formula

on it D = γ1 + . . . + γg , and two punctures which are the points P ± at inﬁnity. From the previous section, the divisor of the component ψi (P ) of Ψ(P ) is: (ψi ) = −D + i P − − i P +

(6.19)

Applying the Riemann–Roch theorem with deg (ψi ) = g, we see that there exists a unique meromorphic function, up to a proportionality constant, having this divisor. This function is not constant, for i > 0, since it vanishes at P − . We can ﬁx the proportionality constant (up to a sign) by requiring that the coeﬃcients of µ∓i at P ± are inverse to each other. Denote these coeﬃcients by e±(qi −q0 ) . Note that this eliminates the residual gauge invariance by diagonal matrices which is present in the general theory. The function ψi (P ) will have the form for all i ≥ 0 : ψi (µ) = eqi −q0 µi (1 − µ−1 ξi+ + O(µ−2 )), −qi +q0

ψi (µ) = e

−i

−1

µ (1 − µ

ξi−

−2

+ O(µ

P → P+ )),

P →P

(6.20) −

(6.21)

ξi±

are just Taylor coeﬃcients. Of course, since ψi (P ) is a meromorHere phic function, it also possesses g extra zeroes. We now show that these properties imply that the functions ψi (P ) constructed with these analyticity requirements are solutions of eq. (6.13): Proposition. Let ψi (P ) be the unique meromorphic function having simple poles at the g points γi , a pole of order i at P + , and a zero of order i at P − , and normalized as in eqs. (6.20, 6.21). Then, (i) The ψi (P ) satisfy the Schroedinger equations (D −µ)Ψ = 0, eq. (6.13), with coeﬃcients ai , pi given by: ai = exp (qi − qi+1 ) ,

+ pi = ξi+1 − ξi+

(6.22)

(ii) The functions ψi (P ) are quasi-periodic: ψn+1+i (P ) = λ(P ) ψi (P )

(6.23)

Therefore, the ψi (P ) are the components of the eigenvector of a Lax matrix L(λ) of the form eq. (6.4). Proof. Let us consider the function ψi (P ) = ((D − µ)Ψ)i (P ). Since the coeﬃcients of (D − µ) are regular outside P ± , ψi possesses g poles at the γi . Consider now the behaviour at P ± . We have: ((D − µ)Ψ)i = µi+1 ai eqi+1 −q0 − eqi −q0 + −µi ai eqi+1 −q0 ξi+1 − eqi −q0 (pi + ξi+ ) + O(µi−1 ), P → P + ((D − µ)Ψ)i = µ−i+1 ai−1 eq0 −qi−1 − eq0 −qi + O(µ−i ), P → P −

186

6 The closed Toda chain

If we choose the coeﬃcients ai and pi as in eqs. (6.22), the function (D − µ)ψi has a pole of order (i−1) at P + , and a zero of order i at P − . Therefore the divisor of ψi (P ) is greater than −D = −D + i P − − (i − 1))P + and we have deg D = g − 1. Thus, by the Riemann–Roch theorem, (D − µ)Ψ = 0. Notice that the coeﬃcients ai and pi are chosen in order to decrease the degree of this divisor by one unit. Let us now prove the periodicity relation ψn+1+i (P ) = λψi (P ). We apply the Riemann–Roch theorem to the functions in each member of this equation. The divisor of the function λ is (λ) = (n + 1)P − − (n + 1)P + . The poles at ﬁnite distance of ψn+1+i and ψi are located at the same positions γ1 , . . . , γg . So we have: (ψn+1+i ) ≥ −D + (n + 1 + i)P − − (n + 1 + i)P + (λψi ) ≥ −D + i P − − i P + + (n + 1)P − − (n + 1)P + Notice that these divisors are both greater than the degree g divisor −D+ (n+1+i)P − −(n+1+i)P + . By the Riemann–Roch theorem, there is only one function satisfying this property and the two functions are therefore proportional. The proportionality constant is determined by comparing the behaviour at P + or P − , and is found to be equal to one. Let us now consider what happens at t = 0. We deﬁne, for all integer i ≥ 0, the function ψi (t, P ) as the Baker–Akhiezer function with g poles at the same positions γ1 , . . . , γn as above, and with essential singularities at P ± given by: ψi (t, µ) = eqi (t) e−µt µi (1 − µ−1 ξi+ (t) + . . .), −qi (t)

ψi (t, µ) = e

µt −i

−1

e µ (1 − µ

ξi− (t)

+ . . .),

P → P+ P →P

−

(6.24) (6.25)

with the normalization parameters qi (t) obtained by requiring that the expansions at P + and P − start with inverse coeﬃcients. Then the Taylor coeﬃcients ξi± (t) are ﬁxed. By the Riemann–Roch theorem this function exists and is unique. Moreover, we have ψn+1+i (t, P ) = λψi (t, P ) by the same method as above. Taking into account the expansions eqs. (6.10, 6.11) of λ and the expansions eqs. (6.24, 6.25) of ψi (t, P ) near P ± , it follows thatthe Taylor coeﬃcients ξi± (t) are periodic, i.e. ± ξi+n+1 (t) = ξi± (t), when i pi = 0 (centre of mass system). Proposition. The Baker–Akhiezer functions deﬁned above satisfy the eigenvalue equation and the evolution equation: L(λ)Ψ(t, P ) = µ Ψ(t, P ) d Ψ(t, P ) = M (λ)Ψ(t, P ) dt

187

6.4 Reconstruction formula with M (λ) deﬁned in eq. (6.3) with ai (t) = exp (qi (t) − qi+1 (t)) .

Proof. The proof of the eigenvalue equation and the quasi-periodicity property (6.23) is the same as for the initial eigenvector at t = 0. To prove the evolution equation let us consider for i = 1, . . . , n the expression: Ei ≡

d ψi + (ai ψi+1 − ai−1 ψi−1 ) dt

Since the poles of the ψi at ﬁnite distance are independent of time, and the ai are constant on the Riemann surface, Ei has the same g poles at ﬁnite distance as ψi . Its behaviour at inﬁnity is easily obtained as: + Ei = −e−µt eqi µi (ξi+1 − ξi+ − q˙i ) + O(µi−1 ), µt −qi −i

Ei = e e

µ

− (ξi−1

−

ξi−

−i+1

− q˙i ) + O(µ

),

P → P+ P →P

(6.26)

−

By the Riemann–Roch theorem Ei is proportional to ψi , and we write Ei = di (t)ψi . Using the quasi-periodicity property eq. (6.23) we can restate this result as: ˙ = M Ψ + dΨ Ψ (6.27) where M is the matrix given in eq. (6.5) and d is the time-dependent diagonal matrix d = Diag (d1 , . . . , dn+1 ), which is constant on the Riemann surface. Diﬀerentiating the relation (L(λ) − µ)Ψ = 0 with respect to time we get, using eq. (6.27), (L˙ − [M, L] − [d, L])Ψ = 0. For any value of λ which is not a branch point of the covering (λ, µ) → λ, we have n+1 independent (see Chapter 5) vectors Ψ(λ, µk ) for which this equation is true, ˙ hence we get the matrix equation L(λ) − [M (λ), L(λ)] = [d, L(λ)], which remains true for all values of λ by analytic continuation. In particular it is true when λ → 1/λ. Note that t L(λ) = L(1/λ), t M (λ) = −M (1/λ), so taking the transpose of the above equation evaluated at 1/λ and comparing it with the original equation we get [d, L(λ)] = 0. This implies that d is proportional to the identity matrix, d = δ(t)I. Then, comparing + − ξi+ + δ(t). In the centre of eq. (6.26) with Ei = δ(t)ψi , we ﬁnd q˙i = ξi+1 n mass system, i=0 qi = 0, so that δ(t) = 0 by the periodicity of ξi+ . At this point we have completely reconstructed the closed Toda chain, starting from an appropriate spectral curve. Moreover, the procedure also provides the solution of the equations of motion. To get explicit expressions for qi (t) we need explicit formulae for the Baker–Akhiezer functions. This is done using Riemann’s theta-functions. Let us ﬁx a canonical set of cycles (ai , bj ) on Γ, a base point Q0 on Γ and a set of holomorphic Abelian diﬀerential ωj , dual to the a-cycles

188

6 The closed Toda chain

(see Chapter 15). Let Ω(i) be the meromorphic diﬀerential analytic on Γ outside the points P ± , and obeying the following normalization conditions: Ω(i) = 0, ak

Ω (P ) = ±(µi−1 + O(µ−2 ))dµ,

P → P± (6.28) P The notation assumes that some multivalued primitives Q0 Ω(i) have been chosen for these diﬀerentials. Since the local parameter around P ± is 1/µ, Ω(i) has poles of order (i + 1) at the points P ± . Note that Ω(i) , i ≥ 1, are Abelian diﬀerentials of the second kind, whereas Ω(0) is a normalized Abelian diﬀerential of the third kind. Let us also deﬁne, for each i, the g-dimensional vectors U (i) whose components are the b-periods of the forms Ω(i) : 1 (i) Uk = Ω(i) (6.29) 2πi bk (i)

Given these data, the Baker–Akhiezer functions deﬁned in eqs. (6.24, 6.25) are expressed as follows: P Proposition. Let A(P ) be the Abel map, Ak (P ) = Q0 ωk . Then the Baker–Akhiezer function ψi (t, P ) has the following expression: P (1)

P θ(A(P ) i Q Ω(0) −t Q Ω

ψi (P ) = ri (t)e

0

0

+ i U (0) − tU (1) − ζ0 ) (6.30) θ(A(P ) − ζ0 )

where ri (t) is independent of P on Γ and ζ0 =

g

A(γi ) + K

(6.31)

i=1

with K the vector of Riemann constants. Proof. First, one checks that this function is well-deﬁned on Γ, i.e. it does not depend on the path of integration between Q0 and P . This is done using the formulae of Chapter 15 on theta functions. Then one checks that it has the right poles at ﬁnite distance. They are given by the zeroes of the theta-function θ(A(P ) − ζ0 ). By Riemann’s theorem, they are located at the points γ1 , . . . , γg because we chose the vector ζ0 according to eq. (6.31). Finally, we check that it has the right behaviour in the neighbourhood of the points at inﬁnity P ± . The theta functions are regular at inﬁnity. Therefore, the behaviour at inﬁnity is governed

189

6.4 Reconstruction formula

by the diﬀerentials Ω(0) and Ω(1) . From eq. (6.28), we deduce that when P → P ±: P Ω(0) = ± log µ + O(1) Q0 P

Ω(i) = ±

Q0

µi + O(1), i

i≥1

These expressions hold modulo periods, but as we have seen, this does not aﬀect the global expression. Therefore, when P → P ± : P P (0) (1) = const. µ±i e∓µt 1 + O(µ−1 ) exp i Ω −t Ω Q0

Q0

This proves the result. In eq. (6.30), the coeﬃcient ri (t) is ﬁxed up to a sign by the requirement that the leading term in the expansions of ψi (P ) at P ± are inverse to each other. In the following we only need to consider ratios of the values of ψi (P ) at the vicinity of the points P + and P − , in which ri (t) cancels out. So we do not give the value of ri (t). Proposition. Let τi (t) be the n tau–functions deﬁned by: 1 2 τi (t) = e 2 β0 i θ(iU (0) − tU (1) − ζ0 )

(6.32)

where β0 and ζ0 are constants. Then the solution of the equations of motion of the closed Toda chain is given by: e2(qi (t)−qi+1 (t)) =

τi+1 (t)τi−1 (t) τi2 (t)

(6.33)

Proof. From eqs. (6.24, 6.25), we have: ψi (P → P + ) = e2qi µ2i e−2µt (1 + O(µ−1 )), ψi (P → P − )

µ→∞

hence substituting the expression (6.30) of ψi we obtain an expression of the form: e2qi = eβ0 i+β1 t ×

θ(A(P + ) + i U (0) − tU (1) − ζ0 ) θ(A(P + ) − ζ0 )

θ(A(P − ) − ζ0 ) θ(A(P − ) + i U (0) − tU (1) − ζ)

190

6 The closed Toda chain

P The exponential prefactor comes from limP →P ± ( Q0 Ω(0) ∓ log µ) = β0± P and limP →P ± ( Q0 Ω(1) ∓ µ) = β1± . Then βk = βk+ − βk− and the singular part cancels with µ2i e−2µt . Taking the quotient of these expressions for i and i + 1 gives: e2(qi −qi+1 ) = e−β0 × θ(i U (0) − tU (1) − ζ0 + A(P + ))θ((i + 1)U (0) − tU (1) − ζ0 + A(P − )) θ(i U (0) − tU (1) − ζ0 + A(P − ))θ((i + 1)U (0) − tU (1) − ζ0 + A(P + )) To end the proof of eq. (6.33), we show that: A(P − ) − A(P + ) = U (0) This is a direct consequence of Riemann’s bilinear relations which, for normalized Abelian diﬀerential of the third kind like Ω(0) , with residue −1 at P + and +1 at P − , implies: U

(0)

1 ≡ 2iπ

Ω bk

(0)

P−

= P+

ωk = A(P − ) − A(P + )

Then the result follows by deﬁning ζ0 = ζ0 − A(P + ) − U (0) and inserting the deﬁnition of the tau-function. The explicit formula of the Baker–Akhiezer function in terms of thetafunctions also shows that the Abel map linearizes the dynamics. Indeed, consider the component ψ0 (t, P ). It has poles at ﬁxed position γ1 , . . . , γn with divisor ζ0 , and zeroes at n others points γ1 (t), . . . , γn (t). At t = 0, γi (t = 0) = γi since ψ0 (t = 0) is equal to one. The points γi (t) are the zeroes of the theta-function which is in the numerator of eq. (6.30) taken for i = 0. Thus, by Riemann’s theorem, n

A(γi (t)) − A(γi ) = tU (1)

(6.34)

i=1

This ﬂow is linear. Remark. The formula (6.30) can be generalized by considering all ﬂows associated with the other conserved Hamiltonians. If we denote by tp the time associated with these Hamiltonians, the generalized tau-functions are obtained by replacing tU (1) by tU (1) → p tp U (p) .

6.5 Symplectic structure

191

6.5 Symplectic structure We now want to prove that the coordinates (λγi , µγi ) of the points of the dynamical divisor form a set of separated canonical coordinates. Recall that the Poisson bracket is the standard canonical one: 1 {pi , qj } = δij 2

{qi , qj } = {pi , pj } = 0,

(6.35)

Proposition. Let Γ be the spectral curve eq. (6.7). Let (λγi , µγi ), i = 1, . . . , g, be the g points of the dynamical divisor D. Then ω=2

δqi ∧ δpi =

i

δλγ i

i

λγi

∧ δµγi

(6.36)

Proof. According to the general procedure explained in Chapter 5, we start from the 2-form K on phase space with values in the 1-forms on Γ deﬁned by: K = K 1 + K2 + K3

(6.37)

dλ K1 = < Ψ(−1) (P )δL(λ) ∧ δΨ(P ) > λ dλ (−1) (P )δµ ∧ δΨ(P ) > K2 = < Ψ λ dλ K3 = δ (log ∂µ Γ(λ, µ)) ∧ δµ λ Here Γ(λ, µ) = 0 is the equation of the spectral curve Γ. We have included the form K3 in the deﬁnition of K, although its role is auxiliary. More importantly, notice the factor 1/λ as compared to the analysis of Chapter 5. This is necessary to get the right Poisson bracket on the variables qi , pi . We write that the sum of the residues of the form K seen as a 1-form on Γ vanishes. The poles of K are located at three diﬀerent places, ﬁrst the dynamical poles of Ψ, then the poles at P ± coming from Ψ, L and dλ/λ, and ﬁnally the poles at the branch points of the covering coming from the poles of Ψ(−1) and from the fact that the δ diﬀerential is taken at ﬁxed λ. The evaluation of the residues at the dynamical poles and the branch points is exactly as in Chapter 5, yielding the sum 2 k δµγk ∧ δλγk /λγk . So we are left with the computation of the residues at P ± . The residue at the point P + , for instance, is obtained by integrating over a small contour enclosing it. But the Riemann surface seen as a branched cover of the λ sphere has a branch point of order n + 1 at P + .

192

6 The closed Toda chain

So, a closed contour around P + runs on the n + 1 sheets of the covering before returning to its starting point. Therefore, we can write 1 1 K(P i ) ResP + K = K(P ) = 2iπ 2iπ i

where in the last expression, the points Pi are the n + 1 points of the contour over the point on the base with coordinate λ. This sum is independent of the order of the sheets and is therefore a 1-form on the base. The integral is also taken on the base. The sum i K1 (Pi ) can be written as a trace since the vectors Ψ(Pi ), with Pi the (n + 1) points over λ, form a basis of eigenvectors of L(λ). As in Chapter 5, it is convenient to introduce the (n + 1) × (n + 1) matrix ) whose columns are the (n + 1) vectors Ψ(Pi ). We thus have: Ψ(P 1 Ψ −1 (λ) ∧ δL(λ)) dλ ResP+ (K1 ) = − Tr (δ Ψ(λ) 2iπ λ Ψ −1 (λ) does not depend on the order of the sheets and The matrix δ Ψ(λ) is therefore a meromorphic function of λ which we now calculate. Lemma.

Ψ −1 (λ) = −(δqi − δq0 )δij δΨ (6.38) ij 1 λ δi,j−1 + δi,n+1 δj,1 + O(λ2 ), P → P − + δηi ai an+1

Ψ −1 (λ) = (δqi − δq0 )δij δΨ (6.39) ij 1 1 δi,j+1 + δi,1 δj,n+1 + O(λ−2 ), P → P + − δξi ai−1 λan+1 where ηi =

i

j=1 pj

and ξi =

i−1

j=0 pj .

Proof. Let us ﬁrst consider the behaviour near P − . From eq. (6.15), we have

ij = Ψ i (Pj ) = e−qi +q0 µ−i 1 + ηi µ−1 + O(µ−2 ) Ψ j j j

−1 = 1 eqj −q0 µj 1 + O(µ−1 ) Ψ i ij j n+1 where µ−1 is the local coordinate of the point Pj above λ, i.e. µj = j −1

λ n+1 αj + · · · with αj a (n + 1)-th root of unity.

193

6.6 The Sklyanin approach

Taking variations, we limit ourselves to O(µ−2 ) so that we can set ij = −(δqi − δµ = 0 (using eq. (6.11) where j pj = 0), and obtain: δ Ψ ij µ−1 + O(µ−2 ). Hence ij + δηi Ψ δq0 )Ψ j j

Ψ −1 (λ) = −(δqi − δq0 )δij + δηi ik µ−1 Ψ −1 δΨ Ψ k kj ij

k

−1 needs to be computed to leading order Note that this shows that Ψ only. The last term is equal to: qj −qi −i+j−1 eqj −qi i−j+1 −i+j−1 ik µ−1 Ψ −1 = e n+1 Ψ µ = αk λ k k kj n+1 n+1 k

k

k

The last sum is over the roots of unity. It vanishes unless i − j + 1 ≡ 0 mod [n + 1], which is possible only if i = j − 1, (i = 1, . . . , n), and i = n + 1, j = 1. Evaluating these two types of terms yields eq. (6.38). The analysis at P + is similar. We now ﬁnish the proof of the proposition, by ﬁrst analysing the residue of K1 at P ± . It is clear that only the terms written in eqs. (6.38, 6.39) contribute to the residue. Indeed, if one keeps terms of order µ−2 the Ψ −1 which are not same computation as above yields contributions to δ Ψ vanishing only if i − j ≡ ±2 mod [n + 1], which cannot contribute to the trace with δL. The contributions to the residues at P ± of the ﬁrst terms in eqs. (6.38, 6.39) add to 2 i δpi ∧ δqi (remember that dλ/λ has residue contributions of the second terms at P ± are ∓1 at P ± ). Similarly, the also equal and add up to 2 i δpi ∧ δqi . Finally, since δµ has a simple zero at P ± the forms K2 and K3 are regular at these points. The quantities (log λγk , µγk ) are therefore canonical coordinates. Since the points (λγk , µγk ) of the dynamical divisor are on the spectral curve, we have the n equations Γ(λγk , µγk ) = 0, where Γ(λ, µ) = 0 is the equation of the spectral curve: λγk + λ−1 γk = 2t(µγk ),

for k = 1, . . . , n

As explained in Chapter 5, this implies that the variables (λγk , µγk ) are separated and, furthermore, this allows to construct the action–angle variables. 6.6 The Sklyanin approach We now introduce an equivalent description of the Toda chain. This approach can be viewed as a lattice version of the constructions of integrable

194

6 The closed Toda chain

ﬁeld theories of Chapter 3. It is based on the use of a 2 × 2 transfer matrix whose Poisson brackets are quadratic, as in eq. (3.91) in Chapter 3. This provides a very simple way to ﬁnd the separated canonical variables and the corresponding separated Hamilton–Jacobi equations. The linearization of the ﬂow on the Jacobian of the spectral curve is also obtained in this approach. We ﬁrst introduce 2 × 2 matrices by replacing the linear second order system eq. (6.13) by the linear 2 × 2 ﬁrst order system: ψi ψi−1 = Ti (µ) (6.40) ψi+1 ψi where the Ti are given by: Ti (µ) =

0 −ai−1 /ai

1 −(pi − µ)/ai

(6.41)

The solution of eq. (6.40) with ψ0 = λ−1 ψn+1 is: ψi ψ0 = Ti (µ)Ti−1 (µ) · · · T1 (µ) ψi+1 ψ1 Each matrix Ti (µ) may be viewed as an elementary transport matrix on a small segment at position i, and the product of such matrices is a transport matrix from site 1 to site i. The periodicity condition, ψn+1+i = λψi , translates into the eigenvalue problem: ψ0 ψ 0 T(µ) =λ ψ1 ψ1 where T(µ) is the monodromy matrix deﬁned by: A(µ) B(µ) T(µ) = Tn+1 (µ)Tn (µ) · · · T1 (µ) ≡ C(µ) D(µ)

(6.42)

= n−1, Here A(µ), B(µ), C(µ), D(µ) are polynomials in µ of degrees deg A = deg C = n, and deg D = n + 1. Since det Ti (µ) = ai−1 /ai one has deg B det T(µ) = 1, and the characteristic equation det(T(µ) − λ) = 0 reads λ2 − 2t(µ)λ + 1 = 0, with 2t(µ) = Tr (T(µ)) = A(µ) + D(µ)

(6.43)

195

6.6 The Sklyanin approach

Clearly t(µ) is a polynomial in µ of degree (n + 1): t(µ) = 12 µn+1 + · · · . Moreover, eq. (6.43) also expresses the existence of an eigenvector of L(λ), hence is equivalent to eq. (6.7). By using T(µ) instead of L(λ) we have exchanged the role played by µ and λ. The spectral curve Γ is now presented as a two-sheeted covering of the µ-plane. Above each point µ there are two points corresponding to the two roots λ± = t(µ) ± t2 (µ) − 1 of eq. (6.43) such that λ+ λ− = 1. In particular, above µ = ∞ we have the two points P + (λ = ∞) and P − (λ = 0). In the following, we shall choose the normalization condition ψ0 (P ) = 1. It is useful to introduce a standard basis of solutions of the linear system (0) (1) eq. (6.40), which we denote by χi , χi , and speciﬁed by the boundary conditions: (0) (1) χ0 χ0 1 0 = , = (6.44) (0) (1) 0 1 χ1 χ1 (0)

(1)

These solutions are polynomials in µ with deg χi = i−2, deg χi We can expand any other solution on this basis. In particular: (0)

= i−1.

(1)

ψi = χi + ψ1 χi

The coeﬃcient ψ1 is determined by the periodicity condition = ψn+1+i 1 1 λψi . This is equivalent to the eigenvalue equation T (µ) =λ ψ1 ψ1 and gives: ± − A(µ) λ (0) (1) χi (6.45) ψi± = χi + B(µ) We see that the two functions ψi± corresponding to the two Bloch waves are in fact the values of a unique meromorphic function ψi (P ) evaluated at the two points (µ, λ± ) above µ, on Γ. The poles at ﬁnite distance of ψi are thus the sames as those of ψ1 γ ) = 0, the two and are located above the n zeroes of B(µ). When B(µ k γ ) and D(µ γ ) so that the numerator λ − eigenvalues of T(µγk ) are A(µ k k A(µ) vanishes on one of the two points above µγk . Therefore the function γ )). Hence the dynamical ψi (P ) has only one-pole at (µγk , λγk = D(µ k divisor is exactly the same as in the (n + 1) × (n + 1) matrix approach. As we already know, the coordinates (µγk , log λγk ) of these points form a set of conjugated canonical variables. We present an alternative way of deriving this result in the following section.

196

6 The closed Toda chain 6.7 The Poisson brackets

We ﬁrst establish an explicit formula for the Poisson brackets of the matrix elements of T(µ). It is actually more convenient to perform a gauge tranformation before computing these Poisson brackets. Let −1 Ti (µ) → Ti (µ) = Di Ti (µ)Di−1

with Di periodic, Dn+1+i = Di . Since T(µ) = Tn+1 (µ) · · · T1 (µ) is the product of the matrices Ti (µ), it gets conjugated by D0 : T(µ) → T (µ) = D0 T(µ)D0−1 . In particular, the spectral curve det(T(µ) − λ) = 0 is preserved by such a gauge transformation. We choose the matrices Di so that the matrices Ti (µ) are local, i.e. Ti (µ) only depends on the canonical variables qi and pi . We take Di = −1 Diag di , di+1 with di = exp(qi ). Notice that ai = di /di+1 . The explicit expressions for Ti (µ) are: 0 e2qi (6.46) Ti (µ) = −e−2qi (µ − pi ) Note that det Ti (µ) = 1 for all i. The monodromy matrix T (µ) is equal to T (µ) = Tn+1 (µ) · · · T1 (µ). Proposition. The Poisson brackets of the matrix elements of T (µ) are given by: / 0 {T1 (µ), T2 (µ )} = r12 (µ − µ ), T1 (µ)T2 (µ ) (6.47) where the r-matrix is given by r12 (µ − µ ) =

C12 , µ − µ

C12 =

Eij ⊗ Eji

ij

Proof. We ﬁrst prove this relation for each individual Ti (µ). Speciﬁcally, / 0 {T1,i (µ), T2,i (µ )} = r12 (µ − µ ), T1,i (µ)T2,i (µ ) (6.48) This is shown by a direct computation using the explicit formula (6.46) for Ti (µ). We then prove that if two Poisson-commuting matrices Ti (µ) and Tj (µ ) satisfy (6.48) then so does their product Ti (µ)Tj (µ ). Indeed, using the fact that the Poisson bracket and the Lie bracket are both derivations, {T1,j (µ)T1,i (µ), T2,j (µ )T2,i (µ )} = T1,j (µ)T2,j (µ ){T1,i (µ), T2,i (µ )} + {T1,j (µ), T2,j (µ )}T1,i (µ)T2,i (µ ) 0 / = r12 (µ − µ ), T1,j (µ)T1,i (µ)T2,j (µ )T2,i (µ )

197

6.7 The Poisson brackets

The claim then follows by induction because {T1,i (µ), T2,j (µ )} = 0 if i = j by the locality of Ti (µ). Note that this also implies the integrability of the Toda chain. Let A(µ), B(µ) , C(µ) and D(µ) be the matrix elements of T (µ): A(µ) B(µ) T (µ) = C(µ) D(µ) In terms of the matrix elements of T(µ) introduced in the previous section, A(µ) = A(µ), D(µ) = D(µ) but B(µ) = (d0 d1 )B(µ) and C(µ) = C(µ)/(d0 d1 ). In particular one ﬁnds (recalling that B(µ) = (d0 /d1 )µn + · · ·):   n pj ) + · · · B(µ) = d20 µn − µn−1 ( j=1

This is a polynomial of degree n with n zeroes, which we denote by µγk , k = 1, . . . , n: n (µ − µγk ) B(µ) = d20 k=1

As explained in the previous section, these zeroes are the µ-coordinates of the poles of the Baker functions ψi . Proposition. Let µγk be the zeroes of B(µ) and λγk be the values of D(µ) at µ = µγk . The n points (λγk , µγk ) are the points of the dynamical divisor. B(µγk ) = 0

and

λγk = D(µγk )

(6.49)

These parameters obey the following Poisson brackets: {µγk , µγk } = {λγk , λγk } = 0,

{λγk , µγk } = λγk δkk

(6.50)

(log λγk , µγk ) form a set of canonical coordinates. Proof. Recall that the matrix T (µ) is lower triangular at µ = µγk , since B(µγk ) = 0. Therefore, at µ = µγk we have: 1 = det T (µγk ) = A(µγk )D(µγk ) and 2t(µγk ) = A(µγk ) + D(µγk ), hence the points (λγk , µγk ) are on the spectral curve: λγk + λ−1 γk = 2t(µγk )

(6.51)

198

6 The closed Toda chain

The Poisson brackets (6.47) for T (µ) imply the following: {A(µ), A(µ )} = {B(µ), B(µ )} = {C(µ), C(µ )} = {D(µ), D(µ )} = 0 A(µ)B(µ ) − B(µ)A(µ ) µ − µ C(µ)B(µ ) − B(µ)C(µ ) {A(µ), D(µ )} = µ − µ D(µ)B(µ ) − B(µ)D(µ ) {B(µ), D(µ )} = µ − µ {A(µ), B(µ )} =

(6.52) (6.53) (6.54)

The ﬁrst equation directly implies that the µγk Poisson commute. Let P (µ) be a polynomial in µ with coeﬃcients functions on phase space, and F be an arbritary function on phase space. Then the Poisson bracket between F and the value of P (µ) at µγk is: + {F, µγk } ∂µ P (µγk ) (6.55) {F, P (µγk )} = {F, P (µ)}µ=µγk

We apply this to F = D(µ) and P (µ) = B(µ). Since B(µγk ) = 0, we get: 0 = {D(µ), B(µγk )} = {D(µ), B(µ )}- + {D(µ), µγk } B (µγk ) µ =µγk

where B (µ) = ∂µ B(µ). We evaluate the ﬁrst term by using eq. (6.54), obtaining {D(µ), µγk } =

λ γk 1 B(µ) µ − µγk B (µγk )

Next we apply the same formula to F = µγk and P = D, evaluated at µ = µγk . Since {µγk , µγk } = 0, one gets {µγk , λγk } = {µγk , D(µγk )} = {µγk , D(µ)}|µ=µγ

k

So we have to evaluate B(µ)/(µ − µγk ) at µ = µγk , which gives δkk B (µγk ). The equation {λγk , µγk } = δkk λγk follows. Finally, we need to prove {λγk , λγk } = 0. We compute {D(µγk ), D(µγk )} by expanding this expression completely using eq. (6.55), {D(µ), D(µ )} = 0 and {µγk , µγk } = 0. We get: {λγk , λγk } = ∂µ D(µγk ){D(µ), µγk }|µ=µγk − ∂µ D(µγk ){D(µ), µγk }|µ=µγ

k

We have already seen that each of these two terms vanishes when k = k and they obviously cancel each other when k = k .

199

6.7 The Poisson brackets In particular, the symplectic two-form ω can be written as: n δλγk ω= ∧ δµγk λ γk k=1

where δ denote the diﬀerential on the phase space. The fact that the equations of motion linearize on the Jacobian variety of the spectral curve may also be derived in this framework. The generating function for the conserved Hamiltonians is the trace of the monodromy matrix Tr (T (u)) = 2t(u) = A(u) + D(u). Thus the equations of motion of any function F on phase space for this generic Hamiltonian t(u) are: F˙ ≡ {t(u), F } Proposition. The equations of motion, relative to the Hamiltonian t(u), of the zeroes µγk of B(µ) are: 1 B(u) µ˙ γk = {t(u), µγk } = (A(µγk ) − D(µγk )) 2 (u − µγk )B (µγk )

(6.56)

Their linearization, under the Abel map A, is given by the following relations: n µjγk µ˙ γk = uj , 0 ≤ j ≤ n − 1 (6.57) A˙ j = 2 (µ ) − 1 t γk k=1 Proof. The Poisson bracket {t(u), µγk } can be computed in the same way as in the previous proposition. One obtains: µ˙ γk = {t(u), µγk } =

B(u) (A(µγk ) − D(µγk )) (6.58) 2(u − µγk )B (µγk )

Hence µ˙ k is a polynomial in u of degree n − 1 which vanishes at the points µγk for k = k and takes the value 12 (A(µγk ) − D(µγk )) = t2 (µγk ) − 1 for k = k. On a hyperelliptic curve the Abelian diﬀerentials are the ωj =

µj t2 (µ) − 1

dµ,

0≤j ≤n−1

so that the time derivative of the Abel map Aj ≡ A˙ j =

k

Pk k

ωj is given by:

µjγk µ˙ γk t2 (µγk ) − 1

Replacing µ˙ γk by its value given in eq. (6.58), we see that A˙ j is a polynomial in u of degree at most n − 1. If one evaluates it at u = µγl , only the term k = l contributes in the above sum, which gives µjγl . Hence the polynomial itself is uj .

200

6 The closed Toda chain 6.8 Reality conditions

Up to now the dynamical variables of the Toda chain were complex valued. We now come to the important and generally diﬃcult question of choosing a real slice in the complexiﬁed phase space. This complex phase space is a ﬁbred space with basis the action variables and ﬁbre the Jacobian variety of the corresponding spectral curve. While it is easy to take real action variables, it is less trivial to choose a proper real slice on the Jacobian which will be identiﬁed with the Liouville torus. Let us ﬁrst remark that the equation of the spectral curve, eq. (6.8), shows that it is a hyperelliptic curve with hyperelliptic involution given by (λ, µ) → (1/λ, µ). Hence the branch points, which are invariant under this involution, are located at λ = 1/λ, i.e. λ = ±1. For such values of λ the Lax matrix L(λ), eq. (6.4), is symmetric. If, moreover, the dynamical variables pi , qi , are real, the matrices L(λ = ±1) are real symmetric, hence each have (n + 1) real eigenvalues which are precisely the location of the branch points. We denote by β0 < β1 · · · < β2n+2 these 2(n + 1) real branch points which completely characterize the curve. This implies in particular that the action variables, i.e. the coeﬃcients of eq. (6.8), are real. Writing the equation of the spectral curve as: λ+

1 = 2t(µ) λ

(6.59)

where t(µ) is a polynomial of degree (n + 1) with real coeﬃcients, the 2n + 2 branch points are located at t(µ) = ±1. It follows that the graph of t(µ) has the form shown in Fig. 6.1 (here assuming n + 1 even): For n + 1 odd, the left branch (where µ → −∞) goes to −∞. Note that for real µ the two values of λ such that λ2 − 2t(µ)λ + 1 = 0 are either real or complex conjugated with modulus one. Borrowing the terminology of solid state physics, the ﬁrst case is called the forbidden zone, while the second case is called the allowed zone. We see that the forbidden zones correspond to the intervals [βi , βi+1 ] for i = 1, 3, . . . , 2n − 1 (for both n even and odd), to which is added one extra zone passing through µ = ∞ given by µ < β0 or µ > β2n+2 . So there are n compact forbidden zones. We now have to locate the points of the dynamical divisor in agreement with the reality conditions. Recall that they are given by the points B(µ) = 0 and λ = D(µ). We show that the zeroes of the polynomial B(µ) of degree n are real so that the corresponding λ are also real, since D(µ) has real coeﬃcients. This immediately implies that the zeroes of B(µ) lie in the forbidden zones.

201

6.8 Reality conditions

Fig. 6.1. The graph of the function t(µ) Proposition. The points of the dynamical divisor have real coordinates, and there is exactly one such point in each of the n forbidden zones [βi , βi+1 ] for i = 1, 3, . . . , 2n − 1. Proof. We ﬁrst show that the zeroes of B(µ) are real. This is because the zeroes of B(µ) are also the poles of the eigenvector of L(λ) normalized by ψ0 = 1. The eigenvector equation being:    a0 . . . λ−1 an p0 − µ ψ0  a0 p1 − µ . . . 0   ψ1   .  . =0 .. ..   .. .   ..  . λan

0

...

pn − µ

ψn

the poles of the normalized Ψ are included in the zero set of the minor of (p0 − µ), see Chapter 5. This minor is the characteristic equation of a real symmetric n × n matrix, hence has n real eigenvalues µγi . To show that there is just one zero of B(µ) in each [βi , βi+1 ], we show that B(µ) has the same sign as a0 dt(µ)/dµ in the allowed zones, hence changes sign between βi and βi+1 , i odd, because so does dt(µ)/dµ. Since

202

6 The closed Toda chain

B(µ) has n zeroes the only possibility is that it has exactly one zero in each forbidden zone. The analysis of the sign of a0 dt(µ)/dµ is quite involved. It rests on the analysis of the two fundamental solutions, eq. (6.44), of the Schroedinger equation, eq. (6.12). In particular, recall that (χn+1 (µ), χn+2 (µ)) is related to the initial values (χ0 , χ1 ) by the monodromy matrix T(µ), so that (0) (1) (0) we have A(µ) = χn+1 (µ), B(µ) = χn+1 (µ), C(µ) = χn+2 (µ), D(µ) = (1) (0) (1) χn+2 (µ), and 2t(µ) = χn+1 (µ) + χn+2 (µ). Moreover, since det T(µ) = 1, we get the Wronskian relation: (0)

(1)

(0)

(1)

χn+1 (µ)χn+2 (µ) − χn+2 (µ)χn+1 (µ) = 1

(6.60)

(j )

We then consider two solutions χi (µ) and χi (µ ) of the Schroedinger equation corresponding to diﬀerent values µ and µ . (j)

(j)

(j)

(j)

(j)

ai−1 χi−1 (µ) + pi χi (µ) + ai χi+1 (µ) = µχi (µ) (j )

(j )

(j )

(j )

ai−1 χi−1 (µ ) + pi χi (µ ) + ai χi+1 (µ ) = µ χi (µ ) (j )

Here j, j take the values 0, 1. We multiply the ﬁrst equation by χi (µ ), (j) the second by χi (µ) and subtract. Adding the resulting equations for i = 1, 2, . . . , n + 1 we get:

(j) (j ) (j) (j ) a0 χ0 (µ)χ1 (µ ) − χ1 (µ)χ0 (µ )

(j) (j ) (j) (j ) −an+1 χn+1 (µ)χn+2 (µ ) − χn+2 (µ)χn+1 (µ )

= (µ − µ ) χ(j) (µ), χ(j ) (µ ) (j) (j ) where we have denoted (χ(j) (µ), χ(j ) (µ )) = n+1 i=1 χi (µ)χi (µ ). Consider the four above equations for (j, j ) = (0, 0), (1, 1), (0, 1), (1, 0) (1) (0) (1) and multiply them respectively by χn+1 (µ), −χn+2 (µ), χn+2 (µ) and (0)

−χn+1 (µ). Adding them and taking into account eq. (6.60), we get: 2a0

t(µ) − t(µ ) (0) (1) = B(µ)(χ (µ), χ(0) (µ )) − C(µ)(χ (µ), χ(1) (µ )) + µ − µ (0) (1) D(µ)(χ (µ), χ(1) (µ )) − A(µ)(χ (µ), χ(0) (µ ))

We can now set µ = µ and obtain our ﬁnal relation:

dt(µ) χ(1) (µ), χ(1) (µ) 2a0 = B(µ) χ(0) (µ), χ(0) (µ) − C(µ) dµ

(6.61) +(D(µ) − A(µ)) χ(0) (µ), χ(1) (µ)

6.8 Reality conditions

203

The right-hand side is a sum over i = 1, . . . , n + 1 of quadratic forms (0) (1) 2+ in the variables χi (µ) and χi (µ), with discriminant (D(µ) − A(µ)) C(µ) 4B(µ) = 4(t2 (µ) − 1). In the allowed zones we have t2 (µ) − 1 < 0 and the quadratic form has a deﬁnite sign, which is the sign of B(µ). Equation (6.61) shows that this is also the sign of a0 dt(µ)/dµ, as asserted above. We can now describe the real part of the spectral curve. From eq. (6.59), we see that when t(µ)2 > 1, i.e. in the forbidden zones, we have two real solutions for λ, which are inverse to each other. For µ = βj , these two solutions coincide. In the allowed zones, t(µ)2 < 1, there is no real solution for λ. Hence the real slice of the spectral curve has n components Ck at ﬁnite distance, and one component which extends to ∞. See Fig. 6.2. The dynamical divisor has n points γk with just one point in each of the components Ck at ﬁnite distance. As time goes on, each of the points γk runs along the cycle Ck , hence the whole motion lies on the real torus C1 × C2 × · · · × Cn , which is the Liouville torus. For the Hamiltonian t(u), the time evolution of µγk is

Fig. 6.2. The real slice of the spectral curve.

204

6 The closed Toda chain

given by eq. (6.58) and the one of λγk is given similarly by: λ˙ k = −

1 B(u) λγ t (µγk ) u − µγk B (µγk ) k

Hence, when the point γk hits the line λ = ±1, we have µ˙ k = 0 and λ˙ k = 0, so that the point γk continues in the same direction and loops around Ck . It is interesting to relate the Liouville torus to the real slice of the Jacobian variety of the spectral curve. One an antiholomorphic can deﬁne involution of the Jacobian by sending γk to γ k , where the image of γ = (λ, µ) is γ = (λ, µ), which also lies on the spectral curve since t(µ) has real coeﬃcients. A real slice of the Jacobian can be deﬁned as the set of ﬁxed points of this involution. Note that γk is invariant if the γk are real or occur in complex conjugate pairs. The various combinations of these two possibilities deﬁne several connected components of the real slice of the Jacobian. The Liouville torus is just one of these components. Note that we have discussed the reality condition for the dynamical variables ai and pi , but one should further ensure that the variables ai are positive to get real qi . Alternatively, one can view the Liouville torus as some choice of acycles on the spectral curve. Consider it as a two-sheeted covering of the complex µ-plane, with cuts between branch points as shown in Fig. 6.3. Here the a-cycles are drawn on the ﬁrst sheet, so in the limit where the ellipses go to the segments [βi , βi+1 ] we see that λ goes to the two real roots of the spectral equation, thereby producing the cycles in the previous drawing. The b-cycles are also drawn. Starting from ∞, they cut the corresponding a-cycle once, then pass on the second sheet through the cut, and return to ∞ without cutting any more of the a-cycles.

Fig. 6.3. Γ as the cut µ-plane.

6.8 Reality conditions

205

References [1] H. Flaschka, The Toda lattice I: Existence of integrals. Phys. Rev. B9 (1974) 1924. [2] H. Flaschka, The Toda lattice II: Inverse scattering solution. Prog. Theor. Phys. 51 (1974) 703–716. [3] P. van Moerbeke, The spectrum of Jacobi Matrices. Invent. Math. (1976) 45–81. [4] P. van Moerbeke and D. Mumford, The spectrum of diﬀerence operators and algebraic curves. Acta. Math. 143 (1979) 93–154. [5] P. van Moerbeke, About Isospectral deformations of discrete Laplacians. Lecture Notes in Math. 755 (1979) 313–370. [6] E. Sklyanin, The quantum Toda chain. Lect. Notes Physics 226 (1985) 196–233. [7] B. Dubrovin, I. Krichever and S. Novikov, Integrable Systems. I. Encyclopedia of Mathematical Sciences, Dynamical Systems IV. Springer (1990).

7 The Calogero–Moser model

The elliptic Calogero–Moser model provides an example in which the Lax matrix is not a rational function of the spectral parameter but lives on an elliptic curve. As pointed out in Chapter 3, integrable systems whose spectral parameter belongs to Riemann surfaces of higher genus are highly non-generic. Furthermore, this model gives an example of an integrable system whose spectral curve is non-hyperelliptic. Nevertheless, most of the results obtained in the rational case extend to this case with slight but interesting adaptations. A special feature of this model is that its r-matrix explicitly contains dynamical variables. Another remarkable fact is the relation between doubly periodic solutions of the KP hierarchy and the Calogero–Moser model. Finally, we show that this model is a particular example of a general construction, due to Hitchin, which allows us to construct models with spectral parameter lying on higher genus Riemann surfaces.

7.1 The spin Calogero–Moser model The Calogero–Moser model consists of N identical particles on a line, at positions qi and momenta pi , with pairwise interactions and Hamiltonian: N 1 1 2 γ2 H= pi + 2 2 i,j=1 (qi − qj )2 i

i=j

This dynamical system is integrable, moreover it remains integrable when the potential is replaced by an elliptic potential, i.e. 1/q 2 → ℘(q) with ℘(q) the Weierstrass elliptic function deﬁned on a two dimensional torus with periods ω1 and ω2 . 206

207

7.1 The spin Calogero–Moser model

It is rewarding to consider a slightly generalized model, the so-called spin Calogero–Moser model, which contains extra dynamical spin variables. Let us introduce a set of dynamical variables (qi , pi ) and (fij ), i, j = 1, . . . , N together with the Poisson brackets: {pi , qj } = δij {fij , fkl } = −δjk fil + δli fkj

(7.1) (7.2)

The Poisson bracket eq. (7.2) is a Kostant–Kirillov bracket for the coadjoint action of the group GL(N ). The Hamiltonian reads H=

N N 1 2 1 pi − fij fji V (qi − qj ) 2 2 i,j=1 i=1

(7.3)

i=j

where the potential V (q) ≡ ℘(q). The equations of motion are easily derived: q˙i = pi , f˙ii = 0 N p˙i = fij fji V (qij ),

qij ≡ qi − qj

j=1 j=i

f˙ij =

N

fik fkj [V (qik ) − V (qjk )] + (fii − fjj )fij V (qij ),

i = j

(7.4)

k=1 k=i,j

From (7.4) we see that fii are integrals of motion, and we can restrict the system to the submanifold fii = α

(7.5)

where α is a constant independent of i. In this case the last term in eqs. (7.4) vanishes. These constraints are related to a Hamiltonian reduction, and we will see that it is this reduced system which admits a Lax pair and is integrable. Let us count the number of degrees of freedom. This is not a completely trivial matter due to the degeneracy of the Poisson bracket eq. (7.2), and the necessary reduction to the manifold fii = α. The symplectic leaves of the Kostant–Kirillov bracket, eq. (7.2), are the coadjoint orbits of GL(N ) acting on the matrix F = (fij ) by conjugation. These orbits are generically characterized by the eigenvalues of the matrix F which are in the centre of the Poisson bracket. Here we shall consider matrices F of rank l, with l diﬀerent non-vanishing eigenvalues. Moreover,

208

7 The Calogero–Moser model

we shall assume that the matrix F is diagonalizable, so that the orbit is of the form {CνC −1 |C ∈ GL(N )},

with ν = Diag(ν1 , . . . , νl , 0, . . . , 0)

(7.6)

The Hamiltonian eq. (7.3) is not invariant under the above GL(N ) but it is preserved by special subgroups. First we have the discrete subgroup of permutation matrices, i.e. the Weyl group of GL(N ), which simply operates by permutation of the N indices i. More importantly, we have the group of diagonal matrices, i.e. the Cartan torus, which operates by: fij → d−1 i fij dj

(7.7)

This action preserves the Hamiltonian which only depends on fij fji . We consider the dynamical system on an orbit of rank l, reduced under this diagonal action whose generators will be shown to be the fii . Proposition. The dimension of the reduced phase space M is l(l + 1) dim M = 2 N l − +1 2

(7.8)

Proof. The tangent space to the orbit (7.6) at F = (fij ) is the set of matrices U = [F, X] for any X ∈ gl(N ). In a basis where F is diagonal this equation reads Uij = (νi − νj )Xij , hence Uij vanishes when νi = νj but is otherwise arbitrary. So the dimension of the orbit is 2N l − l2 − l. The action of the subgroup of diagonal matrices induces a ﬁbring of the orbits with ﬁbres of dimension N −1 because the identity does not act. The moment associated with this action is the collection of diagonal elements fii , that is N − 1 non-trivial moments since on the orbit, the eigenvalues of F being ﬁxed, so is Tr(F ). Indeed, under inﬁnitesimal action di = 1 + i , fij changes as δfij = ( j − i )fij , and if P = i i fii , we have {P, fij } = ( j − i )fij , so P is the corresponding Hamiltonian. We consider the reduced dynamical system obtained by ﬁrst ﬁxing the moments to a common value fii = α, and then quotienting by the stabilizer of this moment, which is the whole diagonal group. We can now count the number of degrees of freedom. We have 2N degrees of freedom for the qi , pi , plus 2N l − l2 − l for the orbit, minus 2(N − 1) due to the Hamiltonian reduction, which ends up with a phase space of dimension 2(N l − l(l + 1)/2 + 1). 7.2 Lax pair The ﬁrst step in proving that the reduced spin Calogero–Moser model is integrable consists of ﬁnding a Lax pair formulation. Here we will

7.2 Lax pair

209

just give the Lax pair, but in later sections we will explain methods to derive it. We will need the Lam´e function Φ: Φ(q, λ) =

σ(λ − q) ζ(λ)q e σ(λ)σ(q)

(7.9)

where σ and ζ are deﬁned in Chapter 15. It is an elliptic function of the parameter λ and satisﬁes the equation 2 d − 2℘(x) Φ(x, λ) = ℘(λ)Φ(x, λ) (7.10) dx2 The Lam´e function is used to construct the Lax pair of the spin Calogero– Moser model. Proposition. The equations of motion of the spin Calogero–Moser system are equivalent to the Lax equation ˙ L(λ) = [M (λ), L(λ)]

(7.11)

where the Lax matrices, with spectral parameter λ, are given by: Lij (t, λ) = q˙i δij + (1 − δij )fij Φ(qi − qj , λ) Mij (t, λ) = −(1 − δij )fij Φ (qi − qj , λ)

(7.12) (7.13)

The prime in eq. (7.13) refers to the derivative with respect to q. Proof. The Lax equation (7.11) reads: q¨i δij + (1 − δij ) f˙ij Φ(qij , λ) + fij q˙ij Φ (qij , λ) = (1 − δij )fij q˙ij Φ (qij , λ) 0 / fik fkj Φ(qik , λ)Φ (qkj , λ) − Φ (qik , λ)Φ(qkj , λ) + k=i,j

This reduces to the equations of motion, in the case where fii = α, if we use an identity satisﬁed by the Lam´e function: Φ (x, λ)Φ(y, λ) − Φ (y, λ)Φ(x, λ) = [℘(y) − ℘(x)]Φ(x + y, λ)

(7.14)

which also implies, by taking the limit y → −x: Φ (x, λ)Φ(−x, λ) − Φ (−x, λ)Φ(x, λ) = −℘ (x) To show eq. (7.14) we compare the analyticity and monodromy properties of both sides of the equation in the x variable, using the properties given in Chapter 15.

210

7 The Calogero–Moser model

By a similar method one shows that the Lam´e function obeys the identity: Φ(q, λ)Φ(−q, λ) = ℘(λ) − ℘(q) (7.15) hence

1 fij fji Tr L2 = H + ℘(λ) 2 i<j

where H is the Hamiltonian of the spin Calogero–Moser model, and 2 2 i=j fij fji = Tr F − N α is an orbit invariant in the centre of the Poisson algebra. Let us comment on the trigonometric and rational limits of the above formulae. The trigonometric limit is obtained when one of the periods ω → ∞. We choose the other one as iπ. In this limit the function Φ becomes: Φ(q, λ) → (coth q − coth λ) eq coth λ The exponential factor in Φ(q, λ) comes from the factor exp(ζ(λ)q) which is necessary in the elliptic case to ensure the double periodicity of Φ(q, λ) in λ. In the trigonometric case, however, this exponential factor can be eliminated by performing a similarity transformation on L(λ) without aﬀecting the periodicity properties of the matrix elements of L(λ). So we may deﬁne Ltrigo (λ) = Diag (e−qi coth λ ) lim Lelliptic (λ) Diag (eqi coth λ ) ω→∞ = pi Eii + fij (coth qij − coth λ)Eij (7.16) i

i=j

The potential V (q) becomes V (q) = 1/ sinh2 (q). The rational limit is obtained straightforwardly from the trigonometric limit by sending the second period ω → ∞. The functions V (q) and Φ(q, λ) become 1 1 1 V (q) → 2 , Φ(q, λ) → − , q q λ Remark. The diagonal action in eq. (7.7) is equivalent to conjugation of the Lax matrix by a diagonal matrix. The Hamiltonian reduction we have performed is similar to the one appearing in the general rational case, see Chapter 5.

7.3 The r-matrix In this section we compute the r-matrix associated with the Lax matrix pi Eii + Φ(qij , λ) fij Eij . (7.17) L(λ) = i

ij

7.3 The r-matrix

211

One should emphasize that it is only the reduced system (fii = α) which is integrable. The Lax matrix (7.17) is not a function on the reduced phase space and so is not expected to have an r-matrix in the usual sense. This accounts for the extra term in eq. (7.18) below: Proposition. The Poisson bracket of the Lax matrix eq. (7.17) is given by: {L1 (λ), L2 (µ)} = [r12 (λ, µ), L1 (λ)] − [r21 (µ, λ), L2 (µ)] + [D, r12 (λ, µ)] (7.18) where D = i fii ∂q∂ i and the r-matrix is expressed as: r12 (λ, µ) = a(λ, µ)

i

Eii ⊗ Eii −

b(qij , λ, µ) Eij ⊗ Eji

(7.19)

ij

with a(λ, µ) = ζ(λ − µ) − ζ(λ) + ζ(µ) σ(λ − µ − q) (ζ(λ)−ζ(µ))q b(q, λ, µ) = e σ(λ − µ)σ(q) Proof. We compute the various terms of eq. (7.18), and collect them according to the number of equal matrix indices. We have (all written indices are diﬀerent, and sums are implied): {L1 (λ), L2 (µ)} = −Φ(qij , λ)Φ(qji , µ)(fii − fjj )Eij ⊗ Eji +Φ (qij , µ)fij (Eii − Ejj ) ⊗ Eij − Φ (qij , λ)fij Eij ⊗ (Eii − Ejj ) +Φ(qij , λ)Φ(qki , µ)fkj Eij ⊗ Eki − Φ(qij , λ)Φ(qjk , µ)fik Eij ⊗ Ejk Similarly one has, using the antisymmetry of the r-matrix: [r12 , L1 (λ) + L2 (µ)] =

a(λ, µ)Φ(qij , µ) + b(qji , λ, µ)Φ(qij , λ) fij (Eii − Ejj ) ⊗ Eij

+ a(λ, µ)Φ(qij , λ) + b(qij , λ, µ)Φ(qij , µ) fij Eij ⊗ (Eii − Ejj )

+ b(qij , λ, µ)Φ(qkj , µ) − b(qik , λ, µ)Φ(qkj , λ) fkj Eij ⊗ Eki

+ b(qkj , λ, µ)Φ(qik , λ) − b(qij , λ, µ)Φ(qik , µ) fik Eij ⊗ Ejk Finally, we have: [D, r12 ] = −b (qij , λ, µ)(fii − fjj )Eij ⊗ Eji

212

7 The Calogero–Moser model

Using the relations a(µ, λ) = −a(λ, µ) and b(−q, µ, λ) = −b(q, λ, µ), we see that eq. (7.18) reduces to the three identities: Φ (q, µ) = a(λ, µ)Φ(q, µ) + b(−q, λ, µ)Φ(q, λ) Φ(q, λ)Φ(q , µ) = b(q, λ, µ)Φ(q + q , µ) − b(−q , λ, µ)Φ(q + q , λ) b (q, λ, µ) = Φ(q, λ)Φ(−q, µ) The ﬁrst identity, written in terms of Weierstrass functions, reads: σ(λ − q)σ(µ)σ(λ − µ + q) = ζ(µ − q) + ζ(q) + ζ(λ − µ) − ζ(λ) σ(λ)σ(µ − q)σ(λ − µ)σ(q) One observes that both sides are elliptic functions of λ (and µ) and have the same poles and residues, hence are equal. The second identity reads: σ(q + q )σ(λ − µ)σ(λ − q)σ(µ − q ) = σ(λ − µ − q)σ(q )σ(λ)σ(µ − q − q ) + σ(λ − µ + q )σ(µ)σ(q)σ(λ − q − q ) which is true because both sides vanish at λ = q and λ = µ, are equal for λ = 0, and have the same monodromy properties when shifting λ by periods. Finally, the third identity is a consequence of the ﬁrst one. Notice that a(λ, µ) and b(q, λ, µ) are true elliptic functions of both λ and µ. Note also that the r-matrix is antisymmetric, i.e. r12 (λ, µ) = −r21 (µ, λ). The most remarkable feature of this r-matrix is that it depends on the dynamical variables qij , hence the name dynamical r-matrix. Remark. Equation (7.18) holds for the non-reduced dynamical system (7.3). We see that Tr Ln (λ) are not in involution for this system. Indeed, we have: {Tr Ln (λ), Tr Lm (µ)} = −nm

N

Φ(qij , λ)Φ(qji , µ)(fii − fjj )[Ln−1 (λ)]ij [Lm−1 (µ)]ji (7.20)

i,j=1 i=j

As we have already seen, the fii are the moments of the diagonal group which acts on L(λ) by conjugation. The quantities Tr Ln (λ) are invariant under this action. It follows that one can compute their reduced Poisson bracket on the manifold (fii = α)i=1,...,N by just setting fii = α in eq. (7.20), see Chapter 14. Therefore Tr Ln (λ) are in involution for the reduced system. This proves integrability of the spin Calogero–Moser model. We will count the number of action variables later on.

213

7.3 The r-matrix

Proposition. The classical r-matrix, eq. (7.19), satisﬁes the identity Y B = 0 with Y B ≡ −{L1 , r23 } + {L2 , r13 } − {L3 , r12 } + [r12 , r13 ] + [r12 , r23 ] + [r13 , r23 ] (7.21) Proof. It is done by direct calculation. It is more interesting, however, to see how this is related to the Jacobi identity. Let us call ∂ fii Z12 = [D, r12 ], D = ∂qi i

The Jacobi identity reads [L1 , [r12 , r23 ] + [r12 , r13 ] + [r32 , r13 ] + {L2 , r13 } − {L3 , r12 }] + cyclic perm. +[r23 , Z12 ] + [r31 , Z23 ] + [r12 , Z31 ] − [r32 , Z13 ] − [r13 , Z21 ] − [r21 , Z32 ] +{L1 , Z23 } + {L2 , Z31 } + {L3 , Z12 } = 0 Note that the term commuted with L1 is not quite equal to the left-hand side of eq. (7.21). We want to show that the missing term, {L1 , r23 }, is produced by all the Zij contributions. Using r12 (λ, µ) = −r21 (µ, λ), we ﬁnd [r23 , Z12 ] + [r31 , Z23 ] + [r12 , Z31 ] − [r32 , Z13 ] − [r13 , Z21 ] − [r21 , Z32 ] = −[D, [r12 , r13 ] + [r12 , r23 ] + [r13 , r23 ]] Moreover, we easily compute {L1 , D} = − k [L1 , Ekk ]∂qk so that: {L1 , Z23 } = {L1 , [D, r23 ]} = [D, {L1 , r23 }] + [{L1 , D}, r23 ] = [D, {L1 , r23 }] − [L1 , {L1 , r23 }] The Jacobi identity becomes [L1 + L2 + L3 , Y B] − [D, Y B] = 0 where we used that Y B is invariant under cyclic permutations of the indices 1, 2, 3. As a result, the Jacobi identity is satisﬁed if eq. (7.21) holds, thereby giving a motivation to the computation showing that Y B = 0. Remark. Let us comment on the trigonometric limit of these formulae. Using the Lax matrix, eq. (7.16), we ﬁnd {L1 (λ), L2 (µ)} = [r12 (λ, µ), L1 (λ)] − [r21 (µ, λ), L2 (µ)] +

N i,j=1 i=j

where r12 (λ, µ) = coth(λ − µ)C12 −

N

fii − fjj Eij ⊗ Eji sinh2 (qi − qj )

coth(qi − qj )Eij ⊗ Eji

i,j=1 i=j

and C12 is the Casimir element of sl(N ): C12 =

N i,j=1

Eij ⊗ Eji .

214

7 The Calogero–Moser model 7.4 The scalar Calogero–Moser model

The scalar Calogero–Moser model is deﬁned by the Hamiltonian: HCal

N N 1 2 1 2 = pi + γ V (qi − qj ) 2 2 i,j=1 i=1

(7.22)

i=j

We show here that this model and its r-matrix can be obtained from eq. (7.18) by a Hamiltonian reduction procedure. Quite generally, we can parametrize the matrix F = (fij ) of rank l as follows: fij =

l

bri arj

(7.23)

r=1

br

The l vectors form a basis of the image of F , and the vectors ar form a basis of a supplementary space of the kernel of F . Moreover, the Poisson bracket, eq. (7.2), is reproduced if we set {ari , bsj } = −δrs δij The equations of motion for the quantities ari , bri read V (qik )fki ark , b˙ ri = {H, bri } = V (qik )fik brk a˙ ri = {H, ari } = − k=i

k=i

These equations of motion reproduce eq. (7.4) on the reduced manifold fii = α. To recover the scalar Calogero–Moser model, we choose l = 1 in eq. (7.23). So there is only one pair of vectors a and b. We simply denote their components by ai , bi . On these variables, the diagonal action eq. (7.7) reads: (7.24) ai −→ di ai , bi −→ d−1 i bi The integrable system is obtained by applying the method of Hamiltonian √ reduction under this group. We ﬁx the moment to fii = ai bi = α = −1γ, which removes N degrees of freedom. Then we have to quotient by the isotropy subgroup of the moment, which is again the group of diagonal matrices. At the end the 2N degrees of freedom ai , bi are eliminated, leaving as reduced Hamiltonian the scalar Calogero–Moser model eq. (7.22). Note that fij fji = α2 = −γ 2 . In order to perform the reduction at the level of the Lax matrix, we remark that if g = Diag (a−1 i )i=1,...,N , the matrix LCal (λ) = gL(λ) g −1 = pi Eii + α Φ(qij,λ ) Eij i

ij

7.4 The scalar Calogero–Moser model

215

is invariant under the diagonal action and so is a function on the reduced phase space. This is the Lax matrix of the scalar Calogero–Moser model. Proposition. The Poisson bracket of the Lax matrix LCal (λ) takes the r-matrix form: Cal Cal Cal Cal Cal {LCal 1 (λ), L2 (µ)} = [r12 (λ, µ), L1 (λ)] − [r21 (µ, λ), L2 (µ)] (7.25)

with Cal (λ, µ) = a(λ, µ) r12

Eii ⊗ Eii −

i

b(qij , λ − µ) Eij ⊗ Eji

ij

1 − Φ(qij , µ) (Eii + Ejj ) ⊗ Eij 2 ij

This is a dynamical r-matrix which is no longer antisymmetric. Proof. Since LCal (λ) is invariant under the symmetry group, we can compute the Poisson brackets of its matrix elements directly. Using eq. (2.13) in Chapter 2, we get: 1 −1 Cal r12 (λ, µ) = g1 g2 r12 (λ, µ) + g1 {g1 , L2 (µ)} + [u12 , L2 (µ)] g1−1 g2−1 2 where u12 = g1−1 g2−1 {g1 , g2 } is here equal to zero. We get Cal r12 (λ, µ) = r12 (λ, µ) −

Φ(qij , µ)Eii ⊗ Eij

i=j

Redeﬁning Cal Cal (λ, µ) −→ r12 (λ, µ) + r12

1 Eii ⊗ Eii , LCal (µ) 2 2α i

does not change eq. (7.25) and yields the r-matrix of the scalar Calogero– Moser model. Finally, we give the formula for the matrix M Cal = gM g −1 + gg ˙ −1 . We Cal ﬁnd Mij = − a˙ i /ai δij − (1 − δij )αΦ (qi − qj , λ). Using the equation of motion a˙ i = − k=i αV (qi − qk )ai , we obtain M Cal = αδij

k=i

V (qi − qk ) − (1 − δij )αΦ (qi − qj , λ)

216

7 The Calogero–Moser model 7.5 The spectral curve

The spectral curve of the spin Calogero–Moser model is deﬁned as usual: Γ : Γ(λ, µ) ≡ det (L(t, λ) − µI) = 0

(7.26)

The curve Γ is time-independent due to the Lax equation, eq. (7.11). Note that Γ is invariant under the symmetries eq. (7.7). Proposition. The equation of the spectral curve takes the form: Γ(λ, µ) ≡

N

ri (λ)µi = 0

(7.27)

i=0

where the ri (λ) are elliptic functions of λ, independent of t, which can be expanded on the Weierstrass ℘ function and its derivatives as: ri (λ) =

Ii0

+

N −i−2

Ii,s ∂λs ℘(λ)

(7.28)

s=0

In a neighbourhood of λ = 0, the function Γ(λ, µ) can be factorized as: Γ(λ, µ) =

N (µ − µi (λ));

µi (λ) = (α − νi )λ−1 + hi (λ)

(7.29)

i=1

where hi (λ) are regular functions of λ, and the νi are the eigenvalues of the matrix F = (fij ), so that νi = 0, i > l and fii = α Proof. The matrix elements Lij (t, λ) of the Lax matrix are elliptic functions of the variable λ having an essential singularity at λ = 0. The functions ri (λ), however, are meromorphic because the essential singularity in L(λ) can be gauged away near λ = 0 since we can write: ˜ λ)G−1 (t, λ), Gij = δij exp(ζ(λ)qi (t)) L(t, λ) = G(t, λ)L(t,

(7.30)

˜ ij (t, λ) are meromorphic functions of λ in a neighbourhood of the where L point λ = 0. Incorporating the constraint fii = α, we have: ˜ λ) = 1 (αI − F (t)) + O(λ0 ) L(t, λ

(7.31)

where F (t) is the matrix of elements fij (t). Therefore the elliptic functions ri (λ) have poles of degree at most N − i at the point λ = 0, so that they can be expanded as linear combinations of the function ℘(λ) and its derivatives. We can always factorize the polynomial in µ, Γ(λ, µ), around

7.5 The spectral curve

217

λ = 0. The branches µi (λ) in eq. (7.29) have simple poles, of the stated form due to eq. (7.31). In particular, since F is of rank l, the eigenvalue νi = 0 has multiplicity N − l. The coeﬃcients Ii0 , Ii,s of this expansion are the integrals of motion of the spin Calogero–Moser model. They deﬁne the moduli of the algebraic curve Γ. We are now in a position to compute the number of action variables. Proposition. The number of action variables is half the dimension of the phase space: 1 N l − l(l + 1)/2 + 1 = dim M 2 Proof. The spectral equation Γ(λ, µ) = 0 depends on N (N + 1)/2 parameters Ii0 , Iis . However, they are not all independent. The constraints come from the conditions νi = 0, i > l and νi non-dynamical for 1 ≤ i ≤ l in eq. (7.29) . To see how they translate on the parameters Ii0 , Iis , let us in˜ µ ˜ +αλ−1 ) = Γ(λ, ˜), troduce the variable µ ˜ = µ−αλ−1 . Then we have Γ(λ, µ which can be expanded as: ˜ µ Γ(λ, ˜) =

N

˜ i (˜ Γ µ)λ−i + R(λ, µ ˜),

(7.32)

i=0

˜ i (˜ where Γ µ) are polynomials in µ ˜ and R(λ, µ ˜) = O(λ) is regular at λ = 0. ˜ µ) is N − i and that its coeﬃOne can check easily that the degree of Γi (˜ cients are linear combinations of the parameters Ii0 , Iis . The conditions νi = 0, i > l, imply that ˜ i (˜ µ) = 0, i > l. Γ

(7.33)

Altogether this is equivalent to a set of (N −l)(N −l+1)/2 linear equations on the parameters Ii0 , Iis . The total number of independent parameters is therefore equal to N l − l(l − 1)/2. Next, recall that the expansion around λ = 0 of the branch µi for i = 1, . . . , l reads µi = (α − νi )λ−1 + O(1). This yields l − 1 additional relations (since the νi are constants characterizing the orbit of F , not to be counted as dynamical variables). Note that this gives only l−1 constraints, and not l, because we have N α = νi = Tr F . This condition also accounts for the fact that the elliptic function rN −1 (λ) is constant, since it cannot have a single pole of order 1. Finally, the number of independent parameters is equal to N l − l(l − 1)/2 − (l − 1) which is exactly half the dimension of the reduced phase space. We now compute the genus of the spectral curve Γ.

218

7 The Calogero–Moser model

Proposition. For generic values of the action variables the genus of the spectral curve is given by: g = Nl −

l(l + 1) +1 2

(7.34)

Proof. The idea of the proof is the same as in Chapter 5 and uses the Riemann–Hurwitz theorem. There is a diﬀerence, however, because here the base curve is of genus 1. Equation (7.26) presents the compact Riemann surface Γ as an N -sheeted branched covering of the base curve of the variable λ. The sheets are the N roots in µ. By the Riemann– Hurwitz formula we have 2g − 2 = N (2g0 − 2) + ν, where g0 is the genus of the base curve, g0 = 1, and ν is the number of branch points, i.e. the number of values of λ for which Γ(λ, µ) has a double root in µ. This is the number of zeroes of ∂µ Γ(λ, µ) on the surface Γ(λ, µ) = 0. But ∂µ Γ(λ, µ) is a meromorphic function on the surface, hence it has as many zeroes as poles. The poles are located above λ = 0, or µ = ∞ which is the same, and are easy to count. Let Pi be the points of Γ lying on the diﬀerent sheets over the point λ = 0. In the neighbourhood of Pi the function µ has the expansion µi = (α − νi )λ−1 + hi (λ). It follows that the function ∂Γ/∂µ in the neighbourhood of Pi has the form ∂Γ/∂µ = [(νj − νi )λ−1 − (hj (λ) − hi (λ))] j=i

From this, we see that on each of the l sheets (λ, µi (λ)) (i = 1, . . . , l) we have one-pole of order (N −1). On each of the (N −l) sheets (λ, µi (λ)) (i = l + 1, . . . , N ) we have one-pole of order l. Finally, ν = l(N − 1) + (N − l)l. Inserting this value in the Riemann–Hurwitz formula yields the result. The last two propositions show that the number of independent action variables is exactly the genus of the spectral curve. Among these, one is the total momentum associated with translation invariance and we will have to factorize by this symmetry. 7.6 The eigenvector bundle As in Chapter 5, we consider at any point P = (λ, µ) of the spectral curve the unique eigenvector Ψ(0, P ) of L(0, λ) with eigenvalue µ, normalized by ψ1 (0, P ) = 1. We want to study the analyticity properties in λ of this eigenvector. A ﬁrst diﬀerence with the case of rational Lax matrices is that it has an essential singularity at the points Pi above λ = 0.

7.6 The eigenvector bundle

219

Proposition. In the neighbourhood of the point Pi the component ψj (0, P ) has the form (i)

ψj (0, P ) = exp [ζ(λ)(qj (0) − q1 (0))](cj + O(λ))

(7.35)

(i)

where cj are the eigenvectors of the matrix F corresponding to the nonzero eigenvalue νi for i = 1, . . . , l: N

(i)

(i)

fkj cj = νi ck

j=1

while for i > l the c(i) form a basis of the kernel of F . ˜ P ), where Proof. From equation (7.30), we have Ψ(0, P ) = G(0, λ)Ψ(0, ˜ ˜ ˜ P) = Ψ(0, P ) is an eigenvector of L(0, λ). Using eq. (7.31), we have Ψ(0, (i) (i) c +O(λ), where c is an eigenvector of F . More precisely, for i = 1, . . . , l, c(i) is the unique eigenvector of F with eigenvalue νi = 0 normalized by (i) c1 = 1, while for i > l it is determined by the limit for P → Pi of the ˜ ) corresponding to the eigenvalue µi (P ). normalized eigenvector of L(P The degeneracy has been lifted by higher order terms in λ. Therefore we (i) have ψj (0, P ) = (cj + O(λ)) exp (ζ(λ)qj (0)). Normalizing ψ1 (0, P ) = 1 yields the result. We can now compute the number of poles of Ψ on Γ. The result is slightly diﬀerent from the case of rational Lax matrices, where it was found to be g + N − 1. Proposition. The number of poles of Ψ(0, P ) is: m = Nl −

l(l + 1) =g−1 2

(7.36)

Proof. As in the case of rational Lax matrices, let us introduce the function W (λ) of the complex variable λ deﬁned by: W (λ) = (Det |ψi (Mj )|)2 where the Mj are the N points above λ. It is well-deﬁned on the base curve since the Det2 does not depend on the order of the M j. 2ζ(λ) i (qi (0)−q1 (0)) This function has an essential singularity of the form e at λ = 0. This does not aﬀect the property that the number of poles of W (λ) is equal to the number of its zeroes. This property is obtained by considering the sum of residues of W /W on the λ-torus, and noting that

220

7 The Calogero–Moser model

ζ (λ) = ℘(λ) = λ12 + O(λ2 ) is elliptic and has no residue. Clearly W (λ) has a double pole where there exists a point P above λ at which Ψ(P ) has a simple pole. As in the rational case, we show that W (λ) has a simple zero for values of λ corresponding to a branch-point of the covering, hence m = ν/2 = g −1, by Riemann–Hurwitz. Here lies the diﬀerence with the rational case, because g0 = 1, see Chapter 5. This result looks diﬀerent from what we got in the case of rational Lax matrices. There, we had g + N − 1 poles for a meromorphic eigenvector at time t = 0. However, N − 1 poles were located above λ = ∞ and were not dynamical. Here all poles are dynamical and we have g − 1 of them. This is a surprising result. In fact, from eq. (7.35), we see that the components of the eigenvector ψi (0, P ) at time t = 0 are Baker–Akhiezer functions. But generically, such a function has at least g poles, not g −1. So, we have to admit that we are not in a generic situation. It was noted in Chapter 3 that when the genus g of the base curve is greater than or equal to 1, the consistency equations of the Lax pair become overdetermined, preventing genericity of the Lax matrix. In fact, this will be a crucial ingredient in the solution of the Calogero–Moser model. 7.7 Time evolution The next step is to compute the time evolution of the eigenvectors. We let the eigenvector evolve according to the natural equation: dΨ = MΨ dt

(7.37)

We choose as initial condition the eigenvector Ψ(0, P ) normalized with its ﬁrst component equal to 1. Of course, at subsequent time this normalization will not hold any more. However, we know that, in this setting, the poles of Ψ(t, P ) do not evolve with time, see Chapter 5. Proposition. The coordinates ψj (t, P ) of the vector-function Ψ(t, P ) are meromorphic functions on Γ except at the points Pi above λ = 0. Their poles γ1 , . . . , γg−1 do not depend on t. In the neighbourhood of Pi they have the form (i)

ψj (t, P ) = cj (t, λ) exp [ζ(λ)(qj (t) − q1 (0)) + mi (λ)t] (i)

(i)

(7.38) (i)

where cj (t, λ) are regular functions of λ for λ 0, cj (t, λ) = cj (t) + O(λ). Here the c(i) (t) are eigenvectors of the matrix F (t) = (fij (t))

7.8 Reconstruction formulae

221

corresponding to the eigenvalues νi , and: mi (λ) = (−α + νi )λ−2 − hi (0)λ−1

(7.39)

is the singular part of −λ−1 µi (λ) at λ = 0, see eq. (7.29). ˜ P ) deﬁned as Proof. Let us consider the vector Ψ(t, ˜ P) Ψ(t, P ) = G(t, λ)Ψ(t,

(7.40)

where G(t, λ) is deﬁned in eq. (7.30) and let Ψ(t, P ) evolve according to ˜ P ) is an eigenvector of the matrix L(t, ˜ λ) and eq. (7.37). The vector Ψ(t, evolves according to the equation ˜ (t, λ))Ψ(t, ˜ P ) = 0, M ˜ = −G−1 ∂t G + G−1 M G (∂t − M

(7.41)

From eqs. (7.12, 7.13) it follows that: ˜ ij = −δij ζ(λ)q˙i − (1 − δij )fij σ(λ − qij ) [ζ(qij − λ) − ζ(qij ) + ζ(λ)] M σ(λ)σ(qij ) so that collecting the coeﬃcient of the 1/λ term we can write: ˜ (t, λ) = −λ−1 L(t, ˜ λ) + O(λ0 ) M Hence around Pi we have: ˜ λ) = ∂t Ψ(t,

1 ˜ P) − µi (λ) + O(1) Ψ(t, λ

(7.42)

(7.43)

The quantity: mi (λ) = −(λ−1 µi (λ))− = (−α + νi )λ−2 − hi (0)λ−1 is independent of time because so is µi (λ). Integrating eq. (7.43), multiplying by G(t, λ) and normalizing ψ1 (0, P ) = 1, we get the result. 7.8 Reconstruction formulae We now reconstruct the original dynamical variables in terms of the Riemann surface Γ and the poles of the eigenvectors. From eq. (7.38) we see that the components ψi (t, P ) of the eigenvector are Baker–Akhiezer functions. Their behaviour above λ = 0 is ψi (P ) = exp[(qi (t)−q1 (0))λ−1 +mj (λ)t](ci (t)+O(λ)), P → Pj (j)

(7.44)

As we already mentioned, there is a paradox with the number of poles of the Baker–Akhiezer function. Generically, such a function has g poles.

222

7 The Calogero–Moser model

The function then exists and is unique up to normalization. In particular, its zeroes are completely determined. One way to construct a Baker– Akhiezer function with g − 1 poles is to let one of the zeroes of the generic function cancel one of its poles. Clearly, this gives a relation between the parameters deﬁning the function, speciﬁcally the qi (t) deﬁning the essential singularity and the moduli of the curve. Let ψi (t, P ) be the Baker–Akhiezer function with the singularities eq. (7.44) at the points Pj above λ = 0, and a divisor of g poles (γ0 , γ1 , . . . , γg−1 ). We will denote by (η0 , η1 , . . . , ηg−1 ) the divisor of its zeroes. By the general formula eq. (5.48) in Chapter 5, we have:

ψi (t, P ) = di (t)e[(qi (t)−q1 (0)) Ω1 +t Ω2 ] θ(A(P ) − U1 (qi (t) − q1 (0)) − U2 t − ζ) × θ(A(P ) − ζ) where A(P ) is the Abel map, and Ω1 and Ω2 are normalized second kind Abelian diﬀerentials with singularities dλ−1 and dmj (λ) at the points Pj above λ = 0 respectively. Note that the diﬀerentials are independent of the index i of the component considered, and so are the vectors U1 and U2 of their b-periods. The functions di (t) are arbitrary normalizations. According to Riemann’s theorem, the divisors of the poles and zeroes of ψi (t, P ) satisfy : A(η0 ) +

g−1

A(ηi ) − U1 (qi (t) − q1 (0)) − U2 t − ζ = −K

i=1

A(γ0 ) +

g−1

A(γi ) − ζ = −K

i=1

Let us assume that the zero η0 coincides with the pole γ0 . We then get the condition g−1 U1 qi (t) + U2 t + V = −K + A(ηi ) (7.45) i=1

g−1 A(γi ), a vector in where we have denoted V = −K − U1 q1 (0) + i=1 Jac (Γ) independent of time. Now the expression in the right-hand side of this equation belongs to the zero divisor of the theta-function (see Chapter 15). This is the constraint we were looking for. As a consequence we have Proposition. The quantities qi (t) satisfying the equation θ(U1 qi (t) + U2 t + V ) = 0

(7.46)

223

7.9 Symplectic structure

are the solutions of the equations of motion of the spin Calogero–Moser model. Moreover, the spin variables are reconstructed with the matrix C of eigenvectors of F = CνC −1 with: (j)

ci (t) = eαj qi (t) θ(U1 qi (t) + U2 t + Vj )

(7.47)

where Vj is a constant vector. Proof. Equation (7.46) expresses the fact that the left-hand side of eq. (7.45) belongs to the zero divisor of the theta-function. Note that all qi (t) satisfy the same equation which describes, for all values of time, the intersection of a straight line in the Jacobian torus with the theta divisor. To proceed to the reconstruction of the spin variables, it is enough to compute the matrix of eigenvectors C. As we have shown, the columns of C are the vectors c(j) whose components are given by the limit when λ → 0 of the following expression, taken for P = (λ, µj (λ)): P

(j) ci (t) = lim di (t)e[(qi (t)−q1 (0))(

1 Ω1 − λ )+t( P Ω2 −mj (λ))]

λ→0

θ(A(P ) − U1 (qi (t) − q1 (0)) − U2 t − ζ) θ(A(P ) − ζ) P Ω1 − λ1 ) and βj = limλ→0 ( Ω2 − mj (λ)) are ×

P The limits αj = limλ→0 (

(j)

(j)

well-deﬁned and depend on the point Pj . Note that if ci → di ci we have fij → di fij d−1 j and we know that we must quotient by this diagonal action. Moreover, if we change the normalization of the vector c(j) , i.e. (j) (j) ci → dj ci , the matrix F is invariant. Hence we can drop all factors (j)

that can be absorbed in these invariances, and we can choose ci eq. (7.47) with the constant vector Vj = ζ − U1 q1 (0) − A(Pj ).

as in

7.9 Symplectic structure −1 (λ) To compute the symplectic form we need to consider the inverse Ψ of the matrix Ψ(λ) whose columns are the eigenvectors at the N points −1 (λ) is built by considering the adjoint system above λ. The matrix Ψ Ψ+ (P )(L(λ) − µ) = 0. The row vector Ψ+ (P ) is reconstructed from its −1 (λ) analytical properties, exactly like Ψ(P ). The rows of the matrix Ψ are the values of the row vector: Ψ(−1) (P ) ≡

Ψ+ (P ) Ψ+ (P )Ψ(P )

224

7 The Calogero–Moser model

at the N points Pj above λ. As before Ψ+ (P )Ψ(P ) is a meromorphic function on Γ with zeroes at the branch points of the covering µ → λ. Note that this function is regular at λ = 0, in particular the essential singularities cancel. It follows that Ψ(−1) (P ) has poles at the branch points, and zeroes at the divisor of poles of Ψ. We remarked already several times that the components of the eigenvector can be written as ratios of suitable minors of the matrix L(λ) − µ. In particular, the poles of Ψ(P ) are obtained as the zeroes of the ﬁrst minor. Note that in this minor, the variables q1 , p1 have disappeared. This corresponds to a reduction of the system by translational symmetry, which leaves a phase space of dimension 2(g − 1), hence the g − 1 poles of Ψ(P ). In particular one can choose the origin such that at initial time q1 (0) = 0. Proposition. The symplectic form of the spin Calogero–Moser model is ω=

g−1 i=1

dλγi ∧ dµγi =

N

dpi ∧ dqi − ωK

(7.48)

i=1

where (λγi , µγi ) are the coordinates of the dynamical divisor, and ωK is −1 the Kirillov symplectic form on the orbit deﬁned by F = CνC , i.e. ωK = Tr νC −1 δC ∧ C −1 δC . Proof. As usual, we consider the form K = K1 + K2 + K3 with: K1 = Ψ(−1) (P )δL(λ) ∧ δΨ(P ) dλ K2 = Ψ(−1) (P )δµ ∧ δΨ(P ) dλ K3 = δ (log ∂µ Γ) ∧ δµ dλ and write that the sum of its residues vanishes to prove the equality of the two expressions for ω. The poles of K are the poles of Ψ(P ), {γi = (λγi , µγi ), i = 1, . . . , g − 1}, the branch points, and the points Pj above λ = 0. At the points γi , the computation is exactly similar to that in Chapter 5, yielding a sum of residues 2 g−1 i=1 δµγi ∧ δλγi coming from equal contributions of K1 and K2 . At the branch points the residues from K1 , K2 , K3 cancel, as in the rational case. The new features occur above λ = 0. It is easy to see that K3 is regular above λ = 0 because δµ is regular at these points, since the coeﬃcients of the polar part of µj are non-dynamical. The sum of residues of K1 and K2 at the Pj can be written in matrix form as the residue of forms on the base

7.9 Symplectic structure as follows: E1 ≡

j

E2 ≡

225

−1 δL(λ) ∧ δ Ψ dλ ResPj K1 = Resλ=0 Tr Ψ

∧ δ −1 dλ ResPj K2 = −Resλ=0 Tr δ Ψ µΨ

j

where µ is the diagonal matrix of the µj (λ). Since this is a local compu = GΨ ˜ and tation we ﬁrst extract the essential singularities by setting Ψ −1 ˜ L = GLG with G given in eq. (7.30). We get

˜ + δL ˜ ∧ G−1 δG + δ Ψ ˜Ψ ˜ −1 dλ E1 = Resλ=0 Tr [G−1 δG, L]

˜ µΨ ˜ −1 + δ Ψ ˜ ∧ δ ˜ −1 dλ µΨ E2 = −Resλ=0 Tr G−1 δG ∧ Ψδ

˜ ∧ G−1 δG = 0 because G is diagonal, and Note that Tr [G−1 δG, L] ˜ ∧ δ ˜ −1 = 0 since δ Resλ=0 δ Ψ µΨ µ is regular at λ = 0. Collecting the remaining terms, we have: ˜ ∧ G−1 δG + δ L ˜ ∧ δΨ ˜Ψ ˜ −1 E1 + E2 = Resλ=0 Tr δ L

˜ ∧ δΨ ˜Ψ ˜ −1 − G−1 δG ∧ Ψδ ˜ µΨ ˜ −1 dλ +[G−1 δG, L] ˜Ψ ˜ = Ψ ˜ µ and Ψ ˜ −1 L ˜=µ ˜ −1 , we get Ψ Using L

˜ ∧ δΨ ˜Ψ ˜ −1 = Tr G−1 δG ∧ (Lδ ˜ Ψ ˜ − δ Ψ ˜ µ)Ψ ˜ −1 Tr [G−1 δG, L]

˜ µ − δL ˜ Ψ) ˜ Ψ ˜ −1 = Tr G−1 δG ∧ (Ψδ ˜ Ψ ˜ − δ Ψ ˜ µ = Ψδ ˜ µ − δL ˜ Ψ. ˜ Hence our where in the last step we have used Lδ expression simpliﬁes to:

˜ ∧ G−1 δG + δ L ˜ ∧ δΨ ˜Ψ ˜ −1 dλ E1 + E2 = Resλ=0 Tr 2δ L Note that (G−1 δG)ij = λ−1 δqi δij + O(1) so that the ﬁrst term contributes ˜ = C + O(λ), a residue 2 i δpi ∧ δqi . For the second term we note that Ψ −1 ˜ in the reduced system where q1 (0) = 0, and

L = −λ (F − α) + O(1) −1 so that the residue is −Tr δF ∧ δCC since δα = 0. Remembering

that F = CνC −1 , this reads −2Tr νC −1 δC ∧ C −1 δC . We recognize the Kostant–Kirillov form on the coadjoint orbit of ν, see Chapter 3. Writing that the sum of residues vanishes proves the proposition.

226

7 The Calogero–Moser model

This result shows that λγi , µγi are canonically conjugate variables. Since (λγi , µγi ) belong to the spectral curve, they form a set of separated variables. 7.10 Poles systems and double-Bloch condition In this section we present a natural construction of the Lax pair for the spin Calogero–Moser model by relating it to the matrix KP equation. Before delving into this particular example it is illuminating to present the general context of this sort of connection. Let us consider a linear diﬀerential operator D in two variables x and t (there could be more than one time) and consider the diﬀerential equation DΨ(x, t) = 0. We will consider as an example the KP operator D = ∂t − ∂x2 + u(x, t), see eq. (10.10) in Chapter 10. Assume, moreover, that the coeﬃcients of D are doubly periodic meromorphic functions of the complex variable x with periods 2ωi , i = 1, 2. We require nothing concerning the t-dependence. In general we know that for generic simply periodic potentials we can ﬁnd a solution of the equation DΨ(x, t) = 0 which is quasi-periodic, i.e. Ψ(x + 2ω1 , t) = B1 Ψ(x, t). Such solutions are called Floquet or Bloch solutions. In the case of elliptic potentials, since we have a second period, it is natural to require that Ψ be double Bloch, i.e. Ψ(x + 2ωi , t) = Bi Ψ(x, t),

i = 1, 2

In contrast to the case of the one-period situation it turns out that this is in general impossible. Nevertheless, one can ﬁnd double Bloch solutions for very special potentials, from which the Calogero–Moser model will automatically spring out. To understand the restrictions coming from the double Bloch condition let us assume that the function x → Ψ(x, t) is a meromorphic function and has N poles at positions qi on the torus with periods 2ωi . Applying the Riemann–Roch theorem with g = 1 one sees that such functions form a vector space of dimension N . Indeed, for any two such functions Ψ1 , Ψ2 with the same Bloch multipliers, the quotient Ψ2 /Ψ1 is a meromorphic function on the torus with N poles (since Ψ1 has the same number of zeroes and poles, because Ψ1 /Ψ1 is elliptic), hence lives in a space of dimension N − g + 1 = N . The existence of such functions comes from their explicit construction using the Lam´e function. Take any sum of the form: N ci (t, z)Φ(x − qi , z)ekx Ψ(x, t, z) = i=1

7.10 Poles systems and double-Bloch condition

227

where Φ(q, z) is the Lam´e function deﬁned in eq. (7.9). Recall that we have Φ(x + 2ωi , z) = Ti (z)Φ(x, z), with Ti (z) = exp (2ωi ζ(z) − 2ηi z) (see Chapter 15), so that Ψ(x + 2ωi , t, z) = Bi Ψ(x, t, z) with Bi = Ti (z) exp (2kωi ). Given the two Bloch multipliers Bi , we can adjust k and z to achieve these values. So we have found an explicit form of the basis of the N -dimensional vector space of double Bloch functions. If we now require that this function Ψ obeys the equation DΨ = 0, we impose in fact more than N conditions. This is because DΨ is a double Bloch function with the same multipliers, but diﬀerentiations and multiplication by potentials can only increase the degree of its divisor of poles. Hence DΨ lives in a space of dimension greater than N , and its vanishing requires more than N linear conditions on the N coeﬃcients ci . This means that for a general operator D with elliptic coeﬃcients there are no double Bloch solutions. On the other hand, given a Riemann surface Γ of genus g, one can construct Baker–Akhiezer functions Ψ on it. It is well known that such Baker–Akhiezer functions satisfy diﬀerential equations of the form DΨ = 0 for some speciﬁc operator D, see an example below eq. (7.49). A generic Baker–Akhiezer function depends on many parameters. It is deﬁned ﬁrst by the choice of the Riemann surface Γ, which depends on 3g − 3 moduli, i.e. depends on 3g − 3 complex parameters, and second by the choice of punctures Pα , local parameters wα around Pα , and singular parts of order nα at Pα . Only the ﬁrst nα coeﬃcients in the expansion of wα are relevant to the deﬁnition of the singular part. Altogether this produces a total of (3g − 3) + α (1 + nα ) parameters. In the case of one puncture, the Baker–Akhiezer function is of the generic form: (i) ti P Ω(i) θ(A(P ) + i U ti + ζ) i Ψ(P, t) = e θ(A(P ) + ζ) where A(P ) is the Abel map. We assume that x is t1 , but we could take for x any combination x = i αi ti of the elementary times ti of the hierarchy. The condition for such a function Ψ to be double Bloch is that 2ωi U1 belongs to the lattice of periods of Jac (Γ). This means 2g conditions on the parameters of the Baker–Akhiezer functions. The dimension of the parameter space of Baker–Akhiezer functions is large enough to accomodate the 2g double Bloch conditions. This provides families of diﬀerential operators possessing double Bloch solutions. In this case the overdetermined linear system on the coeﬃcients ci (t, z) becomes compatible. The compatibility conditions eventually take the form of a Lax equation. Let us apply this strategy to a simple example: we consider a smooth Riemann surface of genus g and l punctures Pβ , β = 1, . . . , l with local

228

7 The Calogero–Moser model

parameters wβ (P ) around Pβ (wβ (Pβ ) = 0). Fix a divisor of degree g+l−1 in general position. Then there exists a unique Baker–Akhiezer function ψα having poles at this divisor and behaving in the neighbourhood of each Pβ as: ∞ −1 −2 ξsαβ (x, t)wβs ψα (x, t, P ) = ewβ x+wβ t δαβ + s=1

In fact, the degree of the divisor of poles being g + l − 1, we have a vector space of functions of dimension l and we impose a system of l linear inhomogeneous normalization conditions. Proposition. Let |Ψ(x, t, P ) be the vector with l ψα (x, t, P ). It satisﬁes the equation:

components

(∂t − ∂x2 + u(x, t))|Ψ(x, t, P ) = 0

(7.49)

where the l × l matrix u is given by uαβ (x, t) = 2∂x ξ1αβ (x, t). Such potentials are called ﬁnite-zone potentials. Proof. In the vicinity of each puncture Pβ one can write: −1

(∂t − ∂x2 )ψα (x, t, P ) = ewβ

x+wβ−2 t

(−2∂x ξ1αβ + O(wβ ))

Since the left-hand side is meromorphic except at the Pj , has the same g + l − 1 poles as Ψ, and has an appropriate essential singularity at each puncture, it can be expanded on the ψβ , so that one can write (∂t − ∂x2 )ψα = − β uαβ ψβ , for some uαβ (x, t) independent of P ∈ Γ. Comparing with the right-hand side around Pβ we ﬁnd uαβ = 2∂x ξ1αβ . We now express the condition that the potential u is elliptic and obtain its precise form. Proposition. If the ﬁnite-zone potential u is elliptic, it has necessarily the form: N u(x, t) = ρi (t)℘(x − qi (t)) i=1

where ρi (t) is an l × l matrix of rank 1 of the form ρi = |ai bi | with |ai an l vector and bi | an l covector. Proof. We need the explicit form of |Ψ as a Baker–Akhiezer function on some curve Γ. Let P1 , . . . , Pl be the punctures and γ1 , . . . , γg+l−1 be the poles of the Baker–Akhiezer function. There exists a unique meromorphic function hα (P ) with poles at the γi and such that hα (Pβ ) = δαβ . In particular it has l − 1 zeroes at the Pβ , β = α, and g other zeroes at a

7.10 Poles systems and double-Bloch condition

229

divisor Dα . Applying Abel’s theorem, it follows that Z0 + K ≡ A(Pα ) − A(Dα ) = β A(Pβ ) − A(γi ) is independent of α. The Baker–Akhiezer function ψα (P ) reads: ψα (P ) = hα (P )

θ(A(P ) + U1 x + U2 t + Z0 − A(Pα ))θ(Z0 ) x P Ω1 +t P Ω2 e θ(A(P ) + Z0 − A(Pα ))θ(U1 x + U2 t + Z0 )

In this formula the ﬁrst theta-function in the denominator cancels the extra g zeroes of hα (P ) at Dα . Then the ﬁrst theta-function in the numerator is obtained by requiring that there is no monodromy. Finally, the two other theta functions are necessary to ensure the correct normalization of ψα (P ) at Pα . It is now easy to identify the poles in x of |Ψ(x, t, P ) . They occur at positions x = qi (t) with (compare with eq. (7.46)): θ(U1 qi (t) + U2 t + Z0 ) = 0 This deﬁnes the functions qi (t). Consider the residue of ψα (x, t, P ) when x = qi (t). As a function of P it is a Baker–Akhiezer function, having poles at the points γi . Moreover, in the neighbourhood of each puncture Pβ it has the form exp (wβ−1 qi (t) + wβ−2 t) O(wβ ), i.e. the coeﬃcient in front of the exponential vanishes. This is because at Pβ , β = α the function hα vanishes, while at Pα the theta-function θ(A(P ) + U1 qi (t) + U2 t + Z0 − A(Pα )) vanishes, due to the deﬁnition of qi (t). In general such a Baker– Akhiezer function vanishes identically, however for the special values of the parameters (x = qi (t), t) it exists, and is a fortiori unique, up to a normalization constant. We choose some normalization and call it σi (t, P ). This means that: ψα (x, t, P ) =

aiα (t)σi (t, P ) + O(1) x − qi (t)

(7.50)

Here aiα (t) is the normalization constant independent of P . The potential uαβ (x, t) is obtained by computing the expansion of ψα (x, t, P ) around Pβ . Writing the expansion of σi (t, P ) as σi (t, P ) = − 12 (wβ bβi + O(wβ2 )) exp (wβ−1 qi (t) + wβ−2 t), we ﬁnd around x = qi and P = Pβ : ξ1αβ (x, t) =

aiα (t)bβi (t) + O(1) x − qi (t)

Finally, uαβ = 2∂x ξ1αβ has double poles at x = qi with coeﬃcient ραβ i = β aiα (t)bi (t). If we now impose that u is elliptic, the double pole gives rise to a Weierstrass function ℘(x − qi (t)) with a matrix coeﬃcient ρi of rank 1.

230

7 The Calogero–Moser model

We now require that |Ψ is double Bloch in x, hence can also be written as: N µ µ2 |Ψ = |si (t, λ, µ) Φ(x − qi (t), λ)e− 2 x+ 4 (7.51) i=1

In this formula λ and µ are free parameters, but the double Bloch condition will turn out to specify the Riemann surface on which (λ, µ) are coordinates. Moreover, µ will become inﬁnite at the punctures, and λ will be the local parameter required to deﬁne the essential singularity of the Baker–Akhiezer function. Finally |si , in view of eq. (7.50), is proportional to |ai (t) . We denote by ψi (t, λ, µ) the proportionality coeﬃcient |si = ψi (t, λ, µ)|ai (t) (do not confuse ψi (t, λ, µ), i = 1, . . . , N with ψα (x, t, P ), α = 1, . . . , l). Proposition. The equation N ∂t − ∂x2 + ρi (t)℘(x − qi (t)) |Ψ = 0

(7.52)

i=1

has solutions |Ψ of the form |Ψ =

N

µ

ψi (t, λ, µ)Φ(x − qi (t), λ)e− 2 x+

µ2 t 4

|ai (t)

(7.53)

i=1

if and only if qi (t) and the quantities fij = bi |aj satisfy the equations of motion of the spin Calogero–Moser system, eqs. (7.4), and the constraints fii = 2. Proof. Inserting equation (7.53) into equation (7.52), we ﬁnd the condition:  N  d(ψi |ai ) Φ(x − qi , λ) E ≡ − (q˙i − µ)Φ (x − qi , λ)ψi |ai  dt i=1  N  − Φ (x − qi , λ)ψi |ai + fji ℘(x − qj )Φ(x − qi , λ)ψi |aj = 0  j=1

where Φ = ∂x Φ and so on. The vanishing of the triple pole (x − qi )−3 gives the condition: bi |ai |ai = 2|ai (7.54) Using this condition and the Lam´e equation (7.10), we can identify the double pole (x − qi )−2 . Its vanishing gives the condition: (q˙i − µ)ψi |ai + fij Φ(qi − qj , λ)ψj |ai = 0 (7.55) j=i

7.10 Poles systems and double-Bloch condition

231

We ﬁnally identify the residue of the simple pole and obtain the condition: d fji ℘(qi −qj )ψi |aj + fij Φ (qi −qj , λ)ψj |ai = 0 − ℘(λ) ψi |ai + dt j=i

j=i

(7.56) Inserting back eqs. (7.54, 7.55, 7.56) into the expression of E one sees that E vanishes identically due to the functional equation (7.14). Hence the vector function |Ψ given by eq. (7.53) satisﬁes eq. (7.52) if and only if the conditions (7.54, 7.55, 7.56) are fulﬁlled. From eq. (7.54) it follows that the constraints fii = 2 should hold. Equation (7.55) can then be rewritten as a matrix equation for the N dimensional vector Ψ = (ψi ) (not to be confused with the l-dimensional object |Ψ ): (L(t, λ) − µI)Ψ = 0 (7.57) where the matrix L(t, λ) is given by: Lij (t, λ) = q˙i δij + (1 − δij )fij Φ(qi − qj , λ)

(7.58)

We recognize the Lax matrix of the Calogero–Moser model, eq. (7.12), so that the N -vector Ψ identiﬁes with the eigenvector considered in section (7.6). We can rewrite equation (7.56) as: fji ℘(qi − qj )|aj (7.59) |a˙ i = −Λi |ai − j=i

where we have deﬁned: Λi =

ψj ψ˙ i − ℘(λ) + fij Φ (qi − qj , λ) ψi ψi j=i

But this last equation can be written: (∂t − ℘(λ)I − Λ − M )Ψ = 0

(7.60)

where Λ = Diag (Λi ) and the matrix M (t, λ) is given by: Mij (t, λ) = −(1 − δij )fij Φ (qi − qj , λ)

(7.61)

We recognize the second matrix M of the Lax pair (7.13). The compatibility condition of eq. (7.57, 7.60) reads L˙ = [M + Λ + ℘(λ)I, L]. Of course the term ℘(λ)I does not contribute to the commutator. Moreover, we can get rid of the diagonal matrix Λ by performing a conjugation by a diagonal matrix on L, which amounts to quotienting out the toral action. In

232

7 The Calogero–Moser model

this way we have exactly recovered the Lax pair of the Calogero–Moser model, eq. (7.11), hence the qi and fij have to satisfy the Calogero–Moser equations of motion, in order for |Ψ to be double Bloch. In the course of the proof, λ and µ have been identiﬁed as coordinates on the spectral curve of the Calogero–Moser model, i.e. are related by the equation det (L(λ) − µ) = 0. One can see that the punctures Pβ are l among the N points above λ = 0, and λ is a local parameter around each of them. The outcome of this analysis is that the double Bloch condition singles out very speciﬁc ﬁnite-zone potentials. It amounts to an overdetermined linear system on the coeﬃcients of the expansion of |Ψ which is equivalent to the Lax equation. In particular, we have obtained in a simple and natural way the Lax matrices of the Calogero–Moser model in eqs. (7.58, 7.61). The method is clearly general and lends itself to extensions by Baker–Akhiezer functions with diﬀerent patterns of essential singularities. Note that in our construction, we have considered only two “times”, x and t, which parametrize the singularity at each puncture. This provides a whole variety of integrable systems with spectral parameter lying on a genus 1 curve. In view of the counting of parameters in Lax equations in Chapter 3 this is a notable fact.

7.11 Hitchin systems A remarkable construction, due to Hitchin, provides integrable systems with spectral parameter lying on a curve of arbitrary genus. The Calogero–Moser model can also be seen as a particular case of this construction. Let Σ be a Riemann surface of genus g, and let G be a complex semisimple Lie group. Let A be the space of type (0, 1) ﬁelds on Σ, i.e. ﬁelds of z for some local coordinate system z, with values the form A = Az¯(z, z¯)d¯ in the Lie algebra of G. We deﬁne the “gauge group” G to be the space of maps from Σ to G, so that h ∈ G is a function h(z, z¯) with values in G. The gauge group acts on A as follows: ¯ A −→ Ah ≡ h−1 Ah + h−1 ∂h

(7.62)

Note that the diﬀerences A − A form a vector space compatible with the gauge group action, so that A can be seen as an aﬃne space with group action. We call N = A/G the orbit space of A under G. A tangent vector at the point A ∈ A is of the form X = Xz¯(z, z¯)d¯ z with values in the Lie algebra of G. A covector Φ at the point A is of the form Φ = Φz (z, z¯)dz, the pairing between vectors and covectors at the

7.11 Hitchin systems

233

point A being given by: Tr (Φz Xz¯)dzd¯ z

(Φ, X) = Σ

The gauge group acts on vectors and covectors by adjoint action, X h = h−1 Xh, Φh = h−1 Φh, and this leaves the pairing invariant. The starting point of Hitchin construction is the cotangent bundle T ∗ A whose points are pairs (A, Φ), where Φ is a cotangent vector at A ∈ A. The canonical symplectic form on this space reads: ω= Tr (δΦz ∧ δAz¯)dzd¯ z Σ

Note that this symplectic form is invariant under the gauge group action, so we can perform a Hamiltonian reduction by this group. To do that we need to compute the moment µ of this action. In the case of a cotangent bundle it is shown in Chapter 14 that the Hamiltonian generating the inﬁnitesimal group action is given by H ≡ (µ, ) = Tr (µz z¯ )dzd¯ z = α(X (A, Φ)), ∈ Lie(G) where α is the canonical 1-form α = Σ Tr (ΦδA) and X (A, Φ) is the inﬁnitesimal variation of the point (A, Φ) under gauge group action, namely: ¯ + [A, ], X A = ∂¯A ≡ ∂

X Φ = [Φ, ]

One gets after an integration by parts: ∂Φz z =− z µz z¯dzd¯ + [Az¯, Φz ] dzd¯ ∂ z¯ ¯ +A∧Φ+Φ∧A µ = ∂¯A Φ ≡ ∂Φ The phase space P of the Hitchin system is obtained by choosing the moment equal to 0. The stability group of this moment is therefore the whole gauge group G, so that we have: P = µ−1 (0)/G Choosing µ = 0 means that ∂¯A Φ = 0, and this has a nice geometric interpretation. A cotangent vector at a point n ∈ N , which is the class of A ∈ A under the G action, may be viewed as a linear form on TA A vanishing on vectors tangent to the ﬁbre, that is, such that (Φ, ∂¯A ) = 0 for all . By integration by parts, this condition is equivalent to ∂¯A Φ = 0. This

234

7 The Calogero–Moser model

interpretation being covariant under the gauge group action, it follows that P = T ∗ N , where N is the orbit space (avoiding non-generic orbits to obtain a good manifold). By a theorem of Narasimhan and Seshadri, the space N is known to be isomorphic to the moduli space of (stable) holomorphic G-bundles on Σ, and this implies in particular that it is ﬁnite-dimensional. Theorem. The phase space P = T ∗ N is of ﬁnite dimension: for g > 1 and G a semi-simple Lie group, we have dim P = 2 dim N = 2 (g − 1) dim G Proof. Let us sketch some ideas of the proof. The ﬁrst step is to relate N to holomorphic G-bundles. Given A ∈ A and a suﬃciently ﬁne covering Uα of Σ, one solves, for some C ∞ functions hα ∈ G deﬁned in Uα , the equation ¯ h−1 (7.63) α ∂hα = Aα where Aα ≡ A|Uα . We deﬁne a principal G-bundle by the transition functions gαβ = hα h−1 on Uα ∩ Uβ . The action of G on hα reads β h hα = hα h, so that gαβ is gauge invariant, and the G-bundle is really ¯ αβ = 0, because attached to a point of N . This bundle is holomorphic, ∂g −1 ¯ −1 gαβ ∂gαβ = hβ (Aα − Aβ )hβ = 0 since Aα = Aβ on Uα ∩ Uβ . Of course, hα is deﬁned up to right multiplication by a holomorphic function fα , but this yields an equivalent presentation of the same bundle, with transition = f g f −1 . functions gαβ α αβ β Next we remark that the associated determinant bundle has vanishing Chern class. Viewing G as a group of matrices, deﬁne the determinant bundle as a line bundle whose transition functions are det gαβ . By deﬁnition of hα we have ∂¯ log det hα = Tr Aα = 0 because we assume G semisimple. So det hα is in fact holomorphic. Then det gαβ = det hα / det hβ deﬁnes a trivial holomorphic line bundle, and its Chern class vanishes. To compute the dimension of N , one computes the dimension of the cotangent space of N at a point n ∈ N . Take a representative element A ∈ A of n. As we have seen, the cotangent space at n identiﬁes with the space of forms Φ of type (1, 0) satisfying ∂¯A Φ = 0. Consider the Gbundle attached to n, deﬁned using some choice of functions hα as in α = hα Φh−1 eq. (7.63). Deﬁne on each Uα the forms Φ α , with values in the ¯ β g −1 , α = gαβ Φ Lie algebra of G. By construction we have ∂ Φα = 0 and Φ αβ i.e. the Φα deﬁne a global holomorphic section with values in 1-forms on Σ of the associated bundle Ad P , the bundle with ﬁbres the Lie algebra of G and adjoint group action. This shows that the cotangent space to N at n identiﬁes with Tn∗ N = H 0 (Σ, κ ⊗ Ad Pn ), where Pn is the principal G-bundle attached to n, and κ is the canonical bundle.

7.11 Hitchin systems

235

The Riemann–Roch theorem has an extension to vector bundles which reads in our case: dim H 0 (Σ, κ ⊗ Ad P ) − dim H 0 (Σ, Ad P ) = (g − 1)dim G − c(det Ad P ) where the determinant bundle det Ad P has transition functions det gαβ , and so has vanishing Chern class c(det Ad P ) = 0. We proceed to show that, generically, the vector bundle Ad P has no global holomorphic section, i.e. dim H 0 (Σ, Ad P ) = 0. Indeed, if {fα } with patching con−1 deﬁnes such a section, for any integer m the quandition fα = gαβ fβ gαβ m tities Tr fα deﬁne a global holomorphic function on Σ, hence a constant. This means that the eigenvalues of fα are constants, independent of α, and one can write fα = uα Λu−1 α for some uα holomorphic on Uα and a constant traceless diagonal matrix Λ. The patching condition reads −1 [Λ, u−1 β gαβ uα ] = 0. This implies that the matrices uβ gαβ uα are block diagonal (blocks correspond to coincident eigenvalues of Λ). But this would mean that the ﬁbre bundle P is decomposable, since the transition functions u−1 β gαβ uα provide an equivalent description of our bundle. We exclude this situation because it is not generic. Finally, the dimension of Tn∗ N is (g − 1)dim G, and dim T ∗ N = 2 (g − 1) dim G. Remark. In the case of line bundles, we have dim H 0 (Σ, Ad P ) = 1, since we are here considering global holomorphic functions and the same argument shows that dim Tn∗ N = g.

It will be useful to have a heuristic picture of the moduli. First, it is known that all vector bundles on a non-compact Riemann surface are trivial. This implies that one can describe all vector bundles on a compact Riemann surface by using a covering with only two open sets U0 and U∞ , where U0 is a small disc around a point, say z = 0, and U∞ is the Riemann surface with the point z = 0 removed. The bundles are then described by giving only one transition function g0∞ deﬁned on the annulus U0 − {z = 0}. We get an equivalent bundle by changing the transition function −1 g0∞ (z) → g0∞ (z) = f0 (z)g0∞ (z)f∞ (z)

where f0 is analytic non-vanishing on U0 , and f∞ is analytic non-vanishing on U∞ , i.e. on the whole of Σ−{z = 0}. The moduli space of vector bundles is the space of transition functions g0∞ modulo the above redeﬁnitions. In the case of a line bundle, we can write quite generally ∞

g0∞ (z) = z k e

−∞

an z n

(7.64)

236

7 The Calogero–Moser model

where k is an integer because g0∞ (z) is single valued on the annulus.

∞ Consider now on U∞ the function f∞ (z) = exp a ϕ (z) , −n −n n=g+1 where ϕ−n (z) is a meromorphic function on Σ such that around z = 0 we have ϕ−n (z) = z −n + O(z −g ) and ϕ−n (z) is regular everywhere else. Notice that such a function exists and is unique for n ≥ g + 1. Using this terms z −n , n ≥ g +1, in eq. (7.64). Then using f∞ , we can get rid ∞of all the n f0 (z) = exp (− n=0 an z ), we can get rid of all the terms z n , n ≥ 0, as well. Hence, we are left with g

g0∞ (z) = z k e

1

a−n z −n

(7.65)

So the line bundles on Σ are holomorphically classiﬁed by an integer k, the Chern class of the bundle, and g continuous moduli, which describe in fact the Picard variety of Σ (the Jacobian for k = 0). In the case of higher rank vector bundles, things are much more complicated. If Σ is a sphere, we know by Riemann–Hilbert factorization that we can always decompose a matrix g0∞ (z) as g0∞ (z) = f0 (z)λ(z)f− (z)−1

(7.66)

where λ(z) is a diagonal matrix with diagonal elements z ki , for some integers ki , f0 has an expansion in z, and f− has an expansion in 1/z. This exactly means that vector bundles on the sphere are classiﬁed by the integers ki and have no continuous moduli. In other words, on the sphere the bundle. The dim N = 0. The integers ki are holomorphic invariants of Chern class of the corresponding determinant bundle is ki . In the case of a general Riemann surface Σ, we can still use Birkhoﬀ’s theorem in the small disc around z = 0 to write the transition function on the annulus in the form eq. (7.66). Note that f− (z) is here only deﬁned on the annulus and cannot in general be extended to Σ−{0}. However one can hope that a similar mechanism as in the case of line bundles allows us to get rid of powers z −n for n ≥ g + 1, so that we can write the transition function as in eq. (7.65), but with k a diagonal matrix with integer entries ki , and a−1 , . . . , a−g matrices. If all the integers ki vanish, one can use the freedom of redeﬁnition of the bundle by constant matrices f0 and f∞ to diagonalize a−1 . We can still quotient away by conjugating by constant diagonal matrices while preserving this form. If g > 1, it remains a space of (g − 1) dim G parameters (in fact rank G + [(g − 1) dim G − rank G]) which can be plausibly taken as coordinates on an open set of the moduli space. If g = 1, however, we are left with g0∞ (z) = exp(a−1 /z)

(7.67)

7.11 Hitchin systems

237

where a−1 is diagonal, so that we have rank G parameters, which is known to be the correct dimension of the moduli space in that case. We will content ourselves with this interpretation in the following. The construction of Hitchin integrable systems on P will now be done by deﬁning a very simple set of Poisson commuting functions on T ∗ A, invariant under the gauge group, and by reducing them to T ∗ N . Let P (X) be an invariant polynomial on the Lie algebra of G, i.e. P (h−1 Xh) = P (X). Recall that the ring of such invariant polynomials is freely generated (Chevalley’s theorem) by homogeneous polynomials Pi , i = 1, . . . , rank G, of degrees mi , the so-called exponents of the Lie algebra. These numbers are such that: rank G

(2mi − 1) = dim G

i=1

For any given invariant polynomial P of degree m, e.g. P (X) = Tr X m , consider the function on phase space taking values in diﬀerentials of type (m, 0) (i.e. of the form ω(z, z¯)dz m ): (A, Φ) ∈ T ∗ A → P (Φ) ¯ (Φ) = m Tr (∂ΦΦ ¯ m−1 ) = The diﬀerential P (Φ) is holomorphic, since ∂P m−1 ¯ ) = 0, where we have used cyclicity of trace. Introducing m Tr (∂A ΦΦ (m) a basis of holomorphic diﬀerentials of type (m, 0), say ωj we can write (m) HP,j (Φ) ωj P (Φ) = j

The functions HP,j on phase space are G-invariant, and deﬁne the Hamiltonians of the Hitchin systems. Note that the basis of diﬀerentials (m) ωj do not contain dynamical variables, since the Riemann surface Σ is a ﬁxed parameter of the construction. Proposition. The functions HPm ,j associated with the primitive polynomials Pm , m = 1, . . . , rank G, which generate the ring of invariant polynomials, are in involution. Their number is (g − 1) dim G, so they deﬁne a Hamiltonian integrable system on P = T ∗ N . Proof. The functions HPm ,j seen as functions on T ∗ A are in involution, {HPm ,j , HPn ,k } = 0, because they depend only on the momenta Φ. Since the polynomials P are G-invariant, the functions HP,j are gauge invariant. They are thus well-deﬁned on the symplectic quotient and in involution there, because one can compute directly their reduced Poisson brackets,

238

7 The Calogero–Moser model

see Chapter 14. It remains to show that the number of the Hamiltonians is half the dimension of the phase space. Let us count them. We need the number of holomorphic diﬀerentials of type (m, 0). By the Riemann–Roch theorem, dim H 0 (Σ, κm ) − dim H 0 (Σ, κ1−m ) = c(κm ) + 1 − g where c(κm ) = m(2g −2) and dim H 0 (Σ, κ1−m ) = 0 because 1−m < 0, so that dim H 0 (Σ, κm ) = (2m − 1)(g − 1). The total number of independent Hamiltonians HPi ,j is therefore: rank G

0

mi

dim H (Σ, κ

i=1

)=

rank G

(2mi − 1)(g − 1) = (g − 1)dim G

i=1

This is half the dimension of the phase space. This counting works for g > 1. For g = 0 there are no regular diﬀerentials on the sphere, so that dim H 0 (Σ, κm ) = 0. The Hitchin construction in this case yields no interesting system. For g = 1 one has dim H 0 (Σ, κm ) = 1, since this space G 1 = rank G Hamiltonians. is spanned by dz m , so that we ﬁnd rank i=1 In genus 0 and 1 the Hitchin construction does not provide useful dynamical systems. This is a motivation to generalize it to Riemann surfaces with marked points. Let zk ∈ Σ be N points on the Riemann surface Σ. At each of these points we associate an element uk in the dual of the Lie algebra of G, which we shall identify with the Lie algebra itself using the invariant bilinear form. Instead of choosing the moment in the Hamiltonian reduction equal to zero, we now choose: u k δ zk (7.68) µ(A, Φ) ≡ ∂¯A Φ = 2iπ k

where δzk is the Dirac measure at the point zk , represented locally around z . The stability group of this momentum is the subgroup zk by δ(z−zk )dzd¯ of gauge transformations leaving uk invariant: Gz;u = {g ∈ G s.t. h(zk ) ∈ Gk }, where Gk is the stabilizer of uk . The reduced phase space is as usual: Pz;u ≡ µ−1 (2iπ uk δzk )/Gz;u k

In other words, we have: Pz;u ≡ {(A, Φ)|∂¯A Φ = 2iπ

k

uk δzk }/Gz;u

(7.69)

239

7.12 Examples of Hitchin systems

The equation ∂¯A Φ = 2iπ k uk δzk speciﬁes the behaviour of Φ around the marked points. Indeed, let us parametrize A locally around zk as ¯ k with gk (zk ) = 1. Then the condition that (A, Φ) belongs A = gk−1 ∂g ¯ k Φg −1 ) = 2iπuk δz , so to Pz;u translates locally into the condition ∂(g k

k

that gk Φgk−1 is holomorphic in an open set around zk excluding zk , and behaving locally as uk dz gk Φgk−1 = + O(1) (7.70) z − zk dz (we used ∂¯ z−z = 2iπδz0 ). 0 The construction of the commuting Hamiltonians then works as above. For Pi an invariant polynomial of degree mi on the Lie algebra of G, one considers the function (A, Φ) ∈ Pz;u → Pi (Φ)

(7.71)

which take values in the space of meromorphic forms of type (mi , 0) with poles at points zk of order at most mi . These functions are in involution and deﬁne an integrable system. 7.12 Examples of Hitchin systems Example 1. Let us illustrate this construction by considering the Riemann sphere with N marked points z1 , . . . , zN and ﬁxed parameters uk in the dual of the Lie algebra of G. We take A of the form ¯ A = h−1 ∂h

(7.72)

with h(z, z¯) ∈ G globally deﬁned on the sphere, so that the attached principal bundle is trivial. The condition ∂¯A Φ = 2iπ k uk δzk imposes that hΦh−1 possesses a simple pole at z = zk with residue h(zk )uk h−1 (zk ). Since there are no holomorphic diﬀerentials on the sphere, the only solution is: u k dz = hΦh−1 = Φ with u k = h(zk )uk h−1 (zk ) (7.73) z − zk k

In contrast to eq. (7.70), h(zk ) may be diﬀerent from one. Requiring that hΦh−1 is regular at inﬁnity yields: u k = 0 (7.74) k

The data (A, Φ) are parametrized by h through the equations (7.72, 7.73), up to left multiplication h → lh with l holomorphic, hence constant,

240

7 The Calogero–Moser model

and together with the constraint eq. (7.74). Notice that this constraint is invariant under h → lh. Gauge transformations act on h as h → hg, where g is such that g(zk ) ∈ Gk , the stability group of uk . Note that Φ and u k are invariant under this action. Quotienting by Gz;u allows us to gauge h away except at the marked points. At these points only a copy of G/Gk survives. This is equivalent to the orbit Ok of uk under coadjoint action of G. Noting that u k describes the orbit Ok , we see that: Pz;u = {( uk ) ∈ COk s.t. u k = 0}/G (7.75) k

k

k → l uk l−1 for all k. where G acts on u k by u The reduced symplectic structure is the usual symplectic structure on coadjoint orbits. Indeed, consider the canonical 1-form on T ∗ A, i.e. −1 ¯ Using the 2iπα = Σ Tr (ΦδA). parametrization A = h ∂h, one ﬁnds 2iπα = − Tr (h−1 δh)∂¯A Φ which, using eq. (7.68), reduces to: Σ

α=−

Tr (uk h−1 k δhk )

k

We recognize that δα is the Kostant–Kirillov symplectic form on the product of coadjoint orbits k Ok , which means that the Poisson brackets read: {Tr ( uk ), Tr ( u k )} = δkk Tr ([ , ] uk ) We still have to quotient by the left action of G. Inﬁnitesimal ack ] and is generated by the Hamiltonian tion isgiven by δ uk = [ , u H = uk ), so the moment is µ = k and eq. (7.74) shows k Tr ( ku that the phase space can be identiﬁed with the symplectic quotient Pz;u = µ−1 (0)/G. Choosing the invariant polynomial P2 (Φ) = Tr (Φ2 ) gives the Hamiltonian: P2 (Φ) =

k,l

Hk Tr ( uk u l ) = (z − zk )(z − zl ) z − zk

(7.76)

k

with Hk =

Tr ( uk u l ) l=k

zk − z l

The Hk form a family of commuting Hamiltonians usually called the Gaudin Hamiltonians, which are very closely related to the Neumann model.

7.12 Examples of Hitchin systems

241

Example 2. Let us explain how one may rederive the elliptic spin Calogero–Moser system from this general construction. Consider the torus Tτ ≡ C/(Z + τ Z) of periods 1 and τ , Im τ > 0. We shall use a coordinate z on Tτ with the identiﬁcation z ∼ z + 1 and z ∼ z + τ . We consider the case with one marked point at z = 0. Let u be an element of the (dual of the) Lie algebra attached to this marked point. We need a Cartan decomposition of the Lie algebra of G, i.e. consider the basis (Hi , Eα ), with Hi in the Cartan subalgebra of the Lie algebra of G and Eα the root generators. We describe any ﬁbre bundle on the torus Tτ using two open sets. U0 , a small disc around z = 0, and U∞ = Tτ − {0}. The transition ∞ functions solving function is g0∞ = h∞ h−1 0 , where h0 and h∞ are C −1 ¯ in the respective open sets U0 and U∞ . It follows from the A = h ∂h above discussion, eq. (7.67), that the transition function g0∞ is equivalent to one of the form g0∞ = exp(q/z),

with q diagonal

Note that in general h0 and h∞ are deﬁned up to left multiplication by holomorphic functions, but the condition h∞ h−1 0 = g0∞ = exp (q/z) completely determines h0 and h∞ up to left multiplication by the same constant diagonal matrix. 0 = h0 Φh−1 has The condition ∂¯A Φ = 2iπuδ0 implies, in U0 , that Φ 0 a simple pole at z = 0 with residue u = h0 (0)uh0 (0)−1 . Similarly in −1 ∞ = h∞ Φh−1 U∞ , Φ ∞ is regular. We have Φ∞ = g0∞ Φ0 g0∞ so that Φ deﬁnes a holomorphic section of our holomorphic vector bundle. Writing ∞ is regular on Tτ except the patching condition when z → 0 we see that Φ at z = 0 where it behaves as (in the case where all holomorphic indices ki vanish): q/z u Φ∞ = e + O(1) e−q/z z ∞ . Introducing the root In the following we drop the subscript ∞ in Φ decompositions: i Hi + α Eα dz = u i Hi + u α Eα , Φ Φ Φ u = i

α

i

α

i ∼ ui , and Φ α ∼ exp ( α(q) ) uα . The ﬁrst condition means we see that Φ z z z i is an elliptic function with a single pole of order 1, which means that Φ i = pi , constant. The second conthat it is a constant. Hence u i = 0 and Φ dition means that Φα has an essential singularity of the Baker–Akhiezer

242

7 The Calogero–Moser model

type at z = 0. The solution is the Lam´e function: σ(z − α(q)) α(q)ζ(z) α = − Φ uα e σ(z)σ(α(q)) ﬁnally reads: The section Φ ≡Φ ∞ = Φ

pi H i +

α Eα Φ

dz

(7.77)

α

i

In the sl(n) case this is exactly of the same form as the Lax matrix of the elliptic spin Calogero–Moser model. The variables (pi , qi ), where q = i qi Hi , are the momenta and positions of the particles of the Calogero– Moser model, while the u α are the spin variables. Choosing now P2 (Φ) = Tr (Φ2 ) and using the property of Lam´e functions, eq. (7.15), we can relate P2 (Φ)to the Hamiltonian of the spin Calogero–Moser model: P2 (Φ) = uα u −α )℘(z) with H + α Tr( p2i − ℘(α(q)) Tr( uα u −α ) H= i

α

As in the previous example, taking the quotient by the gauge group leaves only the variables qi and pi which parametrize respectively the and the spin variables moduli space N and the Cartan component of Φ, u = h0 (0)uh0 (0)−1 describing the G-orbit of u, restricted by u i = 0. Moreover, since h is deﬁned up to left multiplication by a constant diagonal matrix, and this leaves the constraint u i = 0 invariant, the phase space is: Pu = {(p, q, u ∈ Ou ) s.t. u i = 0}/H where H is the Cartan torus, acting on u by conjugation. It remains to show that our dynamical variables have the correct Poisson brackets. To do that we compute the canonical 1-form: Tr (ΦδA) = Tr (ΦδA) + Tr (ΦδA) 2iπα = Σ

U0

Σ−U0

where we use the description of the bundles on Tτ with two open sets. −1 ¯ ¯ Writing A = h−1 0 ∂h0 = h∞ ∂h∞ in the respective open sets, one computes: ¯ ∂ Tr (ΦδA) = − Tr (h−1 δh Φ) + Tr (Φh−1 0 A 0 0 δh0 ) U0 U0 ∂U0 −1 ¯ Tr (ΦδA) = − Tr (h∞ δh∞ ∂A Φ) + Tr (Φh−1 ∞ δh∞ ) Σ−U0

Σ−U0

∂(Σ−U0 )

7.12 Examples of Hitchin systems

243

First we have ∂(Σ−U0 ) = − ∂U0 . Second, since ∂¯A Φ = uδ0 , the integral Σ−U0 vanishes. We get:

1 −1 Tr Φ(h−1 δh − h δh ) α = −Tr (uh0 (0)−1 δh0 (0)) + 0 ∞ ∞ 0 2iπ ∂U0 ∞ (h∞ h−1 δh0 h−1 The contour integral can be rewritten as ∂U0 Tr Φ ∞ − 0

−1 δh∞ h−1 ∞ ) . Varying the equation h∞ h0 = g0∞ = exp (q/z) we have: δq −1 −1 = δh∞ h−1 ∞ − h∞ h0 δh0 h∞ z so that we ﬁnally get:

∞ δq Tr Φ α = −Tr (uh0 (0) z ∂U0 pi δqi − Tr (uh(0)−1 δh(0)) =− −1

1 δh0 (0)) − 2iπ

i

To evaluate the contour integral, we noticed that δq being diagonal, only in eq. (7.77) contributes to the trace. It is clear that the diagonal part of Φ the constraint u i = 0 and the quotient by the Cartan torus is a symplectic quotient of the coadjoint orbit, so ﬁnally the symplectic form reads: ω = −δα = δpi ∧ δqi + ωK i

Remark 1. The solution of the Hitchin systems can be viewed as the projection, under taking the quotient by the gauge group, of a straight line motion on T ∗ A since ˙ = 0 and A˙ = P (Φ) = Const. on this space the equations of motion are of the form Φ Remark 2. Note that the constraints ui = 0 are the same as the constraints fii = α in eq. (7.5), because the Cartan algebra for sl(N ) is generated by Eii − Ejj . occuring in eq. (7.77) is the same as the Lax matrix in Moreover, it is clear that Φ eq. (7.12). This provides some insight in the true nature of the Lax matrix, which appears as a form-valued holomorphic section of a vector bundle (depending on the dynamical data). Remark 3. The above construction naturally yields a Lax pair formulation of ˙ −1 . = hΦh−1 , we have dΦ/dt with M = hh = [M, Φ] the equations of motion. Since Φ As it was emphasized above, h is determined up to a constant (in z) diagonal matrix, ˙ → lΦl −1 and M → lM l−1 + l. the diagonal action on these data is given by Φ

244

7 The Calogero–Moser model 7.13 The trigonometric Calogero–Moser model

In this section we present the construction, due to Olshanetsky and Perelomov, of the trigonometric spin Calogero–Moser model by Hamiltonian reduction of the geodesic motion on a symmetric space. The Lax matrix of the elliptic spin Calogero–Moser model, eq. (7.12), reads in the trigonometric limit: Lij = pi δij + (1 − δij )fij (coth(qij ) − coth(λ))eqij coth(λ) In the limit λ → −∞ we get: Lij = pi δij + (1 − δij )

fij sinh(qi − qj )

This Lax matrix, without spectral parameter, is a good Lax matrix in the case of the scalar trigonometric Calogero–Moser model, as it produces N commuting conserved quantities. In the spin case, however, it does not provide enough conserved commuting quantities. We will construct this Lax matrix by a Hamiltonian reduction procedure, starting from a ﬁnite-dimensional space. The construction is similar to that in the case of the open Toda chain, see Chapter 4. One starts from the symplectic space T ∗ G, where G is a complex Lie group, and reduce under the action on the left by a subgroup HL and on the right by a subgroup HR . The general theory, see Chapter 14, shows that the Poisson brackets in the coordinates (g, ξ) are expressed as: {ξ(X), ξ(Y )} = ξ([X, Y ]),

{ξ(X), g} = g X,

{g, g} = 0

(7.78)

Recall that here g is the matrix of functions on G whose elements are the matrix elements of g in a faithful representation. As in the case of the Toda chain we reduce the geodesic motion on G. The Hamiltonian on T ∗ G reads Hgeod = 12 Tr (ξ 2 ). Notice that Hgeod is bi-invariant, so one can attempt to reduce this dynamical system using Lie subgroups HL and HR of G, acting respectively on the left and on the right on T ∗ G. Recall that the corresponding moments are, see eq. (4.49) in Chapter 4: P L (g, ξ) = −PHL∗ gξg −1 ,

∗ ξ P R (g, ξ) = PHR

(7.79)

∗ of elements of G ∗ inwhere we have introduced the projectors on HL,R duced by the restriction of these elements (linear forms) to HL,R , the Lie algebras of HL,R . Let us consider an involutive automorphism σ of G and let H be the subgroup of its ﬁxed points. Then H acts on the right on G, deﬁning a

7.13 The trigonometric Calogero–Moser model

245

principal ﬁbre bundle of total space G and base G/H, which is a symmetric space. Moreover, G acts on the left on G/H and in particular so does H itself. We shall consider the reduction of the geodesic motion on G under the product HL × HR with HL = HR = H. The reduction is achieved with an adequate choice of the momentum µ = (−µL , µR ) such that (P L , P R ) = µ. We take µR = 0 so that the isotropy group of the right action is HR itself. In this way we are in fact dealing with motions on G/H. The derivative of σ at the unit element of G is an involutive automorphism of G, also denoted by σ. Let us consider its eigenspaces H and K associated with the eigenvalues +1 and −1 respectively. We have a decomposition: G =H⊕K

(7.80)

in which H is the Lie algebra of H. Note that hKh−1 = K, for h ∈ H. We shall consider the particular case obtained by taking G = SL(N, C) and H = SU (N ). Here, we view SL(N, C) as a real Lie algebra. Counting real parameters, dim G = 2N 2 − 2 (note that det M = 1 yields two conditions) and dim H = N 2 − 1. The automorphism σ is σ(g) = (g ∗ )−1 , where ∗ means transpose and complex conjugate. The set of its ﬁxed points is SU (N ). The symmetric space K = G/H may be identiﬁed with the space of positive Hermitian matrices. Indeed, for any matrix M ∈ SL(N, C) we can uniquely write the Cayley decomposition M = KU with U ∈ SU (N ) and K positive Hermitian. This is because M M ∗ is Hermitian and strictly −1 with V ∈ SU (N ) and D dipositive, so we can write M M ∗ = V DV √ √ agonal real positive. We deﬁne K = V DV −1 , where D is the positive square root. Then we check that K −1 M is in SU (N ). At the Lie algebra level we have the decomposition (7.80) where H is the Lie algebra of antihermitian matrices, and K is the space of Hermitian matrices. Finally, σ reads X → −X ∗ , and is an automorphism for the structure of a real Lie algebra. The dual of G is identiﬁed with G using the symmetric real invariant bilinear form (X, Y ) = Tr (XY + X ∗ Y ∗ ). The choice of the moment µL is of course of crucial importance. We will consider µL in the (dual of the) Lie algebra of H, i.e. an antihermitian matrix, such that its isotropy subgroup Hµ is a maximal proper Lie subgroup of HL , so that the phase space of the reduced system is of minimal dimension but non-trivial. As a matter of fact, the dimension of the reduced phase space is 2 dim G − dim (H × H) − dim (Hµ × H). This is because dim T ∗ G = 2 dim G, the constraint (P L , P R ) = (−µL , 0) yields dim (H ×H) equations, and we still have to quotient by Hµ ×H. To analyse the stabilizer of µ we have to solve h−1 µh = µ. In a basis where µ is diagonal, we see that if µ has N − l equal eigenvalues and the other

246

7 The Calogero–Moser model

l eigenvalues all diﬀerent, the stabilizer is U (1)l × U (N − l)/U (1). This yields a reduced phase space of dimension 2N l − l(l + 1). Note that this is the dimension of the phase space of the spin Calogero–Moser model, eq. (7.8), apart from a discrepancy of 2 which corresponds to a reduction by the centre of mass motion, as we shall see. We now introduce coordinates on the group G. Using the Cayley decomposition, we can write uniquely g = KU with K Hermitian positive and U unitary. Diagonalizing K we write K = hL Qh−1 L with Q diagonal with real positive entries, and hL ∈ SU (N ). The columns of hL form the orthonormal basis of eigenvectors of K, hence are deﬁned up to a phase. So we can write g = hL Qh−1 R , where hL , hR ∈ SU (N ) are deﬁned up to right multiplication by the same unitary diagonal matrix. The set of points of T ∗ G with given moment (−µL , 0) is deﬁned by the equations: −1 −1 PH (gξg −1 ) = PH (hL Qh−1 R ξhR Q hL ) = µL ,

PH (ξ) = 0

(7.81)

The second condition means that ξ ∈ K is a Hermitian traceless matrix. −1 −1 Setting L = h−1 R ξhR ∈ K, the ﬁrst condition reads (hL QLQ hL , X) = −1 (µL , X), ∀X ∈ H. Equivalently (QLQ−1 , h−1 L XhL ) = (hL µL hL , −1 hL XhL ), yielding: PH (QLQ−1 ) = h−1 L µL hL q Setting Q = Diag (e i ) with i qi = 0 (hence the reduction by the centre of mass motion), we have (QLQ−1 )ij = exp (qi − qj )Lij . The projection on H amounts to take the antiHermitian part of this matrix, giving the equation sinh (qij )Lij = µ ij with µ ij = (h−1 L µL hL )ij . The solution of this equation is: Lij =

µ ij , i = j, sinh (qij )

Lii = pi ,

pi = 0,

µ ii = 0

where the pi are arbitrary real numbers. We have found: L=

i

pi Eii +

i=j

µ ij Eij sinh (qij )

We recognize the Lax matrix of the trigonometric spin Calogero–Moser model, with the spin variables µ ij describing the coadjoint H-orbit of the momentum µL . In view of the deﬁnition of L, the ambiguity on the deﬁnition of hL , hR amounts to quotienting by a conjugation by a diagonal unitary matrix. We still have not explored the implications of the conditions µ ii = 0 on hL . We do this for the case of the scalar model, i.e. µL has N − 1 equal

7.13 The trigonometric Calogero–Moser model

247

eigenvalues. This means that µL is of the form µL = i(V V ∗ − αI), where V is an N -vector and α is determined so that µL is traceless. We then ii = 0 read have µ = i(W W ∗ − αI), where W = h−1 L V . The conditions µ ¯ i = α for all i. Since hL is deﬁned up to right multiplication by a Wi W positive, diagonal unitary √ matrix, we can always assume that Wi is real −1 some ﬁxed element h ∈ SU (N ) so that Wi = α. There always exists 0 √ mapping V to the vector W = α(1, . . . , 1)t , and any other solution hL is of the form hL = h0 hW , where hW runs over the stability group of W , which is exactly h−1 0 Hµ h0 , where Hµ is the stabilizer of µ. Here the element h0 plays essentially no role, and one usually takes µL = i(W W ∗ − αI). Then all solutions of the constraints (PL , PR ) = (−µL , 0) are of the form (hL , hR ).(Q, L) with Q, L as above, and hL ∈ Hµ , as it should be. Quotienting by Hµ × H, according to the general procedure, yields the phase space of the Calogero–Moser model. A similar analysis can be performed in the case of the spin Calogero–Moser model. Finally, it is easy to see that the symplectic structure of the cotangent bundle we started with gives the symplectic structure of the spin Calogero–Moser model under reduction. It is enough to compute the canonical 1-form: −1 −1 α = (ξ, g −1 δg) = (QLQ−1 , h−1 L δhL ) + (L, Q δQ) − (L, hR δhR )

Using the constraint eq. (7.81), the ﬁrst term is (µL , δhL h−1 L ), that is the Kirillov structure on the spin variables, the second term is i pi δqi , and the last term vanishes because the constraint PH (ξ) = 0 also implies PH (L) = 0. References [1] F. Calogero, Exactly solvable one-dimensional many-body systems. Lett. Nuovo Cimento 13 (1975) 411–415. [2] J. Moser, Three integrable Hamiltonian systems connected with isospectral deformations. Adv. Math. 16 (1975) 441–416. [3] M.A. Olshanetsky and A.M. Perelomov, Classical integrable ﬁnitedimensional systems related to Lie algebras. Physics Reports 71 (1981). [4] I.M. Krichever, Elliptic solutions of Kadomtsev–Petviashvilii equation and integrable systems of particles. Func. Anal. App. 14 (1980) no. 4, 282–290. [5] J. Gibbons and T. Hermsen, A generalization of the Calogero–Moser system. Physica 11D (1984) 337.

248

7 The Calogero–Moser model

[6] H. Airault, H. McKean and J. Moser, Rational and elliptic solutions of the KdV equation and related many-body problem. Comm. Pure Appl. Math. 30 (1977) 95–125. [7] N. Hitchin, Stable bundles and integrable systems. Duke Math. Journ. 54 (1987) 91. [8] M.S. Narasimhan and C.S. Seshadri, Holomorphic vector bundles on a compact Riemann surface. Math. Ann. 155 (1964) 69–80. [9] R.C. Gunning, Lectures on vector bundles over Riemann surfaces, Princeton University Press (1967). [10] O. Forster, Lectures on Riemann Surfaces, Springer (1981).

8 Isomonodromic deformations

In this chapter, we consider isomonodromic deformations of the ﬁrst order linear diﬀerential operator ∂λ − Mλ (λ). Here the problem is to determine Mλ such that solutions of this diﬀerential equation have a given monodromy. In general, the solution is not unique, and depends on a number of continuous parameters, the so-called isomonodromic deformation parameters. We show that the deformation equations with respect to these parameters are an integrable system. Ordinary integrable systems with a Lax matrix appear as particular cases of such systems, namely when the group generated by the monodromies is ﬁnite. However, the new setting is much more general. Just as solutions of Lax equations were written in terms of theta-functions, in the general case, the solutions can be written in terms of new functions called tau-functions. We express the dynamical variables in terms of tau-functions and their so-called Schlesinger transforms. We show that, in terms of tau-functions, the equations of motion take the form of bilinear Hirota equations. Finally, we show that the Painlev´e equations can be interpreted as isomonodromic deformation equations. 8.1 Introduction In Chapter 3, we have seen that the Zakharov–Shabat construction yields hierarchies of integrable equations of the form: ∂ti Ψ(λ) = Mi (λ)Ψ(λ), Mi (λ) = g(λ)ξi (λ)g −1 (λ) − (8.1) Here i is a multi-index i = (k, n, α), ξi (λ) is the diagonal matrix ξi (λ) = (λ − λk )−n Eαα , g(λ) is a regular matrix, and

Ψ(λ) = g(λ)e

i ξi (λ)ti

is a matrix of size N × N . The notation ( )− means taking the polar part at λk . In Chapter 3, the wave function Ψ(λ) was deﬁned through the 249

250

8 Isomonodromic deformations

Lax equation L(λ)Ψ(λ) = Ψ(λ) µ, where µ is the diagonal matrix of the eigenvalues of L(λ). We have shown that the isospectral ﬂows deﬁned by eq. (8.1) are all commuting and satisfy the zero curvature equations: ∂ti Mj − ∂tj Mi − [Mi , Mj ] = 0

(8.2)

In this chapter we enlarge the framework by replacing the Lax equation by a linear diﬀerential equation in the spectral parameter λ: ∂λ Ψ = Mλ Ψ

(8.3)

We assume that the entries of the matrix M (λ) are rational functions of λ: (k)

Ank +1 A1 (∞) (∞) −A0 −· · ·−An∞ −1 λn∞ −1 +· · ·+ (λ − λk ) (λ − λk )nk +1 k=1 (8.4) (k) (k) This is of the form Mλ (λ) = k Mλ (λ), where Mλ is the polar part at (k) λk (including ∞). The polar parts Mλ at λk are readily obtained from the expression of Mλ , but some care must be taken concerning the polar (∞) part at ∞. To deﬁne Mλ one chooses a local parameter z = 1/λ and (∞) writes the diﬀerential equation as ∂z Ψ(z) = (Mλ (z) + regular)Ψ(z), so that using eq. (8.4) one gets: (k) 1 1 (∞) (∞) (∞) −2 −n∞ −1 Mλ ≡ − M ( = A z + · · · + A z − A1 z −1 ) λ n∞ −1 0 z2 z − k (8.5) The solutions Ψ(λ) of the linear diﬀerential equation (8.3) have essential singularities at the λ = λk and at λ = ∞. They are otherwise analytic but have non-trivial monodromies around the singularities. So, they are in fact deﬁned on a Riemann surface with an inﬁnite (in general) number of sheets. This will be understood in the following. Again we will show that around each pole λk of Mλ , the function Ψ(λ) admits an expansion of the form

Mλ (λ) =

K

(k)

Ψ(λ) ∼ Ψasy (λ) = g (k) (λ) eξ (k)

(k) (λ)

(k)

where g (k) (λ) = g0 (1 + g1 (λ − λk ) + · · ·) is regular at λk and (k)

ξ (k) = B0 log (λ − λk ) +

nk α,n=1

ξ (∞)

t(k,n,α) Eαα (λ − λk )n

n∞ 1 (∞) + = B0 log t(∞,n,α) λn Eαα λ α,n=1

(8.6)

251

8.2 Monodromy data

While the equations look the same as in the isospectral context, there are important diﬀerences. One of them is that the quantity ξ (k) includes now a logarithmic term, and another one is that the expansions are now only valid in the asymptotic sense (in general), hence the notation Ψasy (λ). (k) Nevertheless, this is suﬃcient to show that the polar part Mλ of Mλ at λk is given by

(k) Mλ = g (k) ∂λ ξ (k) g (k) −1 −

The solution Ψ(λ) of eq. (8.3) has non-trivial monodromy properties when λ makes a loop around a singularity of Mλ (λ). We call isomonodromic deformations the deformations of Mλ (λ) such that the monodromy properties of Ψ(λ) are kept ﬁxed. Our aim is to write evolution equations which describe these deformations. The isomonodromic deformation parameters will include as before the t(k,n,α) occurring in eq. (8.6). Under these time evolutions we have:

∂ti Ψ = Mi Ψ, Mi = g (k) ∂ti ξ (k) g (k) −1 , i = (k, n, α) −

Moreover, this set of commuting ﬂows will be enlarged by varying the positions λk of the poles of Mλ (λ):

(k) (k) (k) −1 ∂λk Ψ = Mλk Ψ, Mλk = g ∂λk ξ g (8.7) −

All these ﬂows will be interpreted as isomonodromic deformations of the diﬀerential equation ∂λ Ψ = Mλ Ψ, and will be shown to commute. 8.2 Monodromy data In this section, we deﬁne precisely the monodromy properties of a linear diﬀerential equation of the form eq. (8.3). Our purpose is to clarify the relation between the true solution Ψ(λ), and local expansions Ψasy (λ) = g(λ) exp (ξ(λ)) around each pole λk . We start from a wave-function Ψ(λ), solution of the linear diﬀerential equation eq. (8.3), where Mλ (λ) is a globally deﬁned rational function of λ. That is, it is a sum of polar parts (k) Mλ (λ) at given poles λk , as in eq. (8.4). Here we consider the parameters (k) Aj as given and our aim is to reconstruct the asymptotic expansions from the diﬀerential equation. We begin by recalling some basic deﬁnitions and facts concerning linear diﬀerential equations with singular points. Deﬁnition. The point λk is a regular singular point of the linear diﬀerential equation eq. (8.3), if Mλ (λ) has a simple pole at λk . When Mλ (λ)

252

8 Isomonodromic deformations

has a pole of order nk + 1 ≥ 2, the point λk is called an irregular singular point of rank nk . A more reﬁned deﬁnition should include the notion of apparent singularities removable through simple changes of variables. Special care must be taken at the point λ = ∞. To include it into this deﬁnition we set z = 1/λ. The equation becomes: ∂z Ψ = −1/z 2 Mλ (1/z)Ψ

(8.8)

When Mλ (λ) is rational in λ, this equation is of the same form as the original one, so that we can in fact consider that it is deﬁned on the Riemann sphere. Assume ﬁrst that Mλ (λ) is a sum of poles at ﬁnite (∞) distance, i.e. all coeﬃcients Aj = 0 in eq. (8.4). We then have Mλ (λ) = O(1/λ) at ∞ so that 1/z 2 Mλ (1/z) has a simple pole at z = 0 which (∞) is thus a regular singular point. On the other hand, if An∞ −1 = 0 then 1/z 2 Mλ (1/z) ∼ 1/z n∞ +1 and we have an irregular singularity of rank n∞ (in particular this is the case if we have only a constant term corresponding to n∞ = 1). The behaviour of solutions at the two types of singularities is very diﬀerent. Assume ﬁrst that eq. (8.3) has a regular singularity at λ = 0, i.e. one can write it λ∂λ Ψ(λ) = A(λ)Ψ(λ), with A(λ) analytic around λ = 0: A(λ) = A0 + A1 λ + A2 λ2 + · · · We also assume that all eigenvalues of A0 are diﬀerent and do not diﬀer by integers. Theorem. Let λ = 0 be a regular singularity. There exists a fundamental matrix of solutions in some neighbourhood of λ = 0 of the form: Ψ(λ) = g(λ) eB log λ where g(λ) is analytic around λ = 0, and B = Diag (αi ), where αi are the eigenvalues of A0 . The coeﬃcients of the development of gex (λ) are obtained by plugging a series expansion into the diﬀerential equation, and determining the coeﬃcients recursively. Indeed, by a constant similarity transformation, we can assume that A0 is diagonal. Setting g = 1 + g1 λ + g2 λ2 + · · · we get the recursive system: rgr − [A0 , gr ] =

r i=1

Ar−i gi

253

8.2 Monodromy data (r−1)

In components this equation reads (r − αi + αj )(gr )ij = Rij , where R(r−1) only depends on g1 , . . . , gr−1 , so that gr is uniquely determined if the eigenvalues αi do not diﬀer by integers. Moreover, it is easy to show that one gets a series with non-vanishing radius of convergence. If now we have an irregular singularity at λ = 0, the situation is very diﬀerent. Let us assume that the equation is of the form λn+1 ∂λ Ψ(λ) = A(λ)Ψ(λ) with n ≥ 1, and that the most singular term, i.e. A0 , has distinct eigenvalues. One can ﬁnd a formal expansion g(λ) = g0 + g1 λ + · · · by plugging it into the diﬀerential equation, but one ﬁnds a system of linear equations of the form [gr , A0 ] = R(r−1) , where the right-hand side depends on lower order terms. One can obtain formal solutions, but the resulting series is in general divergent. The precise meaning of Ψ ∼ Ψasy is that Ψ(λ) exp (−ξ(λ)) is asymptotically equal to the formal series g(λ). Theorem. Let λ = 0 be an irregular singularity. The diﬀerential equation has a fundamental system of formal solutions in the neighbourhood of λ = 0 of the form: (8.9) Ψasy (λ) = g(λ) eξ(λ) where ξ(λ) = B0 log λ + · · · is a diagonal matrix, and the dots represent a polynomial in 1/λ with dominant term nλ1 n Diag (α1 , . . . , αN ), where the αi are the eigenvalues of the matrix A0 , and g(λ), which is determined up to a right multiplication by a constant diagonal matrix, has a formal expansion of the form g(λ) = g0 + g1 λ + · · ·. In each angular sector S of angle slightly bigger than π/n with vertex λ = 0, there exists a unique true solution Ψ(λ) which admits, in S, an asymptotic expansion given by the above formal series Ψasy . Proof. We shall not give a complete proof of these theorems here (see the References, particularly Wasow) but sketch a few of the ideas involved. It is simpler to assume that the singularity is at inﬁnity, so we set z = 1/λ. We consider the equation z −q Ψ (z) = A(z)Ψ(z) around the singular point z = ∞ of order (q + 1). We assume that A(z) = A0 + 1/zA1 + · · ·, and make the further simplifying assumption that all eigenvalues of A0 are diﬀerent, hence there is no restriction in taking A0 diagonal. We show below that there is a matrix P (z) such that under the transformation Ψ(z) = P (z)W (z), the diﬀerential equation becomes W (z) = z q Q(z)W (z) with Q(z) diagonal. This equation is readily solved, yielding: z q (8.10) Ψ(z) = P (z)e ζ Q(ζ)dζ

254

8 Isomonodromic deformations

The matrix P (z) must obey the equation: z −q P (z) = −P (z)Q(z) + A(z)P (z)

(8.11)

in which Q(z) is diagonal. Of course the transformation P (z) is deﬁned up to multiplication on the right by a diagonal matrix, since this would leave the matrix Q(z) diagonal. We ﬁx this ambiguity by requiring Pkk (z) = 1. Taking the diagonal element kk of eq. (8.11) one gets Qkk (z) = (A(z)P (z))kk . Plugging this into eq. (8.11) one gets a non-linear system for the N (N − 1) non-diagonal elements Pij (z) with i = j: z −q Pij (z) = (A(z)P (z))ij − Pij (A(z)P (z))jj

(8.12)

We ﬁrst show that this equation admits a unique formal solution of the form: 1 1 P (z) = 1 + P , (P ) = 0, Q(z) = A + Qr r r 0 kk zr zr r≥1

r≥1

Inserting into eq. (8.11), the coeﬃcient of 1/z r for r ≥ 1 gives: A0 Pr − Pr A0 = Qr + Hr

(8.13)

where Hr depends on Pj , Qj for j < r. This is solved recursively uniquely by setting (Qr )kk = −(Hr )kk , which reproduces the solution Qkk = (AP )kk , and (Pr )ij = 1/(αi − αj )(Hr )ij for i = j, where we recall that A0 = Diag (α1 , . . . , αN ). Inserting this expansion for P (z) and Q(z) into eq. (8.10) we get eq. (8.9) with B0 = −Qq+1 and ξ = − qi=0 Qi z q−i+1 /(q − i + 1). Moreover, g(z) = P (z) exp ( ∞ i=q+2 Qi z q−i+1 /(q − i + 1)) is of the form g = 1 + O(1/z) and this uniquely determines this expansion. Note that the most singular term in ξ is indeed of the form A0 z n /n with n = q + 1 and A0 diagonal. If A0 is not diagonal, write A0 = g0 Dg0−1 . Then g = g0 (1 + O(1/z)). Of course g0 is determined up to a right multiplication by any constant diagonal matrix. In contrast to the regular singularity case this expansion is, however, only valid in the asymptotic sense. We then show that there exists a true solution in the sector S having the above formal solution as asymptotics when z → ∞. Write P = 1 + P˜ , eq. (8.12) takes the form z −q P˜ (z) = f0 (z) + f1 (z, P˜ ) + f2 (z, P˜ ), where f1 (z, P˜ ) is linear in P˜ , and f2 (z, P˜ ) is quadratic in P˜ . Moreover, when z → ∞, f1 (z, P˜ ) = [A0 , P˜ ] + · · ·, hence this linear application is not singular (since P˜ has no diagonal element). Finally, we know this equation

8.2 Monodromy data

255

has a formal solution r≥1 Pr /z r . It is a known theorem that there exists, in the interior of any sector of angle less than 2π, an analytic function Φ(z) with the given asymptotics r≥1 Pr /z r when z → ∞. We set P˜ (z) = U (z) + Φ(z) and get a transformed equation for U (z) of the same form, z −q U (z) = g0 (z) + g1 (z, U (z)) + g2 (z, U (z)), but now when z → ∞, one can show that g0 (z) is asymptotic to 0, and the leading term of g1 (z, U (z)) is of the form [A0 , U (z)]. Finally, our equation can be written: z −q U (z) = [A0 , U (z)] + R(z, U ) where R(z, U ) will be treated as a perturbation. The unperturbed equation is readily solved and has the fundamental solution: V (z) = e

z q+1 A0 q+1

q+1

− zq+1 A0

V0 e

We now replace the diﬀerential equation by a system of coupled integral equations. Written in components, they are of the form: z q+1 −tq+1 (αi −αj ) q e q+1 t Rij (t, U (t)) dt, i = j Uij (z) = γij

The integration path γij ends at z but its origin may depend on ij. The origins of the paths γij represent the N (N − 1) integration constants of the problem. So we need to carefully specify the integration paths, and it is here that the sector S appears. To be able to control the exponentials in the kernel of the integral equation one chooses paths γij (t) such that Re (z q+1 − tq+1 )(αi − αj ) < 0. Let us examine the conditions under which such a choice is possible. We use the variables ζ = z q+1 and τ = tq+1 . In the τ -plane we draw the lines Re τ (αi − αj ) = 0 called Stokes lines. Let Σ be a sector of angle slightly larger than π, such that each Stokes line intersects the interior of Σ on just one half-line. The pre-image of Σ in the z-plane (under z → ζ) is a sector S: (8.14) θ0 ≤ θ ≤ θ0 + π/n + δ with n = q+1, and some small positive δ. The Stokes line Re τ (αi −αj ) = 0 divides Σ in two regions where Re τ (αi − αj ) is positive and negative respectively. The path γij is taken so that its image in Σ is a straight line from ∞ to ζ with a slope such that Re τ (αi − αj ) < 0. The origins of the paths are chosen at ∞ so that Uij (z) → 0 when z → ∞. The solution Uij (z) is uniquely determined by this requirement, which subsequently yields uniqueness of the solution Ψ(λ) having the considered

256

8 Isomonodromic deformations

asymptotic expansion in the sector S. With this choice of path, one can check that Uij (z) is asymptotic to 0 when z → ∞, hereby justifying the treatment of R(z, U ) as a perturbation. It is clear that for τ on the paths γij one can write: - q+1 q+1 - z −t (αi −αj ) -e q+1 - ≤ e−β|xq+1 −tq+1 | for some positive ﬁxed constant β (recall that αi = αj for i = j). From this point, one can solve the integral equation by successive approximations and prove that this yields series U a convergent in the sector S and asymptotic to zero. We will not reproduce this analysis here. Note that the paths occur in pairs γij and γji such that their images in Σ are straight lines, one on each side of the line Re (ζ − τ )(αi − αj ) = 0. Moreover, we keep the slopes ﬁxed when ζ varies in Σ, and this can be done through the whole sector Σ bounded by some Stokes lines. Note, however, that if ζ goes beyond the boundary, one of the pair of paths has to be modiﬁed (see Fig. 8.1) yielding a diﬀerent value of the integral. In general the true solution Ψ1 (λ) with asymptotic expansion Ψasy (λ) in the sector S1 can be analytically continued beyond the sector S1 , but it is very important to realize that its asymptotic expansion will be different there. If we consider another sector S2 adjacent to S1 , there exists another true solution Ψ2 which has in S2 the given asymptotic expansion Ψasy (λ). Since S1 and S2 overlap, there is a constant matrix S1 such that in this overlap Ψ2 = Ψ1 S1 . This relation remains true in S1 ∪ S2 by analytic continuation. This phenomenon was ﬁrst noticed by Stokes and the matrix S1 is called a Stokes multiplier. Of course, the origin of the Stokes phenomenon is that subdominant exponentials in one sector become dominant in the next sector. Note that there exists a permutation matrix P such that P −1 S1 P is triangular with diagonal elements equal to 1. Indeed, let P be a permutation matrix. Then ΨP results from Ψ by some permutation of columns. We have ΨP ∼ gP exp(P −1 ξP ), where the matrix P −1 ξP is diagonal and results from ξ by the corresponding permutation of the diagonal elements. In the smallest sector bounded by Stokes rays containing the overlap S1 ∩ S2 , one can choose P such that Re (αP (1) τ ) < Re (αP (2) τ ) < · · · < Re (αP (n) τ ). Since Ψ1 P and Ψ2 P have the same asymptotics in the overlap, we have necessarily Ψ2 P = Ψ1 P S1 , where S1 is lower triangular. This is because exp(−αi τ ) is asymptotically negligible with respect to exp(−αj τ ) when i > j. Moreover, comparing the asymptotics of Ψ1 P and Ψ2 P we see that the diagonal elements of S1 are equal to 1. Finally

257

8.2 Monodromy data

Fig. 8.1. Here the sector Σ is bounded by the bold lines, close to the Stokes rays, i.e. half Stoke lines, labelled 23 and 12. When the point ζ crosses the boundary 12, in the lower left side of the picture, the pair of paths γ12 and γ21 has to be modiﬁed as indicated by the dashed lines. Notice that this new path cannot be continuously deformed into the previous one. The sector Σ is of angle greater than π, so its pre-image is as in eq. (8.14). S1 = P S1 P −1 . Altogether the Stokes matrix S1 depends on N (N − 1)/2 continuous parameters. The monodromy matrix around λ = 0 is the matrix M such that Ψ1 (e2iπ λ) = Ψ1 (λ)M

(8.15)

where the left-hand side means the analytic continuation of the solution Ψ1 (λ), with asymptotics Ψasy on S1 , around a closed contour around λ = 0. We can easily relate the matrix M to the Stokes multipliers as follows. Let us cover a neighbourhood of the plane at λ = 0 by 2n sectors of angle π/n + δ, denoted by S1 , S2 , . . . , S2n . More precisely, the sector Sj is deﬁned by (j − 1)π/n ≤ arg (λ) ≤ jπ/n + δ. First in the sector S2 , we have Ψ1 (λ) = Ψ2 (λ)S1−1 (where Ψ2 has the asymptotic Ψasy in S2 ) since

258

8 Isomonodromic deformations

this is true on the overlap. By recursion on the sector Sj we have Ψ1 (λ) = −1 Ψj (λ)Sj−1 · · · S1−1 for λ ∈ Sj . Making a complete 2π rotation around λ = 0, we get a sector S2n+1 which projects over S1 , and we have on this sector −1 · · · S1−1 . But the asymptotic expansion of Ψ1 (e2iπ λ) = Ψ2n+1 (e2iπ λ)S2n Ψ2n+1 (e2iπ λ) is by deﬁnition Ψasy (e2iπ λ) = Ψasy (λ) exp (2iπB0 ), so we see that: −1 Ψ1 (e2iπ λ) Ψasy (λ) e2iπB0 S2n · · · S1−1 (8.16) By comparing the asymptotic expansions in eqs. (8.15, 8.16) we have: −1 · · · S1−1 M = exp (2iπB0 ) S2n

In the case of regular singularities, there are no Stokes matrices, and the monodromy matrix reduces to exp (2iπB0 ). We now extend these results to the case of several singular points λk , including the point ∞. Hence we consider the diﬀerential equation eq. (8.3), where Mλ (λ) is the general rational function of λ given in eq. (8.4). To describe the monodromy data of eq. (8.3), we have to patch the local descriptions around each singularity λk . At these points we have formal solutions:

(k) (k) (k) (k) (λ) eξ (λ) , g (k) (λ) = g0 1 + g1 (λ − λk ) + · · · Ψ(k) asy (λ) = g Since they obey eq. (8.3) one can readily identify the polar part of Mλ at λk . We compute ∂λ Ψ · Ψ−1 in a sector so that we can replace Ψ by its asymptotic expansion. We get ∂λ gg −1 + g∂λ ξg −1 ∼ Mλ . Keeping the polar part yields the equality between a ﬁnite number of terms:

(k) (k) (k) −1 Mλ = Ψasy (λ)∂λ ξ (k) Ψasy (λ) (8.17) −

(∞)

Due to our deﬁnition of Mλ parameter z = λ−1 : (∞)

Mλ

we have a similar formula at ∞ using the

(∞) (∞) −1 (z) = Ψ(∞) Ψasy asy ∂z ξ

−

(8.18)

The global monodromy problem is speciﬁed by ﬁxing paths γk from a reference point, which we choose to be ∞, to each λk . Around λk there are (k) (k) sectors Sj and corresponding solutions Ψj with the given asymptotics. (∞)

Starting from the solution Ψ1 we want to see how it changes when it is (∞) analytically continued around singularities. We can continue Ψ1 along (k) the path γk and compare the result to Ψ1 . This deﬁnes matrices C (k)

259

8.2 Monodromy data

λ1

λ2

λ3

λ

∞

Fig. 8.2. The paths γi and the various Stokes sectors at the points λk . (∞)

such that Ψ1 around λk is

(k)

(∞)

(λ) = Ψ1 (λ)C (k) . Then the monodromy matrix of Ψ1 (k)

(k) −1

M(k) = C (k) −1 e2iπB0 S2n

(k) −1 (k)

· · · S1

C

(8.19)

Around λ = ∞ we have the same formula with C (∞) = 1. Since the path around all singularities is contractible, these matrices are subjected to the relation: M(∞) · M(1) · · · M(K) = 1

(8.20)

where K is the number of singularities at ﬁnite distance. The monodromy group is the group generated by the M(k) subjected to the above relation. Note that the determinant of each Stokes matrix is equal to 1 since there exists a permutation of the basis in which it is triangular with 1 on (k) the diagonal. It follows that det (M(k) ) = exp (2iπTr B0 ). Taking the (k) determinant of eq. (8.20) we see that k Tr B0 is an integer, where the sum includes ∞. In fact there is a stronger compatibility condition for

260

8 Isomonodromic deformations

the existence of Ψ(λ), called the Fuchs condition, stating that this integer vanishes. (k) (∞) Tr (B0 ) + Tr (B0 ) = 0 (8.21) k

To show this we ﬁrst take the trace of eq. (8.18), getting: (∞)

Tr Mλ

(∞) 1 = Tr ∂z ξ (∞) = Tr B0 + ··· z −

On the other hand, considering the 1/z term in eq. (8.5), we get: (∞)

Tr Mλ

=−

1 (k) Tr A1 + · · · z k

so we get (∞)

Tr B0

=−

(k)

Tr A1

(8.22)

k

Similarly, at each singularity λk at ﬁnite distance, we take the trace of eq. (8.17): (k)

Tr Mλ

(k) = Tr ∂λ ξ (k) = Tr B0 −

1 + ··· λ − λk

Considering the 1/(λ − λk ) term in eq. (8.4), we get: (k)

Tr Mλ (k)

=

1 (k) Tr A1 + · · · λ − λk

(k)

so that Tr B0 = Tr A1 . Inserting this into eq. (8.22), we ﬁnd the Fuchs relation. Our next task is to relate the coeﬃcients of Mλ (λ) and the monodromy data. Deﬁnition. We deﬁne the monodromy data at λk to be the 2nk Stokes (k) matrices Sj and the connection matrices C (k) . We deﬁne the singularity data at λk to be the coeﬃcients of the singular terms ξ (k) . These deﬁnitions apply to the point at ∞ as well. It is important to note that given Mλ , the monodromy data are deﬁned only up to a group of diagonal matrices at each λk . In fact we have seen that they are uniquely deﬁned once we have chosen an asymptotic (k) expansion around λk . More precisely, once the most singular part of Mλ

8.2 Monodromy data

261

has been diagonalized, there is a canonical asymptotic expansion of the (k) form Ψasy (λ) = (1 + O(λ − λk )) exp (ξ (k) (λ)). Conjugating the matrix Mλ by a constant matrix, one can assume that the most singular term at ∞ is (∞) diagonal. Then there is a canonical choice of Ψ1 . At the other singular (k) (k) points, however, one has simply Ψasy (λ) = g0 (1+O(λ−λk )) exp (ξ (k) (λ)), (k) (k) where g0 is the matrix diagonalizing the leading singularity in Mλ . It is deﬁned only up to multiplication on the right by a diagonal matrix d(k) . (k) Hence the Ψ1 are deﬁned only up to right multiplication by d(k) . This (k) (k) changes ξ (k) → ξ (k) , Sj → d(k) −1 Sj d(k) , and C (k) → C (k) d(k) . Since the monodromy data are signiﬁcant only up to this diagonal action, we deﬁne reduced monodromy data to be the quotient set. We are now in a position to compare the number of parameters in the matrix Mλ and in the reduced monodromy and singularity data set. Assuming that the leading singular coeﬃcient at ∞ is diagonal, the matrix Mλ depends on N 2 (n∞ + K k=1 nk + K − 1) + N parameters. On the other hand, each Stokes matrix depends on N (N − 1)/2 parameters, and there are 2nk such matrices at each singular point including ∞. Similarly each ξ (k) depends on (nk + 1)N parameters, while the matrix C (k) (for k = 1, . . . , K) depends on N 2 parameters. Altogether the set of mon odromy and singularity data contains N 2 (n∞ + K k=1 nk + K) + N (K + 1) parameters. These parameters are not all independent, since we have to take into account the relation eq. (8.20) between the monodromy matrices which removes N 2 parameters, and we have to quotient by the diagonal group action at each λk which removes N K parameters. One then gets exactly the number of free parameters in Mλ . It is then reasonable to expect that these two sets of data are equivalent, that is to say, for any monodromy and singularity data satisfying the appropriate consistency conditions, one can ﬁnd a unique diﬀerential equation of the form eqs. (8.3, 8.4) with these given monodromy and singularity data. This is called the generalized Riemann problem, which was ﬁrst studied by Birkhoﬀ and more recently by Malgrange and Sibuya. The unicity part is easy. Assume that we have two diﬀerential equations with the same monodromy and singularity data. Consider corresponding ˜ solutions Ψ(λ) and Ψ(λ), normalized as (1 + O(λ−1 )) exp (ξ (∞) (λ)) at ∞. −1 (λ). We see that P (λ) is single valued. ˜ Let us consider P (λ) = Ψ(λ)Ψ The singular parts cancel at λk so that P (λ) is holomorphic at λk and therefore holomorphic everywhere. Since P (λ = ∞) = 1, we get P (λ) = 1.

262

8 Isomonodromic deformations

−1 The same argument applied to ∂Ψ(λ) ∂λ Ψ (λ) shows that Mλ is a rational function with a pole of order nk + 1 at λk .

The existence part, i.e. the reconstruction of Ψ from its monodromy and singularity data, is much more diﬃcult. We present in the next section a sketch of this construction in the case of regular singularities, by relating it to the Riemann–Hilbert factorization problem. In the following we shall take the general result for granted, and refer to the literature for its proof. However, assuming that Ψ exists, one can write linear diﬀerential equations on Ψ which characterize deformations where the monodromy data are ﬁxed and we vary the singularity data. The compatibility conditions of this system is a set of non-linear diﬀerential equations on the coeﬃcients of Mλ . We will directly prove the integrability, in the Frobenius sense, of these equations. Hence there exist locally solutions depending on exactly the number of deformation parameters. 8.3 Isomonodromy and the Riemann–Hilbert problem In the same way that the Riemann–Hilbert factorization problem is fundamental for the study of isospectral integrable systems, it is also the key ingredient in the construction of Ψ with given monodromy. We shall only consider the restricted Riemann problem, i.e. ﬁnd a diﬀerential equation with ﬁrst order poles whose solutions have a prescribed monodromy group. Let us ﬁx K points λ1 , . . . , λK (including λK = ∞) on the Riemann sphere, and K matrices M(1) , . . . , M(K) whose product M(K) · · · M(1) = 1. We construct a multivalued function Ψ(λ) such that Ψ(λ) → Ψ(λ)M(k) , when λ describes a loop around λk . Finally, we show that det Ψ = 0 and that ∂λ ΨΨ−1 is rational with simple poles at the λk at ﬁnite distance. Following Birkhoﬀ, we draw a simple closed path D visiting all the λk , but leaving them outside, and small circles Ck around each λk as in Fig. 8.3. We deﬁne matrices M1 , . . . , MK+1 with M1 arbitrary, and −1 Mk+1 Mk = M(k) . Since M(k) = 1 we have MK+1 = M1 . We then deﬁne a C ∞ invertible matrix M (λ) on the path D, which is constant equal to Mk between Ck and Ck+1 , and which inside Ck is a C ∞ interpolation between the two constant values Mk and Mk+1 . With these data we can consider the following Riemann–Hilbert factorization problem: U+ (λ) = U− (λ)M (λ) on D where U+ (λ) is analytic inside D and U− (λ) is analytic outside D. As explained in Chapter 3, indices can appear in the solution of the factorization problem, see eq. (3.49). We have absorbed the diagonal matrix of

8.3 Isomonodromy and the Riemann–Hilbert problem

263

Fig. 8.3. The path D and the small circles Ck around the λk . indices in U− , which is thus allowed to have a pole or a zero at ∞, otherwise det U± = 0. Note that U+ has an analytic continuation outside D, through the segment of D between Ck−1 and Ck , given by U− Mk . This is because Mk being constant, U− (λ)Mk is analytic outside D and coincides with U+ on this segment, hence analytically continues U+ (λ). Similarly, the analytic continuation of U− through the segment of D between Ck −1 and Ck+1 is given by U+ Mk+1 . So, if we perform a loop around λk , U+ (λ) −1 gets multiplied on the right by M(k) = Mk+1 Mk . We have obtained a multivalued matrix U (λ), the analytic continuation of U+ (λ), which is analytic outside the Ck , has non-vanishing determinant there, apart possibly for a pole at ∞, and has the given monodromy properties around the Ck . We now consider a second Riemann–Hilbert factorization problem, relative to the union of the contours Ck . Let Zk (λ) be the matrix deﬁned inside Ck by: 1 (k) Zk (λ) = (λ − λk ) 2iπ log M for some determination of the logarithms. Of course, the matrix Zk (λ) undergoes a right multiplication by M(k) when λ performs a loop around λk . We deﬁne the functions Ak (λ) on each Ck as U (λ)Zk−1 (λ). Notice that Ak (λ) is univalued and analytic in a vicinity of Ck . Let us denote by A(λ) the collection of functions Ak (λ) on Ck and consider the

264

8 Isomonodromic deformations

factorization problem: V+ (λ) = V− (λ)A(λ) on ∪ Ck where V+ (λ) is a set of invertible matrices Vk+ (λ) analytic inside Ck , VK+ having a possible pole at λK = ∞ (we have absorbed the indices here) and V− (λ) is an invertible matrix analytic outside the Ck . Finally, on each Ck we have Vk+ (λ) = V− (λ)Ak (λ). With the solution of these two Riemann–Hilbert problems at hand, we deﬁne: Ψ(λ) = V− (λ)U (λ) outside the Ck The analytic continuation of this function inside Ck is Vk+ (λ)Zk (λ) by the deﬁnition of the second factorization problem. Note that Vk+ (λ)Zk (λ) is analytic inside Ck except at λk . Finally, Ψ(λ) has the same monodromy around Ck as Zk (λ), hence has the prescribed monodromies M(k) . It is clear that ∂λ ΨΨ−1 is well-deﬁned and analytic except at the λk , where it has a simple pole. This is because inside Ck : −1 −1 + Vk+ (∂λ Zk Zk−1 )Vk+ ∂λ ΨΨ−1 = ∂λ Vk+ Vk+

and ∂λ Zk Zk−1 has a simple pole at λk . This means that Ψ is a solution of eq. (8.3) with Mλ having only simple poles at the λk , thereby solving the restricted Riemann problem. There is still a problem left, that we cannot be sure that the pole at λK = ∞ is simple. When the corresponding monodromy matrix M(∞) is diagonalizable, Birkhoﬀ has shown that one can carefully choose ZλK so as to achieve a simple pole at ∞. The basic idea is that log Mk is only deﬁned up to integers, and one can choose them to remove the left-over poles. It has been found, however, subsequently that the pole may remain of higher order if M(∞) is not diagonalizable. The case of irregular singularities has been treated by similar methods by Birkhoﬀ. Alternatively, one can see higher order poles as obtained from ﬁrst order poles by conﬂuence of singularities. As an example, consider eq. (8.3) when Mλ (λ) is a sum of two nearby polar terms: Mλ (λ) =

1 A1 − A2 −1 1 A1 + A2 −1 A1 A2 + = + 2 + O( 2 ) 2 λ+ 2 λ− λ λ

We see that it is easy to produce poles of order n by letting n simple poles move to a single point. 8.4 Isomonodromic deformations As we have seen in the previous section, the matrix Ψ(λ) is determined by its monodromy data and its singularity data at the λk . These are two

265

8.4 Isomonodromic deformations

independent sets of quantities which can be speciﬁed independently. We will examine here the matrices Ψ(λ), with prescribed monodromy data (k) (k) Sj , C (k) , ﬁxed parameters B0 , and varying the parameters ti in ξ (k) and the singularity positions λk . Assuming that Ψ exists, we will show that Ψ(λ, {ti }, {λk }) has to satisfy a hierarchy of equations of the form eq. (8.1) with respect to the times ti , and new equations, that we call Schlesinger’s equations, with respect to the λk . Notice that these are all the possible deformation parameters, since once they are ﬁxed, together with the monodromy data, they determine the function Ψ(λ) uniquely. Let us assume that some multivalued function Ψ(λ) exists, with given monodromy and singularity data. Speciﬁcally, suppose that we are given (k) the positions λk of the singularities, some Stokes matrices Sj at these singularities (which satisfy the triangularity condition) and some connection matrices C (k) , so that eq. (8.20) is obeyed. By hypothesis, we have (∞) (∞) an asymptotic expansion of the form Ψasy (λ, t) = (1 + O(1/λ)) eξ (λ,t) (k) in the ﬁrst sector at ∞, and asymptotic expansions in the sector Sj at λk of the form: (k) −1

Ψ(λ) ∼ Ψ(k) asy (λ)Sj

(k) −1 (k)

· · · S1

C

(8.23)

where (k)

(k)

ξ Ψ(k) asy (λ) = g0 (1 + g1 (λ − λk ) + · · ·) e

(k) (λ,t)

(8.24)

With these assumptions, the matrix Mλ = ∂λ ΨΨ−1 is a rational function of λ, as was noticed above. Hence it is of the form eq. (8.4) and the diﬀerential equation ∂λ Ψ = Mλ Ψ has a solution Ψ(λ) with the above monodromy and singularity data. Denote by {τi } the set of variables {ti } ∪ {λk }, andby d the exterior diﬀerentiation with respect to the parameters τi , d = i dτi ∂τi . Theorem. The monodromy data for Ψ(λ) are independent of the deformation parameters if and only if the function Ψ(λ) satisﬁes diﬀerential equations with respect to the deformation parameters: dΨ = M Ψ where M = of λ.

i Mi dτi

(8.25)

is a 1-form with coeﬃcients Mi rational functions

Proof. Let us consider the function Ψ(λ) obeying all the above constraints, and consider the 1-form dΨ(λ)Ψ−1 (λ), as a function of λ. This 1-form is single valued around λk because when we turn around λk , the matrix Ψ(λ) gets multiplied by the monodromy matrix eq. (8.19), which

266

8 Isomonodromic deformations

cancels in dΨ Ψ−1 because it is assumed to be independent of the deformation parameters. Moreover, in the vicinity of λk its asymptotic expansion can be computed in any sector Sj by inserting the asymp(k) (k) −1 totic expansion eq. (8.23), yielding dΨ Ψ−1 ∼ dΨasy Ψasy , where we again use the independence of the monodromy data from the deformation parameters, and the known fact that one can diﬀerentiate an asymp(k) totic expansion valid in a sector. From the explicit form of Ψasy we see that the singularity of the asymptotic expansion of dΨ(λ)Ψ−1 (λ) is a pole at λk . Explicitly, ∂ti Ψ(λ) Ψ−1 (λ) has a pole of order nk at λk , and ∂λk Ψ(λ)Ψ−1 (λ) has a pole of order nk + 1 at λk . Hence M (λ) is a rational function of λ. Conversely, assume that Mλ (λ) is parametrized by some parameters τi and that one can write equations ∂τi Ψ = Mi (λ)Ψ, where Mi (λ) are rational functions of λ. Considering, for example, two adjacent sectors S1 , S2 at a singularity, the solution Ψ(λ) has the asymptotic Ψasy in S1 and Ψasy S in S2 . Then ∂i Ψ has the asymptotic ∂i Ψasy = Mi Ψasy in S1 . This relation on Ψasy remains true in all sectors. In S2 , ∂i Ψ has the asymptotic ∂i Ψasy S + Ψasy ∂i S, but this should also be equal to Mi Ψ which has the asymptotic Mi Ψasy S. Hence ∂i S = 0. Similarly, one shows (k) that the connection matrices C (k) and the B0 are independent of the deformation parameters τi . We now change our point of view and directly study the system of equations: ∂λ Ψ = Mλ Ψ, ∂τi Ψ = Mi (λ)Ψ (8.26) with Mλ of the form eq. (8.4) and Mi rational matrices. This system of equations is compatible if the rational matrices Mλ , Mi obey the zerocurvature conditions: ∂τi Mλ = ∂λ Mi + [Mi , Mλ ]

(8.27)

∂τi Mj − ∂τj Mi − [Mi , Mj ] = 0

(8.28)

In this case, the Frobenius theorem asserts that one can ﬁnd solutions Ψ(λ) depending on the maximal number of parameters. In our situation the maximal number of isomonodromic parameters τi and compatible equations that one can introduce is just the set of times ti and the λk . We want to prove the compatibility of the system eqs. (8.27, 8.28) directly, for properly deﬁned Mi , yielding the existence of Ψ(λ; τi ). We start from the rational matrix Mλ of the form eq. (8.4) and consider the diﬀerential equation eq. (8.3). Around λk there exists a formal solution

267

8.4 Isomonodromic deformations (k)

Ψasy = g (k) exp (ξ (k) ). Hence g (k) obeys the equation: ∂λ g (k) = Mλ g (k) − g (k) ∂λ ξ (k) Next we deﬁne the rational matrices:

Mi (λ) = g (k) ∂τi ξ (k) g (k) −1

(8.29)

−

where τi stands for t(k,n,α) and λk . Note that the matrices Mi are algebraic functions of the matrix elements of Mλ , since to compute Mi , we need the expansion of g (k) to some ﬁnite order, which is obtained algebraically from Mλ by the recurrence relations eq. (8.13). From Mi , we deﬁne some vector ﬁeld Xi acting on Mλ by: Xi Mλ = ∂λ Mi + [Mi , Mλ ]

(8.30)

It is important to notice that the polar structure of the right-hand side of this equation allows us to consider the ﬂows Xi as acting on the coeﬃcients (k) Aj in eq. (8.4) of Mλ and the λk . In particular Xi and ∂λ commute. To show that, one has to examine the order of the pole at λk in both sides of eq. (8.30). Let i = (k, n, α). Around λk , k = k, the matrix Mi is regular and eq. (8.30) is compatible with the pole structure of Mλ at λk . Around λk , since ∂λ (A)− = (∂λ A)− for any rational matrix A(λ), we have ∂λ Mi = (g (k) ∂λ ∂τi ξ (k) g (k)−1 )− + ([∂λ g (k) g (k)−1 , g (k) ∂τi ξ (k) g (k)−1 ])− In the second term, we can replace ∂λ g (k) g (k)−1 = Mλ − g (k) ∂λ ξ (k) g (k)−1 by Mλ because g (k) ∂λ ξ (k) g (k)−1 does not contribute to the commutator since ξ (k) is diagonal. Similarly, writing g (k) ∂τi ξ (k) g (k)−1 = Mi + (g (k) ∂τi ξ (k) g (k)−1 )+ we get ∂λ Mi = (g (k) ∂λ ∂τi ξ (k) g (k)−1 )− − [Mi , Mλ ]− + [Mλ , (g (k) ∂τi ξ (k) g (k)−1 )+ ]− hence ∂λ Mi + [Mi , Mλ ] = (g (k) ∂λ ∂τi ξ (k) g (k)−1 )− +[Mi , Mλ ]+ + [Mλ , (g (k) ∂τi ξ (k) g (k)−1 )+ ]− from which we see that the pole structure is the same as the one of ∂τi Mλ (it is the term ∂λ ∂τi ξ (k) which controls this assertion, the action of ∂τi on (k) the Aj will be determined later on).

268

8 Isomonodromic deformations

We are now in a position to prove the main theorem of this section: Theorem. The ﬂows Xi are all commuting, and we can identify Xi = ∂τi . Proof. The commutation of the ﬂows Xi , Xj is expressed by ∂λ Fij − [Mλ , Fij ] = 0, where: Fij = Xi Mj − Xj Mi − [Mi , Mj ] We show in fact a stronger result, i.e. Fij = 0. Our ﬁrst task is to ﬁnd the action of the ﬂow Xi on the variables g (k) and ξ (k) around any pole λk . To do that we apply Xi to eq. (8.29), getting: ∂λ (Xi g (k) − Mi g (k) ) = Mλ (Xi g (k) − Mi g (k) ) −(Xi g (k) − Mi g (k) )∂λ ξ (k) − g (k) ∂λ (Xi ξ (k) ) Writing (Xi g (k) − Mi g (k) ) = g (k) hi and using eq. (8.29), we get for hi the linear equation: ∂λ hi − [∂λ ξ (k) , hi ] = −Xi ∂λ ξ (k) Since Xi ξ (k) is diagonal, a particular solution is hi = −Xi ξ (k) . The gen(k) (k) eral solution is obtained by adding to it eξ Di e−ξ for any matrix Di independent of λ. Through its deﬁnition, we see that hi has at most poles at λk , hence Di must be diagonal, otherwise essential singularities appear (k) (k) in eξ Di e−ξ . Finally, we get: Xi g (k) = Mi g (k) − g (k) Xi ξ (k) + g (k) Di Looking at the polar part of this equation we see that (g (k) Xi ξ (k) g (k) −1 )− = (g (k) ∂τi ξ (k) g (k) −1 )− , from which it follows (k) that Xi ξ (k) = ∂τi ξ (k) , assuming that g0 is generic. Hence Xi identiﬁes to ∂τi on ξ (k) . We can now compute Fij . It is simpler to use a compact notation: let M be the 1-form Mi dτi , and let δ be the vector ﬁeld i dτi Xi with values in diﬀerentials. Note that on ξ (k) , δ identiﬁes to d so that 2 (k) (k) δ ξ = 0. The equation on g becomes, with D = Di dτi : δg (k) = M g (k) − g (k) δξ (k) + g (k) D

(8.31)

The conditions Fij = 0 read δM − M ∧ M = 0. With the help of the above equation one can compute the polar part of δM at λk . Since M (k) = (g (k) δξ (k) g (k)−1 )− , we get: δM (k) = (g (k) [D, δξ (k) ]g (k) −1 )− + [M, g (k) δξ (k) g (k) −1 ]−

269

8.4 Isomonodromic deformations where for M =

Mi dτi , N =

Mj dτj , we deﬁne [Mi , Nj ]dτi ∧ dτj [M, N ] = i=j

The ﬁrst term vanishes because it involves commutators of diagonal matrices. To evaluate the second term we remark that M − g (k) δξ (k) g (k) −1 = O(1), hence, squaring it, we get M ∧M −[M, g (k) δξ (k) g (k) −1 ] = O(1) since again the commutator of diagonal matrices vanishes. Taking the polar part of this relation, we conclude that [M, g (k) δξ (k) g (k) −1 ]− = (M ∧ M )− , hence δM (k) = (M ∧ M )− . Since this is true at each pole λk and since M is a rational function of λ, we see that Fij = 0 for all i, j. We have shown that [Xi , Xj ] = 0. The Frobenius theorem implies that one can simultaneously solve (locally) ∂τi Mλ = Xi Mλ , thereby obtaining rational matrices Mλ , Mi satisfying eqs. (8.27, 8.28). Since Xi identiﬁes with ∂τi on ξ (k) , we have, ∂τi = ∂τi and ∂τi identiﬁes to Xi everywhere. Remark 1. The term D in eq. (8.31) appeared because g(k) is deﬁned only up to right multiplication by a diagonal matrix. This gauge transformation did not aﬀect the calculation above because only gauge invariant quantities are considered. For any gauge choice, we havean equation of the form eq. (8.31) for some speciﬁc D. Now that δ is identiﬁed to d = dτi ∂τi , we have d2 = 0, and this implies dD = 0. Hence we have D = dh, so one can choose a gauge where D = 0. In this gauge the evolution equations of g (k) read: dg (k) = M g (k) − g (k) dξ (k) (8.32) This should be compared to eq. (3.44) in Chapter 3.

Remark 2. The compatibility conditions, eq. (8.27, 8.28) are non-linear diﬀerential equations on the coeﬃcients of Mλ . Once we have a complete solution of these equations, one can ﬁnd Ψ by solving eq. (8.26). Example. The Schlesinger equations. Consider a diﬀerential equation ∂λ Ψ = Mλ Ψ, where Mλ has only regular singularities, i.e. Mλ (λ) =

A(k) , λ − λk

(k)

(k) (k) −1

A(k) = g0 B0 g0

(8.33)

k

We assumed, according to the general analysis, that the matrices A(k) are diagonalizable. Note that there is a hidden regular singularity at ∞. The asymptotic expansions of Ψ(λ) at λk are easily computed and found to be of the form: (k)

ξ Ψ(k) asy (λ) = g0 (1 + O(λ − λk )) e

(k) (λ)

,

(k)

ξ (k) (λ) = B0 log (λ − λk )

270

8 Isomonodromic deformations

This means that the times ti are all set to 0 and that the only deformation parameters are the positions of the poles λk . The deformation equations ∂λk Ψ = Mλk Ψ are constructed with the help of the general formula:

A(k) (k) (k) −1 Mλk = Ψ(k) ∂ ξ Ψ = − asy λk asy λ − λk − The zero curvature conditions read: [A(k) , A(l) ] , l = k λk − λ l [A(k) , A(l) ] =− λ k − λl

∂λl A(k) =

(8.34)

∂λk A(k)

(8.35)

l=k

These equations are called Schlesinger equations. One can check easily that both equations are contained in eq. (8.27), while eq. (8.28) is a direct consequence of them, in agreement with the general theory. 8.5 Schlesinger transformations We have studied all continuous isomonodromic deformations. They are parametrized by the times ti and the λk . However, there remains discrete isomonodromic deformations. The basic remark is that, although the (k) matrices B0 are not allowed to change continuously, a discrete change (k) (k) B0 → B0 + L(k) , where L(k) is a diagonal matrix with integer entries, does not change the monodromy data. The singularity data are modiﬁed (k) (k) (k) by eξ (λ) → (λ − λk )L eξ (λ) , i.e. we add extra zeroes or poles at the singularities. This is a discrete analogue of the ti deformations. These transformations are called Schlesinger transformations. So, one can consider that Ψ(λ) depends not only on the continuous variables ti and λk but also on a set of integers, a diagonal matrix with integer entries above each singularity. For the continuous isomonodromic transformations we have deformation equations dΨ = M Ψ with M a rational matrix. We show that for Schlesinger transformations we can write analogously diﬀerence equations with respect to the integers. Proposition. The two wave–functions Ψ(λ) associated with the data (k) (k) B0 and Ψ (λ) associated with the data B0 + L(k) are related by: Ψ (λ) = R(λ)Ψ(λ)

(8.36)

where the matrix R(λ) is a rational function of λ. Proof. Consider the matrix Ψ (λ)Ψ−1 (λ). The essential singularities cancel and it has no monodromy. Hence, it is a rational function.

271

8.5 Schlesinger transformations

Note that the integer matrices L(k) have to be restricted by the Fuchs condition, eq. (8.21): Tr L(k) + Tr L(∞) = 0 k

To study Schlesinger’s tranformations, it is enough to concentrate on elementary ones: Deﬁnition. An elementary Schlesinger transformation is associated with the matrices k (l) k = δkl Eαα − δk l Eα α (8.37) L α α This shifts the αth diagonal element by +1 above λk and the αth diagonal element by −1 above λk in order to fulﬁl the Fuchs condition. In the following we restrict ourselves to Schlesinger transformations involving singularities at ﬁnite distance. Proposition. The matrix R(λ) in eq. (8.36) associated with the elementary Schlesinger transformation eq. (8.37) is of the form: R(λ) = 1 −

R0 , λ − λk

R0 λ − λk

R−1 (λ) = 1 +

(8.38)

where the matrix R0 is given by: R0 = R0 =

λ k − λk

(k )−1 (k) g0 g0 1

(k) g1

(k )−1

(k)

if k = k

g0 Eαα g0

,

(k)−1

if k = k

(8.39)

α α

(k)

g0 Eαα g0

,

(8.40)

α α

(k)

The matrices gi are deﬁned in the asymptotic expansion eq. (8.24). We have R02 = (λk − λk )R0 . Proof. The conditions determining R(λ) are R(λ)g (l) (λ) = (g )(l) (λ)(λ − λl )L

(l)

(8.41)

so that R(λ) has a simple pole at λ = λk . Similarly, the inverse Schlesinger transform is obtained by changing L(l) to −L(l) , so that R−1 (λ) has a simple pole at λk . Asymptotic expansion at ∞ shows that R(λ) and R−1 (λ) tend to 1 at ∞. This motivates eqs. (8.38) which are moreover consistent if R02 = (λk − λk )R0 .

272

8 Isomonodromic deformations

Suppose ﬁrst k = k . Let us write the condition eq. (8.41) in more detail for l = k , k. They read 1 1 (k ) (k ) 1+ R0 g = (g ) − 1 Eα α , l = k 1− λ − λk λ − λk

1 1− R0 g (k) = (g )(k) 1 + (λ − λk − 1)Eαα , l = k λ − λk (k )

Looking at the polar terms in the ﬁrst of these equations, we get R0 g0 = (k )−1 (k ) (k ) −g 0 Eα α , or R0 = −g 0 Eα α g0 . The matrix element γα of the right-hand side of the second equation vanishes at λ = λk . So we have, using the value of R0 : 1 (k ) (k) (k )−1 (k) (g 0 )γα (g0 g0 )α α = 0 (g0 )γα + λ k − λk (k )

Solving for (g 0 )γα and inserting back into the formula for R0 yields eq. (8.39). With this expression one checks immediately that R02 = (λk − λk )R0 . Suppose next k = k , which implies α = α in order to get a non-trivial transformation. The equation determining R(λ) now reads: 1 E α α (k) (k) 1− Id + (λ − λk − 1) Eαα − R0 g = (g ) λ − λk λ − λk Comparing the terms of order (λ − λk )−1 one gets R0 = −g 0 Eα α g0 as before. Next the matrix element γα of the right-hand side vanishes at λ = λk so that, considering the terms of order (λ − λk )0 in the left-hand (k) (k) side, one gets (recall that g (k) (λ) = g0 (1 + (λ − λk )g1 + · · ·): (k)

(k)

(k)−1

(k) (k)

(g0 )γα − (R0 g0 g1 )γα = 0 This is eq. (8.40). Here one checks that R02 = 0. 8.6 Tau-functions Consider the diﬀerential equation eq. (8.3), where Mλ is a rational function of λ depending on isomonodromic deformation parameters. At each singularity λk we have asymptotic expansions of the form (k) Ψ(λ) ∼ g (k) (λ) eξ (λ) . With any solution of the deformation equations, eqs. (8.25), we can associate a 1-form Υ: Resλ=λk Tr(g (k)−1 ∂λ g (k) dξ (k) )dλ (8.42) Υ=− k

The sum is over all singularities including ∞.

273

8.6 Tau-functions

Theorem. The deformation equations imply that Υ is closed: dΥ = 0. Proof. We have already proved this equation in a more restricted setting in Chapter 3. Let us repeat the proof of this important result in this more general context. Recall that d is the diﬀerential with respect to the isomonodromic deformation parameters ti and λk . Resλk Tr(g (k)−1 dg (k) g (k)−1 ∂λ g (k) dξ (k) − g (k)−1 ∂λ dg (k) dξ (k) )dλ dΥ = k

From the deformation equation eq. (8.32) we get (using that dξ (k) ∧dξ (k) = 0 since the matrix ξ (k) is diagonal): dΥ = Resλk Tr(d∂λ ξ (k) ∧ dξ (k) − ∂λ M ∧ g (k) dξ (k) g (k)−1 )dλ k

The ﬁrst term vanishes because the order of the pole is at least 3. For the same reason Resλk Tr(∂λ (g (k) dξ (k) g (k)−1 ) ∧ (g (k) dξ (k) g (k)−1 ))dλ = Resλk Tr(d∂λ ξ (k) ∧ dξ (k) )dλ = 0

(8.43)

Next we write g (k) dξ (k) g (k)−1 = M + N (k) , where N (k) , is regular at λk . Then eq. (8.43) reads Resλk Tr(∂λ (M + N (k) ) ∧ (M + N (k) ))dλ = 0 Since the residue of a derivative of a function of λ vanishes, we can replace ∂λ N (k) ∧ M by ∂λ M ∧ N (k) , getting: 1 Resλk Tr(∂λ M ∧ N (k) )dλ = − Resλk Tr(∂λ M ∧ M )dλ 2 It follows that 1 Resλk Tr(∂λ M ∧(M +N (k) ))dλ = − Resλk Tr(∂λ M ∧M )dλ dΥ = − 2 k

k

But now Tr(∂λ M ∧ M )dλ is a rational 1-form on the λ Riemann sphere, hence the sum of the residues vanishes. Example. Let us give the form Υ in the Schlesinger case of regular singularities and deformation parameters λk . In that case (see eq. (8.33)): dξ = −

B (k) 0 dλk λ − λk k

274

8 Isomonodromic deformations

and we have only to keep the constant term in g (k) −1 ∂λ g (k) , yielding: (k) (k) Υ= Tr(g1 B0 )dλk k

Starting from ∂λ Ψ =

l (g

(l) ∂

λξ

(l) g (l) −1 ) Ψ, −

(k)

and expanding

(k)

Ψ = g0 (1 + (λ − λk )g1 + · · ·)eξ we get: (k)

k)

(k)

(k)

(k)−1

g0 (g1 − [B0 , g1 ])g0

=

g (l) B (l) g (l)−1 0

l=k

so that Υ=

(k)

0

0

λ k − λl

dλk − dλl 1 Tr(Ak Al ) 2 λ k − λl k=l

We can verify that this form is closed using the Schlesinger equations eqs. (8.34, 8.35). By the closedness of Υ, we can introduce a function τ ({ti }, {λk }), deﬁned up to a multiplicative constant, by: Deﬁnition. The tau-function is deﬁned by Υ = d log τ

(8.44)

With each solution of the deformation equations eq. (8.25), one can associate a tau-function. Hence, with the Schlesinger transformed solution we can associate a transformed tau-function. There is a simple relation between the original tau-function and its transform by an elementary Schlesinger transformation. Proposition. In the gauge eq. (8.32), we have: 

(k )−1 (k) 1  g g  0 0 λk −λk k k α α (t) = τ (t) τ

α α (k)   g1 αα

if k = k if k = k

(8.45)

where the left-hand side denotes the transform of the tau-function under the elementary Schlesinger transformation eq. (8.37). Proof. Let us denote by Υ the form associated with the transformed solutions. Using eq. (8.41), we can write: Υ − Υ = −E1 − E2 + E3

275

8.6 Tau-functions where we deﬁned the expressions E1 = R−1 ∂λ Rg (l) dξ (l) g (l)−1 l

E2 =

g (l)−1 ∂λ g (l) d(ξ (l) − ξ (l) )

l

E3 =

L(l) (λ − λl )−1 dξ (l)

l

In this section we use the notation X (l) = Resλl Tr(X (l) )dλ. The term E3 vanishes because the pole is of order at least 2. Using ξ (l) = ξ (l) + L(l) log(λ − λl ), the second term E2 is equal to:

(l) (k) (k ) E2 = − Tr(g1 L(l) )dλl = − g1 dλk + g1 dλk αα

l

αα

In the above sum over l, only l = k, k contribute since otherwise L(l) vanishes. To compute the ﬁrst term E1 , we split it into two parts: E1 = E 1 + E 1 E1 = R−1 ∂λ Rg (l) dξ (l) g (l)−1 l

E

1

=

R−1 ∂λ Rg (l) (dξ (l) − dξ (l) )g (l)−1

l

Using the explicit form for R(λ) we can compute (assuming λk = λk ) 1 R0 1 R0 −1 R ∂λ R = − = (λ − λk )(λ − λk ) λ k − λk λ − λk λ − λk one gets for the ﬁrst term E1 : 1 1 1 E1 = − R0 g (l) dξ (l) g (l)−1 λ k − λk λ − λk λ − λk l

To evaluate this expression, we use the following identity valid for any ∞ function f (λ)i with an expansion around λl of the form f (λ) = i=−N fi (λ − λl ) : 5 1 −f− (λ)|λ=λk if λl = λk f (λ) = Resλl f0 if λl = λk λ − λk We immediately get: 6 1 Tr (R0 (g (l) dξ (l) g (l)−1 )− )|λ=λk − E1 = λ k − λk l=k

7

+Tr (R0 (g (k) dξ (k) g (k)−1 )0 ) − (k → k )

276

8 Isomonodromic deformations

To rewrite these terms, consider the equation of motion eq. (8.32) and expand it around λ = λk . We ﬁnd (k) (k)−1 (k) (k) (k)−1 dg0 g0 −g0 g1 g0 dλk = Ml |λ=λk +(M (k) −g (k) dξ (k) g (k)−1 )|λ=λk l=k

Now we have Ml |λ=λk =

(g (l) dξ (l) g (l)−1 )− |λ=λk

and

(M (k) − g (k) dξ (k) g (k)−1 )|λ=λk = −(g (k) dξ (k) g (k)−1 )0 so that we can rewrite

7 −1 6 (k) (k)−1 (k) (k) (k)−1 E1 = − g0 g1 g0 dλk − (k → k ) Tr R0 dg0 g0 λk − λk Similarly, the term E 1 reads: 6 (l) 1 (l) L (l)−1 E 1 = Tr(R0 g0 g )dλl λk − λk λk − λl 0 l=k

(k) (k) (k)−1 −Tr(R0 g0 [g1 , L(k) ]g0 )dλk

7 − (k → k )

In the right-hand side, only l = k, k contribute to the sums over l because L(l) vanishes otherwise. After substituting the explicit value of R0 , they produce the contribution dλk − dλk λk − λ k The terms depending on g1 in E 1 give:

(k) (k ) dλk − g1 dλk g1 αα α α

(k )−1 (k) (k) (k ) (k )−1 (k) g0 g0 g1 dλ − g g g dλk k 1 0 0 α α α α

− (k )−1 (k) g0 g0 αα

E

they cancel with those coming from 1 and E2 . Hence, putting everything together, we get: (k) (k)−1 (k ) (k )−1 dλk − dλk dg g − dg g 0 0 − Υ − Υ = Tr R0 0 0 λk − λk λk − λk

or τ

(k )−1 (k) g0

g0

αα = d log τ λk − λ k Integrating the above equation proves eq. (8.45) for k = k . The integration constant has been normalized to 1. The case k = k is proved similarly.

d log

277

8.7 Ricatti equation 8.7 Ricatti equation

Notice that the right-hand side of eqs. (8.45) is the product of τ (t) by the leading term in the expansion of G

(kk )

δkk Id − g (k)−1 (λ)g (k ) (λ ) (λ, λ ) = λ − λ

(8.46)

in powers of zk = λ−λk and zk = λ −λk . This double expansion has only positive powers of zk and zk . This is clear when k = k , and for k = k the zero in the denominator is cancelled by a zero in the numerator. The matrix elements of G(kk ) (λ, λ ) are algebraic functions of the dynamical variables occuring in Mλ . We can recast the equations of motion of the hierarchy in terms of these new variables. They take a particularly simple Ricatti type form. Let us consider the generating function for the ﬂows associated with the pole λl : ∂ (λ − λl )n−1 (8.47) ∇(l) α (λ) = ∂t(l,n,α) n>0

Strictly speaking, in our formalism there were a ﬁnite number of times t(k,n,α) with n ≤ nk . We consider now, formally, diﬀerential equations ∂λ Ψ − Mλ Ψ = 0, where Mλ is allowed to have poles of arbitrary order at each λk .

Proposition. The quantity G(kk ) (λ, λ ) deﬁned in eq. (8.46) obeys the Ricatti type equation:

(kk ) (λ, λ ) = G(kl) (λ, λ )Eαα G(lk ) (λ , λ ) ∇(l) α (λ )G G(kk ) (λ, λ )

(8.48)

G(kk ) (λ , λ )

− λ − λ ) (kk G (λ, λ ) − G(kk ) (λ, λ ) −δk l Eαα λ − λ +δkl Eαα

Similarly, the equation of motion relative to the position of the pole λl takes the form:

∂λl G(kk ) (λ, λ ) = Resλ =λl G(kl) (λ, λ )∂λl ξ (l) (λ )G(lk ) (λ , λ )

+δkl ∂λk ξ (k) (λ)G(kk ) (λ, λ ) − δk l G(kk ) (λ, λ )∂λk ξ (k ) (λ ) +

+

where ()+ means taking the positive power part in the expansions around λk and λk respectively.

278

8 Isomonodromic deformations

Proof. We need the following identities: let f (λ) = have: ∞

λ

n−1

(λ

−n

n=1

f (λ) − f (λ ) f (λ))+ = , λ − λ

∞

∞

j=1 fj λ

j,

then we

λn−1 (λ−n f (λ))− =

n=1

f (λ ) λ − λ

Recalling the equations of the hierarchy expressed on the g (k) : ∂ ∂t(l,n,α)

g (k) (λ) = (g (l) Eαα (λ − λl )−n g (l)−1 )− g (k) − δkl g (k) Eαα (λ − λk )−n

we see that they can be recast in the form: ∇α(l) (λ )g (k) (λ) = −g (k) (λ)Eαα

g (l) (λ )Eαα g (l)−1 (λ )g (k) (λ) δkl + λ − λ λ − λ

which proves the ﬁrst part of the proposition. n Similarly, using the identity for the function f (λ) = ∞ −∞ fn λ : 1 f (λ ) f− (λ) = Resλ =0 λ − λ we can write the equation of motion for g (k) in the form: ∂λl g (k) (λ) g (l) (λ )∂λl ξ (l) (λ )g (l) −1 (λ ) (k) · g (λ) − δkl g (k) (λ)∂λk ξ (k) (λ) λ − λ

= Resλ =λl

from which the second statement follows. We will need, in the next section, the limits of eqs. (8.48) when l = k, λ → λ and l = k , λ → λ . We get respectively (if k = k ):

∇α(k) (λ)G(kk ) (λ, λ ) = G(kk) (λ, λ)Eαα G(kk ) (λ, λ ) + Eαα ∂λ G(kk ) (λ, λ ) (8.49)

∇α(k ) (λ )G(kk ) (λ, λ ) = G(kk ) (λ, λ )Eαα G(k k ) (λ , λ )−∂λ G(kk ) (λ, λ )Eαα (8.50) 8.8 Sato’s formula

It is remarkable that the complete matrix G(kk ) (λ, λ ) can be reconstructed from the tau-function and its elementary Schlesinger transforms. As a consequence we can express the matrix elements of G(kk ) (λ, λ ) as

279

8.8 Sato’s formula

quotients of tau-functions, as in Sato’s formula. We still denote zk = λ−λk and zk = λ − λk and introduce the notation: t → t + [zk ]α means t(l,n,γ) → t(l,n,γ) + δkl δγα

zkn n

(kk )

Proposition. Denote by Gαα (λ, λ ) the matrix element αα of G(kk ) (λ, λ ). We have: k k (kk ) (t + [zk ]α − [zk ]α ), if (k, α) = (k , α ) τ (t) Gαα (λ, λ ) = τ α α τ (t) − τ (t + [zk ]α − [zk ]α ) τ (t) G(kk) (λ, λ ) = , (8.51) αα λ − λ Proof. This is a generalization of the proof of eq. (3.61) in Chapter 3. From the deﬁnition of the tau-function, eq. (8.44), we have:

∂ log τ = −Resλl Tr g (l)−1 (λ)∂λ g (l) (λ)Eαα (λ − λl )−n ∂t(l,n,α) Using the identity eq. (3.62) in Chapter 3, we get

(kk) ∇α(k) (λ) log τ = −Tr g (k)−1 (λ)∂λ g (k) (λ)Eαα = −Gαα (λ, λ) From this, it follows, using eqs. (8.49, 8.50) and the deﬁnition of G(kk ) (λ, λ ), that:

(kk ) (λ) − ∂ (8.52) τ (t)Gαα (λ, λ ) = 0 ∇(k) λ α

(k ) (kk ) ∇α (λ ) + ∂λ τ (t)Gαα (λ, λ ) = 0 (8.53) These are diﬀerential equations relating the λ-dependence to the time dependence. Their unique solution allows us to express τ (t)G(kk ) (λ, λ ) in the form: (kk )

τ (t)Gαα (λ, λ ) = ταα (t + [zk ]α − [zk ]α ) To ﬁnd the functions ταα , it is enough to compare the two sides of the equation at zk = zk = 0. But there, comparing with eq. (8.45), we ﬁnd k k (kk ) τ (t)Gαα (λ, λ ) → τ (t) α α this proves the proposition if k = k . If k = k , the right-hand side of eq. (8.49) contains the extra term −

G(kk) (λ, λ ) − G(kk) (λ, λ) Eαα λ − λ

280

8 Isomonodromic deformations

If α = α this does not aﬀect eq. (8.52), but if α = α it becomes

(kk) (λ) − ∂ ) τ (t)[(λ − λ )G (λ, λ ) − 1] =0 (∇(k) λ α αα Using the analogous equation for λ , we deduce that (kk) τ (t)[(λ − λ )Gαα (λ, λ ) − 1] = τα (t + [zk ]α − [zk ]α )

To ﬁnd the function τα (t) we notice that τ (t)[(λ − λ )Gαα (λ, λ ) − 1] → −τ (t) when λ → λk and λ → λk , hence τα (t) = −τ (t). This yields the second half of the proposition. (kk)

Many remarkable relations can be extracted from this result. In particular, setting k = k , zk = 0, and introducing the matrix h(k) (λ) by (k) g (k) (λ) = g0 h(k) (λ), we ﬁnd k k (t − [zk ]α ) τ

α α (k) (λ) = (λ − λk ) h , α = α τ (t) αα

τ (t − [zk ]α ) (λ) = h(k) (8.54) τ (t) αα We have already met these equations in Chapter 3, they are the Sato formulae. We see that we have completely identiﬁed the functions ταα occurring in the numerator of eq. (3.61) as the Schlesinger transforms of the tau-function in the denominator. 8.9 The Hirota equations Hirota noticed that many integrable equations could be recast into a bilinear form in terms of tau-functions. Speciﬁcally, introducing the Hirota diﬀerential operators Di with the deﬁnition: ∂ n Din f · g = f (x + y)g(x − y)|y=0 (8.55) ∂yi the equations of motion take the symbolic form: P (D)τ · τ = 0, where P is a polynomial in D. For instance, the equation (D14 + 3D22 − 4D1 D3 )τ · τ = 0

(8.56)

is the Hirota form of the KP equation. As a matter of fact, setting u = −2

∂2 log τ ∂t21

(8.57)

281

8.9 The Hirota equations we get the Kadomtsev–Petviashvili equation: ∂2u ∂u ∂ ∂u ∂3u 3 2 + −4 − 6u + 3 =0 ∂t1 ∂t3 ∂t1 ∂t2 ∂t1

(8.58)

We show in this section that this is a general phenomenon. We have the: Proposition. In terms of the tau-function and its elementary Schlesinger transforms, the hierarchy equations take the Hirota bilinear form. Proof. The proof is just a rewriting of the Ricatti equation eq. (8.48) in terms of tau-functions. Let us do it in a simple case (the other cases are similar). We assume for simplicity that l, k, k are all diﬀerent. Multiplying eq. (8.48) by τ 2 (t) we get: 1 k k (l) 2 τ (t)∇β (λ ) (t + [zk ]α − [zk ]α ) τ α α τ (t) k l l k =τ (t + [zk ]α − [zl ]β ) τ (t + [zl ]β − [zk ]α ) β α α β The left-hand side can be rewritten in terms of Hirota diﬀerential operators, using the identity: g 2 ∂ = −(f˙g − gf ˙ ) = −Dt f · g f ∂t f Introducing the generating function for Hirota diﬀerential operators: Dα(l) (λ ) = (λ − λl )n−1 Dt(l,n,α) n>0

and shifting the variables t by t → t − [zk ]α /2 + [zk ]α /2, we get: [zk ]α [zk ]α [zk ]α [zk ]α k k (l) ·τ Dβ (λ )τ t − t+ + − α α 2 2 2 2 ] [z [z ] α k l k α t+ = −τ + k − [zl ]β α β 2 2 [zk ]α [zk ]α l k t− ×τ − + [zl ]β β α 2 2 We now expand this formula in powers of zk , zk and zl using the equation f (t + z) g(t − z) = ezDt f · g

282

8 Isomonodromic deformations

It is clear that the coeﬃcients in this expansion have the form of Hirota bilinear equations. Exactly the same method applies to the other Ricatti equations, and shows that they can be recast in Hirota form. This form is very remarkable as it allows for some easy particular solutions and, moreover, lends itself to a beautiful geometric interpretation as Pl¨ ucker relations in an inﬁnite Grassmannian which will be described in Chapter 9. 8.10 Tau-functions and theta-functions In this section we show how the Lax matrix approach ﬁts into the isomonodromy approach. We show that it corresponds to very special matrices Mλ . In that case, the tau-functions are essentially Riemann’s theta-functions. We start from a rational Lax matrix L(λ) of size N × N , with poles at λk . We consider the associated spectral curve, Γ : det (L(λ) − µ) = 0, which is a compact Riemann surface presented as an N -sheeted branched covering of the Riemann sphere. For any value of the spectral parame ter λ one may consider the (multivalued) N × N matrix Ψ(λ) given by (Ψ(P1 ), . . . , Ψ(PN )), where the Pj are the N points above λ in some order, and Ψ(Pj ) is the eigenvector of L(λ) for the corresponding eigenvalue µj (λ). It has been explained in Chapter 5 that requiring the time evolution = Mi Ψ implies that the components of the eigenvector are equations ∂ti Ψ Baker–Akhiezer functions on Γ. In general they have essential singularities at all points on Γ above λk . Around each puncture λk , the matrix Ψ(λ) has an expansion of the form Ψ(λ) = g (k) (λ) exp (ξ (k) (λ)) where g (k) (k) is regular at λ = λk and ξ is a diagonal matrix singular at λk : ξ (k) (λ) =

n,α

t(k,n,α)

Eαα (λ − λk )n

Here t(k,n,α) are the times describing all the integrable ﬂows of the hierarchy. Moreover, Ψ(P ) has g + N − 1 poles on the Riemann surface, g of them at ﬁnite distance being the dynamical divisor D, and the other N − 1 being at the points Qi , i = 2, . . . , N above λ = ∞. Note that at λk , Ψ(λ) has the behaviour considered in this chapter. Hence it is natural to ·Ψ −1 . consider the matrix Mλ = ∂λ Ψ Proposition. The matrix Mλ is a rational function of λ. It has poles at the λk , simple poles at the projections of the branch points of the covering (λ, µ) → λ, and at the projections of the poles of Ψ(λ).

8.10 Tau-functions and theta-functions

283

is multivalued, i.e. its columns Proof. The main point is that while Ψ(λ) undergo permutations when one performs a loop around a branch point, Ψ −1 is independent of the ordering of the columns of Ψ(λ), ∂λ Ψ· hence it is well-deﬁned as a function of λ. Using the expansions around the punctures λk we see that the singular part of Mλ at λk is given by (g (k) ∂λ ξ (k) g (k)−1 )− . It has a ﬁnite order pole if there are a ﬁnite number of time variables. Let us consider now a branch point, and for simplicity assume that it is of order 2, that is we assume that the ﬁrst two columns of Ψ(λ) coalesce for λ = λb . The corresponding eigenvalues (µ1 , µ2 , . . .) are such that µ1 , µ2 also coalesce. Locally the equation of Γ√is of the form λ−λb = (µ−µb )2 and the local parameter is z = µ − µb = λ − λb . The ﬁrst two eigenvectors are just the evaluation of one meromorphic vector valued function of z on the two sheets above λ. Splitting the even and odd powers of z, we can write the matrix Ψ(λ) in the form: √ √ Ψ(λ) = (Ψe (z) + zΨo (z), Ψe (z) − zΨo (z), Ψ3 (z), . . .)    1 1 0 ··· 1 √0 0 · · · 0   z 0 ···   1 −1 0 · · ·  = (Ψe (z), Ψo (z), Ψ3 (z), . . .)  0 0 1 0 0 1  .. .. .. .. .. .. . . . . . . The ﬁrst matrix g(z) = (Ψe (z), Ψo (z), Ψ3 (z), . . .) is regular around z = 0. The third matrix is an inessential invertible constant matrix, and the second matrix can be identiﬁed with exp ξ(λ) with ξ(λ) = 12 log (λ − λb )E22 . This produces in Mλ a polar part (g∂λ ξg −1 )− =

1 g(λb )E22 g −1 (λb ) 2 (λ − λb )

At a pole of Ψ(λ) (above λc ) at ﬁnite distance, one column (say the j th one) has a pole. Hence one can write, up to right multiplication by a constant invertible matrix, Ψ(λ) = g(λ) exp (ξ(λ)) with ξ(λ) = − log (λ − λc )Ejj . This again yields a simple pole in Mλ : (g∂λ ξg −1 )− = −

g(λc )Ejj g −1 (λc ) (λ − λc )

Finally, above λ = ∞, in the setup of Chapter 5, we have N points Q1 , . . . , QN on Γ, and Ψ(λ) has simple poles at the N − 1 points Q2 , . . . , QN . When λ → ∞, the wave function Ψ(λ) has the asymptotic (∞) (∞) (∞) expansion g0 (1 + O(1/λ)) exp (B0 log (1/λ))C , where C (∞) is a (∞) constant invertible matrix and B0 = − N i=2 Eii . This implies that

284

8 Isomonodromic deformations

around λ = ∞ the matrix Mλ has the form: (g∂λ ξg −1 )− = −

(∞)

g0

(∞) (∞) −1 g0

B0

λ

+ O(1/λ2 )

It is now clear that Mλ is a rational function of λ vanishing at ∞ and is the sum of its polar parts at ﬁnite distance. Remark 1. Note that the last statement of the proof implies that Mλ tends to 0 at ∞, so we must have: (k) 1 1 (∞) (∞) (∞) −1 (k) (k) −1 Mλ = g ∂λ ξ g = − g0 B0 g0 +O 2 λ λ − k

where the sum over k runs over poles at ﬁnite distance. In particular, looking at the 1/λ terms and taking the trace one gets the Fuchs relation eq. (8.21) again. It is interesting (∞) to check it in this context. First we have Tr B0 = −N +1. The poles at ﬁnite distance are the branch points, each one contributing 1/2 to the trace, and the g points of D, each one contributing −1. Finally, at the punctures, we have Tr (g (k) (λ)∂λ ξ (k) g (k) −1 (λ))− = Tr ∂λ ξ (k) = O(1/λ2 ) at ∞, hence they do not contribute to the 1/λ terms. The Fuchs condition therefore reads g = ν/2 − N + 1. This is just the Riemann–Hurwitz formula.

Remark 2. The matrix Mλ (λ) has more poles than L(λ). However, at the extra

and the branch points, the singularity of Mλ is of regular poles, namely the poles of Ψ type. The corresponding singularity data ξ contains only a logarithmic term. At the poles of L(λ) we have the whole singularity structure for singularities of irregular type, but without a logarithmic term. The matrix Mλ (λ) embodies the data allowing us to In particular, it contains data pertaining to the spectral curve, through reconstruct Ψ. its branch points, and data pertaining to the eigenvector bundle, through the divisor D.

It follows from the Proposition that the matrix Ψ(λ) constructed from the eigenvector bundle satisﬁes a diﬀerential equation (∂λ −Mλ (λ))Ψ(λ) = 0, where Mλ is a rational function of λ of the type studied in this chapter. However, Mλ is a very particular function of λ since the solution Ψ(λ) has no monodromy at the punctures, i.e. at the poles of L(λ). Its only non–trivial monodromy occurs around the branch points, and acts by permutations of corresponding columns. Globally, the monodromy group is ﬁnite, and this is a special feature of the diﬀerential equations coming from Lax equations. Finally, there are no Stokes matrices at any singularity. Hence the monodromy data, as deﬁned above, consist only of the connection matrices C (k) , which are time-independent matrices, function of the moduli of the spectral curve. The general theory nevertheless applies to this very special situation, and in particular the tau-functions can be computed explicitly in terms

8.10 Tau-functions and theta-functions

285

of theta-functions. We have already met this situation in Chapter 5, but we make here the analysis in a broader context in order to be able to understand the action of Schlesinger transformations. We will see that Schlesinger transformations reduce to very simple translations in the argument of the theta-functions, as we now show. Let Pkα be the N points above λk . We take zk = λ − λk as local parameter around each Pkα . We introduce the singular parts t(k,n,α) zk−n (8.59) ξkα (zk ) = lkα log (zk ) + n≥1

We introduced logarithmic terms as in eq. (8.6), but we assume that the coeﬃcients lkα are integers in order to be able to construct Baker– Akhiezer functions. These logarithmic terms will introduce extra zeroes or poles in the Baker–Akhiezer functions at the punctures, and will be useful to help us understand Schlesinger transformations. Consider Baker–Akhiezer functions with singular parts given by eq. (8.59) at each Pkα . We introduce poles at the g + N − 1 given points and Q2 , . . . , QN (above ∞). We D = (γ1 , . . . , γg ) (the dynamical divisor) choose the numbers lkα such that kα lkα = 0 so that the degree of the divisor of prescribed zeroes and poles is still g + N − 1. The dimension of the space of such functions is N , by the Riemann–Roch theorem. We ﬁx a particular λk and consider the N sheets above it. For each α, one (k) can deﬁne a unique Baker–Akhiezer function ψα (P ) satisfying the N conditions: ψα(k) (P ) = eξkα (zk ) (δαβ + O(zk )),

when P → Pkβ

We put these N functions in a column vector Ψ(k) (P ) and form as usual the N × N matrix (k) (λ) = (Ψ(k) (P1 ), . . . , Ψ(k) (PN )) Ψ

(8.60)

where the Pi are the N points above λ. Note that, around our particular λk , we have the expansion (k) (λ) = h(k) (λ)eξ(k) (zk ) Ψ

(8.61)

with h(k) (λ) = 1 + O(zk ) and ξ (k) being the diagonal matrix α ξkα Eαα , (k) i.e. we have chosen the normalization g0 = 1 in the expansion eq. (8.24). We are going to identify the tau-function by comparing the expression of (k) in terms of theta-functions with the expression eq. (8.54). the matrix Ψ

286

8 Isomonodromic deformations (k)

Proposition. The Baker–Akhiezer function ψα (P ) can be written in terms of theta-functions as: P

ψα(k) (P ) = e

Pkα

Ω

·

θ(A(P ) + V − K)θ(A(Pkα ) − A(D) − K) θ(A(Pkα ) + V − K)θ(A(P ) − A(D) − K)

(8.62)

where the vector V is given by: V = t(k,n,α) U (k,n,α) + lk β A(Pk β ) − A(Pkα ) + A(Q1 ) − A(D) knα

k β

Proof. Following the general procedure of Chapter 5, we construct P

ψα(k) (P ) = C · e

Pkα

Ω

·

θ(A(P ) + V − K) θ(A(P ) − A(D) − K)

where C is a normalization constant, Ω is an Abelian diﬀerential chosen (k) so that ψα has the required properties at the punctures and at the Qj , the theta-function at the denominator has been introduced to take care of the poles at the dynamical divisor D (K is the vector of Riemann’s constants), and the vector V in the theta-function in the numerator is determined by requiring that the resulting function has no monodromy. Let us ﬁrst determine the form Ω. We write it as a sum of three pieces (k) Ω = Ω(t) + Ω(l) + Ω(q) , where Ω(t) ensures that ψα (P ) has the correct essential singularities at the punctures. One can write (n) t(k ,n,β) ΩP (8.63) Ω(t) = k ,n,β

(n)

k β

where ΩP is the normalized (i.e. the a-periods vanish) second kind k β Abelian diﬀerential with just one singularity at Pk β and such that around P (n) (n) this point ΩP = d(zk−n ΩPkα )+holomorphic. Note that the integral P kα k β is ill-deﬁned. We take as its deﬁnition the unique primitive which around Pkα behaves as zk−n +O(zk ) (no constant term). The form Ω(l) is computed to produce a zero or pole of order lk β at the puncture Pk β . Noting that lk β = 0, it is the unique normalized Abelian diﬀerential of the third kind with ﬁrst order poles at these points with corresponding residues lk β . As in the previous case, the integral of this form with origin at Pkα is ill-deﬁned. We take it to be the primitive behaving as lkα log (zk ) + O(zk ), mod 2iπ. Finally, the form Ω(q) is introduced to get N − 1 poles at the points Q2 , . . . , QN and N − 1 zeroes at the points Pkβ with β = α (for the special k). It is given by the unique third kind diﬀerential with residues −1 at the Qj and +1 at the Pkβ .

8.10 Tau-functions and theta-functions

287

P It is now easy to compute the monodromy of the function exp ( Pkα Ω). Since the diﬀerentials are normalized, there is no monodromy around the a-cycles. Around the cycle bj the monodromy is given by exp ( bj Ω). Using the monodromy property of theta-functions, eq. (15.14) in Chapter 15, we can cancel this monodromy by taking: 1 Vj = Ω − Aj (D) 2iπ bj Decomposing Ω into its three components, we note that bj Ω(t) is a linear form in the times with coeﬃcients independent of the indices (k, α) since all punctures enter symmetrically in its deﬁnition. One can compute more precisely the other two contributions by using Riemann’s bilinear identity for third kind diﬀerentials. Let Ω3 be a third kind diﬀerential with ﬁrst order poles at some points Pl with residue rl ( rl = 0). The Riemann bilinear identity reads: Ω3 = 2iπ rl Aj (Pl ) bj

l

Applying this to the form Ω(l) , one gets: 1 Ω(l) = lk β Aj (Pk β ) 2iπ bj kβ

where the last sum is over all punctures. Similarly, we ﬁnd: 1 2iπ

Ω(q) = bj

β=α

Aj (Pkβ ) −

N

Aj (Ql ) = −Aj (Pkα ) + Aj (Q1 )

l=2

To get the last equation, notethat if P1 , . . . , PN are the N points above some λ0 then the Abel sum j A(Pj ) is a constant independent of λ0 . This is because the meromorphic function on Γ: f (P ) = (λ − λ0 )/(λ − λ1 ) has zeroes at the points above λ0 and poles at the points above λ1 , hence these two divisors are mapped to the same point in Jac (Γ) due to the Abel theorem. In particular β Aj (Pkβ ) = N l=1 Aj (Ql ). There remains to compute the normalization constant C, such that (k) ψα (P ) = eξkα (zk ) (1 + O(zk )) when P → Pkα . Thanks to our deﬁnition of P the primitive of the form Ω, we have Pkα Ω = ξkα (zk ) + O(zk ). Hence we need only to normalize the quotient of theta-functions, and we ﬁnd the ﬁnal result, eq. (8.62).

288

8 Isomonodromic deformations

To identify the tau-function, we have to expand this formula for P → Pkβ (β may be equal to α) in powers of zk and compare the result with eq. (8.54). Proposition. The tau-function associated with the algebro-geometric integrable system is given by: τ (t) = eσ(t,t)+ρ(t) θ(U · t + W )

(8.64)

where σ(t, t) is bilinear in the times ti and ρ(t) is linear. Proof. We ﬁrst look at P → Pkα . Note that, due to the explicit form of V , one can write for P close to Pkα : P

ψα(k) (zk ) = eξkα (zk ) · e

W =

Pkα (Ω−dξkα )

θ(A(Pkα ) − A(D) − K) θ(A(P ) − A(D) − K) θ(A(P ) − A(Pkα ) + U · t + W ) × θ(U · t + W )

lk β A(Pk β ) + A(Q1 ) − A(D) − K

k β (k)

(k)

which compares to ψα (zk ) = eξkα (zk ) hαα (zk ). The middle term is regular when zk → 0 and tends to one, so one can write it in the form P

e

Pkα (Ω−dξkα )

θ(A(Pkα ) − A(D) − K) = exp (t.bα (zk ) + aα (zk )) θ(A(P ) − A(D) − K)

where t.bα (zk ) is an expression linear in the times t(k,n,α) , and both aα (zk ) and bα (zk ) are of order O(zk ). Considering the last term, note that it can be rewritten as θ(A(P ) − A(Pkα ) + U · t + W ) θ(U · (t − [z]) + W ) = θ(U · t + W ) θ(U · t + W ) To see that we Taylor expand the Abel map A(P ) − A(Pkα ) around Pkα . ∞ (j) i Writing the ﬁrst kind diﬀerential ωj = i=0 ci zk dzk one gets, using Riemann’s bilinear identities (see eq. (15.9) in Chapter 15): ∞ ∞ ∞ n zkn (k,n,α) −1 zkn (j) zk (n) Aj (P ) − Aj (Pkα ) = cn−1 ΩPkα = − = U n 2πi n bj n j n=1

n=1

This invites us to look for a tau-function of the form: τ (t) = eρ(t)+σ(t,t) θ ti U (i) + W i

n=1

8.10 Tau-functions and theta-functions

289

where ρ(t) is linear in t and σ(t, t) is a quadratic form in t. One gets a condition on ρ and σ: t · bα (zk ) + aα (zk ) = −ρ([zk ]α ) − 2σ(t, [zk ]α ) + σ([zk ]α , [zk ]α )

(8.65)

One can always choose ρ and σ satisfying this equation provided that bα obeys the adequate symmetry property stemming from the fact that the quadratic form σ is symmetric in its arguments. Explicitly, we have the expansion b(k ,n ,α ),α (zk )t(k ,n ,α ) t · bα (zk ) = For (k , α ) = (k, α) (the equality case was treated in Chapter 5) we have by eq. (8.63): P (n ) b(k ,n ,α ),α (zk ) = ΩP = b(k ,n ,α ),(k,n,α) zkn Pkα

k α

n

The condition (8.65) implies σ(k ,n ,α ),(k,n,α) = 12 nb(k ,n ,α ),(k,n,α) . So we must have the relation nb(k ,n ,α ),(k,n,α) = n b(k,n,α),(k ,n ,α ) (in this equation, the function b(k,n,α) (zk ) is obtained by performing the same construction as above but starting from the privilegied point Pk α ). This is a consequence of Riemann’s bilinear identities: apply the identity eq. (15.8) (n) (n ) in Chapter 15 to the second kind diﬀerentials ΩPkα and ΩP . Since k α these diﬀerentials are normalized their a-periods vanish and the left-hand side of the identity vanishes. We get Res (b(k ,n ,α ) db(k,n,α) ) = 0. There are two poles, one at Pkα and the other at Pk α . Computing the residues yields the required relation. Altogether this shows that the quadratic form σ exists, is completely determined, and independent of the choice of the particular point Pkα . The computation of the linear form ρ(t) is then straightforward. We are now in a position to discuss the eﬀect of a Schlesinger transformation on this tau-function since this only amounts to changing the inte gers lkα . These integers only occur in the contribution k β lk β A(Pk β ) to the vector W in eq. (8.64), and in a term linear in lkα in ρ(t) (appearing in the exponential prefactor). We see that an elementary Schlesinger transformation (which adds a zero at Pkα and a pole at Pk α ) is obtained by changing lkα → lkα + 1 and lk α → lk α − 1. The eﬀect of such a transformation on the theta-function is remarkably simple and amounts to a simple translation of its argument. Up to the exponential prefactor, we have: k k τ (t) = θ(U · t + W + A(Pkα ) − A(Pk α )) α α

290

8 Isomonodromic deformations

Remark. This allows us to perform an interesting check of the ﬁrst of eqs. (8.54). (k) From the deﬁnitions, eqs. (8.60, 8.61), the matrix element hαα is obtained by evalu(k) ating ψα (P ) around the point Pkα in eq. (8.62). The time-dependent theta-function in the numerator of this equation is then exactly what we expect from the numerator in eq. (8.54). We have found that the tau-function in the algebro-geometric situation is essentially the Riemann theta-function. Moreover, the various matrix elements of h(k) (t, λ) are obtained by simple shifts of the arguments of the theta-function. The isomonodromic context of this chapter is a generalization of the algebro-geometric context, so that the tau-functions can be viewed as generalizations of Riemann’s theta-functions and should enjoy many of their remarkable properties. In Chapter 9 another framework is proposed allowing us to directly deﬁne the tau-function, using the geometry of the inﬁnite Grassmannian. 8.11 The Painlev´ e equations An important application of the theory of isomonodromic deformations concerns the Painlev´e equations which can be interpreted as isomonodromic deformation equations, but not as isospectral deformations. The Painlev´e property deals with singularities of solutions of diﬀerential equations. In this respect there is a striking diﬀerence between linear diﬀerential equations and non-linear ones. The solutions of linear diﬀerential equations have singularities (poles, branch points, essential singularities) only where the coeﬃcients of the equation have singularities, i.e. the position of these singularities are ﬁxed. In contrast, the solutions of non-linear equations can develop singularities at arbitrary points, depending on initial conditions. For example y˙ = y 2 has the general solution y = −1/(t − t0 ). We call these singularities depending on the initial conditions movable singularities. In general one cannot avoid movable poles, however, one can try to ﬁnd equations whose solutions have no movable singularities other than poles, i.e. the branch points and essential singularities of all solutions are ﬁxed. This is called the Painlev´e property. In fact Painlev´e and Gambier have classiﬁed all diﬀerential equations of the form: d2 y dy = R(t, y, ) 2 dt dt where R is a rational function of its arguments, satisfying the Painlev´e property. Up to trivial redeﬁnitions they found 50 such equations. Of all

8.11 The Painlev´e equations

291

these equations only six could not be integrated in terms of already known functions, and are listed below.

(ii)

d2 y dt2 d2 y dt2

(iii)

d2 y dt2

(iv)

d2 y dt2

(v)

d2 y dt2

(vi)

d2 y dt2

(i)

= 6y 2 + t = 2y 3 + ty + α 1 dy 2 1 dy 1 δ = − + (αy 2 + β) + γy 3 + y dt t dt t y 2 1 dy 3 β = + y 3 + 4ty 2 + 2(t2 − α)y + 2y dt 2 y 5 8 2 dy 1 1 dy 1 = − + 2y y − 1 dt t dt 8 5 2 γy δy(y + 1) β (y − 1) + αy + + + 2 t y t y−1 5 8 2 dy 1 1 1 1 = + + 2 y y−1 y−t dt 5 8 dy 1 1 1 − + + t t − 1 y − t dt 5 8 βt γ(t − 1) δt(t − 1) y(y − 1)(y − t) α+ 2 + + + t2 (t − 1)2 y (y − 1)2 (y − t)2

All these equations can be understood in the framework of isomonodromy deformations. The fact that the Painlev´e property appears in integrable systems is not an accident. The dynamical variables, i.e. the matrix elements of the h(k) in eq. (8.54), are ratios of tau-functions. In the algebrogeometric case the tau-functions are theta-functions which are entire functions, so that we only have movable poles at the zeroes of one thetafunction. This remark has been generalized to the isomonodromy case by Malgrange and Miwa. Here we shall only consider equations (ii) and (vi) in order to illustrate various aspects of the method. Example 1. The Painlev´e (ii) equation. We apply the general construction to the case where all matrices are of size 2 × 2, with just one singularity at λ = ∞. Moreover, we require that Ψ belongs to the group SL(2) so that Mλ belongs to the Lie algebra sl(2). We limit ourselves to n∞ = 3, so that Mλ = A0 + A1 λ + A2 λ2 , where A2 is diagonal and traceless. Altogether there are seven parameters in Mλ . The point at ∞ is an irregular singularity of order 3, so that there are six Stokes sectors, and therefore the Stokes matrices depend on six parameters. The monodromy matrix at ∞ must be equal to 1, yielding three relations, so the

292

8 Isomonodromic deformations

monodromy data are expressed by three parameters. Finally, the singularity data depend on four parameters. Since we are looking for an ordinary non-linear diﬀerential equation we introduce only one time, t, and assume that the singularity data is given by: 3 λ tλ 1 1 0 ξ(λ) = σ 3 , σ3 = + + θ log 0 −1 3 2 λ Clearly one could extend this construction to a whole hierarchy of times. In the following we shall parametrize Mλ by three natural parameters and ﬁnd their time dependence so that the ﬂow is isomonodromic. We have introduced a branch point parametrized by θ (which will account for the parameter α in Painlev´e (ii)), and this is consistent with the Fuchs condition since Tr (σ3 ) = 0. Finally, for irrational θ, any Lax pair interpretation of the diﬀerential equation is prohibited since Baker–Akhiezer functions do not have such inﬁnitely branched points. We set Ψ(λ) = g(λ)eξ(λ) and take g(λ) = 1 +

g1 g2 g3 + 2 + 3 + ··· λ λ λ

There is no restriction in assuming that the leading term is 1. We also impose that det g(λ) = 1, since Ψ(λ) belongs to Sl(2). The gi are such that the matrix Ψ(λ) satisﬁes the linear diﬀerential equation ∂λ Ψ = Mλ Ψ with Mλ = (g∂λ ξg −1 )− . Here the symbol ()− means taking the polynomial part of the considered expression. Finally, the isomonodromic deformation equation is ∂t Ψ = Mt Ψ with Mt = (g∂t ξg −1 )− . One gets: t Mλ = λ2 σ3 + λ[g1 , σ3 ] + σ3 + [g2 , σ3 ] − [g1 , σ3 ]g1 2 1 1 Mt = λσ3 + [g1 , σ3 ] 2 2 We see that Mλ depends only on g1 , g2 , and ∂λ g = Mλ g−g∂λ ξ determines the g3 , g4 , . . . in terms of these two matrices. One ﬁnds for i ≥ 3: t [gi , σ3 ] = (i − 3)gi−3 + θgi−3 σ3 − [gi−2 , σ3 ] 2 +[g1 , σ3 ]gi−1 + [g2 , σ3 ]gi−2 − [g1 , σ3 ]g1 gi−2

(8.66)

We parametrize the matrix gi in the form: bi ∆i + ai gi = ci ∆i − ai where ∆i is obtained by requiring that det g(λ) = 1, yielding ∆1 = 0, ∆2 = (a21 + b1 c1 )/2, ∆3 = a1 a2 + (b1 c2 + b2 c1 )/2, etc. The left-hand side

8.11 The Painlev´e equations

293

of eq. (8.66) has vanishing diagonal elements. This provides a constraint on the lower order gi . The oﬀ-diagonal elements determine bi and ci . In the case i = 3, we ﬁnd one constraint: b1 c2 + b2 c1 =

θ 2

and get the oﬀ-diagonal elements: t b1 + a2 b1 + a1 b2 + b1 ∆2 2 t c3 = − c1 + a2 c1 + a1 c2 − c1 ∆2 2

−b3 =

Similarly for i = 4 we get the constraint: a1 = −(t + 4∆2 )b1 c1 + 2b2 c2 + 2a1 (b1 c2 − b2 c1 ) while a2 is determined by the equation at order i = 5. Finally there are only three free parameters in g1 and g2 . The equations of motion ∂t g = Mt g − g∂t ξ read: 1 1 g˙ 1 = − [g2 , σ3 ] + [g1 , σ3 ]g1 , 2 2

1 1 g˙ 2 = − [g3 , σ3 ] + [g1 , σ3 ]g2 2 2

Note that since [g3 , σ3 ] is known in terms of g1 , g2 , these equations close on these two matrices. In components we get: a˙ 1 = −b1 c1 ,

b˙ 1 = b2 + a1 b1 ,

c˙1 = −c2 + a1 c1

t t b˙ 2 = − b1 − a1 b2 − 2b1 ∆2 , c˙2 = c1 − a1 c2 + 2c1 ∆2 2 2 It is now natural to introduce the three parameters: x=

b1 , c1

y = a1 +

b2 , b1

z = b1 c1

With these parameters one has a1 = −tz − 2z 2 − 2zy 2 + θy. Then the equations of motion give: 1 x x˙ = θ , 2 z

y˙ = −t/2 − 2z − y 2 ,

z˙ = 2zy − θ/2

Eliminating z yields the equation: d2 y 1 = 2y 3 + yt − + θ 2 dt 2

294

8 Isomonodromic deformations

One recognizes the Painlev´e (ii) equation with α = θ − 1/2. Once the solution y(t) of the Painlev´e equation is known, it is easy to reconstruct the matrix Mλ (λ; t) in terms of x(t), y(t), z(t). The subtle nature of the t-dependence of Mλ (λ; t) ensuring the isomonodromy property is here particularly striking. The Schlesinger transformations take a particularly simple form on this example. They just amount to changing θ → θn = θ + n, so that the ﬁrst column of Ψ acquires a zero of order n at λ = ∞ while the second column acquires a pole of order n. Let Ψ(n, λ) be the wave-function constructed with the parameter θn , then we have for an elementary Schlesinger transformation Ψ(n + 1, λ) = R(n, λ)Ψ(n, λ), where R(n, λ) is a matrix rational in λ. Plugging in the asymptotic expansions at ∞, Ψ(n, λ) ∼ g(n, λ) exp (ξ(n, λ)), one gets R(n, λ)g(n, λ) = g(n+1, λ)(λ−1 E11 +λE22 ). Expanding to order λ−3 , one ﬁnds: 0 b1,n+1 R(n, λ) = −1/b1,n+1 λ − yn+1 where b1,n+1 and yn+1 are the parameters entering in g(n+1, λ). Moreover, one also gets: c2,n a1,n+1 = , c1,n+1 = c3,n − c1,n (∆2,n + a2,n ) + c2,n (a1,n − a1,n+1 ) c1,n (8.67) a1,n 1 , b2,n+1 = − (8.68) b1,n+1 = c1,n c1,n From this we obtain the recursion relations: t θn 2 − zn , yn+1 = −yn + zn+1 = − − yn+1 2 2zn It is now illuminating to introduce the tau-functions. First we compute the closed form Υ = −Res∞ Tr (g −1 (λ)∂λ dξ)dλ. Here we simply have dξ = λσ3 dt/2, so that we ﬁnd: 1 Υ = − Tr (g1 σ3 )dt = −a1 dt 2 Hence τn is deﬁned by τ˙n /τn = −a1,n . One can express the matrix elements of g(λ) in terms of tau-functions according to the general results of the previous sections. Here however, things are so simple that it is even more straightforward to rederive the appropriate results. From eq. (8.67) we have a1,n+1 = −τ˙n+1 /τn+1 = c2,n /c1,n . It follows immediately that c2,n τ˙n τ˙n+1 d τn yn+1 = −a1,n + = − = log c1,n τn τn+1 dt τn+1

8.11 The Painlev´e equations

295

On the other hand, by the equations of motion we have yn = d log b1,n /dt. Moreover, by eq. (8.68) we have c1,n = 1/b1,n+1 . Hence we can take: b1,n =

τn−1 , τn

c1,n =

τn+1 , τn

zn =

τn+1 τn−1 τn2

The Hirota bilinear form of the equations of motion also follows straightforwardly. Let us write the three equations of motion: a˙ 1,n = −zn → τn τ¨n − τ˙n2 − τn+1 τn−1 = 0 1 1 z˙n = 2yn zn − θn → τn+1 τ˙n−1 − τn−1 τ˙n+1 + θn τn2 = 0 2 2 t 1 2 y˙ n = −2zn − yn − → τn−1 τ¨n + τn τ¨n−1 − 2τ˙n τ˙n−1 + tτn τn−1 = 0 2 2 In the last equation we have used the ﬁrst one to simplify the result. Example 2. The Painlev´e (vi) equation. We consider the case of three regular singularities at ﬁnite distance which we put at λ = 0, 1, t, and we study the isomonodromic deformation equations with respect to the parameter t. We choose the singularity data at these three points as: θk 0 (k) log (λ − λk ), k = 0, 1, t, λ0 = 0, λ1 = 1, λt = t ξ (λ) = 0 0 With this we construct the wave-function Ψ(λ) having the asymptotic expansion Ψ(λ) ∼ g (k) (λ) exp (ξ (k) (λ)), with k = 0, 1, t and satisfying the diﬀerential equation ∂λ Ψ = Mλ Ψ. This implies: A(0) A(1) A(t) θk 0 (k) (k) −1 (k) g0 Mλ (λ) = + + , A = g0 0 0 λ λ−1 λ−t In particular A(k) is a rank 1 projector, hence det A(k) = 0 and Tr A(k) = θk . The equation of motion reads ∂t Ψ = Mt Ψ with:

A(t) Mt = g (t) ∂t ξ (t) g (t) −1 =− λ−t − We are free to make a global gauge transformation by a matrix which is constant in λ, and we use this freedom to diagonalize Mλ at ∞, so that we assume: 1 κ1 0 + O(λ−2 ), λ → ∞ Mλ (λ) = λ 0 κ2 The diﬀerential equation ∂λ Ψ = Mλ Ψ has a regular singularity at ∞, and the Fuchs condition reads κ1 + κ2 = θ0 + θ1 + θt .

296

8 Isomonodromic deformations

We can write Mλ as: m11 (λ) Mλ (λ) = m21 (λ)

m12 (λ) m22 (λ)

=

1 A(λ) λ(λ − 1)(λ − t)

where Aii (λ) are second degree polynomials in λ with leading term κi λ2 , and A12 (λ) and A21 (λ) are degree one polynomials in λ. We set A12 (λ) = γ(λ − y) introducing the important parameter y which will end up satisfying the Painlev´e (vi) equation. Taking the trace of Mλ (λ), we have A22 (λ) = θ0 (λ − 1)(λ − t) + θ1 λ(λ − t) + θt λ(λ − 1) − A11 (λ) which eliminates the parameters in A22 (λ). Moreover, we know that det A(λ) vanishes for λ = 0, 1, t, so we get three conditions on A21 (λ): A21 (λ) =

A11 (λ)A22 (λ) , γ(λ − y)

λ = 0, 1, t

The ﬁrst two allow us to express A21 (λ) in terms of A11 (λ) and A12 (λ), while the last one provides, since A21 (λ) is a linear function of λ, a constraint on the parameters in A11 (λ): A21 (t) = (1 − t)A21 (0) + tA21 (1) In order to write this constraint more explicitly, we parametrize A11 (λ) by the values it takes at λ = t and λ = y, i.e. write the interpolation formula: A11 (λ) = (λ − t)(λ − y)κ1 +

(λ − y)A11 (t) − (λ − t)A11 (y) t−y

Substituting this into the previous equations, one observes that the constraint is quadratic in A11 (y) but, unexpectedly, is linear in A11 (t). We solve A11 (t) in term of A11 (y) getting: A11 (t) = −

κ21 (y − t)(y − 1 + t) κ 1 − κ2

κ1 (θ0 t + θ1 (t − 1))(y − t) + 2A11 (y) + κ1 − κ 2

A11 (y) A11 (y) + θ1 y(t − 1) + θ0 t(y − 1) − (κ1 − κ2 )y(y − 1)

297

8.11 The Painlev´e equations

The dynamical variables are now y, γ, A11 (y) and we have to write the equations of motion ∂t Mλ = ∂λ Mt + [Mt , Mλ ]. Everything can be expressed in terms of the matrix A(λ) since we have Mt (λ) = −

1 A(t) t(t − 1)(λ − t)

We compute the equation of motion and afterwards we set λ = y. We get 5 8 1 1 ˙ 12 (λ) m ˙ 11 (λ) m A(t) − = [A(t), A(y)] ˙ 22 (λ) λ=y t(t − 1)(y − t)2 m ˙ 21 (λ) m y(y − 1) Taking the matrix element 12 of this equation and evaluating it at λ = y gives: θ0 θ1 θt − 1 y(y − 1)(y − t) 2m11 (y) − − − y˙ = t(t − 1) y y−1 y−t Taking the matrix element 11 evaluated at λ = y gives: m ˙ 11 (λ)|λ=y =

A11 (t) γ + m21 (y) 2 t(t − 1)(y − t) t(t − 1)

We introduce the dynamical variable z = m11 (y) and compute z˙ = m ˙ 11 (λ)|λ=y + (∂λ m11 )|λ=y · y˙ We ﬁnd after some algebra: t(t − 1)z˙ = −(3y 2 − 2(1 + t)y + t)z 2

+ 2(κ1 + κ2 − 1)y + 1 − θ0 − θt − (θ0 + θ1 )t z + κ1 (1 − κ2 ) Eliminating z between the equations for y˙ and z˙ yields the Painlev´e (vi) equation, with the following values of the parameters: 1 α = (κ1 − κ2 + 1)2 , 2

1 β = − θ02 , 2

1 γ = θ12 , 2

1 δ = (1 − θt2 ) 2

It is known that the other Painlev´e equations can be obtained from this one by various limiting procedures. References [1] G.D. Birkhoﬀ, Collected mathematical papers, Vol. 1. Dover Publications Inc. (1968) 259–306. [2] E.L. Ince, Ordinary diﬀerential equations. Dover Publications Inc. (1956).

298

8 Isomonodromic deformations

[3] W. Wasow, Asymptotic expansions for ordinary diﬀerential equations. Interscience Publishers (1965). [4] H. Flaschka and A. Newell, Monodromy and Spectrum preserving deformations I. Commun. Math. Phys. 76 (1980) 65–166. [5] M. Jimbo, T. Miwa and K. Ueno, Monodromy preserving deformations of ordinary diﬀerential equations with rational coeﬃcients I. Physica 2D (1981) 306–352. [6] M. Jimbo and T. Miwa, Monodromy preserving deformations of ordinary diﬀerential equations with rational coeﬃcients II. Physica 2D (1981) 407–448. [7] M. Jimbo and T. Miwa, Monodromy preserving deformations of ordinary diﬀerential equations with rational coeﬃcients III. Physica 4D (1981) 26–46. [8] T. Miwa. Painlev´e property of monodromy preserving deformation equations and the analyticity of τ –functions. RIMS 17 (1981) 703– 721. [9] B. Malgrange, Sur les d´eformations isomonodromiques. Mathematics and Physics (Paris, 1979/1982), Progr. Math. 37, Birkh¨ auser, Boston, Mass. (1983). [10] Y. Sibuya, Linear diﬀerential equations in the complex domain: problems of analytic continuation. AMS Providence (1990). [11] D. Anosov and A. Bolibruch, The Riemann–Hilbert problem. Aspects of Mathematics Vol. 22 (1994).

9 Grassmannian and integrable hierarchies

We learned in previous chapters that when the Lax matrix has several singularities, the situation mainly reduces to local studies around each singularity. In this chapter we consider the case of one singularity in all its generality, which amounts to the study of the Kadomtsev–Petviashvili equation. This ﬁnds a natural presentation in Sato’s Grassmannian approach. We give an explicit realization of the Grassmannian in a fermionic Fock space. In this settingwe obtain a remarkable formula expressing the τ -function as τ (t) = 0|e i Hi ti g|0 . Soliton solutions are then obtained by choosing for g particular elements which have a simple expression in terms of vertex operators. Hirota equations are interpreted as the Pl¨ ucker equations of an inﬁnite-dimensional Grassmannian, on which the inﬁnitedimensional group GL(∞) acts. The time ﬂows are interpreted as the action of an inﬁnite-dimensional natural Abelian subgroup. We also to ﬁnd particularly simple tau-functions expressed as Schur polynomials. Finally, we show that the full KP hierarchy admits an elegant formulation in terms of pseudo-diﬀerential operators.

9.1 Introduction In Chapter 3 we showed that one can simultaneously solve the hierarchy of equations: ∂tj Ψ = Mj Ψ,

Mj = (gξj g −1 )− ,

Ψ = ge

ξj tj

(9.1)

with g = 1 + O(1/z) regular and ξj a diagonal constant matrix singular at the unique puncture that we take at z = ∞. Here Ψ is a matrix of fundamental solutions of this system. In this chapter we denote by z the spectral parameter and the singularity is at z = ∞. The matrix 299

300

9 Grassmannian and integrable hierarchies

elements Ψαα admit a remarkable expression in terms of tau-functions, see eq. (3.61) in Chapter 3. Ψαα (z; t1 , t2 , . . .) =

τ (t − [z −1 ]α ) z n t(n,α) ) exp ( τ (t)

(9.2)

n≥1

where (t − [z −1 ]α ) means that times t(n,α) are shifted by −z −n /n. The noticable feature of this expression is that the z-dependence is folded into the inﬁnitely many times t(n,α) . The local formula eq. (9.2) was also recovered when analyzing isospectral ﬂows in Chapter 5, where we observed that, in this context, the tau-functions are essentially Riemann’s theta– functions. In the isomonodromic approach of Chapter 8, it was shown that the notion of tau-function is in fact much more general than that of theta-function. We have also seen in Chapter 8 that the equations of motion can be recast as bilinear Hirota equations on the tau-function, P (D)τ · τ = 0. They form an inﬁnite set of bilinear identities which are equivalent to the hierarchy equations. In this chapter we elaborate on the relation between Sato’s formula eq. (9.2), Hirota’s bilinear equations and representation theory of inﬁnitedimensional aﬃne Kac–Moody algebras. A motivation for introducing aﬃne Kac–Moody algebras may be seen in the structure of Sato’s formula, which, using exp(a∂t )f (t) = f (t + a), can be written as: V (z, t)τ (t) Ψ(z, t) = , τ (t)

V (z, t) = e(

n≥1

z n tn ) −

e

n≥1

z −n ∂ n ∂tn

(9.3)

The operator V (z, t) is a typical vertex operator which is known to play a central role in the construction of representations of aﬃne Kac–Moody algebras, see Chapter 16. A further motivation can be found in the observation that highest weight representations of aﬃne Kac–Moody algebras by vertex operators naturally produce tau-functions obeying bilinear identities. We sketch now, in a rather informal way, the ideas allowing us to associate Hirota bilinear equations with vertex operator highest weight representations of aﬃne Kac–Moody algebras. These ideas also underly the fermionic constructions of the following sections. Let G be an aﬃne Kac–Moody algebra and H a Cartan subalgebra of G. We will use freely, in this section, the notion of a group G associated with G. Let X a be a basis of G. It will often be a Cartan–Weyl basis (Hi , Eα ), where Hi is a basis of H, and Eα are the root vectors associated to the roots α, normalized by (Eα , E−α ) = 1. The tensor Casimir C12 of G is

301

9.1 Introduction deﬁned by: C12 =

X a ⊗ Xa =

a

H i ⊗ Hi +

Eα ⊗ E−α

α

i

where the indices are raised or lowered with the Killing form. The fundamental property of C12 is that for all g ∈ G, we have: C12 g ⊗ g = g ⊗ g C12 Let us give a proof of this property, which will be adapted in the fermionic case. We have to show that g −1 X a g ⊗ g −1 Xa g = X a ⊗ Xa a

a

but g −1 X a g = αda X d , and g −1 Xb g = βbc Xc . The Killing form is such a , X ) = δ a and is invariant under adjoint action. Hence we have that b b (X a c a α β = δ c c b b , which implies our property. If |Λ and |Λ are two highest weight vectors, we have: C12 |Λ ⊗ |Λ = (Λ, Λ )|Λ ⊗ |Λ To see it, we write C12 in the Cartan–Weyl basis and use that highest weight vectors |Λ are eigenstates of Hi and are annihilated by the generators Eα for α any positive root. These two relations imply the following identity: C12 g|Λ ⊗ g|Λ = (Λ, Λ ) g|Λ ⊗ g|Λ ,

∀ g∈G

(9.4)

Suppose now that the representation of G, with highest weight vector |Λ , admits a vertex operator construction in terms of bosonic operators αn , n ∈ Z, satisfying [αn , αm ] = nδn+m,0 , see Chapter 16. The representation space is the bosonic Fock space with vacuum |Λ such that αn |Λ = 0 for n > 0. It is generated by acting on |Λ with the operators αn , n < 0. A basis consists of states of the form α−n1 · · · α−nk |Λ . A vertex operator is an operator acting on the Fock space and has the typical form:

V (z) = e−(

n<0 (z

−n /n)α

n

) e−(

n>0 (z

−n /n)α

n

)

(9.5)

When we expand V (z) in powers of z, the coeﬃcients are polynomials in the bosonic operators αn . Together with the αn themselves, they represent the elements X a of the Lie algebra. Note that taking for α−n , n > 0 the operator of multiplication by ntn , and for αn , n > 0 the derivation operator ∂/∂tn , the vertex operator eq. (9.5) exactly reproduces the vertex operator appearing in eq. (9.3).

302

9 Grassmannian and integrable hierarchies

Let us introduce the generating function, H(t) = n>0 αn tn , depending on the inﬁnite number of variables tn . Note that [H(t), H(t )] = 0. We also need the dual vacuum Λ| such that Λ|αn = 0 for n < 0. We deﬁne the tau-function by: τΛg (t) = Λ|eH(t) g|Λ , with H(t) = αn tn , g ∈ G (9.6) n>0

This is a function on the orbit of |Λ under G. Since X a ∈ G is represented by some polynomial in the αn , there exist a (t, ∂ ) in the variables t such that: diﬀerential operators DΛ t n a (t, ∂t ) Λ|eH(t) g|Λ Λ|eH(t) X a g|Λ = DΛ

This easily follows from: Λ|e

H(t)

5

αm g|Λ =

∂ H(t) g|Λ for m > 0 ∂tm Λ|e mt−m Λ|eH(t) g|Λ for m < 0

(9.7)

(9.8)

The case m > 0 is obvious, while for m < 0 one uses eH(t) αm e−H(t) = αm + mt−m to bring αm to the left where it is annihilated by the vacuum. We now multiply the fundamental bilinear relation eq. (9.4) by eH(t ) ⊗ eH(t ) . Let xn = 12 (tn + tn ), yn = 12 (tn − tn ), so that

)

eH(t ) ⊗ eH(t

= eH(y) ⊗ e−H(y) · eH(x) ⊗ eH(x)

The second factor in the right-hand side of this relation commutes with C12 , so that we can push it to its right. Then, taking the scalar product with Λ| ⊗ Λ |, we get Λ|eH(y) X a eH(x) g|Λ Λ |e−H(y) Xa eH(x) g|Λ a

= (Λ, Λ )τΛg (x + y)τΛg (x − y) Using now eq. (9.7) we obtain: a DΛ (y, ∂y )DaΛ (−y, −∂y ) τΛg (x + y)τΛg (x − y) a

= (Λ, Λ )τΛg (x + y)τΛg (x − y) These are Hirota type equations. Expanding in y, we get an inﬁnite number of bilinear Hirota equations which are in fact equations characterizing the orbits of the highest weight vectors |Λ and |Λ .

303

9.2 Fermions and GL(∞)

Clearly this construction can be applied to any vertex operator representation of an aﬃne Kac–Mooody algebra G. The hierarchy of Hirota equations we obtain in this way depends on the aﬃne Kac–Moody algebra but also on the choice of the vertex operator representation. In the next section we make these ideas precise in the case of the group GL(∞). 9.2 Fermions and GL(∞) Let gl(∞) be the Lie algebra of inﬁnite-dimensional band matrices of the form Mrs with r, s ∈ Z + 12 (the 12 is added for convenience) and such that Mrs = 0 for |r − s| large. Note that the sum and the product of two such matrices is well-deﬁned and of the same type. Finally, gl(∞) is a central extension of gl(∞) that will be described later on, by displaying a representation of this algebra on some Hilbert space. We introduce fermionic operators βr , βr∗ , r ∈ Z + 12 , with the following anticommutation relations: {βr , βs } = 0,

{βr∗ , βs∗ } = 0,

{βr , βs∗ } = δr+s,0

(9.9)

where { , } denotes the anticommutator, ie. {a, b} = ab + ba for any operators a and b. By convention, we will say that the fermions βr have charge (+1) while the fermionic operators βs∗ have charge (−1). We introduce a vacuum vector |0 such that: βr |0 = βr∗ |0 = 0

for r ≥

1 2

The Fock space is, by deﬁnition, generated by acting successively on the vacuum with the fermionic operators. A basis of the fermionic Fock space is the following set of states: ∗ ∗ · · · β−r |0 β−sn · · · β−s1 · β−r m 1

(9.10)

with sj , rj ≥ 12 . We introduce a Hermitian structure on the Fock space by deﬁning the dual vacuum vector: 0|βr = 0|βr∗ = 0

for r ≤ − 12

such that 0|0 = 1, and the adjoint of the fermionic operators: ∗ βr† = β−r ,

(βr∗ )† = β−r

This implies that the states eq. (9.10) form an orthonormal basis of the Fock space. Since the fermions βs have charge (+1) and the fermions βr∗ charge (−1), the state eq. (9.10) has charge (n − m).

304

9 Grassmannian and integrable hierarchies

Let |p be the vector deﬁned by: |p ≡ β−p+ 1 · · · β− 1 |0 , ≡

∗ βp+ 1 2

2

for p > 0,

(9.11)

2

∗ · · · β− 1 |0 , 2

for p < 0,

∗ |p = 0 for r ≥ 1 . We These states have charge p and satisfy βr−p |p = βr+p 2 call them charged vacuum vectors. The states with charge p are obtained by acting on the vacuum |p with an equal number of βs and βr∗ . In the following we will mostly restrict ourselves to neutral states which are linear combinations of states of the form eq. (9.10) with the same number of β and β ∗ .

We will need the notion of normal order of fermionic operators. This order consists in writing all operators βr , βr∗ with positive indices r on the right and multiplying by the signature of the considered permutation. The important property of normal ordered products is that their vacuum expectation value vanishes. The relation between a monomial and its normal ordered form is given by Wick’s theorem. For two fermions it reads: βr βs∗ = : βr βs∗ : + 0|βr βs∗ |0 ,

: βr βs∗ := − : βs∗ βr :

This can be seen as follows. To bring the left-hand side to its normal ordered form we have to perform at most one anticommutation, hence adding at most a c-number which can only be the vacuum expectation of the left-hand side since the normal product has vanishing vacuum expectation value. This scalar is called the contraction of the two operators. For more than two operators Wick’s theorem is expressed by the induction formula: φ1 : φ2 · · · φn :=: φ1 φ2 · · · φn : +

n

(−1)i 0|φ1 φi |0 : φ2 · · · φi · · · φn :

i=2

where φj is either some βr or some βs∗ and the notation φi means omission of the factor φi . It follows that 0|φ1 φ2 · · · φn |0 is equal to the sum of all possible products of contractions with appropriate signs. We now construct the algebra gl(∞). Consider neutral bilinear opera∗ tors of the form βr βs . They form a closed algebra under commutation. For example: ∗ ∗ ] = δs+n,0 (βr βm ) − δr+m,0 (βn βs∗ ) [βr βs∗ , βn βm

9.2 Fermions and GL(∞)

305

Let Mrs be a band matrix, i.e. an inﬁnite-dimensional matrix such that − s| > N for some given N . We can deﬁne the formal Mrs = 0 for |r ∗ . The commutator of two such objects can operator X = r,s Mrs βr β−s be computed without ambiguity, due to the band structure of the matrices M and N : ∗ ∗ ∗ Mrs Nnm [βr β−s , βn β−m ]= [M, N ]rm βr β−m [X, Y ] = rsnm

rm

We reproduce the Lie algebra of (band) matrices. However, objects like X cannot be represented on the Fock space since their matrix elements on this space may be inﬁnite. For example, 0|X|0 = r>0 Mrr which can be inﬁnite. To overcome this problem we normal order the bilinear fermionic operator. Then all matrix elements are ﬁnite between states with ﬁnite number of particles. However, this induces a modiﬁcation of the commutation rules: / 0 ∗ ∗ ∗ ∗ : βr β−s :, : βn β−m : = δs,n : βr β−m : − δr,m : βn β−s : ∗ ∗ (9.12) +δs,n 0|βr β−m |0 − δr,m 0|βn β−s |0 The additional c-number term is the central extension of gl(∞), and is equal to δrm δsn (θ(m) − θ(s)), where θ(m) = 1 for m > 0 and vanishes for m < 0. Deﬁnition. The algebra gl(∞) is the inﬁnite-dimensional Lie algebra of elements of the form: 9 : gl(∞) = X= Mrs : βr β ∗ :, Mrs = 0 if |r − s| 0 ⊕ C −s

r,s

(9.13) equipped with the Lie bracket eq. (9.12). Here |r − s| 0 means that there exists N > 0 such that Mrs = 0 when |r − s| > N . We will need to consider the group GL(∞) associated with this Lie algebra. Its precise deﬁnition is somewhat tricky. We will adopt here a very naive point of view, and consider group elements of the form: g = eX1 eX2 · · · eXk with X1 , . . . , Xk ∈ gl(∞) deﬁned by ﬁnite sums of bilinears in the fermionic operators. It is then clear that such group elements are well represented in the Fock space. Of course we will quickly encounter the need to extend this setting to inﬁnite sums of bilinears, then one has to be very careful about convergence properties, but we will not attempt to deﬁne a general framework and refer instead to the literature. The group GL(∞) acts on the fermions by conjugation.

306

9 Grassmannian and integrable hierarchies

Proposition. For any g ∈ GL(∞) we have gβs g −1 = βr ars

(9.14)

r

∗ −1 gβ−s g =

∗ β−r (a−1 )sr

(9.15)

r

for some c-number matrix ars . With our deﬁnition of GL(∞) the matrices ars only have a ﬁnite but arbitrary large number of non-zero entries. Proof. We use a matrix notation: β is an ∞ line vector with elements βs , ∗ . We want to show that and β ∗ is an ∞ column vector with elements β−s for a given g, gβg −1 = β · a g −1 β ∗ g = a · β ∗ with the same matrix a. Note that if g1 βg1−1 = βa1 and g2 βg2−1 = βa2 then g1 g2 βg2−1 g1−1 = βa1 a2 , and similarly if g1−1 β ∗ g1 = b1 β ∗ , g2−1 β ∗ g2 = b2 β ∗ , then g2−1 g1−1 β ∗ g1 g2 = b1 b2 β ∗ . So, it is enough the verify that a = b on simple generators. We take ∗

∗

∗

g = e:βr βs : = e− 0|βr βs |0 eβr βs

We see that the normal ordering produces simple scalar factors which do not contribute to the adjoint action, and we can omit it. If (r + s) = 0, ∗ we have g = eβr βs = 1 + βr βs∗ , and therefore, gβk g −1 = βk + [βr βs∗ , βk ] = βl (δkl + δrl δs+k,0 ) ∗ ∗ ∗ ∗ g −1 β−l g = β−l − [βr βs∗ , β−l ] = (δkl + δrl δs+k,0 )β−k ∗

If (r + s) = 0 then g = eβr β−r , and gβk g −1 = βk eδrk

,

∗ ∗ g −1 β−k g = β−k eδrk

This proves the result. It is convenient to work with generating functions β(z) and β ∗ (z), also called fermionic ﬁelds, deﬁned by: β(z) = βr z −r−1/2 (9.16) 1 r∈Z+ 2

β ∗ (z) =

βr∗ z −r−1/2

(9.17)

1 r∈Z+ 2

Notice that β(z) and β ∗ (z) are single valued, i.e. β(ze2iπ ) = β(z) and similarly for β ∗ (z).

307

9.2 Fermions and GL(∞)

Proposition. The fermionic ﬁelds satisfy the following “operator product expansion” : β(z)β ∗ (w) =: β(z)β ∗ (w) : +

1 , z−w

for

|z| > |w|

(9.18)

It is enough to compute the vacuum expectation value of 1 1 −r− 2 −s− 2 ∗ w . Since the left-hand side. This is given by r,s 0|βr βs |0 z ∗ 0|βr βs |0 = 1 whenr = −s > 0 and vanishes otherwise, we get i 0|β(z)β ∗ (w)|0 = 1/z ∞ i=0 (w/z) = 1/(z − w) when |z| > |w|.

Proof.

By Wick’s theorem this can be generalized to any product of ﬁelds, yielding: 0|

N

β(zj )

j=1

N

∗

β (wj )|0 = (−1)

N (N −1) 2

det

j=1

1 zi − wj

(9.19)

A direct consequence of eqs. (9.14, 9.15) is that there exists a fermionic analogue of the tensor Casimir. This fact is the crucial point of the fermionic construction of the Hirota equations. Proposition. Let S12 be deﬁned by: S12 =

∗ βr ⊗ β−r =

1 r∈Z+ 2

dz β(z) ⊗ β ∗ (z) 2iπ

(9.20)

∀g ∈ GL(∞)

(9.21)

Then S12 g ⊗ g = g ⊗ g S12 ,

The operator (S † S) is the tensor Casimir for gl(∞). Proof. Equation (9.21) is equivalent to : ∗ ∗ βr g ⊗ β−r g= gβr ⊗ gβ−r , r∈Z+1/2

∀g ∈ GL(∞)

r∈Z+1/2

∗ g = a gβ ∗ , so From eqs. (9.14, 9.15), we have βr g = gβs (a−1 )sr and β−r rs −s that ∗ ∗ ∗ βr g ⊗ β−r g= gβs ⊗ gβ−k (a−1 )sr ark = gβs ⊗ gβ−s r

r,s,k

s

308

9 Grassmannian and integrable hierarchies

∗ |0 = 0, Notice that the deﬁnition of the vacuum implies that βr |0 ⊗β−r ∗ ∀r, since either βr |0 = 0 or β−r |0 = 0 . Therefore ∗ βr |0 ⊗ β−r |0 = 0 (9.22) S12 |0 ⊗ |0 = r

Combining eqs. (9.21, 9.22) we obtain: S12 g|0 ⊗ g|0 = 0

(9.23)

for any g ∈ GL(∞). This relation is the fermionic analogue of the bosonic equation eq. (9.4) and will be very important in the following. Remark. The transformations eqs. (9.14, 9.15) can alternatively be written in a more compact way by introducing the function Agn (z) for n ∈ Z: m z am+ 1 ,n+ 1 (9.24) Agn (z) = m∈Z

2

2

1 dzβ(z)Ags− 1 (z) 2iπ 2 In eq. (9.24) the sum is ﬁnite as long as we use the previous deﬁnition for the group GL(∞). As pointed out above, extending this construction such that the deﬁnition of the function Agn (z) involves an inﬁnite sum requires choosing an appropriate completion of GL(∞).

We then have:

(gβs g −1 ) =

9.3 Boson–fermion correspondence We now come to a truly remarkable result known as the boson–fermion correspondence. This will be the main technical tool used in the forthcoming sections. We construct bosonic operators from the fermions and conversely. The bosonic operators Hn are simply bilinear in the fermionic operators. Proposition. Deﬁne the operators Hn by H(z) =: β(z)β ∗ (z) := z −n−1 Hn or equivalently Hn = tion relations:

(9.25)

n∈Z

r

∗ : βr β−r+n :. They obey the following commuta-

[Hn , β(z)] = z n β(z) [Hn , β ∗ (z)] = −z n β ∗ (z) [Hn , Hm ] = nδn+m,0

(9.26)

9.3 Boson–fermion correspondence

309

Proof. The proof follows from standard manipulations using Wick’s theorem. By deﬁnition, we have: dz n Hn = z : β(z)β ∗ (z) : 2iπ Using Wick’s theorem, we may write the product Hn β(w) as: dz n Hn β(w) = z : β(z)β ∗ (z) : β(w) 2iπ dz n 1 dz n ∗ = z : β(z)β (z)β(w) : − z β(z) 2iπ 2iπ w−z |w|<|z| Here the integration over z is performed on a circle C1 around the origin enclosing w so that we can apply eq. (9.18). Similarly, dz n β(w)Hn = z β(w) : β(z)β ∗ (z) : 2iπ dz n 1 dz n ∗ z : β(z)β (z)β(w) : − z β(z) = 2iπ 2iπ w−z |w|>|z| The integral is over a circle C2 around the origin not enclosing w. So the commutator [Hn , β(w)] is: dz n 1 [Hn , β(w)] = − z β(z) (9.27) 2iπ w−z C1 −C2 dz n 1 z β(z) = wn β(w) = z−w Cw 2iπ where Cw is a small circle around w. The other relations are proved similarly. From eq. (9.26), we see that Hn obey bosonic commutation relations. Moreover, we have Hn† = H−n . The operator H0 is both Hermitian and in the centre of the bosonic algebra. Let us now describe the bosonic Fock space. The vacuum vector |l is such that: Hn |l = 0,

for n > 0,

H0 |l = l|l ,

l∈Z

Here we choose l ∈ Z because in terms of fermionic operators H0 has integer spectrum. Over each state |l we construct a bosonic Fock space by acting with the creation operators Hn with n < 0. A basis of the bosonic Fock space consists of the states: H−n1 · · · H−nk |l

310

9 Grassmannian and integrable hierarchies

We will say that these states have charge l, so the charge is the H0 eigenvalue. In particular, the neutral bosonic Fock space is spanned by the states obtained by acting with the Hn on the vacuum |0 . We have constructed bosons from fermions, the unexpected result is that fermions can be reconstructed from bosons. We start with bosonic operators obeying commutation relations [Hn , Hm ] = nδn+m,0 . As already noticed, H0 is in the centre of this algebra, so we call it “momentum” and denote it p. We enlarge the algebra by introducing a “position” operator q such that [p, q] = −i. On the vacua |l , the operator eiq acts as a translation operator: eiq |l = |l + 1 , compatible with p|l = l|l . Proposition. Let [p, q] = −i and deﬁne φ(z) = q − ip log z + i

Hn n=0

n

z −n

Then, the fermionic ﬁelds are reconstructed by the formulae: β(z) = V+ (z),

β ∗ (z) = V− (z)

(9.28)

where V± (z) =: exp(±iφ(z)) : The colons : : denote the bosonic normal ordering, which consists of writing the operators Hn with n > 0 on the right, and the eiq factor on the left. Explicitly, the operators V± (z) read V± (z) = e±iq z ±p exp ∓

z −n n<0

n

Hn

exp ∓

z −n n>0

n

Hn

(9.29)

The operators V± (z) are called vertex operators. Proof. The proof relies on the following relations: V± (z)V± (w) = (z − w) : e±iφ(z) e±iφ(w) : |z| > |w| 1 V± (z)V∓ (w) = : e±iφ(z) e∓iφ(w) : |z| > |w| z−w

(9.30)

They follow immediately from the standard Campbell–Hausdorﬀ formula eA eB = eB eA e[A,B] for two operators A and B whose commutator [A, B] is a c-number. To compute the anticommutator {βr , βs∗ }, we represent the

9.4 Tau-functions and Hirota bilinear identities

311

. fermionic operators as contour integrals, βr = dz z −r−1/2 : exp(iφ(z)) : and similarly for βs∗ . We then use the above relations and apply contour integral manipulations as in the previous proposition: dzdw z −r−1/2 w−s−1/2 ∗ {βr , βs } = − : eiφ(z) e−iφ(w) : 2 z−w |z|>|w| |z|<|w| (2iπ) dw dz z −r−1/2 w−s−1/2 iφ(z) −iφ(w) = e := δr+s,0 :e z−w C0 2iπ Cw 2iπ where C0 and Cw are small circles around 0 and w respectively. The proof that {βr∗ , βs∗ } = {βr , βs } = 0 is similar. When describing the bosonic Fock space, we denoted by |l states which are eigenstates of the momentum operator, p|l = l|l , and which satisfy Hn |l = 0 for n > 0. Using the bosonic construction of the fermions, ∗ |l = 0 for one veriﬁes that these states |l also satisfy : βr−l |l = βr+l 1 r ≥ 2 . Thus they coincide with the states that we deﬁned in eq. (9.11), i.e. fermionic charged vacua of charge l. The boson–fermion correspondence may also be viewed at the level of vacuum expectation values as a consequence of Cauchy’s determinant formula. Wick’s theorem allows us to compute expectation values of vertex operators: N N i<j (zi − zj )(wi − wj ) iφ(zj ) −iφ(wj ) :e : :e : |0 = (9.31) 0| i,j (zi − wj ) j=1

j=1

Similarly we have: 0|

N j=1

: eiφ(zj ) : | − N =

(zi − zj ) zi−N i<j

(9.32)

i

The bosonization formula will follow if the fermionic expectation value eq. (9.19) coincide with the bosonic expectation value eq. (9.31), speciﬁcally, if we have: N (N −1) 1 i<j (zi − zj )(wi − wj ) = (−1) 2 det (9.33) zi − wj i,j (zi − wj ) This is nothing but the Cauchy determinant formula. 9.4 Tau-functions and Hirota bilinear identities The tau-functions are functions on the orbit of the vacuum |0 under the action of the group GL(∞). To deﬁne them as in eq. (9.6), we again

312

9 Grassmannian and integrable hierarchies

introduce the function H(t) by: H(t) =

tn H n

n>0

Deﬁnition. We deﬁne the tau-function τ (t) as the vacuum expectation value: τ (t; g) = 0|eH(t) g|0 , g ∈ GL(∞) (9.34) We wish to prove that the tau-functions obey Hirota equations. We repeat the argument of section (9.1), but using the operator S12 in place of C12 . From eq. (9.23) we get: dz S12 · g|0 ⊗ g|0 = (9.35) β(z)g|0 ⊗ β ∗ (z)g|0 = 0 2iπ This is an identity on vectors in Fock space belonging to the orbit of the vacuum vector. It translates readily into bilinear identities on the tau-functions. Proposition. Let τ (t; g) be the tau-function deﬁned in eq. (9.34). It satisﬁes the identities: z −n ∂ dz n 2z yn exp − τ (t + y; g)τ (t − y; g) = 0 exp 2iπ n ∂yn n>0 n>0 (9.36)

Proof. Applying eH(t ) ⊗eH(t ) to eq. (9.35), and taking the inner product with +1| ⊗ −1|, we get : dz +1|eH(t ) β(z)g|0 −1|eH(t ) β ∗ (z)g|0 = 0 2iπ

To compute +1|eH(t ) β(z)g|0 and −1|eH(t ) β ∗ (z)g|0 we use the bosonization formula. This gives, as we show below:

+1|eH(t ) β(z)g|0 = V+ (z, t )τ (t ; g) H(t ) ∗

−1|e

(9.37)

β (z)g|0 = V− (z, t )τ (t ; g)

where V± (z; t) are the vertex operators in their diﬀerential operator representation: z −n ∂ z n tn exp ∓ V± (z; t) = exp ± n ∂tn n>0

n>0

313

9.4 Tau-functions and Hirota bilinear identities

Indeed, inserting eq. (9.28) into the left-hand side of eq. (9.37) we get:

+1|eH(t ) β(z)g|0 H(t )

= +1|e

eiq z p exp −

z −n n<0

n

Hn

exp −

z −n n>0

n

Hn

g|0

−n Noticing that +1|eiq z p = 0|, and commuting exp − n<0 z n Hn to the left using the Campbell–Hausdorﬀ formula, we get, using 0|Hn = 0 for n < 0: z −n H(t ) exp − tn z n Hn = eξ(t,z) 0|eH(t ) , ξ(t, z) = 0|e n n<0

Hence eq. (9.37) reads: ξ(t ,z)

e

H(t )

0|e

n>0

exp −

= exp

z −n n

n>0

n>0

z n tn

Hn

g|0

z −n ∂ exp − n ∂tn

0|eH(t ) g|0

n>0

With these formulae, the bilinear identity becomes: dz V+ (z; t ) V− (z; t ) τ (t ; g)τ (t ; g) = 0 2iπ

(9.38)

which is eq. (9.36) if we use the variables t = 12 (t + t ) and y = 12 (t − t ). The identity eq. (9.36) contains an inﬁnite set of bilinear Hirota equations. Let us introduce the Hirota operators, see eq. (8.55) in Chapter 8, for the inﬁnite set of variables tn : D = {D1 , D2 , . . . , Dn , . . .} Equation (9.36) can be rewritten in terms of these operators as:

dz (n>0 2z n yn ) −n>0 z−n Dn ( n e n>0 yn Dn ) τ (t; g) · τ (t; g) = 0 e e 2iπ Using the deﬁnition of the elementary Schur polynomials Pk (t): ∞ ∞ tn z n = Pk (t)z k (9.39) eξ(t,z) = exp n=1

k=0

314

9 Grassmannian and integrable hierarchies

n we can expand in z the quantity exp n>0 2z yn . Integrating over z shows that the bilinear identity eq. (9.36) is equivalent to:   ∞ exp  Pj (2y)Pj+1 (−D) yn Dn  τ (t; g) · τ (t; g) = 0, ∀yn (9.40) n>0

j=0

= {D1 , 1 D2 , . . . , 1 Dn , . . .}. where D 2 n Expanding this equation in powers of y, we get an inﬁnite set of Hirota equations called the KP hierarchy. For instance, the coeﬃcient of y13 gives the equation (D14 + 3D22 − 4D1 D3 )τ · τ = 0 which is the Hirota form of the KP equation, see eq. (8.56) in Chapter 8. ∗ |l = 0 for r ≥ 12 , we also have Remark. Since the states |l satisfy βr−l |l = βr+l

the relation

S12 · g|l ⊗ g|l =

∗ βr g|l ⊗ β−r g|l = 0

r

Thus we can similarly prove more general bilinear identities for the τ -functions τl (t; g) =

l|eH(t) g|l : dz l−l V− (z; t ) V+ (z; t ) τl (t ; g)τl (t ; g) = 0, ∀t , t , l, l , l ≥ l (9.41) z 2iπ

9.5 The KP hierarchy and its soliton solutions To produce a solution of the KP hierarchy, we need to choose an element g ∈ GL(∞) and calculate the τ -function. A particularly simple choice is: g1−sol = eaβ(p)β

∗ (q)

= 1 + aβ(p)β ∗ (q),

with |p| > |q|

This corresponds to the one-soliton solution. In principle one should normal order β(p)β ∗ (q) in the exponential, but this would amount to a multiplicative prefactor which is irrelevant in the bilinear Hirota equations. The N -soliton solution is obtained by choosing the group element to be the product of N one-soliton factors. Proposition. The N -soliton tau-function of the KP hierarchy is given by: τN (t; g) = 0|eH(t) g1 g2 · · · gN |0 ,

with gi = 1 + ai β(pi )β ∗ (qi )

315

9.5 The KP hierarchy and its soliton solutions Explicitly, it is equal to: τN (t; g) = 1 +

N

r=1

I⊂{1,...,N } |I|=r

where ηi = ξ(t, pi ) − ξ(t, qi ) + log

(pi − pj )(qi − qj ) eηi · (pi − qj )(qi − pj )

i<j∈I

ai pi −qi

(9.42)

i∈I

with ξ(t, p) =

np

nt

n.

Proof. Using the commutation relation eq. (9.26) of the fermions with H(t), we get the “evolution equations” of the fermionic ﬁelds: eH(t) β(z)e−H(t) = eξ(t,z) β(z) eH(t) β ∗ (z)e−H(t) = e−ξ(t,z) β ∗ (z)

(9.43) (9.44)

This allows us to commute eH(t) to the right where it is annihilated by the vacuum, getting: τN (t; g) = 0|

N

1 + ai eξ(t,pi )−ξ(t,qi ) β(pi )β ∗ (qi ) |0

(9.45)

i=1

Expanding this formula with the Wick theorem, we get the tau-function as a sum of determinants: Wii + det [W(ij) ]2×2 + det [W(ijk) ]3×3 + · · · τN (t; g) = 1 + i

i<j

i<j
(9.46) where [W(i1 i2 ···ir ) ]r×r is the r×r submatrix of W obtained by selecting out the rows and columns of indices i1 i2 · · · ir , e.g. det [W(ij) ]2×2 = Wii Wjj − Wij Wji . Here W is an N × N matrix with elements Wij = Xi Xj 0|β(pi )β ∗ (qj )|0 =

Xi X j pi − q j

2

X and log aii = ξ(t, pi ) − ξ(t, qi ). Equation (9.42) immediately follows by applying the Cauchy formula eq. (9.33). τN (t; g) = 1 +

i

eηi +

i<j

eηi +ηj

(pi − pj )(qi − qj ) + ··· (pi − qj )(qi − pj )

Equivalently, we can write the tau-function as a determinant: Xi Xj τN (t; g) = det δij + pi − q j

316

9 Grassmannian and integrable hierarchies

This is proved by comparing the well known expansion formula: Wii + det [W(ij) ]2×2 + · · · det(1 + W ) = 1 + i

(9.47)

i<j

with eq. (9.46). 9.6 Fermions and Grassmannians The previous construction can be given a more geometrical interpretation by relating it to Grassmannians. We ﬁrst explain this notion in the ﬁnitedimensional case. We then extend it to the inﬁnite-dimensional case and make contact with the above fermionic description. The Hirota equations can be interpreted as the Pl¨ ucker equations of the Grassmannian, and the time ﬂows are induced by one-parameter subgroups of GL(∞). Let V be a vector space of dimension N . The Grassmannian Gr(M, V ), also denoted by Gr(M, N ), is the space of M -dimensional hyperplanes in V . Since a hyperplane is uniquely determined by an M -dimensional frame in V modulo the linear transformations GL(M ), Gr(M, V ) may be deﬁned as: Gr(M, V ) ≡ {M − frames in V }/GL(M )

(9.48)

M Grassmannians are projective algebraic varieties. If P (Λ V ) denotes the N projective space of dimension − 1 on the M th exterior power of M V , one can embed them into P (ΛM V ) using the Pl¨ ucker coordinates and Pl¨ ucker relations, as we now explain.

We deﬁne the application Θ from Gr(M, V ) to P (ΛM V ), by associating an M -dimensional frame (x(1) , . . . , x(M ) ) with their exterior product: Θ : Gr(M, V ) −→ P (ΛM V ) = x(1) ∧ · · · ∧ x(M ) (x(1) , . . . , x(M ) ) −→ x

(9.49)

A change of basis vectors x(k) by an arbitrary element g of GL(M ) multiplies x by the scalar det g = 0, so the right-hand side is invariant in P (ΛM V ). The group GL(N ) acts transitively on the set of M -hyperplanes. If (e1 , . . . , eN ) is a basis of V , we can take a reference M -hyperplane spanned by e1 , . . . , eM . Any other basis of any other M -hyperplane can be written

9.6 Fermions and Grassmannians

317

ge1 , . . . , geM for some g ∈ GL(N ). Under the application Θ this yields a point ge1 ∧ · · · ∧ geM . In other words the Grassmannian is the orbit of any of its points under GL(N ): Gr(M, V ) ∼ {ge1 ∧ · · · ∧ geM |g ∈ GL(N )} Writing x(j) =

(9.50)

xi j , we have x = X h1 ···hM eh1 ∧ eh2 · · · ∧ ehM i ei

(9.51)

h1 ,...,hM

The Pl¨ ucker coordinates of an M -hyperplane in P (ΛM V ) are

X h1 ···hM = det xhji i,j=1,...,M

The Grassmannian has an image in P (ΛM V ) characterized by algebraic relations which are called the Pl¨ ucker relations. Proposition. The image, in the projective space, of the Grassmannian is the set of points whose coordinates X k1 ···kM satisfy the bilinear Pl¨ ucker relations, for all k1 , . . . , kM −1 and h1 , . . . , hM +1 : M +1

X k1 ···hj ···kM −1 X h1 ···hj ···hM +1 = 0

(9.52)

j=1

where the index hj has to be omitted. Proof. In order to prove these Pl¨ ucker relations, we need a: Lemma. Let x ∈ ΛM V , and deﬁne V = {v ∗ ∈ V ∗ |i(v ∗ ) x = 0} and W = (V )⊥ ⊂ V . Then W is the smallest subspace of V such that x|ξ ∈ ΛM −1 V ∗ }. x ∈ ΛM V . Moreover, W = {i(ξ) Proof. Let w1 , . . . , wl be a basis of W completed by ul+1 , . . . , uN to a basis of V . Let w1∗ , . . . , wl∗ , u∗l+1 , . . . , u∗N be the dual basis. Then V is just the span of u∗l+1 , . . . , u∗N . In this basis one can write x =x 1 + x 2 + · · ·, M where x 1 ∈ Λ W , while the other terms can be expanded withMincreasing 2 = ˜j , with v˜j ∈ Λ −1 W . By number of uk . For instance x j uj ∧ v ∗ deﬁnition of V , we have i(uj ) x = 0, and this implies that all x j = 0 for M j ≥ 2. Hence x =∈ Λ W . To show the second part of the Lemma, we use the formula i(ξ) x, v ∗ = ± i(v ∗ ) x, ξ

318

9 Grassmannian and integrable hierarchies

x ∈ (V )⊥ = W . so that v ∗ ∈ V if and only if i(ξ) We now return to the proof of the Proposition. We have to characterize the elements x which are factorizable, that is x = w 1 ∧ · · · ∧ wM This is equivalent to the fact that W is the span of w1 , . . . , wM , and so is of dimension M . Another way of saying this is w∧x = 0, ∀ w ∈ W Taking into account the second part of the Lemma, this is equivalent to x factorizable iﬀ i(ξ) x∧x = 0, ∀ξ ∈ ΛM −1 V ∗ To express this condition, it is suﬃcient to vary ξ over a basis of ΛM −1 V ∗ . One takes ξ = e∗k1 ∧ · · · ∧ e∗kM −1 with k1 < · · · < kM −1 . Using x in the form eq. (9.51), we get X k1 ···kM −1 h eh i(ξ) x= h

( Similarly, writing x = X h1 ···hj ···hM +1 eh1 ∧ · · · ∧ ehM +1 we get ( i(ξ) x∧x = X k1 ···hj ···kM −1 X h1 ···hj ···hM +1 eh1 ∧ · · · ∧ ehj ∧ · · · ∧ ehM +1 j

yielding eq. (9.52). Remark. It is easy to see that the Pl¨ucker are necessary relations. Let relations (m) us demand that a vector of the form z = z x belongs to the hyperplane (m) m spanned by the x(j) . This means 0=z∧x = Z h1 ···hM +1 eh1 ∧ · · · ∧ ehM +1 h1 <···
with Z h1 ···hM +1 =

M +1 M

j ···hM +1 hj (m) h1 ···h X mz

(−)j−1 x

j=1 m=1

We next choose z (m) as the minor of the following determinant:  k1  x1 ··· xk1 M ..   .. hj  .  z (m) x m X k1 ···kM −1 hj = det  kM. −1 kM −1  = x 1  ··· x M m h hj x 1j ··· xM

(9.53)

9.6 Fermions and Grassmannians

319

Expanding this determinant with respect to the last row shows that the coeﬃcient z (m) only depends on the indices k1 , . . . , kM −1 and not on the index hj . Inserting this formula into eq. (9.53) gives the Pl¨ ucker relations.

To introduce the inﬁnite-dimensional Grassmannian, we ﬁrst need to consider the formal limit of the above constructions to inﬁnite-dimensional spaces V∞ . A basis of V∞ will be denoted by es with s ∈ Z + 12 . We equip V∞ with the scalar product (es , er ) = δr+s,0 . To make sense of exterior products in this context, consider formally the vacuum vector |0 which is the semi-inﬁnite wedge vector: |0 = e 1 ∧ e 3 ∧ e 5 ∧ · · · 2

2

(9.54)

2

and consider elements which are ﬁnite perturbations of this state in the following sense: |Y = e−sn ∧ · · · ∧ e−s1 ∧ e 1 ∧ · · · ∧ er1 ∧ · · · ∧ erm ∧ · · ·

(9.55)

2

for 0 < s1 < · · · < sn and 0 < r1 < · · · < rm , where erj have been omitted in the inﬁnite wedge product. We say that the state |Y contains a ﬁnite number of n particles and a ﬁnite number m of holes. Note that the tail of these wedge products is of the form ∧es ∧ es+1 ∧ · · · for suﬃciently large positive s, so these states are ﬁnite perturbations of the vacuum. We will work in the space of linear combinations of states |Y . We now recast the fermionic constructions of the previous sections into the inﬁnite Grassmannian language. The basis for this interpretation is to write the fermionic vacuum |0 as a Dirac sea: |0 = β 1 β 3 · · · | − ∞ 2

(9.56)

2

where | − ∞ is the charge −∞ fermionic vacuum. The dictionary between the semi-inﬁnite wedge product and the fermionic language is as follows. We identify the fermionic state eq. (9.56) with the vacuum vector eq. (9.54). Then we identify the fermionic operators with the exterior and inner products: βr = er ∧,

βs∗ = es ¬

(9.57)

where er ∧ means taking the wedge product with er , and es ¬ means taking the interior product with es , using the scalar product (er , es ) = δr+s,0 , which also means removal of e−s . It is a direct consequence of exterior calculus that the deﬁnition eq. (9.57) forms a representation of the anticommutation relations eq. (9.9).

320

9 Grassmannian and integrable hierarchies

We deﬁne the inﬁnite Grassmannian, or rather its image by Θ, as the orbit of the vacuum under GL(∞), as in eq. (9.50): (9.58) g|0 = (g · e 1 ) ∧ (g · e 3 ) ∧ (g · e 5 ) ∧ · · · 2 2 2 We have denoted g·es = r er ars , where ars is related to g as in eq. (9.14). Since ars is a band matrix with arr = 0 this produces a combination of ﬁnite perturbations of the vacuum state, hence belonging to the space of semi-inﬁnite wedge products. The expression of g|0 is factorized, so naturally belongs to the inﬁnite Grassmannian. In the fermionic language, we have: g|0 = gβ 1 g −1 · gβ 3 g −1 · · · | − ∞ 2

2

i.e. g acts on the βr by conjugation. Note that one has naturally g|−∞ = | − ∞ . To see it, we write g = eX and verify that X| − ∞ = 0. This is ∗ β ∗ |0 because X is a combination of elements βr βs∗ and | − ∞ = · · · β− 3 −1 2

2

by eq. (9.11). When s < 0 we get βs∗2 = 0 and if s > 0 we can push βs∗ to the right where it annihilates the vacuum. This may serve as a motivation for our deﬁnition of the action of GL(∞) in eq. (9.58). Note that GL(∞) acts transitively on the Grassmannian but the vacuum vector |0 has a stability subgroup. In particular, elements g which correspond to upper triangular matrices a act trivially. From eq. (9.58) we see that when a is upper triangular, i.e. ars = 0 for r < s, the vacuum vector gets multiplied by the scalar r arr = 0, so is invariant in P (ΛV ). −1 In GL(∞) we can factorize any element g = g− g+ so that g± correspond respectively to upper and lower triangular band matrices, and only g− acts eﬀectively on the vacuum. Finally, we write the Pl¨ ucker equations of the Grassmannian and identify them with eq. (9.35). Formally an element of the inﬁnite Grassmannian is of the form: |X = X k1 k2 ···km ··· ek1 ∧ ek2 ∧ · · · ∧ ekm ∧ · · · k1 k2 ···km ···

where ek1 ∧ · · · is a semi-inﬁnite wedge product, i.e. there is no hole after ucker relation eq. (9.52) to this some km . The straight extension of the Pl¨ setting reads: ( X k1 k2 ···hj ··· X h1 h2 ···hj ··· = 0 j

The sum over j is ﬁnite since we cannot add a particle at position hj after km . This can be rewritten using eqs. (9.57) in the elegant form: ∗ βj ⊗ β−j |X ⊗ |X = 0 j

321

9.6 Fermions and Grassmannians

which is exactly eq. (9.35). Of course this simple way of stating the Pl¨ ucker relation relies on the inﬁnite number of indices. As we have already shown, the condition eq. (9.35) translates into Hirota’s equations on the tau-function which are therefore equivalent to the equations of the Grassmannian in the inﬁnite wedge space. This is a beautiful discovery of Sato. We can now identify the ﬂows of the KP hierarchy as the action of an Abelian subgroup of GL(∞) in Gr(V∞ ). The KP time evolutions are deﬁned in eq. (9.34) by the action on g|0 of e n>0 Hn tn . We want to compute the action of this operator on semi-inﬁnite wedge products. Proposition. Let S be the shift operator on V∞ deﬁned by: S : es

→ es+1

We extend S to semi-inﬁnite wedge products by: S ei1 ∧ ei2 ∧ · · · = (Sei1 ) ∧ ei2 ∧ · · · + ei1 ∧ (Sei2 ) ∧ · · · + · · · The action of Hn on the states eq. (9.55) is given by the shift operator S n. Proof. In the fermionic language, a semi-inﬁnite wedge product is of the form βi1 · · · βim |l , where |l is the vacuum of charge l and the indices i1 < i2 < · · · < im < l may be negative. The operator eH(t) acts by conjugation on the βij and leaves the vacuum |l invariant. At the inﬁnitesimal level, this yields the action of Hn on this state as a ﬁnite sum of commutators n j βi1 · · · [Hn , βij ] · · · βim |l . This translates exactly to the action of S on the inﬁnite wedge product because [Hn , βij ] = βij +n . One realization of the inﬁnite-dimensional space V∞ is the set of functions on the circle |z| = 1. A basis is given by the powers z n for n ∈ Z. 1 We identify the abstract basis elements es with the powers z s− 2 . On this basis, eq. (9.24) shows that the group GL(∞) acts as: z n → Agn (z) = am+ 1 ,n+ 1 z m m

2

2

In particular, when ars is lower triangular with diagonal elements equal to 1, we have: z n → Agn (z) = am+ 1 ,n+ 1 z m = z n + an− 1 ,n+ 1 z n−1 + · · ·(9.59) m≤n

2

2

2

2

This representation will be studied in detail below. For the time being we look at the image under Θ. The vacuum vector becomes: |0 = z 0 ∧ z 1 ∧ z 2 ∧ · · ·

322

9 Grassmannian and integrable hierarchies

The group GL(∞) acts on the vacuum as in eq. (9.58): g|0 = Ag0 (z) ∧ Ag1 (z) ∧ Ag2 (z) ∧ · · ·

(9.60)

In this setting we can give the expression of the τ -function: Proposition. The tau-function may be written as an inﬁnitedimensional determinant: dz ξ(t,z) −m−1 g z An (z) (9.61) τ (t; g) = det e 2iπ m,n≥0 with ξ(t, z) = n z n tn . Proof. Recall the deﬁnition of the tau-function as τ (t; g) = 0|eH(t) g|0 . The fermionic operators satisfy the “evolution equations”, eqs. (9.43, 9.44). The time evolution of g|0 under eH(t) amounts to the substitution z m → z m eξ(t,z) in eq. (9.60), thus

eH(t) g|0 = eξ(t,z) Ag0 (z) ∧ eξ(t,z) Ag1 (z) ∧ eξ(t,z) Ag2 (z) ∧ · · · To compute the expectation value 0|eH(t) g|0 we have to extract the component of eH(t) g|0 on |0 . This is given by the determinant eq. (9.61).

9.7 Schur polynomials In this section we show that a basis of semi-inﬁnite wedge products is indexed by Young diagrams. This allows us to investigate the structure of tau-functions. In particular, we get polynomial tau-functions which turn out to be the Schur polynomials. We use the properties of Schur polynomials to prove that knowledge of the tau-function allows us to reconstruct the corresponding element of the Grassmannian, see eqs. (9.67, 9.68). In the ﬁnite-dimensional case, a basis of ΛM V consists of the vectors |Y = ek1 ∧ · · · ∧ ekM . The notation |Y refers to Young diagrams, which are deﬁned as follows. Recall that, for any integer p > 0, a partition of p is deﬁned as a collection of integers nj , j = 1, . . . , M such that p= M j=1 nj and 1 ≤ n1 ≤ n2 ≤ · · · ≤ nM . With such a partition one associates a diagram of boxes with M lines, and nM boxes in the ﬁrst line, nM −1 boxes in the second line, up to n1 boxes in the last line. This is called a Young diagram with p boxes. With the set of number (k1 , . . . , kM ) with 1 ≤ k1 < · · · < kM ≤ N (note that ki = kj for i = j) we associate the Young diagram Y = [nj ]

323

9.7 Schur polynomials nM = kM - M + 1

Y = [kj − j+1] =

nj = k j - j + 1

n1 = k1

Fig. 9.1. A Young diagram with the j th line of length nj = kj − j + 1 (now 1 ≤ n1 ≤ · · · ≤ nM ≤ N ). These Young diagrams have at most M lines and N columns. Consider variables u1 , u2 , . . . , uk , where k is arbitrarily large. Symmetric polynomials of degree d in these variables may be expressed on several bases. To construct one of them consider the elementary basic symmetric polynomials hj of degree j deﬁned by: k i=1

∞

1 = hj (u)sj 1 − ui s j=0

With any partition of d into at most k parts, namely (λ1 ≥ λ2 ≥ · · · λk ≥ 0) such that d = i λi , we associate the Young diagram Y with at most k lines, whose ith line has λi boxes, and the symmetric polynomial of degree d: hλi hY = i

It is known that the hY form an integral basis of the space of symmetric polynomials of degree d. Any symmetric polynomial with integer coeﬃcients can be expanded on this basis with integer coeﬃcients. With the same Young diagram Y one can associate another symmetric polynomial, known as the Schur polynomial: SY (u) =

det uλj i +j−i ∆(u)

,

∆(u) =

(ui − uj )

(9.62)

i<j

where ∆(u) = det uj−i is the Vandermonde determinant. Note that the j numerator and the denominator are antisymmetric polynomials, and that the denominator divides the numerator so that SY is a symmetric polynomial of degree d. It can also be shown that the Schur polynomials form an integral basis of symmetric polynomials of degree d indexed by the

324

9 Grassmannian and integrable hierarchies

Young diagrams. Hence we can express SY in terms of the hY , and this relation is given by the formula SY = det (hλi +j−i ) We will prove this formula below using the boson–fermion correspondence, see eq. (9.65). We can also relate these deﬁnitions to the deﬁnition of “elementary Schur polynomials” introduced in eq. (9.39). We have 1 n uj hi (u) = Pi (t), for tn = n j

This may be seen by writing: ∞ 1 i j hi (u)si = e− j log(1−uj s) = e i=1 ( i j uj )s = Pi (t)si i

i

Any polynomial f (t1 , t2 , . . .) can be seen as a symmetric polynomial of the variables u and conversely. This is because symmetric polynomials in uj can be expressed on the Newton sums tn . In particular, we shall denote by χY (t) the Schur polynomial SY (u) expressed in terms of the variables tn . 1 n uj χY (t) = SY (u), tn = n j

The next proposition computes the expectation value 0|eH(t) |Y , using the boson–fermion correspondence. ∗ ∗ Proposition. Let |Y = β−sm · · · β−s1 ·β−r · · · β−r |0 be a neutral state m 1 with sm > · · · > s1 and rm > · · · > r1 , then

0|eH(t) |Y = det Pna +a−b (t) (9.63)

with na the number of boxes in the ath line, counted from the bottom, for a Young diagram Y = [na ] with L = (rm + 12 ) lines and C = (sm + 12 ) columns, enclosed in a hook1 of width m. The ﬁrst m lines have length (sm−k+1 + k − 12 ), and the ﬁrst m columns have length (rm−k+1 + k − 12 ) for k = 1, . . . , m. See Fig. (9.2). Proof. Let us give the details of the proof of eq. (9.63) in the simplest case ∗ |0 . Using the deﬁnition m = 1. Thus we consider the state |Y = β−s β−r eq. (9.11) for the charged vacuum | − r − 12 we can write: |Y = β−s β 1 · · · βr−1 | − r − 12 2

1

This means that there is no box in the rectangle of size (L − m) × (C − m) in the lower right corner.

325

9.7 Schur polynomials Young Tableau

Length of the lines Sm + 1/2 S + 3/2 m-1

S1 + m -1/2

Y =

r 1 - 1/2

m

2 rm-1 - rm-2 - 1 rm

2 1

rm-1 - 1

-

1

Fig. 9.2. The Young diagram associated with the state with m particules and holes. Expressing the fermionic operators as contour integrals, βk . dz k− 1 2 β(z), we obtain 2iπ z 1 r+ 2

0|eH(t) |Y =

1 r−ma − 2

za

=

0|eH(t) β(zr+1/2 ) · · · β(z1 )| − r − 12

a=1

with (m1 , . . . , mr+1/2 ) = (1, 2, . . . , r − 12 , r + s). Using the evolution equations eq. (9.43) and the fact that H(t) annihilates | − r − 12 , we ﬁnd:

1 r+ 2

0|eH(t) |Y =

1 r−ma − 2 ξ(t,z ) a

za

e

0|β(zr+1/2 ) · · · β(z1 )| − r − 12

a=1

The expectation value can be evaluated using the bosonization formulae eqs. (9.28, 9.30): (za − zb ) za−r−1/2 0|β(zr+1/2 ) · · · β(z1 )| − r − 12 = a

a
Thus, because the product a
0|eH(t) |Y =

r+ 2

za−ma −1 eξ(t,za )

a=1

(za − zb )

a

= det

dzz

b−ma −2

ξ(t,z)

e

= det Pma −b+1 (t)

(9.64)

326

9 Grassmannian and integrable hierarchies

Since ma = na + a − 1 this proves the result eq. (9.63) for m = 1. The general case is proved similarly starting from the representation of the ∗ ∗ state |Y = β−s1 · · · β−sm · β−r · · · β−r |0 as m 1 |Y = β−s1 · · · β−sm · β 1 · · · βrk · · · βrm −1 βrm | − rm − 12 2

where the operators βrk have been omitted. This proposition allows us to associate a Young diagram Y with the ∗ ∗ · · · β−r |0 in the following way: state |Y ≡ β−sm · · · β−s1 · β−r m 1 For example, in the simple case m = 1 which has been considered in the proof of the proposition, Y is a hook diagram of width one, with the ﬁrst line containing s + 1/2 boxes, followed by r − 1/2 lines with one box. In the next proposition, we compute 0|eH(t) |Y in another way and get the Jacobi–Trudy identity: Proposition. The Schur polynomial SY associated with the Young diagram Y , eq. (9.62), can be expressed in terms of the variables tn as:

χY (t) = det Pna +a−b (t) (9.65) Proof. We use the same technique as before, and limit ourselves to the case m expression eq. (9.64) of 0|eH(t) |Y , we insert exp ξ(t, z) = = 1. In the (1 − zui )−1 . Next we use the Cauchy identity in the form: 1 1 1 (9.66) = det 1 − u i za ∆(u)∆(z) 1 − za u i i,a where ∆(u) = a
327

9.7 Schur polynomials Proposition. The coeﬃcients ζY of the expansion f (t) = can be recovered by:

Y

ζY χY (t)

ζY = χY (∂t ) · f (t)|t=0 where χY (∂t ) are the diﬀerential operators obtained by replacing tn by 1 ∂ n ∂tn in the deﬁnition eq. (9.63) of χY (tn ). Proof. We ﬁrst rewrite the Cauchy identity eq. (9.66) in the form: i,j

1 = SY (u)SY (y) 1 − u i yj Y

where the sum runs over all Young diagrams. This is shown by expanding each matrix entry 1/(1 − ui yj ) in powers of ui yj . The j th column can be k k written as kj yj j ui j , so using the multilinearity of the determinant, one gets: kj 1 k = yj det (ui j ) det 1 − u i yj k1 ,k2 ,... j

k

Since det (ui j ) is antisymmetric one can replace the sum over the unk restricted ki by a sum over k1 > k2 > · · ·. The coeﬃcient of det (ui j ) k becomes det (yi j ) in this restricted sum. Dividing by ∆(u)∆(y) one gets the result. Writing now 1/(1−ui yj ) = exp(− log(1−ui yj )) and expanding the logarithm, we ﬁnd:

SY (u)SY (y) = e

1 n n

(

i

n un i )( j yj )

=e

n

˜n ntn H

Y

n ˜ n = 1/n y n . Thus we have the algebraic where tn = 1/n ui , and H i ˜ n) = ˜ ˜ identity exp ( n ntn H Y χY (t)χY (H). Replacing the variables Hn by the commuting operators 1/nHn we ﬁnally obtain: eH(t) =

˜ Y (t), χY (H)χ

Y

˜ n = 1 Hn H n

Taking the matrix element of this identity between 0| and |Y and remembering eqs. (9.63, 9.65), we get: χY (t) = 0|eH(t) |Y =

Y

˜ χY (t) 0|χY (H)|Y

328

9 Grassmannian and integrable hierarchies

˜ yielding 0|χY (H)|Y

= δY Y . We now compute: χY (∂t ) · f (t) = χY (∂t ) ζY χY (t) = χY (∂t ) 0|eH(t) |Y ζY =

Y

Y

Y

˜ H(t) |Y ζY 0|χY (H)e

Letting t → 0 yields the result. In particular, for a general tau-function of the form τ (t, g) = 0|eH(t) g|0 we insert the completeness relation 1 = Y |Y Y | to get: τ (t; g) = ζY (g) χY (t) (9.67) Y

with ζY (g) = Y |g|0 . The sum is over all Young diagrams. The previous proposition shows that conversely: ζY (g) = χY (∂t ) · τ (t; g)|t=0

(9.68)

This allows us to reconstruct the components of g|0 on the basis |Y knowing the tau-function. We see that general tau-functions corresponding to any element of GL(∞) can be expressed on Schur polynomials. There is just one Schur polynomial when g is chosen so that g|0 = |Y . 9.8 From fermions to pseudo-diﬀerential operators We show that the time evolution of the wave function of the KP hierarchy can be written very concisely in terms of pseudo-diﬀerential operators. This will be the basis for the study of the KP hierarchy in the next chapter. We start from eq. (9.61) for the tau-function. We use the fundamental diﬀerential property of the elementary Schur polynomials which follows directly from their deﬁnition, eq. (9.39): ∂ Pk (t) = ∂ n Pk (t) = Pk−n (t) ∂tn

(9.69)

where ∂ = ∂t∂1 . In this section we will freely use the notion of pseudodiﬀerential operators explained in Chapter 10. Recalling that ξ(t, z) = n tn z n , we have z −n eξ(t,z) = ∂ −n eξ(t,z) , and eq. (9.61) for the tau-function can be written as: g (t) (9.70) τ (t; g) = det Fmn m,n≥1

9.8 From fermions to pseudo-diﬀerential operators

329

g where the functions Fmn (t) are given by: g −n g Fmn (t) = ∂ Fm0 (t) = dzeξ(z,t) z −n−1 Agm (z)

We introduce now a pseudo-diﬀerential operator, Φ, by adding one line and one column to the determinant in eq. (9.70). 1 ∂ −n f −n Φ·f ≡ = 1 + w (t)∂ · f (9.71) det g n ∂ −n Fm;0 (t) m≥1 τ (t; g) n≥0

n>0

From this, we deﬁne a function Ψ(t, z) by the action of Φ on eξ(z,t) , speciﬁcally: t)eξ(z,t) ; w(z, t) ≡ 1 + wn (t)z −n (9.72) Ψ(t, z) = Φeξ(z,t) = w(z, n>0

We show that Ψ(t, z) is the Baker–Akhiezer function. For this, it is suﬃcient to prove that it is expressed by Sato’s formula, eq. (9.77), in terms of the tau-function. Proposition. Let τ (t; g) be deﬁned by eq. (9.70), and let Ψ(t, z) be the function deﬁned in eq. (9.72). We have: Ψ(z, t) =

τ (t − [z −1 ]; g) ξ(z,t) 1|eH(t) β(z)g|0 = e τ (t; g) 0|eH(t) g|0

(9.73)

The second expression is just a rewriting of the ﬁrst in terms of fermions. t) is the following determinant: Proof. By deﬁnition, τ (t; g)w(z,   z −2 ··· 1 z −1 g g g  F1,0 ∂ −1 F1,0 ∂ −2 F1,0 ···   g g g −1 −2 τ (t; g)w(z, t) = det  F   2,0 ∂ F2,0 ∂ F2,0 · · ·  .. .. .. . . . ··· Subtracting z −1 times the j th column from the (j + 1)th column reduces the ﬁrst line to (1, 0, 0, . . .). Expanding the determinant with respect to this ﬁrst line gives: g τ (t; g)w(z, t) = det (1 − z −1 ∂)∂ −n Fm,0 n,m≥1

On the other hand, since eξ(ζ,t−[z have:

−1 ])

= (1 − ζ/z)eξ(ζ,t) for |ζ| < |z|, we

g g (t − [z −1 ]) = (1 − z −1 ∂)Fm,n (t) Fm,n

hence τ (t; g)w = τ (t − [z −1 ]; g) which proves eq. (9.73).

330

9 Grassmannian and integrable hierarchies

The pseudo-diﬀerential operator Φ has coeﬃcients depending on the times of the KP hierarchy. One can express its time-evolution simply: Proposition. We have: ∂ Φ = (Φ∂ n Φ−1 )+ Φ − Φ∂ n = −(Φ∂ n Φ−1 )− Φ ∂tn

(9.74)

where the subscript + refers to the projection on the diﬀerential part of the operator, and the subscript − refers to the projection on the negative powers of ∂. Proof. We introduce diﬀerential operators DN = ∂ N +· · · which are ﬁnite order approximations of Φ: Φ = lim

N →∞

DN ∂ −N

The operator DN is deﬁned by truncating the determinant eq. (9.71) to a ﬁnite-dimensional determinant:   N ∂ f ∂ N −1 f · · · f g g g   F1,0 ∂ −1 F1,0 · · · ∂ −N F1,0 1   DN · f = det  .  .. .. . τN (t; g)   . . . g g g −1 −N FN,0 ∂ FN,0 · · · ∂ FN,0

g with τN (t; g) = det ∂ −n Fm . The time evolution of DN is easy n,m=1,...,N

to ﬁnd: ∂ −1 DN = (DN ∂ n DN )+ DN − DN ∂ n ∂tn Since both side of this equation are diﬀerential operators of order (N − 1), to prove this equality it is enough to check it on N linearly independent g g functions. We choose them to be Fp,N = ∂ −N Fp,0 , for p = 1, . . . , N , which span the kernel of DN . We have: g ∂Fp,N ∂DN g g −1 · Fp,N = −DN · = −DN ∂ n + (DN ∂ n DN )+ DN · Fp,N ∂tn ∂tn g The ﬁrst equality follows by applying ∂∂t to DN Fp,N = 0. To prove the n g g n second equality we substitute ∂tn Fp,N = ∂ Fp,N and we use again the fact g that Fp,N is in the kernel of DN to add the second term which vanishes on g Fp,N . This term is chosen so that, combined with the ﬁrst one, we get a diﬀerential operator of degree N −1. This proves the evolution equation for DN . Taking the limit N → ∞, we get the evolution equation of Φ.

9.9 The Segal–Wilson approach

331

Note that eq. (9.74) has exactly the form of eq. (3.45) in Chapter 3, but the element of the loop group, g (k) (λ), is here replaced by the pseudodiﬀerential operator Φ(∂ −1 ). This formulation of the KP hierarchy will be studied in detail in Chapter 10.

9.9 The Segal–Wilson approach Up to now, we have used the description of the Grassmannian embedded into projective space by Pl¨ ucker coordinates, using the fermionic language. This has the advantage of providing a well-deﬁned computational framework by regularizing potential inﬁnities using normal ordering. In this section we shall look at the Grassmannian as the space of suitable subspaces of a Hilbert space, selected by imposing appropriate functional constraints. We start again from the space V∞ with basis z n , n ∈ Z which can be seen as the space of functions on the unit circle |z| = 1. With a function n on the circle we associate its Fourier expansion f (z) = n an z . The 2 space2 V∞ is turned into a Hilbert space H by introducing the L norm |f | , or what amounts to the same, n |an |2 . We can decompose H as a direct sum of subspaces H ± , where H + is generated by z n , n ≥ 0 and H − by z n , n < 0. We consider the set of subspaces W which are comparable to H + in the following sense: 5 pr+ : W → H + is Fredholm W ∈ Gr iﬀ pr− : W → H − is compact The fact that the projection pr− is compact means that it is a norm limit of operators with ﬁnite-dimensional images. In our fermionic language these ﬁnite-dimensional images correspond to states with a ﬁnite number of particles. The fact that the projection pr+ is Fredholm means that the kernel of pr+ is ﬁnite-dimensional, and its image is closed and of ﬁnite codimension. In the fermionic language this means a ﬁnite number of holes. Let us illustrate these conditions by an example: assume that W is spanned by e−2 , e1 +e−1 , e3 , e4 , . . .. Then pr+ W is spanned by e1 , e3 , e4 , . . . and is of codimension 2 in H + since e0 and e2 are missing. Its kernel is spanned by e−2 and is of dimension 1. Similarly pr− W is spanned by ucker embedding W e−1 , e−2 and is of ﬁnite dimension 2. Under the Pl¨ goes to e−2 ∧ (e1 + e−1 ) ∧ e3 ∧ e4 ∧ · · · which expands on two semi-inﬁnite

332

9 Grassmannian and integrable hierarchies

wedge products. These two terms have the property Index (pr+ ) = −1 where the index of the Fredholm operator pr+ is deﬁned by Index (pr+ ) ≡ dim Ker pr+ − codim Im pr+ = no. of particles − no. of holes Notice that this is also the common fermionic charge of all the above states. In general we have Ker pr+ ⊂ Im pr− , but the ﬁrst one is of ﬁnite dimension while the second one may become inﬁnite-dimensional under the limiting procedure yielding a general compact operator. We recall some properties of compact and Fredholm operators. If u is compact and v is continuous (i.e. bounded) then uv and vu are compact. The product of a Fredholm operator and a compact operator is compact. Finally, the sum of a Fredholm operator and a compact operator is Fredholm, and the product of two Fredholm operators is Fredholm. This allows us to consider the group GL(∞) of matrices having the following block structure on the decomposition H = H + ⊕ H − : a b a, d Fredholm and b, c compact h= c d The product of two such elements is of the same form. Moreover, a group element h acts on the Grassmannian, moving W to hW , where the projections for hW are given by: pr+ a b pr+ = pr− pr− c d and have therefore the required properties. A subgroup denoted by Γ+ is of particular interest. An element of Γ+ is given by the multiplication by an L2 non-vanishing function h on the unit circle, extending to a non-vanishing analytic function h in the unit disc and normalized by h(0) = 1. In particular the expansion h(z) has only positive powers of z. It can be represented in the block form as: a b + h∈Γ , h= 0 d where a and d are invertible, hence Fredholm, and b is compact. Indeed, let 2 us write h(z) 1 z+a2 z +· · · and consider its action by multiplication = 1+a k on g(z) = k bk z . If g has only positive powers, hg has only positive powers, moreover, 1/h can be expanded on positive powers, so that a is

333

9.9 The Segal–Wilson approach

invertible. It is continuous by the Schwartz inequality and so is Fredholm. If g has only negative powers of z, we remark that h(z)z −n = z −n + a1 z −n+1 + · · · + an−1 z −1 + an+j z j j≥0

We see that the H − part induces a triangular system with 1 on the diagonal, hence is invertible, so that d is Fredholm. To show that b is N j compact, consider the truncation hN = j=0 aj z which is such that dim{pr+ (hN g)|g ∈ H − } < N , i.e. the corresponding bN is of ﬁnite rank.We have ||h − hN || → 0 when N → ∞ because ||(h − hN )g|| ≤ 2 ||g|| ∞ j=N |aj | by the Schwartz inequality. In the following we restrict ourselves to spaces W which are transversal to H − , i.e. such that pr+ : W → H + is an isomorphism. Such W have charge 0, and are “small” deformations of the vacuum |0 . In this case one can deﬁne an operator A : H + → H − as follows. With any f ∈ H + one can associate a unique element w ∈ W such that f = pr+ (w). We denote w = pr−1 + (f ) and form f ∈ H + → A(f ) = pr− (w),

w = pr−1 + (f )

The operator A is compact, and we have (pr+ |W )−1 (f ) = w = f + Af with Af ∈ H − . We now want to understand the tau-function in this context. Let us ﬁx an W of the Grassmannian and let h ∈ Γ+ be given by h = element i exp( i≥0 ti z ). This introduces the times of the KP hierarchy. We denote a b −1 h = 0 d Deﬁnition. Assuming that W is transverse to H − so that (pr+ |W )−1 exists, we set: τW (h) = det (h pr+ h−1 (pr+ |W )−1 ) = det (1 + a−1 bA)

(9.75)

To understand the second formula, consider f ∈ H + . Then we have pr+ h−1 (pr+ |W )−1 (f ) = af + bAf ∈ H + and h pr+ h−1 (pr+ |W )−1 (f ) = (1 + a−1 bA)f . We have here an operator from H + to H + of the form 1 + compact and we want to take its determinant. This is possible if the

334

9 Grassmannian and integrable hierarchies

operator a−1 bA is of trace class, which is ensured if h is suﬃciently regular on the unit circle. Note that the operator a : H + → H + is triangular with 1 on the diagonal, hence one can set det a = 1. If it were allowed to write det M N = det M · det N for inﬁnite-dimensional matrices, we could content ourselves with deﬁning τW = det(a + bA). The deﬁnition eq. (9.75) performs a regularization of this too naive expression. It is important to show that this deﬁnition agrees with the previous construction in eq. (9.34). Recall the expression eq. (9.60) of the Pl¨ ucker embedding of some element W of the Grassmannian. The two deﬁnitions agree if we identify Ags (z) = (pr+ |W )−1 (z s ) Hence A0 (z) ∧ A1 (z) ∧ · · · represents the Pl¨ ucker embedding of W . By g −1 deﬁnition the action of h on W is As (z) → eξ(t,z) Ags (z), which under the Pl¨ ucker embedding is represented by multiplication by eH(t) due to eq. (9.43). Next the projection P+ becomes |0 0| and the multiplication by h is achieved by multiplying by e−H(t) , so that: h pr+ h−1 (pr+ |W )−1 (H + ) −→ e−H(t) |0 0|eH(t) A0 (z) ∧ A1 (z) ∧ · · · Pl¨ ucker

Taking the scalar product with 0| produces the determinant of the operator h pr+ h−1 (pr+ |W )−1 . Since 0|e−H(t) |0 = 1 due to normal ordering of H, we reproduce exactly eq. (9.61). When M, N are of the form 1 + m, 1 + n with m and n of trace class, one is allowed to write det M N = det M det N . For h ∈ Γ+ one can show that a−1 bA is of trace class. In particular one obtains: τW (h1 h2 ) = τW (h1 )τh−1 W (h2 )

(9.76)

1

Indeed, τW (h1 )τh−1 W (h2 ) 1

−1 −1 −1 = det(h2 pr+ h−1 2 (pr+ |h−1 W ) ) · det(pr+ h1 (pr+ |W ) h1 ) 1

=

−1 )−1 pr+ h−1 det (h1 h2 pr+ h−1 2 (pr+ |h−1 1 (pr+ |W ) ) 1 W

= τW (h1 h2 )

This is because if f ∈ H + then w = (pr+ |W )−1 f ∈ W so that h−1 1 w ∈ + which under (pr | −1 W . Under pr this gives some g ∈ H h−1 −1 + + h W) 1

−1 −1 reproduces h−1 1 w = h1 (pr+ |W ) f .

1

To deﬁne Baker–Akhiezer functions, we assume that h ∈ Γ+ and is such that h−1 W is transverse to H − .

9.9 The Segal–Wilson approach

335

Deﬁnition. Let h(z; t) = exp( i>0 ti z i ) = eξ(z,t) ∈ Γ+ . The Baker– Akhiezer function ΨW (h, z) is the unique function such that h−1 ΨW (h, z) is the inverse image under pr+ |h−1 W of the constant function 1 ∈ H + . means that h−1 ΨW (h, z) ∈ h−1 W and can be written as 1 + This ∞ −i i=1 ai (h)z . That is to say, the Baker–Akhiezer function is the unique function ΨW (h, z) ∈ W having the form: ΨW (h, z) = h(z; t)(1 +

∞

ai (h)z −i )

i=1

We see that ΨW has an essential singularity at z = ∞. This is one of the essential feature of the Baker–Akhiezer functions. The Baker–Akhiezer function ΨW becomes a function of t and z and can be explicitly expressed in terms of the tau-function by means of Sato’s formula: Proposition. We have ΨW (t, z) =

τW (t − [z −1 ]) ξ(t,z) e τW (t)

(9.77)

where the notation t − [z −1 ] refers to the substitution of tn by tn − z −n /n. Proof. To avoid notational conﬂicts, in this proof we denote by ζ the current variable on the circle (previously denoted by z). The group element −1 h(ζ; t − [z ]) is given by exp( n>0 (tn − z −n /n)ζ n ) = h(ζ; t)qz where qz = 1 − ζ/z ∈ Γ+ . Equation (9.77) is equivalent to e−ξ(t,z) ΨW (t, z) =

τW (hqz ) τW (h)

Since τW (hqz ) = τW (h)τh−1 W (qz ), we have to show that e−ξ(z,t) ΨW (t, z) = τh−1 W (qz ). By deﬁnition, e−ξ(z,t) ΨW (t, z) = (pr+ |h−1 W )−1 (1). Replacing h−1 W by W , we have to show the equality of the two functions of z: (pr+ |W )−1 (1) = τW (qz ) = det(1 + a−1 bA), where the operators a and b are the ones appearing in the block representation of qz−1 and A is the operator induced by W . Since qz−1 (ζ) = n≥0 ζ n /z n , we have: −n ζ = z −n qz−1 (ζ) b(ζ −n ) = pr+ 1 − ζ/z while the action of a−1 on this element of H + is simply represented by the multiplication by qz (ζ). Hence for any element g ∈ H − we have a−1 bg = g(z) · 1 ∈ H + , i.e. the operator is of rank 1, so that det(1 + a−1 bA) = 1 + Tr(a−1 bA). Since the image is spanned by the function 1, it is enough

336

9 Grassmannian and integrable hierarchies

to compute the action of a−1 bA on 1 to get the trace. But by deﬁnition A sends the basis element 1 on the negative power part of (pr+ |W )−1 (1) denoted by f (ζ), so that 1 + Tr(a−1 bA) = 1 + f (z). Let us explain how the algebro-geometric solutions of KP ﬁt into this setting (see Chapter 10 for a description of these solutions). Let Γ be a compact Riemann surface of genus g and L be a line bundle on Γ, see Chapter 15. Fix a puncture x∞ on Γ and let z −1 be a local parameter around x∞ . Let D∞ be a small disc around x∞ . Sections of L locally appear as functions on D∞ . Consider also the open set Γ0 which is the complement of the disc D∞ . The two open sets D∞ and Γ0 cover Γ. With this set of data one associates an element W of the Grassmannian such that w ∈ W if w is the boundary value on the circle of a holomorphic section of the restriction of L on Γ0 . One can show that pr− is compact. To show that pr+ is Fredholm one ﬁrst shows that Ker(pr+ : W → H + ) = H 0 (Γ, L∞ ), where L∞ = L−[x∞ ] is the diﬀerence of the line bundle L and the point bundle at x∞ , i.e. a section of L∞ arises from a section of L vanishing at x∞ . Recall that H 0 (Γ, L∞ ) is the set of global holomorphic sections of L∞ . A function belongs to Ker(pr+ : W → H + ) if and only if it has only strictly negative powers of z, hence extends to the interior of D∞ and vanishes at x∞ . By deﬁnition it extends to Γ0 so providing a global section of L∞ . Next we show that H + /pr+ W = H 1 (Γ, L∞ ). It is a well-known fact that the non-compact Riemann surface Γ0 has no sheaf cohomology H 1 (Γ0 , L∞ ) = 0 and it is obviously the same for D∞ . Hence any nonvanishing element in H 1 (Γ, L∞ ) comes from some analytic function φ0∞ on the annulus Γ0 ∩ D∞ which cannot be written as φ0 − φ∞ , where φ0 extends to an analytic section of L on Γ0 and φ∞ is an analytic function on D∞ vanishing at x∞ . But the part of φ0∞ with strictly negative powers of z extends uniquely to a function φ∞ vanishing at x∞ . So H 1 (Γ, L∞ ) is isomorphic to the set of analytic functions on S 1 with only non-negative powers of z modulo those which extend to sections of L. This is precisely the deﬁnition of H + /pr+ W . Thus we arrived at the conclusion that the Fredholm index of pr+ is given by the Riemann–Roch theorem: Index (W ) = dim H 0 (L∞ ) − dim H 1 (L∞ ) = 1 − g + c(L∞ ) = c(L) − g where c(L) is the Chern class of L and g is the genus of the Riemann surface. Recall that c(L∞ ) = c(L) − 1 (see Chapter 15). In particular, in the interesting case where W is transverse to H − the index of W vanishes, which needs c(L) = g.

9.9 The Segal–Wilson approach

337

In practice one takes a puncture x∞ and a set of g points in generic position and one considers meromorphic functions with poles at these g points and an essential singularity at x∞ . These data deﬁne a unique W transverse to H − in the Grassmannian. Elements of W are boundary values on the circle of meromorphic functions on Γ0 with poles only at these g points. The Baker–Akhiezer function speciﬁed by W has an essential singularity at x∞ and g poles. It identiﬁes with those deﬁned in Chapter 5. References [1] R. Hirota, Exact solution of the Korteweg–de Vries equation for multiple collisions of solitons. Phys. Rev. Lett. 27 (1971) 1192. [2] M. Sato and Y. Sato, Solitons equations as dynamical systems on inﬁnite-dimensional Grassmann manifolds. Lect. Notes in Num. Appl. Anal. 5 (1982) 259, or Proc. U.S.–Japan Seminar Nonlinear PDE in Applied Science, Tokyo 1982, Ed. Lax, Fujita, 259–271, North Holland/Kinokuniya. [3] E. Date, M. Kashiwara, M. Jimbo and T. Miwa, Transformation groups for soliton equations. in Proceedings of RIMS symposium, Kyoto 1981. World Scientiﬁc (1983) 39–119. [4] M. Jimbo and T. Miwa, Solitons and inﬁnite dimensional Lie algebras. RIMS 19 (1983) 943–1001. [5] G. Segal and G. Wilson, Loop groups and equations of KdV type. Publ. Math. I.H.E.S. 61 (1985) 5–65. [6] V. Kac, Inﬁnite-dimensional Lie algebras. Cambridge University Press (1985). [7] V. Kac and A. Raina, Bombay lectures on highest weight representations of inﬁnite-dimensional Lie algebras. World Scientiﬁc (1987). [8] L. Dickey, On the tau-function of matrix hierarchies of integrable equations. J. Math. Phys. 32 (1991) 2996–3002. [9] W. Fulton and J. Harris, Representation theory. Springer (1991). [10] C. Itzykson and J.-B. Zuber, Combinatorics of the modular group II: the Kontsevich integrals. Int. J. Mod. Phys. A7 (1992) 5661–5705.

10 The KP hierarchy

In the previous chapter we showed that the equations of the KP hierarchy can be written as: ∂tn Φ = −(Φ∂ n Φ−1 )− Φ where Φ is a pseudo-diﬀerential operator. This is identical to the standard form of the equations of an integrable hierarchy, but we are dealing here with the algebra of pseudo-diﬀerential operators, instead of a loop algebra. In this chapter, we explain this setting and investigate the corresponding hierarchy. We show that the general solution can be expressed with the Grassmannian tau-function. With any Riemann surface one can associate particular ﬁnite-zone solutions. More generally, we construct solutions corresponding to slow modulations of the algebro-geometric solutions following the Whitham procedure. We also present the reduction of this hierarchy to the generalized KdV equations, and discuss their Poisson structures. We show that these Poisson stuctures can be obtained by Hamiltonian reduction from the Kostant–Kirillov bracket on a Kac– Moody algebra.

10.1 The algebra of pseudo-diﬀerential operators We brieﬂy expose the theory of pseudo-diﬀerential operators ﬁrst introduced in this context by Gelfand and Dickey. The algebra of diﬀerential operators is the algebra generated by Cvalued functions of one variable x and the derivation symbol ∂, with the usual Leibnitz rule, ∂.a = a.∂ + (∂a), where (∂a) means the derivative (∂x a)(x) of the function a(x). This deﬁnes the multiplication law between the symbol∂ and the functions. An element in this algebra is a ﬁnite i sum A = N i=0 ai ∂ , with N ﬁnite but arbitrary. The coeﬃcients ai are 338

10.1 The algebra of pseudo-diﬀerential operators

339

functions of x. To deﬁne the algebra of pseudo-diﬀerential operators, we extend the algebra of diﬀerential operators by introducing the “integration” symbol, ∂ −1 and its powers, with the following algebraic rules: ∂ −1 ∂ = ∂∂ −1 = 1 ∂ −1 a =

∞

(−1)i (∂ i a)∂ −i−1

(10.1)

i=0

The algebra of pseudo-diﬀerential operators consists of elements which are i . Equations (10.1) deﬁne the semi-inﬁnite sums of the form A = N a ∂ i −∞ −1 multiplication by ∂ in the pseudo-diﬀerential algebra since they allow us to push all ∂ −1 symbols to the right. Here we don’t have to deal with the convergence questions involved in this reshuﬄing, since only a ﬁnite number of terms appear at each order in ∂ −i . This rule is motivated by the integration by parts formula. Symbolically we have: dx a∂(∂ −1 u) = a∂ −1 u − dx (∂a)∂ −1 u ∂ −1 (a.u) = dx au = = a∂ −1 u − (∂a)∂ −2 u + dx (∂ 2 a)∂ −2 u and so on. The algebra of pseudo-diﬀerential operators is an associative algebra with a unit. It possesses a natural anti-homomorphism, that is (AB)∗ = B ∗ A∗ , which we call the formal adjoint, deﬁned by: (a ∂ i )∗ ≡ (−∂)i a

(10.2)

for any function summarize these facts in the following deﬁnitions: 7 6 a. We N i be the set of formal pseudo-diﬀerential Let P = A = −∞ ai ∂ 6 7 i a ∂ the operators in one variable. We denote by P+ = A = N i i=0 6 7 −1 subalgebra of diﬀerential operators, and by P− = A = −∞ ai ∂ i the subalgebra of integral operators. We have the direct sum decomposition of P as a vector space: P = P+ ⊕ P− Notice that P is naturally a Lie algebra. P+ and P− are Lie subalgebras, but P+ and P− do not commute. For A ∈ P, we deﬁne its residue, denoted by Res∂ A, as the coeﬃcient of ∂ −1 in A, Res∂ A ≡ a−1 (x) (10.3)

340

10 The KP hierarchy

On P there exists a natural linear form: Proposition. The algebra P is equipped with a linear form, denoted by N , called the Adler trace. It is deﬁned for any element A = −∞ ai ∂ i by: A =

dx Res∂ A =

dx a−1 (x)

(10.4)

This linear form satisﬁes the fundamental trace property AB = BA , hence deﬁnes an ad-invariant non-degenerate scalar product on P by: ∗ =P . (A, B) = AB . Using this bilinear form we have the duality: P+ − unique way of writing a pseudo-diﬀerential operator Proof. There is a N i i in the form A = −∞ ai ∂ , i.e. with all ∂ on the right. Let us prove the trace property. It is suﬃcient to verify it on operators of the form A = a ∂ k and B = b ∂ j . If k and j are both positive or both strictly negative, we clearly have AB = BA = 0. Thus we take A = a ∂ k and B = b ∂ −j−1 with k, j ≥ 0. Using the relation: ∞ j+v −j−1 v (∂ v a) ∂ −j−1−v a= (−1) (10.5) ∂ v v=0

which is shown by induction, starting from the deﬁnition relation eq. (10.1), and the identity between binomial coeﬃcients

j+1+µ µ

=

µ j+ν ν=0

ν

we get:

k k k−j BA = (−1) dx b ∂ a = dx a ∂ k−j b j j k (a ∂ v b) ∂ k−j−v−1 , we Similarly, using the Leibnitz rule, AB = ∞ v=0 v ﬁnd that AB is given by:

k AB = dx a ∂ k−j b j k−j

Clearly, AB and BA coincide. The invariance of the scalar product means (A, [B, C]) = ([A, B], C). It follows from the trace property. We already noticed that P+ and P− are isotropic with respect to the ∗ = P , we consider in P elements trace (P± , P± ) = 0. To check that P+ − +

341

10.2 The KP hierarchy

ai ∂ i , i = 0, 1, . . . , ∞. They are paired with elements ∂ −i−1 bi in P− since: −i−1 j ∂ bi , aj ∂ = δij dx (ai bi ) Choosing the coeﬃcients ai and bi in an orthonormal basis under we get dual bases of P+ and P− .

dx ab,

10.2 The KP hierarchy We introduce the KP ﬂows by applying the Adler–Kostant–Symes construction to the Lie algebra of pseudo-diﬀerential operators. Consider the formal group G = exp(P− ), called the Volterra group. We have G ∼ 1 + P− because powers of elements in P− are in P− . Let Φ be an element of G: Φ=1+

∞

wi ∂ −i ∈ (1 + P− )

(10.6)

i=1

−i and The element Φ has an inverse because, writing Φ−1 = 1 + ∞ 1 wi ∂ −1 demanding Φ Φ = 1, one recursively computes the coeﬃcients wi . One ﬁnds that they are of the form wi = −wi + pi (w1 , . . . , wi−1 ), where pi is a polynomial in its arguments and their derivatives (up to order i − 2). For example: w1 = −w1 , w2 = −w2 + w12 w3 = −w3 − w1 (∂w1 ) + 2w1 w2 − w13 The left and right inverses are identical. In the Adler–Kostant–Symes scheme we consider the decomposition of the Lie algebra P = P+ +P− and introduce a second Lie algebra structure on the underlying vector space, called PR . In PR , P+ and P− commute, see eq. (4.13) in Chapter 4. The Lie algebra PR acts on the Volterra group by δA Φ = (ΦAΦ−1 )+ Φ − ΦA+ = −(ΦAΦ−1 )− Φ + ΦA− for any A = A+ + A− ∈ PR . We have [δA , δB ] = δ[A,B]R . See eq. (14.35) in Chapter 14. The signs are slightly diﬀerent from those in that chapter because we use here the decomposition A = A+ + A− instead of A = A+ − A− in order to conform ourselves with the usual conventions in KP theory.

342

10 The KP hierarchy

If A ∈ A+ , the formula simpliﬁes to δA Φ = −(ΦAΦ−1 )− Φ. The KP ﬂows are deﬁned by taking A in a the Abelian subalgebra ∂ k for k > 0. Deﬁnition. For any Φ ∈ (1 + P− ), deﬁne the k th KP ﬂow by:

k −1 Φ ∂tk Φ = − Φ · ∂ · Φ −

(10.7)

These ﬂows coincide with eq. (9.74) in Chapter 9. By construction they commute, but it is instructive to check this essential commutativity property directly. Proposition. The KP ﬂows ∂tk all commute. Proof. Consider the pseudo-diﬀerential operators Θk = Φ∂ k Φ−1 . One has:

∂tk ∂tl Φ − ∂tl ∂tk Φ = [Θl− , Θk ] − [Θk− , Θl ] + [Θk− , Θl− ] Φ −

Replace Θk = Θk− + Θk+ and similarly for Θl . One gets:

[∂tk , ∂tl ]Φ = [Θl+ , Θk− ]− + [Θl− , Θk+ ]− + [Θl− , Θk− ]− Φ = [Θl , Θk ]− Φ The result follows because [Θl , Θk ] = 0. From Φ, we construct the pseudo-diﬀerential operator: Q = Φ · ∂ · Φ−1 ,

Q=∂+

∞

q−i ∂ −i

i=1

It is easy to check that there is no ∂ 0 term. Given Q one can reconstruct Φ up to some constants: Proposition. The pseudo-diﬀerential operator Φ is determined by Q up −i and the c are to the transformation Φ → ΦC, where C = 1 + ∞ i i=1 ci ∂ constants independent of x. Proof. Obviously Φ → ΦC, with C independent of x, leaves Q invariant. Using the expression of Φ−1 , we ﬁnd immediately q−i = −(∂wi ) + hi (w1 , . . . , wi−1 ), where hi is a diﬀerential polynomial in its arguments. The derivatives are at most of order (i − 1). Conversely, wi is recursively x determined by q−i as wi = (hi − q−i )dx, up to an integration constant. These constants can be absorbed in C.

343

10.2 The KP hierarchy

On the pseudo-diﬀerential Q the KP evolution equations take the Lax form: Proposition. The time evolutions of Q are given by:

∂tk Q = [ Bk , Q ] with Bk = Qk

(10.8)

+

Moreover, the diﬀerential operators Bk satisfy the zero curvature condition: ∂tk Bl − ∂tl Bk − [Bk , Bl ] = 0 Proof. We have to compute ∂tk Q. Using the deﬁnition eq. (10.7) written as ∂tk Φ = −(Qk )− Φ, we ﬁnd:

−1 k k ∂tk Q = ∂tk (Φ∂Φ ) = − Q ,Q = Q ,Q −

+

The zero-curvature condition follows from a direct computation:

− ∂tl Qk = [Bk , Ql ] − [Bl , Qk ] ∂tk Bl − ∂tl Bk = ∂tk Ql +

+

+

Therefore, using the decomposition Qk = Qk + + Qk − , we obtain:

∂tk Bl − ∂tl Bk − [Bk , Bl ] = [Bk , Ql ] − [Bl , Qk ] − [Bk , Bl ] +

k l k l = −[Bk − Q , Bl − Q ]+ = − Q , Q =0

−

− +

The Lax equation eq. (10.8) shows that Q is the analogue of the Lax matrix L, and the diﬀerential operator Bk is the analogue of Mk . The pseudo-diﬀerential operator Φ is the analogue of g(λ) in the loop–algebra situation of Chapter 3. In particular the evolution equation eq. (10.7) is analogous to eq. (3.50) for g. We will keep, however, the traditional notations in this chapter. This analogy can be pursued to get conserved quantities as traces of powers of the Lax matrix if we replace the ordinary trace by the Adler trace. Proposition. The quantities Hk = Qk are conserved. Proof. Using eq. (10.8) we have ∂tl Hk = [Bl , Qk ] which vanishes due to the cyclicity of Adler’s trace. Remark 1. The equations (10.8) are consistent in the sense that [Bk , Q] ∈ P− . This is because [Bk , Q] = [(Qk )+ , Q] = [Qk − (Qk )− , Q] = −[(Qk )− , Q] which expands

344

10 The KP hierarchy

on the negative powers ∂ −j , j ≥ 1. Hence the Lax equations (10.8) produce non-linear equations of motion for the coeﬃcients of Q, i.e. for the functions {q−i }.

Remark 2. The ﬁrst KP-ﬂow ∂1 is identiﬁed with ∂, because we have Q+ = ∂ so that the ﬁrst ﬂow reads: ∂1 Q = [(Q)+ , Q] = [∂, Q] =

∞

(∂q−i )∂ −i

i=1

Therefore, the KP-time t1 is naturally identiﬁed with the variable x introduced in the deﬁnition of the algebra P.

Example. To illustrate these formulae we compute the ﬁrst few equations of motion. First we have: (Q2 )+ = ∂ 2 + 2q−1 ,

(Q3 )+ = ∂ 3 + 3q−1 ∂ + 3(∂q−1 ) + 3q−2

The time evolution with respect to t2 reads: ∂t2 q−1 = ∂ 2 q−1 + 2∂q−2 ∂t2 q−2 = ∂ 2 q−2 + 2∂q−3 + 2q−1 ∂q−1 .. . Similarly, the time evolution with respect to t3 of q−1 is given by: ∂t3 q−1 = ∂ 3 q−1 + 3∂ 2 q−2 + 3∂q−3 + 6q−1 ∂q−1 Eliminating q−2 and q−3 between these equations and renaming u = −2q−1 , one gets: 3∂t22 u = ∂(4∂t3 u + 6u∂u − ∂ 3 u) This is the KP equation, see eq. (8.58) in Chapter 8. It is the ﬁrst of an inﬁnite hierarchy of non-linear partial diﬀerential equations for q−1 , q−2 , . . . . 10.3 The Baker–Akhiezer function of KP By analogy with the Lax situation, we consider eigenvectors of the operator Q, together with their time evolutions under the KP ﬂows, that is we look for an eigenfunction Ψ(t, z) of Q such that: (Q − z)Ψ = 0, (∂tm − Bm )Ψ = 0,

Q = Φ∂Φ−1 Bm = (Qm )+

(10.9)

For m = 2, we ﬁnd (∂t2 − ∂ 2 + u)Ψ = 0

(10.10)

10.3 The Baker–Akhiezer function of KP

345

In the algebro-geometric case such eigenfunctions were shown to be Baker–Akhiezer functions and we will continue to call them by this name. In order to make connection with previous expressions of the Baker– Akhiezer function, we deﬁne the action of pseudo-diﬀerential operators on exponentials: ∂ i ezx = z i ezx , for all i ∈ Z This extends to the action of any pseudo-diﬀerential operator by writing it ﬁrst in normal form with all ∂ i on the right. This deﬁnition is compatible with the algebra structure, in particular (Φ1 Φ2 ) ezx = Φ1 ((Φ2 ) ezx ). Proposition. The Baker–Akhiezer function Ψ(t, z) obeying eqs. (10.9) can be written as: z) eξ(t,z) Ψ(t, z) = Φeξ(t,z) = (1 + w1 z −1 + w2 z −2 + · · ·) eξ(t,z) ≡ w(t, (10.11) i . This deﬁnes where ξ(t, z) = ∞ t z i=1 i w(t, z) = 1 + w1 z −1 + w2 z −2 + · · · z) results clearly from the deﬁnition of Proof. The expansion w(t, the action of Φ on exp ξ(t, z), noting that t1 = x. Then QΨ = (Φ∂Φ−1 )Φeξ(t,z) = zΨ. Similarly, using ∂tm eξ(t,z) = ∂ m eξ(t,z) = z m eξ(t,z) the evolution of Ψ with respect to tm is given by:

∂tm Ψ = ∂tm (Φeξ(t,z) ) = (∂tm Φ)eξ(t,z) + Φ ∂tm eξ(t,z)

= − (Qm )− Φeξ(t,z) + Φ ∂ m eξ(t,z) = − (Qm )− + Qm Φeξ(t,z) = (Qm )+ Ψ In the last equality, we have written Φ∂ m eξ = (Φ∂ m Φ−1 )Φeξ = Qm Ψ and we used the decomposition Qm = (Qm )+ + (Qm )− . It is useful to introduce the adjoint Baker–Akhiezer function by: Ψ∗ = (Φ∗ )−1 e−ξ(t,z) where Φ∗ is the formal adjoint of the pseudo-diﬀerential operator Φ, deﬁned in eq. (10.2). This adjoint function satisﬁes the adjoint system: Proposition. The adjoint Baker–Akhiezer function obeys: ( Q∗ − z) Ψ∗ = 0, ∗ ( ∂tm + Bm ) Ψ∗ = 0,

Q∗ = −(Φ∗ )−1 ∂φ∗ ∗ Bm = (Qm )∗+

346

10 The KP hierarchy

Proof. Since adjoint is an antihomomorphism we have Q∗ = ∗ the formal −1 ∗ −1 = −(Φ ) · ∂ · Φ∗ . As a consequence, we have: Φ·∂·Φ Q∗ Ψ∗ = −(Φ∗ )−1 ∂ e−ξ(t,z) = z Ψ∗ Similarly, we have: ∂tm Ψ∗ = ∂tm (Φ∗ )−1 e−ξ(t,z) + (Φ∗ )−1 ∂tm e−ξ(t,z) = (Qm )∗− − (Qm )∗ Ψ∗ = −(Qm )∗+ Ψ∗

Note that, compared to the algebro-geometric Baker–Akhiezer functions, the puncture is at z = ∞ where the exponential factor exp ξ(t, z) has an essential singularity, while w(x, z) is formally regular. The fact that the Baker–Akhiezer functions are solutions of the linear system eq. (10.9) implies that they satisfy an important bilinear identity: Theorem. The following bilinear identity holds for all (i1 , . . . , im ), ij ≥ 0: dz i1 m Ψ(t, z)) · Ψ∗ (t, z) = 0 (10.12) (∂ · · · ∂tim 2iπ t1 It can be rewritten more compactly as: dz Ψ(t, z) · Ψ∗ (t , z) = 0, 2iπ

∀t, t

(10.13)

The integrals over z are residues around z = ∞, i.e. integrals on big circles around z = ∞. Proof. Notice that the integrands are meromorphic functions around ∞ because the essential singularities cancel. We ﬁrst need a formula expressing the Adler residue, eq. (10.3), of a product of two pseudo-diﬀerential operators D = i di ∂ i and F = i fi ∂ i : Lemma.

dz (Dezx )(F e−zx ) = Res∂ (DF ∗ ) 2iπ

(10.14)

Proof. The left-hand side is the coeﬃcient of z −1 in the integrand: dz dz (Dezx )(F e−zx ) = di z i fj (−z)j = (−1)j d−j−1 fj 2iπ 2iπ i

j

j

347

10.3 The Baker–Akhiezer function of KP

Similarly, we compute residue, i.e. the coeﬃcient of ∂ −1 in the Adler ∗ ∗ i (DF ), using F = i (−∂) fi : Res∂ (DF ∗ ) = Res∂ ( di ∂ i (−∂)j fj ) = (−1)j d−j−1 fj i,j

j

We can now prove the identity eq. (10.12). Since ∂tm Ψ = Bm Ψ, where Bm is a polynomial in ∂ (only a ﬁnite number of positive powers appear), it is suﬃcient to prove this equality for (i, 0, . . . , 0), i ≥ 0. In this case: dz i ξ(t,z) dz i ∗ ) · ((Φ∗ )−1 e−ξ(t,z) ) (∂ Ψ) · Ψ = (∂ Φe 2iπ 2iπ dz i zx = (∂ Φe ) · ((Φ∗ )−1 e−zx ) 2iπ = Res∂ (∂ i Φ · Φ−1 ) = Res∂ (∂ i ) = 0 The compact expression eq. (10.13) is obtained formally by Taylor expanding around t = t . In the previous proposition, we proved that the KP equations imply that the Baker–Akhiezer functions satisfy the bilinear identities eq. (10.13). We now establish the converse statement, meaning that the whole KP hierarchy is equivalent to these bilinear identities. Proposition. Consider two formal series: Ψ = Φeξ(t,z) ,

Ψ∗ = Φ e−ξ(t,z)

where Φ and Φ are two pseudo-diﬀerential operators of the form: ∞ ∞ wi ∂ −i , Φ = 1 + wi (−∂)−i Φ=1+ i=1

i=1

{wi , wi }

where are functions of the variables {ti }. Let us assume that the bilinear identity eq. (10.13) is satisﬁed. Then one has Φ = (Φ∗ )−1 Φ∂Φ−1 ,

(10.15)

Moreover, deﬁning Q = we have ∂tm Φ = the Baker–Akhiezer function of the KP hierarchy.

−(Qm )− Φ.

Hence Ψ is

Proof. We have used the notation Φ (with a diﬀerent star) to avoid introducing a new letter, but at this stage it is an independent pseudodiﬀerential operator. By deﬁnition the functions Ψ and Ψ∗ are: ∞ ∞ wi z −i eξ(t,z) , Ψ∗ = 1 + wi z −i e−ξ(t,z) Ψ= 1+ i=1

i=1

348

10 The KP hierarchy

. dz i and we assume that 2iπ ∂ Ψ · Ψ∗ = 0 for any i ≥ 0. We ﬁrst prove that this implies that Φ and Φ∗ are inverse to each other. Indeed, using eq. (10.14), we have for any i ≥ 0: dz i ξ(t,z) dz i i ∗ −ξ(t,z) Res∂ (∂ Φ (Φ ) ) = )(Φ e )= (∂ Φe ∂ Ψ · Ψ∗ = 0 2iπ 2iπ where the hypothesis is used in the last step. But by construction Φ(Φ )∗ = 1 + X, with X ∈ P− , therefore the above equation implies Res (∂ i X) = 0, for all i ≥ 0 and thus X = 0, so that Φ = (Φ∗ )−1 . Now let Q = Φ∂Φ−1 , and Bm = (Qm )+ . We show that ∂tm Φ = −(Qm )− Φ. First, observe that using ∂tm eξ(t,z) = ∂ m eξ(t,z) and Φ∂ m = Qm Φ, we have:

(∂tm Φ) + (Qm )− Φ eξ(t,z) = ∂tm Φeξ(t,z) − Φ∂tm − (Qm )− Φ eξ(t,z)

= ∂tm Φeξ(t,z) − Φ∂ m − (Qm )− Φ eξ(t,z)

= ∂tm − Qm + (Qm )− Φ eξ(t,z)

= ∂tm − (Qm )+ Φ eξ(t,z) By hypothesis, Ψ fore, since (Qm )+ 0= =

and Ψ∗ satisfy the bilinear identity eq. (10.13). Thereis a diﬀerential polynomial we have, for any i ≥ 0: dz i ∂ (∂tm − (Qm )+ )Φeξ(t,z) · Φ e−ξ(t,z) 2iπ

dz i ∂ (∂tm Φ) + (Qm )− Φ eξ(t,z) · Φ e−ξ(t,z) 2iπ

Equivalently, one can write:

0 = Res∂ ∂ i (∂tm Φ) + (Qm )− Φ (Φ )∗

= Res∂ ∂ i (∂tm Φ) + (Qm )− Φ Φ−1 Since this is true for any i ≥ 0, it implies: ((∂tm Φ) + (Qm )− Φ)Φ−1 = 0. Multiplying on the right by Φ proves the result. 10.4 Algebro-geometric solutions of KP It is quite a remarkable fact that with any Riemann surface of genus g one can associate a solution of the KP hierarchy. We explain this construction in this section. Let Γ be a smooth algebraic curve of genus g. Fix a point P∞ on Γ and a local coordinate w(P ) = z −1 in a neighbourhood of the puncture

10.4 Algebro-geometric solutions of KP

349

P∞ ( z = ∞). Then for each set of g points γ1 , . . . , γg in a general position there exists a unique function Ψ(t, P ) of the variable P ∈ Γ which is meromorphic outside P∞ and has at most simple poles at the points γs , and in the neighbourhood of the puncture P∞ one requires: ∞ ξ(t,z) −s Ψ(t, P ) = e ws (t)z 1+ (10.16) s=1

We now recall the fundamental formula expressing the Baker–Akhiezer functions in terms of Riemann theta functions (see Chapter 5). Let Ω(i) be the unique normalized meromorphic diﬀerential with a pole at P∞ , of the form Ω(i) = d(z i + O(z −1 )) and holomorphic everywhere else. The normalization condition is that all its a-periods vanish, Ω(i) = 0 ak

With it, we deﬁne a vector U (i) with coordinates 1 (i) Ω(i) Uk = 2πi bk

(10.17)

The Baker–Akhiezer function Ψ(t, P ) is equal to Ψ(t, P ) =

θ(A(P ) + U (1) x + U (2) t − ζ)θ(ζ) (i ti PP Ω(i) ) ∞ e θ(A(P ) − ζ)θ(U (1) x + U (2) t − ζ)

(10.18)

P where P∞ Ω(i) is the unique primitive of Ω(i) behaving as z i + O(z −1 ) modulo periods in the vicinity of P∞ . The vector ζ is equal to ζ = A(D)+ K with D = γ1 +· · ·+γg and K is the vector of Riemann constants. Finally, A(P ) is the Abel map with origin at P∞ . Remark. The Baker–Akhiezer function is intrinsically deﬁned by its analyticity properties. In the above formula, the choice of a-cycles and b-cycles and the normalization of the diﬀerentials is at our disposal. Another more canonical normalization is obtained by requiring that the forms Ω (j) have pure imaginary periods on any cycle. They are obtained by the transformation Ω(j) = Ω

(j)

+

g

(j)

αi ωi

(10.19)

i=1

where the ωi are the g normalized holomorphic diﬀerentials. Writing these normal(j) ization conditions gives 2g real conditions on the g complex parameters αi which can be solved. Indeed, taking the integral of this formula over the cycle ai , we get

350

10 The KP hierarchy

. (j) αi = − a Ω (j) is pure imaginary. Taking the integral over the cycle bi , we get the i unique solution (j) (j) αi = 2iπ(Im B)−1 ik Im Uk where B is matrix of b-periods of the ωi . With such Ω

Ψ(t, P ) = e

(j)

P∞

Ω (j)

ϕ

(j)

, we can write

tj U

(j)

,P

(10.20)

(j)

has 2g components 1 1 (j) =− Ω (j) , Ug+i = Ω (j) , 2iπ ai 2iπ bi

where the vector U Ui

j tj

P

j = 1, . . . , g

The function ϕ(z, P ) is equal to g

ϕ(z, P ) = e2iπ(

i=1

zi Ai (P ))

θ(A(P ) + i (zi+g Ii + zi Bi ) − ζ)θ(ζ) θ(A(P ) − ζ)θ( i (zi+g Ii + zi Bi ) − ζ)

The vectors Ii and Bi are such that (Ii )j = δij and (Bi )j = Bij . This is obtained by plugging eq. (10.19) in eq. (10.18). Note that the function ϕ(z, P ) is periodic with period 1 in each of the 2g variables zi . This form will be useful in the considerations of the last sections on Whitham equations.

The Baker–Akhiezer function automatically produces solutions of the KP hierarchy as follows. Consider eq. (10.16) for the asymptotic form of the Baker–Akhiezer function around P∞ . Let us rewrite it as ξ(t,z)

Ψ(t, P ) = Φe

,

Φ=1+

∞

ws (t)∂ −s

s=1

i where ∂ = ∂t1 = ∂x and ξ(t, z) = i ti z . This deﬁnes the pseudodiﬀerential operator Φ. From it we deﬁne Q = Φ∂Φ−1 Then we have Proposition. Let Ψ(t, P ) be the above Baker–Akhiezer function. Then it satisﬁes the equations of motion of the KP hierarchy (Q − z)Ψ = 0 (∂ti − (Qi )+ )Ψ = 0 Proof. The ﬁrst equation has a meaning as an expansion around P∞ and directly follows from the deﬁnition of Q. To prove the second equation,

351

10.4 Algebro-geometric solutions of KP

consider the function (∂ti − (Qi )+ )Ψ on Γ. It has the same analyticity properties as Ψ, apart from the behavior around P∞ where we have (∂ti − (Qi )+ )Ψ = (∂ti − Qi + (Qi )− )Ψ = O(z −1 )eξ(t,z) We used that Φ∂ti eξ(t,z) = Qi Φeξ(t,z) . Hence the expression in the lefthand side identically vanishes by the unicity of the Baker–Akhiezer function.

Remark. We stress that this construction associates solutions of the KP hierarchy with any Riemann surfaces. Special curves may lead to additional interesting structures, as we have seen in Chapter 7 on Calogero–Moser systems. We now give the global deﬁnition of the adjoint Baker–Akhiezer function. For any set of g points in general position there exists a unique meromorphic diﬀerential Ω with a double pole at P∞ : Ω = dz(1 + O(z −2 ))

(10.21)

and zeroes at the points γs : Ω(γs ) = 0,

s = 1, . . . , g

(10.22)

Besides γs this diﬀerential has g other zeroes that we denote by γs∗ . The adjoint Baker–Akhiezer function is the unique function Ψ∗ (t, P ) of the variable P ∈ Γ which is meromorphic outside P∞ , has at most simple poles at the points γs∗ (if all of them are distinct), and behaves in the neighbourhood of the puncture P∞ as ∞ Ψ∗ (t, P ) = e−ξ(t,z) 1 + ws∗ (t)z −s s=1

The adjoint Baker–Akhiezer function Ψ∗ (t, P ) is equal to Ψ∗ (t, P ) =

θ(A(P ) − U (1) x − U (2) t − ζ ∗ )θ(ζ ∗ ) −(i ti PP Ωi ) ∞ e θ(A(P ) − ζ ∗ )θ(U (1) x + U (2) t + ζ ∗ )

where ζ ∗ = A(D∗ ) + K where D∗ = γ1∗ + · · · + γg∗ .

352

10 The KP hierarchy

Proposition. The adjoint Baker–Akhiezer function satisﬁes the equations: (Q∗ − z)Ψ∗ = 0 (∂ti + (Qi )∗+ )Ψ∗ = 0

(10.23)

where Q∗ is the formal adjoint of Q. Proof. Consider, for any positive integer i, the form (∂ i Ψ)Ψ∗ Ω, where Ω is deﬁned by eqs. (10.21, 10.22). This is a meromorphic 1-form on Γ with a unique pole at P∞ of order 2 + i because the poles of Ψ and Ψ∗ are cancelled by the zeroes of Ω. Moreover, the essential singularities of Ψ and Ψ∗ at P∞ cancel. Around P∞ , we have Ψ = Φeξ (t, z) and Ψ∗ = Φ e−ξ(t,z) , where Φ is deﬁned from the expansion of Ψ∗ . So we have (∂ i Φeξ )Φ e−ξ Ω = z i dz (1 + O(1/z)) Since the sum of residues of any meromorphic 1-form must vanish, we get: (∂ i Ψ) · Ψ∗ Ω = 0, ∀i ≥ 0 where the integral is taken on a small circle around P∞ . This means that Ψ and Ψ∗ Ω satisfy the bilinear identities eq. (10.13), and therefore by eq. (10.15) we have Φ Ω = (Φ∗ )−1 dz. The adjoint equations of motion follow because Ω is independent of t. We have shown that the adjoint Baker–Akhiezer function is equal to the formal adjoint of Ψ, up to the factor Ω/dz. This also shows that Baker–Akhiezer functions constructed from Riemann surfaces automatically satisfy the fundamental bilinear identities eq. (10.13). 10.5 The tau-function of KP The bilinear identity eq. (10.13) allows us to express the Baker–Akhiezer function in terms of a tau-function. Proposition. Assume that Ψ and Ψ∗ are Baker–Akhiezer functions of the KP hierarchy satisfying eq. (10.13). Then there exists a function τ such that τ (t − [z −1 ]) ξ(t,z) τ (t + [z −1 ]) −ξ(t,z) , Ψ∗ (t, z) = (10.24) e e τ (t) τ (t) i where [z −1 ] = z1 , 2z12 , 3z13 , · · · and ξ(t, z) = ∞ i=1 ti z . Ψ(t, z) =

10.5 The tau-function of KP

353

−i we have, by the residue Proof. Note that for f (z) = 1 + ∞ i=1 fi z theorem: dz f (z) = z (f (z ) − 1) (10.25) 2iπ 1 − zz where z is big enough to be inside the integration contour around ∞. The two terms correspond to the poles at z = z and z = ∞. The bilinear identity applied to t and t = t − [z1−1 ] yields: dz w(t, z)w ∗ (t − [z1−1 ], z) dz 0= Ψ(t, z)Ψ∗ (t − [z1−1 ], z) = 2iπ 2iπ 1 − z/z1 where w(t, z) is deﬁned in eq. (10.11). To show the second equality, we used that −1 1 e−ξ(t−[z1 ],z) = e−ξ(t,z) 1 − z/z1 Up to now the arguments are similar to the proof of eq. (9.77) in Chapter 9. Notice that the Cauchy kernel in the right-hand side is produced by the very speciﬁc choice of shift we consider. Applying the residue ∗ (t − [z1−1 ], z1 ) = 1, or: formula eq. (10.25), we get w(t, z1 )w w ∗ (t − [z1−1 ], z1 ) =

1 w(t, z1 )

(10.26)

Similarly, applying the bilinear identity to t and t = t − [z1−1 ] − [z2−1 ], we see that the following quantity vanishes: dz w(t, dz z)w ∗ (t − [z1−1 ]−[z2−1 ], z) −1 −1 ∗ Ψ(t, z)Ψ (t−[z1 ]−[z2 ], z) = 2iπ 2iπ (1 − z/z1 )(1 − z/z2 ) Since there is no residue at ∞ we get w(t, z1 )w ∗ (t − [z1−1 ] − [z2−1 ], z1 ) = −1 −1 ∗ ∗ (t−[z1 ]−[z2 ], z2 ). Eliminating w using eq. (10.26), we obtain w(t, z2 )w the functional equation: w(t − [z2−1 ], z1 ) w(t − [z1−1 ], z2 ) = w(t, z1 ) w(t, z2 ) We want to show that this equation implies: w(t, z) =

τ (t − [z −1 ]) τ (t)

(10.27)

It is trivial to verify that this solves the equation. We now proceed to show that this is the general solution. Taking the logarithm of the functional equation, we are led to study an equation of the form f (t − [u−1 ], v) − f (t, v) = f (t − [v −1 ], u) − f (t, u)

(10.28)

354

10 The KP hierarchy

where the function f is: f (t, v) = log w(t, v) =

1 1 1 w1 (t) + 2 (w2 (t) − w12 (t)) + · · · v v 2

Introducing the generating function of time derivatives: ∇v =

∞

v −i−1 ∂ti

i=1

)φ(t − [v −1 ])

we remark that (∂v − ∇v (∂v − ∇v ) to eq. (10.28), we get

= 0 for any function φ(t). Applying

(∂v − ∇v )f (t − [u−1 ], v) − (∂v − ∇v )f (t, v) = −(∂v − ∇v )f (t, u) −i−1 , this Expanding in v −1 and setting (∂v − ∇v )f (t, v) = ∞ i=1 γi (t)v reads: (10.29) γi (t − [u−1 ]) − γi (t) = ∂ti f (t, u) Considering Fij (t) = ∂ti γj (t) − ∂tj γi (t) we get the condition Fij (t − [u−1 ]) = Fij (t). Expanding in powers of u−1 one sees that Fij (t) is independent of all the time variables t1 , t2 , . . . . But by construction, Fij is a local diﬀerential polynomial in w1 (t), w2 (t), . . . (for example F12 = ∂t2 w1 − 2∂t1 w2 + 2w1 ∂t1 w1 − ∂12 w1 , etc. . . .). Using the equations of motion of the KP hierarchy we can replace all the ∂tk wl for k ≥ 2 by higher derivatives of wl with respect to t1 = x. Hence we can write Fij as a polynomial in the ∂ k wl . But the monomials in Fij are independent, and we know that Fij is constant, hence it reduces to its constant term. Since it vanishes for the particular solution w = 0, we see that Fij = 0. So we can write γi (t) = ∂ti log τ (t). Finally, inserting this into eq. (10.29) we get eq. (10.27). The formula for Ψ∗ (t, z) is then a straightforward consequence of eq. (10.26) where one substitutes t → t + [z1−1 ]. Equation (10.24) is Sato’s formula. Using it, the bilinear identity eq. (10.13) can be rewritten as bilinear identities for the tau-functions. Of course they coincide with the Hirota bilinear identities we obtained using vertex operators, eq. (9.38) in Chapter 9. Remark 1. We recall that in terms of the fermionic description of the previous chapter, the tau-function and the Baker–Akhiezer function have a particularly elegant formulation: τ (t; g) = 0|eH(t) g|0 and Ψ(t, z) =

1|eH(t) β(z)g|0 , τ (t; g)

Ψ∗ (t, z) =

−1|eH(t) β ∗ (z)g|0 τ (t; g)

355

10.6 The generalized KdV equations

where g is an element of the group GL(∞), and β(z) and β ∗ (z) are the fermionic operators.

Remark 2. The Grassmannian formulation also shows that τ (t) is given by an inﬁnite determinant. In particular interesting cases, it degenerates to a ﬁnite determinant. 10.6 The generalized KdV equations The KP hierarchy is a system of evolution equations for the inﬁnite set of functions wi (t) or equivalently the coeﬃcients q−i (t) appearing in Q. To reduce the sytem to a ﬁnite number of coeﬃcients, we remark that since Q obeys the Lax equations ∂tk Q = [(Qk )+ , Q], any power Qn+1 also obeys the same equation. The main remark is that one can impose consistently that Qn+1 is a diﬀerential operator, i.e. (Qn+1 )− = 0. This is because one then has [(Qk )+ , Qn+1 ]− = 0, and so ∂tk Qn+1 is a diﬀerential operator. Moreover, the two sides of the Lax equation are diﬀerential operators of the same order because one can also write ∂tk Qn+1 = [−(Qk )− , Qn+1 ], which is in fact of order ∂ n−1 . It follows that one can further impose that the coeﬃcient of ∂ n in Qn+1 vanishes. To summarize, we impose that Qn+1 = L is a diﬀerential operator: L=∂

n+1

−

n−1

ui ∂ i

(10.30)

i=0

With this L one can write Lax equations which deﬁne ﬂows on the ﬁnite number of functions ui . These ﬂows close on the ui because given L one can reconstruct Q such that Qn+1 = L, so that (Qk )+ may be viewed as a function of the ui . Proposition. Let L be the diﬀerential operator eq. (10.30). There exists a unique pseudo-diﬀerential operator Q = ∂ + q−1 ∂ −1 + · · · such that 1 Qn+1 = L. We will denote it by Q = L n+1 . Proof. If Q = ∂ + q0 + q−1 ∂ −1 + · · ·, one ﬁrst sees that Qn+1 = ∂ n+1 + (n + 1)q0 ∂ n + · · ·. Since there is no term ∂ n in L one has q0 = 0. Then by induction one shows that: (n + 1)q−1 = −un−1 ,

(n + 1)q−2 = −un−2 −

n(n + 1) ∂q−1 2

(n + 1)q−i = −un−i + pi (q−1 , . . . , q−i+1 ) where pi is a diﬀerential polynomial in its arguments. Knowing the ui , this system uniquely determines the qi recursively.

356

10 The KP hierarchy

We can rewrite the reduced KP ﬂows directly in terms of L. These systems are called the generalized KdV hierarchies. The KdV hierarchy corresponds to n = 1, and the generalized ones to n = 2, 3, . . . . It is worth writing these equations once more: Proposition. Let L be the diﬀerential operator eq. (10.30). Then the Lax equations

k n+1 ∂tk L = L ,L (10.31) +

are consistent for all k ∈ N. 1

Proof. We introduce the pseudo-diﬀerential operator Q = L n+1 . Notice that Qk , ∀k ∈ N, commutes with L since LQk = Qn+1+k = Qk L. Then we have:

k k k k Q ,L = Q ,L − Q ,L = − Q ,L +

−

−

From the last equality, it follows that the diﬀerential operator Qk + , L is of order less or equal to n − 1, so that the Lax equation eq. (10.31) is an equation on the coeﬃcients of L. Example. Let us consider the KdV case n = 1. The operator L is the second order diﬀerential operator L = ∂2 − u We ﬁrst ﬁnd Q such that Q2 = L. One has Q2 = ∂ 2 + 2q−1 + (2q−2 + ∂q−1 )∂ −1 + · · · so that q−1 = − 12 u, q−2 = 14 ∂u, etc. . . . 1 1 Q = ∂ − u∂ −1 + (∂u)∂ −2 + · · · 2 4 We again check on this simple example that all the q−j are recursively determined in terms of u by requiring that no ∂ −j terms occur in Q2 . To obtain the KdV ﬂows, we only have to compute (Qk )+ , k = 1, 2, . . . . For k = 1, we have (Q)+ = ∂, and ∂1 L = [∂, L]. This reduces to the identiﬁcation ∂t1 = ∂. For k = 2, we have (Q2 )+ = L, and we get the trivial equation ∂t2 L = 0. The ﬁrst non-trivial case is k = 3. We have: 3 3 (Q3 )+ = ∂ 3 − u∂ − (∂u) 2 4 so the Lax equation reads ∂t3 u = [(Q3 )+ , ∂ 2 − u]. This is the Korteweg–de Vries equation: 4∂t3 u = ∂ 3 u − 6u(∂u)

10.6 The generalized KdV equations

357

This is the ﬁrst of a hierarchy of equations obtained by taking k = 3, 5, 7, . . . , called the KdV hierarchy (note that for k even we get trivial equations), which will be studied in detail in Chapter 11. We now show that the generalized KdV equations are Hamiltonian systems. The diﬀerential operator L is an element of P+ . So we have to specify a Poisson structure on the space F(P+ ) of functions on P+ . If we view P+ as the dual of the Lie algebra P− through the Adler trace, there is a natural Poisson bracket on P+ : the Kostant–Kirillov bracket. For any functions f and g on P+ , it is deﬁned as usual by: {f, g}1 (L) = L , [df, dg]

∀ L ∈ P+

(10.32)

where we understand that df, dg ∈ P− . In particular, if for any X = ∞ −j−1 x ∈ P , we deﬁne the linear function f (L) by: j − X j=0 ∂ fX (L) = L, X

(10.33)

and we have dfX = X ∈ P− . Therefore {fX , fY }1 = f[X,Y ] = L, [X, Y ] for any X, Y ∈ P− . Proposition. Let L ∈ P+ be the diﬀerential operator of order n + 1 as in eq. (10.30). Deﬁne the functions of L by k n+1 L n+1 +1 Hk (L) = n+k+1 They are the conserved quantities of the generalized KP hierarchy. Then: (i) The quantities Hk are the Hamiltonians of the generalized KdV ﬂows under the bracket eq. (10.32):

k ˙ n+1 L = {Hk , L}1 = L ,L (10.34) +

(ii) The functions Hk (L) are in involution with respect to this bracket. 1

Proof. Recall that Q = L n+1 so that Hk is proportional to Qk+n+1 , which are the conserved quantities of the KP hierarchy, and so are conserved also in the generalized KdV hierarchies. We ﬁrst need to compute the diﬀerential of the Hamiltonian Hk . Let L and δL be diﬀerential operators of the form eq. (10.30). One has, using the cyclicity of Adler’s trace: (L + δL)ν = Lν + ν Lν−1 δL + · · · which implies d Lν = ν(Lν−1 )(−n) , where the notation ( )(−n) means projection on P− truncated at the ﬁrst n terms. This projection appears

358

10 The KP hierarchy

because δL = −δun−1 ∂ n−1 − · · · − δu0 , which is dual to elements of the form ∂ −1 x0 + · · · + ∂ −n xn under the Adler trace. Hence: k

dHk (L) = L n+1 = Qk ∈ P− (10.35) (−n)

(−n)

(k)

We deﬁne θ−(n+1) as the terms left over in the truncation: k

(k) L n+1 = dHk + θ−(n+1)

(10.36)

−

We now prove eq. (10.34). Consider the function fX (L) = LX , then f˙X = {Hk , fX }(L) = L, [dHk , dfX ] = [L, dHk ], X where we used the invariance of the Adler trace. Since X ∈ P− , only [L, dHk ]+ contributes to this expression. But

k k (k) [L, dHk ]+ = L, L n+1 − L, θ−(n+1) = L n+1 ,L − +

+

+

k

(k)

where we have used [L n+1 , L] = 0, and the fact that [L, θ−(n+1) ]+ = 0. So [L, dHk ]+ is a diﬀerential operator of order at most n − 1, and this is enough to prove eq. (10.34). Next we show that the Hamiltonians Hk are in involution. We have: {Hk , Hl }1 (L) = L, [dHk , dHl ] = [L, dHk ]+ , dHl

k = L n+1 , L , dHl +

Using again the fact

that [L, dHk ]+ is of order at most n − 1, we can l n+1 replace dHl by L , and get: −

k l k l n+1 n+1 n+1 {Hk , Hl }1 (L) = L ,L L

= L , L L n+1 −

+

+

l

In the last step we used that P+ , P+ = 0 in order to replace L n+1 by L

l n+1

−

. Finally, from the invariance of the trace, we obtain: k l L, L n+1 = 0 {Hk , Hl }1 (L) = L n+1 +

This proposition shows that the generalized KdV hierarchies are Hamiltonian systems. In the next section we show that there exists in fact another local Hamiltonian structure for the same hierarchy.

359

10.7 KdV Hamiltonian structures 10.7 KdV Hamiltonian structures

We will establish recursion relations between the equations of motion written with Hamiltonians Hk and Hk+n+1 . These relations are called Lenard recursion relations. They suggest introducing a second Poisson bracket on F(P+ ) such that the generalized KdV equations can be writtens as Hamilton equations with respect to both Poisson brackets; such systems are called bihamiltonian systems. Proposition. Let L be the diﬀerential operator eq. (10.30). Let us introduce two operators D1 and D2 acting on any X ∈ P− by: D1 (X) = [L, X]+ D2 (X) = (LX)+ L − L(XL)+ −

1 [L, (∂ −1 [X, L]−1 )] n+1

Then, the functions Hk (L) satisfy the following recursion relation: D1 (dHk+n+1 ) = D2 (dHk )

(10.37)

Proof. Note that for any X ∈ P− we have 0 = [X, L] = dx [X, L]−1 so that [X, L]−1 is a total derivative and the object (∂ −1 [X, L]−1 ) appearing of this total diﬀerential, in the deﬁnition of D2 is by deﬁnition a primitive −i x i.e. it is local. For example, we have [∂ 2 −u, ∞ i=1 i ∂ ]−1 = ∂(∂x1 +2x2 ). Using eq. (10.36) for dHk+n+1 , we have: (k+n+1)

]+ − [L, θ−(n+1) ]+ D1 (dHk+n+1 ) = [L, dHk+n+1 ]+ = [L, Qk+n+1 − = −[L, Qk+n+1 ] + (k+n+1)

We used [L, θ−(n+1) ]+ = 0 and the fact that Qk+n+1 = Qk+n+1 +Qk+n+1 + − commutes with L. Now, the simple recursion relation: Qk L = Qk+n+1 implies:

Qk+n+1 = Qk L = Qk+ L + Qk− L = Qk+ L + Qk− L +

+

+

+

+

Thus we obtain:

k+n+1 k k = − L, Q+ L + L, Q− L D1 (dHk+n+1 ) = − L, Q +

= [L, dHk ]+ L − [L, (dHk L)+ ] −

+

(k) [L, (θ−(n+1) L)+ ]

(10.38)

where we used again the decomposition eq. (10.36) for dHk . The remark(k) able fact is that the last term involving θ−(n+1) can also be expressed

360

10 The KP hierarchy

in terms of dHk . Indeed, deﬁning v0 by θ−(n+1) = v0 ∂ −(n+1) + · · ·, we (k)

k

(k)

k

have (θ−(n+1) L)+ = v0 . Also, [(L n+1 )− , L]− = −[(L n+1 )+ , L]− = 0, and looking at the Adler residue gives: k

(k)

0 = [(L n+1 )− , L]−1 = [dHk , L]−1 +[θ−(n+1) , L]−1 = [dHk , L]−1 −(n+1)∂v0 Because the residue of a commutator is a total derivative, we can integrate this equation, and write consistently that: v0 =

1 −1 ∂ [dHk , L]−1 n+1

Inserting this result into eq. (10.38), we get: [L, dHk+n+1 ]+ = (LdHk )+ L − L(dHk L)+ −

1 [L, (∂ −1 [dHk , L]−1 )] n+1

which is the claimed statement. This recursion relation led Adler to conjecture that the operators D1 and D2 deﬁne two Poisson brackets on F(P+ ) denoted by { , }1 and { , }2 through the equations: {f, g}1 = D1 (df ) · dg {f, g}2 = D2 (df ) · dg More precisely, let us deﬁne as in eq. (10.33) the functions fX (L) = LX such that dfX = X. The explicit expressions for the two brackets are then: (10.39) {fX , fY }1 (L) = LXY − LY X {fX , fY }2 (L) = (LX)+ (LY )− − (XL)+ (Y L)−

1 − dx (∂ −1 [L, X]−1 ) [L, Y ]−1 n+1 The ﬁrst bracket is the Kostant–Kirillov bracket, eq. (10.32). We now prove that { , }2 is a Poisson bracket. The antisymmetry can easily be checked using the cyclicity of Adler trace and isotropy of P± . To check the Jacobi identity, we change variables, from the coeﬃcients ui of L, see eq. (10.30), to new variables pj in which the Jacobi identity is obvious. This change of variables is called the Miura transformation. The Poisson bracket { , }2 becomes very simple in the variables pj . This is the content of the Kupershmidt–Wilson theorem.

361

10.7 KdV Hamiltonian structures Theorem. Let us write L = Ln+1 Ln · · · L1 , with Li = ∂ − pi : L = (∂ − pn+1 ) · (∂ − pn ) · · · (∂ − p1 ),

n+1

pi = 0

(10.40)

i=1

Let us deﬁne a Poisson bracket on the functions pk (x) by: 1 {pk (x), pl (y)} = δkl − δ (x − y) n+1

(10.41)

Then we have {fX , fY }(L) = {fX , fY }2 (L). This shows that { , }2 satisﬁes the Jacobi identity. Proof. Viewing fX (L) as a function of pk , we have: δfY δfX {fX , fY }(L) = dx dy {pk (x), pl (y)} δpk (x) δpl (y) k,l

We now check that inserting eq. (10.41) into this formula reproduces the second Poisson structure. Deﬁne the operators: Lij = Li Li−1 · · · Lj ,

for i ≥ j,

L01 = 1, Ln+1,n+2 = 1

(10.42)

From the expression of fX , we have: δfX /δpk (x) = −(Lk−1,1 XLn+1,k+1 )−1 (x) Thus, {fX , fY }(L) =

dx dy(Lk−1,1 XLn+1,k+1 )−1 (x)

k,l

× {pk (x), pl (y)} × (Ll−1,1 Y Ln+1,l+1 )−1 (y) Using the Poisson bracket of the pk , this is rewritten as: {fX , fY }(L) = E1 − E2

(10.43)

where E1 is produced by the Kronecker delta in eq. (10.41) while E2 is produced by the 1/(n + 1) term. We have: E1 =

n+1

dx(∂(Lk−1,1 XLn+1,k+1 )−1 )(Lk−1,1 Y Ln+1,k+1 )−1

k=1

n+1 1 E2 = dx(∂(Lk−1,1 XLn+1,k+1 )−1 )(Ll−1,1 Y Ln+1,l+1 )−1 n+1 k,l=1

362

10 The KP hierarchy

We can write ∂(Lk−1,1 XLn+1,k+1 )−1 = ([Lk , Lk−1,1 XLn+1,k+1 ])−1 . This is because Lk = ∂ − pk and one checks that for any pseudo-diﬀerential f = n fn ∂ n one has [∂, f ]−1 = (∂f−1 ) and [pk , f ]−1 = 0. Thus: n+1 (10.44) dx (Lk1 XLn+1,k+1 )−1 (Lk−1,1 Y Ln+1,k+1 )−1 E1 = k=1

−(Lk−1,1 XLn+1,k )−1 (Lk−1,1 Y Ln+1,k+1 )−1 E2 =

n+1 1 dx (Lk1 XLn+1,k+1 )−1 (Ll−1,1 Y Ln+1,l+1 )−1 n+1 k,l=1

−(Lk−1,1 XLn+1,k )−1 (Ll−1,1 Y Ln+1,l+1 )−1 We use the identity, true for all p, dx(U )−1 (V )−1 = U− (∂ − p)V− = (∂ − p)U− V−

(10.45)

to rewrite the expression E1 as: n+1 E1 = (Lk1 XLn+1,k+1 )− Lk (Lk−1,1 Y Ln+1,k+1 )− k=1

− Lk (Lk−1,1 XLn+1,k )− (Lk−1,1 Y Ln+1,k+1 )−

Next replace all the terms U− by U − U+ , to get E1 =

n+1

− (Lk1 XLn+1,k+1 )+ Lk1 Y Ln+1,k+1

k=1

+ (Lk−1,1 XLn+1,k )+ Lk−1,1 Y Ln+1,k

In the sum, the terms cancel two by two, and we are left with E1 = (XL)+ Y L − (LX)+ LY We now turn to the summation over k in E2 . Again, terms cancel two by two and we are left with: n+1 1 E2 = dx [X, L]−1 (Ll−1,1 Y Ln+1,l+1 )−1 n+1 l=1

1 = [X, L]−1 Ll−1,1 Y Ln+1,l+1 n+1 n+1

=

1 n+1

l=1 n+1 l=1

Ln+1,l+1 [X, L]−1 Ll−1,1 Y

10.8 Bihamiltonian structure

363

But we have [L, f ] = n+1 l=1 Ln+1,l+1 ∂f Ll−1,1 for any function f . Thus the sum can ﬁnally be written as: 1 E2 = dx (∂ −1 ([X, L]−1 )[Y, L]−1 n+1 This completes the proof that { , } = { , }2 . Since { , } obviously satisﬁes the Jacobi identity, so does { , }2 . Proposition. The Hamiltonians Hk are in involution with respect to both Poisson brackets. Proof. We already know that {Hk , Hl }1 = 0. Using the recursion relation eq. (10.37), we have: {Hk , Hl }2 = D2 dHk , dHl = D1 dHn+1+k , dHl = {Hn+1+k , Hl }1 = 0

Example. For n = 1, we have L = ∂ 2 −u with u = p2 +p . The Poisson bracket eq. (10.41) becomes {p(x), p(y)} = 12 δ (x − y). This implies 1 {u(x), u(y)}2 = [u(x)∂x + ∂x u(x) − ∂x3 ]δ(x − y) 2 We recognize the Virasoro algebra (see Chapter 11). 10.8 Bihamiltonian structure The two Poisson brackets { , }1 and { , }2 have a remarkable compatibility property, called the Magri compatibility condition. Proposition. The two Poisson structures { , }1 and { , }2 are compatible, in the sense that for any λ1 , λ2 the application (f, g) → λ1 {f, g}1 + λ2 {f, g}2 is a Poisson bracket. Proof. The condition that the sum of two Poisson brackets {f, g}1 = (1) (2) ij Pij ∂i f ∂j g and {f, g}2 = ij Pij ∂i f ∂j g satisﬁes the Jacobi identity reads:

(1) (2) (2) (1) Pil ∂l Pjk + Pil ∂l Pjk ∂i f ∂j g∂k h + cyc. perm. = 0 Since no second order derivatives occur, it is suﬃcient to check it on the linear functions f = fX , g = fY , and h = fZ . The condition becomes: D1 (X), d D2 (Y ), Z

+ D2 (X), d D1 (Y ), Z

+ cyc. perm. = 0

364

10 The KP hierarchy

where d is the diﬀerential on phase space. We have: d D1 (X), Y = [X, Y ] d D2 (X), Y = X(LY )− − (Y L)− X + (XL)− Y − Y (LX)− , 1 + −1 + (Y LX − XLY )− + [(∂ [Y, L]−1 ), X] − [(∂ −1 [X, L]−1 ), Y ] n+1 Inserting this into the above condition, one veriﬁes that it is indeed satisﬁed. One of the main advantages of bihamiltonian structures is that they automatically produce commuting Hamiltonians, as we now explain. Let { , }1 and { , }2 be two compatible Poisson brackets, and consider the linear combination { , }λ = { , }1 − λ{ , }2 Let us assume the existence of Hλ , a Casimir function of the Poisson bracket { , }λ . This means (10.46) {Hλ , f }λ = 0 ∀f ∞ Suppose that we can expand Hλ = n=0 Hn λn . Then the above relation gives {H0 , f }1 = 0, {Hn , f }1 = {Hn−1 , f }2 , ∀f (10.47) This shows that the ﬂows generated by the Hn are related by recursion relations of the Lenard type. Moreover, the Hn are in involution with respect to both Poisson brackets { , }1,2 . The proof is done by induction. First, by eq. (10.47), one has {H0 , Hn }1 = 0, ∀n. Suppose that we have shown that {Hm , Hn }1 = 0, ∀n, then we have {Hm+1 , Hn }1 = 0, ∀n. This is because, using the recursion relations, we have {Hm+1 , Hn }1 = {Hm , Hn }2 = −{Hn , Hm }2 = −{Hn+1 , Hm }1 = 0 Hence the Hn are in involution with respect to the ﬁrst Poisson bracket. But by the recursion relations, we have {Hm , Hn }2 = {Hm+1 , Hn }1 = 0, so they are also in involution with respect to the second bracket. All this means that bihamiltonian structures are consubstantial to integrable systems. The unique feature of the present situation is that both Poisson brackets are local. 10.9 The Drinfeld–Sokolov reduction The two compatible Poisson brackets { , }1 and { , }2 have a nice Lie algebraic interpretation, which we now explain. They can be obtained,

365

10.9 The Drinfeld–Sokolov reduction

through Hamiltonian reduction, from the Kostant–Kirillov bracket on coadjoint orbits of central extensions of loop algebras. Consider the loop algebra of traceless (n + 1) × (n + 1) matrices U (x), with matrix elements functions of x. Let G be the central extension of this loop algebra. It consists of pairs (U (x), u) also denoted by U (x) + u K, where u is a number, and K is called the central element. The commutator on G is deﬁned as: [(U (x), u), (V (x), v)] ≡ ( [U (x), V (x)] , ω(U, V ) ) where in the right-hand side, [U (x), V (x)] is the loop algebra commutator and ω(U, V ) is a bilinear antisymmetric form. The central element K = (0, 1) commutes with everything. The Jacobi identity for this bracket reduces to the cocycle condition on ω: ω([U, V ], W ) + ω([W, U ], V ) + ω([V, W ], U ) = 0 Note that this is a linear condition on ω, so the sum of two cocycles is a cocycle. Trivial cocycles are given by Σ([U, V ]) where Σ is a linear form on the loop algebra. Such a linear form can be written as Σ(U ) = dx Tr (Σ(x)U (x)) where we used the natural invariant scalar product on the loop algebra: (U, V ) = dx Tr (U (x)V (x)) The standard non-trivial cocycle is: ω0 (U, V ) = dx Tr (U (x)∂x V (x)) and we can take for ω any linear combination ω(U, V ) = ω0 (U, V ) + Σ([U, V ]) G∗

The dual of G can be identiﬁed with G using the non-degenerate bilinear form (U + u K, V + v K) = (U, V ) + uv. Then the coadjoint action of G on G ∗ reads: ad∗(V,v) (U, u) = (u∂x V − [U + Σ, V ], 0)

(10.48)

To see it, we apply the deﬁnition

ad∗(V,v) (U, u) (W, w) = − (U, u), [(V, v), (W, w)] = −(U, [V, W ]) − uω(V, W )

= dx Tr (−[U, V ] + u∂V − [Σ, V ])W

366

10 The KP hierarchy

We see that u is invariant by the coadjoint action eq. (10.48), so in the following we ﬁx it to the value u = 1. The coadjoint action of (V, v) becomes a gauge transformation on the operator ∂ − U − Σ, namely: ad∗(V,v) (U, 1) = (U , 0),

with U = ∂V − [U + Σ, V ]

(10.49)

By construction any orbit of the gauge action in G ∗ is equipped with an invariant symplectic form, the Kostant–Kirillov form. Explicitly, at the point U , the induced Poisson bracket reads:

{f, g}(U ) = ad∗df (U ) (dg) = ∂df − [U, df ], dg − Σ, [df, dg] (10.50) where the diﬀerentials df and dg of functions on G ∗ are viewed as elements of G. The two terms in the right-hand side of eq. (10.50) obviously satisfy the Jacobi identity separately, and so does their sum, hence they deﬁne two compatible Poisson brackets. We show below that, with a proper choice of Σ and an appropriate symplectic reduction, these two Poisson brackets reduce to the brackets { , }1 and { , }2 considered in the previous section, which are then compatible by construction. We choose to reduce by a subgroup of the loop group, namely the loop group N− of lower triangular matrices with 1 on the diagonal. The dual of its Lie algebra can be identiﬁed with the loop algebra N+ of strictly upper triangular matrices. The Hamiltonian which generates the gauge action by (V, v) is simply H(V,v) (U, u) = (U, V ) as in the general theory, see Chapter 14. Alternatively, this follows from eq. (10.50) with fV (U ) = (U, V ) and dfV = V . The moment at the point (U, 1) is P(U ) = PN+ U . To perform the Hamiltonian reduction, we must set P(U ) to a ﬁxed value µ ∈ N+ , which determines the nature of the reduced symplectic manifold. We take: n Ei,i+1 = E+ ∈ N+ µ= i=1

where E+ is the sum of simple root vectors of the Lie algebra sl(n + 1) in the vector representation. So far, we did not specify the form Σ(U ) in the central extension we considered. We require now that Σ ∈ N− , and more precisely: Σ = αΣ0 ,

with Σ0 = En+1,1 ∈ N−

(10.51)

This choice will lead to the Poisson bracket { , }1 . It has the important property that [Σ, V ] = 0 for any V ∈ N− , so that the coadjoint action of

10.9 The Drinfeld–Sokolov reduction

367

N− reduces to a gauge action on ∂ − U . Moreover, the group of stability of µ is the whole group N− , as we now show. The variation of the moment under the coadjoint action of N− is given by δV µ = PN+ (∂V − [µ + Σ, V ]) where V ∈ N− . Due to the speciﬁc form of the momentum µ, the commutator [µ, V ] cannot have matrix elements above the diagonal and is killed by the projection on N+ . Similarly, ∂V is lower triangular and does not survives the projection. Finally, we recall that [Σ, V ] = 0 for the speciﬁc choice of Σ we made. Altogether δV µ = 0. The matrices U (x) such that PN+ U = µ have the form U (x) = B(x) + µ, where B(x) is a lower triangular matrix, including the diagonal. The reduced phase space is obtained by quotienting by the group N− with group action U → U = ∂V + [U, V ] with V ∈ N− . Note that this leaves the form of U invariant. Alternatively, the reduced phase space can be identiﬁed with the set of diﬀerential operators ∂ − B − µ quotiented by the group N− acting by gauge transformations. One can use this gauge action to bring ∂ − B − µ to either one of the two forms (note that B has (n + 1) more parameters than N− ):     0 ··· p1 1 0 1 0 ···  0 p2 1  0 ···  0 1 ···  .  .   . . , Du = ∂ −  ..  .. .. Dp = ∂ −   ..     0 ···   0 ··· 0 1  1 un u0 u1 · · · 0 · · · · · · 0 pn+1 (10.52) Since we consider the loop algebra of traceless matrices we have i pi = 0 and un = 0. The sets (p1 , . . . , pn+1 ) and (u0 , . . . , un ) constitute two diﬀerent coordinate systems on the reduced phase space Fµ = Mµ /N− . Note that with any point in Mµ , i.e. with any matrix diﬀerential operator D = ∂ − B − µ, one can associate a scalar diﬀerential operator of order (n + 1). To do that we consider the matrix diﬀerential equation DΨ = 0 and write the diﬀerential equation of order (n + 1) induced on the ﬁrst component ψ1 of Ψ, which is of the form Lψ1 = 0. Since the group action of N− on Ψ leaves ψ1 invariant, this diﬀerential equation is invariant under gauge transformations, so the coeﬃcients of the equation are invariant functions under N− . For the two particular forms Dp , Du , we get: L = (∂ − pn+1 ) · · · (∂ − p1 ),

L = ∂ n+1 − un−1 ∂ n−1 − · · · − u1 ∂ − u0

It remains to express the reduced Poisson bracket in terms of the invariant operator L. We recall that the reduced bracket of invariant

368

10 The KP hierarchy

functions can be computed straightforwardly using the Poisson bracket on the unreduced phase space, see Chapter 14. We take as invariant functions the functions fX = LX , where X is any pseudo-diﬀerential operator on P− , and is the Adler trace. Hence we have at the point (p + µ) (where p is diagonal) {fX , fY }reduced = (ad∗dfX (p + µ), dfY )

(10.53)

Separating the ω0 and Σ parts in the cocycle deﬁnition, we can write also {fX , fY }reduced = {fX , fY }ω0 + {fX , fY }Σ with {fX , fY }ω0 = (∂dfX − [p + µ, dfX ], dfY ),

{fX , fY }Σ = −([Σ, dfX ], dfY )

To compute dfX and dfY , we ﬁrst need to compute the variation of L when p + µ → p + µ + b, where b is a small lower triangular matrix (including diagonal). Writing the system (∂−p−b−µ)Ψ = 0 in terms of ψ1 and keeping only terms of ﬁrst order in b we ﬁnd, with the notations of eq. (10.42): δL = − Ln+1,i+1 bij Lj−1,1 i≥j

The diﬀerential dfX is deﬁned by the relation (dfX , b) = δLX so that dfX is the upper triangular traceless matrix:

1 Lk−1,1 XLn+1,k+1 + (dfX )ji = − Lj−1,1 XLn+1,i+1 δij n+1 −1 −1 k (10.54) We are now in a position to prove the main result of this section: Proposition. One has {fX , fY }ω0 = {fX , fY }2 ,

{fX , fY }Σ = {fX , fY }1

Proof. We start with the coadjoint action: ad∗dfX (p + µ) = ∂dfX − [p + µ + Σ, dfX ] Noting that dfY is upper triangular in eq. (10.53), we need only keep the lower triangular part in ad∗dfX (p + µ). Using eq. (10.49) (with V = dfX and U = p+µ), we remark that the Σ independent term in this expression is upper triangular, so that we need only keep the diagonal part in this term, that is (dfX )kk . We get: (∂(dfX )kk )(dfY )kk − ([Σ, dfX ], dfY ) {fX , fY }reduced = k

10.9 The Drinfeld–Sokolov reduction

369

The ﬁrst term is just {fX , fY }ω0 . Substituting the expressions of dfX and dfY it immediately yields eq. (10.43), which shows that it coincides with the bracket { , }2 . This also shows that the Poisson bracket of the diagonal coordinates pi (x) is given by eq. (10.41). We now look at the Σ dependent term which is {fX , fY }Σ and show that it reproduces the bracket { , }1 . Due to the choice Σ = αΣ0 we have to compute: ([Σ0 , dfX ], dfY ) =

n+1

(dfX )1,j (dfY )j,n+1 − (dfY )1,j (dfX )j,n+1

j=1

Inserting the expressions eq. (10.54) for dfX and dfY , and noting that the terms proportional to the Kroneker deltas do not contribute, we get: n+1 ([Σ0 , dfX ], dfY ) = dx (XLn+1,j+1 )−1 (Lj−1,1 Y )−1 j=1

−(Y Ln+1,j+1 )−1 (Lj−1,1 X)−1 Using eq. (10.45), this can be rewritten in terms of Adler traces: ([Σ0 , dfX ], dfY ) =

n+1

(XLn+1,j+1 )− Lj (Lj−1,1 Y )−

j=1

− (Y Ln+1,j+1 )− Lj (Lj−1,1 X)− Substituting everywhere U− = U − U+ and using the isotropy of P± so that U V+ = U− V , this reads: ([Σ0 , dfX ], dfY ) =

n+1

XLY − (XLn+1,j )− Lj−1,1 Y

j=1

− (XLn+1,j+1 )+ Lj,1 Y − (X ↔ Y ) In the sum the ﬁrst term yields (n + 1) XLY , while the second and third terms regroup themselves as −

n+1

(XLn+1,j )− + (XLn+1,j )+ Lj−1,1 Y = −n XLY

j=2

and we ﬁnally get: −([Σ0 , dfX ], dfY ) = − XLY + Y LX = L[X, Y ] = {fX , fY }1

370

10 The KP hierarchy

This construction, due to Drinfeld and Sokolov, can be generalized to Lie algebras other than sl(n + 1), replacing E+ by the sum of simple root vectors and Σ0 by the root vector E−α , where α is the longest root. This also shows the nice interplay between diﬀerent Lie algebra structures (the one induced by the algebra of pseudo-diﬀerential operators, and the Kac– Moody one) producing the same Kostant–Kirillov Poisson brackets, after suitable Hamiltonian reduction. 10.10 Whitham equations In many cases, solutions of non-linear partial diﬀerential equations take the form of modulated wavetrains, i.e. at small scale they look like sinusoidal solutions, but at large scale the parameters of the sinusoid slowly evolve. Whitham equations describe the slow variations of these parameters. It turns out that algebro-geometric solutions of KP are particularly well suited examples of Whitham analysis. In the algebro-geometric solutions, the ﬁeld u = −2q−1 of the KP hierarchy is of the form: u(t) = u0 ti U (i) (m), m i

where ti are the KP time variables and m denotes the moduli of the Riemann surface, Γ, used to build the solution. The quantities U (i) (m), deﬁned in eq. (10.17), are functions of the moduli only. The vector V = ti U (i) (m) (10.55) i

lives on the Jacobian of Γ and u(t) is a pseudo-periodic function of each time ti . We now look for solutions of KP, close to these algebro-geometric solutions, but where the moduli m slowly evolve. To describe the slow modulation, we introduce the large scale variables Ti = ti , and we express the idea of a modulated wavetrain by searching for u(t) in the form u(t) = u0 ( −1 S(T ), m(T )) + u1 (T ) + · · · ,

Ti = ti

(10.56)

Our purpose is to ﬁnd the equations for S(T ) and m(T ) such that u(t) in eq. (10.56) is a solution of the KP equation to ﬁrst order in , valid over a time scale t ∼ −1 . This means that the ﬁrst order term must remain uniformly bounded over a period of time −1 . To this aim, we take advantage that time dependence in the algebrogeometric solution is entirely contained in the variable V , eq. (10.55), so

10.10 Whitham equations

371

that we can write it explicitly as a function of V and m. Once this is done, we consider V and m as independent variables. We postulate the equations of motion for V = −1 S: ∂Ti S = U (i) (m(T ))

(10.57)

These equations are obviously satisﬁed for the modulated solutions. As a consequence, we can write the time derivatives of u(t) in eq. (10.56) to order as ∂ti u(t) = U (i) · ∂V u0 + (∂Ti u0 + ∂ti u1 (t)) (10.58) where the slow time derivatives are deﬁned by: ∂Ti u0 = (∂Ti mj )∂mj u0 j

They come from the variation of the moduli only. Note that eq. (10.57) already imposes constraints on the time evolution of the moduli. Speciﬁcally, integrability conditions of eq. (10.57) imply ∂Ti U (j) = ∂Tj U (i)

(10.59)

The slow modulation equations will have to be compatible with these constraints. The equations for the time evolution of the moduli we are aiming at are the Whitham equations, eqs. (10.77). The main idea of the derivation, which is rather long, is to average over fast oscillations and retain terms involving only slow modulations. In the algebro-geometric setting, the speciﬁc feature we use is that, the time ﬂow being a linear motion on the Jacobian torus, by the ergodicity theorem we can replace the fast time average by average over the torus. Let us now start from the linear system satisﬁed by the Baker–Akhiezer function, and limit ourselves to the ﬁrst three times: ∂t2 Ψ = (Q2 )+ Ψ,

∂t3 Ψ = (Q3 )+ Ψ

or (∂t2 − L)Ψ = 0,

L ≡ ∂x2 − u (10.60) 3 (∂t3 − A)Ψ = 0, A ≡ ∂x3 − u∂x − v (10.61) 2 where we set v = − 32 ∂x u − 3q−2 , and we identify t1 = x, T1 = X. The compatibility condition of this system F ≡ ∂t2 A − ∂t3 L − [L, A] = 0 is the KP equation.

372

10 The KP hierarchy

Proposition. Let us denote by L = L0 + L1 + · · · and A = A0 + A1 + · · · the operators corresponding to the small perturbation eq. (10.56) of a ﬁnite-zone solution with L1 , A1 linear in u1 . To order the zero curvature condition reduces to: (1)

(1)

F1 + ∂T2 A0 − ∂T3 L0 + L0 ∂X A0 − A0 ∂X L0 = 0 (1)

(10.62)

(1)

with L0 = −2∂x , A0 = −3∂x2 + 32 u0 and F1 = ∂t2 A1 − ∂t3 L1 − [L1 , A0 ] − [L0 , A1 ] Proof. We suppose that the perturbed operators satisfy the zero curvature equation. Since L = L0 + L1 + · · · and A = A0 + A1 + · · ·, one has F = F0 + F1 + · · ·

(10.63)

It is important to realize, however, that the leading term F0 also produces see eq. (10.58). To a correction of order due to the deformation of ∂ti , extract this term, let us write L0 = i li ∂ i , A0 = j aj ∂ j , with li , aj functions of u0 and its derivatives. The product is i l (∂ k aj )∂ i+j−k L0 A0 = k i i,j,k

The term of order induced by eq. (10.58) in (∂ k aj ) is k∂X ∂ k−1 aj . Hence (1) (1) the ﬁrst order term is equal to − L0 ∂X A0 , where L0 = − i>0 ili ∂ i−1 . (1)

One gets the similar contribution − A0 ∂X L0 from the product A0 L0 (1) with a similar deﬁnition of A0 . We will get rid of F1 in eq. (10.63) by an averaging procedure, leading to a direct determination of the variation of the moduli in terms of the slow variables. To work conveniently with these averages, we introduce a deﬁnition. For D a diﬀerential operator, we deﬁne the diﬀerential operators D(j) by (D∗ f )g = ∂ j (f D(j) g) (10.64) j≥0

To show how this deﬁnition works, consider D = a∂ i , so that (D∗ f ) · g = ((−)i ∂ i (af )) · g. Let us write ∂ = ∂1 + ∂2 , where ∂1,2 act respectively on the ﬁrst and second factor around the dot: ∂(f · g) = ∂1 f · g + f · ∂2 g ≡ ∂f · g + f · ∂g

373

10.10 Whitham equations This is just a way to encode the Leibnitz rule. Then i ∗ i j ∂ j (f a∂ i−j g) (D f ) · g = (∂2 − ∂) af · g = (−1) j j≥0

i i (j) j So, for D = a∂ , we get D = (−1) a∂ i−j . In particular, by linearj ity, we get for any diﬀerential operator D = i ai ∂ i D(0) = D, D(1) = − iai ∂ i−1 i>0 (1)

(1)

Note that the notation D(1) is consistent with the notation L0 , A0 introduced earlier. We also have the identity: (j)

(D1 D2 )

=

j

(k)

(j−k)

(10.65)

D1 D2

k=0

This follows from the binomial identity n m − k mn + m − k = j k p j−p p

Let Ψ0 and Ψ∗0 be the Baker–Akhiezer functions corresponding to the exact algebro-geometric solution u0 . Recall that Ψ0 and Ψ∗0 can be written in the form (see eq. (10.20) )

Ψ0 = e

i ti P

(i) (P )

ϕ(P, V, m),

Ψ∗0 = e−

i ti P

(i) (P )

ϕ∗ (P, V, m)

(10.66)

where ϕ(P, V, m), ϕ(P, V, m)∗ are periodic, of period 1 in each component of V , and so bounded in V . We have introduced the notation P Ω(i) P (i) (P ) = P∞

Ω(i)

where the forms have purely imaginary periods over any cycle. In these Baker–Akhiezer functions, we make the substitution V → S/ . They satisfy the equations (∂t2 − L0 )Ψ0 = O( ) and (∂t3 − A0 )Ψ0 = O( ). The right-hand sides are not zero because in L0 and A0 we made the substitution V → S/ . Proposition. We have the identity ∂t2 (Ψ∗0 A1 Ψ0 ) − ∂t3 (Ψ∗0 L1 Ψ0 )

(j) (j) ∂ j Ψ∗0 (A0 L1 −L0 A1 )Ψ0 + O( ) = Ψ∗0 F1 Ψ0 + j≥1

(10.67)

374

10 The KP hierarchy

Proof. From eq. (10.23), and the renaming of the operators Qi as in eqs. (10.60, 10.61), we have ∂t2 (Ψ∗0 A1 Ψ0 ) = −(L∗0 Ψ∗0 )A1 Ψ0 + Ψ∗0 ∂t2 A1 Ψ0 + Ψ∗0 A1 L0 Ψ0 + O( ) (j) ∂ j (Ψ∗0 L0 A1 Ψ0 ) + O( ) = Ψ∗0 (∂t2 A1 + [A1 , L0 ])Ψ0 − j≥1

Writing the second term

∂t3 (Ψ∗0 L1 Ψ0 )

in a similar way yields the result.

We now take the average of eq. (10.67) over the times t1 , t2 , t3 . This average is taken over a time scale which is large compared to 1 but small compared to 1/ . For the quantity O, we denote this average by 1 O

ti = O dti 2 − Over this time scale, the point V describes in general an almost dense trajectory on the torus, so that the average can also be interpreted as an average on the torus. The time scales are chosen so that the moduli can be considered as constant in the averaging. In agreement with our hypothesis, we assume that L1 and A1 remain bounded when the ti evolve in an interval of order O( −1 ). Note that in eq. (10.67), the exponential factors cancel between Ψ0 and Ψ∗0 . Since the average of derivatives of bounded functions vanishes, only one term survives in the averaging of eq. (10.67) and we get: Ψ∗0 F1 Ψ0

t1 ,t2 ,t3 = 0 Hence, by averaging eq. (10.62), we get an equation, valid at order : Ψ∗0 (∂T2 A0 − ∂T3 L0 + L0 ∂X A0 − A0 ∂X L0 )Ψ0

t1 ,t2 ,t3 = 0 (1)

(1)

(10.68)

In this equation all quantities are computed with the exact algebrogeometric solution u0 . In the following we shall drop the suﬃx 0. The next two propositions are devoted to the computation of the various terms in this equation. Proposition. With the parametrization eq. (10.66) of the Baker– Akhiezer function, we have: Ψ∗ ∂T3 LΨ

= ∂T3 P (2) ϕ∗ ϕ

+ ∂T3 Uj ϕ∗ ϕj

(2)

(1) ϕj

+ ∂T3 Uj ϕ∗ L (1)

(10.69)

Ψ∗ ∂T2 AΨ

= ∂T2 P (3) ϕ∗ ϕ

+ ∂T2 Uj ϕ∗ ϕj

(3)

(1) ϕj

+ ∂T2 Uj ϕ∗ A (1)

(10.70)

375

10.10 Whitham equations and Ψ∗ (L(1) ∂X A − A(1) ∂X L)Ψ

= ∂X P (3) Ψ∗ L(1) Ψ

(3) (1) ϕj

− ∂X P (2) Ψ∗ A(1) Ψ

+ ∂X U ϕ∗ L j

−

(2) (1) ϕj

∂X Uj ϕ∗ A

(1) = e− where ϕj = ∂Vj ϕ and L (1) . for A

i ti P

(10.71)

(i)

L(1) e

i ti P

(i)

and similarly

Proof. Let us choose two Riemann surfaces Γ(m) and Γ Γ(m + δm). Comparison of functions deﬁned on diﬀerent Riemann surfaces requires a “connection”. This is achieved by choosing a meromorphic function on each Riemann surface and keeping it ﬁxed. We choose to keep P (1) ﬁxed. Let Ψ and Ψ be corresponding Baker–Akhiezer functions on Γ and Γ . Consider the expression ∂t2 (Ψ∗ Ψ ) = −(L∗ Ψ∗ )Ψ + Ψ∗ L Ψ = Ψ∗ L Ψ − ∂ j (Ψ∗ L(j) Ψ ) j≥0 ∗

= Ψ (L − L)Ψ −

∂ j (Ψ∗ L(j) Ψ )

j≥1

Subtracting the same equation for Ψ = Ψ, we get ∂ j (Ψ∗ L(j) (Ψ − Ψ)) ∂t2 (Ψ∗ (Ψ − Ψ)) = Ψ∗ (L − L)Ψ − j≥1

If Ψ = Ψ + δΨ, this gives: ∂t2 (Ψ∗ δΨ) = (Ψ∗ δLΨ) −

∂ j (Ψ∗ L(j) δΨ)

(10.72)

j≥1

Now from eq. (10.66) we have

(i) (i) ti δUj ϕj + ti δP (i) ϕ δΨ = e i ti P δm ∂m ϕ + i,j

(10.73)

i

where we recall that ϕj = ∂Vj ϕ. We now average eq. (10.72). In the left-hand side, in the average over t2 , the terms which do not contain an explicit factor t2 vanish because they are the average of a derivative of a bounded function. The terms

376

10 The KP hierarchy

containing an explicit factor t2 are treated by ﬁrst averaging over t2 on the interval [−, ]: 1 f (t1 , , t3 ) + f (t1 , −, t3 )

t1 ,t3 2 = f

∂t2 (t2 f (t1 , t2 , t3 ))

t1 ,t2 ,t3 =

where f

means average on the torus. We treat similarly the average over t1 = x in the right-hand side. Note that since we have kept P (1) ﬁxed, there is no δP (1) contribution. Interpreting δ as a small variation of the moduli m in the direction T3 , we arrive at Ψ∗ ∂T3 LΨ

= ∂T3 P (2) ϕϕ∗

+ ∂T3 Uj ϕj ϕ∗

(2)

(1) ϕj

+∂T3 Uj ϕ∗ L (1)

where the last term comes from the term j = 1 in eq. (10.72). In the same way, we get Ψ∗ ∂T2 AΨ

= ∂T2 P (3) ϕϕ∗

+ ∂T2 Uj ϕj ϕ∗

(3)

(1) ϕj

+∂T2 Uj ϕ∗ A (1)

This proves eqs. (10.69, 10.70). To prove eq. (10.71), note that the vanishing of the curvature, F = 0, ∗ implies 0 = (F g)f = j ∂ j (gF (j) f ), and we deduce that F (j) = 0, ∀j. By eq. (10.65), this can be written as (j)

∂t3 L

− ∂t2 A

(j)

+

j

[L(k) , A(j−k) ] = 0

k=0

This relation implies, by performing the time derivatives with eqs. (10.60, 10.61), the identity

∂ j−1 ∂t3 (Ψ∗ L(j) Ψ ) − ∂t2 (Ψ∗ A(j) Ψ ) j≥1

=

∂ j−1 Ψ∗ L(j) (A − A)Ψ − Ψ∗ A(j) (L − L)Ψ

j≥1

Averaging this equation with Ψ = Ψ + δΨ we obtain: ∂t3 (Ψ∗ L(1) δΨ) − ∂t2 (Ψ∗ A(1) δΨ)

= Ψ∗ L(1) δAΨ − Ψ∗ A(1) δLΨ

377

10.10 Whitham equations

Indeed, the order zero term, i.e. Ψ = Ψ, gives vanishing averages because it is always a derivative of a bounded function. The ﬁrst order term (in δΨ), produces potentially dangerous terms linear in the time variables. However, the averages vanish when j ≥ 2 because we have at least two derivatives. The average ﬁnally reduces to eq. (10.71) when we interpret δ = ∂X . Using the results of this proposition, eq. (10.68) becomes:

0 = ∂T2 P (3) − ∂T3 P (2) Ψ∗ Ψ

+∂X P (3) Ψ∗ L(1) Ψ

− ∂X P (2) Ψ∗ A(1) Ψ

(10.74)

The terms ϕ∗ ϕj

cancel because we assumed U (i) = ∂Ti S, and so the compatibility condition eq. (10.59) holds. For the same reason the terms (1) ϕj

and ϕ∗ A (1) ϕj

also cancel. ϕ∗ L The last step in our derivation of the Whitham equations consists of evaluating the averages Ψ∗ L(1) Ψ

and Ψ∗ A(1) Ψ

. Proposition. Let Ω(i) be the second kind Abelian diﬀerentials with pure imaginary periods used to construct the Baker–Akhiezer function on Γ. We have: Ψ∗ L(1) Ψ

Ω(1) = − Ψ∗ Ψ

Ω(2) ∗

Ψ A

(1)

Ψ

Ω

(1)

∗

= − Ψ Ψ

Ω

(3)

(10.75) (10.76)

Proof. Consider eq. (10.72) with δ = d now representing the diﬀerential on the curve Γ. Since δL = 0, it reduces to: ∂t2 (Ψ∗ dΨ) = −∂x (Ψ∗ L(1) dΨ) − ∂ j (Ψ∗ L(j) dΨ) j≥2

We have, recalling that dP (i) = Ω(i) ,

(i) dΨ = e i ti P ti Ω(i) ϕ + dϕ i

By averaging, the terms with ∂ j , j ≥ 2, all vanish. Treating carefuly the terms linear in the times as in the proof of the previous proposition, we get: Ω(2) Ψ∗ Ψ

= −Ω(1) Ψ∗ L(1) Ψ

The other formula is proved similarly.

378

10 The KP hierarchy

Inserting these formulae into eq. (10.74) we get our ﬁnal result: Proposition. The slow modulations obey the Whitham equations (3) Ω(2) Ω ∂T2 − (1) ∂X P (3) = ∂T3 − (1) ∂X P (2) (10.77) Ω Ω Had we kept ﬁxed any meromorphic function on the Riemann surfaces instead of P (1) , the Whitham equation would have taken the more symmetric form

Ω(1) ∂T2 P (3) − ∂T3 P (2)

+Ω(2) ∂T3 P (1) − ∂T1 P (3)

+Ω(3) ∂T1 P (2) − ∂T2 P (1) = 0 It is important to check the consistency equations, eqs. (10.59). When the point P on Γ describes a non-trivial cycle, the forms Ω(i) (P ) do not change, but the functions P (i) (P ) change by a period U (i) . Hence the above equation implies

Ω(1) (P ) ∂T2 U (3) − ∂T3 U (2) + cyclic perm. = 0 which implies eqs. (10.59) because the Ω(i) are linearly independent. P In the KdV case, there is no time T2 . Keeping P∞ Ω(2) ﬁxed, we get ∂T3 P (1) − ∂T1 P (3) = 0 Diﬀerentiating with respect to the point P on the Riemann surface, we get the Whitham equations in their usual form: ∂T3 Ω(1) − ∂T1 Ω(3) = 0

(10.78)

We will recover this equation in Chapter 11 where other proofs are available. In the KP case, however, the above derivation of Whitham equations, which is due to Krichever, is the only one known. Remark. Assuming that the Riemann surface Γ is generic, the forms Ω(1) and Ω(2)

have no common zero. Let us assume that

Ψ∗ Ψ and

Ψ∗ L(1) Ψ are meromorphic functions. They have respectively 2g and 2g + 1 poles, hence 2g and 2g + 1 zeroes. Looking at eq. (10.75), we see that the zeroes of

Ψ∗ Ψ are the 2g zeroes of Ω(1) . The form Ω(1) Ω= (10.79)

Ψ∗ Ψ has a double pole at ∞, is otherwise regular and has zeroes at the poles of Ψ and Ψ∗ . It coincides with the form deﬁned in eq. (10.21).

10.11 Solution of the Whitham equations

379

10.11 Solution of the Whitham equations There is a simple method to ﬁnd explicit solutions to the Whitham equations. Let us present it in the simple case of hyperelliptic curves, which is appropriate to the KdV equation. In this case the curve is of the form 2

µ = R(λ) =

2g+1

(λ − λi )

i=1

where the λi are slowly modulated. Recall that in KdV, only the odd times survive. The forms Ω(2i−1) are given by 2i − 1 λg+i−1 + P (2i−1) (λ) (2i−1) = Ω dλ (10.80) 2 R(λ) where the polynomial P (2i−1) (λ) is of degree g − 1 and chosen so that all the periods of Ω(2i−1) are pure imaginary. At inﬁnity √ Ω(2i−1) = d(z 2i−1 + O(z −1 )), z = λ Let us introduce the normalized form T2i−1 Ω(2i−1) + Ω(n) S= i

where n is chosen at will and is a free parameter of the solution. Proposition. Let us assume that for each branch point λj , either λj is independent of the times T2i−1 or S vanishes at λj . This is a system of 2g + 1 equations for the 2g + 1 quantities λj , which allows us to express them in terms of the T2i−1 . Then ∂T2i−1 S = Ω(2i−1) It follows that the Whitham equations, eqs. (10.78), are satisﬁed. We have more generally: (10.81) ∂T2i−1 Ω(2j−1) = ∂T2j−1 Ω(2i−1) (2i−1) Uj = ∂T2i−1 Sj , Sj = S cj

where cj is a basis of non-trivial cycles on Γ. Proof. Let us consider the analyticity properties of ∂T2i−1 S. First, at inﬁnity we have

S=d T2i−1 z 2i−1 + z 2n−1 + O(z −1 ) i

380

10 The KP hierarchy

hence ∂T2i−1 S = d(z 2i−1 + O(z −1 )). At ﬁnite distance, we have (i) 1 ∂T2i−1 λk (2i−1) S+ ∂T2i−1 S = Ω + ck ωk 2 λ − λk k

k

The second term in the right-hand side comes from the derivation of the factor 1/ R(λ) in eq. (10.80), while the last term comes from diﬀerentiating the polynomials P (2i−1) (λ) and P (n) . The ωk are the holomorphic diﬀerentials. The right-hand side is regular at ﬁnite distance since either ∂T2i−1 λk = 0 or S|λk = 0. Finally, all periods of ∂T2i−1 S are purely imaginary for T2i−1 real. Hence we have ∂T2i−1 S = Ω(2i−1) . This in turn implies eq. (10.81). Moreover, we have (2i−1) (2i−1) = Ω = ∂T2i−1 S = ∂T2i−1 Sj Uj cj

cj

So we have solved both eq. (10.57) and the Whitham equations, eq. (10.78). References [1] G.B. Whitham, Linear and Nonlinear Waves. Wiley (1974). [2] I.M. Gelfand and L.A. Dickey, Fractional powers of operators and Hamiltonian sytems. Funkz. Anal. Priloz. 10 (1976) 13–29. [3] F. Magri, A simple model of integrable Hamiltonian equation. J. Math. Phys. 19 (1978) 1156–1162. [4] M. Adler, On a trace functional for formal pseudo-diﬀerential operators and the symplectic structure of the Korteweg–de Vries equations. Invent. Math. 50 (1979) 219–248. [5] D.R. Lebedev and Yu.I. Manin, Hamiltonian Gelfand–Dickey operator and coadjoint representation of the Volterra group. Funkz. Analys. Priloz. 13 (1979) 40–46. [6] H. Flaschka, M.G. Forest and D.W. McLaughlin, Korteweg–de Vries equation. Comm. Pure Appl. Math. 33 (1980) 739–784. [7] A.G. Reyman and M.A. Semenov-Tian-Shansky, Family of Hamiltonian structures, hierarchy of Hamiltonians, and reduction for matrix ﬁrst order diﬀerential operators. Funkz. Analys. Priloz. 14 (1980) 77–78.

10.11 Solution of the Whitham equations

381

[8] V.G. Drinfeld and V.V. Sokolov, Equations of the Korteweg–de Vries type and simple Lie algebras. Doklady AN SSSR 258 (1981) 11–16. [9] M. Jimbo and T. Miwa, Solitons and inﬁnite-dimensional Lie algebras. RIMS 19 (1983) 943–1001. [10] I. Krichever, Method of averaging for two dimensional integrable equations. Funkz. Analys. Priloz. 22 (1988) 37–52. [11] L.A. Dickey, Soliton Equations and Hamiltonian Systems. World Scientiﬁc (1991).

11 The KdV hierarchy

In this chapter we study the Korteweg–de Vries equation, which occupies a central place in the modern theory of integrable systems. All the aspects of integrable systems discussed so far converge in this chapter to draw a particularly rich landscape. In particular, the methods of pseudodiﬀerential operators allow us to easily discuss the formal aspects, the tau-functions yield soliton solutions, and the algebro-geometric methods yield ﬁnite-zone solutions. The soliton solutions which we obtained in the Grassmannian setting by using vertex operators are also degenerate cases of these ﬁnite-zone solutions. Finally, we use a fermionic fomalism to analyse the structure of the local ﬁelds and show that the equations of the hierarchy can be recast in a very compact form. This is used to give a new derivation of the Whitham equations in the KdV case. 11.1 The KdV equation The Korteweg–de Vries (KdV) equation was introduced historically as an approximation of the equations of hydrodynamics, describing unidimensional long waves in shallow water. In their pioneering work, Gardner, Greene, Kruskal and Miura found an unexpected connection with the inverse scattering problem of the Schroedinger equation. More recently, the Hamiltonian aspects of KdV theory connected it to conformal ﬁeld theory. The KdV equation reads: 4∂t u = −6u∂x u + ∂x3 u

(11.1)

The numerical factors in front of each term in eq. (11.1) can be modiﬁed by rescaling u, x and t. The KdV equation can be written as the zero curvature condition Fxt ≡ ∂x At − ∂t Ax − [Ax , At ] = 0 382

11.1 The KdV equation

383

with the connection Ax , At , depending on a spectral parameter λ: 1 ∂x u 4λ − 2u 0 1 , At = Ax = −∂x u λ+u 0 4 4λ2 + 2λu + ∂x2 u − 2u2 (11.2) Note that ∂x − Ax = Du − λΣ0 with the notations of eqs. (10.51, 10.52) in Chapter 10. Alternatively, one can recast the KdV equation in the Lax form ∂t L = [M, L], where L and M are the following diﬀerential operators: L = ∂2 − u (11.3) 3 1 1 M = (4∂ 3 − 3u∂ − 3∂u) = (4∂ 3 − 6u∂ − 3(∂x u)) = (L 2 )+ 4 4 3

The operator ∂ acts as ∂x , and the notation (L 2 )+ refers to the pseudodiﬀerential operator formalism introduced in Chapter 10. In the Lax equation [M, L] is the commutator of diﬀerential operators. Of course these two descriptions are not independent. To relate them, consider the linear system: Ψ Ψ (∂x − Ax ) = 0, (∂t − At ) =0 (11.4) χ χ The x-equation yields χ = ∂x Ψ and (L − λ)Ψ = 0 with L = ∂x2 − u

(11.5)

The time evolution of Ψ is given by 4∂t Ψ = ∂x u · Ψ + (4λ − 2u) ∂x Ψ. Using eq. (11.5), this may be rewritten as: (∂t − M )Ψ = 0

1 with M = (4∂ 3 − 3u∂ − 3∂u) 4

(11.6)

The compatibility condition of eqs. (11.5, 11.6) is the Lax equation ∂t L = [M, L], which is equivalent to the KdV equation. Equation (11.5) is the Schroedinger equation with potential u. The parameter λ gets an interpretation as a point of the spectrum of this operator. This is the origin of the terminology “spectral parameter”. As explained in Chapter 3, a general consequence of the zero curvature condition is the existence of non-trivial conserved quantities. This requires, however, imposing appropriate boundary conditions. We shall consider here for deﬁniteness either potentials u(x) fast decreasing at x → ±∞, or potentials periodic under x → x + . Let us assume for deﬁniteness that u(x) is periodic. Since Ax and At are local in u, they are also periodic. In this case conserved quantities are generated by Tr T (λ),

384

11 The KdV hierarchy

where T (λ) is the monodromy matrix associated with the linear system eq. (11.4): ←− T (λ) =exp dxAx (x, t, λ) (11.7) 0

To compute the trace Tr T (λ), we remark that it is invariant under periodic gauge transformations. We build a gauge in which the connection is diagonal, making the calculation of the monodromy matrix and its trace simple. We shall present this computation for general connections whose components Ax and At belong to the sl(2) algebra, so that it can be applied to a wider class of systems. The commutation relations of the sl(2) algebra are: [H, E± ] = ±2E± , [E+ , E− ] = H Its fundamental representation is given by: 0 1 0 1 0 , E− = H= , E+ = 0 0 1 0 −1

0 0

Proposition. Let Ax = Ah H +A− E− +A+ E+ , where Ah (x, λ), A± (x, λ) are periodic functions of x. There exists a periodic gauge transformation g(x, λ): Ax → g Ax ≡ g −1 Ax g − g −1 ∂x g such that g A− = g A+ = 0, and g

1 Ah = P (λ)H

where P (λ), independent of x, is given by: P (λ) = function v(x, λ) is a solution of the Ricatti equation: v + v2 = V

with

V = A + A 2 + A− A + ,

0

dx v(x, λ). The

A = Ah −

1 A+ 2 A+

(11.8)

Proof. The proof consists of performing successively three gauge transformations: the ﬁrst one annihilates the component along E− , the second one annihilates the component along E+ and the third one is chosen to ensure that the component along H is constant. Let us perform ﬁrst a gauge transformation with g = g1 = exp(f− E− ), then g

Ax = (Ah + A+ f− )H − (f− + 2Ah f− + A+ f−2 − A− )E− + A+ E+

where prime ( ) means derivative with respect to x. The coeﬃcient of E− vanishes if f− is a solution of the Ricatti equation f− + 2Ah f− + A+ f−2 − A− = 0

385

11.1 The KdV equation A

If one sets f− = A1+ (v − A) and then A = Ah − 12 A+ , this equation be+ comes the Ricatti equation (11.8). As usual the substitution v = y /y linearizes the equation which becomes the Schroedinger equation: y − V y = 0

(11.9)

The potential V (x, λ) being periodic, one can take for y(x, λ) any one of the two quasi-periodic Bloch waves (Floquet solutions), y± (x, λ): y± (x + , λ) = exp(±P (λ)) y± (x, λ)

(11.10)

P (λ) is called the quasi-momentum. For deﬁniteness, we shall take v = /y which is periodic. We shall, moreover, assume that the Wronskian y+ + − y y = 1. Notice that of y+ and y− is normalized by y+ y− + − y+ (, λ) dx v(x, λ) (11.11) = P (λ) = ln y+ (0, λ) 0 Similarly, let us deﬁne g2 = exp(f+ E+ ) and compute the matrix with g = g1 g2 : g

gA x

Ax = (Ah + A+ f− )H + (−f+ + 2(Ah + A+ f− )f+ + A+ )E+

We reduce g Ax to the diagonal form if we choose for f+ the periodic solution of the equation −f+ + 2(Ah + A+ f− )f+ + A+ = 0. This solution is f+ = A+ y+ y− which is also periodic. Finally, taking g3 = exp(hH), the gauge transformed matrix g Ax with g = g1 g2 g3 reads: y 1 A+ g H Ax = −h + Ah + A+ f− H = −h + + + y + 2 A+ 2 e−2P (λ)x/ ) reduces Choosing for h the periodic function h = 12 ln(A+ y+ 1 the coeﬃcient of H to the constant P (λ)H.

Conserved quantities are obtained by looking at the trace of the monodromy matrix Tr (T (λ)). Once the connection has been diagonalized this trace is easy to compute. It follows from the previous proposition that the two eigenvalues of T (λ) are exp(±P (λ)), hence: dx v(x, λ) Tr T (λ) = 2 cosh P (λ), P (λ) = 0

The function P (λ) can serve, as well as Tr T (λ), as a generating function for the integrals of motion. To construct them we only have to solve the Ricatti equation (11.8) for v(x, λ).

386

11 The KdV hierarchy

Let us apply the above proposition to the KdV equation. In view of the expression of the KdV connection (11.2), we have Ah = 0, A− = λ + u and A+ = 1. Thus V = λ + u and the Ricatti equation reads: v + v2 = λ + u

(11.12)

The Schroedinger equation associated with the Ricatti eq. (11.12) coincides with eq. (11.5). The quantity P (λ) is the quasi-momentum of the Bloch eigenfunctions of the diﬀerential operator L = ∂ 2 − u with periodic potential u and eigenvalue λ. To obtain local conserved quantities we expand P (λ) around λ = ∞. When λ → ∞, the solution of the Ricatti equation admits the asymptotic expansion: v=

√

λ+

(−1)n vn √ , n ( λ) n≥0

2vn+1 = vn +

n

vp vn−p ,

v0 = 0, 2v1 = −u

p=0

This gives a recursion relation for computing the coeﬃcients vn . Since its solution does not require any integration it leads to local integrals of motion. The ﬁrst few coeﬃcients are: v1 = − 12 u, v4 = 12 v3 + 18 uu ,

v2 = − 14 u , v5 = 12 v4 +

v3 = 18 (u2 − u ) 1 2 32 u

+

1 16 uu

−

1 3 16 u

The conserved quantities are given by the integral: √ 1 1 2 P (λ) = v dx = λ + √ u dx − √ u dx 2 λ (2 λ)3 0 0 0 1 2 3 (11.13) (u + 2u )dx − · · · + √ (2 λ)5 0 11.2 The KdV hierarchy We now particularize the formalism of pseudo-diﬀerential operators studied in Chapter 10 to the KdV situation. This amounts to studying the implications of the condition that L = Q2 is a second order diﬀerential operator: Q2 = Φ∂ 2 Φ−1 = L = ∂ 2 − u,

Φ=1+

∞

wi ∂ −i

1

A ﬁrst consequence is that only odd times survive in the KdV hierarchy. Indeed, recalling the equations of motion of the KP hierarchy, ∂ti Φ = −(Φ∂ i Φ−1 )− Φ

387

11.2 The KdV hierarchy

we see that for i = 2j, (Φ∂ 2j Φ−1 ) = Lj is a diﬀerential operator, so that its projection ( )− vanishes. Recall that we have deﬁned two formal Baker–Akhiezer functions, see eq. (10.11) in Chapter 10: ξ(t,z)

Ψ(t, z) = Φe

,

∗

∗ −1 −ξ(t,z)

Ψ (t, z) = (Φ )

e

,

ξ(t, z) =

∞

t2i−1 z 2i−1

i=1

where t1 = x. The function Ψ(t, z) is an eigenfunction of L with the eigenvalue λ = z2 and Ψ∗ (t, z) is its formal adjoint (see eq. (10.2) in Chapter 10). Since L is obviously formally self-adjoint, Ψ∗ (t, z) is also an eigenfunction of L with the same eigenvalue, (∂x2 − u)Ψ(t, z) = λΨ(t, z),

(∂x2 − u)Ψ∗ (t, z) = λΨ∗ (t, z)

(11.14)

The Wronskian of these two solutions is a constant that we now compute. Proposition. The Wronskian of the two Baker–Akhiezer functions Ψ and Ψ∗ is given by: W (Ψ, Ψ∗ ) ≡ Ψ (t, z)Ψ∗ (t, z) − Ψ∗ (t, z)Ψ(t, z) = 2z

(11.15)

where we have denoted Ψ ≡ ∂x Ψ. Proof. From the deﬁnition of the Baker–Akhiezer functions we see that the essential singularities cancel in W (Ψ, Ψ∗ ) and that W admits a power series expansion in z around ∞ of the form W (z) = 2z + α + β/z + · · ·. We prove that only the ﬁrst term is present by showing that the residue of W (z)z i vanishes for i ≥ −1. dz i z W (z) 2iπ

dz ∂Φ∂ i ezx (Φ∗ )−1 e−zx − Φezx ∂(Φ∗ )−1 (−∂)i e−zx = 2iπ Note that the terms involving the times t3 , t5 , . . . in ξ(t, z) cancel because there are only derivatives with respect to x = t1 . Using eq. (10.14) in Chapter 10, we can rewrite this as an Adler residue in the pseudodiﬀerential algebra:

i i dz i z W (z) = Res∂ ∂Φ∂ i Φ−1 + Φ∂ i Φ−1 ∂ = Res∂ ∂L 2 + L 2 ∂ 2iπ

388

11 The KdV hierarchy i

For i even, this vanishes because L 2 is a diﬀerential operator. For i odd we are going to show that: L 2 ∗ = −L 2 , i

i

i

i = odd

i

so that ∂L 2 +L 2 ∂ is formally self-adjoint and so cannot have a residue. It 1 1 1 is suﬃcient to show that L 2 ∗ = −L 2 . But L 2 = ∂ − 12 u∂ −1 + 14 u ∂ −2 + · · · 1 is the unique solution of the equation (L 2 )2 = L with leading term ∂. 1 Similarly, −L 2 is the unique solution of the same equation with leading 1 1 term −∂. Since (L 2 ∗ )2 = L∗ = L and L 2 ∗ = −∂ + · · · the result follows. We introduce the quantity S(t, λ) which will be useful in expanding the compact pseudo-diﬀerential expressions. It will also play an important role in the last two sections on Whitham theory. It is deﬁned by ∗

S(t, λ) ≡ Ψ (t, z)Ψ(t, z) = 1 +

∞

λ−j S2j (t)

j=1

Note that the essential singularities cancel in S(t, λ) and that S(t, λ) is a function of λ = z 2 . This is because Ψ∗ (t, z) and Ψ(t, −z) are solutions of the same eq. (11.14), and have the same behaviour at z → ∞. So we have Ψ∗ (t, z) = c(z, t3 , . . .)Ψ(t, −z)

(11.16)

Inserting this into eq. (11.15) we see that c(z, t) is even in z, and so is Ψ∗ (t, z)Ψ(t, z). The aim of the following propositions is to show that the whole KdV hierarchy can be written in terms of S(t, λ). Proposition. The Baker–Akhiezer functions can be expressed as: Ψ(t, z) = S 1/2 (t, λ)eX(t,z) ,

Ψ∗ (t, z) = S 1/2 (t, λ)e−X(t,z)

(11.17)

where

z , X(t, z) = ξ(t, z) + O(1/z) S(t, λ) Proof. This parametrization obviously satisﬁes Ψ∗ (t, z)Ψ(t, z) = S(t, λ) and this deﬁnes X(t, z). Inserting it into eq. (11.15) yields the equation relating X(t, z) and S(t, λ). The asymptotic form of X(t, z) when z → ∞ follows by comparing the asymptotic expansions of Ψ(t, z). ∂x X(t, z) =

As a consequence, eq. (11.14) translates into an equation on S(t, λ). It is convenient to write it in the form of the Ricatti equation (11.12) with: v(t, z) =

∂x Ψ(t, z) z 1 = ∂x log S(t, λ) + Ψ(t, z) 2 S(t, λ)

(11.18)

11.2 The KdV hierarchy

389

There is a simple expression of the coeﬃcients S2j as residues of fractional powers of L. Proposition. The coeﬃcients S2j are the local densities of the conserved quantities of the KdV hierarchy, as computed in Chapter 10. 2j−1

S2j = Res∂ L 2 (11.19) As a consequence, the Hamiltonians of the KdV hierarchy are given by: 2 (11.20) dxS2j+2 (x) H2j−1 (L) = 2j + 1 Proof.

S2j

dz 2j−1 ∗ dz = Ψ (t, z)Ψ(t, z) = z ((Φ∗ )−1 e−zx ) · (Φ∂ 2j−1 ezx ) 2iπ 2iπ 2j−1

= Res∂ (Φ∂ 2j−1 Φ−1 ) = Res∂ L 2

where we have used eq. (10.14) in Chapter 10. Due to eq. (11.16), replacing Ψ(t, z) by Ψ∗ (t, z) in eq. (11.18) amounts to changing z → −z. In particular this shows that the coeﬃcients v2n are derivatives with respect to x of local densities. One can compute the coeﬃcients S2j by induction as follows. Since Ψ and Ψ∗ obey eq. (11.14), their product S(t, λ) = Ψ∗ (t, z)Ψ(t, z) obeys the third order equation: 1 1 ( ∂ 3 − u − u∂ − λ∂)S(t, λ) = 0 4 2 Expanding in z one gets ∂S2j+2 = ( 14 ∂ 3 − 12 u − u∂)S2j . This recursion relation can also be understood as the Lenard recursion relation, eq. (10.37) in Chapter 10. In fact, since k+2 k 2 2 2 Hk (L) = L 2 = dxRes∂ L 2 +1 = dxSk+3 k+2 k+2 k+2 (11.21) we get, using eq. (10.35) with n = 1, dHk = Sk+1 ∂ −1 . The Lenard relation becomes D1 (S2j+2 ∂ −1 ) = D2 (S2j ∂ −1 ) which is exactly the previous recursion relation. The ﬁrst few values of S2j are: 1 S2 = − u, 2

1 3 S4 = − u + u2 , 8 8

S6 = −

5 3 5 2 5 1 u − u + (uu ) − u(iv) 16 32 16 32

390

11 The KdV hierarchy

One can recast the equations of motion of the KdV hierarchy as equations on S(t, λ). It is convenient to introduce a generating function for all the time derivatives: ∇(λ) = λ−j ∂2j−1 (11.22) j≥1

Then we have: Proposition. The equations of the KdV hierarchy are equivalent to: ∇(λ)S(t, λ ) =

S(t, λ) · ∂x S(t, λ ) − S(t, λ ) · ∂x S(t, λ) , λ − λ

Proof. In these equations, we have λ = a similar equation on Ψ(t, z): ∇(λ)Ψ(t, z ) =

z 2 , λ

=

2S(t, λ)∂x − ∂x S(t, λ) Ψ(t, z ), 2(λ − λ )

z 2 .

|λ| > |λ | (11.23)

We shall ﬁrst prove

for |λ| > |λ | (11.24)

Recall that, from eq. (10.9) in Chapter 10, the equation of motion for Ψ is (11.25) ∂2j−1 Ψ(t, z) = (L(2j−1)/2 )+ Ψ(t, z) Now, there is a simple recursion relation: 2j+1

2j−1

1 L 2 = L 2 L + S2j ∂x − ∂x S2j 2 + +

(11.26)

Indeed, L being a diﬀerential operator, we have: 2j−1

2j−1

2j−1

2j+1

= L 2 L = L 2 L+ L 2 L L 2 +

+

+

−

+

To compute the plus part in the last term, we only have to keep the ﬁrst two terms in the expansion of (L(2j−1)/2 )− because L is a second order diﬀerential operator. We have: 2j−1

1 −2 L 2 = S2j ∂ −1 − S2j ∂ + ··· 2 − where the ﬁrst coeﬃcient (which is the residue of the considered pseudodiﬀerential operator) is determined by eq. (11.19), and the second coefﬁcient is then ﬁxed by the fact that the left-hand side is formally anti self-adjoint. The recursion relation eq. (11.26) follows immediately. It implies in turn: j−1 2j−1

1 j−i−1 L 2 = (S2i ∂ − S2i )L (11.27) 2 + i=0

391

11.2 The KdV hierarchy Using LΨ(t, z ) = λ Ψ(t, z ), we have:

∂2j−1 Ψ(t, z ) =

j−1 i=0

1 j−i−1 (S2i ∂ − S2i )λ Ψ(t, z ) 2

The computation of ∇(λ)Ψ(t, z ) is then straightforward, yielding eq. (11.24). Changing z → −z , we see that the Baker–Akhiezer function Ψ∗ (t, z ) also obeys eq. (11.24), and eq. (11.23) follows immediately for S = Ψ∗ Ψ. It is worth noticing that eq. (11.23) can be rewritten as a local conservation law: 1 S(λ) 1 = ∂x ∇(λ) S(λ ) λ − λ S(λ ) Using this formalism we now show that the conserved quantities given in eq. (11.13) are the same as the ones in eq. (11.21). This amounts to showing that v(t, z) and S(t, λ) diﬀer by the derivative in x of a local function. Proposition. We have the relation:

dv(t, z) 1 d = S(t, λ) + ∂x 2z + ∇(λ) log S(t, λ) dz 2 dλ

(11.28)

where v = ∂x Ψ/Ψ, λ = z 2 . Proof. From eq. (11.23) we have:

S(λ) log S(t, λ ) − ∂ log S(t, λ) ∇(λ) log S(t, λ ) = ∂ x x λ − λ We substitute eq. (11.18) in this equation and take the limit λ → λ. One gets: S(t, λ) dv(t, z) 1 d + − 2z log S(t, λ) z dz z dλ We diﬀerentiate this expression with respect to x and substitute ∂x v = λ + u − v 2 to get the result. ∇(λ) log S(t, λ) = −

Remark 1. As for the KP hierarchy, there exists a tau-function τ (t) such that: Ψ(t, z) = eξ(t,z) with [z −1 ] = (. . . ,

τ (t − [z −1 ]) , τ (t)

z −2j+1 , . . .). 2j−1

S(t, λ) =

Ψ∗ (t, z) = e−ξ(t,z)

τ (t + [z −1 ]) τ (t)

It is related to the generating function S(t, λ) by:

τ (t + [z −1 ])τ (t − [z −1 ]) = 1 + ∂x ∇(λ) log τ (t) τ 2 (t)

(11.29)

392

11 The KdV hierarchy

This last formula follows by inserting the parametrization of Ψ, Ψ∗ in terms of taufunctions in eq. (11.28).

Remark 2. The Schroedinger equations, eq. (11.14), can also be translated into an equation on S(t, λ). Using the value of the Wronskian, it is straightforward to show that 2∂x2 log S(t, λ) + (∂x log S(t, λ))2 + 4λS −2 (t, λ) = 4(u + λ) (11.30)

Remark 3. We can give more information on the decomposition eq. (11.17). Using eq. (11.24), we have: S(t, λ) z ∇(λ)X(t, z ) = (11.31) λ − λ S(t, λ ) S(t, λ) − S(t, λ ) z z + , for |λ| > |λ | = λ − λ S(t, λ ) λ − λ Notice that we can expand X(t, z) as: ˜ z), X(t, z) = ξ(t, z) + X(t,

with

ξ(t, z) =

z 2j−1 t2j−1

(11.32)

j≥1

˜ z) regular at z = ∞. This decomposition follows from the fact that with X(t, z ∇(λ)ξ(z , t) = λ−λ for |λ| > |λ |.

11.3 Hamiltonian structures and Virasoro algebra Recall eq. (10.39) in Chapter 10 which deﬁned two Poisson structures: {fX , fY }1 (L) = LXY − LY X {fX , fY }2 (L) = (LX)+ (LY )− − (XL)+ (Y L)−

1 − dx (∂ −1 [L, X]−1 ) [L, Y ]−1 2 Here L = ∂ 2 − u, X = X−1 ∂ −1 + X−2 ∂ −2 + · · · and fX (L) = LX , where denotes the Adler trace. To compute the Poisson brackets of u it is enough to take X = X(x)∂ −1 so that fX (L) = − dxu(x)X(x). The two Poisson brackets become: 5 8 dxuX, dxuY = dx(X Y − XY ) (11.33) 1

5

8

dxuX,

=−

dxuY 2

1 dxu(X Y − XY ) − 2

dxXY (11.34)

11.3 Hamiltonian structures and Virasoro algebra

393

Equivalently, we can write: {u(x), u(y)}1 = −(∂x − ∂y )δ(x − y)

(11.35)

1 1 {u(x), u(y)}2 = (u(x) + u(y))(∂x − ∂y )δ(x − y) − (∂x3 − ∂y3 )δ(x − y) 2 4 (11.36) A noticeable feature of these two Poisson brackets is that they are both local in x. Alternatively, we can expand the ﬁeld u(x) in Fourier series: u(x) = k uk eikx (we chose = 2π). Taking X = e−inx and Y = e−imx , eqs. (11.33, 11.34) become respectively: i (11.37) {un , um }1 = − nδn+m,0 π

1 n3 {un , um }2 = − (n − m)un+m − δn+m,0 (11.38) 2iπ 2 The bracket { , }1 is called the Gardner–Faddeev–Zakharov bracket, while the bracket { , }2 is called the Magri–Virasoro bracket. It coincides with the Kostant–Kirillov bracket associated with the Virasoro algebra. The Lax operator L can also be written in factorized form L = (∂ + p)(∂ −p) so that u = p +p2 . This is the Miura transformation. The second Poisson bracket has a simple expression in this parametrization: 1 {p(x), p(y)}2 = (∂x − ∂y )δ(x − y) 4 or in Fourier modes p(x) = k pk eikx : {pn , pm }2 =

(11.39)

in δn+m,0 4π

It was shown in Chapter 10 that these Poisson brackets are compatible in the sense that their sum is again a Poisson bracket. Moreover, the Hamiltonians Hn of the KdV hierarchy are in involution with respect to both Poisson brackets. Let us give an example of the Hamiltonian equations of motion. Taking 1 1 2 H1 = dxu , H3 = − dx(u2 + 2u3 ) 4 16 one gets: 1 ∂t3 u = {H3 , u}1 = {H1 , u}2 = (−6uu + u ) 4 illustrating the fact that one ﬁnds the same equations of motion with the two Poisson brackets, but with diﬀerent Hamiltonians.

394

11 The KdV hierarchy 11.4 Soliton solutions 1

Considering the KP hierarchy with Q = L 2 , we see that the equations i of motion ∂ti Φ = −(L 2 )− Φ imply that Φ is stationary with respect to the even times t2j . Conversely, any solution of the KP hierarchy which is stationary for any even time is such that (Q2j )− = 0, hence, in particular, Q2 = L is a diﬀerential operator. This solution is thus a solution of the KdV hierarchy. At the tau-function level, to obtain the decoupling of the even time variables it is suﬃcient to have:

τKP (teven , todd ) = e

n even cn tn

τKdV (todd )

(11.40)

This is because the action of a Hirota diﬀerential operator with respect to even time variables on such a KP tau-function vanishes: Dtm2k ec2k t2k τKdV (todd ) · ec2k t2k τKdV (todd ) = ∂ym ec2k (t2k +y) τKdV (todd )ec2k (t2k −y) τKdV (todd )|y=0 = 0 The Hirota equations of the KdV hierarchy are thus obtained from the Hirota equations of the KP hierarchy by simply erasing the even times. We get, for instance, the Hirota form of the KdV equation: (D14 − 4D1 D3 )τ · τ = 0

(11.41)

(compare with eq. (8.56) in Chapter 8). Setting u = −2

∂2 D12 τ · τ log τ = − τ2 ∂t21

(11.42)

we recover the KdV equation on u. Indeed, one has: D14 τ · τ , τ2 D1 D3 τ · τ −u˙ = ∂x , τ2

−u + 3u2 =

D14 τ · τ = 2τ (iv) τ − 8τ τ + 6(τ )2 D1 D3 τ · τ = 2(τ˙ τ − τ τ˙ )

Combining these expressions we see that the KdV equation is equivalent to the Hirota equation (11.41). Recall that the KP tau-functions are constructed by choosing an element g ∈ GL(∞) (see eq. (9.34) in Chapter 10): τKP (t) = 0|eH(t) g|0 with H(t) = n Hn tn and where Hn are bosonic oscillators, (not to be confused with the Hamiltonians). See Chapter 9. This tau-function

395

11.4 Soliton solutions

satisﬁes the bilinear identity (9.36), which reduces to the KdV Hirota bilinear identity whenever τKP (t) is of the form eq. (11.40). The main problem is to ﬁnd the group elements g such that this property holds. Recall that the Lie algebra of GL(∞) consists of fermionic bilinears ∗ : (see eq. (9.13) in Chapter 9). If g of the form X = rs Mrs : βr β−s commutes with the H2k , one can push the term exp ( k t2k H2k ) in τKP to the right, where it hits the vacuum and disappears since H2k |0 = 0. In fact commutation up to a central element is suﬃcient since a central term would produce an exponential of a linear combination of the even times. Using eq. (9.26) in Chapter 9, we have: ∗ [Hn , βs∗ ] = −βs+n

[Hn , βr ] = βr+n ,

so that the H2k commute with X, up to central terms, if Mrs = Mr+2,s−2 . This means that the inﬁnite band matrix Mrs is made of diagonals whose elements reproduces themselves with period 2. This characterizes the Lie ⊂ gl(∞), algebra sl(2) see Chapter 16. We have found: Proposition. The tau-function of the KdV hierarchy is given by: τKdV (t) = 0|eH(t) g|0 , with H(t) = t2k−1 H2k−1 and g ∈ SL(2) k>0

As an application, we can ﬁnd the KdV soliton solutions. Recall that the KP soliton solutions are obtained for g = gi with gi = 1+ai β(pi )β ∗ (qi ), where β(p) and β ∗ (q) are the fermionic ﬁelds. Since we have: [Hn , β(z)β ∗ (w)] = (z n − wn ) : β(z)β ∗ (w) : +

z n − wn z−w

we see that gi commutes with H2k if pi = −qi . Hence we have: Proposition. The n-soliton tau-functions of the KdV hierarchy are: τn (t, g) = 0|e

H(t)

n

(1 + ai β(pi )β ∗ (−pi )) |0

(11.43)

i=1

Explicitly, they are equal to: τn (X|p) = 1 +

= 1+

n

p=1

I⊂{1,...,n} |I|=p

i

Xi +

i<j∈I

i<j

p i − pj 2 · Xi p i + pj i∈I

Xi Xj

pi − p j pi + p j

2 + ···

(11.44)

396

11 The KdV hierarchy

where ξ(p, t) = n compact form as:

odd p

nt

n,

τn (X|p) = det (1 + W ) ,

and Xi =

with

ai 2ξ(pi ,t) . 2pi e

Wij =

This can be written in

Xi

4pi pj Xj (pi + pj )

(11.45)

Proof. We have eH(t) β(pi )e−H(t) = eξ(t,pi ) β(pi ) and eH(t) β ∗ (−pi )e−H(t) = eξ(t,pi ) β ∗ (−pi ) so that pushing eH(t) to the right amounts to replacing ai → (2pi )Xi . The expression (11.44) is obtained by applying Wick’s theorem. 0| (1 + 2pi Xi β(pi )β ∗ (−pi ))|0 = 1 + 2pi Xi 0|β(pi )β ∗ (−pi )|0 i

+

i ∗

4pi pj Xi Xj 0|β(pi )β (−pi )β(pj )β ∗ (−pj )|0 + · · ·

i<j

Each of these vacuum expectation values is a determinant (see eq. (9.19) in Chapter 9). We get: 0| (1 + 2pi Xi β(pi )β ∗ (−pi ))|0 i

=1+

Xi +

i

Xi Xj det

i<j

1

+ ··· pi + pj

(11.46)

By the Cauchy formula, eq. (9.33), these determinants can be rewritten as in eq. (11.44). On the other hand, eq. (11.45) reduces to eq. (11.46) by virtue of the expansion formula, eq. (9.47) in Chapter 9. Remark 1. Using the bosonization formula of Chapter 9, one recognizes that the operator

V (p) = p β(p)β ∗ (−p)

coincides with the vertex operator deﬁning the level one vertex representation of the algebra, (see Chapter 16). In the bosonic representation, the group elements aﬃne sl(2) gi can be written: gi ≡ 1 + ai β(pi )β ∗ (−pi ) = 1 +

ai V (pi ) pi

Hence the soliton solutions of KdV are directly related to vacuum expectation values of vertex operators.

The one-soliton solution is obtained when τ = 1 + X, with X = 1 + 3 e(2p(x−x0 )+2p t) . One gets: u(x, t) = −

2p2 cosh2 (p(x − x0 ) + p3 t)

397

11.4 Soliton solutions

It corresponds to a bump (or rather a dip) of height 2p2 propagating with velocity −p2 . In sharp contrast with the case of linear partial diﬀerential equations where wave packets spread out in time, here the bump preserves its shape for all times. Note that the centre of the bump is located at X(x, t) = 1. Consider now the general n-soliton solution, eq. (11.44). We can analyse its shape asymptotically when t → ±∞. Assume that p1 > p2 > · · · > pn > 0. We want to show that we have asymptotically n solitons moving from right to left with velocities −(p1 )2 , . . . , −(pn )2 . Let us assume t → −∞ and consider what happens around Xi = 1, i.e. xi (t) = −p2i t. The values of the other Xj when x ∼ xi (t) are: Xj ∼ (aj / 2pj ) exp (2pj (p2j − p2i )t). Hence for large negative time, if p2j < p2i then Xj (xi (t), t) is very large, while if p2j > p2i , then Xj (xi (t), t) is very small. So we can split the indices j into two subsets, relative to the index i. One subset I+ is such that p2j < p2i and corresponds to Xj very large. The other one I− is such that p2j > p2i and corresponds to Xj very small. To evaluate the tau functions, eq. (11.44), when x ∼ xi (t), one has to keep the terms containing the maximum number of Xj , j ∈ I+ . There are two such terms, yielding: τ (x, t)|x∼xi (t) ∼

j∈I+

Xj

j
(pi − pj )2

(pj − pk )2 1 + X i (pj + pk )2 (pi + pj )2 j∈I+

where we have to keep the second term because Xi ∼ 1. When we compute the KdV ﬁeld using eq. (11.42), the prefactor disappears because it involves an exponential linear in x, t. Hence the tau-function reduces to one-soliton tau-functions, but with parameter ai → ain i = ai

(pi − pj )2 (pi + pj )2

j∈I+

So around x = xi (t) the KdV ﬁeld u(x, t) looks like a one-soliton ﬁeld, and the n-soliton solution appears as n widely separated solitons travelling to the left with velocities −p2i . At some ﬁnite times these solitons will collide and the above analysis becomes invalid. However, for t → +∞ the solitons separate again, and we get the same picture with the roles of I+ and I− reversed, so that: ai → aout = ai i

(pi − pj )2 (pi + pj )2

j∈I−

398

11 The KdV hierarchy

It follows that the interaction introduces a phase shift: δi = log

(pi − pj )2 (pi − pj )2 aout i = log − log (pi + pj )2 (pi + pj )2 ain i ji

This formula shows that the delay is the sum of delays introduced by pairwise interactions. Moreover, each individual soliton keeps its form and its velocity after interactions. Knowing the picture of the n-soliton solutions in the regime t → ±∞ as a superposition of n one-soliton solutions, we can easily compute the conserved quantities. Since the quantities are conserved one can evaluatethem at t → −∞. From eq. (11.29) we can write S(x, t, λ) = 1 + i ∂x ∇(λ) log τi (x, t), where τi is the one-soliton tau-function with parameters pi , ain i , since for any given value of x only one soliton contributes. We have τi = 1 + ain i exp (2ξ(t, pi )), so that ∂t2j−1 log τi = 2p2j−1 i

2ξ(t,pi ) ain i e 2ξ(t,pi ) 1 + ain i e

yielding

dxS(x, t, λ) = 2 + 2

H(λ) = −

λ−j

j>1

p2j−1 i

i

So the complete set of conserved quantities is provided by: 4 2j+1 = pi 2j + 1 n

H2j−1

(11.47)

i=1

Because the Hamiltonians H2j−1 are conserved for any j, it follows that in the scattering process of solitons, only permutations of the pi can occur. The scattering of solitons is completely described by this permutation and the time delays δi . 11.5 Algebro-geometric solutions We wish to apply the analytical methods of Chapter 5 to construct solutions of the KdV equation. As explained in that chapter, one way to get a Lax matrix compatible with the equations of the KdV hierarchy is to seek for stationary solutions with respect to some higher time tj . Then the zero-curvature condition, ∂i Aj − ∂j Ai − [Ai , Aj ] = 0, becomes a Lax equation since the stationarity condition with respect to time tj means ∂j Ai = 0, for all i. The Lax matrix is Aj and its associated spectral curve

399

11.5 Algebro-geometric solutions

is independent of all times ti . A very simple example of this situation occurs when u is stationary with respect to t3 = t. In that case the Lax matrix is At given in eq. (11.2). The associated spectral curve is: Γ:

1 1 1 det(At − µ) = µ2 − λ3 + (3u2 − u )λ + (2uu − u2 − 4u3 ) = 0 4 4 16

The zero-curvature condition becomes the Lax equation ∂x At = [Ax , At ] and reduces to the stationary KdV equation 6uu − u = 0. Integrating, one gets 3u2 − u = C1 and 2u3 − u2 = 2C1 u + C2 for some constants C1 , C2 . So the spectral curve reads µ2 = λ3 /4 − C1 λ/4 − C2 /16, and is independent of x as it should be. This is a genus 1 curve, so that u is given by an elliptic integral. More interesting solutions will be obtained by assuming u to be stationary with respect to some higher time t2j−1 . Let us compute the matrices At2j−1 . We start from the equations of the hierarchy, eq. (11.25). Since 2j−1

L 2 is anti self-adjoint, for either one of the two solutions Ψ and Ψ∗ of the KdV hierarchy we have: 2j−1 (L 2 )+ Ψ Ψ Ψ = = At2j−1 ∂t2j−1 2j−1 ∂x Ψ ∂x Ψ ∂x (L 2 )+ Ψ Using the identity (11.27) and ∂x2 Ψ = (λ − u)Ψ one gets: At2j−1 =

j−1

λ

j−i−1

i=0

− 12 S2i (λ − u)S2i − 12 S2i

S2i 1 2 S2i

Notice that At2j−1 depends only on λ, in agreement with the fact that Ψ and Ψ∗ play the same role. In particular, for j = 1 one ﬁnds Ax , and for 2j−1 2j−1 2j−1 j = 2 one ﬁnds At . Writing (L 2 )+ = (L 2 ) − (L 2 )− , we see that: 2j−1 Ψ Ψ =λ 2 + O(1) At2j−1 ∂x Ψ ∂x Ψ Hence we have identiﬁed, asymptotically for λ → ∞, the eigenvectors of At2j−1 and the eigenvalues: µ=λ

2j−1 2

+ O(1)

(11.48)

The matrix At2j−1 being traceless 2 × 2, its associated spectral curve is a hyperelliptic curve of genus (j − 1) of the form µ2 = R(λ) with R(λ) = 2j−1 j=0 (λ − λj ) = det At2j−1 . This curve is not a general hyperelliptic

400

11 The KdV hierarchy

Riemann surface of genus (j−1). In fact, since µ = R(λ) is an eigenvalue of At2j−1 (λ), it has to be of the speciﬁc form eq. (11.48), showing that R(λ) has the very special form R(λ) = λ2j−1 + C1 λj−1 + C2 λj−2 + · · ·. To overcome this peculiarity, we notice that the stationarity condition can be generalized by imposing the condition: cj ∂t2j−1 u = 0 j

for some constant coeﬃcients cj . Then the corresponding Lax matrix becomes j cj At2j−1 . For any time t2i−1 , the zero curvature condition implies the Lax equation (because the At2j−1 depend only on u): ∂t2i−1

cj At2j−1 = At2i−1 , cj At2j−1

j

j

In the following we consider the hyperelliptic curve Γ constructed from such a Lax matrix. It is of the generic form: 2

Γ : µ = R(λ) =

2g+1

(λ − λi )

(11.49)

i=1

The point at√∞ is a branch point, and a local parameter around that point is z = λ. We want to construct a section Ψ of the eigenvector bundle on Γ, obeying Ψ = 0, ∀i (∂t2i−1 − At2i−1 ) ∂x Ψ The choice ∂x Ψ for the second component is dictated by the equation for i = 1 which then reduces to: (∂x2 − u)Ψ = λΨ So it is enough to consider Ψ. A consequence of eq. (11.48) is that Ψ ξ(t,z) (1 + O(z −1 )), where has the √ asymptotic behaviour at inﬁnity Ψ = e z = λ. We know (see Chapter 5) that Ψ has g + N − 1 = g + 1 poles on Γ (N is the size of the √ Lax matrix). Here one of the poles is at ∞ because ∂x Ψ ∼ zΨ and z = λ is the local parameter at ∞. Hence we require that Ψ has g poles (γ1 , . . . , γg ) at ﬁnite distance. Recall that the positions of these poles are independent of all the times t2j−1 . With these data we

11.5 Algebro-geometric solutions

401

construct the Baker–Akhiezer function on Γ which is the unique function with the following analytical properties: • It has an essential singularity at the point P at inﬁnity:

α(t) Ψ(t, z) = eξ(t,z) 1 + + O(1/z 2 ) z √ where z = λ and ξ(t, z) = i≥1 z 2i−1 t2i−1 .

(11.50)

• It has g simple poles, independent of all times. The divisor of these poles is D = (γ1 , . . . , γg ). This Baker–Akhiezer function solves the KdV hierarchy equations as the following two propositions show. Proposition. There exists a function u(x, t) such that (∂x2 − u) Ψ = λΨ

(11.51)

Proof. Consider on Γ the function ∂x2 Ψ − λΨ. To deﬁne this object as a function on the curve, λ is viewed as a meromorphic function on Γ. Note that λ has only a double pole at ∞ and such a function exists only if Γ is hyperelliptic. We see that ∂x2 Ψ − λΨ has the same analytical properties as Ψ itself at ﬁnite distance on Γ. At inﬁnity we have by eq. (11.50): √ ∂x2 Ψ − λΨ = eξ(t,z) (2∂x α + O(1/z)), z = λ So it is a Baker–Akhiezer function, but with a normalization 2∂x α instead of 1 at inﬁnity. By the uniqueness theorem of such functions, we have: ∂x2 Ψ − λΨ = uΨ,

u = 2∂x α

(11.52)

Having found the potential u, we construct the diﬀerential operator L = ∂ 2 − u and show that the Baker–Akhiezer function Ψ obeys all the equations of the associated KdV hierarchy. Proposition. The evolution of Ψ is given by: ∂t2i−1 Ψ = (L

2i−1 2

)+ Ψ

where L = ∂ 2 − u is the KdV operator constructed above. 2i−1

Proof. Consider the function ∂t2i−1 Ψ − (L 2 )+ Ψ. It has the same analytical properties as Ψ at ﬁnite distance on Γ. At inﬁnity we have

402

11 The KdV hierarchy 2i−1

2i−1

2i−1

∂t2i−1 Ψ = z 2i−1 Ψ + eξ O(1/z) and (L 2 )+ Ψ = L 2 Ψ − (L 2 )− Ψ = z 2i−1 Ψ + eξ O(1/z), where we have used LΨ = z 2 Ψ. Summarizing, we get: ∂t2i−1 Ψ − (L

2i−1 2

)+ Ψ = eξ(t,z) O(z −1 ),

z→∞

By unicity, this Baker–Akhiezer function which vanishes at ∞ vanishes identically. Remark. Because the Schroedinger operator in eq. (11.51) is self-adjoint, the adjoint Baker–Akhiezer function Ψ∗ (P ) is easily related to Ψ(P ). In fact one can choose Ψ∗ (P ) = Ψ(σ(P )) where σ is the hyperelliptic involution on Γ. Note, however, that this choice does not correspond to the normalization selected by the deﬁnition of Ψ∗ (P ) in terms of pseudodiﬀerential operators.

The Baker–Akhiezer function Ψ(t, z) can be written explicitly in terms of theta-functions, see Chapter 5. Let Ω(2j−1) be the unique normalized second kind diﬀerential (all the a-periods vanish) with a pole of order 2j at inﬁnity, such that: Ω(2j−1) = d z 2j−1 + regular , for z → ∞ (2j−1)

Let Uk

be its b-periods: (2j−1)

Uk

=

1 2iπ

Ω(2j−1) bk

In terms of these data we have: Proposition. The Baker–Akhiezer function with the divisor of poles D = (γ1 , . . . , γg ) can be expressed as: (2j−1) − ζ) θ(ζ) P (2j−1) θ(A(P ) + j t2j−1 U t Ω 2j−1 Ψ(t, P ) = e ∞ j θ(A(P ) − ζ) θ( j t2j−1 U (2j−1) − ζ) (11.53) where A(P ) is the Abel map on Γ with base point ∞, and ζ = A(D) + K with K the Riemann’s constant vector. The KdV ﬁeld, u, is given by the Its–Matveev formula:

u(x, t) = −2∂x2 log θ t2j−1 U (2j−1) − ζ + const. (11.54) j

403

11.5 Algebro-geometric solutions

P Proof. In eq. (11.53) the integral ∞ has to be understood in the follow P ing sense: for z in a vicinity of ∞, one deﬁnes ∞ Ω(2j−1) as the unique primitive of Ω(2j−1) which behaves as z 2j−1 + O(1/z) (no constant term). Of course, when this is analytically continued on the Riemann surface, b-periods will appear. However, they will cancel out in eq. (11.53) due to the monodromy properties of theta-functions, leaving us with a welldeﬁned normalized Baker–Akhiezer function. The formula for the KdV ﬁeld is found by using: λ + u = (∂x2 log Ψ) + (∂x log Ψ)2 Setting Ω(1) (z) = d(z + we have:

β z

+ O(z −2 )), where β does not depend on times,

∂x log Ψ = z + ∂x log θ A(P ) + t2j−1 U (2j−1) − ζ −∂x log θ

j

j

β t2j−1 U (2j−1) − ζ + + O(z −2 ) z

We evaluate this expression when z → ∞. Using Riemann’s bilinear identities, we can expand the Abel map A(P ) around ∞ (see eq. (5.56) in Chapter 5), and we have:

z −2j+1 (2j−1) t2j−1 U (2j−1) − ζ = θ (t2j−1 − −ζ )U θ A(P ) + 2j − 1 j

j

Keeping the 1/z terms, we obtain:

β 1 1 t2j−1 U (2j−1) − ζ + + O( 2 ) ∂x log Ψ = z − ∂x2 log θ z z z j

Diﬀerentiating once more with respect to x, we also get ∂x2 log Ψ = O(1/z). It follows that z 2 + u = z 2 − 2∂x2 log θ + 2β + O(1/z), proving the result. On a hyperelliptic curve Γ, one can easily express the Baker–Akhiezer function Ψ knowing the divisor of its zeroes. We concentrate ﬁrst on the x dependence. The higher times t2j−1 will be considered next. Let D(x) be the divisor of the zeroes of the Baker–Akhiezer function Ψ. It is of degree g, and coincides with the divisor of the poles, D, for x = 0: D(x) ≡ {γ1 (x), . . . , γg (x)}

(11.55)

404

11 The KdV hierarchy

The points γi (x) have coordinates λγi (x) , µγi (x) = R(λγi (x) ) . In this formula, the expression R(λγi (x) ) refers to the determination of the square root corresponding to the sheet to which the point γi (x) belongs, while − R(λγi (x) ) corresponds to the other sheet. Let us deﬁne the polynomial in λ: g B(λ, x) = (λ − λγi (x) ) i=1

In terms of these data, we have: Proposition. The Baker–Akhiezer function is equal to: x R(λ) Ψ(x, λ) B(λ, x) = exp dx Ψ(x0 , λ) B(λ, x0 ) x0 B(λ, x)

(11.56)

The KdV potential u is expressed as: u=2

g i=1

λγi (x) −

2g+1

λj

(11.57)

i=1

Proof. We need to ﬁnd the equations governing the x dependence of the divisor D(x). To this end we consider the function ∂x Ψ/Ψ. It is a meromorphic function on Γ, has poles at the points γi (x) and behaves like z + O(1/z) at inﬁnity. Hence we can write R(λ) + Q(λ, x) ∂x Ψ = g (11.58) Ψ i=1 (λ − λγi (x) ) where Q(λ, x) is a polynomial of degree g − 1 in λ. We determine Q(λ, x) by requiring that ∂xΨΨ has a pole above λ = λγi (x) on the sheet µγi (x) and not on −µγi (x) . Thus we ﬁnd the g conditions Q(λγi (x) , x) = µγi (x) which completely determine the polynomial: j=i (λ − λγj (x) ) µγi (x) Q(λ, x) = j=i (λγi (x) − λγj (x) ) i

On the other hand, in the vicinity of λγi (x) , we have: ∂x λγi (x) ∂x Ψ + O(1) =− Ψ λ − λγi (x)

(11.59)

11.5 Algebro-geometric solutions

405

˜ because in the vicinity of the zero λγi , we have Ψ(x, λ) ∼ (λ − λγi )Ψ. Comparing the residues of the poles at λ = λγi in eq. (11.58) and in eq. (11.59), we get the equations of motion for the divisor D(x): ∂x λγi (x) = −2

µγi (x) j=i (λγi (x) − λγj (x) )

(11.60)

One can now reconstruct the Baker–Akhiezer function itself. Indeed, inserting eq. (11.60) into eq. (11.58), we get: R(λ) 1 ∂x Ψ 1 ∂x λγi (x) = − Ψ B(λ, x) 2 λ − λγi (x) i

Integrating this formula from x0 to x gives eq. (11.56). To compute the potential u(x), we insert eq. (11.56) into (∂x2 − u)Ψ = λΨ. We get the polynomial identity 1 1 R = − BB + B 2 + (u + λ)B 2 2 4 where = ∂x . Comparing the terms in λ2g we obtain eq. (11.57). One can ﬁnd the generalization of eq. (11.60) for any ﬂow of the KdV hierarchy. Proposition. On the coordinates λγi of the divisor D(x) the equations of motion with respect to the time t2j−1 read:   g −1 (1 − λγi λ ) j−1  Pj (λγi )µγi λ ∂t2j−1 λγi = −2 , Pj (λ) =  i=1 (λ − λ ) 2g+1 γl −1 l=i γi i=1 (1 − λi λ ) + (11.61) where the ( )+ means taking the polynomial part in the expansion at λ = ∞. Proof. The only diﬀerence from the previous discussion is that, in the derivation of eq. (11.58), the behaviour at ∞ is replaced by: ∂t2j−1 Ψ/Ψ = z 2j−1 + O(1/z) Following the same reasoning as before, we can write: ∂t2j−1 Ψ Pj (λ) R(λ) + Qj (λ) = Ψ B(λ)

(11.62)

406

11 The KdV hierarchy

where Pj (λ) is a polynomial in λ of degree (j−1) and Qj (λ) is a polynomial in λ of degree (g − 1). The polynomial Pj (λ) = λj−1 + · · · is uniquely determined by imposing the asymptotic eq. (11.62), which gives (j − 1) linear conditions. The solution is given by eq. (11.61). The polynomial Qj is determined as above, yielding Qj (λγi ) = µγi Pj (λγi ). One gets the equation of motion for the divisor D(x): ∂t2j−1 λγi = −2

Pj (λγi )µγi l=i (λγi − λγl )

This can be used to give a direct proof of the linearization of the ﬂows on the Jacobian of the hyperelliptic curve Γ. For this purpose, it is suﬃcient to consider the time evolutions of the Abel sums λkγi ∂ λγi λk dλ = ∂t2j−1 λγi ∂t2j−1 R(λγi ) R(λ) i i = −2

i

λkγi Pj (λγi ) l=i (λγi − λγl )

The right-hand side is equal to the integral: λk Pj (λ) −2 dλ (λ − λγl ) , B(λ) = 2iπ Υ B(λ) l

where Υ is a loop surrounding all the λγi . We deform the contour to a loop around ∞ so that: −2

i

λkγi Pj (λγi ) Pj (λ) k λ dλ = 2Resλ=∞ B(λ) l=i (λγi − λγl )

To compute this residue at λ = ∞, we write: 2Resλ=∞

Pj (λ) R(λ) λk dλ Pj (λ) k λ dλ = Res∞ B(λ) B(λ) R(λ)

where the right-hand side is a residue computed on the curve Γ. The factor 2 appears because ∞ is a branch point of the covering z → λ = z 2 around that point, so that the residue on Γ has to be computed with 1/z as local parameter. The ﬁrst factor behaves as z 2j−1 + O(1/z) by construction. So we have: ∂ γi −2j+1 ωk = Resz =0 (z + O(z ))ωk ∂t2j−1 i

11.5 Algebro-geometric solutions

407

where z = 1/z is the local parameter at ∞, and ωk = λk dλ/ R(λ) is an unnormalized Abelian diﬀerential. This shows that the ﬂow linearizes on the Jacobian because the right-hand side is a constant. Since this equation is linear in ωk it remains true for normalized Abelian diﬀerentials. One can then use Riemann’s bilinear identities to evaluate the residue further. The time derivatives of the Abel map are given by: ∂ 1 −2j+1 A(D) = Res∞ (z + O(z ))ωk = − Ω(2j−1) = −U (2j−1) ∂t2j−1 2iπ bk We recover exactly the expected slope of the linear ﬂow on the Jacobian. Remark. The equation of motion of the divisor D can be recast in compact form by introducing the generating of time derivatives, eq. (11.22). Remembering function −n that for a function f (λ) = ∞ f λ we have: n n=0 ∞

λ−j (f (λ)λj−1 )+ =

j=1

we get:

f (λ ) λ − λ

R(λγi ) B(λ) λ = −2 λ − λγi R(λ) l=i (λγi − λγl ) √

∇(λ)λγi

(11.63)

Finally, we show that eq. (11.57) can be generalized to a whole set of so-called trace identities. Proposition. The generating function S(t, λ) of the local quantities S2n has a simple expression in terms of the divisor D(x). g −1 √ B(λ, x) i=1 (1 − λγi (t) λ ) S(t, λ) = λ (11.64) = 2g+1 R(λ) −1 ) (1 − λ λ j j=1 Proof. Recall that we have deﬁned S(t, λ) = ΨΨ∗ , where Ψ and Ψ∗ are normalized such that their Wronskian is equal to 2z. We can evaluate S(t, λ) using eq. (11.56) for Ψ and a similar equation for Ψ∗ with the sign of the exponential reversed, provided that we normalize these expressions to have the correctWronskian. Using eq. (11.56), one gets W (Ψ, Ψ∗ ) = 2Ψ(x0 , λ)Ψ∗ (x0 , λ) R(λ)/B(λ, x0 ), and the expression of S(t, λ) follows.

Taking the logarithm of eq. (11.64), we get: ∞ ∞ 1 1 1 λ−n λnγi = λ−n λni − log S(λ) n 2 n n=0

i

n=0

i

408

11 The KdV hierarchy

Identifying the powers of λ−n , we ﬁnd: i

λ γi =

1 u λi + , 2 2

λ2γi =

i

1 2 1 1 2 λi + u − u 2 4 2

The ﬁrst equation is eq. (11.57). We see that all the symmetric functions of the λγi have a local expression in terms of the potential u. 11.6 Finite-zone solutions The Its–Matveev formula, eq. (11.54), shows that, as a function of the real variable x, the potential u(x, t) is almost periodic. Speciﬁcally, the argument of the theta-function is a straight line, Y (x) = U (1) x + Y0 , which wraps densely around the Jacobian torus, and for suﬃciently large Y (x + ) returns arbitrarily close to Y (x). This means that Y (x + ) Y (x)+n+Bm, where n, m ∈ Zg and B is the matrix of b-periods. The eﬀect of the translation by n + Bm does not aﬀect the potential because of the second order derivative in front of the logarithm. Hence u(x + ) u(x). One can choose the moduli of the curve Γ so that the potential is exactly periodic. This amounts to the condition: U (1) = n + Bm

(11.65)

which gives 2g real conditions on the 2g + 1 complex parameters λi , the branch points of the hyperelliptic curve Γ: Γ:

µ2 = R(λ) =

2g+1

(λ − λi )

i=1

If, however, the parameters λi are real (as will be the case in the following), these conditions quantize all the moduli of Γ (up to a translation λi → λi + α which does not change the periods U (1) ). Under these conditions the potential u(x) given by eq. (11.57) is periodic, and eq. (11.53) shows that the Baker–Akhiezer function Ψ(x, λ) is a Bloch wave: Ψ(x + , λ) = eP(λ) Ψ(x, λ)

(11.66)

Up to now, the potential u(x) was complex. We determine below the conditions on branch points λi and the divisor D(x) ensuring that u(x) is real and periodic. To do this, we ﬁrst derive general properties of Bloch waves for generic real periodic potentials. We will then particularize these properties to the algebro-geometric solutions. We will get in this way the very special ﬁnite-zone potentials.

11.6 Finite-zone solutions

409

Consider the Schroedinger equation Ψ − uΨ = λΨ with a real periodic potential u(x) with period . The space of solutions is spanned by the two solutions y1 (x, λ) and y2 (x, λ) with initial conditions: y1 (0, λ) = 1, y1 (0, λ) = 0,

y2 (0, λ) = 0, y2 (0, λ) = 1

Because the initial conditions are independent of λ, y1 and y2 are entire functions of λ. Since u(x) is periodic, y1 (x + , λ) and y2 (x + , λ) are two other solutions and we can write: y1 (x, λ) y1 (x + , λ) = T (λ) y2 (x + , λ) y2 (x, λ) where T (λ) is a 2 × 2 matrix, the monodromy matrix. Because the Wronskian W (y1 , y2 ) = 1, we have det T (λ) = 1. Notice that for real ¯ The Bloch λ, yi are real so that T (λ) is real, and in general T (λ) = T (λ). waves are the two solutions which diagonalize the monodromy matrix. Denote by e±P(λ) its eigenvalues (with product 1). The two Bloch waves are such that Ψ± (x + , λ) = e±P(λ) Ψ± (x, λ), and we choose to normalize them by Ψ± (0, λ) = 1. The quantity P(λ) is called the quasi-momentum. Writing Ψ± (x, λ) = y1 (x, λ) + βy2 (x, λ), the Bloch condition on Ψ± (x, λ) and ∂x Ψ± (x, λ) gives: Ψ± (x, λ) = y1 (x, λ) +

e±P(λ) − y1 (, λ) y2 (x, λ) y2 (, λ)

(11.67)

where P(λ) is obtained by solving the equation: eP(λ) + e−P(λ) = y1 (, λ) + y2 (, λ) ≡ t(λ)

(11.68)

This shows that t(λ) = Tr T (λ) is an entire function of λ, hence eP(λ) has no pole or zero at ﬁnite distance. One can discuss the nature of the solutions of eq. (11.68) when λ is real, which also implies that t(λ) is real. When ∆(λ) ≡ t2 (λ) − 4 is negative, the quasi-momentum P(λ) is pure imaginary. In this case Ψ(x, λ) has an oscillatory behaviour in x at large scale corresponding to propagation of waves. This deﬁnes what is called the allowed zones in the spectrum. When ∆(λ) is positive, Ψ(x, λ) has an exponential behaviour at large scale, so waves cannot propagate. This deﬁnes the forbidden zones in the spectrum. Finally, when ∆(λ) = 0, we have t(λ) = ±2 and eP(λ) = ±1 (same sign). This means that Ψ(x + , λ) = ±Ψ(x, λ) so that Ψ(x, λ) is periodic or antiperiodic. This means that the periodic and antiperiodic levels are the boundaries of allowed and forbidden zones. When λ → ∞ one can as a ﬁrst approximation neglect the potential √ ± λx (forbidden zone) and u. Then for λ → +∞ we have Ψ(x, λ) ∼ e

410

11 The KdV hierarchy

√ √ ±i −λx t(λ) ∼ 2 cosh λ. For λ → −∞ √ we have similarly Ψ(x, λ) ∼ e (allowed zone) and t(λ) ∼ 2 cos −λ. In fact, when u = 0 one can show that for generic potentials there are an inﬁnite number of forbidden zones of exponentially decreasing sizes extending in the region λ → −∞. Finite-zone potentials are such that this phenomenon does not occur, i.e. there are a ﬁnite number of forbidden zones. The classical theory of Sturm–Liouville equations gives a rather detailed information on the poles of the Bloch waves and the boundaries of the zones. We recall the main facts. Consider a diﬀerential equation y −uy = λy with a real periodic potential u. We have:

f ∂2g = 0

g∂ 2 f + [f g − f g]0

0

so that we get a self-adjoint problem when the boundary conditions are such that the term [f g − f g]0 vanishes. In this case the spectrum is real. Proposition. The poles of the Bloch wave Ψ(x, λ) and the periodic and antiperiodic levels are all real. The periodic and antiperiodic levels form a sequence β1 > β2 ≥ β3 > β4 ≥ β5 > · · · such that β1 is a periodic level, β2 , β3 are antiperiodic levels, β4 , β5 are periodic, and so on. The allowed zones are the intervals [β2j , β2j−1 ]. The forbidden zones are the intervals [β1 , ∞] and the [β2j+1 , β2j ]. There is at least one-pole of Ψ(x, λ) in each forbidden zone, except [β1 , +∞]. Proof. It follows from eq. (11.67) that the poles in λ of Ψ± (x, λ) are located where y2 (, λ) = 0. For these values of λ, y2 (x, λ) is solution of the Sturm–Liouville problem with boundary conditions y(0, λ) = y(, λ) = 0. This problem is self-adjoint, so that these values of λ are real. Similarly, the periodic and antiperiodic levels correspond to the boundary conditions (valid in the case of a periodic potential) y(, λ) = ±y(0, λ) and y (, λ) = ±y (0, λ), which also lead to a self-adjoint problem. These levels are therefore also real. We have shown that all the roots of ∆(λ) are real. We now show that ∂λ t(λ) has a deﬁnite sign in the allowed zones. We have ∂λ t(λ) = ∂λ y1 (, λ)+∂λ y2 (, λ). Since v = ∂λ yj obeys the diﬀerential equation v − (u + λ)v = y with initial conditions v(0) = v (0) = 0, one has: x ∂λ yj (x, λ) = (y1 (x)y2 (ξ) − y1 (ξ)y2 (x))yj (ξ)dξ, j = 1, 2 0 x ∂λ yj (x, λ) = (y1 (x)y2 (ξ) − y1 (ξ)y2 (x))yj (ξ)dξ 0

11.6 Finite-zone solutions

411

so that:

(A(λ)y22 (ξ) + B(λ)y1 (ξ)y2 (ξ) + C(λ)y12 (ξ))dξ

∂λ t(λ) = 0

where A(λ) = y1 (, λ), B(λ) = y1 (, λ) − y2 (, λ), C(λ) = −y2 (, λ). The quadratic form appearing in the integrand has discriminant B 2 − 4AC = (y1 − y2 )2 + 4y2 y1 = t2 (λ) − 4 = ∆(λ) < 0 so it is of ﬁxed sign in the whole domain of integration. It follows that ∂λ t(λ) = 0 in an allowed zone, and that the sign of ∂λ t(λ) is the same as the sign of C(λ) = −y2 (, λ). In particular y2 (, λ) cannot vanish in an allowed zone. We see that t(λ) either crosses the lines t = ±2 or is tangent to them, but cannot have an extremum in the region |t(λ)| < 2. From this, it follows that the periodic and antiperiodic levels are distributed as indicated in the proposition. Obviously, the sign of ∂λ t(λ) changes when one goes from one allowed zone to the next one, so that y2 (, λ) has at least one zero in each forbidden zone [β2j+1 , β2j ] with β2j+1 = β2j and j ≥ 1. We now return to the case where the periodic potential u(x) is produced by the algebro-geometric construction on the hyperelliptic curve

Fig. 11.1. The graph of the function t(λ) for real λ.

412

11 The KdV hierarchy

Γ(λ, µ). As we have seen, the two Bloch waves Ψ± (x, λ) are the two values of the Baker–Akhiezer function Ψ(x, (λ, µ)) at the two points (λ, ±µ) above λ. At a branch point, we have Ψ+ (x, λ) = Ψ− (x, λ) so that the quasi-momentum satisﬁes eP(λ) = ±1, i.e. the branch points are zone boundaries and therefore real. Since the region λ → ∞ is a forbidden zone and corresponds to R(λ) > 0, we see, following sign changes, that allowed zones correspond to R(λ) < 0 and forbidden zones to R(λ) > 0. In particular, the branch points form a sequence λ1 > λ2 > λ3 > · · · > λ2g+1 with λ1 = β1 and {λi } ⊂ {βi }. There may be degenerate forbidden zones with β2j+1 = β2j which do not correspond to branch points of the spectral curve. To compare eq. (11.56) with this discussion, we set x0 = 0, and we see that for x = 0, the divisor D(0) of zeroes of Ψ(x, λ) coincides with the divisor of its poles. By eq. (11.67) and the discussion which follows it, we see that the elements of D(0) are all real, and lie in forbidden zones. We know that the divisor D(0) is of degree g if Γ is of genus g, so we put one-pole of Ψ(x, λ) in each forbidden zone [λ2j+1 , λ2j ], for j = 1, . . . , g, and no pole in [λ1 , +∞] to get a periodic motion. The equation of motion, eq. (11.60), are thus regular and show that all points of D(x) stay real for all x, since R(λγi (x) ) > 0, and remain in forbidden zones. Conversely, if all λj and λγi (x) are real, eq. (11.57) shows that u is real. We have also to choose the curve so that the potential is periodic. This will be the case if the points of D(x) describe cyclic motions on g cycles belonging to the real slice of Γ (i.e. λ and µ real and so R(λ) > 0) and if this motion is periodic of period (so that u(x) has period ). As we already remarked, this requires that the moduli be quantized. It is worth mentioning that, once a genus g Riemann surface Γ is chosen with moduli quantized to produce a periodic ﬁnite-zone real potential, the KdV ﬂows with respect to higher times preserve these conditions, and automatically produce a family of continuous deformations parameters. To express the periodicity conditions further, we view the curve Γ as a two-sheeted covering of the λ plane with cuts on the forbidden zones.

Fig. 11.2. The curve Γ(λ, µ) seen as a two sheeted cover of the λ plane with cuts along the forbidden zones.

413

11.6 Finite-zone solutions

We choose for the a-cycles loops around the compact cuts on the upper sheet. Then the b-cycles start from ∞ on the upper sheet, cross the a-cycles once, go to the lower sheet through the cut and return to ∞. The regular Abelian diﬀerentials are of the form λj dλ/s for j = 0, . . . , g − 1. Their a-periods are real and their b-periods are pure imaginary. Hence normalized regular Abelian diﬀerentials are real combinations of the above, so the matrix of b-periods is pure imaginary. Moreover, the second kind Abelian diﬀerential Ω(1) is of the form λg dλ/s plus a combination of ﬁrst kind diﬀerentials with real coeﬃcients, so that U (1) is real. More generally, all Ω(i) will have all their periods pure imaginary. They are thus the forms considered in eq. (10.19) in Chapter 10. The periodicity condition eq. (11.65) can only be realized with m = 0 and becomes 1 (1) U = Ω(1) = nj 2iπ bj The Bloch momentum P is naturally deﬁned only up to 2iπZ. This is consistent with the following: Proposition. The Bloch quasi-momentum P(λ) is the primitive to the form Ω(1) : dP = Ω(1) (11.69) Proof. Recall that µ = eP(λ) satisﬁes µ + µ−1 = t(λ), where t(λ) is an entire function of λ. Hence µ has no pole or zero at ﬁnite distance. Moreover, for periodic motions of the divisor D(x), µ is a well-deﬁned function on the curve Γ (minus the point at ∞) since, by eq. (11.66), we can write it as the quotient of two Baker–Akhiezer functions: µ(P ) =

Ψ(P, x + ) Ψ(P, x) √

We also deduce from this formula the behaviour at ∞: µ ∼ e λ . Hence dP = dµ/µ is an Abelian diﬀerential on Γ, without singularities at √ ﬁnite distance, and having a double pole at ∞, i.e. dµ/µ ∼ dz where z = λ is the local parameter. For real λ, µ is real of ﬁxed sign on a forbidden zone, so that the a-periods of dP = dµ/µ vanish. This characterizes dP = Ω(1) . On the other hand in an allowed zone, µ is of modulus one, so the b-periods of dµ/µ are pure imaginary, and of the form 2iπnj since µ is well-deﬁned on the curve, in agreement with the periodicity condition.

From eq. (11.56), we can get still another expression for the quasimomentum. In the case of a periodic motion of the divisor D(x), it is

414

11 The KdV hierarchy

given by:

P(λ) = 0

R(λ) dx B(λ, x)

(11.70)

At a point βj which is a boundary of a non-degenerate forbidden zone, P(λ) changes from real to pure imaginary, which implies that R(λ) changes sign. Hence all such zone boundaries appear among the branch points of Γ. All other periodic or antiperiodic levels are therefore points where t(λ) is tangent to t = ±2, and there are an inﬁnite number of them. These remarks also allow us to compare the spectral curve ΓT of the monodromy matrix T (λ) corresponding to a ﬁnite-zone potential u with the ﬁnite genus curve Γ used to built the algebro-geometric solution. By deﬁnition, the “curve” ΓT is given by the equation µ2 −t(λ)µ+1 = 0. Setting s = 2µ − t(λ), we get the hyperelliptic type equation s2 = ∆(λ), where ∆(λ) = t2 (λ) − 4. The entire function ∆(λ) is not a polynomial, but admits an inﬁnite product representation: ∆(λ) =

2g+1 i=1

λ λ 2 1− 1− λi βj ∞

j=1

where the βj are the points where t(λ) is tangent to the lines ±2, i.e. correspond to the degenerate forbidden zones, and the λi are the points where t(λ) crosses the lines ±2, i.e. correspond to the branch √ points of Γ. This inﬁnite product is convergent because t(λ) ∼ 2 cos −λ when λ is large, or else βj ∼ −j 2 π 2 / when j is large. Note that the potential u is determined by Γ, so the βj are really functions of the moduli λi of Γ. λ The bianalytic transformation s = s ∞ j=1 (1− βj ) transforms the equa λ tion of ΓT into the equation s2 = 2g+1 i=1 (1 − λi ), that is the equation of Γ. All points (s = 0, λ = βj ) are singular points of ΓT and Γ is the desingularization of ΓT . Conversely, ΓT is obtained from Γ by identifying pairs of points (λ = βj , s = ±s (βj )) which accumulate at ∞.

11.7 Action-angle variables We want to express the restriction of the symplectic forms corresponding to the two Poisson brackets { }1,2 on the ﬁnite-zone solutions of the KdV hierarchy. We begin with the ﬁrst Poisson bracket (11.35). This Poisson bracket is degenerate, and the kernel is 0 u(x)dx. So we consider a symplectic leaf where this integral is kept constant. On such a leaf, the associated

11.7 Action-angle variables

415

symplectic form reads: ω1 =

1 4

x

dx δu(x) ∧

0

dy δu(y) 0

To build ﬁnite-zone solutions one chooses a genus g hyperelliptic Riemann surface, and a divisor D = (γ1 , . . . , γg ) of degree g on it. These data determine the potential u. The variations δu are expressed through the variations of the moduli of the curve, and the variations of the divisor, D, of the poles of the Baker–Akhiezer function. When all the times t2j−1 are set to zero, the Baker–Akhiezer function is equal to one, and its zeroes coincide with its poles.In agrement with eq. (11.55) we denote γi = (λγi , µγi ), where µγi = R(λγi ). Proposition. The restriction of the symplectic form ω1 on the ﬁnitezone solution constructed from the curve Γ : µ2 = R(λ) =

2g+1

(λ − λi )

i=1

is expressed in terms of the divisor of poles γi = (λγi , µγi ) of the Baker– Akhiezer function as: ω1 =

g

δP(γi ) ∧ δλγi

i=1

where P is the Bloch momentum such that dP = Ω(1) and P = z + O(1/z) at inﬁnity. Proof. By analogy with the discussion in Chapter 5, we introduce a 1-form on the curve Γ with values in 2-forms on phase space: K = Ψ∗ δL ∧ δΨ Ω where Ω is the form deﬁned in eq. (10.21) in Chapter 10. There are several important diﬀerences coming from the ﬁeld theoretical context. First, the notation means: 1 f (x) = lim f (x)dx (11.71) →∞ 0 This is a Whitham average and is also an average on the Liouville torus. Second, variations δΨ have to be deﬁned by keeping the primitive P the (1) Ω (behaving as k = z + O(z −1 ) at ∞) ﬁxed instead of keeping λ k=

416

11 The KdV hierarchy

ﬁxed. Otherwise, eq. (10.66) in Chapter 10 shows that a term linear in x occurs in δΨ and the average is not deﬁned. Comparing with eq. (10.79) in the same chapter, we can write: K = Ψ∗ δL ∧ δΨ

dk , Ψ∗ Ψ

Ω=

dk Ψ∗ Ψ

(11.72)

Just as taking the variation δ by keeping λ ﬁxed introduces poles at the branch points, that is where dλ = 0, taking the variation δ by keeping k ﬁxed introduces poles at the points where dk = Ω(1) = 0. We write as usual that the sum of residues of K on Γ vanishes. The residue of K at ∞ is precisely the form ω1 . Indeed, using k as a local parameter, we have Ψ = ekx (1 + α/k + · · ·) and Ψ∗ = e−kx (1 + · · ·). Since we vary while keeping k ﬁxed, we have δΨ = ekx (δα/k + O(1/k 2 )). Remembering that δL = −δu, we get Res∞ K = δu ∧ δα There is an extra minus sign because the local parameter is really 1/k. To relate α to the potential u, notice that by deﬁnition Ω(1) = d(z + O(z −1 )) at ∞, so that λ = k 2 + c + O(k −1 ) for some constant c. Reproducing the reasoning leading to eq. (11.52), we ﬁnd u = 2∂α − c. When the curve Γ is such that the potential u is periodic, we have Ψ(x + ) = ek Ψ(x), so that α is periodic. Hence u = −c. x Recalling that we keep u ﬁxed, we have δc = 0, and so δu, giving δα = 12 2 Res∞ K = ω1 The form K has poles at ﬁnite distance at the poles of Ψ (the poles of Ψ∗ are cancelled by Ω) and at the zeroes of Ω(1) , coming from δΨ. At a pole γi of Ψ we have δΨ = (δkγi /(k − kγi ))(Ψ + O(1)), since we keep k constant in the variation. We get a contribution: Resγi K = Resγi

δkγi Ψ∗ δLΨ dk ∧ ∗ Ψ Ψ k − kγi

Varying LΨ = λΨ, we have Ψ∗ δLΨ = − Ψ∗ (L − λ)δΨ + δλ Ψ∗ Ψ . Integrating by parts, and using (L − λ)Ψ∗ = 0, we get: - 1 1 Ψ∗ (L − λ)δΨ = Ψ∗ ∂x δΨ − ∂x Ψ∗ δΨ = W (δΨ, Ψ∗ ) 0 0 Taking large but close enough to an almost period of u, we write: Ψ(x + ) = ek Ψ(x),

Ψ∗ (x + ) = e−k Ψ∗ (x)

417

11.7 Action-angle variables

so that this quantity vanishes. Finally, Ψ∗ δLΨ = δλ Ψ∗ Ψ and the contribution of the pole γi is: Resγi K = δλγi ∧ δkγi We now analyze what happens at zeroes si of dk. We have: dΨ dΨ δΨ = δmj δλ + dλ dmj j

where mj are the moduli of Γ. The polar part at si comes from the ﬁrst term and we can replace δΨ → dΨ dλ δλ. Moreover, δL is the multiplication operator by −δu, so dΨ dΨ

∧ δλ Ω = (δL − δλ)Ψ∗

∧ δλ Ω dλ dλ where we added the δλ term which cancels in the wedge product. Varying (L − λ)Ψ∗ = 0, we can write Ψ∗ δL

Ressi K = −Ressi (L − λ)δΨ∗

dΨ

∧ δλ Ω dλ

Integrating by parts gives dΨ dΨ -- 1 Ressi K = −Ressi δΨ∗ (L − λ)

+ W (δΨ∗ , )- ∧ δλ Ω dλ dλ 0 Diﬀerentiating (L − λ)Ψ = 0 with respect to λ, we have (L − λ) dΨ dλ = Ψ. Using that Ψ and Ψ∗ are Bloch waves and the fact that δk = 0, we ﬁnd 1 dΨ -- dk W (δΨ∗ , )- = W (δΨ∗ , Ψ) dλ 0 dλ 0 Since dk vanishes at si , to get a residue for this term one has to take the polar contribution in δΨ∗ , proportional to δλ, which disappears due to the wedge product. We get ﬁnally Ressi K = −Ressi K1 ,

K1 = δΨ∗ Ψ ∧ δλ Ω

To evaluate the sum of residues of K1 at si , we write that the sum of residues of K1 vanishes. Poles of K1 are at the si , the poles γi∗ of Ψ∗ , and at inﬁnity. At inﬁnity, Ω has a double pole but δλ and δΨ∗ Ψ have a simple zero so that there is no residue. At γi∗ , we write δΨ∗ =

δkγ ∗ ∗ i k−kγ ∗ (Ψ i

and using eq. (11.72) for Ω, we ﬁnd Ressi K1 = − δkγi∗ ∧ δλγi∗ si

γi∗

+ · · ·),

418

11 The KdV hierarchy

Recalling that γi∗ = σ(γi ), where σ is the hyperelliptic involution (λ, k) → (λ, −k), we get λγi∗ = λγi and kγi∗ = −kγi . Hence the result Ressi K = − δkγi ∧ δλγi si

γi

from which the proposition follows. We can analyse the second symplectic structure in a similar way. The symplectic form ω2 is expressed most simply in terms of the Miura variable p such that u = p + p2 . Since the second Poisson bracket is given by eq. (11.39), it has a kernel 0 p(x)dx in the variable p. So this quantity has to be ﬁxed. The symplectic form in the variable p reads: x dx δp(x) ∧ dy δp(y) ω2 = − 0

0

With the same notations as in the previous proposition: Proposition. The restriction of the symplectic form ω2 on the ﬁnitezone solution is expressed as: ω2 = 2

g

δP(γi ) ∧

i=1

δλγi λ γi

Proof. Now we introduce the 1-form K on the curve Γ, with values in 2-forms on phase space: K = Ψ∗ δL ∧ δΨ

Ω λ

Again, the variations are done by keeping the meromorphic function k ﬁxed. In contrast to the previous case, there is no pole at ∞, but a new pole appears at λ = 0. Let us compute the residue of K at λ = 0. We will show that 1 Resλ=0 K = ω2 The Miura variable p is related to Ψ by ∂x Ψ -p= Ψ -λ=0 and we can express Ψ(x) ≡ Ψ(λ = 0, x) in terms of p as: x

Ψ(x) = e

0

p(y)dy

11.8 Analytical description of solitons

419

When u is periodic of period , we can choose p periodic and Ψ is a Bloch wave with Bloch momentum given by P(λ = 0) = 0 p(y)dy, hence in the kernel of ω2 . The residue of K at λ = 0 is: x δp Ψ∗ δL ∧ δΨ = − Ψ∗ Ψ(δp + 2pδp) ∧ 0 x x

∗ ∗ = − Ψ Ψ 2pδp ∧ δp + δp ∧ ∂x Ψ Ψ δp 0 0 x x W (Ψ, Ψ∗ ) 1 δp = δp ∧ δp − Ψ∗ Ψδp ∧ 0 0 0 0 where we have used (Ψ∗ Ψ) = −W (Ψ, Ψ∗ ) + 2pΨ∗ Ψ and the fact that Ψ∗ Ψ and p are periodic, so the last term is proportional to δ 0 p, hence vanishes. Now the residue is easily computed once we notice that Ω=

dλ W (Ψ, Ψ∗ )

This is because the right-hand side has zeroes at the poles of Ψ and Ψ∗ . Moreover, the Wronskian vanishes at the branch points, thus cancelling the zeroes of dλ. Finally, at ∞ we can normalize Ψ and Ψ∗ such that W (Ψ, Ψ∗ ) 2z, so that the form behaves as dz (λ = z 2 ). This uniquely identiﬁes Ω. The analysis of the residues of K at the poles of Ψ and Ψ∗ is exactly the same as in the previous case, replacing Ω by Ω/λ everywhere, which accounts for the result δP(γi ) ∧ δλγi /λγi . Finally, at the points si , the analysis is also similar but the form K1 involves Ω/λ and so has no pole at ∞ but a possible pole at λ = 0. In fact, there is no pole at λ = 0 because δλ vanishes there: the function k(λ) has an expansion of the form 1 k(λ) = p(y)dy + aλ + · · · 0 where the ﬁrst term is the Bloch momentum at λ = 0. Variations are done by keeping k ﬁxed and 0 p(y)dy ﬁxed. So δλ = −(δ log a)λ + · · · vanishes at λ = 0. 11.8 Analytical description of solitons The soliton solutions can be viewed as a singular limit of the ﬁnite-zone solutions in which the branch points λj coincide in pairs. On these degenerate Riemann surfaces, everything becomes rational and all calculations can be performed to the end.

420

11 The KdV hierarchy

Consider the curve s2 = λ ni=1 (λ − λi )2 , where we set λi = p2i with 0 < p1 < p2 < · · ·. This singular curve is desingularized by setting s = z (λ − λi ), which leads to λ = z 2 . Hence the desingularized curve is the Riemann sphere, which we identify to the complex z plane. The singular curve is obtained from the z plane by identifying the points z = ±pi which are both mapped to the point (s = 0, λ = λi ). A general Baker–Akhiezer function Ψ(t, z) on the Riemann sphere is given by: z − zi (t) Ψ(t, z) = eξ(z,t) Ψ(0, z) z − zi (0) n

(11.73)

i=1

If such a function comes from a function deﬁned on the singular curve, and this is the case if it is a singular limit of a ﬁnite-zone solution, it must take the same value at identiﬁed points, so we must have Ψ(t, pi ) = Ψ(t, −pi ), or n pi + zj (t) (11.74) = ai e2ξ(pi ,t) , i = 1, . . . , n pi − zj (t) j=1

where ai are time-independent constants. Equation (11.74) is a linear system for the symmetric functions of the zj (t). It determines the time evolution of the divisor of zeroes of Ψ. On the other hand, the soliton solution is well known in terms of tau-functions, see eq. (11.44). From this we can construct the Baker– Akhiezer function through the Sato formula: Ψ(t, z) τ (t)|t=0 τ (t − [z −1 ]) eξ(z,t) = Ψ(0, z) τ (t) τ (t − [z −1 ])|t=0

(11.75)

Recall that τ (t) depends on times only through the variables Xi (t) = Xi (0)e2ξ(pi ,t) so that: Xi (t − [z −1 ]) =

z − pi Xi (t) z + pi

Plugging this into eq. (11.44) for τ (t), we ﬁnd a formula of the type: τ (t − [z −1 ]) (z − zi (t)) = (11.76) (z + pi ) τ (t) This shows that the right-hand side of eq. (11.75) is exactly of the form eq. (11.73). We shall now prove that the zi (t) deﬁned by eq. (11.74) and

421

11.8 Analytical description of solitons

eq. (11.76) are the same. For this it is suﬃcient to show that the zi (t) from eq. (11.76) satisfy eq. (11.74). We note the relation (pi − pk )2 Xk τn (X) = τn−1 (X) + Xi τn−1 (pi + pk )2 where τn−1 (X) is given by the formula for (n − 1) solitons (Xi removed), and in the coeﬃcient of Xi all arguments Xk in τn−1 (X) are replaced as indicated. This formula is obvious from eq. (11.44). With its help, we can compare the residue of the pole z = −pi on both sides of eq. (11.76). We ﬁnd:

pi −pj τ X n−1 j (pi + zj (t)) pi +pj 1 2pi j Xi =− − z + pi τn (X) z + pi j=i (pi − pj ) Similarly, comparing the two sides of eq. (11.76) for z = pi , we ﬁnd:

p −p τn−1 pii +pjj Xj (pi − zj (t)) 1 j = τn (X) 2pi j=i (pi + pj ) Combining the two equations yields pi + zj (t) j

pi − zj (t)

=

pi − pj j=i

pi + pj

Xi (t)

(11.77)

This is exactly eq. (11.74). We also found the value of the constants ai in that equation: p i − pj Xi (0) ai = p i + pj j=i

The equations of motion for the divisor D(t) = {zi (t)} are obtained by taking the limit of eq. (11.63) in which we have to replace λγi (t) = zi2 (t). They read: 2 2 2 2 j (zi − pj ) j=i (z − zj ) 2 ∇(z )zi = − 2 2 2 2 j (z − pj ) j=i (zi − zj ) In particular, taking the coeﬃcient of z −2 we ﬁnd: 2 2 j (zi − pj ) ∂x zi = − 2 2 j=i (zi − zj )

(11.78)

The solution of these equations is of course given by solving the algebraic system eq. (11.74).

422

11 The KdV hierarchy

We can also compute the degenerate limit of the Baker–Akhiezer function starting from eq. (11.56). It suﬃces to notice that: R(λ) -1 1 1 ∂x zi = z − + (11.79) B(λ, x) λ=z 2 2 z − zi z + z i i

so that performing the integral in eq. (11.56) we get eq. (11.73). One can use the rationality of the spectral curve in this degenerate case to express the tau-function as a rational functions of the divisor D(t). Proposition. The tau-function admits a simple expression in terms of the divisor D(t).   n (zi + zj ) i<j (pi + pj ) n(n−1) i<j n pj  τ (t) = (−1) 2 2  i,j (pi − zj ) j=1

Proof. We start from the logarithm of eq. (11.76) and apply the operator ∂z − ∇(z 2 ) which vanishes on log τ (t − [z]). We get: ∇(z 2 ) log τ (t) =

i

1 1 1 − + ∇(z 2 )zi z − zi z + pi z − zi i

i

Since the tau-function only through the zi (t), we have depends on times 2 )z . Identifying the residues of the ∂ log τ (t)∇(z ∇(z 2 ) log τ (t) = i i zi poles in z at z = ±pl and z = zl , one ﬁnds the n conditions: 1 1 1 ∂zi log τ (t) 2 ∂ z = ∂x zi , l = 1, . . . , n x i 2 2 pl − zi pl − zi2 p l − zi i

i

This is a linear system for the quantities ∂zi log τ (t)∂x zi , the solution of which requires the inversion of the Cauchy matrix Mij = 1/(p2i − zj2 ). Recalling eq. (9.33) in Chapter 9 we ﬁnd: 2 2 2 2 2 2 k=i (pj − zk ) k (pk − zi ) k=i (pj − zk ) −1 n−1 (M )ij = (−1) = ∂ z x i 2 2 2 2 2 2 k=j (pj − pk ) k=i (zi − zk ) k=j (pj − pk ) This gives: n−1

∂zi log τ = (−1)

(p2j − zk2 ) k=j (p2k k=i 2 2 2 k=j (pj − pk ) k=l (zl j,l

− zl2 ) −

zk2 )

1 p j − zl

We consider this expression as a rational function of zi . It has poles at zi = pj for any j and at zi = ±zj for j = i, and goes to 0 at λi = ∞.

11.8 Analytical description of solitons

423

One sees easily that the residue at zi = pj is equal to −1, the residue at zi = zj vanishes. We ﬁnally get: ∂zi log τ =

rl 1 + p j − zi zi + z l

j

with rl =

j

l=i

(z 2 k=j l2 k=j (pj

− p2k ) − p2k )

k=i,l

(p2j − zk2 )

2 k=l,i (zl

− zk2 )

In fact we have rl = 1. This is equivalent to the identity: 2 2 k=i,l (p2j − zk2 ) 1 k=i,l (zl − zk ) = 2 2 2 2 2 2 k=j (pj − pk ) zl − pj k (zl − pk ) j which is easily checked by comparing the residues in zl2 on both sides. This gives the ﬁnal simple result: ∂zi log τ =

1 1 + p j − zi zi + z l

j

l=i

Integrating this formula, we ﬁnd:

i<j (zi + zj ) τ =C i,j (pi − zj )

(11.80)

where the constant C does not depend on the zi hence is independent of times. To ﬁnd it, note that when the times are set to −∞ we have τ = 1. Equation (11.76) implies that the zi (−∞) are equal to the −pi up to ordering. Inserting these values into eq. (11.80) gives C = (−1)

n(n−1) 2

2n

j

pj

(pi + pj ) i<j

To end this section, we would like to discuss the Poisson structures of the KdV equation in these variables. As we know, we have a whole hierarchy of these structures. Let ωk , k = 1, 2 be the restriction of the k th symplectic form on the manifold of ﬁnite-zone solutions. We have seen that: ωk = ck

n i=1

δP(λγi ) ∧

δλγi λk−1 γi

(11.81)

424

11 The KdV hierarchy

where ck is a normalization constant and the Bloch momentum P(λ) is deﬁned as: P(λ) =

1 Ψ(λ, x = ) log 2 Ψ(λ, x = −)

To compute this Bloch momentum in the soliton limit, we use eq. (11.73) and we send → ∞. Equation (11.74) shows that, up to ordering, we have zi () → pi , zi (−) → −pi . Using eq. (11.70) and eq. (11.79), which is symmetric in the zi , we get: z + pj 1 mod iπ P(z) = z − log 2 z − pj j

Hence, recalling that λγi = zi2 , we ﬁnd: n zi − p j δzi ∧ δpj δzi δ log ωk = ck ∧ 2k−3 = 2ck 2 zi + p j zi (zi − p2j )zi2k−4 i=1

j

ij

The form ω2 , normalized with c2 = 2, corresponds to the Magri–Virasoro bracket: δzi δzi ∧ δpj ∧ δpj ω2 = 4 =4 (11.82) (zi2 − p2j ) zi2 − p2j ij j i In the coordinates (Xi (0), pi ), it reads δpi δXi δpi ∧ δpj ω2 = 2 ∧ −8 pi Xi p2i − p2j i

(11.83)

i<j

To see that, we diﬀerentiate eq. (11.77), getting δzj δpj δXi 2pi =− + 2pi + Λi δpi 2 2 Xi zj − pi p2i − p2j j

j=i

where Λi is a coeﬃcient whose value does not matter. Inserting into eq. (11.82), we arrive at eq. (11.83). We can write the Poisson brackets {pi , pj }2 = 0,

1 {pi , Xj }2 = pi Xj δij , 2

{Xi , Xj }2 = 2

pi p j Xi X j p2i − p2j

As a simple consistency check, we use the Hamiltonians, eq. (11.47), and compute Xi ∂t2j+1 Xi = {H2j−1 , Xi }2 = 2p2j+1 i in agreement with the fact that H2j−1 generate the ﬂow t2j+1 with { , }2 .

11.9 Local ﬁelds

425

11.9 Local ﬁelds In this and the next section, we study the Whitham average of local ﬁelds in the KdV hierarchy. By deﬁnition, the space of local ﬁelds is the vector space generated by monomials of the form O(u, u , u , . . .) where the prime denotes ∂x . In particular, we wish to know when the Whitham average of a local ﬁeld vanishes. One such a case is when the local ﬁeld we consider is a derivative with respect of any time of the hierarchy of another local ﬁeld. This is a motivation for presenting local ﬁelds in the form: ∂ ν EO,ν (S2 , S4 , . . .) (11.84) O(u, u , u , . . .) = |ν|≥0

where ν = (i1 , i3 , . . .) is a multi-index, ∂ ν = ∂ti11 ∂ti33 · · · , and |ν| = i1 + 3i3 + · · ·. This is not all the story, however, because some linear combinations in the right-hand side of eq. (11.84) identically vanish by the equations of motion of the KdV hierarchy. We deﬁne a null vector as such a combination which vanishes when we take the equations of motion into account. Our ﬁrst task is to describe the null vectors. We will analyse the Whitham average in the next section. The possibility of going from one presentation to the other in eq. (11.84) relies on the possibility of replacing the odd derivatives ∂x2j−1 u by the higher times derivatives ∂t2j−1 , according to the equations of motion ∂t2j−1 u = [(L

2j−1 2

)+ , L] =

1 u(2j−1) + · · · 22j−1

Similarly, the even derivatives ∂x2j u can be replaced by the densities of the integrals of motion S2j : S2j = Res∂ L

2j−1 2

=−

1 22j−1

u(2j−2) + · · ·

Let us compare the dimensions of the space L and the space E generated by elements of the form ∂ ν E(S2 , S4 , . . .), where E(S2 , S4 , . . .) is a monomial of S2 , S4 , . . .. Since both spaces are inﬁnite, we introduce a grading by attributing to ∂x weight 1 and to u weight 2. As a consequence, ∂t2j−1 has weight 2j − 1 and S2j has weight 2j. This makes the two spaces L and E graded vector spaces: ∞ ∞ L= Ln , E = En n=0

n=0

426

11 The KdV hierarchy

At each grade n, the vector spaces Ln and En are ﬁnite-dimensional. We deﬁne the characters: χ(L) =

∞

q dim Ln , n

χ(E) =

n=0

∞

q n dim En

n=0

The space Ln is made of monomials in u, u , u , . . . of weight n. It is quite clear that 1 1 = (1 − q) = 1 + q 2 + q 3 + 2q 4 + 2q 5 + · · · χ(L) = 1 − qj 1 − qj j≥2

j≥1

because the number of local ﬁelds of weight k is the number of ways to

2n 3n 1 2 · · ·. write 2n1 + 3n2 + · · · = k, so that χ(L) = n1 q n2 q Similarly, the character of the vector space E can be easily computed and is found to be: 1 1 1 = χ(E) = 2j−1 2j 1−q 1−q 1 − qj j≥1

j≥1 3

j≥1 5

= 1 + q + 2q + 3q + 5q + 7q + · · · 2

4

The ﬁrst inﬁnite product counts the factors ∂ ν and the second product counts the factors EO,ν in eq. (11.84). Note that we have χ(E) > χ(L), meaning that for each n dim En ≥ dim Ln . However, the equations of motion of the KdV hierarchy imply that many expressions of the form of those in the left-hand side of eq. (11.84) vanish. So the equality in eq. (11.84) is meant modulo the equations of motion of the KdV hierarchy. Let us give some examples of null vectors: level 1 : ∂t1 · 1 = 0, level 2 : ∂t21 · 1 = 0, level 3 : ∂t31 · 1 = 0,

∂t3 · 1 = 0

level 4 : ∂t41 · 1 = 0,

∂t1 ∂t3 · 1 = 0,

(∂t21 S2 − 4S4 + 6S22 ) = 0,

level 5 : ∂t51 · 1 = 0,

∂t21 ∂t3 · 1 = 0,

∂t5 · 1 = 0,

∂t1 ((∂t21 S2

− 4S4 +

6S22 )

= 0,

(∂t3 S2 − ∂t1 S4 ) = 0

We have written all the null vectors explicitly to show that their numbers exactly match the character formula. The non-trivial null vector at level 4 expresses S4 in terms of u: 8S4 = −u + 3u2 . With this identiﬁcation, the non-trivial null vector at level 5, ∂t3 S2 − ∂t1 S4 = 0, gives the KdV equation eq. (11.1).

427

11.9 Local ﬁelds

The goal of our study is to ﬁnd all the null vectors. They will have a simple description, given in eqs. (11.90, 11.97) below, at the price of introducing a fermionic language. A generating function for the monomials spanning the vector space E is: E(u, v) ≡ e j u2j−1 ∂t2j−1 e n v2n J2n (11.85) where the coeﬃcients J2n are deﬁned by log S(λ) ≡ −

1 J2n λ−n n

(11.86)

n>0

One can express any monomial in the S2n in terms of the J2n and vice versa. To describe the null vectors, we introduce fermionic and bosonic ﬁelds as in Chapter 9: βr λ−r−1/2 , β ∗ (λ) = βr∗ λ−r−1/2 , (11.87) β(λ) = r∈Z+ 12

r∈Z+ 12

H(λ) =

Hn λ−n−1 =: β(λ)β ∗ (λ) :

n∈Z

The vacuum |0 is characterized by βr |0 = 0, βr∗ |0 = 0, for r > 0. By the Campbell–Hausdorﬀ formula we have:

e

n

v2n J2n

= 0|e−

n>0

v2n Hn −

e

1 n>0 n J2n H−n

|0

Hence the monomials in J2n are in one to one correspondence with the components of the vector Z|0 in Fock space, where: Z ≡ e−

1 n>0 n J2n H−n

=e

dλ 2iπ

log S(λ)H(λ)

(11.88)

The full set of elements of the space E is obtained by acting on this vector with the time derivatives ∂t2j−1 . Null vectors admit a particularly simple description in this setting. −j Proposition. Let ∇(λ) = j≥1 λ ∂t2j−1 be the operator deﬁned in eq. (11.22), and Q be the fermionic operator: dλ Q= β(λ)∇(λ) (11.89) 2iπ Then the equations of motion of the KdV hierarchy imply: Q Z|0 = 0

(11.90)

428

11 The KdV hierarchy

Proof. We need the two formulae dλ −1 ∇(λ)Z = Z S (λ )∇(λ)S(λ )H(λ ) 2iπ

(11.91)

and β(λ)Z = S −1 (λ) Z β(λ)

(11.92)

These formulae are straightforward consequences of the fact that all Hn , n > 0, mutually commute, and eq. (9.43) in Chapter 9 applies. So we can write dλ dλ −1 Q Z|0 = Z S (λ)S −1 (λ )∇(λ)S(λ )β(λ)H(λ )|0 2iπ 2iπ dλ dλ ∂x log S(λ ) − ∂x log S(λ) β(λ)H(λ )|0 (11.93) =Z 2iπ 2iπ λ − λ In the last step we used eq. (11.23) valid for |λ| > |λ |. We examine separately the ∂x log S(λ ) and the ∂x log S(λ) terms. The ∂x log S(λ ) term reads: dλ dλ 1 ∂ log S(λ )β(λ)H(λ )|0 x 2iπ 2iπ λ − λ |λ|>|λ | One can do the integral over λ. Poles can occur at λ = 0 and λ = λ . At λ = 0, the integrand is actually regular. In fact, potentially dangerous terms come from β(λ)H(λ )|0 . But this is regular at λ = 0 because we have β(λ)H(λ ) =: β(λ)β(λ )β ∗ (λ ) : −

1 β(λ ), λ − λ

|λ| > |λ |

and by deﬁnition of the vacuum and normal ordered product, the term : β(λ)H(λ ) : |0 is regular at λ = 0. The same formula is used to analyse the poles at λ = λ . One has two terms. The ﬁrst one is dλ 1 : β(λ)β(λ )β ∗ (λ ) : |0 2iπ λ − λ which is zero because at λ = λ we get the product of two fermionic ﬁelds at the same point inside the normal product, and this vanishes. The second term is equal to 1 dλ β(λ )|0 − 2iπ (λ − λ )2

429

11.9 Local ﬁelds and this obviously vanishes. Next, we examine the ∂x log S(λ) term: dλ dλ 1 ∂ log S(λ)β(λ)H(λ )|0 x 2iπ 2iπ λ − λ |λ|>|λ |

This time we can do the λ integral. But, by the properties of the vacuum vector, the integrand is regular at λ = 0 because H(λ )|0 = O(1) and the integral vanishes. We have shown that Q Z |0 = 0. Equation (11.90) is not suﬃcient to characterize null vectors. We exhibit now a second equation which, together with the ﬁrst one, will provide a complete set of constraints. Proposition. Let C be the fermionic operator: C = C0 + C1

where

dλ d β(λ) λ β(λ) 2iπ dλ dλ1 dλ2 λ1 β(λ1 )β(λ2 )∇(λ1 )∇(λ2 ) log 1 − 2iπ 2iπ λ2 C0 =

C1 =

(11.94)

|λ1 |<|λ2 ]

(11.95) (11.96)

Then the equations of motion of the KdV hierarchy imply: C Z|0 = 0

(11.97)

Proof. We ﬁrst calculate C0 Z|0 . Since β(λ)2 = 0, we immediately get: d dλ −2 C0 Z|0 = Z S (λ)β(λ)λ β(λ)|0 2iπ dλ Next we calculate C1 Z|0 . We have dλ1 dλ2 λ1 C1 Z|0 = − β(λ2 )∇(λ2 ) log 1 − β(λ1 )∇(λ1 )Z|0 2iπ λ2 Υ 2iπ where Υ is the contour |λ1 | < |λ2 ]. We use eqs. (11.91, 11.92) to write dλ −1 β(λ1 )∇(λ1 )Z = Z S (λ)S −1 (λ1 )∇(λ1 )S(λ)β(λ1 )H(λ) 2iπ S(λ) dλ 1 β(λ1 )H(λ) =Z ∂x log S(λ1 ) |λ1 |>|λ] 2iπ λ1 − λ So, one has to evaluate the integral dλ1 dλ λ1 ∂x log S(λ) − ∂x log S(λ1 ) log 1 − β(λ1 )H(λ)|0 λ2 λ1 − λ |λ1 |>|λ] 2iπ 2iπ

430

11 The KdV hierarchy

This is exactly the same type of integral we met in eq. (11.93). Again, we see that the term ∂x log S(λ1 ) vanishes because in the λ integral, the integrand is regular at λ = 0. Only the term ∂x log S(λ) contributes. This time, however, the double pole gives a non-vanishing contribution: 1 dλ1 1 λ1 =− log 1 − 2 2iπ λ (λ − λ) λ 2 1 2−λ |λ1 |>|λ] Hence, we ﬁnd C1 Z|0 = −

dλ2 β(λ2 )∇(λ2 )Z 2iπ

|λ]<|λ2 |

dλ ∂x log S(λ) β(λ)|0 2iπ λ2 − λ

In this formula, ∇(λ2 ) acts on Z and ∂x log S(λ). In the terms coming from the action of ∇(λ2 ) on Z, we have three integrals. One of them can be evaluated because it does not involve any S(λ). We ﬁnally get a total vanishing contribution. The terms coming from the action of ∇(λ2 ) on ∂x log S(λ) can be put in the form dλ dλ2 ∂x log S(λ2 )∂x log S(λ) −Z (λ2 − λ)2 |λ2 |>|λ| 2iπ 2iπ ∂x2 log S(λ) − ∂x2 log S(λ2 ) − (∂x log S(λ2 ))2 : β(λ2 )β(λ) : |0 + (λ2 − λ)2 In the second term, one can always perform one of the integrals. In the ﬁrst term, we take the half sum with λ and λ2 exchanged, getting an integral which localizes at λ = λ2 . Putting everything together, we get 1 d dλ C1 Z|0 = Z [2∂x2 log S(λ) + (∂x log S(λ))2 ]β(λ)λ β(λ)|0 4 2iπλ dλ Remembering eq. (11.30), we ﬁnally obtain: dλ d (C0 + C1 )Z|0 = (u + λ)β(λ)λ β(λ)|0 2iπλ dλ d β(λ)|0 = O(λ). This integral vanishes because β(λ)λ dλ

Remark. It is useful to write the operators Q and C explicitly. For Q we ﬁnd Q=

β−j+ 1 ∂t2j−1

j≥1

2

From this expression it is quite clear that Q2 = 0

431

11.9 Local ﬁelds Similarly, we ﬁnd C=

β−r

1

− 2rβr +

r≥ 1 2

r− 2 j1 ≥1 j2 =1

1 r − j2 +

1 βr+1−j1 −j2 ∂t2j1 −1 ∂t2j2 −1 2

Note that Q and C commute.

We now show that eqs. (11.90, 11.97), contain all the information about the KdV hierarchy by enumerating all null vectors and comparing characters. ∗ be the Fock space consisting of elements of the form Let F−n

0|βs1 · · · βsm βr∗1 · · · βr∗m+n with si , ri ≥ 12 and all diﬀerent. Note that we have n + m operators βr∗i of charge −1, and m operators βsi of charge +1, so that the total charge of the state is −n. Attributing a weight 2s at βs and 2r at βr∗ turns the ∗ of charge −n into a graded vector space. dual Fock space F−n Introducing a parameter q to count the weight and x to count the ∗ is easily calculated: charge, the character of F−n dx n ∗ χ(F−n ) = (1 + q 2j−1 x)(1 + q 2j−1 x−1 ) x 2iπx j≥1

∗ ) = Changing x = q 2 x in the above integral, we ﬁnd the relation χ(F−n 2 ∗ ∗ ) = q n χ(F ∗ ). On the other hand, F ∗ is ), so that χ(F−n q 2n−1 χ(F−n+1 0 0 isomorphic to the bosonic Fock space generated by the Hn , which has weight 2n. Its character is therefore j≥1 (1 − q 2j )−1 . Hence we have shown 1 2 ∗ χ(F−n ) = qn (11.98) 1 − q 2j j≥1

∗ be the dual Proposition. Let F−2 ∗ → F ∗ is injective. cation C : F−2 0

Fock space of charge −2. The appli-

Introduce a linear transformation on the space of fermions β˜r = Proof. r Nrr βr , where the matrix Nrr is deﬁned by β˜r = βr ,

r≤−

1 2 1

β˜r = 2rβr −

r− 2 j1 ≥1 j2

1 βr+1−j1 −j2 ∂t2j1 −1 ∂t2j2 −1 , r − j2 + 12 =1

r≥

1 2

432

11 The KdV hierarchy

∗ ∗ ˜ ˜∗ satisfy the The dual fermions are β˜−r = (t N −1 )r,r β−r so that β, β canonical anticommutation relations. Because the transformation N is triangular and leaves β−r , βr∗ , r ≥ 12 invariant, the vacuum is invariant and the Fock spaces are the same. In the β˜r basis we can write β˜−r β˜r C= r≥ 12

∗ β ˜∗ . We have the sl2 Let us call X+ ≡ C and introduce X− = r≥ 1 β˜−r r 2 algebra [X+ , X− ] = H0 , [H0 , X± ] = ±2X± ∗ ˜ ˜∗ where H0 = r : βr β−r := r : βr β−r : is the charge operator The ∗ spaces F−n are eigenspaces of H0 . So; we have a representation of the sl2 algebra on the full Fock space F ∗ = n Fn∗ . ∗ Note, moreover, that X± , H0 are of grade zero, so their action on F−n preserves the gradation of these spaces, and we can restrict them to subspaces of given grade. Subspaces of given grade are ﬁnite-dimensional. For each grade, we can decompose F ∗ into a direct sum of irreducible ﬁnite-dimensional representations of sl2 . Finally, on a ﬁnite-dimensional ∗ irreducible representation, X+ : Fn−2 → Fn∗ is injective for n ≤ 0. The weights were so chosen that, taking into account the weights of ∂t2j−1 deﬁned above, the operators Q and C have weight zero. Hence they preserve the gradings of the spaces on which they act. This is an important observation in the proof of the following: Proposition. The character of the space of vectors in E, solutions of the equations eq. (11.90) and eq. (11.97), is 1 χ1 = 1 − qj j≥2

It is equal to the character of the space of local ﬁelds. As a consequence eqs. (11.90, 11.97) capture the complete information about the KdV hierarchy. Proof. The character we are looking for is of the form 1 χ1 = ·χ 1 − q 2j−1 j≥1

where the ﬁrst factor comes from the u-exponential in eq. (11.85), while the second factor, χ, comes from the v-exponential, subjected to the conditions eqs. (11.90, 11.97). To compute χ, we count the dimension of the

433

11.10 Whitham’s equations

dual spaces. The C conditions are taken into account by considering the ∗ C. Due to the previous proposition, the character of this space F0∗ /F−2 ∗ ). Next, we have to take into account the Q condispace is χ(F0∗ ) − χ(F−2 tions. Because Q and C commute, this is achieved by replacing the spaces ∗ by the spaces H∗ = F ∗ /F ∗ F−n −n −n −n−1 Q. It follows that: ∗ ) χ = χ(H0∗ ) − χ(H−2

(11.99)

∗ ), we take into account that Q is a To compute the characters χ(H−n nilpotent operator with trivial cohomology. In fact, we can construct a homotopy: −1 ∗ βj− Q∗ = 1 ∂t 2j−1 j≥1

2

The operator QQ∗ + Q∗ Q acting on any vector reproduces it up to a constant. Hence ∗ ∗ ∗ ∗ → F−n ) = Im (Q : F−n−2 → F−n−1 ) Ker (Q : F−n−1

Summing over this complex, we get, using eq. (11.98): 2

2 2 ∗ ) = q n − q (n+1) + q (n+2) + · · · χ(H−n j≥1

Inserting into eq. (11.99), we get χ = (1 − q)

j≥1 (1

1 1 − q 2j

− q 2j )−1 .

Equations (11.90, 11.97), which code for the non-linear KdV hierarchy, are actuatly linear in Z. The non-linearity comes from the explicit form of Z as an exponential, as in eq. (11.88). 11.10 Whitham’s equations We study eqs. (11.90, 11.97) in the case of ﬁnite-zone solutions. They acquire a beautiful geometrical meaning when we consider their average over the Liouville torus in the spirit of the Whitham theory (see Chapter 10). Speciﬁcally, C reﬂects the Riemann bilinear identities and Q gives rise to the Whitham equations directly. For any local quantity O(u, u , . . .), the Whitham average is deﬁned by 1 O(u(x, t), . . .)dx O

= lim →∞ 2 − 2π 2π dθg dθ1 ··· O(u(θ, {λi }), . . .) = 2π 2π 0 0

434

11 The KdV hierarchy

where, in the second expression, we have replaced the average over x by an average over the Liouville torus, which is justiﬁed for almost all trajectories. Let us remark that if the expression, O, is the x-derivative of a bounded function, its Whitham average vanishes. However, we will show below that there exist more subtle vanishing conditions. We consider ﬁnite-zone solutions constructed with the hyperelliptic curve eq. (11.49) and the dynamical divisor D(x) = {γi = (λγi , µγi )}, deﬁned in eq. (11.55). With these data, we can write: - - ∂θ −1 dλγ1 · · · dλγg O(u({λγi }, {λi }), . . .) -- - O

= N ∂γ a1 an where the normalization factor N is deﬁned so that 1

= 1. The Jacobian determinant |∂θ/∂γ| is easily computed. Proposition. We have N

- - i<j (λγi − λγj ) −1 - ∂γ - = ∆ R(λ )

−1 - ∂θ -

where

∆ = det ai

λj−1 dλ R(λ)

(11.100)

γi

i

i,j=1,...,g

Proof. The angles on the torus are given by θk = are the normalized Abelian diﬀerentials. So dθk = ωk (γi )

γi i

ωk , where ωk

i

But ωk =

g

j=1 ckj

√λ

j−1

R(λ)

dλ, where the coeﬃcients ckj are determined by

normalizing the a-periods. Hence - j−1 - ∂θ i<j (λγi − λγj ) - - = det c det λγi = det c - ∂γ R(λγi ) R(λγi ) i The factor det c cancels out in the normalization factor, which is found by imposing 1

= 1 giving N = det c ∆. By eqs. (11.84, 11.64), with every local ﬁeld O we can associate a symmetric function of the points γi , LO (λγ1 , . . . , λγg ) = EO,0 (S2 , S4 , . . .). This function depends on the moduli λi and the λγi . In the Whitham average,

11.10 Whitham’s equations

435

all terms with ν > 0 in eq. (11.84) vanish because they are exact derivatives. We have dλγg dλγ1 −1 O

= ∆ LO (λγ1 , . . . , λγg ) (λγi −λγj ) ··· R(λγg ) R(λγ1 ) a1 an i<j (11.101) We can expand the symmetric function LO (λγ1 , . . . , λγg ) on the Schur polynomials introduced in eq. (9.62) in Chapter 9: LO = LYO SY Y

Using the determinant formula for SY associated with the Young diagram Y = [ni ], we see that averaging reduces to computing dλ O

= ∆−1 LYO det λni +j−i R(λ) a j Y These formulae are particularly useful for describing the Whitham average because the variables λγi are separated and we have only one dimensional integrals to compute. By eq. (11.64), the value of the coeﬃcients J2n deﬁned in eq. (11.86) are: g 2g+1 1 n n J2n = λ γi − λi 2 i=1

i=1

Using the boson–fermion correspondance of Chapter 9, we can write Z in eq. (11.88) as: g g i=1 λγi (11.102) Z|0 = GΓ β ∗ (λγ1 )β ∗ (λγ2 ) · · · β ∗ (λγg )|g i<j (λγi − λγj ) where GΓ depends only on the curve Γ:   2g+1 1 1 λni H−n  GΓ = exp  2 n n≥1

i=1

In the Whitham theory, we assume that the moduli, λi , become functions of the slow modulation times T2j−1 = t2j−1 . ∂t2j−1 O = ∂t2j−1 O|λi +

∂λi ∂λ O ∂t2j−1 i i

436

11 The KdV hierarchy

Upon averaging, the ﬁrst term drops out and we are left with ∂t2j−1 O

=

∂λi ∂λ O

≡ ∂T2j−1 O

∂t2j−1 i i

The modulation equations are obtained by keeping the leading terms in in eqs. (11.90, 11.97). They become Q0 Z

|0 = 0, with Q0 =

C0 Z

|0 = 0

β−j+ 1 ∂T2j−1

(11.104)

dλ d β(λ) λ β(λ) 2iπ dλ

(11.105)

2

j≥1

and C0 =

(11.103)

The rest of this section is devoted to the analysis of these equations. We need some preparation. On a Riemann surface, there is a natural pairing between meromorphic diﬀerentials. If Ω1 and Ω2 are two such diﬀerentials on Γ, we deﬁne (Ω1 • Ω2 ) =

g i=1

Ω2 −

Ω1

aj

Ω2

aj

bj

(11.106)

Ω1 bj

The Riemann bilinear identities express this quantity in terms of residues (see Chapter 15). We also have a pairing between cycles. If C1 and C2 are two cycles, the pairing is simply g (n1j m2j − m1j n2j ) C1 ◦ C2 = j=1

g

where Ci = j=1 (nij aj + mij bj ). We can write this intersection number in a way similar to eq. (11.106). Proposition. Let ωi be the normalized holomorphic diﬀerentials. Let ηi , i = 1, . . . , g, be the second kind diﬀerentials dual to the ωi , normalized by (ωi • ηj ) = δij , then C1 ◦ C2 =

(ηi • ηj ) = 0,

g j=1 C 1

(ωi • ωj ) = 0

ηj −

ωj C2

ωj

C2

ηj

C1

(11.107)

437

11.10 Whitham’s equations

Proof. The normalization conditions of ωj and ηj mean that the matrix P , deﬁned by ωi ωi a b j j P = aj ηi bj ηi is a symplectic matrix: t

P J P = J,

J=

0 Id −Id 0

Since J 2 = −Id, the right inverse of P is −J t P J. Using the fact that the right inverse and the left inverse are the same, we deduce that t P JP = J. Now let Ci | = (nij , mij ). We can rewrite the intersection number as C1 ◦ C2 = C1 |J|C2 . Using the relation t P JP = J, this is equal to C1 ◦ C2 = C1 | t P JP |C2 . This is equivalent to eq. (11.107) because C1 | t P is the vector of periods of the forms ωj , ηj along the cycle C1 , and similarly for P |C2 . For a hyperelliptic curve, one has the explicit formula: Proposition. The intersection number is given by 1 dλ dλ 1 2 C(λ1 , λ2 ) C1 ◦ C2 = 4iπ C1 R(λ1 ) C2 R(λ2 )

(11.108)

where the antisymmetric polynomial C(λ1 , λ2 ) is deﬁned by R(λ1 ) ∂ C(λ1 , λ2 ) = R(λ1 ) − (λ1 ↔ λ2 ) ∂λ1 λ1 − λ2 Proof. The ﬁrst term in the expression of C1 ◦ C2 reads R(λ1 ) 1 dλ2 d dλ1 4iπ C2 R(λ2 ) C1 dλ1 λ1 − λ2

(11.109)

The integral over λ1 can be performed and gets contributions only at intersection points of C1 and C2 . Let the curves have a positive intersection at λ = λ0 . We get a contribution 1 1 = 2iπ R(λ0 )δ(λ0 − λ2 ) R(λ0 ) − + λ0 + i − λ2 λ0 − i − λ2 The integral over λ2 now gives 1/2. The second term is treated similarly and also gives 1/2.

438

11 The KdV hierarchy

It is important to realize that the average eq. (11.101) can vanish for particular antisymmetric polynomials: MO (λγ1 , . . . , λγg ) ≡ (λγi − λγj )LO (λγ1 , . . . , λγg ) i<j

There are two general reasons why such an integral can vanish: • The ﬁrst one is when MO is “an exact form” (−1)i M (λγ1 , . . . , λ( MO (λγ1 , . . . , λγg ) = γi , . . . , λγg )P (λγi ) i

where P (λ) is a polynomial such that P (λ)/ R(λ) has vanishing a-periods. In particular, if deg P ≥ 2g, we can write √ S d P √ =√ + (Q R) R R dλ with deg S ≤ 2g − 1 and deg Q = deg P − 2g. The exact derivative term has vanishing periods. • The second one, less trivial, is when MO (λγ1 , . . . , λγg ) ( = (−1)i+j M (λγ1 , . . . , λ( γi , . . . , λγj , . . . , λγg )C(λγi , λγj ) (11.110) i,j

since we are integrating over non-intersecting a-cycles. This second condition, which is a direct consequence of Riemann bilinear identities, ensures that the second of eqs. (11.103) is automatically satisﬁed. Proposition. The equation C0 Z

|0 = 0

(11.111)

follows from eq. (11.110). Proof. We have to evaluate g C0 Z

|0 = ∆−1 i=1

ai

λgγ dλγi i C0 GΓ β ∗ (λγ1 ) · · · β ∗ (λγg )|g R(λγi )

The commutation relation β(λ)GΓ = λ−g− 2 1

R(λ) GΓ β(λ)

11.10 Whitham’s equations implies

C0 GΓ = GΓ

439

d dλ −2g−1 R(λ)β(λ)λ β(λ) λ 2iπ dλ

So, we have to compute

d dλ −2g−1 R(λ)β(λ)λ β(λ)β ∗ (λγ1 ) · · · β ∗ (λγg )|g λ 2iπ dλ

This is done using Wick’s theorem. Since we are using the charged vacuum, the contraction is g|β(z)β ∗ (w)|g =

z g w−g , z−w

|z| > |w|

We have three terms corresponding to zero, one and two contractions respectively. The term with zero contraction reads E0 =

dλ −2g d λ R(λ) : β(λ) β(λ)β ∗ (λγ1 ) · · · β ∗ (λγg ) : |g 2iπ dλ

d This term vanishes because β(λ) dλ β(λ)|g = λ2g |g + 2 + O(λ2g+1 ). The term with one contraction can be written as

E1 = 2

g i=1

dλ λ−g d γi (−1) R(λ) R(λ)λn 2iπ λ − λγi dλ i

n≥0

∗ (λ ) · · · β ∗ (λ ) : |g : β−g−n− 1 β ∗ (λγ1 ) · · · β γi γg 2

When we perform the λ integral, we get a contribution at the pole λγi which produces an exact form and therefore does not contribute in the Whitham average. The point λ = 0 does not contribute because the integrand is regular there. The term with two contractions reads E2 =

ij

(−1)i+j

dλ −g −g λ λ 2iπ γi γj

R(λ) d R(λ) −i↔j λ − λγi dλ λ − λγj

∗ (λ ) · · · β ∗ (λ ) · · · β ∗ (λ ) : |g : β ∗ (λγ1 ) · · · β γi γj γg

where the hat means that the corresponding quantity is omitted. Performing the λ integral, we get an expression of the form eq. (11.110) vanishing under the Whitham average.

440

11 The KdV hierarchy

Proposition. The equation Q0 Z

|0 = 0

(11.112)

implies the modulation equations ∂ ∂T2p−1

Ω(2q−1) =

∂ ∂T2q−1

Ω(2p−1)

where Ω(2p−1) are the normalized second kind diﬀerentials with a pole at ∞ such that Ω(2p−1) = d(z 2p−1 + O(z −1 )) at ∞. Proof. To extract this particular modulation equation, one has to extract a particular component in eq. (11.112). Consider the co-vector ∗ ∗ λ2 λ−1/2 dλ 0| : βp− 1β q− 1 1

2

2

d 1 λ 2 β(λ) : dλ

where the factor λ−1/2 dλ = 2dz is introduced to get a 1-form on Γ. ∗ Applying the Wick theorem, with contraction 0|βp− 1 β−j+ 1 |0 = δpj , we 2

have

2

d 1 ∗ ∗ Q λ 2 β(λ)βp− 1β q− 12 0 2 dλ 1 d 1 1 d 1 ∗ ∗ = 0|λ 2 λ 2 β(λ)βp− λ 2 β(λ)βq− 1 ∂T2q−1 − 0|λ 2 1 ∂T2p−1 2 2 dλ dλ On the other hand, we easily get 1

0|λ 2

1 d 1 1 ∗ ∗ 0|β ∗ (λ)βp− ) 0|β ∗ (λ)βp− 1 − 0|λ 2 λ 2 β(λ)βp− 1 C0 = (p − 1 2 2 2 2 dλ hence, having in mind eq. (11.111), we can write

d 1 ∗ ∗ Q λ 2 β(λ)βp− 1β q− 12 0 2 dλ 1 1 = (p − ) 0|β ∗ (λ)βp− 1 ∂T2q−1 − (q − ) 0|β ∗ (λ)βq− 1 ∂T2p−1 2 2 2 2 This yields the equation 1

0|λ 2

(2q − 1)∂T2p−1 0| : βq− 1 β ∗ (λ) : Z

|0 2

= (2p − 1)∂T2q−1 0| : βp− 1 β ∗ (λ) : Z

|0 2

It remains to evaluate 0|β ∗ (λ)βp− 1 Z

|0 2 g λgγ dλγi −1 i =∆ 0|β ∗ (λ)βp− 1 GΓ β ∗ (λγ1 ) · · · β ∗ (λγg )|g 2 R(λ ) γi i=1 ai

441

11.10 Whitham’s equations Pushing GΓ to the left and writing βp− 1 = 2

dλ1 p−1 2iπ λ1 β(λ1 ),

we arrive at

1

λg+ 2 0|β (λ)βp− 1 GΓ β (λγ1 ) · · · β (λγg )|g = 2 R(λ) 3 dλ1 p−g− 2 λ R(λ1 ) 0|β ∗ (λ)β(λ1 )β ∗ (λγ1 ) · · · β ∗ (λγg )β−g+ 1 · · · β− 1 |0 2 2 2iπ 1 ∗

∗

∗

The last vacuum expectation value is just the determinant of a (g + 1) × (g + 1) matrix 1 λ−j+1 λ−λ1 det 1 λ−j+1 γi λγ −λ1 i,j=1,...,g

i

The λ1 integral can be evaluated: 1 1

λg+ 2 1 λg+ 2 p−g− 3 dλ1 2 R(λ1 ) = R(λ) λ λ − λ1 + R(λ) |λ1 |>|λ| 2iπ R(λ) where ()+ means the polynomial part at ∞. Putting back the factor λ−1/2 dλ, we ﬁnally get: λ−1/2 dλ 0|β ∗ (λ)βp− 1 Z

|0 2

 3 g √λ λp−g− 2 R(λ) dλ R(λ) +  = ∆−1 det  dλ p−g− 32 λgγi γi √ λ γi R(λγi ) ai 2iπ R(λγi )

+

g−j+1 λ √ dλ R(λ) g−j+1 dλγi λγi √ ai 2iπ R(λγi )

  

Here the indices i, j = 1, . . . , g, so that the matrix inside the determinant is (g + 1) × (g + 1). Note that this is a second kind diﬀerential. At inﬁnity it behaves as (z 2p−2 + O(1))dz. Moreover, it is evident that its a-periods 1 Ω(2p−1) . all vanish. It is equal to 2p−1 References [1] D.J. Korteweg and G. de Vries, On the change of form of long waves advancing in a rectangular channel and on a new type of long stationary wave. Philos. Mag. 39 (1895) 422–443. [2] C.S. Gardner, J.M. Greene, M.D. Kruskal and R.M. Miura, Method for solving the Korteweg–de Vries equation. Phys. Rev. Lett. 19 (1967) 1095. [3] P.D. Lax, Integrals of nonlinear equations of evolution and solitary waves. Comm. Pure Appl. Math. 21 (1968) 467–490.

442

11 The KdV hierarchy

[4] V.E. Zakharov and L.D. Faddeev, Korteweg–de Vries equation: a completely integrable Hamiltonian system. Funct. Anal. Appl. 8 4 (1971) 280–287. [5] S. Novikov, S.V. Manakov, L.P. Pitaevskii and V.E. Zakharov, Theory of solitons. The inverse scattering method. Consultants Bureau (1984). [6] L.D. Faddeev and L.A. Takhtajan, Hamiltonian Methods in the Theory of Solitons. Springer (1986). [7] L.A. Dickey, Soliton Equations and Hamiltonian Systems. World Scientiﬁc (1991). [8] O. Babelon, D. Bernard and F. Smirnov, Null-vectors in integrable ﬁeld theory. Commun. Math. Phys. 186 (1997) 601–648. [9] O. Babelon, D. Bernard and F. Smirnov, Form factors, KdV and deformed hyperelliptic curves. Nucl. Phys. B (Proc. Suppl.) 58 (1997) 21–33. [10] I. Krichever and D.H. Phong, Symplectic forms in the theory of solitons. Surveys in Diﬀerential Geometry 4 (1998) 239–313, International Press.

12 The Toda ﬁeld theories

In this chapter, we study Toda ﬁeld theories, which are generalizations of the Liouville equation. The equations of motion are both integrable, i.e. they admit a zero curvature representation, and conformally invariant. They allow us to see the interplay between conformal symmetry and the classical Yang–Baxter equation. The sine-Gordon theory is a Toda ﬁeld (2 . In particular, soliton theory associated with the aﬃne Lie algebra sl solutions are constructed by the action of very special elements of the dressing group on the vacuum solution. Compared to the KP situation, we have here an example with two singularities, each one accomodating one of the two relativistic light-cone variables x ± t, and their corresponding hierarchy of times. We study soliton and ﬁnite-zone solutions of the sineGordon equation. In the next chapter we will apply the inverse scattering method to discuss further solutions. 12.1 The Liouville equation In his studies on surfaces with constant curvature, Liouville introduced the equation (∂t2 − ∂x2 )ϕ = −4e2ϕ (12.1) In the light-cone coordinates x± = x ± t, ∂x± = 12 (∂x ± ∂t ), the equation reads ∂x+ ∂x− ϕ = e2ϕ . Remarkably, Liouville was able to give the general solution of this non-linear partial diﬀerential equation: e2ϕ = −

∂F (x+ )∂G(x− ) (F (x+ ) − G(x− ))2

(12.2)

where F and G are arbitrary functions of the single variables x+ and x− respectively and ∂F (x+ ) = ∂x+ F (x+ ), ∂G(x− ) = ∂x− G(x− ). Such 443

444

12 The Toda ﬁeld theories

functions are called chiral functions. A very important property of this equation is its invariance under changes of coordinates: x+ = f (x+ ),

x− = g(x− ),

ϕ = ϕ −

1 log (∂f ∂g) 2

We see that the equation for the primed variables is the same as for the unprimed ones. We call this invariance the conformal invariance of the Liouville equation. There is an important connection between the Liouville equation and the Schroedinger equation, eq. (11.5) in Chapter 11. The ﬁeld e−ϕ satisﬁes two chiral equations: (∂x2− − u(x− ))e−ϕ = 0, (∂x2+

−ϕ

− u(x+ ))e

= 0,

u = (∂x− ϕ)2 − ∂x2− ϕ,

∂x+ u = 0

u = (∂x+ ϕ) −

∂x− u = 0

2

∂x2+ ϕ,

Indeed, the two Schroedinger equations are obtained readily by computing ∂x2± e−ϕ and the chirality of u and u is then proved using the Liouville equation. Conversely, starting from two arbitrary chiral potentials u(x− ) and u(x+ ), we construct solutions ξi (x− ) and ξ i (x+ ), i = 1, 2, of the two Schroedinger equations, normalized such that their Wronskians are: W (ξ1 , ξ2 ) = 1,

W (ξ 1 , ξ 2 ) = −1

where the Wronskian is deﬁned as W (f, g) = f g − f g . It is easy to check that ϕ(x+ , x− ), given by the formula, e−ϕ(x+ ,x− ) = ξ1 (x− )ξ 1 (x+ ) + ξ2 (x− )ξ 2 (x+ )

(12.3)

satisﬁes the Liouville equation. One can relate this solution to Liouville’s solution eq. (12.2) by writing it as: 1 G F −G F 1 e−ϕ = √ =√ √ −√ √ −F G F −G F −G This leads us to set: F ξ1 = √ , F

1 ξ2 = − √ , F

ξ1 = √

1 , −G

ξ2 = √

G −G

The Wronskian conditions W (ξ1 , ξ2 ) = 1 and W (ξ 1 , ξ 2 ) = −1 are automatically satisﬁed. It follows that F = −ξ 1 /ξ 2 and G = ξ2 /ξ1 . Finally, one

445

Zero-curvature representation

can use this identiﬁcation to compute the potentials u and u in terms of F and G. 1 u = − S(F ), 2

1 u = − S(G), 2

S(f ) =

f 3 f 2 − f 2 f 2

where S(f ) is the so-called Schwarzian derivative of f . Note that the two solutions ξ1 , ξ2 are deﬁned up to a linear transformation of determinant 1. This translates into a homographic transformation for G which leaves the potential u invariant, and similarly for F and u. When the two homographic transformations of F and G are inverse to each other, the Liouville ﬁeld ϕ is invariant. Invariance of the Liouville equation under change of coordinates reﬂects itself into covariance properties of the Schroedinger equation. Changing x = f (x ), we have: ξ(x) ξ (x ) = √ , ∂f

u (x ) = u(x)(∂f )2 +S(f ),

(∂ 2 −u )ξ = (∂f ) 2 (∂ 2 −u)ξ 3

These equations exhibit the covariance properties of the objects involved. Speciﬁcally, ξ is a diﬀerential of weight − 12 , u is a Schwartzian connection, and (∂ 2 − u)ξ is a diﬀerential of weight 32 . In the next section, we generalize this setup to a large class of twodimensional ﬁeld theories, called Toda ﬁeld theories, based on Lie algebras. 12.2 The Toda systems and their zero-curvature representations Toda ﬁeld theories are two-dimensional generalizations of the Toda chains studied in Chapter 4. We use the same notations for Lie algebras as in that chapter. Let G be a simple Lie algebra of rank r, and consider a Cartan decomposition: G = N− ⊕ H ⊕ N+ Let Φ(x, t) be the Toda ﬁeld taking values in the Cartan subalgebra Φ(x, t) =

r

Φi (x, t)Hi

i=1

where Hi form an orthonormal basis of the Cartan algebra. By a straightforward generalization of eq. (4.24), we deﬁne the Toda ﬁeld theories by

446

12 The Toda ﬁeld theories

their equations of motion:

(∂t2 − ∂x2 )Φ = −2

Hα exp (2α(Φ))

(12.4)

α simple

To write these equations in components, we introduce ϕi = Λ(i) (Φ) for i = 1, . . . , r, where Λ(i) are the r fundamental weights of G. Since we have Λ(i) (Hαj ) = (αi2,αi ) δij , and αi (Φ) = j Λ(j) (Φ) aji , where aji are the elements of the Cartan matrix, the Toda equations of motion can be written as: j (∂t2 − ∂x2 )ϕi = −αi2 e 2 j ϕ aji (12.5) In particular in the sl2 case, the Toda ﬁeld has only one component ϕ1 = Λ(Φ), and setting ϕ1 = 12 ϕ, we recover eq. (12.1). All these equations share with the Liouville equation the important property of being invariant under a change of coordinates: Proposition. The Toda ﬁeld equations are invariant under the transformations x± → x± with x+ = f (x+ ), x− = g(x− ), and 1 Φ (x+ , x− ) = Φ(f (x+ ), g(x− )) + Hρ ln(∂f (x+ )∂g(x− )) 2 Here Hρ ∈ H is the Weyl vector characterized by the property αj (Hρ ) = 1 for all simple roots αj . Proof. The proof is obvious once we write the ﬁeld equations in the light-cone coordinates: 1 ∂x+ ∂x− Φ = Hα e2α(Φ) 2 α simple

We introduce now a zero curvature representation for the Toda ﬁeld equations. Proposition. Let Ax = ∂t Φ +

nα eα(Φ) (Eα + E−α )

(12.6)

nα eα(Φ) (Eα − E−α )

(12.7)

α simple

At = ∂x Φ −

α simple

Then the Toda ﬁeld equations eq. (12.4) can be rewritten as the zero curvature equation ∂x At − ∂t Ax − [Ax , At ] = 0. The constants nα are such that n2α (Eα , E−α ) = 1.

12.3 Solution of the Toda ﬁeld equations

447

Proof. The proof is exactly the same as for the open Toda chain, and we do not repeat it. It will be often convenient to work with the light-cone coordinates. In these coordinates, Ax± = 12 (Ax ± At ). Deﬁne the elements of G:

E± =

nαj E±αj

αj simple

then Ax+ = ∂x+ Φ + e−ad Φ E− ,

Ax− = −∂x− Φ + ead Φ E+

(12.8)

Example. Let us give the example associated with the Lie algebra G = sl2 . Let E+ , E− , H, be its generators with commutation relations [H, E± ] = ±2E± ,

[E+ , E− ] = H

We have Φ = 12 ϕH, and the Lax connection reads, in the fundamental representation: 1 1 ∂x ϕ 0 eϕ − 2 ∂x− ϕ and A Ax+ = 2 ϕ+ = x− 1 − 12 ∂x+ ϕ 0 e 2 ∂x− ϕ The zero curvature condition ∂x+ Ax− − ∂x− Ax+ − [Ax+ , Ax− ] is equivalent to the Liouville equation (12.1). 12.3 Solution of the Toda ﬁeld equations The Toda ﬁeld equations being conformally invariant, one can solve them by splitting the chiralities, as in the Liouville case. To do that, we use the zero-curvature representation. When Φ is a solution of the Toda ﬁeld equations, we have Fx+ x− = 0 (and conversely), so we can solve the linear system (∂x± −Ax± )Ψ = 0. Equivalently, we can write Ax± as a pure gauge (12.9) Ax± = ∂x± Ψ · Ψ−1 We denote by B± = H ⊕ N± the two Borel subalgebras of G. Proposition. Let Q± ∈ exp B± be deﬁned by two decompositions of Ψ as: Ψ = e±Φ N∓ Q±

with

N∓ ∈ exp N∓ ,

Q± ∈ exp B± (12.10)

448

12 The Toda ﬁeld theories

then Q± satisfy the following equations: ∂x− Q− Q−1 − = 0, ∂x+ Q+ Q−1 +

= 0,

∂x+ Q− Q−1 − = −P + E−

(12.11)

∂x− Q+ ·

(12.12)

Q−1 +

= P + E+

where P (x− ) and P (x+ ) are chiral ﬁelds with values in the Cartan subalgebra. Proof. We write Ψ in two diﬀerent ways: Ψ = e−Φ G1 = eΦ G2 . Plugging this into eq. (12.9) we get: 2adΦ ∂x− G1 · G−1 E+ , ∂x− G2 · G−1 1 =e 2 = −2∂x− Φ + E+

(12.13)

−1 −2adΦ E− ∂x+ G1 · G−1 1 = 2∂x+ Φ + E− , ∂x+ G2 · G2 = e

(12.14)

Let us prove eqs. (12.11). Using the Gauss decomposition G1 = N+ Q− with N+ ∈ exp N+ and Q− ∈ exp B− , we obtain: −1 −1 2adΦ N+ (∂x− Q− Q−1 E+ − )N+ + ∂x− N+ N+ = e

or, multiplying on the right by N+ and on the left by N+−1 , −1 −1 2adΦ E+ N+ ∂x− Q− Q−1 − = −N+ ∂x− N+ + N+ e

Since the left-hand side is in B− and the right-hand side is in N+ , they −1 2adΦ E . This both vanish, so that ∂x− Q− Q−1 + − = 0 and ∂x− N+ N+ = e proves that Q− only depends on x+ . Next, using eq. (12.14) and again the decomposition G1 = N+ Q− , we obtain: −1 −1 N+ (∂x+ Q− Q−1 − )N+ + ∂x+ N+ N+ = 2∂x+ Φ + E−

(12.15)

The right-hand side has lowest height −1 given by the E− term. So the −1 lowest height term in ∂x+ Q− Q−1 − is also equal to E− . Since ∂x+ Q− Q− ∈ B− , it is necessarily of the form −P + E− , with P ∈ H only depending on x+ . The equations for Q+ are proved similarly. One can reconstruct the Toda ﬁeld, Φ, from the knowledge of Q± . Let |Λ(i) , i = 1, . . . , r = rank G be highest weight vectors for the fundamental representations of G. Recall the main properties of |Λ(i) : H|Λ(i) = Λ(i) (H)|Λ(i) ,

Eα |Λ(i) = 0 for α > 0

We denote by Λ(i) | the conjugate highest weight which satisﬁes (see Chapter 16): Λ(i) | = Λ(i) (H) Λ(i) |,

Λ(i) |E−α = 0 for α > 0

449

12.3 Solution of the Toda ﬁeld equations

We can compute any scalar product of the form Λ(i) |X |Λ(i) , where X is any element of the universal enveloping algebra, by pushing all the Eα to the right and all the E−α to the left using the commutation relations, and the fact that |Λ(i) is a common eigenvector of all the H. Finally, we normalize the scalar product by: Λ(i) |Λ(i) = 1 With these deﬁnitions at hand, we have: Proposition. For any fundamental representation with highest weight Λ(i) , deﬁne: ξ (i) = Λ(i) | e−Φ Ψ = Λ(i) | Q+ ξ

(i)

(i) = Ψ−1 e−Φ |Λ(i) = Q−1 − |Λ

(12.16)

(i)

The vectors ξ (i) and ξ are chiral: ∂x+ ξ (i) = 0 and ∂x− ξ The Toda ﬁeld Φ can be reconstructed by the formula: e−2Λ

(i) (Φ)

= ξ (i) · ξ

(i)

(i)

(i) = Λ(i) |Q+ Q−1 − |Λ

= 0.

(12.17)

Proof. First e−Φ Ψ = N− Q+ and Λ(i) |N− = Λ(i) | by the highest weight condition. So Λ(i) | e−Φ Ψ = Λ(i) | Q+ depends only on x+ . Similarly (i) (i) Ψ−1 e−Φ |Λ(i) = Q−1 − |Λ depends only on x− . By the deﬁnition of ξ (i)

(i)

and ξ , we see that Ψ and Ψ−1 cancel in the scalar product ξ (i) · ξ , (i) leaving Λ(i) | e−2Φ |Λ(i) = e−2Λ (Φ) . The knowledge of these quantities for i = 1, . . . , r completely characterizes Φ. Equations (12.12) determine Q± in terms of the and P with values in H. For example, consider the (P + E+ )Q+ in the Liouville case, G = sl2 . We have: q11 q12 p , (P + E+ ) = Q+ = 0 q22 0

two chiral ﬁelds P equation ∂x− Q+ = 1 −p

The vector ξ reads ξ = (1, 0)Q+ = (q11 , q12 ) and the ﬁrst order diﬀerential equation for Q+ yields: (∂x2− − u)ξ = 0,

u = p + p2

The relation between u and p is called the Miura transformation. The ﬁrst order linear system for Q+ is a matrix version of the Schroedinger equation, which is recovered as an equation for the ﬁrst row of the matrix Q+ .

450

12 The Toda ﬁeld theories

The chiral ﬁelds P and P are the two arbitrary functions parametrizing the general solution of Toda ﬁeld equations. Note that the splitting of chiralities in Toda ﬁeld theories brings us back to the Drinfeld–Sokolov linear systems of Chapter 10. We describe more explicitly the case of sl(n + 1). The n fundamental representations are the vector representation and its wedge products. The vector representation acts on Cn+1 with basis | j , j = 1, . . . , (n + 1). The highest weight vector is | 1 . The elements of the Cartan algebra are the Ei,i±1 , where Eij = | i j | are traceless diagonal matrices, and E± = the canonical matrices acting on Cn+1 . In the vector representation the −1 ¯ | 1 . Let us decompose chiral ﬁelds are ξ(x) = 1 |Q + and ξ(x) = Q− Pj = the ﬁelds P and P¯ as P = j Pj Ejj and P¯ = j P¯j Ejj , with ¯ ¯ ξ¯j the components of the chiral ¯ﬁelds: ξ(x) = Pj = 0. Denote¯ by ξj and | ξ , and ξ(x) = j j j j ξj | j . The functions ξj and ξj satisfy the diﬀerential equations of order (n + 1): (∂x+ − P¯n+1 ) · · · (∂x+ − P¯1 ) ξ¯j = 0 (∂x− − Pn+1 ) · · · (∂x− − P1 ) ξj = 0

(12.18) (12.19)

Indeed, the ﬁrst row of Q+ is (ξ1 , . . . , ξn+1 ) and the explicit form of E+ immediately yields eqs. (12.19). Proposition. The components of the Toda ﬁeld along the fundamental weights Λ(p) are given by: e−2Λ

(p) (Φ)

¯ = det (∂xi − ξ · ∂xj + ξ),

i, j = 0, . . . , p − 1

(12.20)

Proof. The highest weight vector of the Λ(p) is the wedge product |Λ(p) = p−1 | 1 ∧ · · · ∧ | p , and we have E− | 1 = | p for p = 1, . . . , n. Thus, p−1 |Λ(p) = | 1 ∧ E− | 1 ∧ · · · ∧ E− | 1

¯(p) Acting with Q−1 − we obtain an expression for the chiral ﬁeld ξ : −1 −1 −1 p−1 (p) ξ¯(p) = Q−1 − |Λ = Q− | 1 ∧ Q− E− | 1 ∧ · · · ∧ Q− E− | 1 (12.21) j We now use the equation of motion (12.12) to express Q−1 − E− | 1 in terms j ¯ of the derivatives ∂x+ ξ. Using eq. (12.12) and diﬀerentiating it, we get: −1 −1 ¯ Q−1 − E− | 1 = −∂x+ Q− | 1 + Q− P | 1 −1 −1 ¯ 2 2 ¯ ¯2 Q−1 − E− | 1 = ∂x+ Q− | 1 + Q− P E− + E− P − P − ∂x+ P | 1 j −1 j j Q−1 − E− | 1 = (−1) ∂x+ Q− | 1 + · · ·

451

12.3 Solution of the Toda ﬁeld equations The extra terms cancel in the wedge product (12.21). Therefore, ξ¯(p) = (−1)

p(p−1) 2

−1 p−1 −1 Q−1 − | 1 ∧ (∂x+ Q− )| 1 ∧ · · · ∧ (∂x+ Q− )| 1

= (−1)

p(p−1) 2

ξ¯ ∧ ∂x+ ξ¯ ∧ · · · ∧ ∂xp−1 ξ¯ + 

= (−1)

p(p−1) 2

ξ¯j1  ∂x+ ξ¯j1 | j1 ∧ · · · ∧ | jp det  ..  . j1 <···<jp ¯ ∂xp−1 + ξj

1

 ξ¯jp ∂x+ ξ¯jp   ..  . p−1 ¯ · · · ∂x+ ξjp ··· ···

A similar expression is obtained for the chiral ﬁelds: ξ (p) = Λ(p) |Q+ = (−1)

p(p−1) 2

ξ ∧ ∂x− ξ ∧ · · · ∂xp−1 ξ −

Equation (12.20) follows from exp(−2Λ(p) (Φ)) = ξ (p) · ξ¯(p) . Noting that k ···k j1 | ∧ · · · ∧ jp | k1 ∧ · · · ∧ | kp = j11···jpp , we can also write e−2Λ

(p) (Φ)

=

j1 <···<jp



ξj1  ∂x− ξj1 det  ..  .

∂xp−1 − ξj1

  ¯ ξjp ξj1 ∂x− ξjp   ∂x+ ξ¯j1  det  .. ..   . . p−1 ¯ ∂ · · · ∂xp−1 ξ x+ ξj1 jp − ··· ···

 ξ¯jp ¯ ∂x+ ξjp   ..  . p−1 ¯ · · · ∂x+ ξjp ··· ···

This is the sln+1 generalization of the Liouville solution eq. (12.3). It is interesting to give a direct and algebraic proof that the formula we just obtained for Φ does indeed satisfy the Toda ﬁeld equations. We ﬁrst rewrite the ﬁeld equations directly in terms of the tau-functions:

τi (x+ , x− ) = exp −2Λ(i) (Φ) (12.22) Proposition. In terms of the functions τi , the ﬁeld equations become: (αi , αi ) −aji τj (12.23) τi (∂x− ∂x+ τi ) − (∂x+ τi )(∂x− τi ) = − 2 j=i

where aij is the Cartan matrix of G. Proof. Recall the Toda ﬁeld equations, eqs. (12.5), for the components −a ϕi = Λ(i) (Φ) = − 12 log τi . We have exp 2 j ϕj aji = j τj ji . Equation (12.23) for the tau-functions then follows using aii = 2 and 1 ∂x+ ∂x− log τi = 2 τi ∂x− ∂x+ τi − ∂x+ τi ∂x− τi . τi

452

12 The Toda ﬁeld theories

Remark. Using Hirota’s diﬀerential operators we can rewrite eq. (12.23) as: D+ D− τi · τi = −(αi , αi )

−aji

τj

j=i

In general 0 ≤ aij aji ≤ 4 and aij ≤ 0 for j = i, so the right-hand side of this equation may be a polynomial of degree greater than 2 in the τj . However, in the sln+1 case, we have aij = −1 for j = i ± 1 (the other aij vanish) so this equation takes a bilinear form: D+ D− τi · τi = −2τi−1 τi+1

We now show that eqs. (12.23) can be derived directly from the chiral equations, eqs. (12.12, 12.11). The tau-functions, eq. (12.17), are directly expressed in terms of G = Q+ Q−1 − : τi = Λ(i) |G|Λ(i) The left-hand side of eq. (12.23) can be written as: τi (∂x− ∂x+ τi ) − (∂x+ τi )(∂x− τi ) = Λ(i) | ⊗ Λ(i) |G ⊗ ∂x− ∂x+ G −∂x+ G ⊗ ∂x− G|Λ(i) ⊗ |Λ(i) Using the equations of motion of Q± , we see that the ﬁelds P and P¯ drop out because they always hit the highest weight vector giving the scalar factors Λ(i) (P ) or Λ(i) (P ). We get: τi (∂x− ∂x+ τi ) − (∂x+ τi )(∂x− τi ) = − Λ(i) |G|Λ(i) Λ(i) |E+ GE− |Λ(i) + Λ(i) |E+ G|Λ(i) Λ(i) |GE− |Λ(i) This can be rewritten as τi (∂x− ∂x+ τi ) − (∂x+ τi )(∂x− τi ) = − 12 Ξ(i) | G ⊗ G |Ξ(i)

(12.24)

where |Ξ(i) = |Λ(i) ⊗ E− |Λ(i) − E− |Λ(i) ⊗ |Λ(i) = |Λ(i) ∧ E− |Λ(i) The proof of the equations of motion, eq. (12.23), now consists of comparing the right-hand sides of eq. (12.23) and eq. (12.24). We need the following lemma: Lemma. In the tensor product of two copies of the fundamental representation with highest weight Λ(i) , |Ξ(i) is a highest weight vector with weight − j=i aji Λ(j) and norm Ξ(i) |Ξ(i) = (αi , αi ).

12.3 Solution of the Toda ﬁeld equations

453

Proof. We must show that Eα kills |Ξ(i) for all positive roots α. It is enough to consider α simple. In the tensor product, Eα is represented by Eα ⊗ I + I ⊗ Eα . Hence we have to show that: (Eα ⊗ I + I ⊗ Eα )|Ξ(i) = |Λ(i) ⊗ Eα E− |Λ(i) − Eα E− |Λ(i) ⊗ |Λ(i) = 0 But for α a simple positive root, Eα E− |Λ(i) = [Eα , E−β ]|Λ(i) = Hα |Λ(i) = Λ(i) (Hα )|Λ(i) β simple

where we have used that for α and β simple, [Eα , E−β ] = 0 for β = α. The ﬁrst part of the lemma follows. For the second part, we must prove that |Ξ(i) is an eigenstate of (Hα ⊗ I + I ⊗ Hα ) and ﬁnd the eigenvalue. Let us compute: (Hα ⊗ I + I ⊗ Hα )|Ξ(i) = Hα |Λ(i) ⊗ E− |Λ(i) − Hα E− |Λ(i) ⊗ |Λ(i) +|Λ(i) ⊗ Hα E− |Λ(i) − E− |Λ(i) ⊗ Hα |Λ(i) Next we notice that E− |Λ(i) = E−αi |Λ(i) , because E−β |Λ(i) vanishes unless β = αi . To see it, we compute the norm of this vector: Λ(i) |Eβ E−β |Λ(i) = Λ(i) (Hβ ) This vanishes if β = αi by deﬁnition of the fundamental weights. Hence Hα E− |Λ(i) = −αi (Hα ) + Λ(i) (Hα ) E−αi |Λ(i) , which gives: (Hα ⊗ I + I ⊗ Hα )|Ξ(i) = [2Λ(i) (Hα ) − αi (Hα )]|Ξ(i) Thus the weight of |Ξ(i) is 2Λ(i) − αi = − j=i aji Λ(j) because αi = (j) (i) (i) (i) j Λ aji . Finally, we have: Ξ |Ξ = Λ (Hαi ) = (αi , αi ). We now ﬁnish the proof of the equations of motion. We have to compute Ξ(i) | G ⊗ G |Ξ(i) . But G ⊗ G is the representation of G in the tensor product representation. Since this scalar product only depends on the values of the weights and the Lie algebra structure, we can compute it once we know the highest weight vector and its normalization: 1

|Ξ(i) = (αi , αi ) 2

< j=i

This yields eq. (12.22).

⊗(−aij ) |Λ(j)

454

12 The Toda ﬁeld theories 12.4 Hamiltonian formalism

The Toda ﬁeld equation eq. (12.4) can be derived from the Lagrangian density: 1 1 L = (∂t Φ, ∂t Φ) − (∂x Φ, ∂x Φ) − 2 2

exp(2α(Φ))

αi simple

As usual we can go to the Hamiltonian formalism by introducing the conjugate momentum Π = ∂t Φ. Weexpand it on an orthonormal basis of the Cartan subalgebra as Π(x) = i Πi (x)Hi . The canonical equal-time Poisson bracket between Φi and its conjugate momentum Πi is 1 {Πi (x), Φj (y)} = δij δ(x − y) 2 Since we compute equal time Poisson brackets, we will often assume in the following that t = 0, and we do not write the time dependence explicitly. Replacing ∂t Φ(x) by Π(x) in the expression of Ax , we get: A(x) ≡ Ax (x) = Π + eα(Φ) nα (Eα + E−α ) α simple

The Poisson brackets of A takes the familiar r-matrix form. In exactly the same way as for the Toda chain in Chapter 4, we have: Proposition. There exists a matrix r12 ∈ G ⊗G, independent of the ﬁelds Π and Φ, such that: {A1 (x), A2 (y)} = [r12 , A1 (x) + A2 (y)]δ(x − y)

(12.25)

We have r12 =

1 2

α positive

Eα ⊗ E−α − E−α ⊗ Eα + λ · C12 (Eα , E−α )

(12.26)

where C12 the tensor Casimir: C12 =

i

Hi ⊗ Hi +

α positive

Eα ⊗ E−α + E−α ⊗ Eα (Eα , E−α )

(12.27)

As explained in Chapter 4, the two values λ = ± 12 of the parameter ± which are solutions of multiplying the tensor Casimir yield matrices r12 the classical Yang–Baxter equation ± ± ± ± ± ± , r13 ] + [r12 , r23 ] + [r13 , r23 ]=0 [r12

455

12.4 Hamiltonian formalism These two solutions, which play an important role below, are: + = r12

1 H i ⊗ Hi + 2 i

− =− r12

α positive

1 Hi ⊗ Hi − 2 i

Eα ⊗ E−α (Eα , E−α )

α positive

E−α ⊗ Eα (Eα , E−α )

(12.28) (12.29)

Recall that the wave function Ψ(x) is the solution of the linear system (∂x − Ax ) Ψ(x) = 0, normalized with the boundary condition Ψ(0) = 1. As explained in Chapter 3, the Poisson brackets of the matrix elements of Ψ(x) can be computed explicitly and take the following form: Proposition. The wave function Ψ(x) satisﬁes the quadratic bracket relation: ± {Ψ1 (x), Ψ2 (x)} = [r12 , Ψ1 (x)Ψ2 (x)]

(12.30)

+ − We can use either r12 or r12 , since the diﬀerence is the tensor Casimir C12 which commutes with Ψ(x) ⊗ Ψ(x).

With each highest weight vector |Λ(r) we associate the two chiral vec(r) tors ξ (r) and ξ by eqs. (12.16). The Poisson brackets between these chiral vectors can be computed and obey what is called an exchange algebra. Since we compute the Poisson brackets at equal time, t = 0, we do not distinguish the notations x+ = x− = x. Proposition. The exchange algebra of the chiral ﬁelds reads: (r )

(r )

± {ξ1 (x), ξ2 (y)} = −ξ1 (x)ξ2 (y) r12 (r)

(r)

(r )

(r )

∓ ξ 1 (x)ξ 2 (y) {ξ 1 (x), ξ 2 (y)} = −r12 (r)

(r )

(r) {ξ1 (x), ξ 2 (y)} (r) (r ) {ξ 1 (x), ξ2 (y)}

(r)

= =

(r )

(r) − ξ1 (x) · r12 · ξ 2 (y) (r) (r ) + ξ2 (y) · r12 · ξ 1 (x)

for x>
(12.31)

for x = y for x = y

Proof. We have: (r )

{ξ1 (x), ξ2 (y)} = Λ(r) | ⊗ Λ(r ) |{e−Φ1 (x) Ψ1 (x), e−Φ2 (y) Ψ2 (y)} (r)

Λ(r) | ⊗ Λ(r ) | exp[−Φ1 (x)] exp[−Φ2 (y)] · {Ψ1 (x), Ψ2 (y)} − {Φ1 (x), Ψ2 (y)}Ψ1 (x) − {Ψ1 (x), Φ2 (y)}Ψ2 (y) Thus we need to evaluate the Poisson brackets of the wave function at diﬀerent points and the Poisson brackets between the wave function Ψ(x)

456

12 The Toda ﬁeld theories

and the Toda ﬁeld Φ(y). The latter is equal to: 0 {Φ1 (x), Ψ2 (y)} = −θ(y − x) Ψ2 (y)Ψ−1 2 (x) C12 Ψ2 (x)

where 0 = C12

1 2

(12.32)

Hi ⊗ Hi

i

and θ(x − y) is the step function. To show it, we take the Poisson bracket of (∂y − A(y))Ψ(y) = 0 with Φ(x), getting: [∂y − A2 (y)]{Φ1 (x), Ψ2 (y)} = {Φ1 (x), A2 (y)}Ψ2 (y) 0 δ(x−y). The result The right-hand side is known: {Φ1 (x), A2 (y)} = −C12 follows by integrating this diﬀerential equation and imposing the boundary condition expressing the ultralocality of the model {Φ1 (x), Ψ2 (y)} = 0 if x > y. Let us now compute the Poisson brackets of the wave function at different points. For x > y we can write Ψ(x) = Ψ(x, y)Ψ(y), where Ψ(x, y) is the transport matrix from y to x. Using the ultralocality property of the Poisson bracket, we have {Ψ1 (x), Ψ2 (y)} = Ψ1 (x, y){Ψ1 (y), Ψ2 (y)}. Equation (12.30) then yields: −1 {Ψ1 (x), Ψ2 (y)} = Ψ1 (x)Ψ2 (y)[−r12 + Ψ−1 1 (y)Ψ2 (y)r12 Ψ1 (y)Ψ2 (y)] (12.33) Combining this with eq. (12.32), we ﬁnally evaluate the Poisson brackets between components of the chiral ﬁeld ξ(x) as: (r)

(r )

{ξ1 (x), ξ2 (y)} = Λ(r) | ⊗ Λ(r ) | exp[−Φ1 (x)] exp[−Φ2 (y)] 0 ·[−Ψ1 (x)Ψ2 (y) · r12 + Ψ1 (x)Ψ−1 1 (y) · [r12 − C12 ].Ψ1 (y)Ψ2 (y)] ± + . Choosing r12 = r12 , we have In this formula, one can take r12 = r12 + (r ) (r ) 0 1 ⊗ Λ |r12 = 1 ⊗ Λ |C12 , and we see that the last term vanishes. This proves the ﬁrst of eqs. (12.31) for x > y. The other cases are proved similarly.

12.5 Conformal structure The solutions of the Toda ﬁeld theory were parametrized by two chiral ﬁelds P and P . The transformation which relates the original ﬁelds to these chiral ﬁelds is highly non-local. There exist, however, remarkable quantities which are local in terms of both sets of ﬁelds. The purpose of this section is to describe them and to ﬁnd their Poisson bracket algebra. As we will see, the Virasoro algebra appears as a subalgebra, so that we are in fact dealing with the conformal symmetry algebra of the theory.

12.5 Conformal structure

457

We consider only one chirality, and recall the linear systems eq. (12.12) and eq. (12.13):

[∂x−

[∂x− − P − E+ ]Q+ = 0 + 2∂x− Φ − E+ ]N− Q+ = 0

(12.34) (12.35)

These equations have exactly the same structure, but one is expressed in terms of P while the other is expressed in terms of ∂x− Φ, the derivative of the original ﬁeld of the theory. The relation between the two equations is a gauge transformation N− ∈ N− . Notice that the ﬁelds ξ (r) (x) are invariant under this gauge transformation because Λ(r) |N− = Λ(r) |. The vector ξ (r) (x) is just the ﬁrst row of Q+ or N− Q+ . For ﬁnite-dimensional highest weight representations of G, one can deduce from the ﬁrst order linear systems eq. (12.34) and eq. (12.35) a single diﬀerential equation of higher order for the components of vector ξ (r) (x), L·ξ =0 (12.36) Since the components of ξ (r) (x) are invariant under the gauge transformation N− , the coeﬃcients of this equation will also be invariant. Moreover, these coeﬃcients are local diﬀerential polynomials in terms of P or Φ. These are the local quantities we were looking for. In the following, we shall restrict ourselves to the sl(n + 1) case in the fundamental representation. In this representation, the vectorξ(x) has (n + 1) components, ξ(x) = (ξi (x)), i = 1, . . . , (n + 1) and E+ = i Ei,i+1 , basis of (n + 1) × (n + 1) matrices. Finally whereEij is the canonical P = i Pi Eii with i Pi = 0. Using eq. (12.34) or eq. (12.35) one can write the operator L in eq. (12.36) explicitly, cf. eq. (12.19): L = ∂xn+1 − −

n−1

ui ∂xi −

i=0

with L = (∂x− − Pn+1 ) · · · (∂x− − P1 ),

Pi = 0

(12.37)

i

From eq. (12.35) L admits a similar expression but with P → −2∂x− Φ. It is easy to compute un−1 : 1 un−1 = (P, P ) + (Hρ , ∂x− P ) = 2(∂x− Φ, ∂x− Φ) − 2(Hρ , ∂x2− Φ) 2

(12.38)

458

12 The Toda ﬁeld theories

where Hρ is the element in H such that [Hρ , E+ ] = E+ , namely in our case Hρ = Diag( n2 , n2 − 1, . . . , − n2 ), yielding (Hρ , ∂x− P ) = n∂x− P1 + (n − 1)∂x− P2 + · · · + ∂x− Pn Using the Toda equations of motion to eliminate the higher order time derivatives of Φ, we also have un−1 = H − P, where: 1 1 H = (Π, Π) + (Φx , Φx ) + e2α(Φ) − (Hρ , Φxx ) 2 2 α simple

P = (Π, Φx ) − (Hρ , Πx ) These are the energy and momentum densities respectively. The function un−1 in eq. (12.38) is the generalization of the potential u introduced in the Liouville theory. We now compute its Poisson brackets. As before we set t = 0 and identify x− = x. 0 = 1 Proposition. Let C12 i Hi ⊗ Hi . Then 2 0 {P1 (x), P2 (y)} = (∂x − ∂y )δ(x − y)C12 (12.39) 3 {un−1 (x), un−1 (y)} = un−1 ∂x + ∂x un−1 − (Hρ , Hρ )∂x δ(x − y)

We recognize the standard Virasoro algebra, see Chapter 11. The value of the central charge is (Hρ , Hρ ) = n(n + 1)(n + 2)/12. Proof. We have to compute the Poisson brackets of P . To do this we start from the exchange algebra, and remark that the scalar product ξ (r) (x)|Λ(r) satisﬁes: 6

7 log ξ (r) (x)|Λ(r) , log ξ (r ) (y)|Λ(r ) 1 (r) Λ (Hi )Λ(r ) (Hi ) (x − y) (12.40) =− 2 i

where (x) is the sign of x. Note also that eq. (12.34) implies that x

ξ (r) (x)|Λ(r) = e

Λ(r) (P )

θ

where θ is some integration constant. Equation (12.39) follows by diﬀerentiating eq. (12.40) with respect to x and y and using the independence of the fundamental weights. The Poisson bracket for un−1 then follows readily, either from the expression of un−1 in terms of P or in terms of Φ and Π. We now explore the Poisson algebra of the other invariant coeﬃcients ui in eq. (12.37). Since the quantity un−1 is the generator of the conformal

12.5 Conformal structure

459

symmetry underlying the theory, we call this Poisson bracket algebra an extended conformal algebra. Comparing with Chapter 10, we know that the Poisson brackets of the ui coincides with the second Hamiltonian structure of the generalized KdV equation. We are going to rederive this result starting from the exchange algebra eq. (12.31). In the course of this computation we will exhibit the extended conformal properties of the ﬁelds ξi . Any component ξi (x), i = 1, . . . , n + 1, of the vector ξ(x) obeys the diﬀerential equation L·ξi = 0 with L a diﬀerential operator of order n+1. Once we know the functions ξi (x), the operator L can be reconstructed as:   ξ ξ1 · · · ξn+1 ξ1 · · · ξn+1   ξ L · ξ = det  (12.41) ..  . .  . =0 .. . . (n+1) (n+1) · · · ξn+1 ξ (n+1) ξ1 It follows that the coeﬃcients ui of L are given by Wronskian type expressions of the ξi , and we can directly calculate their Poisson brackets knowing those of the ξi . To express the result conveniently we need to recall some notations about pseudo-diﬀerential operators, see Chapter 10. We have introduced the derivation symbol ∂ = ∂x (here identiﬁed with ∂x− ), with the usual Leibnitz rule ∂.a = a.∂ + (∂a), where (∂a) means ∂x a(x), and the integration symbol ∂ −1 with the following computational rules: ∞ i+v −1 −1 −i−1 v ∂ ∂ = ∂∂ = 1, ∂ (∂ v f )∂ −i−1−v(12.42) f= (−1) v v=0

i The elements A = N −∞ ai ∂ form an associative algebra with unit, called the algebra of formal pseudo-diﬀerential operators in one variable. It is equipped with a linear form satisfying the fundamental trace property AB = BA , called the Adler trace, and deﬁned by: A = dxa−1 With these notations we have: Proposition. Let X = i ∂ −i−1 Xi and let fX (L) = LX . Then one has: 1 −1 {fX (L), ξi (x)} = (XL)+ + ∂ ([X, L]−1 ) ξi (x) (12.43) n+1

460

12 The Toda ﬁeld theories

Proof. Using eq. (12.42) we ﬁnd: (XL)+ =

k−i−1

(−1)

k−i−s

k>i≥0 s=0

k−1−s i

∂ k−i−s−1 (Xi uk )∂ s

and [X, L]−1 = −

k+i

(−1)

k>i≥0

k ∂ k−i (Xi uk ) i

So we have to prove that: 1 k k+i ∂ k−i−1 (Xi uk )ξq (x) (−1) {fX (L), ξq (x)} = − i (n + 1) +

k−i−1

(−1)

k+i+s

k>i≥0 s=0

k>i≥0

k−s−1 i

∂ k−i−s−1 (Xi uk )∂ s ξq

(12.44)

From eq. (12.41), the coeﬃcients ui in the operator L are given by ui = (−1)i+1 det Mi , where Mi is the matrix of elements 5 a

(Mi )ab = ∂ ξb

with

a = 0, 1, ..., n + 1, a = i b = 1, ..., n + 1

Let [∆i (x)]kl with k = 0, . . . , n + 1, k = i and l = 1, . . . , n + 1 be the minors of the matrix Mi . By using the Leibnitz rules for the Poisson brackets, we obtain {fX (L), ξq (x)} = − dzXi (z)(−1)i [∆i (z)]kl ∂zk {ξl (z), ξq (x)} (12.45) where all repeated indices are summed over. The Poisson brackets of the chiral ﬁelds were obtained in eq. (12.31): − + − {ξ1 (z), ξ2 (x)} = −ξ1 (z)ξ2 (x)r12 − θ(z − x)ξ1 (z)ξ2 (x)(r12 − r12 )

We have thus two contributions to eq. (12.45). The contribution of the term independent of θ(z − x) is

− dzXi (z)(−1)i [∆i (z)]kl ∂ k ξm (z)ξn (x)rmn,lq

12.5 Conformal structure

461

Remembering that k=i [∆i (z)]kl ∂ k ξm (z) = (det Mi ) δlm , we see that the − = above expression vanishes since the r-matrices are such that: Tr1 r12 − − Tr2 r12 = Tr12 r12 = 0. Calculating the θ(z − x) dependent term, we get k Xi (z)(−1)i [∆i (z)]kl ∂za ξm (z)ξn (x) dz {fX (L), ξq (x)} = a a=k

·(r+ − r− )mn,lq ∂zk−a θ(z − x) The term a = k in the sum may be excluded, again due to the trace ± properties of the matrices r12 . To evaluate this expression further we + − remark that r12 −r12 = C12 , where C12 is the Casimir element. To evaluate C12 for two vector representations of sln+1 we ﬁrst compute it in gln+1 according to eq. (12.27). Here the Eα are the Eab for a < b, and E−α = E ba , while Hi = Eii , so we get for C12 the permutation operator P12 = Eab ⊗ Eba , or Pij,kl = δil δjk . We restrict this to sln+1 by requiring that 1 1 + P12 . Using this the partial traces of C12 vanish, yielding C12 = − n+1 result, we ﬁnally write the Poisson brackets (12.45) as: {fX (A), ξq (x)} =

1 Uq (x) − Vq (x) (n + 1)

where Uq (x) comes from the identity factor in C12 and Vq (x) from the factor P12 . k i ∂za ξl (z)ξq (x)∂zk−a−1 δ(z − x) Uq (x) = dzXi (z)(−1) [∆i (z)]kl a a=k k Vq (x) = ∂za ξq (z)ξl (x)∂zk−a−1 δ(z − x) dzXi (z)(−1)i [∆i (z)]kl a a=k

The Uq (x) term is easy to deal with. Noticing that, for k = i and a ≤ n+1, we have [∆i (z)]kl ∂ a ξl (z) = (det Mi ) δka + (−1)k+i+1 (det Mk ) δia (12.46) l

we get Uq (x) = −

k>i

(−1)k+i

k ∂ k−i−1 (Xi uk )ξq (x) i

After integrating over z the expression of Vq (x) becomes k i+k+a+1 Vq (x) = (−1) ∂ k−a−1 (Xi [∆i ]kl ∂ a ξq )ξl (x) a a=k

(12.47)

462

12 The Toda ﬁeld theories

The identity

(∂ c A)B

=

c b=0

(−1)b

c ∂ c−b (A∂ b B) allows us to rewrite d

this as k−a−1 k k−a−1 k+a+i+b+1 (−1) Vq (x) = a b a=k

b=0

×∂ k−a−b−1 (Xi [∆i ]kl ∂ a ξq ∂ b ξl ) Suming over l using eq. (12.46) and performing the derivatives, we obtain k−1−s k+i (−1) Vq (x) = − i s≥0 9 : k k − 1 − a ∂ k−i−1−s (Xi uk )∂ s ξq (−1)a × a k−1−s a
The sum over a in the curly bracket is equal to (−1)s , so that ﬁnally k−s−1 k+i+s ∂ k−i−s−1 (Xi uk )∂ s ξq (−1) Vq (x) = − i s≥0

which ends the proof. It is now very easy to derive the Poisson bracket algebra of the coeﬃcient uk : Proposition. The Poisson brackets of the uk reproduce the second Poisson bracket structure of the KdV hierarchy: {fX (L), fY (L)} = (LX)+ (LY )− − (XL)+ (Y L)− (12.48) 1 − dx(∂ −1 [X, L]−1 )[Y, L]−1 n+1 Proof. We start from Lξi = 0 , so that {fX (L), L}ξi + L{fX (L), ξi } = 0, or equivalently 1 −1 {fX (L), L}ξi + L (XL)+ + ∂ [X, L]−1 ξi = 0 n+1 Next, using again the diﬀerential equation for ξ, we rewrite this equation as: 6 {fX (L), L} + L(XL)+ − (LX)+ L (12.49)

7 1 (L∂ −1 [X, L]−1 − ∂ −1 [X, L]−1 L) ξi = 0 + n+1

12.6 Dressing transformations

463

Because L(XL)+ − (LX)+ L = (LX)− L − L(XL)− , the diﬀerential operator in the left-hand side of eq. (12.49) is of order at most n. Since it annihilates the (n + 1) linearly independent functions ξi , it is identically zero. The result then follows by multiplying eq. (12.49) by Y and taking the Adler trace. Remark. Equation (12.43) describes the conformal properties of the ﬁeld ξi . To extract it, let us apply it with X = ∂ −n Xn−1 , so that fX (L) = 2π − 0 dxXn−1 (x)un−1 (x). We then get

{fX (L), ξi (x)} = Xn−1 ∂ − ∆(1) Xn+1 ξi (x) with ∆(1) = n2 = Λ(1) (Hρ ), where Λ(1) is the highest weight of the vector representation. This shows that ξi (x) is a diﬀerential of weight −n/2. Using ξ (r) = ξ (1) ∧· · ·∧∂ (r−1) ξ (1) , we see that ξ (r) is a diﬀerential of weight −∆(r) = −Λ(r) (Hρ ) = −(r − 1)(n − r)/2. In fact, each ξ (1) has conformal weight −n/2 and each ∂ has weight 1.

12.6 Dressing transformations We now describe the group of dressing transformations in Toda ﬁeld theories. In Chapter 3, dressing transformations were introduced as special gauge transformations preserving the form of the Lax connection. On the wave function they read (see eq. (3.98) in Chapter 3): −1 −1 Ψ → Ψg = Θ+ Ψg+ = Θ− Ψg− ,

−1 −1 Θ−1 (12.50) − Θ+ = Ψg− g+ Ψ

Here g is an element of G = exp G, and g± , Θ± are obtained by solving a suitable factorization problem in G which we now explain. The last condition in eq. (12.50) ensures that the two gauge transformations by Θ± produce the same gauge transformed connection. We use this condition to determine Θ± by a factorization problem in G. The right factors, g± , play no role in the transformation of the connection, but are essential for understanding the structure of the dressing group. Proposition. Let g ∈ G = exp G. Deﬁne a factorization problem −1 g = g− g+

(12.51)

by requiring that g± ∈ exp B± , and that the components of g+ and g− on exp(H) are inverse to each other. Then the gauge transformations Θ± in eq. (12.50) preserve the form of the Toda connection. Proof. We want the gauge transformed connection Agx± = (∂x± Ψg ) Ψg −1 to be of the form: g Agx± = ±∂x± Φg + e∓ad Φ E∓

464

12 The Toda ﬁeld theories

for some ﬁeld Φg ∈ H. If this is the case, the gauge transformation acts on the Toda ﬁeld Φ. We have: −1 Agµ = ∂µ Θ± Θ−1 ± + Θ± Aµ Θ±

with Ax± = ±∂x± Φ + e∓ad Φ E∓

First, one has to see that Agx− decomposes on elements of height (0, 1) of the Lie algebra G, and Ax+ on elements of height (0, −1). The choice Θ± ∈ exp B± precisely ensures this condition. To show it, we use the fact that one can perform the gauge transformation using either Θ+ or Θ− . Using Θ+ we ﬁnd that Agx− contains only positive heights, while using Θ− we see that its maximal height is 1, so ﬁnally Agx− decomposes on elements of height (0, 1). Similarly, Agx+ decomposes on elements of height (0, −1). Let us write Θ+ = e∆+ M+ , Θ− = e∆− M− ,

∆+ ∈ H, M+ ∈ exp N+ ∆− ∈ H, M− ∈ exp N−

Performing the gauge transformation Θ+ we have:

Agx− = ∂x− (e∆+ M+ )M+−1 e−∆+ + e∆+ M+ −∂x− Φ + ead Φ E+ M+−1 e−∆+ Projecting this equation on H, we obtain: ∂x− Φg = ∂x− Φ − ∂x− ∆+ Now performing the gauge transformation with Θ− , we get:

Agx− = ∂x− (e∆− M− )M−−1 e−∆− + e∆− M− −∂x− Φ + ead Φ E+ M−−1 e−∆− g

Comparing the terms of height +1 we get ead Φ E+ = ead (Φ+∆− ) E+ , so we must have αi (Φg ) = αi (Φ + ∆− ) for all αi simple, which implies Φ g = Φ + ∆−

(12.52)

The same analysis performed with Ax+ gives ∂x+ Φg = ∂x+ Φ + ∂x+ ∆− and Φg = Φ − ∆+ . We see that these four equations are compatible if and only if ∆− = −∆+ i.e if Θ+ and Θ− have opposite components on exp H.

12.6 Dressing transformations

465

Proposition. The induced action on the chiral ﬁelds ξ(x+ ) and ξ(x− ) is: −1 ξ g (x+ ) = ξ(x+ ) g− ,

g

ξ (x− ) = g+ ξ(x− )

(12.53)

Note that the factors g± of g act separately on the two chiral sectors. The dressed Toda ﬁeld is: e−2Λ

(r) (Φg (x,t))

−1 = ξ (r) (x− ) g− · g+ ξ¯(r) (x+ )

Proof. We have −1 ξ g = Λ|(e−Φ Ψg ) = e−Λ(Φ ) Λ| (Θ− Ψ) g− −1 g −1 = e−Λ(Φ −Φ−∆− ) Λ| e−Φ Ψ g− = ξ · g− g

g

(12.54)

In eq. (12.54) we used eq. (12.52) and the properties of the highest weight vector Λ|. The transformation law for ξ is found in a similar way. −1 Proposition. Let g = g− g+ and h = h−1 − h+ in G. Their multiplication in the dressing group is:

(g+ , g− ) • (h+ , h− ) = (g+ h+ , g− h− )

(12.55)

In particular, the plus and minus components commute. Proof. This follows from the general statement in Chapter 3 but is also obvious from the action on the chiral ﬁelds ξ, ξ. We denote by G∗ the new group equipped with this multiplication law. For inﬁnitesimal transformations, g 1 + X, and g± 1 + X± with X = X+ − X− , the composition law corresponds to: [X, Y ]R = [X+ , Y+ ] − [X− , Y− ]

(12.56)

This deﬁnes a new Lie algebra G ∗ which is the Lie algebra of G∗ . As in Chapter 4, we introduce the operators R± such that X± = R± X. Then we have: + − X2 ), R− X = Tr2 (r12 X2 ) (12.57) R+ X = Tr2 (r12 ± are the matrices in eqs. (12.28, 12.29). This exhibits the deep where r12 relation between the group of dressing transformations and the r-matrix formalism. Using the action of dressing transformations on the chiral ﬁelds ξ(x+ ) and ξ(x− ), it is easy to understand the Poisson–Lie group nature of the

466

12 The Toda ﬁeld theories

group of dressing transformations, see Chapter 14. Indeed, the exchange algebra ± {ξ1 (x), ξ2 (y)} = −ξ1 (x)ξ2 (y)r12 for x>
Proposition. Let G∗ be equipped with the Poisson brackets: {(g+ )1 , (g+ )2 }G∗ {(g− )1 , (g− )2 }G∗ {(g− )1 , (g+ )2 }G∗ {(g+ )1 , (g− )2 }G∗

= = = =

± −[r12 , (g+ )1 (g+ )2 ] ∓ −[r12 , (g− )1 (g− )2 ] − −[r12 , (g− )1 (g+ )2 ] + −[r12 , (g+ )1 (g− )2 ]

−1 ¯ ξ → g+ ξ¯ leaves the exchange algebra (12.31) then the action ξ → ξg− invariant. −1 Proof. Consider the action of g− on ξ: ξ g = ξg− . Introduce the Poisson −1 brackets on g− : −1 −1 ± −1 −1 )1 , (g− )2 } = [−r12 , (g− )1 (g− )2 ] {(g−

We view ξ g as a function on the product of the group G∗ by the phase space M equipped with the product Poisson structure, i.e. −1 )2 } = 0 {ξ1 (x), (g−

Then we have: −1 −1 )1 (g− )2 {ξ1g (x), ξ2g (y)}G∗ ×M = {ξ1 (x), ξ2 (y)}M (g−

−1 −1 +ξ1 (x)ξ2 (y){(g− )1 , (g− )2 }G∗ / ± −1 0 −1 −1 −1 = ξ1 (x)ξ2 (y) r12 (g− )1 (g− )2 + {(g− )1 , (g− )2 }G∗

± = ξ1g (x)ξ2g (y)r12

The other cases are treated similarly. −1 Note that for the factorized element, g = g− g+ , the Poisson bracket on G∗ reads: + − ± ∓ {g1 , g2 }G∗ = −g1 r12 g2 − g2 r12 g1 + g1 g2 r12 + r12 g1 g2

Since dressing transformations are not symplectic transformations, they are not generated by scalar Hamiltonians. However, since they are Lie– Poisson actions, there exists a non-Abelian analogue of the moment map.

12.7 The aﬃne sinh-Gordon model

467

In other words, there exists a non-Abelian Hamiltonian generating the dressing transformations. If we consider the system on the ﬁnite interval of length with periodic boundary conditions, then Ψ(x + ) = Ψ(x) · T , where T ≡ Ψ() is the ± monodromy matrix. The Poisson brackets of T are: {T1 , T2 } = [r12 , T1 T2 ]. The non-Abelian Hamiltonian of dressing transformations turns out to be the monodromy matrix T . This means that the variations of the chiral ﬁelds under inﬁnitesimal dressing transformations are given by the following Poisson brackets: Proposition. The monodromy matrix generates dressing transformations: δX ξ(x) = −ξ(x)X− = Tr2 X2 T2−1 {ξ1 (x), T2 } δX ξ(x) = X+ ξ(x) = Tr2 X2 T2−1 {ξ 1 (x), T2 } (12.58) Proof. We have to compute the Poisson bracket of the monodromy matrix and the chiral ﬁelds. Using eqs. (12.32, 12.33) we have: ± {T1 , e−Φ2 (x) Ψ2 (x)} = −T1 e−Φ2 (x) Ψ2 (x)r12 −Φ2 (x) ± 0 + T1 Ψ−1 (r12 − C12 )Ψ1 (x)Ψ2 (x) 1 (x)e ± , we see that the Projecting on 1 ⊗ Λ(r) | and choosing the plus sign in r12 + second term cancels, yielding {T1 , ξ2 (x)} = −T1 ξ2 r12 , or equivalently: − {ξ1 (x), T2 } = −ξ1 (x)T2 r12

The ﬁrst part of the proposition then follows from eq. (12.57). The second part is proved similarly. We see that Toda ﬁeld theories provide a particularly simple example of the action of dressing transformations and non-Abelian Hamiltonians. 12.7 The aﬃne sinh-Gordon model We now show that the sinh-Gordon theory belongs to the general class (2 . The of Toda ﬁeld theories. It is associated with the aﬃne Lie algebra sl usual formula for the Lax connection is written in the loop representation of this algebra, with vanishing central charge. In the following we will consider non-zero central charge. The beneﬁt will be that all the algebraic structure of Toda theories then applies to the sinh-Gordon theory as well. In particular, the existence of highest weight representations allows us to disentangle the action of the group of dressing transformations. As we

468

12 The Toda ﬁeld theories

will see, this leads directly to the N -soliton formula in terms of vertex operators. (2 admits the Cartan decomposition: sl (2 = N − ⊕ H⊕ The aﬃne algebra sl ( N+ . In the principal gradation, a basis of sl2 is given by (see Chapter 16): = {H, d, K} H + = {E (2n−1) = λ2n−1 E+ , E (2n−1) = λ2n−1 E− , H (2n) = λ2n H, n > 0} N + − − = {E (2n+1) = λ2n+1 E+ , E (2n+1) = λ2n+1 E− , H (2n) = λ2n H, n < 0} N + − (12.59) In particular, the simple root vectors can be taken as E±α1 = λ±1 E± and E±α2 = λ±1 E∓ . The commutation relations are: H (r) , H (s) = Kr δr+s,0 (s) (r+s) H (r) , E± = ±2E± K (r) (s) E+ , E− = H (r+s) + rδr+s,0 2 Following the general construction of Toda ﬁeld theories, we deﬁne E± = E±α1 + E±α2 = λ±1 (E+ + E− ) The connection Ax± is given by Ax± = ±∂x± Φ + me∓adΦ E∓

(12.60)

(2 . Let us of sl where the ﬁeld Φ takes values in the Cartan subalgebra H decompose it on the generators H, d and K of H: 1 1 Φ= H ϕ+d η+ K ζ 2 4 The zero curvature condition, ∂x+ Ax− − ∂x− Ax+ − [Ax+ , Ax− ] = 0, can (2 . It gives be worked out using only the Lie algebra structure of sl ∂x+ ∂x− ϕ = m2 e2η (e2ϕ − e−2ϕ ) ∂x+ ∂x− η = 0

(12.61) (12.62)

∂x+ ∂x− ζ = m2 e2η (e2ϕ + e−2ϕ )

(12.63)

Thanks to the ﬁeld η, the above equations are conformally invariant. This is in contrast with the sinh-Gordon equation which is not conformally

12.7 The aﬃne sinh-Gordon model

469

invariant. In fact, performing a change of coordinates x+ = f (x+ ) and x− = g(x− ), the equations are invariant if we redeﬁne the ﬁelds by: ϕ (x+ , x− ) = ϕ(f (x+ ), g(x− )) ζ (x+ , x− ) = ζ(f (x+ ), g(x− )) η (x+ , x− ) = η(f (x+ ), g(x− )) + log (∂f ∂g) There are two real forms of eqs. (12.61–12.63). One is when ϕ, ζ and η are all real, and the other one is when ϕ is pure imaginary and ζ, η are real. These two forms correspond to the sinh-Gordon and sine-Gordon case respectively (when one sets η = 0, which can be done consistently and decouples the equation for ϕ). In a loop representation, K = 0, the ﬁeld ζ and its equation of motion, eq. (12.63), disappear. Setting η = 0, we are left with the standard sinhGordon equation. Let us write the Lax connection in this case: 1 ∂t ϕ m(λeϕ + λ−1 e−ϕ ) 2 Ax = (12.64) − 12 ∂t ϕ m(λe−ϕ + λ−1 eϕ ) 1 ∂x ϕ −m(λeϕ − λ−1 e−ϕ ) 2 At = (12.65) − 12 ∂x ϕ −m(λe−ϕ − λ−1 eϕ ) The beneﬁt of having extended the loop algebra to the full aﬃne algebra is that we have now at our disposal highest weight representations and the general structures of Toda ﬁeld theories can be applied straightforwardly, provided that the action of “group” elements on the highest weight vector is deﬁned. In the following we shall restrict ourselves to integrable highest weights and work freely with formal expressions. Final formulae will be (2 , where they will evaluated explicitly in the level 1 representations of sl be seen to make sense. As usual with Toda ﬁeld theories, with a highest weight vector |Λ one ¯ t) deﬁned by : associates two sets of ﬁelds ξ(x, t) and ξ(x, ξ(x, t) = Λ|e−Φ Ψ(x, t),

ξ(x, t) = Ψ−1 (x, t)e−Φ |Λ

(12.66)

These ﬁelds are chiral: ∂x+ ξ = 0 and ∂x− ξ = 0. For any highest weight Λ we can reconstruct Λ(Φ) by the formula: ¯ +) exp (−2Λ(Φ)) = ξ(x− ) · ξ(x (2 algebra has two fundamental highest weights, which we The aﬃne sl shall denote by Λ− and Λ+ , see Chapter 16. They are characterized by Λ± (H) = ± 12 , Λ± (K) = 1 and Λ± (d) = 0, so that we have Λ± (Φ) = 1 4 (±ϕ + ζ). This is enough to reconstruct the ﬁelds ϕ and ζ. The ﬁeld

470

12 The Toda ﬁeld theories

η cannot be obtained by these highest weight projections. This is not a problem if we are interested in the sinh-Gordon model, since then the free ﬁeld η is equal to 0. Deﬁning the tau-functions: τ± = exp −2Λ± (Φ) we have: e−ϕ =

τ+ τ−

and e−ζ = τ+ τ−

(12.67)

In terms of the tau-functions, the equations of motion take the form: τ± (∂x− ∂x+ τ± ) − (∂x+ τ± )(∂x− τ± ) = −m2 e2η τ∓2

(12.68)

When η = 0, this is just the Hirota bilinear form of the sinh-Gordon equation. ¯ + ) for the simplest We now describe the chiral ﬁelds ξ(x− ) and ξ(x solution, the vacuum solution of eq. (12.61): ϕvac = 0,

ζvac = 2m2 x+ x−

ηvac = 0,

One can insert this solution into the linear system and compute the vacuum wave function Ψvac (x, t). We have Φvac = 12 m2 x+ x− K and the linear system becomes: 1 (∂x+ − m2 x− K − mE− )Ψ = 0, 2

1 (∂x− + m2 x+ K − mE+ )Ψ = 0 2 1

˜ − ). The ﬁrst equation is readily solved by Ψ = e 2 m x+ x− K emx+ E− Ψ(x Inserting this into the second equation, and using [E+ , E− ] = K, which ˜ = 0. implies exp (−mx+ adE− )E+ = E+ + mx+ K, one gets (∂x− − mE+ )Ψ We ﬁnally obtain: m2 x+ x− K 2

m2 x+ x − K 2

emx− E+ emx+ E− (12.69) The two expressions are equal thanks to the Campbell–Haussdorf formula. One can then compute the chiral ﬁelds: Ψvac (x, t) = e

emx+ E− emx− E+ = e−

2

ξvac (x− ) = Λ|emx− E+ ,

ξ vac (x+ ) = e−mx+ E− |Λ

(12.70)

The reconstruction formula for Φ reads

exp (−2Λ(Φ)) = ξvac (x− ).ξ¯vac (x+ ) = exp −m2 x+ x− Λ(K)

as it should be. The tau-functions of the vacuum solution are: τ+ = τ− = τ0 = exp[−m2 x+ x− ]

(12.71)

12.8 Dressing transformations and soliton solutions

471

Remark. We can embed the vacuum equations of motion ∂x+ ∂x− ζvac = 2m2 into (r)

a larger hierarchy. Introduce the variables x± for r odd, and consider the following connection, generalizing eqs. (12.60) evaluated at Φvac , 1 (r) Avac (r) = ± ∂ (r) ζ K + mE∓ x± 4 x± where E± = λ±r (E+ + E− ), with r odd. Since [E+ , E− ] = Krδrs , the zero curvature (r) (r) condition reduces to ∂x(r) ∂x(s) ζvac = 2m2 rδrs . A solution is ζvac = 2m2 r r x+ x− . (r)

(r)

+

(s)

−

(r)

(r)

In the same way as before, we can calculate Ψvac (x+ , x− ) (r)

2

− m2

(r)

Ψvac (x+ , x− ) = e

= e

m2 2

r

r

(r) (r)

rx+ x− (r) (r)

rx+ x−

K m

e

K m

e

r

r

(r)

(r)

x− E+ (r)

(r)

x+ E−

em

em

r

r

(r)

(r)

x+ E− (r)

(r)

x− E+

The vacuum chiral ﬁelds are thus: (r)

ξvac (x− ) = Λ|em

r

(r)

(r)

x− E +

,

ξ vac (x+ ) = e−m (r)

r

(r)

(r)

x+ E−

|Λ

(12.72)

(r)

The x± are the collection of elementary times deﬁning the sinh-Gordon hierarchy, see Chapter 3. Each chirality is attached separately to the poles at λ = 0, ∞ in the loop representation.

12.8 Dressing transformations and soliton solutions As shown in eq. (12.53), dressing transformation act on the chiral ﬁelds ξ and ξ¯ by: −1 , ξ g (x− ) = ξ(x− ) · g−

g

ξ (x+ ) = g+ · ξ(x+ )

(2 . The dressing group In our case, we start from an element g ∈ exp sl element (g+ , g− ) is deﬁned by the factorization problem −1 g = g− g+

± ) with g± ∈ B± = (exp H)(exp N

and, moreover, we require that g− and g+ have inverse components on the Cartan torus. −1 g+ , we When we dress the vacuum solution with an element g = g− g get a new solution Φ such that: e−2Λ(Φ

g)

= Λ|emx− E+ g e−mx+ E− |Λ

To construct new solutions of the sinh-Gordon equation, one has to choose particular elements g of the aﬃne group. Remarkable elements can be constructed with vertex operators in the (2 algebra. Let us recall here two level one representations of the aﬃne sl

472

12 The Toda ﬁeld theories

the main facts, and refer to Chapter 16 for more details. One introduces bosonic oscillators pn for n odd, such that [pm , pn ] = mδn+m,0 and p†n = p−n . They generate a Fock space over the vacuum state |0 which is speciﬁed by pn |0 = 0 for n > 0. Let Z(λ) be the generating function: √ λn Z(λ) = −i 2 p−n n n odd

The level one vertex operator representations with highest weight Λ± of (2 are then obtained by setting: the Lie algebra sl n

i d n n λ−n (E+ + E− ) = √ λ Z(λ) 2 dλ odd

λ−n H n +

n even

n n λ−n (E+ − E− ) = ± V (λ)

(12.73)

n odd

where V (λ) denotes the vertex operator: V (λ) =

1 2

√

: e−i

2Z(λ)

:=

1 2

√

: ei

2Z(−λ)

:

The double dots means that the expressions are normal ordered, i.e. we write all pn with n > 0 to the right. The representations with highest weight Λ± correspond to the plus and minus sign respectively, in eq. (12.73). Recall the formula, (see eq. (16.25)): λ−µ 2 V (λ)V (µ) = : V (λ)V (µ) : |λ| > |µ| λ+µ This means that inside expectation values V 2 (µ) = 0. So there is an (2 such that ρΛ+ (g) = ρΛ+ (exp aV (µ)) = 1 + aV (µ). element g ∈ exp sl In the representation ρΛ− we have ρΛ− (g) = 1 − aV (µ) due to the sign change in eq. (12.73). More generally, we consider the product of such elements g = g1 g2 · · · gN . We have: ρΛ± (g) = (1 ± 2a1 V (µ1 ))(1 ± 2a2 V (µ2 )) · · · (1 ± 2aN V (µN )) (N )

Proposition. Let τ±

be the tau-functions:

τ± (x± ) = Λ± |emx− E+ (N )

N (1 ± 2ai V (µi ))e−mx+ E− |Λ± i=1

(12.74)

12.8 Dressing transformations and soliton solutions

473

Then we have: τ± =1+ (±)p τ0 (N )

N

p=1

k1
with

X k1 · · · X kp

µki − µkj 2 (12.75) µki + µkj

ki
Xi = ai exp −2m(µi x− + µ−1 i x+ )

and τ0 = exp (−m2 x+ x− ). The parameters µi are interpreted as the rapidities of the solitons and ai are related to their positions. Proof. The commutation relations between E± and V (λ) are [E± , V (λ)] = −2λ±1 V (λ) see eq. (16.29) in Chapter 16. They imply that ∓1

emx± E∓ V (µi )e−mx± E∓ = e−2mµi

x±

V (µi )

Commuting emx− E+ to the right, and e−mx+ E− to the left amounts to the replacement ai → Xi and, moreover, produces the factor τ0 since [E+ , E− ] = K, so that we get: (N )

τ±

= τ0 Λ± |

N (1 + 2Xi V (µi ))|Λ± i=1

The explicit formula (12.75) then follows from Wick’s theorem applied to the vertex operators V (µi ). Remark. One can incorporate the whole hierarchy of times, as in eq. (12.72), to n n this expression. Using the commutation relations [E+ + E− , V (λ)] = −2λn V (λ), Xi is replaced by s (s) −s (s) Xi = ai exp −2m (µi x− + µi x+ ) and τ0 by exp (−m2

s (s) (s)

s sx+ x− ).

The N -soliton tau-functions can also be written as the determinant of an N × N matrix: √ (N ) µi µj τ± = det(1 ± V) with Vij = 2 Xi Xj (12.76) τ0 µi + µj This is proved by using the Cauchy identity: √ µi µj µi − µj 2 = det 2 µi + µj µi + µj i<j

474

12 The Toda ﬁeld theories

and the expansion formula eq. (9.47) in Chapter 9. For the sine-Gordon equation, the parameters ai , µi in eq. (12.74) satisfy speciﬁc reality conditions. Replacing ϕ by iϕ in eq. (12.67), the onesoliton solution reads e−iϕ =

τ+ 1+X = , τ− 1−X

X = ae−2m((µ+µ

−1 )x−(µ−µ−1 )t)

For this to be of modulus 1, one takes a = i eγ with γ real and = ±1. At ﬁxed t, when x goes from −∞ to +∞, ϕ goes from − π to 0. The topological charge is deﬁned as Q=

1 (ϕ(+∞) − ϕ(−∞)) π

Solitons, with charge +1, correspond to = +1 and antisolitons, with charge −1, to = −1. The “breathers” correspond to pairs of complex conjugated rapidities ¯ In this case we have: ¯) and positions (X1 = X, X2 = −X). (µ1 = µ, µ2 = µ

2

2 Im µ 2 1 + X1 + X2 + µµ11 −µ 1 + 2i Im X + X X |X|2 1 2 +µ2 Re µ −iϕ = = e

2

2 µ1 −µ2 µ 1 − X1 − X2 + µ1 +µ2 X1 X2 1 − 2i Im X + Im |X|2 Re µ This shows that ϕ(x = −∞) = ϕ(+∞) = 0, and the topological charge is 0. Moreover, if we perform a Lorentz boost to go to the rest frame where |µ| = 1, we see that ϕ is periodic in time, hence the denomination of this solution. 12.9 N -soliton dynamics The space of N -soliton solutions forms a ﬁnite-dimensional submanifold in the inﬁnite-dimensional phase space of sine-Gordon, on which the symplectic form of sine-Gordon can be restricted. This yields a ﬁnitedimensional symplectic manifold. The soliton solutions then become a relativistically invariant N -body problem ﬁrst introduced by Ruijsenaars, which is a relativistic generalization of the Calogero model. Consider the sine–Gordon model with equation of motion: ∂x+ ∂x− ϕ = 2m2 sin(2ϕ) The natural symplectic form is the canonical one: +∞ ΩSG = dx δπ(x) ∧ δϕ(x) −∞

(12.77)

(12.78)

12.9 N -soliton dynamics

475

where π(x) = ∂t ϕ(x) is the momentum conjugate to the ﬁeld ϕ(x), and δ denotes the diﬀerential on the phase space. The N -soliton solution is + given by e−iϕ = ττ− , where τ± are given by eq. (12.75). These solutions depend on 2N parameters ai , µi which provide coordinates on the N soliton submanifold. We wish to compute the restriction of the symplectic form eq. (12.78) on this 2N -dimensional manifold. We present the proofs in the soliton case. This means that the µi are real, and the ai are pure imaginary. The analysis for the antisolitons is the same and for breathers some technicalities must be added. We now express the restriction of ΩSG on the N -soliton submanifold. Proposition. In the coordinates ai and µi , the restriction of ΩSG is given by: N 4µi µj δµi δµj δai δµi ω=2 ∧ +2 ∧ (12.79) ai µi µi µj µ2i − µ2j i=1

i<j

Proof. Consider ﬁrst the one-soliton solution. In this case, the computation can be done directly using the formula e−iϕ = (1 + X)/(1 − X), ˙ which gives π = 2iX/(1 − X 2 ). One ﬁnds: +∞ 1 δµ δa · = −2 dx∂x ΩSG ∧ (12.80) 2 1−X µ a restricted −∞ Since X is pure imaginary, and X(+∞) = 0 and X(−∞) = ∞, the integral evaluates to the numerical constant −2. So the formula is proved for one soliton. Consider next the general case with an arbitrary number of solitons with parameters (µi , ai ). Since the symplectic form can be computed at any time, we evaluate it at t → ±∞. In these limits the sine-Gordon ﬁeld becomes asymptotically equal to the sum of one-soliton in out out solutions with parameters (µin i , ai ) and (µi , ai ): out µin = µi i = µi in µi − µj 2 aiout = ai µi + µj < |µj | > |µi |

These formulae are obtained as follows. Let us look at the quantities Xi = ai exp(−2m(µi x− + µ−1 i x+ )) entering the deﬁnition of the N soliton solution. When t is large, Xi is of order 1 when x is located µi −µ−1 i t. The values µi +µ−1 i 4m(µ2j −µ2i )t/µi µj (µi +µ−1 i )

at x ∼ xi = Xj ∼ aj e

of the other Xj when x ∼ xi are

. Hence for large positive time, if µ2j > µ2i ,

476

12 The Toda ﬁeld theories

Xj (xi , t) is very large, while if µ2j < µ2i , Xj (xi , t) is very small. So we can split the N solitons into two subsets. One, I+ , is such that µ2j > µ2i , and the other one, I− , is such that µ2j < µ2i . We can evaluate the taufunctions, eq. (12.76), when x ∼ xi . One has to keep the terms containing the maximum number of Xj , j ∈ I+ . These terms read:   µki − µkj 2 µi − µj 2 1 ± Xi  τ± (xi , t) ∼ Xj µki + µkj µi + µj j∈I+

j∈I+

kj
where we need to keep the second term because Xi ∼ 1. So we have e−iϕ =

τ+ = τ−

1 + Xi 1 − Xi

j∈I+ j∈I+

µi −µj µi +µj µi −µj µi +µj

2

2

which is indeed a one-soliton formula, but with modiﬁed parameter aout . Since asymptotically the solitons decouple, in the symplectic form the crossed terms vanish (the overlap integrals are zero). Therefore the symplectic form reduces to the sum of the one-soliton expressions, but with the modiﬁed in and out parameters: ω=2

N δain i in a i i=1

δaout δµi δµi i =2 ∧ µi aout µi i N

∧

i=1

Both forms are equal to eq. (12.79), as can be checked easily. As a byproduct, we see that the transformation from the in-variables in out out (µin i , ai ) to the out-variables (µi , ai ) is symplectic, as it should be. The 2-form (12.79) is non-degenerate. It therefore deﬁnes a symplectic structure on the restricted phase space. The corresponding Poisson brackets are found by inverting the symplectic form (12.79): 4µi µj 1 1 ai aj {µi , µj } = 0, {µi , aj } = − µi aj δij , {ai , aj } = − 2 2 µ2i − µ2j (12.81) Knowing the Poisson brackets between the parameters (µi , ai ), we can compute the Poisson brackets between the elements of the matrix V in eq. (12.76). Remarkably, they can be written in the r-matrix form: {V1 , V2 } = [r12 , V1 ] − [r21 , V2 ]

(12.82)

477

12.9 N -soliton dynamics where r12 =

⊗ Ekl and µi + µj (Vjk δil + Vjl δik + Vik δjl + Vil δjk ) µi − µj

ij;kl rij;kl Eij

rij;kl = −

1 16

Equation (12.82) can be checked by straightforward computation. From the deﬁnition of the sine-Gordon ﬁeld in terms of the taufunctions, we see that it can be expressed in terms of only the eigenvalues Qi of the matrix V: N det (1 + V) 1 + Qi −iϕ e (12.83) = = det (1 − V) 1 − Qi i=1

Recall that, in the N -soliton case, we can assume aj = ieγj and µj > 0 ˜ where V˜ is a real symmetric so that the matrix V is of the form iV, matrix. The matrix V˜ can be diagonalized with a real orthogonal matrix ˜ j are real positive. To show the positivity U . Moreover, its eigenvalues Q assertion, we show that V˜ is positive. For any real vector y we have xi xj ˜ =2 (y Vy) µi + µj ij

√ −1 with xi = µi eγi e−m(µi x− +µi x+ ) yi . But since all µi are assumed positive, we have for real xi : ∞ ∞ xi xj −z(µi +µj ) = dz e xi xj = dz( e−zµj xj )2 ≥ 0 µi + µj 0 0 ij

ij

j

So the eigenvalues Qj of V are of the form i×real positive number, or Qj = ieqj . Finally, note that eq. (12.82) implies {Qi , Qj } = 0. We want to express the evolution of solitons in terms of the variables Qi . It is convenient to consider the evolution in the light-cone coordinates separately, and we take for deﬁniteness the evolution with respect to x− = x − t. Let U be the real orthogonal matrix which diagonalizes the symmetric matrix V and deﬁne a symmetric matrix L by: L = −2m U µU −1 ,

V = U −1 QU

(12.84)

where Q denotes the diagonal matrix diag( Qi ) and µ the diagonal matrix diag( µi ) with µi the rapidities of the solitons. The matrix L plays the role n of a Lax operator. Obviously, the quantities tr(Ln ) = (−2m)n N i=1 µi are conserved during the evolution of the solitons. They are in involution under the Poisson bracket (12.81). The time evolution of L takes the Lax form, and we have:

478

12 The Toda ﬁeld theories

Proposition. The x− evolution of L is given by a Lax equation L˙ = [M, L] where the dot means

∂ ∂x− .

with

M = U˙ U −1

Moreover, L and M can be expressed in terms

of the quantities Qi and Q˙ i : Q˙ i Q˙ j Lij = 2 , Qi + Qj

and

Mij =

Q˙ i Q˙ j Qi − Qj

(1 − δij )

(12.85)

If we introduce the coordinates Qj = ieqj , the equations of motion read: q¨i =

k=i

2q˙i q˙k sinh(qi − qk )

(12.86)

Proof. The Lax form of the equation of motion follows from the deﬁnition of L, eq. (12.84), and the fact that µ is conserved. Let us prove eq. (12.85). We have to be careful about what we mean by the square roots. Denote, ˜ Q = iQ, ˜ X = iX. ˜ The tilde quantities are all real as before, V = iV, positive deﬁnite, and we can take their positive square root. We start ˜ ˜ from the relation µ V + Vµ = 2|e e|, where |e is the column vector with ˜ i µi . Multiplying on the left by U and on the right components ei = X ˜ + LQ ˜ = −4m|˜ by U −1 we get, using that U is orthogonal, QL e ˜ e| with ˜ |˜ e = U |e . Since Q is diagonal, we obtain from this relation: Lij = −4m Next, since

e˜i e˜j ˜ ˜j Qi + Q

˙ ˜ i = −mµi X ˜ i , we have X

˜˙ + [Q, ˜ M] U ˜ = −2m|e e| = U −1 Q V˜˙ = −m(µV˜ + Vµ) ˜˙ + [Q, ˜ M ]. In components, this reads which implies −2m|˜ e ˜ e| = Q ˜˙ i δij + (Q ˜i − Q ˜ j )Mij −2m˜ ei e˜j = Q ˜˙ i /2m, and if i = j, we ﬁnd the value of Mij If i = j, we ﬁnd e˜i = −Q ˜ i, Q ˜˙ i . This completely determines the antisymmetric matrix in terms of Q M . The equations of motions, eq. (12.86), are obtained by writing the Lax equation.

12.9 N -soliton dynamics

479

Instead of the variables Q˙ i , let us introduce the variables ρi = Q˙ i /Qi . The Lax matrix becomes Qi Qj √ Lij = 2 ρi ρj Qi + Qj Comparing the matrices V and L, one sees that L has exactly the same form as V with the change of variables (µi , ai ) → (Qi , ρi ). This symmetry extends at the level of the symplectic structure. Proposition. The transformation (µi , ai ) → (Q−1 i , ρi ) is a symplectic transformation. Proof. Let us consider ω1 = 2

N δai i=1

ω2 = 2

ai

N δρi i=1

ρi

∧

4µi µj δµi δµj δµi +2 ∧ µi µj µ2 − µ2j µi i<j i

∧

4Qi Qj δQi δQj δQi +2 ∧ Qi Qj Q2i − Q2j Qi i<j

We ﬁrst express ω1 using the independent variables µi , Qi . We get ω1 = δµj δµj δµi i 2 ij Aij δQ i<j Bij µi ∧ µj . Recall that using the symplectic Qi ∧ µj + 2 form ω1 we have shown that V obeys an r-matrix relation, hence its eigenvalues Poisson commute. Noting that −1 t −1 0 A (A ) BA−1 −(At )−1 = A−1 0 −At B we see that the vanishing of the Poisson brackets {Qi , Qj } requires i ∂aj Bij = 0. To compute Aij = Q aj ∂Qi (at µk ﬁxed), we vary the matrix V in eq. (12.76) (at µ ﬁxed), getting dV = 12 (Va−1 da + a−1 daV), where a = Diag(ai ). Similarly, varying V = U −1 QU , we get dV = U −1 (dQ + [Q, dU U −1 ])U so that:

1 −1 dQ + [Q, dU U −1 ] = U a daU −1 Q + QU a−1 daU −1 2 Looking at the diagonal part of this equation, we get dQi −1 daj −1 Qk ∂aj dQk = Uij Uji = Uij Uji Qi aj aj ∂Qk Qk j

j,k

−1 Setting Mij = Uji Uij , this means AM = 1 or A = M−1 .

480

12 The Toda ﬁeld theories

We now express the form ω2 in terms of the variables (Qi , µi ) exactly in the same way. In particular, starting from ω2 one may deduce an r-matrix δµj i relation for L implying that {µi , µj } = 0, so that ω2 = 2 ij Cij δQ Qi ∧ µj µ

∂ρi with Cij = − ρij ∂µ (at Qk ﬁxed). The computation of Cij is the same as j for Aij except that U is replaced by U −1 , which leads to MC = −1 so changes ω2 to −ω2 , that C = −A and ω2 = −ω1 . Changing Qi to Q−1 i thereby proving the symplecticity of the transformation.

We introduce a new set of variables pi by: Qk + Qi qi − qk pi =e coth ρi = exp(pi ) Qk − Qi 2 k=i

(12.87)

k=i

It is easy to check that the variables pi , qi are canonically conjugated: {qi , qj } = 0,

{pi , pj } = 0,

{pi , qj } = δij

We have already noticed that L obeys an r-matrix relation, which implies that the function T (z) = det(1 + zL) is a generating function of commuting Hamiltonians (which we already knew by construction of L): {T (z), T (z )} = 0 Expanding T (z) in power of z, we ﬁnd: T (z) = 1 +

N

p

z Hp = 1 +

p=1

N p=1

z

p

k1 <···
ρρk1 · · · ρkp

Qki − Qkj 2 Qki + Qkj

ki
The Hamiltonian generating the evolution in the light-cone coordinate, x− , is H− = Tr (L). The evolution in the other light-cone coordinate, x+ , is generated by H+ = Tr (L−1 ) = HN −1 /det(L): qj − qk ∓pj H± = e coth 2 j

k=j

These are the Hamiltonians written by Ruijsenaars and Schneider. By construction, their corresponding ﬂows are linearized in the variables (µi , Xi ). We end this section by presenting the reality conditions corresponding to solitons, antisolitons and breathers. For the sine-Gordon ﬁeld eq. (12.83) and the Hamiltonians H± to be real, the coordinates (Qi , pi ) ¯ j and p¯j = p¯j . The case j = ¯j have to come in pairs (j, ¯j), with Q¯j = −Q

481

12.10 Finite-zone solutions

corresponds to a soliton or an antisoliton, and the case j = ¯j to a breather. Therefore, in the coordinates qj such that Qj = ieqj : Im qs = 0 for s a soliton Im qs¯ = π for s¯ an antisoliton q¯b = q¯b for b a breather Similarly, the momenta ps and ps¯ are real and pb is complex with p¯b = p¯b . 12.10 Finite-zone solutions The simplest example of ﬁnite-zone solutions is the one-zone solution. We study it in detail. The multi-zone solutions are then simple generalizations. The one-zone solution is obtained by looking at solutions ϕ = ϕ(z) of the sinh-Gordon equation depending only on the variable z = m(µx− + µ−1 x+ ). Then ∂x+ ∂x− ϕ = 2m2 sinh(2ϕ) becomes ϕ = e2ϕ − e−2ϕ , where means the derivative with respect to z. This integrates to ϕ2 = e2ϕ + e−2ϕ − 2E, where E is a constant. Setting e−2ϕ = t, this equation takes the form: t2 = 4t(t2 − 2Et + 1) For E = ±1, the right-hand side is a perfect square and the equation is readily solved. For E = +1 one ﬁnds the soliton and antisoliton solutions. For general E, the equation can be compared with the equation for the Weierstrass ℘ function: ℘ (z)2 = 4℘3 (z) − g2 ℘(z) − g3 and we have: e−2ϕ = t(z) = ℘(z − z0 ) +

2E , 3

g2 =

16 2 E − 4, 3

8 8 g3 = E( E 2 − 1) 3 9

and z0 is an arbitrary constant. Note that the discriminant g23 − 27g32 = 64(E 2 − 1) vanishes only for the cases E = ±1. To see how this simple one-zone solution ﬁts into the general formalism of ﬁnite-zone solutions, we need to construct a Lax matrix, L, satisfying the compatibility conditions [∂x+ − Ax+ , L] = 0,

[∂x− − Ax− , L] = 0

As explained in Chapter 5, the general technique is to impose a stationarity condition with respect to some time. The one-zone solution only

482

12 The Toda ﬁeld theories

depends on z = m(µx− + µ−1 x+ ), hence satisﬁes the stationarity condition: (µ−1 ∂x− − µ∂x+ )ϕ = 0 (12.88) 1 So we choose as Lax matrix L(λ) = m (µ−1 Ax− −µAx+ ). From eqs. (12.64, 12.65) we ﬁnd 1 1 − 2 ∂x− ϕ mλeϕ ∂x+ ϕ mλ−1 e−ϕ 2 Ax+ = , Ax− = mλ−1 eϕ − 12 ∂x+ ϕ mλe−ϕ 21 ∂x− ϕ

and the Lax matrix is: 1 −1 − 2m (µ ∂x− ϕ + µ∂x+ ϕ) L(λ) = λ −ϕ − µλ eϕ µe

µ −ϕ λ ϕ µe − λe 1 −1 2m (µ ∂x− ϕ + µ∂x+ ϕ)

The spectral curve, Γ : det(L(λ) − Λ) = 0, reads Λ2 =

1 λ2 µ2 −1 2 2ϕ −2ϕ (µ ∂ ϕ + µ∂ ϕ) − e − e + + x x + − 4m2 µ2 λ2

1 −1 ∂ ϕ + Hence, for these special one-zone solutions, the quantity 4m x+ 2 (µ µ∂x− ϕ)2 − e2ϕ − e−2ϕ is a constant independent of x± . Indeed, noting that µ−1 ∂x+ ϕ + µ∂x− ϕ = 2mϕ , we ﬁnd that this constant is −2E. As an equation in the (Λ, λ) plane, the curve ﬁnally reads:

Γ:

Λ2 =

λ2 µ2 + − 2E µ2 λ2

Besides the hyperelliptic involution, Λ → −Λ, this curve has another involution λ → −λ. Let Γ be the curve obtained from Γ by quotienting by this involution. This is done by setting Y = µλ2 Λ,

λ = λ 2

or

Λ=

Y , µλ

λ=

√

λ

The transformation Γ → Γ is one to two. The curve Γ is a two-sheeted unbranched covering of Γ . The equation of the curve Γ reads Γ :

Y 2 = λ (λ2 − 2Eµ2 λ + µ4 )

The points P+ (λ = 0) and P− (λ = ∞) are branch points on Γ . Their pre-images on Γ are two singular points which blow up to four points on its desingularization. This makes the situation more complicated on Γ than on Γ , and this is why we usually work on Γ , and eventualy lift the results to Γ if needed.

12.10 Finite-zone solutions

483

Next, we describe the singularity structure of the wave function Ψ in this one-zone case. The wave function satisﬁes the equations (∂x± − Ax± )Ψ = 0,

(L(λ) − Λ)Ψ = 0

(12.89)

These equations are naturally written on Γ and we need to transport them to Γ . This is achieved by performing a gauge transformation: ϕ 0ϕ e2 Ψ Ψ = gΨ = 0 λ1 e− 2 where the important factor is 1/λ and the factors e±ϕ/2 have been introduced for later convenience. The gauge transformed connection Ax± and the new Lax matrix L = gLg −1 still obey the Lax equations [∂x± − Ax± , L ] = 0, but are now rational functions of λ = λ2 . Explicitly, the transformed connection is: 0 mλ e2ϕ ∂x+ ϕ m , Ax+ = (12.90) Ax− = mλ−1 −∂x+ ϕ me−2ϕ 0 1 The transformed L is still equal to m (µ−1 Ax− − µAx+ ) thanks to the stationarity condition, eq. (12.88). It is now straightforward to extract the singular parts of Ψ at the branch points λ = 0 and λ = ∞. In the neighbourhood of the point λ = √ ∞ of Γ , we can use the local √ of L is λ /µ. The eigenvector is such parameter λ . The eigenvalue √ eigenvector of L and L = that ψ1 /ψ2 ∼ exp(2ϕ) λ . Since Ψ is an √ 1 −1 m (µ Ax− − µAx+ ), we have Ax− Ψ = m λ Ψ + O(1) around ∞. So we get √ χ1 χ1 m λ x− , Ψ∞ ∼ e = e2ϕ (12.91) √1 χ2 χ 2 λ √ √ Similarly, at the√point λ = 0 the eigenvalue √of L is µ/ λ + O( λ ) and ψ1 /ψ2 ∼ − λ . We have Ax+ Ψ = −m/ λ Ψ + O(1). To get the √ precise form of Ψ at λ = 0, write Ψ0 = exp(− √m x+ )(f, −g/ λ ) with λ √ g = f + O( λ ) and plug into the equations (∂x± − Ax± )Ψ = 0. A careful analysis of these equations shows that f is a constant, independent of x± , and so can be normalized to 1. We ﬁnally get: 1 − √m x+ (12.92) Ψ0 ∼ e λ − √1 λ

Ψ

λ

Γ .

We see that has a pole at = 0 on The one-zone analysis can be generalized immediately to provide ﬁnitezone solutions to the sinh-Gordon equation. Consider an arbitrary hyperelliptic curve Γ , of genus g, having λ = 0 and λ = ∞ as branch points: Y 2 = λ P (λ ),

deg P = 2g

484

12 The Toda ﬁeld theories

On this curve we construct Baker–Akhiezer vectors Ψ with g + N − 1 = g + 1 poles (N = 2). We set one of these poles at λ = 0. The other g poles at ﬁnite distance are independent of x± . The essential singularities are set at P+ (λ = 0) and P− (λ = ∞) with behaviour speciﬁed as in eqs. (12.91, 12.92), so ψ2 has a pole at P+ and a zero at P− . This characterise Ψ uniquely. Proposition. We have: 0 mλ χχ12 ψ1 ∂x− − =0 ψ2 m χχ21 0 ∂x+ log χ1 m ψ1 =0 ∂x+ − m ∂ log χ ψ2 x 2 + λ

(12.93)

Proof. Consider ∂− ψ1 −m χχ12 λ ψ2 . It is a Baker–Akhiezer function. At P− , √ √ √m x it behaves like em λ x− O(1), while at P+ it behaves like e λ + O( λ ) and therefore has a zero at this point. It also has g poles at ﬁnite distance inherited from Ψ . So, it is a Baker–Akhiezer function with g poles and a prescribed zero at λ = 0 and therefore vanishes. We prove similarly ∂x χ1 that ∂x− ψ2 − χχ21 ψ1 = 0 and that ∂x+ ψ1 − mψ2 − χ+1 ψ1 = 0. Finally, ∂x χ2

considering ∂x+ ψ2 − λm ψ1 − χ+2 ψ2 , we see that it has a simple pole at λ = 0 but a double zero at λ = ∞, hence also vanishes. We set χ1 /χ2 = exp(2ϕ) and χ1 χ2 = exp(2σ) and compute the zero curvature condition for the system (12.93). We get ∂x+ ∂x− σ = 0 and ∂x+ ∂x− ϕ = 2m2 sinh(2ϕ). The free ﬁeld σ decouples and can be set equal to 0. Then the connection appearing in eq. (12.93) is exactly the same as Ax± in eq. (12.90). We see that, starting from any hyperelliptic curve with branch points at 0 and ∞, and imposing the essential singularities eqs. (12.91, 12.92) at these two punctures yields readily a ﬁnite-zone solution of the sinhGordon equation. Using two punctures allows one to get independent equations of motion with respect to x+ and x− , so producing relativistic generalizations of the ﬁnite-zone solutions for the KdV equation. References [1] A.C. Scott, F.Y.F. Chu and D.W. McLaughlin, The soliton: a new concept in applied science. Proc. IEEE 61 (1973) 1443. [2] R.F. Dashen, B. Hasslascher and A. Neveu, Particle spectrum in model ﬁeld theories from semiclassical functional integral techniques. Phys. Rev. D11 (1975) 3424.

12.10 Finite-zone solutions

485

[3] A.N. Leznov and M.V. Saveliev, Representation of zero curvature for the system of non-linear partial diﬀerential equations xα,z z¯ = exp(kx)α and its integrability. Lett. Math. Phys. 3 (1979) 489–494. [4] A.B. Zamolodchikov and Al.B. Zamolodchikov, Factorized Smatrices in two dimensions as the exact solutions of certain relativistic quantum ﬁeld theory models. Ann. Phys. 120 (1979) 253. [5] J.L. Gervais and A. Neveu, Novel triangle relation and absence of tachyons in Liouville string ﬁeld theory. Nucl. Phys. B238 (1984) 125. [6] A.A. Belavin, A.M. Polyakov and A.B. Zamolodchikov, Inﬁnite conformal symmetry in two-dimensional quantum ﬁeld theory. Nucl. Phys. B241 (1984) 333. [7] M. Semenov-Tian-Shansky. Dressing transformations and Poisson group actions. Publ. RIMS 21 (1985) 1237. [8] L.D. Faddeev and L.A. Takhtajan, Hamiltonian Methods in the Theory of Solitons. Springer (1986). [9] S.N.M. Ruijsenaars H. Schneider, A new class of integrable systems and its relation to solitons. Ann. Phys. 170 (1986) 370–405. [10] O. Babelon, Extended conformal algebra and the Yang–Baxter equation. Phys. Lett. B215 (1988) 523. [11] O. Babelon D. Bernard, Aﬃne Solitons: a relation between tau functions, dressing and B¨ acklund transformations. Int. J. Mod. Phys. A8 (1993) 507–543.

13 Classical inverse scattering method

We introduce the basic tools of the classical inverse scattering method created by Gardner, Green, Kruskal and Miura. This very ingenious method was ﬁrst exploited to solve the KdV equation. Here we apply it to the sine-Gordon equation since it is rewarding to show how the method applies to a more involved situation. An advantage of the inverse scattering method is that it allows us to construct action–angle variables for a full inﬁnite-dimensional phase space of the ﬁeld theory corresponding to ﬁelds decreasing rapidly at inﬁnity. The idea of this approach is to associate a scattering problem with the ﬁeld conﬁguration. The key point is that the scattering data have a very simple time evolution, and that the ﬁeld can be reconstructed from these data. Of course particular solutions like soliton solutions, which depend on a ﬁnite number of parameters, are also recovered this way. 13.1 The sine-Gordon equation Let us consider the sine-Gordon equation ∂t2 ϕ − ∂x2 ϕ = −

8m2 sin(2βϕ) β

(13.1)

with ϕ ≡ ϕ(x, t) a real scalar ﬁeld. One can write this equation as the compatibility condition of a linear system (∂x − Ax ) Ψ = 0 (∂t − At ) Ψ = 0 with: Ax = i

β 2 ∂t ϕ −iβϕ m(λe − λ−1 eiβϕ )

m(λeiβϕ − λ−1 e−iβϕ ) − β2 ∂t ϕ

486

(13.2) (13.3) (13.4)

487

13.2 The Jost solutions At = i

β 2 ∂x ϕ −iβϕ −m(λe + λ−1 eiβϕ )

−m(λeiβϕ + λ−1 e−iβϕ ) − β2 ∂x ϕ

(13.5)

As compared to eqs. (12.64, 12.65) in Chapter 12, we have changed ϕ → iϕ to get the sine-Gordon equation, and λ → iλ so that i(∂x − Ax ) and i(∂t − At ) are formally self-adjoint for real λ. In this chapter, we will study the solutions of the sine-Gordon equation on the line with boundary conditions: Qπ lim ϕ(x) = lim ϕ(x) = 0, , Q∈Z x→−∞ x→+∞ β The number Q is the topological charge of the ﬁeld conﬁguration. These asymptotic conditions are manifestly compatible with the equations of motion and correspond to a minimum of the sine-Gordon potential. We will assume that they are reached rapidly enough. To solve a non-linear partial diﬀerential equation by the classical inverse scattering method, we proceed in three steps. • Given the initial data, i.e. ϕ(x, 0) for the sine-Gordon case, we solve the direct scattering problem: [∂x − Ax (λ, ϕ(x, 0))]Ψ = 0 We thus determine some scattering data a(λ, 0), b(λ, 0) for the continuous spectrum and discrete spectum. • We determine the time evolution of the scattering data by using the second equation: [∂t − At (λ, ϕ)]Ψ = 0

(13.6)

In the regions x = ±∞ where the asymptotics of the ﬁeld ϕ(x) are known, it allows us to get a(λ, t), b(λ, t) . . . . • Knowing the scattering data at time t, we reconstruct the matrix Ax (λ, t) (and thus ϕ(x, t)) by solving the Gelfand–Levitan– Marchenko equation, which is a linear integral equation. In the rest of this chapter, we apply this strategy to solve the sine-Gordon equation. 13.2 The Jost solutions In our case, the direct scattering problem is deﬁned by the equation β ∂ ∂t ϕ m(λeiβϕ − λ−1 e−iβϕ ) 2 Ψ=0 −i ∂x m(λe−iβϕ − λ−1 eiβϕ ) − β2 ∂t ϕ (13.7)

488

13 Classical inverse scattering method

This linear system possesses a few simple properties: Let ∗ denote complex conjugation. For λ real, we have A∗x = σ2 Ax σ2 , (and more generally for complex λ we have (Ax (λ, x))∗ = σ2 Ax (λ∗ , x)σ2 ). Therefore, if Ψ is a solution of eq. (13.7), so is Ψ = σ2 Ψ∗ . This deﬁnes a conjugate solution: ψ1 −iψ2∗ = (13.8) Ψ= ψ2 iψ1∗ Similarly, we have Ax (−λ) = σ3 Ax (λ)σ3 , so that σ3 Ψ(x, −λ) is also a solution. ξ1 ψ1 and Ψ = , we may deﬁne the For two solutions ξ = ξ2 ψ2 Wronskian W (ξ, ψ) by: W (ξ, ψ) = ξ1 ψ2 − ξ2 ψ1 d Since Tr (Ax ) = 0, we have dx W (ξ, ψ) = 0, and the Wronskian is xindependent. Assume that ϕ → 0 for x → −∞, and ϕ → πQ β for x → +∞. Then the linear system eq. (13.7) becomes respectively ∂ ∂ Qiπ − ik(λ)σ1 Ψ = 0, x → −∞; − ie k(λ)σ1 Ψ = 0, x → +∞ ∂x ∂x

where k(λ) = m(λ − λ−1 ). It follows that the solutions behave asymptotically, for x → ±∞, as 1 1 ik(λ)x Ψ|x→−∞ ∼ c1 e e−ik(λ)x + c2 1 −1 1 1 ik(λ)x e e−ik(λ)x Ψ|x→+∞ ∼ c1 + c2 eiπQ −eiπQ The Jost solutions are solutions of the linear system with speciﬁc behaviour at inﬁnity. Noting that Im k(λ) = (1 + |λ|−2 )Im λ, we set the deﬁnition: Deﬁnition. The Jost solutions f1 and f2 are deﬁned by the following asymptotic behavior: 1 1 ik(λ)x f1 ∼ e , x → +∞; f2 ∼ e−ik(λ)x , x → −∞ eiπQ −1 (13.9) These solutions are chosen in such way that the asymptotics remain bounded if we give a positive imaginary part to λ . Let us examine in detail the asymptotic form of our Jost solutions when |x| → ∞.

489

13.2 The Jost solutions x = −∞ c11 (λ)

f1

f2

1 −1

1 1 eik(λ)x + c12 (λ) e−ik(λ)x 1 −1

x = −∞

x = +∞

1

eik(λ)x

eiQπ

x = +∞

e−ik(λ)x

c21 (λ)

1 eiQπ

eik(λ)x

+ c22 (λ)

1 −eiQπ

e−ik(λ)x

We get some relations between the coeﬃcients cij (λ) by calculating the Wronskians at x = +∞ and x = −∞: x = −∞

x = +∞

W (f1 , f2 )

−2c11 (λ)

−2c22 (λ)eiQπ

W (f1 , f 2 )

2ic12 (λ)

2ic∗21 (λ)

W (f1 , f 1 )

2i |c11 |2 + |c12 |2

2i

Since the Wronskians are constant, we deduce that c11 (λ) = c22 (λ)eiQπ and c12 (λ) = c∗21 (λ). We set c11 (λ) = c22 (λ)eiQπ = a(λ) and c12 (λ) = c∗21 (λ) = −b(λ). We have |a(λ)|2 + |b(λ)|2 = 1

(13.10)

The asymptotic behavior of the Jost solutions can be rewritten as:

f1

x = −∞

x = +∞

1 1 ik(λ)x a(λ) e − b(λ) e−ik(λ)x 1 −1

x = −∞ f2

1 −1

eiQπ

eik(λ)x

x = +∞

e−ik(λ)x

1

−b∗ (λ)

1 eiQπ

eik(λ)x

+ a(λ)

eiQπ −1

e−ik(λ)x

490

13 Classical inverse scattering method

Comparing the asymptotic expansions of f¯i and σ3 fi (−λ) we get, for real λ: b∗ i i ib f¯1 = −i f1 − f2 , f¯2 = f1 + f2 (13.11) a a a a b∗ 1 1 b σ3 f1 (−λ) = e−iπQ f1 + e−iπQ f2 , σ3 f2 (−λ) = f1 + f2 (13.12) a a a a From the last relation one gets the symmetry properties a(−λ) = e−iπQ a∗ (λ),

b(−λ) = −e−iπQ b∗ (λ)

The main property that we shall prove below is that f1 , f2 and a(λ) can be analytically continued to the upper half-plane . Assuming this result for the time being, we can give a more precise deﬁnition of the scattering data. For λ in the upper half-plane , −λ∗ is also in the upper half-plane (it is the symmetric of λ with respect to the imaginary axis) and the relation a(−λ) = e−iπQ a∗ (λ) for λ real extends to a(−λ∗ ) = e−iπQ (a(λ))∗ . Note also that f1 and f2 obey the symmetry property: f1 (x, −λ∗ ) = eiπQ σ1 (f1 (x, λ))∗ ,

f2 (x, −λ∗ ) = −σ1 (f2 (x, λ))∗

(13.13)

This is because Ax (x, −λ∗ ) = σ1 (Ax (x, λ))∗ σ1 so that f (x, −λ∗ ) obeys the same equation as σ1 (f (x, λ))∗ , and one can then compare the asymptotics at x = ±∞. Suppose that a(λ) has some complex zeros λn , Im λn > 0. From the symmetry property of a(λ), these zeroes are either purely imaginary or occur in pairs symmetric with respect to the imaginary axis. Because 2a(λ) = −W (f1 , f2 ), the two Jost solutions f1 and f2 become linearly dependent when λ = λn . Looking at the asymptotics, we see that this solution is normalizable. It corresponds to a bound state for the linear scattering problem (13.7). At λn , the solutions f1 , f2 are proportional: f2 (x, λn ) = cn f1 (x, λn ),

a(λn ) = 0 and Im(λn ) > 0

(13.14)

Equation (13.13) implies that, for λn pure imaginary, c∗n = −eiπQ cn , and a similar relation when we have a pair of symmetric roots. The functions a(λ) and b(λ) and the numbers λn and cn constitute the scattering data. The function a(λ) is called the Jost function. Proposition. The Jost solutions f1 (x, λ) and f2 (x, λ) are analytic functions of λ in the upper half-plane Im(λ) > 0. Proof. Denote A±∞ = limx→±∞ Ax , i.e. A−∞ = ik(λ)σ1 and A∞ = ieiπQ k(λ)σ1 . We can write the linear systems for f1,2 as (∂x − A±∞ )f1,2 =

491

13.2 The Jost solutions

(Ax −A±∞ )f1,2 . These equations, plus the boundary conditions, are equivalent to the integral equations +∞ dy G1 (x, y)V1 (y, λ)f1 (y, λ) (13.15) f1 (x, λ) = f10 (x, λ) + −∞

f2 (x, λ) = f20 (x, λ) +

+∞

−∞

dy G2 (x, y)V2 (y, λ)f2 (y, λ)

(13.16)

where we have set V1,2 (x, λ) = Ax (x, λ) − A±∞ (x, λ) and 1 1 0 ik(λ)x 0 e , f2 = e−ik(λ)x f1 = eiπQ −1 The kernels G1 and G2 are: 1 1 G1 (x, y) = − θ(y − x) eiπQ 2

G2 (x, y) =

1 θ(x − y) 2

1 1

eiπQ 1

eik(λ)(x−y)

1 −eiπQ −ik(λ)(x−y) e + −eiπQ 1 1 −1 1 eik(λ)(x−y) + e−ik(λ)(x−y) −1 1 1

where θ(x) is the step function. To analyse the integral equation for f1 , we set f1 = eik(λ)x f˜1 . Using the properties of the step function, we get a Volterra type integral equation: ∞ 0 ˜ ˜ f1 (x, λ) = f1 − dyK(x, y; λ)f˜1 (y, λ) (13.17) x

with 1 K(x, y; λ) = 2

1 eiπQ

eiπQ 1

+

1 −eiπQ

−eiπQ 1

2ik(λ)(y−x)

e

V1 (y, λ)

By iteration, we ﬁnd the unique solution: f˜1 = (1 − K1 + K12 + · · ·)f˜10 We show that this series converges uniformly, thereby proving the existence of the Jost solution f1 . In the upper half-plane, we can bound the exponential | exp (2ik(λ)(y − x))| ≤ 1 so that |K(x, y; λ)| ≤ M (y; λ), ∞ dyM (y; λ) converges. This allows us where M (y; λ) > 0 is such that th to bound the n iterate in the following way: ∞ ∞ n ˜0 dx1 K(x, x1 ) · · · K(xn−1 , xn )f˜10 | ≤ |(K f1 )(x)| = | x xn−1

n 1 ∞ 0 ˜ dx1 · · · dxn M (x1 ) · · · M (xn ) = dyM (y) |f˜10 | |f1 | n! x x≤x1 ≤···≤xn

492

13 Classical inverse scattering method

Because of the n! in the denominator, the series is absolutely and uni∞ formly bounded by exp ( x dyM (y)). This is the main observation in the theory of Volterra integral equations. Each term in the expansion is analytic in λ for Im λ > 0, hence the uniform limit is also analytic. Note, however, that for λ = 0, V (y, λ) has a pole so that singularities are expected for f1 and f2 . The equation for f2 is treated in the same way.

Remark 1. If we assume, in the previous proof, that the potential ϕ is such that V1 , V2 have compact support (i.e. the potential reaches its limiting values 0, Qπ/β at ﬁnite distance), then the integration domain in the integral equation is ﬁnite, and one can bound e2ik(λ)(y−x) even for Im(k) < 0. It follows that f1 (x, λ) and f2 (x, λ) are analytic functions of λ in the whole λ-plane except at λ = 0 and λ = ∞. In this case, the scattering data a(λ), b(λ) can be analytically continued in the λ-plane. This remains true if V1 , V2 vanish at ±∞ rapidly enough to compensate for the growth of e2ik(λ)(y−x) when Im(k) < 0. In particular, choosing λ to be a zero λn of a(λ) in the upper half-plane , and looking at the asymptotic expansions of f1 , f2 when x = −∞, we see that cn = −1/b(λn ) in eq. (13.14). The next task is to compute the asymptotics of the Jost solutions when λ → 0 and λ → ∞. Proposition. The Jost solutions f1 , f2 have the following asymptotics in λ, valid for any x: iπQ − 2 iβϕ 1 e ik(λ)x σ 3 iπQ f1 = e e 2 + O( ) , |λ| → ∞ |λ| e− 2 iβϕ 1 1 −ik(λ)x σ 3 e 2 + O( ) , f2 = e |λ| → ∞ −1 |λ| iπQ 2 e ik(λ)x − iβϕ σ3 2 iπQ e + O(|λ|) , |λ| → 0 f1 = e e 2 1 −ik(λ)x − iβϕ σ 3 f2 = e e 2 + O(|λ|) , |λ| → 0 (13.18) −1 iβϕ

Proof. It is easily checked that the factors e± 2 σ3 are such that the right-hand sides of these equations have the correct asymptotics for f1 , f2 when x → ±∞ respectively, see eq. (13.9). Let us consider the case λ → ∞ and analyse the asymptotics of f1 for deﬁniteness, the other cases being similar. We introduce f1 by performing the gauge transformation: f1 = eik(λ)x ei

βϕ σ 2 3

f1

493

13.2 The Jost solutions

x ]f1 = 0 with: It obeys the transformed equation [∂x + ik(λ)(1 − σ1 ) − A β (ϕ˙ − ϕ ) mλ−1 (1 − e−2iβϕ ) 2 Ax = i mλ−1 (1 − e2iβϕ ) − β2 (ϕ˙ − ϕ ) x rapidly vanNotice that the boundary conditions on ϕ are such that A ishes when x → ±∞. Equivalently, f1 is a solution of the integral equation: ∞ 1 0 ik(λ)(y−x)(1−σ1 ) 0 −i πQ Ax (y, λ)f1 (y, λ), f1 = e 2 f1 (x, λ) = f1 − dy e 1 x We decompose this equation on the eigenvectors of the matrix (1 − σ1 ) corresponding to the eigenvalues 2, 0 respectively: 1 1 f1 (x, λ) = F (x, λ) + G(x, λ) 1 −1 This yields the two coupled scalar integral equations: ∞ 6β F (x, λ) = −i dy e2ik(λ)(y−x) (ϕ˙ − ϕ )G(y, λ) 2 x

7 −mλ−1 (1 − cos(2βϕ))F (y, λ) − i sin(2βϕ)G(y, λ) ∞ 6 πQ β G(x, λ) = e−i 2 − i dy (ϕ˙ − ϕ )F (y, λ) 2 x

7 +mλ−1 (1 − cos(2βϕ))G(y, λ) − i sin(2βϕ)F (y, λ) This is a Volterra system, so the iteration procedure converges, exactly as above. Note that there is a Fourier exponential in the ﬁrst equation, but not in the second one. Now for h(x) rapidly decreasing at +∞ we have: ∞ dy eiλy h(y) = O(λ−1 ) x

It follows that the iteration procedure yields G(x, λ) = e−i and F (x, λ) = O(λ−1 ).

πQ 2

+ O(λ−1 )

Proposition. The Jost function is analytic in the upper half-plane Im (λ) > 0. Furthermore, iπQ 1 − iπQ , |λ| → ∞; a(λ) = e 2 + O (|λ|) , |λ| → 0 a(λ) = e 2 + O |λ| Recall also that a(−λ) = e−iπQ a∗ (λ) for real λ. Proof. This follows from the relation a(λ) = − 12 W (f1 , f2 ), and from the fact that f1 , f2 are analytic in the upper half-plane.

494

13 Classical inverse scattering method

Remark 2. When x → −∞, we see that f1 ∼ a(λ)

1 1 − b(λ) e−2ik(λ)x 1 −1

Letting x → −∞ in the above Volterra system for F, G, we can directly identify: ∞ 5 πQ β a(λ) = e−i 2 − i dy (ϕ˙ − ϕ )F (y, λ) 2 −∞ 8 +mλ−1 (1 − cos(2βϕ))G(y, λ) − i sin(2βϕ)F (y, λ) 5 ∞ 2ik(λ)y β b(λ) = −i dy e (ϕ˙ − ϕ )G(y, λ) 2 −∞ 8 −mλ−1 (1 − cos(2βϕ))F (y, λ) − i sin(2βϕ)G(y, λ) We see in the expression for a(λ) that it can readily be extended in the upper half-plane, since F and G have such an extension. By contrast, since b(λ) is a Fourier transform on the whole real axis, it cannot be extended outside the real λ axis in general. However, if the ﬁeld ϕ attains its asymptotic values at ﬁnite distance, the integral is over a ﬁnite interval, and b(λ) admits an analytic continuation in the whole λ-plane. Similarly, if we assume that the ﬁeld ϕ(x) is C ∞ , the functions F , G, solutions of the integral equation, will also be C ∞ in x, so that λn b(λ) → 0 for λ → ∞ for any n > 0, and λ−n b(λ) → 0 for λ → 0. This is because, integrating by parts, ∞ ∞ 1 dx eiλx h(x) = − dx eiλx h (x) iλ −∞ −∞ where the second integral exists by hypothesis. Iterating this formula, we see that the Fourier transform of a C ∞ function h(x) goes to 0 faster than any power of λ when λ → ∞. Noting that |a|2 + |b|2 = 1, we see that the modulus of a tends rapidly to 1 for λ → 0, ∞ and so all information is contained in its phase.

We will need alternative representations of the Jost solutions f1 and f2 , in which the λ dependence is explicit. It is convenient to ﬁrst get rid of the explicit ϕ dependence in the asymptotic expansions eq. (13.18), by deﬁning: i i f1 = e 2 (βϕ−Qπ)σ3 f1 , f2 = e 2 βϕσ3 f2 Proposition. These functions admit the following Fourier representations: ∞

ik(λ)x 0 −1 ( f1 (x, λ) = e f1 + dy U1 (x, y) + λ W1 (x, y) eik(λ)y (13.19) x

f2 (x, λ) = e−ik(λ)x f20 +

x

−∞

(2 (x, y) e−ik(λ)y 2 (x, y) + λ−1 W dy U

13.2 The Jost solutions

495

i (x, y), W (i (x, y) are two component vectors, and where U 1 1 0 0 , f2 = f1 = eiπQ −1 Proof. Consider a function f (λ) analytic in the upper half-plane . Since k(λ) = m(λ − λ−1 ), the map λ → k covers twice the upper half k-plane, the two values λ and −1/λ map to the same value of k. So the function f (λ) can be written: f (λ) =

1 1

1 1 1

f (λ) + f (− ) + f (λ) − f (− ) = g1 (k) + λ + g2 (k) 2 λ 2 λ λ

where g1 and g2 are analytic functions of k. If f is bounded at λ = 0 and λ = ∞, this implies that g1 (k) is bounded and g2 (k) tends to zero 1 k, we can write f (λ) = at k → ∞. Alternatively, since λ + λ1 = 2λ−1 + m −1 h1 (k) + λ h2 (k), and the function h1 is bounded when k → ∞, while h2 tends to zero. By the classical Paley–Wiener theorem, one can represent the functions hi in the form: ∞ hi (k) = ci + ui (y)eiky dy (13.20) 0

where the functions ui (y) are suﬃciently regular so that the Fourier integral tends to zero when k → ∞, and ci is then the limiting value of hi . The analyticity of hi in the upper half-plane is accounted for by the support of the Fourier transform on the positive half–line. To apply this to the Jost solutions, we note that e−ikx f1 is analytic in the upper halfplane. Moreover, e−ikx f1 is bounded at λ → 0, ∞ by eq. (13.18), hence one can represent it as in eq. (13.20). Multiplying by eikx and changing variables y + x → y in the integral (13.20), we arrive at eq. (13.19). The other equation is obtained similarly. i , W (i are determined by eq. (13.7) as follows. The funcThe kernels U x )f1 = 0 with the tion f1 obeys the gauge transformed equation (∂x − A connection: β (ϕ˙ − ϕ ) meiπQ (λ − λ−1 e−2iβϕ ) 2 Ax = i meiπQ (λ − λ−1 e2iβϕ ) − β2 (ϕ˙ − ϕ ) We will use the notation −1 1 + A 0 + λ−1 A x = λA A

496

13 Classical inverse scattering method

When we insert the representation eq. (13.19) into that equation, terms in λ and λ−2 appear. To rewrite them, we use: λ−2 eik(λ)y = eik(λ)y −

1 ∂y eik(λ)y , imλ

λeik(λ)y =

1 1 ∂y eik(λ)y + eik(λ)y im λ

and perform integration by parts. The diﬀerential equation then translates 1 (x, y) and W (1 (x, y): into the following relations on the kernels U

1 0 (x)U 1 + A 1 (x) + A −1 (x) W (1 1 = A A1 (x)∂y U im

1 0 (x)W (1 + A −1 (x) U (1 = A 1 (x) + A 1 ∂x − A−1 (x)∂y W im

∂x +

and the boundary conditions:

1 0 (x)f10 1 (x, x) = −A 1− A1 (x) U im

1 (1 (x, x) = − A −1 (x) + im f0 A−1 (x) W 1+ 1 im i (x, y) and W (i (x, y), these equations allow us Alternatively, knowing U x (x, λ) and therefore the ﬁeld ϕ(x, t). In to reconstruct the connection A particular, the boundary conditions yield e2iβϕ(x) =

(1 )2 (x, x) im + eiπQ (W (1 )1 (x, x) im + (W

(13.21)

The Gelfand–Levitan–Marchenko equation which we will present below (i (x, y) directly to the scattering data. i (x, y) and W relates U 13.3 Inverse scattering as a Riemann--Hilbert problem One can of course deﬁne Jost solutions, f3 , f4 , analytic in the lower halfplane by choosing the appropriate boundary conditions: 1 1 −ik(λ)x f3 ∼ , x → +∞; f4 ∼ eik(λ)x , x → −∞ e −eiπQ 1 (13.22) These solutions are linear combinations of f1 , f2 . By comparing at x = ±∞, we ﬁnd the relations, valid for real λ: f3 =

b∗ (λ) −iQπ e−iQπ f1 + e f2 , a(λ) a(λ)

f4 =

1 b(λ) f1 + f2 a(λ) a(λ)

(13.23)

13.4 Time evolution of the scattering data

497

We can write b(λ) = − 12 W (f1 , f4 ). Since f1 and f4 are not analytic in the same half-plane, we recover the fact that b(λ) cannot be extended outside the real axis in general. Let us deﬁne the matrices Θ± (λ), analytic in the upper and lower half-plane respectively, by (recall that fi are twocomponent vectors): 1 iQπ 1 −ik(λ)σ3 x , e f3 , ∗ f4 = Θ− (λ)e−ik(λ)σ3 x (f2 , f1 ) = Θ+ (λ)e a∗ (λ) a (λ) The factors e−ik(λ)x are introduced so that Θ± (λ) have ﬁnite limits when λ → 0 and λ → ∞ in their respective domains of analyticity. We can write eq. (13.23) as 1 −b(λ)e−2ik(λ)x −1 Θ− Θ+ = −b∗ (λ)e2ik(λ)x 1 This is a Riemann–Hilbert problem, typical of a dressing transformation. Note, however, that the matrix Θ+ is degenerate at the zeroes of a(λ) in the upper half-plane and the matrix Θ−1 − is degenerate at the zeroes of a∗ (λ) in the lower half-plane. We are thus led to a Riemann– Hilbert problem with zeroes, as discussed in Chapter 3. In the following, we propose another route to the solution of the inverse scattering problem, by transforming it to the Gelfand–Levitan–Marchenko linear integral equation. 13.4 Time evolution of the scattering data In the previous sections, from a ﬁeld ϕ(x, 0) at time t = 0, we have deﬁned the scattering data a(λ), b(λ), λn and cn . The second step in the classical inverse scattering method is to compute the time evolution of the scattering data, which turns out to be beautifuly simple. Proposition. The time evolution of the sine-Gordon theory linearizes on the scattering data: a(λ, ˙ t) = 0, ˙λn = 0,

˙ t) = 2im(λ + λ−1 )b(λ, t) b(λ, c˙n = −2im(λn + λ−1 n )cn

Proof. Recall that for the Jost solution f1 (x, λ) we have 1 eik(λ)x f1 (x, λ)|x→+∞ ∼ eiQπ 1 1 ik(λ)x e e−ik(λ)x f1 (x, λ)|x→−∞ ∼ a(λ) − b(λ) 1 −1

(13.24)

498

13 Classical inverse scattering method

Consider now the time evolution of Ψ given by the second equation of the linear system: ∂Ψ ∂t − At Ψ = 0. In the limit x → +∞, it reduces to ∂Ψ 0 1 iQπ −1 Ψ=0 (13.25) + ie m(λ + λ ) 1 0 ∂t Choose Ψ = α(t)f1 (x, t, λ). Then the asymptotic time evolution equation at x → +∞ gives α˙ = −im(λ + λ−1 )α and for x → −∞ it gives d (αa) + im(λ + λ−1 )(αa) = 0 dt d (αb) − im(λ + λ−1 )(αb) = 0 dt ˙ t) = 2im(λ + λ−1 )b(λ, t) as claimed in the Therefore a(λ, ˙ t) = 0 and b(λ, proposition. For bound states, we have by deﬁnition a(λn ) = 0. So λn does not evolve. Consider now the wave function fn (x) ≡ f2 (x, λn ) = cn f1 (x, λn ). We have 1 1 −ik(λn )x e eik(λn )x , fn (x)|x→+∞ ∼ cn fn (x)|x→−∞ ∼ eiQπ −1 Take Ψn = α(t)fn . For x → −∞ we have α˙ = im(λn + λ−1 n )α, while for )c = 0. x → +∞ we have c˙n + 2im(λn + λ−1 n n Integrating eqs. (13.24), we get simple time evolutions of the scattering data: a(λ, t) = a(λ, 0), λn (t) = λn (0),

−1 )t

b(λ, t) = e+2im(λ+λ

−2im(λn +λ−1 n )t

cn (t) = e

b(λ, 0)

(13.26)

cn (0)

13.5 The Gelfand--Levitan--Marchenko equation We now explain the inverse problem which amounts to reconstructing the potential from the scattering data. The Gelfand–Levitan–Marchenko i (x, y) equation is a linear integral equation which determines the kernels U ( and Wi (x, y) from the scattering data. Once these kernels are known, the i (x, x), W (i (x, x) local ﬁelds are reconstructed from their boundary values U by eq. (13.21). Recall eq. (13.11), which we write in the form: b∗ (λ) f2 (x, λ) = r(λ)f1 (x, λ) + if 1 (x, λ) with r(λ) = − a(λ) a(λ)

(13.27)

13.5 The Gelfand--Levitan--Marchenko equation

499

i (x, y), W (i (x, y), we need to rewrite this equaTo introduce the kernels U tion in terms of the functions fi . Recall that f1 = g f1 with g = exp ( 2i (βϕ − Qπ)σ3 ). We multiply eq. (13.27) by g −1 . Notice that g −1 f2 = πQ ei 2 σ3 f2 and g −1 f¯1 = f1 , because g −1 f¯1 = g −1 σ2 f1∗ = g −1 σ2 g ∗ f1∗ and g −1 σ2 g ∗ = σ2 . We get: ei

πQ σ3 2

1 ¯ f2 (x, λ) = r(λ)f1 (x, λ) + if1 (x, λ) a(λ)

(13.28)

The Gelfand–Levitan–Marchenko equation is essentially the Fourier transform of this equation. To perform this transformation we need a lemma: Lemma. We have the relations +∞ 2π eik(λ)x dλ = δ(x) m −∞ +∞ dλ eik(λ)x =0 λ −∞ +∞ 2π dλ eik(λ)x 2 = δ(x) λ m −∞

(13.29)

Proof. Because of the singularities at λ = 0, ∞, one has to give a careful deﬁnition of these integrals on the real axis. We give a principal part deﬁnition, that is we set: ∞

−

dλ = lim −∞

→0

1/

dλ + −1/

dλ

(13.30)

Let us prove the ﬁrst formula. Recall that k(λ) = m(λ − λ−1 ), so that k(−λ−1 ) = k(λ). In the integral from −1/ to − , change λ → −λ−1 to get 1/ +∞ 1 +∞ ikx 2π ik(λ)x ik(λ)x −2 e dλ = lim e (1 + λ )dλ = e dk = δ(x) →0 m −∞ m −∞ The same technique applied to the second formula yields: 1/ +∞ 1 1 ik(λ)x dλ = lim e eik(λ)x (− + )dλ = 0 →0 λ λ λ −∞ Finally, the last equation is equivalent to the ﬁrst one changing λ to −1/λ.

500

13 Classical inverse scattering method

1 (x, y), W (1 (x, y), y ≥ x, appearing in the Theorem. The kernels U Fourier transform of f1 , eq. (13.19), satisfy the linear integral equations: −

2iπ U 1 (x, y) = F0 (x + y)f10 m ∞

1 (x, z) + F−1 (y + z)W (1 (x, z) dz + F0 (y + z)U x

2iπ ( W 1 (x, y) = F−1 (x + y)f10 (13.31) − m ∞

1 (x, z) + F−2 (y + z)W (1 (x, z) dz + F−1 (y + z)U x

The functions Fj (x) are directly computed in terms of the scattering data by: ∞ Fj (x) = dλ λj eik(λ)x r(λ) − 2iπ eik(λn )x λjn mn (13.32) −∞

n

where we deﬁned the parameters mn =

cn a (λn )

1 and W (1 always appear with their Proof. Note that, in eq. (13.31), U second argument greater than the ﬁrst one, in agreement with their deﬁnition. Hence the system of two equations (13.31) determines these two quantities in their domain of deﬁnition. We multiply eq. (13.28) by λj eik(λ)y (for j = 0, −1) and integrate over λ from −∞ to +∞, with a principal part prescription, getting: ∞ f2 (x, λ) ik(λ)y i πQ σ 3 e 2 dλ λj e a(λ) −∞ ∞

¯ = dλ λj eik(λ)y r(λ)f1 (x, λ) + if1 (x, λ) (13.33) −∞

Recall the Fourier representations: ∞ (1 (x, z))eikz dz 1 (x, z) + λ−1 W f1 (x) = eikx f10 + (U x ∞ 0 −ikx 1 (x, z) + λ−1 W ( 1 (x, z))e−ikz dz f 1 (x) = e f1 + (U x

where the second equation is derived from the ﬁrst by complex conjugating and multiplying by σ2 . We evaluate the right-hand side of eq. (13.33) using

13.5 The Gelfand--Levitan--Marchenko equation

501

the lemma. We ﬁnd for j = 0, −1 respectively: ∞ dλ eiky (r(λ)f1 (x) + if1 (x)) −∞

0 2π 2π 1 (x, y) = R0 (x + y)f10 + i δ(x − y)f1 + i θ(y − x)U m m ∞

1 (x, z) + R−1 (y + z)W (1 (x, z) (13.34) dz R0 (y + z)U + x

and similarly: ∞ dλ λ−1 eiky (r(λ)f1 (x) + if1 (x)) −∞

2π ( 1 (x, y) = R−1 (x + y)f10 + i θ(y − x)W m ∞

1 (x, z) + R−2 (y + z)W (1 (x, z) dz R−1 (y + z)U + x

where we have introduced the notation: ∞ Rj (x) = dλ λj eik(λ)x r(λ), −∞

j = 0, −1, −2

To evaluate the left-hand side of eq. (13.33) we use the analyticity properties of the Jost solutions, and the residue theorem. Recall that the λ integrals are deﬁned with the prescription (13.30). We close the contour in the upper half-plane by introducing a small half-circle C of center 0 and radius and a large half-circle C1/ of center 0 and radius 1/ . In the upper half-plane , the integrand has poles at the zeroes λn of a(λ). So the residue theorem gives: ∞ πQ f2 (x, λ) ik(λ)y f2 (x, λn ) i πQ σ 3 dλ λj = 2iπei 2 σ3 eik(λn )y λjn e 2 e a(λ) a (λn ) −∞ n πQ f2 (x, λ) ik(λ)y dλ λj (13.35) + ei 2 σ3 e a(λ) C +C1/ da . To proceed we need to evaluate the integrals on the where a (λ) = dλ half-circles. Using the asymptotic expansions eq. (13.18) we see that on these circles f2 (x, λ) = e−ik(λ)x × regular, and similarly a(λ) is regular. In particular, at ∞ we have iπQ f2 (x, λ) = e 2 e−ik(λ)x (f20 + O(λ−1 )) a(λ)

502

13 Classical inverse scattering method

. We consider, ﬁrst for j = 0, integrals of the form C eik(λ)(y−x) dλ, where k is large. The existence of such integrals requires y ≥ x, otherwise the exponential explodes. Assuming in the following that this condition is satisﬁed, we have |eik(λ)(y−x) | ≤ 1 for λ in the upper half-plane, hence the integral on C is bounded by π and can be neglected when → 0. On the other hand, the integral over C1/ reduces by the residue theorem to the integral on the real axis: ∞ 2π eik(λ)(y−x) dλ ∼ eimλ(y−x) dλ = δ(x − y) m C1/ −∞ so that the last term in eq. (13.35) is equal to 0

2π m δ(x

− y)e

iπQ(1+σ3 ) 2

f20 .

Taking into account the value of f1 , this term precisely cancels the δ(x−y) term in the right-hand .side of eq. (13.34). For j = −1, we have to consider integrals of the form C eik(λ)x dλ/λ with x > 0. Write λ = eiθ with 0 < θ < π, and ﬁx η > 0 small enough. The integral takes the form: π mx imx ik(λ)x −1 e λ dλ ∼ i dθe− sin θ− cos θ C

0

On the interval [η, π − η] the integrand decays exponentially when → 0. On the intervals [0, η] and [π − η, π] the integrand is bounded by 1, and the integral by 2η. So the contribution on C can be neglected. A similar analysis shows that the contribution on C1/ can also be neglected. We now evaluate f2 (x, λn ) appearing in eq. (13.35). Precisely at the zeroes λn of a(λ) we have f2 (x, λn ) = cn f1 (x, λn ), which translates into 1 f2 (x, λn ) = cn e−i 2 πQσ3 f1 (x, λn ). Replacing f1 (x, λn ) by its expression, eq. (13.19), one gets:

eik(λn )y λjn

j f2 (x, λn ) ik(λn )y cn λn −i 12 πQσ3 eik(λn )x f10 = e e a (λn ) a (λn ) ∞ ik(λn )z −1 ( + dz e (U1 (x, z) + λn W1 (x, z)) x

Combining everything ﬁnally yields the Gelfand–Levitan–Marchenko equations. 13.6 Soliton solutions The solution of the Gelfand–Levitan–Marchenko equation (13.31) is particularly simple when we take R(x) = 0. This corresponds to b(λ) = 0,

503

13.6 Soliton solutions

which means that there is no reﬂection in the auxiliary scattering problem. Corresponding potentials are called reﬂectionless potentials. Then the kernels Fj are degenerate and the Gelfand–Levitan–Marchenko equations reduce to a ﬁnite linear system. The sine-Gordon solutions ϕ(x, t) we get in this way are just the multi-soliton solutions. If there is no reﬂection, the scattering data are λn and mn . The Gelfand– Levitan–Marchenko kernels read: Fj (x, y) = −2iπ mn λjn eik(λn )(x+y) n

Looking at the y dependence in the Gelfand–Levitan–Marchenko equation (1 (x, y) can be expanded as: 1 (x, y) and W shows that U ∗ 1 (x, y) = e−ik(λn )(x+y) m∗n un (x) U n

(1 (x, y) = W

∗

e−ik(λn )(x+y) m∗n wn (x)

n

The y exponentials have been chosen to remain bounded when y → ∞, so that the z integrals in the Gelfand–Levitan–Marchenko equations ∗ converge. The factor m∗n e−ik(λn )x has been introduced to simplify later formulae. Inserting these forms into the Gelfand–Levitan–Marchenko equations and identifying the coeﬃcient of exp(ik(λn )y) on both sides yields: im∗p 1 ∗ u ¯n (x) = f10 + e−2ik(λp )x (up (x) + λ−1 n wp (x)) ∗ m k(λn ) − k(λp ) p im∗p λn ∗ w ¯n (x) = f10 + e−2ik(λp )x (up (x) + λ−1 n wp (x)) ∗ m k(λn ) − k(λp ) p Since the right-hand sides of these equations are identical, we get wn (x) = 1 λ∗n un (x). Substituting back into the equation for un and noting that k(λn ) − k(λ∗p ) = m(λn − λ∗p )(1 + 1/(λn λ∗p )), the equation simpliﬁes to: u ¯n (x) = mf10 +

p

im∗p −2ik(λ∗ )x p e up (x) λn − λ∗p

In the following, we restrict ourselves to pure soliton and antisoliton solutions (no breathers), so that the λp are pure imaginary. To connect with the notations of Chapter 12 we set λp = iµp . The Gelfand–Levitan– Marchenko equation becomes, in matrix notation: u ¯ = mf + V u

(13.36)

504

13 Classical inverse scattering method

u1n , f is the vector with where u is the vector with components un = u2n components fn = f10 en , where en = 1 for all n, and we have deﬁned the matrix: µp mp −2m(µp +µ−1 p )x X p , Xp = e (13.37) Vnp = µn + µp µp

Note that, since a(−λ∗ ) = eiπQ (a(λ))∗ and c∗n = −eiπQ cn , we have m∗p = mp , and the matrix V is real in this pure solitonic case. Note that, if one −1 includes the time dependence, i.e. mp → mp e2m(µp −µp )t , the exponential appearing in V becomes exp (−2m[µp (x − t) + µ−1 p (x + t)]), which is the familiar form encountered in Chapter 12. ¯ = −u, we get −u = Taking the bar of eq. (13.36) and using that u mf¯ + V u ¯ so that, eliminating u ¯: (1 + V 2 )u = −mf¯ − mV f Writing (1 + V 2 ) = (1 + ieiπQ V )(1 − ieiπQ V ), this is solved by: u1 = imeiπQ (1 − ieiπQ V )−1 e,

u2 = −im(1 + ieiπQ V )−1 e

We can then compute the ﬁeld ϕ by eq. (13.21). This is done in the: Proposition. We have: iπQ

(1 )1 (x, x) = im det (1 + ie V ) im + (W det (1 − ieiπQ V ) iπQ (1 )2 (x, x) = im det (1 − ie V ) im + eiπQ (W det (1 + ieiπQ V ) so that e−iβϕ =

τ+ τ−

(13.38) (13.39)

as in eqs. (12.67, 12.76) in Chapter 12.

Proof. Let us prove the ﬁrst equation. We have (1 )1 (x, x) = i (W Xn (u1 )n = −meiπQ Tr (M ), M ≡ (1−ieiπQ V )−1 e⊗X n

where X is the vector of components Xn as in eq. (13.37). Hence

im + W1 (x, x) = im 1 + ieiπQ Tr (M ) Note that M is of rank 1, so 1 + Tr (ieiπQ M ) = det (1 + ieiπQ M ). Moreover:

1 + ieiπQ M = (1 − ieiπQ V )−1 1 − ieiπQ (V − e ⊗ X)

505

13.7 Poisson brackets of the scattering data Finally, we remark that (V − e ⊗ X) = −µV µ−1 so that 1 + ieiπQ M = (1 − ieiπQ V )−1 µ(1 + ieiπQ V )µ−1

Taking the determinant proves eq. (13.38). Equation (13.39) is obtained similarly. Equation (13.21) gives: e2iβϕ =

det2 (1 − ieiπQ V ) det2 (1 + ieiπQ V )

which identiﬁes with the tau-function formula. The parameters an , µn in Chapter 12 are related to mn , λn in this chapter by: λn = iµn ,

mn = −2ieiπQ µn an

(13.40)

13.7 Poisson brackets of the scattering data We now consider the sine-Gordon equation from the Hamiltonian point of view. The aim is to compute the Poisson brackets of the scattering data deﬁned in the previous sections. We follow the method of the classical r-matrix introduced by Faddeev, Sklyanin and Takhtajan. We start from the canonical Poisson brackets: {π(x), ϕ(y)} = δ(x − y) and deﬁne the Hamiltonian to ∞: 1 2 H= dx π (x) + 2 −

on the interval [−, ], eventually will tend 1 4m2 2 (∂x ϕ) (x) + 2 (1 − cos (2βϕ)) dx 2 β

The equations of motion read ϕ˙ = π and π˙ = ϕ − 8m β sin (2βϕ), which reproduces eq. (13.1). Consider the auxiliary linear problem eq. (13.7) on the interval [−, ]. Let Ψ(−) be the value of a solution at x = − and Ψ() its value at x = . Then we can write: 2

Ψ() = T (λ)Ψ(−) where T (λ) is the monodromy matrix.

506

13 Classical inverse scattering method

Proposition. The monodromy matrix and the scattering data are directly related by: a(λ) b(λ) T (λ) ≡ −b∗ (λ) a∗ (λ) iπQ ik(λ) ik(λ) 1 e e −eik(λ) e−ik(λ) e = lim T (λ) e−ik(λ) eiπQ e−ik(λ) −eik(λ) e−ik(λ) 2 →∞ (13.41) Proof. From the asymptotic form of the Jost solutions and the relation |a|2 + |b|2 = 1 we ﬁnd  1   eik(λ)x x → −∞   1  1 b f1 + f2 ∼ iπQ  a a  1 e  ∗ ik(λ)x  e−ik(λ)x x → +∞ e +b a iπQ e −1 Any solution Ψ(x) can be written as ψ1 ( a1 f1 + ab f2 ) + ψ2 f2 and behaves at x = − as: −ik(λ) e eik(λ) ψ1 Ψ(−) = e−ik(λ) −eik(λ) ψ2 while at x = +∞ it behaves as: ∗ ik(λ) a e + beiπQ e−ik(λ) Ψ() = ∗ iπQ ik(λ) a e e − be−ik(λ)

−b∗ eik(λ) + aeiπQ e−ik(λ) −b∗ eiπQ eik(λ) − ae−ik(λ)

ψ1 ψ2

The result follows from writing Ψ() = T (λ)Ψ(−), and conjugating the relation obtained by σ1 . To compute the Poisson brackets of the scattering data we need the r-matrix relation: Proposition. {T,1 (λ), T,2 (µ)} = [r12 (λ, µ), T (λ) ⊗ T (µ)]

(13.42)

where the matrix r12 (λ, µ) is given by β 2 λ2 + µ2 4λµ r12 (λ, µ) = H ⊗H + 2 (E+ ⊗ E− + E− ⊗ E+ ) 4 λ2 − µ2 λ − µ2 (13.43)

13.7 Poisson brackets of the scattering data

507

Proof. As we know from Chapter 3, see eqs. (3.90, 3.91), it suﬃces to prove the much simpler local relation: {Ax,1 (λ, x), Ax,2 (µ, y)} = [r12 (λ, µ), Ax (λ, x) ⊗ I + I ⊗ Ax (µ, y)]δ(x − y) (13.44) The r-matrix is obtained, up to a factor, by the general formula for Toda models, 1 + − + r12 ) r12 = (r12 2 ± are given by eqs. (12.28, 12.29) in Chapter 12 in The formulae for r12 which we insert the root decomposition, eq. (12.59), in the same chapter. We get in the loop representation: 1 1 + = H ⊗H + λ2n H ⊗ µ−2n H r12 4 2 n>0

+λ2n−1 E+ ⊗ µ−2n+1 E− + λ2n−1 E− ⊗ µ−2n+1 E+

(we take as invariant bilinear form on sl(2) the trace in the 2 × 2 representation, so that (H, H) = 2 and (E+ , E− ) = 1. This accounts for the relative factors). Summing the geometric series yields eq. (13.43) in which − |λ| < |µ|. The formula for r12 is the same but with |λ| > |µ|. The antisymmetric matrix r12 is just the half sum of these two identical rational functions for λ = µ. It is easy to check that the relation eq. (13.44) holds true, and this allows us to adjust the factor β 2 . Proposition. The complete list of Poisson brackets of the scattering data is: (1) continuum–continuum: {a(λ), b(µ)} = β 2

λµ a(λ)b(µ) (λ + µ)(λ − µ + i0)

{a(λ), b∗ (µ)} = −β 2

λµ a(λ)b∗ (µ) (λ + µ)(λ − µ + i0)

{b(λ), b∗ (µ)} = −iπβ 2 λ |a(λ)|2 δ(λ − µ) {a(λ), a(µ)} = 0,

{a(λ), a∗ (µ)} = 0,

{b(λ), b(µ)} = 0

(2) continuum–discrete: {a(λ), λn } = 0,

{a(λ), λ∗n } = 0

{b(λ), λn } = 0,

{b(λ), λ∗n } = 0

508

13 Classical inverse scattering method

{a(λ), cn } = −β 2

{b(λ), cn } = 0,

{b(λ), c∗n } = 0

λλn a(λ)cn , 2 λ − λ2n

{a(λ), c∗n } = β 2

λλ∗n a(λ)c∗n λ2 − λ∗2 n

(3) discrete–discrete: {λn , λm } = 0,

{λn , λ∗m } = 0

{cn , cm } = 0,

{cn , c∗m } = 0

β2 (13.45) λn cm δnm , {λ∗n , cm } = 0 unless λm = −λ∗n 2 Proof. Deﬁne EL (λ) and ER (λ) as the matrices: iπQ ik(λ) ik(λ) −eik(λ) e−ik(λ) e e e , ER (λ) = EL (λ) = e−ik(λ) eiπQ e−ik(λ) −eik(λ) e−ik(λ) {λn , cm } =

so that T (λ) = 12 lim→∞ EL (λ)T (λ)ER (λ). Inserting this into eq. (13.42), we get the Poisson brackets of the elements of the matrix T (λ): {T1 (λ), T2 (µ)} = r+ (λ, µ)T1 (λ)T2 (µ) − T1 (λ)T2 (µ)r− (λ, µ) where we have deﬁned: r+ (λ, µ) = EL (λ) ⊗ EL (µ)r12 (λ, µ)EL−1 (λ) ⊗ EL−1 (µ), r− (λ, µ) =

−1 ER (λ)

⊗

−1 ER (µ)r12 (λ, µ)ER (λ)

⊗ ER (µ),

→∞ →∞

In order to evaluate these r-matrices, one ﬁrst computes:

EL HEL−1 = eiπQ e2ik E+ + e−2ik E−

1 − eiπQ H ± e2ik E+ ∓ e−2ik E− EL E± EL−1 = 2 −1 HER = e−2ik E+ + e2ik E− ER

1 −1 ER E± E R = − H ± e−2ik E+ ∓ e2ik E− 2 and then obtain r± , before taking the → ∞ limit: 4 2λµ r± (λ, µ) = 2 H ⊗H 2 β λ − µ2 λ − µ ±2i(k(λ)+k(µ)) + E + ⊗ E+ + e λ+µ λ + µ ±2i(k(λ)−k(µ)) + E + ⊗ E− + e λ−µ

λ − µ ∓2i(k(λ)+k(µ)) E− ⊗ E− e λ+µ λ + µ ∓2i(k(λ)−k(µ)) E− ⊗ E+ e λ−µ

13.7 Poisson brackets of the scattering data

509

We take the limit → ∞ in the sense of distribution theory. This is done using the formula: 1 lim P e±ix = ±iπδ(x) →∞ x Indeed, if f is analytic with slow growth at ∞ and x > 0 one can compute − ∞ 1 ±ix + f (x)dx = iπf (0) e x −∞ by considering the closed contour obtained by adding a half-circle C and a half-circle at inﬁnity. The integral on this last circle vanishes in the limit → ∞, and the integral on C gives −iπf (0). From this we deduce a formula more suited to our case, lim P

→∞

1 e±2i(k(λ)−k(µ)) = ±iπδ(λ − µ) λ−µ

which is obtained by the change of variables λ − µ → 2(k(λ) − k(µ)) in the delta function. We have similar formulae for λ + µ, but due to the symmetry properties under λ → −λ, a(−λ) = eiπQ a∗ (λ) and b(−λ) = −e−iπQ b∗ (λ), we can restrict ourselves to λ > 0 and µ > 0, and ignore the terms in δ(λ + µ). We can now take the limit → ∞ in r± , getting: r± (λ, µ) =

β2 λ + µ λ − µ

P − H ⊗H 8 λ−µ λ+µ iπβ 2 (λ + µ)(E+ ⊗ E− − E− ⊗ E+ )δ(λ − µ) ∓ 4

From this we compute the Poisson brackets of the scattering data. For instance: λµ iπ 2 {a(λ), b(µ)} = β a(λ)b(µ) − (λ + µ)δ(λ − µ)b(λ)a(µ) λ2 − µ2 4

λ − µ 2 1 β (λ + µ) P a(λ)b(µ) = − iπδ(µ − λ) − 4 λ−µ λ+µ λµ = β2 a(λ)b(µ) (λ − µ + i0)(λ + µ) where in the last step we have used the identity: 1 1 = P ∓ iπδ(x) x ± i0 x

510

13 Classical inverse scattering method

Note that the left-hand side and the right-hand side are analytic in the upper λ half-plane , as it should be. The other Poisson brackets are computed similarly. There are sixteen Poisson brackets in {T1 , T2 } but the independent ones are listed in the proposition. The Poisson brackets for the discrete spectrum would require a detailed analysis, but they can be obtained quickly using the following trick. We insisted already on the fact that the function b(µ) cannot be analytically continued in the upper half-plane. The situation changes, however, if the ﬁeld ϕ(x) is compactly supported. Then b(µ) is analytic in the plane, and we have seen that: 1 cn = − b(λn ) Assuming that we are in such a case, setting µ = λm into the equation for {a(λ), b(µ)}, we get immediately {a(λ), cm } = −β 2 λλm /(λ2 −λ2m )a(λ)cm . an in this equation with λ → λn , one gets Letting further a(λ) = (λ − λn )˜ β2 {λn , cm } = 2 λn cm δnm . The remaining Poisson brackets are computed similarly. 13.8 Action--angle variables Due to the boundary conditions of the ﬁeld ϕ, which diﬀer at x = −∞ and x = +∞, the generating function of conserved quantities is a modiﬁed trace of the monodromy matrix, speciﬁcally Tr (T (λ)ρ), where ρ is the iπQ iπQ diagonal matrix ρ = Diag (e− 2 , e 2 ). This is a consequence of the zero curvature condition which implies, (see eq. (3.72) in Chapter 3): ∂t T (λ, t) = At (λ, )T (λ, t) − T (λ, t)At (λ, −) and the explicit values At (λ, ) = −im(λ + λ−1 )eiπQ σ1 and At (λ, −) = −im(λ + λ−1 )σ1 for → +∞. Computing T (λ, t) from eq. (13.41), we can express Tr (T (λ)ρ) in terms of the scattering data: Tr (T (λ)ρ) ∼ e

iπQ 2

e−2ik(λ) a(λ) + e−

iπQ 2

e2ik(λ) a∗ (λ)

So the generating functional for conserved quantities can be taken as a(λ). Since {a(λ), a(µ)} = 0 these conserved quantities Poisson commute. Remembering the asymptotics iπQ 1 − iπQ , |λ| → ∞; a(λ) = e 2 + O (|λ|) , |λ| → 0 a(λ) = e 2 + O |λ| we see that we can expand log a(λ) around λ = 0 and λ = ∞: ∞

log a(λ) = −

iπQ + In (iλ)−n , + 2 n=1

|λ| → ∞

(13.46)

511

13.8 Action--angle variables ∞

log a(λ) =

iπQ − In (iλ)n , + 2

|λ| → 0

(13.47)

n=1

We will now calculate the In± in two diﬀerent ways. The ﬁrst one will give In± in terms of the original sine-Gordon ﬁeld ϕ, while the second one will express In± in terms of the scattering data. In order to compute In+ in terms of ϕ, we perform a gauge transformaβ ˜ so that the gauge transformed connection A˜x reads: tion Ψ → ei 2 ϕσ3 Ψ, β A˜x = i (ϕ˙ − ϕ )H + imλ(1 − λ−2 e−2iβϕ )E+ + imλ(1 − λ−2 e2iβϕ )E− 2 Note that A˜x takes the same value at x = ±∞ and we have Tr (T˜ (λ)) = Tr (T (λ)ρ). To compute this trace we can directly apply eq. (11.8) in Chapter 11, where we have found that Tr (T˜ (λ)) = eP (λ) + e−P (λ) For smooth (C ∞ ) ﬁelds ϕ, the quantity P (λ) admits asymptotic expansions for λ → 0 and λ → ∞ which can be found using the Ricatti equation. The coeﬃcients of this asymptotic expansion are integrals over local densities in the ﬁeld ϕ containing higher and higher derivatives. Hence the smoothness condition is essential for their existence. On the other hand, for such smooth ﬁelds we have seen that b(λ) vanishes at λ = 0, ∞ as well as all its derivatives. This means that b(λ) has zero asymptotic expansion at these points. Since |a|2 = 1 − |b|2 we see that |a| = 1 in the asymptotic sense, or a(λ)∗ ∼ 1/a(λ). Using this fact, we can compare the asymptotic expansions in both sides of the equation: eP (λ) + e−P (λ) = e

iπQ 2

e−2ik(λ) a(λ) + e−

iπQ 2

e2ik(λ) a∗ (λ)

and identify in the asymptotic sense: P (λ) =

iπQ − 2ik(λ) + log a(λ), 2

λ→∞

To compute the left-hand side, we recall that P (λ) = − v(x, λ)dx, where v(x, λ) is a solution of the Ricatti equation v +v 2 = V and V is determined in eq. (11.8) in terms of A˜x . We compute this expansion at the lowest nontrivial order, so that O(λ−2 ) terms are neglected. We get: V = −m2 λ2 −

β2 β (ϕ˙ − ϕ )2 + 2m2 cos(2βϕ) − i (ϕ˙ − ϕ ) + O(λ−2 ) 4 2

512

13 Classical inverse scattering method

One inserts v = ±imλ + · · · and observes that there is no O(1) term. Up to a choice of sign, and substituting k(λ) = mλ − mλ−1 , we have:

im β 2 iβ 1 2 v = −ik(λ) − ( ϕ ˙ − ϕ ) + 1 − cos(2βϕ) + ( ϕ ˙ − ϕ ) + O( 2 ) 2 2 λ 8m 4m λ Since (ϕ˙ − ϕ ) vanishes at x = ±∞, we obtain: P (λ) = −2ik(λ) − imλ−1 where:

H= −

β2 (H − P ) + O(λ−2 ) 4m2

1 2 1 2 4m2 ϕ˙ + ϕ + 2 (1 − cos (2βϕ)) dx 2 2 β P = ϕϕ ˙ dx −

In−

(λ → 0) we perform the gauge transformation Similarly, to compute −i β2 ϕσ3 ˜ Ψ, so that the gauge transformed connection A˜x reads: Ψ→e β A˜x = i (ϕ˙ + ϕ )H − imλ−1 (1 − λ2 e2iβϕ )E+ − imλ−1 (1 − λ2 e−2iβϕ )E− 2 Note, however, that, due to the sign change in the gauge transformation, we now have Tr T˜ = Tr (T ρ−1 ) = eiπQ Tr (T ρ). The same computation as above yields: P (λ) = −2ik(λ) + imλ

β2 (H + P ) + O(λ2 ) 4m2

Comparing with the asymptotic expansions in: Tr T˜ = eP (λ) + e−P (λ) = e−iπQ (e we ﬁnd P (λ) = −

iπQ 2

e−2ik(λ) a(λ) + e−

iπQ − 2ik(λ) + log a(λ), 2

iπQ 2

e2ik(λ) a∗ (λ))

λ→0

from which we identify: I1± =

β2 (H ∓ P ) 4m

Now we reconstruct a(λ) from its analyticity properties, and get alternative expressions for the quantities In± . Recall that the function a(λ) is analytic in the upper half-plane, behaves at λ = 0, ∞ as: iπQ iπQ 1 , |λ| → ∞; e 2 a(λ) = eiπQ +O (|λ|) , |λ| → 0 e 2 a(λ) = 1+O |λ|

513

13.8 Action--angle variables iπQ

iπQ

and obeys (e 2 a(−λ∗ ))∗ = e 2 a(λ). Therefore we can reconstruct a(λ) if we know its modulus on the real axis and the position of its zeroes. ∞ λ − λi iπQ 1 log |a(µ)| exp − e 2 a(λ) = dµ (13.48) λ − λ∗i iπ −∞ λ − µ + i0 i

The right-hand side is analytic in the upper half-plane. It is invariant under complex conjugation and changing λ → −λ∗ because for each λi there is a λj = −λ∗i and one can change variables µ → −µ in the integral (|a(−µ)| = |a(µ)|). For λ → ∞ one gets the correct asymptotic value 1. Finally, for λ real, using: 1 1 =P − iπδ(λ − µ) λ − µ + i0 λ−µ the modulus of the right-hand side is exp ( log |a(µ)|δ(λ−µ)dµ) = |a(λ)|. Using the asymptotic value at λ = 0, and noting that the integral vanishes for λ = 0 (by µ → −µ) we must have eiπQ = i (λi /λ∗i ). Note that a pair of roots symmetric with respect to the imaginary axis contribute a +1 in this product, but pure imaginary roots contribute a −1, so, modulo 2, Q must be equal to the number of pure imaginary roots, i.e. the total number of solitons and antisolitons must have the same parity as the topological charge Q. This is as it should be since a soliton has Q = 1 and an antisoliton has Q = −1. On the real axis we have |a(λ)|2 = 1 − |b(λ)|2 and we can replace 1 π log |a(λ)| by 1 ρ(λ) = log (1 − |b(λ)|2 ) 2π in eq. (13.48). Thus a(λ) can be reconstructed from the knowledge of |b(λ)| on the real axis and the zeroes λn . For a smooth ﬁeld ϕ, ρ(µ) decreases fast at µ → 0, ∞, so that one can expand 1/(λ − µ) in powers of λ/µ or µ/λ in the integral. We get asymptotic expansions at λ → ∞:   ∞ ∞ iπQ 1  λnj − λ∗n j log a(λ) = − − µn−1 ρ(µ)dµ + +i 2 λn n −∞ n=1

and at λ → 0: log a(λ) =

iπQ + 2

∞ n=1

j



1 λ n − n j

1 1 − ∗n λnj λj

−i

∞

−∞

 µ−n−1 ρ(µ)dµ

514

13 Classical inverse scattering method

Comparing with eqs. (13.46, 13.47), we identify the In± . We ﬁnd In± = 0 for n even, and for n odd In± = ±I±n , where In is deﬁned for n ∈ Z by: (n−1)/2

In = (−1)

λn − λ∗n j

j

j

in

−

∞

µn−1 ρ(µ)dµ ∈ R

0

In particular, for n = ±1 we obtain:   ∞ ±1 ∗±1 4m  λj − λj dµ − H ±P = 2 ± µ±1 ρ(µ)  β i µ −∞ j

If, for simplicity, we don’t consider breathers and set λj = iξj , we get:   ∞ dµ 4m (µ + µ−1 )|ρ(µ)|  H = 2  (ξj + ξj−1 ) + β µ 0 j   ∞ 4m dµ P = 2  (ξj − ξj−1 ) + (µ − µ−1 )|ρ(µ)|  β µ 0 j

Setting kj = written as:

8 m (ξ β2 2 j

− ξj−1 ), M =

8 m β2

and k =

m 2 (µ

− µ−1 ), this can be

∞ 8 dk kj2 + M 2 + 2 k 2 + m2 |ρ(k)| √ 2 β −∞ k + m2 j ∞ 8 dk P = kj + 2 k|ρ(k)| √ β −∞ k 2 + m2 j

H =

which nicely exhibits the decomposition of the theory into a sum of relativistic modes. Note that the soliton j has mass M and momentum kj , while the continuous spectrum is a superposition of modes of mass m. Remark. One can extract the Poisson brackets of the solitonic modes from eq. (13.45) and recover eq. (12.81) in Chapter 12. The parameters an , µn are related to mn , λn by eq. (13.40). Moreover, from eqs. (13.32, 13.48), for purely solitonic solutions we have: πQ λ − λj cn mn = , a(λ) = e−i 2 a (λn ) λ + λj j so that: an = 2iei

πQ 2

cn

µn + µj µn − µj

j=n

13.8 Action--angle variables

515

Then a straightforward computation using eqs. (13.45), which mean that log µn and log cn are canonically conjugated variables, yields: β2 β2 4µi µj ai aj µi aj δij , {ai , aj } = {µi , µj } = 0, {µi , aj } = 2 2 µ2i − µ2j which identiﬁes with eq. (12.81) up to the factor β 2 (we should set β = −i to compare the two formulae).

With this, we end this chapter on the classical inverse scattering method, thereby paying due tribute to Gardner, Greene, Kruskal and Miura, without whom this book would not exist. References [1] C.S. Gardner, J.M. Greene, M.D. Kruskal and R.M. Miura, Method for solving the Korteweg–de Vries equation. Phys. Rev. Lett. 19 (1967) 1095. [2] E.K. Sklyanin, On the complete integrability of the Landau–Lifchitz equation. Preprint LOMI E-3-79. Leningrad (1979). [3] S.Novikov, S.V. Manakov, L.P. Pitaevskii and V.E. Zakharov, Theory of Solitons, the Inverse Scattering Method. Consultants Bureau, NY (1984). [4] L.D. Faddeev and L.A. Takhtajan, Hamiltonian Methods in the Theory of Solitons. Springer (1986).

14 Symplectic geometry

The aim of this chapter is to provide a concise presentation of classical mechanics in the framework of symplectic geometry. This geometrical approach of mechanics is essential to gain any understanding of integrable systems theory. We assume the reader has some basic knowledge of elementary diﬀerential geometry and diﬀerential forms, but we present all the symplectic theory we need. We then explain the notion of symplectic reduction under a Lie group action, a concept which frequently appears in our discussion of integrable systems. The chapter ends by a discussion of a more recent topic, Poisson–Lie groups, which is used in the analysis of dressing transformations, and whose importance has to be stressed in connection with quantum group theory.

14.1 Poisson manifolds and symplectic manifolds In this chapter we investigate the formulation of classical mechanics using Poisson brackets. We work on a phase space M and consider the diﬀerentiable functions on M . We denote by F(M ) the algebra of such functions. A Poisson bracket is a bilinear antisymmetric derivation of the algebra F(M ): { , } : F(M ) × F(M ) → F(M ) such that: {f1 , f2 } = −{f2 , f1 }, antisymmetry {f1 , αf2 + βf3 } = α{f1 , f2 } + β{f1 , f3 }, α, β constants {f1 , f2 f3 } = {f1 , f2 }f3 + f2 {f1 , f3 }, Leibnitz rule {f1 , {f2 , f3 }} + {f3 , {f1 , f2 }} + {f2 , {f3 , f1 }} = 0, Jacobi identity Since the Poisson bracket is linear in f and obeys the Leibnitz rule, with any function H ∈ F(M ) we can associate a vector ﬁeld XH on M by 516

14.1 Poisson manifolds and symplectic manifolds

517

XH f = {H, f }. When H is the Hamiltonian of the system, this vector ﬁeld deﬁnes the time evolution by f˙ = XH f = {H, f } ∀f ∈ F(M ) Deﬁnition. A Poisson manifold M is a manifold on which a Poisson bracket is deﬁned. The important feature of Poisson brackets in classical mechanics is that if f1 and f2 are two conserved quantities, i.e. {H, f1 } = {H, f2 } = 0, then {f1 , f2 } is also conserved due to the Jacobi identity. In general a Poisson bracket is degenerate, which means that there are functions f on M such that {f, g} = 0 for all g. The set of such functions is called the centre of the Poisson algebra. If the centre is non-trivial, i.e. contains non-constant functions, one can reduce the dynamical system by setting all functions of the centre to constant values. This deﬁnes a foliation of M and the Poisson bracket is non-degenerate on the leaves. In a local coordinate system xj on M we can write: ∂f1 ∂f2 P ij (x) i j { f1 , f2 }M (x) = ∂x ∂x ij

due to the bilinearity and Leibnitz rule. Antisymmetry requires that P ij (x) = −P ji (x). The Jacobi identity reads: P is ∂s P jk + P ks ∂s P ij + P js ∂s P ki = 0 where ∂s = ∂xs . Assume now that the matrix P ij is invertible. In particular, this means that the centre of the Poisson algebra is trivial. This can occur only when the dimension of M is even. Denote by (P −1 )ij the inverse matrix of P ij so that ∂s P ij = −P ia ∂s (P −1 )ab P bj . Inserting this into the Jacobi identity yields:

P is P ja P kb ∂s (P −1 )ab + ∂b (P −1 )sa + ∂a (P −1 )bs = 0 s,a,b

Since

P ij

is invertible, this is equivalent to the linear conditions: ∂a (P −1 )bc + ∂b (P −1 )ca + ∂c (P −1 )ab = 0

These conditions can be interpreted as the closedness of the 2-form: ω=− (P −1 )ij dxi ∧ dxj i<j

that is dω = 0. This 2-form is invariant under changes of coordinates, and so is globally deﬁned on M . We emphasize that the matrices entering the deﬁnition of the Poisson bracket and the symplectic form are inverse to each other.

518

14 Symplectic geometry

Deﬁnition. A symplectic manifold (M, ω) is a manifold M equipped with a non-degenerate closed 2-form ω, dω = 0. Given a function H on a symplectic manifold, the Hamiltonian vector ﬁeld XH is deﬁned using the interior product iX by: dH = −iXH ω,

i.e.

dH = −ω(XH , ·)

Using local coordinates xi on M we have ω= ωij dxi ∧ dxj i<j

i i = ω ij ∂ H, where ω ij is the inverse Setting XH = i XH ∂i , we get XH j matrix of ωij . Knowing the symplectic form ω one can reconstruct the Poisson bracket as follows: {f1 , f2 } = Xf1 (f2 ) = −Xf2 (f1 ) = ω(Xf1 , Xf2 )

(14.1)

In components, we have {f1 , f2 } = −

ω ij ∂i f1 ∂j f2

ij

On a symplectic space, one can deﬁne the notion of symplectic transformations. Consider a bijection γ : M → M and the transform of the symplectic form ω under γ. This is the form (γ ∗ ω)m (V, W ) = ωγ(m) (γ∗ V, γ∗ W ) where m is a point of M , and γ∗ is the diﬀerential of γ, sending a tangent vector at m to a tangent vector at γ(m). We say that the transformation γ is symplectic if γ ∗ ω = ω. For an inﬁnitesimal transformation, γ is speciﬁed by a vector ﬁeld X on M , and this condition is equivalent to LX ω = 0, where LX is the Lie derivative, LX = diX + iX d. We translate the symplecticity property of ω on Poisson brackets and show that it reads: γ{f, g} = {γf, γg} (14.2) We recall that the action of γ on functions is given by (γf )(m) = f (γ −1 (m)). Note that we have introduced the inverse of γ so that the property γ1 · (γ2 · f ) = (γ1 γ2 ) · f holds. If γ is symplectic, we have:

Xγf (m) = γ∗ Xf (γ −1 (m))

14.1 Poisson manifolds and symplectic manifolds

519

This is because d commutes with the pullback operation, i.e. d(γf ) = γ −1∗ df , so applying this to V ∈ Tm M we get d(γf )m (V ) = dfγ −1 (m) (γ∗−1 V ) that is: ωm (Xγf , V ) = ωγ −1 (m) (Xf (γ −1 (m)), γ∗−1 V ) = ωm (γ∗ Xf ◦ γ −1 (m), V ) where in the last step we have used that γ is symplectic. We use this to prove eq. (14.2). We have {γf, γg}m = ωm (Xγf (m), Xγg (m)) = ωγ(γ −1 (m)) (γ∗ Xf ◦ γ −1 , γ∗ Xg ◦ γ −1 ) Using that γ is symplectic, this is equal to ωγ −1 (m) (Xf ◦ γ −1 (m), Xg ◦ γ −1 (m)) = {f, g}(γ −1 (m)) = (γ{f, g})(m) Proposition. Any Hamiltonian ﬂow is a symplectic transformation. Proof. Let H be the Hamiltonian with associated vector ﬁeld XH such that dH = −iXH ω. We have: LXH ω = (iXH d + diXH )ω = d(iXH ω) = −d2 H = 0 where we have used dω = 0. Example 1. A standard example of symplectic space is given by M = R2n with coordinates (pi , qi ), and symplectic form: ω= dpi ∧ dqi i

These coordinates are called canonical coordinates. The corresponding Poisson bracket reads: {qi , qj } = 0,

{pi , pj } = 0,

{pi , qj } = δij

The Hamiltonian vector ﬁeld corresponding to the function H is: XH =

i

−

∂H ∂H ∂pi + ∂q ∂qi ∂pi i

In fact this example is generic, at least locally. This is the Darboux theorem.

520

14 Symplectic geometry

Fig. 14.1. Foliation of phase space by the surfaces Σt and Φs . Theorem. On any symplectic manifold (M, ω) one can introduce, locally around a point m0 , canonical coordinates (qi , pi ) such that ω = i dpi ∧ dqi . Moreover, one can choose p1 to be any given function H on M such that dH(m0 ) = 0. Proof. The proof will be done by induction on the dimension (2n) of M . We choose a coordinate system y ∈ R2n around m0 , and deﬁne p1 (y) = H(y). We can assume that p1 (m0 ) = 0. We then introduce the Hamiltonian vector ﬁeld Xp1 associated with p1 , which is non-vanishing in the considered neighbourhood of m0 . Choose a hypersurface Σ0 passing through m0 , and transverse to Xp1 , i.e. such that Xp1 is not tangent to Σ0 . We want to deﬁne q1 such that {p1 , q1 } = 1 and q1 = 0 on Σ0 . This means that on any trajectory of the ﬂow of Xp1 we want to achieve q˙1 = {p1 , q1 } = 1. Hence q1 (y) − q1 (z) = q1 (y) = t, where z is the point at which the trajectory crosses Σ0 , that is q1 (z) = 0. By the assumption of transversality, for any y close to Σ0 one deﬁnes q1 (y) as the time needed to go from z ∈ Σ0 to y along the trajectory of the ﬂow Xp1 . The neighbourhood of m0 is foliated by the hypersurfaces Σt where q1 (y) = t. On the other hand, it is also foliated by the hypersurfaces Φs where p1 (y) = s. We now show that one can simultaneously solve: ∂s y = −Xq1 (y),

∂t y = Xp1 (y)

thereby allowing us to write y = y(s, t, z) with z ∈ Σ0 ∩Φ0 . This is because for any function f (y) one has by deﬁnition ∂s f = −Xq1 f = {−q1 , f } and

14.1 Poisson manifolds and symplectic manifolds

521

∂t f = Xp1 f = {p1 , f } so that, using the Jacobi identity: ∂s (∂t f ) − ∂t (∂s f ) = −{q1 , {p1 , f }} + {p1 , {q1 , f }} = {{p1 , q1 }, f } = 0 since {p1 , q1 } = 1 is constant. The vector ﬁeld Xp1 is tangent to Φ0 (in fact to any Φs , because Xp1 p1 = {p1 , p1 } = 0), and similarly Xq1 is tangent to Σt . They are both transverse to Σ0 ∩ Φ0 and independent, because {p1 , q1 } = ω(Xp1 , Xq1 ) = 1 = 0. For any vector V tangent to Σ0 ∩ Φ0 we have ω(V, Xp1 ) = ω(V, Xq1 ) = 0 because ω(V, Xp1 ) = V · p1 , ω(V, Xq1 ) = V · q1 , and p1 = q1 = 0 are constant on this intersection. It follows that the restriction of ω to Σ0 ∩Φ0 is non-degenerate. By the induction hypothesis we assume that we have found canonical coordinates pi , qi , i ≥ 2 on the (2n − 2) dimensional symplectic variety Σ0 ∩Φ0 . We extend these coordinates as functions on M by setting pj (y) = pj (z) and qj (y) = qj (z) for j ≥ 2 and y = y(p1 , q1 , z) with z ∈ Σ0 ∩ Φ0 . This amounts to keeping them constant along the ﬂows of Xp1 and Xq1 so that {p1 , pj } = {p1 , qj } = {q1 , pj } = {q1 , qj } = 0 It remains to show that the symplectic form ω on M is equal to ω ˜ = n dp ∧ dq . We ﬁrst show that this is true at any point p = q =0 i i 1 1 i=1 of Σ0 ∩ Φ0 . Any vector V tangent to M at this point can be decomposed as a sum of a vector V1 on the space spanned by Xp1 and Xq1 , and a vector V2 tangent to Σ0 ∩ Φ0 . Computing ω(V1 + V2 , W1 + W2 ) we have seen that ω(V1 , W 2 ) = ω(V2 , W1 ) = 0, while by the induction hypothesis ω(V2 , W2 ) = ( j≥2 dpj ∧ dqj )(V2 , W2 ). Finally, ω(V1 , W1 ) = (dp1 ∧ dq1 )(V1 , W1 ), because it is suﬃcient to compute this for V1 = Xp1 and W1 = Xq1 , and both members are then equal to {p1 , q1 } = 1. To show that the equality holds on M , consider the Hamiltonian evolution U(s,t) under the ﬂows of Xp1 and Xq1 . In the coordinates (pi , qi ) it reads (p1 , q1 , p2 , q2 , . . .) → (p1 + s, q1 + t, p2 , q2 , . . .) so that the form ω ˜ = i dpi ∧ dqi is obviously invariant. On the other hand, the symplectic form ω is also invariant under Hamiltonian evolutions. Since we have shown that ω = ω ˜ for p1 = q1 = 0, these two forms coincide everywhere. Example 2. Another very natural example of symplectic space is the cotangent bundle M = T ∗ N of any diﬀerentiable manifold N . On this bundle is deﬁned a canonical 1-form α given by: αx (X) = p(π∗ X) (T ∗ N ),

and x ∈ T ∗ N . One can write x = (q, p), with where X ∈ Tx q = π(x) ∈ N (π is the projection on N ), and p is a 1-form belonging to

522

14 Symplectic geometry

the ﬁbre of q. We deﬁne the symplectic form as the closed form ω = dα. To show that it is non-degenerate we express it in terms of local coordinates. If q = (q1 , . . . , qn ) is a system of local coordinates on N , a basis of the tangent space of N at q is (∂q1 , . . . , ∂qn ) and a basis of the cotangent ∗ space at q is (dq1 , . . . , dqn ). In particular, any point p in the ﬁbre of T N above q is of the form p = i pi dqi . A tangent vector X = (δq, δp) at the point(q, p) of T ∗ N has projection π∗ X = δq = i (δq)i ∂qi , so that α(X) = i pi (δq)i , and the canonical form reads:

Then ω = dα =

α = p1 dq1 + · · · + pn dqn

i dpi

∧ dqi is non-degenerate. 14.2 Coadjoint orbits

Our aim is to present a non-trivial example of symplectic structure on the coadjoint orbits of a Lie group, which plays an important role in the study of integrable systems. We ﬁrst recall some notions about adjoint and coadjoint actions of Lie algebras and Lie groups. Let G be a connected Lie group with Lie algebra G. The group G acts on G by the adjoint action denoted Ad: X −→ (Ad g)(X) = gXg −1 ,

g ∈ G, X ∈ G

Similarly, the coadjoint action of G on the dual G ∗ of the Lie algebra G (i.e. the vector space of linear forms on the Lie algebra) is deﬁned by: (Ad∗ g.Ξ)(X) = Ξ(Ad g −1 (X)), g ∈ G, Ξ ∈ G ∗ , X ∈ G The inﬁnitesimal version of these actions provides actions of the Lie algebra G on G and G ∗ , denoted by ad and ad∗ , given by: ad X(Y ) = [X, Y ], X, Y ∈ G, ad X.Ξ(Y ) = −Ξ([X, Y ]), X, Y ∈ G, Ξ ∈ G ∗ ∗

Coadjoint orbits in G ∗ are equipped with the canonical Kostant–Kirillov symplectic structure. Before deﬁning it we need some simple facts concerning the functions on G ∗ . Denote the space of such functions by F(G ∗ ). The coadjoint action of G on G ∗ induces a coadjoint action of G on functions on G ∗ , also denoted by Ad∗ . If g ∈ G and h ∈ F(G ∗ ), Ad∗ g . h(Ξ) = h(Ad∗ g −1 (Ξ)) and a similar formula for the inﬁnitesimal action ad∗ . For a function h on G ∗ the diﬀerential dh may be viewed as an element of G. This is because,

14.2 Coadjoint orbits

523

G ∗ being a vector space, the diﬀerential dh is a linear form on G ∗ , i.e. an element of G ∗∗ ∼ G, and one can write for δΞ ∈ G ∗ : h(Ξ + δΞ) = h(Ξ) + δΞ(dh) + O (δΞ)2 From the Lie algebra structure on G, we can construct a Poisson bracket on the space F(G ∗ ) of functions on G ∗ . This is the Kostant–Kirillov bracket. Deﬁnition. If h1 and h2 ∈ F(G ∗ ), the Kostant–Kirillov bracket is deﬁned by: {h1 , h2 }(Ξ) = Ξ([dh1 , dh2 ]) It is obvious that the bracket { , } is antisymmetric and veriﬁes the Jacobi identity. Choosing two linear functions h1 (Ξ) = Ξ(X) and h2 (Ξ) = Ξ(Y ) with X, Y ∈ G, we have dh1 = X and dh2 = Y , and the Kostant–Kirillov Poisson bracket reads: {Ξ(X), Ξ(Y )} = Ξ([X, Y ]) The right-hand side is the linear function Ξ → Ξ([X, Y ]). This Poisson bracket is very natural but one has to be aware that it is a degenerate Poisson bracket. Proposition. The kernel of the Kostant–Kirillov bracket { , } is the set I(G ∗ ) of Ad∗ -invariant functions. Proof. Let us ﬁrst express the Ad∗ -invariance property of a function h on G ∗ . Performing an inﬁnitesimal transformation, we have: h(Ξ + t ad∗ X · Ξ) = h(Ξ) + t (ad∗ X · Ξ)(dh) + O(t2 ) so h is invariant if (ad∗ X · Ξ)(dh) = Ξ([dh, X]) = 0 for all Ξ ∈ G ∗ and all X ∈ G. Assuming now that k ∈ F(G ∗ ) is in the kernel of { , }, we have {k, f }(Ξ) = Ξ([dk, df ]) = 0, ∀f ∈ F(G ∗ ), in particular f = Ξ(X) for any X ∈ G. Thus k is ad∗ -invariant. The converse is obvious. This proposition means that the kernel of the Kostant–Kirillov bracket { , } is the set of functions which are constant on the orbits of the coadjoint action of G. Let I1 , I2 , . . . be the primitive invariant functions, i.e. any invariant function is a function of them, and denote by I1 , I2 , . . . the constant values they take on a speciﬁc orbit. We consider the ideal of the function algebra generated by the non-constant functions I1 − I1 , I2 − I2 , . . . . It is also an ideal of the Poisson algebra. The quotient of the function algebra by this ideal can be identiﬁed with the functions on the orbit, and the quotient Poisson bracket yields a Poisson bracket on

524

14 Symplectic geometry

the orbit, which by construction is non-degenerate, and therefore deﬁnes a symplectic structure on the orbit. More explicitly: Proposition. For any two tangent vectors at the point Ξ of the orbit, VX = ad∗ X · Ξ and VY = ad∗ Y · Ξ, deﬁne ωK (VX , VY ) = Ξ([X, Y ])

(14.3)

This form is closed and non-degenerate on any G-orbit. It induces the Kostant–Kirillov bracket. Proof. First note that since we are on an orbit of G, the vectors ad∗ X · Ξ, X ∈ G describe all the tangent space at Ξ. To show that ω is closed, let us recall the deﬁnition of exterior diﬀerentiation: dη(X0 , . . . , Xp ) = +

p

j , . . . , Xp ) (−1)j Xj · η(X0 , . . . , X

j=0

i , . . . , X j , . . . , Xp ) (−1)i+j η([Xi , Xj ], X0 , . . . , X

0≤i<j≤p

In our case, the ﬁrst term vanishes because VX · Ξ([Y, Z]) = −Ξ([X, [Y, Z]]), and we apply the Jacobi identity. The second term vanishes for the same reason once one notices that [VX , VY ] = −V[X,Y ] . This is because, by the general deﬁnition of the Lie derivative, we have: LX·m Y ·m ≡ [X·m, Y ·m] =

d −Xt Y.(eXt m) |t=0 = −[X, Y ]·m (14.4) e dt

To show that the form ω is non-degenerate on the orbit, assume that the tangent vector VX is such that ω(VX , VY ) = 0 for all Y ∈ G. This means Ξ([X, Y ]) = 0, ∀Y , that is ad∗ X ·Ξ(Y ) = 0, ∀Y . Hence VX = ad∗ X ·Ξ = 0. To compute the Poisson bracket associated with ωK , we need the Hamiltonian vector ﬁeld of any function f : Xf (Ξ) = −ad∗ df · Ξ To show it, note that df (ad∗ Y · Ξ) = Ξ([df, Y ]), which is also ωK (ad∗ df· Ξ, ad∗ Y · Ξ). Hence ωK (Xf , Xg )(Ξ) = Ξ([df, dg]). The 2-form ωK deﬁning the Kostant–Kirillov symplectic structure on coadjoint orbits is closed, but not exact. However, the coadjoint action deﬁnes a map ϕ from the group G to the orbit OΞ0 by

14.3 Symmetries and Hamiltonian reduction

525

ϕ : g → Ad∗ g · Ξ0 ≡ Ξ. The pullback ω = ϕ∗ ωK of ωK on G is exact and one can write ω = δα, with α = −Ξ0 (g −1 δg) To check this, note that ϕ∗ (gX) = ad∗ gXg −1 · Ξ. Hence ϕ∗ ωK (gX, gY ) = ωK (ϕ∗ (gX), ϕ∗ (gY )) = Ξ(g[X, Y ]g −1 ) = Ξ0 ([X, Y ]) On the other hand, δα(gX, gY ) = gX · α(gY ) − gY · α(gX) − α([gX, gY ]) The ﬁrst two terms vanish because they are derivatives of constant functions (α(gY ) = −Ξ0 (Y )). By deﬁnition of the Lie bracket [gX, gY ] = g[X, Y ] so that the last term is equal to Ξ0 ([X, Y ]), hence ϕ∗ ω = δα. 14.3 Symmetries and Hamiltonian reduction Let G be a Lie group and G its Lie algebra. Consider a symplectic manifold M on which G operates. We say that the action of G is symplectic if for any g ∈ G the transformation m → gm is symplectic. In view of eq. (14.2) this means: {f1 (gm), f2 (gm)} = {f1 , f2 }(gm) For any X ∈ G, consider the one-parameter group g t = exp (tX). In the limit t → 0 we deﬁne the action of X on functions by: X · f (m) =

d f (e−tX · m)|t=0 dt

(14.5)

so that we get a representation on functions of the Lie algebra G: X · (Y · f ) − Y · (X · f ) = [X, Y ] · f

(14.6)

Notice that X · f = −LX·m f , where L is the Lie derivative. Finally, the symplecticity condition reads: {X · f1 , f2 } + {f1 , X · f2 } = X · {f1 , f2 }

(14.7)

Proposition. Let G be a Lie group acting on M by symplectic diﬀeomorphism. The action of any one-parameter subgroup of G is locally Hamiltonian. This means that there exists a function HX , locally deﬁned on M , such that: X · f = {HX , f }

(14.8)

526

14 Symplectic geometry

Proof. The condition eq. (14.7) is obviously necessary to have X · f = {HX , f }. It is also suﬃcient. To show it, we use the canonical Darboux i i coordinates. Writing X · m = i (X p ∂pi + X q ∂qi ), we have: {X · f, h} + {f, X · h} − X · {f, h} / 0 = (∂pj Xqi − ∂pi Xqj )∂qi f ∂qj h + (∂pj Xpi + ∂qi Xqj )∂pi f ∂qj h i,j

/ 0 − (∂qj Xqi + ∂pi Xpj )∂qi f ∂pj h + (∂qj Xpi − ∂qi Xpj )∂pi f ∂pj h i,j

The condition that this vanishes identically for all f and h is equivalent to dΩX = 0, where ΩX = −iX·m ω = i (Xqi dpi −Xpi dqi ). So there exists, at least locally, a function HX such that ΩX = dHX or: X qi =

∂HX ∂pi

Xpi = −

∂HX ∂qi

Then X ·f =

i

−

∂HX ∂f ∂HX ∂f + = {HX , f } ∂qi ∂pi ∂pi ∂qi

This proves eq. (14.8). If one knows that there is an invariant 1-form α such that ω = dα, one can give an explicit formula for the function HX , which is then globally deﬁned: (14.9) HX (m) = α(X · m) Indeed, since α is invariant we have LX α = 0. Then 0 = LX α = (iX d + diX )α = ω(X, ·) + dα(X), so that comparing with dHX = −iX ω we see that HX = α(X · m). Using eq. (14.8) and the Jacobi identity, we ﬁnd that X ·(Y ·f )−Y ·(X · f ) = {{HX , HY }, f }, so that eq. (14.6) yields {H[X,Y ] −{HX , HY }, f } = 0. Because constants commute with any function we cannot conclude that H[X,Y ] = {HX , HY }. This motivates the: Deﬁnition. Consider a Lie group G acting on a symplectic manifold M by symplectic action. This action is said to be Poissonian if the Hamiltonians HX of the one-parameter subgroups are globally deﬁned, depend linearly on X, and are such that H[X,Y ] = {HX , HY }

14.3 Symmetries and Hamiltonian reduction

527

In the previous case, where there exists an invariant 1-form α, this property is always satisﬁed. In this case, we have {HX , HY } = ω(X, Y ) = dα(X, Y ) = X · α(Y ) − Y · α(X) − α([X, Y ]). By invariance of α we have X · α(Y ) = LX α(Y ) = α([X, Y ]) = −Y · α(X), so that {HX , HY } = α([X, Y ]) = H[X,Y ] . Example. This particular situation is important because it occurs in the case of a cotangent bundle. In this case we already know that the 1-form α exists and is globally deﬁned. We consider particular diﬀeomorphisms of M = T ∗ N , namely those which are induced by a diﬀeomorphism of the base N . We are going to show that α is invariant under such diffeomorphisms, and in particular under those which are induced by group actions on N . Any diﬀeomorphism φ of N induces a transformation on T ∗ N as follows: a point (q, p) of T ∗ N is determined by a point q ∈ N and a linear form p on the tangent space Tq N to N at q. The diﬀerential of φ at q, which we denote by φ∗ , maps Tq N into Tφ(q) N . Its transpose ∗ N to T ∗ N , hence φ∗ −1 maps T ∗ N to T ∗ N . The induced φ∗ maps Tφ(q) q q φ(q) ∗ transformation φ on M = T N is given by p) = (φ(q), φ∗ −1 (p)) φ(q, Proposition. The 1-form α on T ∗ N is invariant under transformations induced by transformations of the base manifold N . Proof. The transformation φ of M induces a transformation φ∗ on T M given by φ∗ (δq, δp) = (φ∗ δq, φ∗ −1 δp). Recall the deﬁnition α(δq, δp) = p(δq). Quite generally, we have: (φ∗ (δq, δp)) (φ∗ α)(q,p) (δq, δp) = αφ(q,p) = φ∗ −1 (p)(φ∗ δq) = p(δq) = α(δq, δp)

In particular, assuming that a Lie group G acts on the base manifold N , this action lifts to a Poissonian action on M = T ∗ N , and for any X ∈ G we have HX (m) = α(X · m) = p(X · q) for m = (q, p). When the action of a Lie group G on a symplectic manifold M is Poissonian, any X ∈ G is associated with a function HX such that X · f = {HX , f }, and X → HX is linear. Hence there exists a function P : M −→ G ∗ such that one can write HX (m) = P(m), X , where is the pairing between G and its dual.

528

14 Symplectic geometry

Deﬁnition. The application m → P(m) ∈ G ∗ is called the moment map. The moment map has the following covariance property with respect to the action of G: Proposition. The value of the moment at the point g · m is related to its value at the point m by: P(g · m) = Ad∗g P(m) Proof. This is equivalent to HX (g · m) = Hg−1 Xg (m). Since we asssume that G is connected, it is suﬃcient to show this for an inﬁnitesimal g = 1 + Y . Then we have to show that dHX (Y · m) = H[X,Y ] (m). Using eq. (14.5) we have dHX (Y · m) = −Y · HX = −{HY , HX }, where we used eq. (14.8) in the last step. Since the group action is Poissonian, this is H[X,Y ] . The moment map associates conserved quantities with symmetries of the Hamiltonian. This is the Noether theorem. Theorem. Let G be a Lie group acting by Poissonian action on a symplectic manifold M , and let H be a Hamiltonian invariant under the action of G. Then the moment P is conserved under the ﬂow of H. Proof. Let us ﬁx X ∈ G and consider the function on M : P(m), X = HX (m). Its time derivative under the ﬂow of H is ∂t HX = {H, HX } = −X · H = 0 because H is invariant under the group action. In the situation of the theorem, one can use the conserved quantities to reduce the number of degrees of freedom. This is called Hamiltonian reduction, because one is able to deﬁne a new symplectic variety of smaller dimension on which the reduced motion takes place. Let M be a symplectic manifold and let G be a group acting on M by Poissonian action. Let P be the moment map. Let us ﬁx a particular value µ of the moment and consider the set of points of phase space where P(m) = µ. By the Noether theorem the motion takes place on this set: Mµ ≡ P −1 (µ),

µ ∈ G∗

We have to assume that µ is not a critical value of P, that is, at all m ∈ Mµ we have dP(m) = 0. Hence there exists a tangent space at m to Mµ . Since Mµ is deﬁned by dim G equations, we have: dim Mµ = dim M − dim G

(14.10)

Note that Mµ is not in general a symplectic variety, and is not in general of even dimension. However, there is a residual action of the group G

14.3 Symmetries and Hamiltonian reduction

529

on Mµ . We have seen that the action of G is transformed by P into the coadjoint action on G ∗ . So the stabilizer Gµ of the moment µ, that is the group of g ∈ G such that Ad∗g µ = µ, preserves Mµ . Deﬁnition. The reduced phase space Fµ is the quotient: Fµ ≡ Mµ /Gµ = P −1 (µ)/Gµ Here we assume that the quotient is well-deﬁned as a diﬀerentiable manifold. In general there are particular values of µ where this quotient is ill-deﬁned. However, for a generic situation we don’t have to enter into such subtleties. The nice feature of the reduced phase space is that it is naturally equipped with a symplectic structure, and in particular is of even dimension. Proposition. Let ξ and η be two vectors tangent to Fµ at the point f. Consider at a point m ∈ Mµ above f any two tangent vectors to Mµ , ξ and η , projecting on ξ, η respectively. We then set: ωf (ξ, η) = ωm (ξ , η ) This is independent of the chosen representatives m, ξ , η and deﬁnes a symplectic form on Fµ . Proof. We ﬁrst show that given m, ωm (ξ , η ) is independant of the choices of representatives ξ , η . This amounts to showing that ωm (V, W ) = 0 for V vertical, i.e. tangent to the orbit of Gµ , and W tangent to Mµ . Since V is vertical we can write V = X · m with X ∈ Gµ . We can consider the Hamiltonian HX deﬁned on M and note that ωm (V, W ) = −dHX (W ). But on Mµ we have HX = P(X) = µ(X) is constant. Since dHX (W ) is the derivative of HX in the direction of W , which is tangent to Mµ , this derivative vanishes. The quantity ωm (ξ , η ) is independent of the choice of the point m above f since by invariance of ω we have for any g ∈ Gµ , ωgm (gξ , gη ) = ωm (ξ , η ). This shows that ωF , the form that we have deﬁned on Fµ , is well-deﬁned, bilinear and antisymmetric. To show that ωF is closed, note that the restriction ω|Mµ of ω to Mµ is obviously closed. On the other hand, if π is the projection Mµ → Mµ /Gµ we have just shown that ω|Mµ = π ∗ ωF . Since d commutes with π ∗ we have π ∗ dωF = 0. This means that dωF (π∗ X, π∗ Y ) = 0 for all X, Y . Since π∗ is surjective we have dωF = 0. Finally, we have to show that ωF is non-degenerate. Vertical vectors are in the kernel of ω|Mµ , in fact they are the whole kernel of ω|Mµ as we

530

14 Symplectic geometry

now explain. We have seen that vertical vectors are orthogonal under ω to Tm (Mµ ). More precisely, we have (G · m)⊥ = Tm (Mµ ) because both sides have the same dimension, dim M − dim G, since ω is non-degenerate. But the kernel of the restriction of ω to Mµ is ⊥ (Mµ ) = Gµ · m Ker ω|Mµ (m) = Tm (Mµ ) ∩ Tm

These vectors project to 0 when we take the quotient under Gµ , hence ωF is non-degenerate. We often need to compute the Poisson brackets of functions on the reduced phase space Fµ , knowing the Poisson brackets on M . Any function f˜ on Fµ uniquely extends to a Gµ invariant function on Mµ . However, to be able to compute Poisson brackets on M we have to extend this function further to a function f deﬁned in a vicinity of Mµ in M . Even requiring complete G invariance is not suﬃcient to lift f˜ to M . This is because while dim M = dim Mµ + dim G, the ﬁbre along G at m ∈ Mµ is not transverse to Mµ , since Gµ leaves Mµ invariant. The general procedure consists of choosing arbitrary extensions f of f˜ outside of Mµ . Two extensions diﬀer by a function vanishing on Mµ . Then we will show how to compute the reduced Poisson bracket as some modiﬁcation of {f, g} (computed on M ) independent of the arbitrary choices (see eq. (14.11) below). The diﬀerence of the Hamiltonian vector ﬁelds of two extensions of the same function on Fµ is controlled by the following: Lemma. Let f be a function deﬁned in a vicinity of Mµ and vanishing on Mµ . Then the Hamiltonian vector ﬁeld Xf associated with f is tangent to the orbit G · m at any point m ∈ Mµ . Proof. The subvariety Mµ is deﬁned by the equations HX i = µi for some basis Xi of G. Since f vanishes on Mµ one can write f = (HXi − µi )fi for some functions fi deﬁned in the vicinity of Mµ . For any tangent vector V at a point m ∈ Mµ one has, using that (HXi − µi ) ∂fi vanishes on Mµ : df (m)(V ) =

i

dHXi (m)(V )fi (m) = −ω (

fi (m) Xi · m , V )

i

where we used that the Hamiltonian vector ﬁeld associated with HXi is Xi .m. Hence Xf = fi (m) Xi · m ∈ G · m.

14.3 Symmetries and Hamiltonian reduction

531

As a consequence of this lemma we have a method for computing the reduced Poisson bracket. We take two functions deﬁned on Mµ and invariant under Gµ and extend them arbitrarily. Then we compute their Hamiltonian vector ﬁelds on M . It turns out that they can be “projected” on the tangent space to Mµ by adding a vector tangent to the orbit G · m. These projections are independent of the extensions and the reduced Poisson bracket is given by the value of the symplectic form on M acting on them. Proposition. Let f be a function deﬁned in a vicinity of Mµ and Gµ invariant on Mµ . At each point m ∈ Mµ one can choose a vector Vf · m ∈ G · m such that Xf + Vf · m ∈ Tm (Mµ ) and Vf · m is determined up to a vector in Gµ · m. Proof. Recall that the symplectic orthogonal of G · m is exactly Tm (Mµ ), so we want to solve ω (Xf + Vf · m, X · m) = 0, ∀X ∈ G. Note that for X, Y ∈ G, ω(X ·m, Y ·m) = {HX , HY } = H[X,Y ] = P ([X, Y ]) = µ ([X, Y ]) since the action is Poissonian and m ∈ Mµ . So the equation to be solved reads LX·m f = µ ([Vf , X]). Both members are linear in X ∈ G and vanish when X ∈ Gµ (the left-hand side because f is Gµ -invariant, the righthand side because Gµ stabilizes µ), so the equation can be seen as an ¯ Y¯ ) → µ([X, Y ]) is equation on G/Gµ . On this quotient, the mapping (X, a skew-symmetric non-degenerate (since we have quotiented by the kernel Gµ ) bilinear form, hence the equation can be uniquely solved for V¯f . Proposition. Let f˜, g˜ be two functions on the reduced phase space Fµ . We lift them to functions f, g deﬁned on a vicinity of Mµ , and Gµ invariant on Mµ . The reduced Poisson bracket is given by: {f˜, g˜}red = ω (Xf + Vf · m, Xg + Vg · m) = {f, g} − µ([Vf , Vg ]) (14.11) Proof. We want to show that the Hamiltonian vector ﬁeld associated with f˜ for the reduced symplectic form is π∗ (Xf (m) + Vf · m) for m ∈ Mµ . Note that this is independent of the choice of Vf modulo Gµ , and that Xf (m) + Vf · m is by construction tangent to Mµ . If V is an arbitrary tangent vector to Mµ at m, by deﬁnition of the reduced symplectic form, one has to check that: df˜(π∗ V ) = −ωred (π∗ (Xf (m) + Vf · m), π∗ V ) = −ωm (Xf (m) + Vf · m, V ) (14.12) We have seen that the symplectic orthogonal of G · m is Tm (Mµ ), so that ωm (Vf · m, V ) = 0. On the other hand, since f is Gµ -invariant we have df˜(π∗ V ) = df (V ) = −ωm (Xf , V ). This proves eq. (14.12) and the ﬁrst equality in eq. (14.11). To get the second form of the reduced Poisson

532

14 Symplectic geometry

bracket, note that ω(Xf , Vg · m) = −µ([Vf , Vg ]). This is because, since Xf + Vf · m ∈ (G · m)⊥ , one has, for m ∈ Mµ : ω(Xf , Vg · m) = −ω(Vf · m, Vg · m) = −H[Vf ,Vg ] (m) = −µ([Vf , Vg ])

Remark. Note that if f | Mµ = 0 we have Xf + Vf · m ∈ Gµ · m, hence {f˜, g˜}red = ω (Xf + Vf · m, Xg + Vg · m) = 0, so that eq. (14.11) is independent of the arbitrary choices of f and g.

In the applications, we are often given functions f and g on M which are G-invariant. It is then obvious that {f, g} is also G-invariant (invariance of ω), hence its restriction to Mµ is Gµ -invariant. Moreover, the associated Hamiltonian vector ﬁelds Xf , Xg are tangent to Mµ since the G-invariance of f implies: 0 = df (X · m) = −ω (Xf , X · m) = −dHX (Xf ), X ∈ G therefore the functions HX are constant along Xf , i.e. Xf is tangent to Mµ . In that case, one can take Vf = 0. It follows that for such functions the reduced Poisson bracket on Fµ is simply given by the ordinary Poisson bracket on M . Proposition. If f and g are G-invariant functions on M they deﬁne functions f˜ and g˜ on the reduced phase space Fµ and we have: {f˜, g˜}red = {f, g} where the right-hand side is also G-invariant and deﬁnes a function on Fµ . 14.4 The case M = T ∗ G Let us now apply these general considerations to the case where N is a Lie group G and M = T ∗ G. We have two actions of G on itself, namely Lg : (g, n) → gn,

Rg : (g, n) → ng −1

The ﬁrst one is called left-action and the second one right-action. The diﬀerential Lg∗ of the map n → gn sends Tn G → Tgn G. In particular if n = e, the unit element of G, this diﬀerential maps Te G = G to Tg G. For any ﬁxed X ∈ G, we get a left-invariant vector ﬁeld g · X on G. As above, the action on functions on G is deﬁned by (gf )(n) = f (g −1 n) so

14.4 The case M = T ∗ G

533

that (g(hf )) = ((gh)f ). For an inﬁnitesimal transformation g = exp(tX) with t small we have: X · f (n) =

d f (e−tX · n)|t=0 = −LX·n f dt

To build a phase space we need the cotangent bundle M = T ∗ G. We can use left translations to identify M with G × G ∗ , where G ∗ is the dual of the Lie algebra G. p ∈ Tg∗ G −→ (g, ξ)

where

p = L∗g−1 ξ

Indeed, Lg−1 ∗ maps Tg G to G, hence its transposed L∗g−1 maps G ∗ to Tg∗ G. Explicitly, p(V ) = ξ(g −1 V ), which is simply ξ(X) when V = g · X is a left-invariant vector ﬁeld. Note that if Xi is a basis of the Lie algebra G, the left-invariant vector ﬁelds g · Xi provide a basis of each Tg G. Since M = T ∗ G there is a canonical 1-form α deﬁned as follows: if (v, κ) is a tangent vector to T ∗ G at the point (g, ξ), so that v ∈ Tg G and κ ∈ G ∗ , we have: α(v, κ) = ξ(g −1 v) This canonical 1-form is both left-invariant and right-invariant, because, according to the general construction, it is invariant under any diﬀeomorphism of the base M = G of the cotangent bundle. Hence the symplectic form ω = dα is invariant under both actions. We can compute the Hamiltonians which generate inﬁnitesimal left and right translations on functions. We have seen that in the case of a cotangent bundle HX (m) = α(−X · m), where, X · m is the inﬁnitesimal group action on M and the minus sign is introduced because we consider the action on functions. For right translations g → gh−1 and h = etXR we get HXR (g, ξ) = α(g · XR ) = ξ(XR ) For left translations g → hg and h = etXL we get HXL (g, ξ) = α(−XL · g) = −ξ(g −1 XL g) From this, one sees that the moment maps for left and right actions are given by PL (XL ) = −(gξg −1 , XL ), PR (XR ) = (ξ, XR ) In applications, we consider the case when only subgroups HL and HR act by left and right translations on G. The moments live in the duals of the Lie algebra HL and HR , hence require natural projections from G ∗ , induced by restriction to the subalgebras. Speciﬁcally, if ξ is a linear form

534

14 Symplectic geometry

on G its restriction to H is an element of H∗ , and we denote it by PH∗ ξ. We have shown the: Proposition. Let HL and HR be two subgroups of G acting by left and right translations respectively on T ∗ G. The moment maps associated with these actions are: ∗ ξ, PR (g, ξ) = PHR

PL (g, ξ) = −PHL∗ gξg −1

We often need to compute Poisson brackets of functions on T ∗ G. We have natural elementary functions on this phase space, namely the quantities ξ(X) for any given X ∈ G, and the matrix elements ρij (g) of g in any faithful representation of G. Any other function can be expressed as a function of these elementary ones. So it is enough to give the Poisson brackets of these elementary functions. Proposition. The Poisson brackets of the elementary functions on T ∗ G read: {ρij (g), ρkl (g)} = 0, {ξ(X), ρij (g)} = ρij (gX), {ξ(X), ξ(Y )} = ξ([X, Y ]) Proof. These relations are consequences of the fact that HX = ξ(X) is the Hamiltonian generating right translations. Since, by the general theory, the action is Poissonian, we have {HX , HY } = H[X,Y ] and this gives the last equation. Moreover, {HX , ρij (g)} is the inﬁnitesimal action of the right translation by X on the function ρij (g), namely the action (hρij )(g) = ρij (gh) (note that we have here gh because this is an action on functions). So {HX , ρij (g)} = ρij (gX), proving the second equation. Finally, the ﬁrst equation is obvious since the ρij (g) only depend on the position variables and not the momenta. The Poisson bracket on ξ is the Kirillov bracket. We often drop the explicit reference to the representation ρ in the above formulae, but it is important to keep in mind that the g occurring in these equations is a function on phase space, and not a point on the base.

14.5 Poisson–Lie groups Consider two Poisson manifolds M1 and M2 . The cartesian product M1 × M2 is also equipped with a natural Poisson structure as follows. The space of functions on M1 × M2 is the tensor product of the space of functions on M1 and the space of functions on M2 . That is, one can write any (1) (2) such function in the form f (x, y) = i fi (x)fi (y), where the sum is in

14.5 Poisson–Lie groups

535

general inﬁnite and requires some topology for its precise deﬁnition. We then deﬁne for two functions f (x, y) and g(x, y) the Poisson bracket: (1) (1) (2) (2) (2) (2) (1) (1) {f, g}M1 ×M2 = {fi , gj }M1 fi gj + {fi , gj }M2 fi gj ij

This obeys all the properties of a Poisson bracket, and implies that functions on M1 Poisson commute with functions on M2 . In particular, if G is a Lie group endowed with a Poisson structure, the product G × G has a Poisson structure and one may wonder whether the multiplication (g, h) → gh from G×G to G is compatible with the respective Poisson structures. More precisely, if we have two Poisson manifolds M and N and a map φ : M → N , this map is said to be Poisson if for any two functions f1 , f2 , on N we have {f1 ◦ φ, f2 ◦ φ}M = {f1 , f2 }N ◦ φ. In our case the multiplication is Poisson if we have: {f1 (gh), f2 (gh)}G×G = {f1 , f2 }G (gh) where in the left-hand side f1,2 (gh) are to be viewed as functions on G×G. Deﬁnition. A Poisson–Lie group G is a Lie group G equipped with a Poisson structure such that the multiplication in G, viewed as a map G × G → G, is a Poisson mapping. To describe the Poisson structure on G one can use the Lie algebra G to label the derivatives of any function at a point g ∈ G. We ﬁrst choose a basis Ea of the Lie algebra G: c [Ea , Eb ]G = fab Ec

(14.13)

and consider the right-invariant vector ﬁelds ∇R a deﬁned as: d ∇R f (etEa g)a f (g) = dt t=0 which form a basis of the tangent space Tg G. The Poisson bracket of two functions f1 , f2 on G can be written as a bilinear combination of derivatives with coeﬃcients η ab (g) as: R η ab (g)(∇R (14.14) {f1 , f2 }G (g) = a f1 )(g)(∇b f2 )(g) a,b

The coeﬃcients η ab (g) contain all the information on the Poisson structure, and we will express the Lie–Poisson condition on them. For this, it is convenient to introduce the element η(g) ∈ G × G: η ab (g)Ea ⊗ Eb η(g) = a,b

536

14 Symplectic geometry

Any function on G can be expressed on elementary functions, i.e. the matrix elements of a faithful representation ρ of G. It is suﬃcient to express the Poisson brackets of such elementary functions. If we consider the function g → ρ(g) we have ∇R a ρ(g) = ρ(Ea g) = ρ(Ea )ρ(g). The Poisson bracket of two matrix elements then reads:

{ρij (g), ρkl (g)}G = ρij ⊗ ρkl η(g)g ⊗ g In the following we drop the explicit mention of the representation ρ and write this formula in the usual tensor notation: {g1 , g2 }G = η12 (g) g ⊗ g

(14.15)

Note that the antisymmetry of the Poisson bracket (14.14) requires η12 = −η21 . Proposition. The Lie–Poisson property is equivalent to the following cocycle condition on η: η(gh) = η(g) + Adg · η(h) where the adjoint action on G ⊗G is deﬁned as Adg ·η(h) = g⊗g η(h) g −1 ⊗ g −1 . Proof. We have {(gh)1 , (gh)2 }G×G = {g1 , g2 }G h1 h2 + g1 g2 {h1 , h2 }G since ρ(gh) = ρ(g)ρ(h). This has to be equal to η(gh)(gh)1 (gh)2 . Comparing the two expressions, the cocycle condition follows. This condition is called a cocycle condition because it means that η is a 1-cocycle for the Hochschild group cohomology of G with values in the representation G ⊗ G. Looking at the vicinity of the identity e in G, that is writing g = exp (tX) and h = exp (t Y ), with t, t small and η(etX ) = η(e) + tde η(X) + · · ·, the cocycle condition implies, at order 0 in t, t , that η(e) = 0, and at second order that de η([X, Y ]) = [∆X, de η(Y )] − [∆Y, de η(X)]

(14.16)

with ∆X = X ⊗ 1 + 1 ⊗ X. This means that the linear function de η on G is a 1-cocycle for the Lie algebra cohomology with values in the same representation G ⊗ G. One can use de η to introduce a Lie algebra structure on G ∗ . For any function f on G the diﬀerential de f at the identity is a linear function on G, hence an element of G ∗ . Considering two functions f1 and f2 , eq. (14.14) shows that the diﬀerential de {f1 , f2 } only depends on de f1 and de f2 and is proportional to de η, since η(e) = 0.

14.5 Poisson–Lie groups

537

Deﬁnition. The Poisson bracket {, }G deﬁnes a Lie algebra structure on G ∗ by the following formula: [de f1 , de f2 ]G ∗ = de {f1 , f2 }G

(14.17)

The Jacobi identity for the Lie algebra structure is direct consequence of the Jacobi identity for the Poisson bracket on G. basis (Ea ) in G, the difIntroducing a basis (E a ) in G ∗ , dual to the ferential at the identity can written as de f = a E a (∇a f ) ∈ G ∗ , where d f (etEa )|t=0 . Similarly, ∇a f = dt de η = E c ∇c η = Ccab E c ⊗ Ea ⊗ Eb With these notations eq. (14.17) reads: [E a , E b ]G ∗ = Ccab E c

(14.18)

so the structure constants are Ccab = (∇c η ab ). We shall denote by G∗ the connected Lie group with Lie algebra G ∗ . Let G be a Poisson–Lie group and let D = G ⊕ G ∗ . This is called the classical double. One can introduce a Lie algebra structure on D which extends the Lie algebra structures on G and G ∗ and such that the elements of G and G ∗ do not commute. In terms of a basis (Ea ) ∈ G and its dual (E a ) ∈ G ∗ , this structure reads: c Ec [Ea , Eb ] = fab a a [E , Eb ] = fbc E c − Cbac Ec [E a , E b ] = Ccab E c

(14.19)

Proposition. The above brackets deﬁne a Lie algebra structure on D. Proof. One deﬁnes also [Eb , E a ] = −[E a , Eb ] so the antisymmetry is obvious. We need to verify the Jacobi identity. Due to the Jacobi identity on G and G ∗ one has only to verify the two cases [E a , [E b , Ec ]] + [Ec , [E a , E b ]] + [E b , [Ec , E a ]] = 0 and [E a , [Eb , Ec ]] + [Ec , [E a , Eb ]] + [Eb , [Ec , E a ]] = 0. These relations reduce to: a b a b Cdab fcld − Clbd fdc + Clad fdc + Ccdb fld − Ccda fld =0 a. when using the Jacobi identity on the structure constants Ccab and fbc This is just the cocycle relation, eq. (14.16).

Remark. With this bracket on the double, one can construct a solution of the Yang–Baxter equation: r12 = Ea ⊗ E a ∈ G ⊗ G ∗ a

538

14 Symplectic geometry

The Yang–Baxter equation: [r12 , r13 ] + [r12 , r23 ] + [r13 , r23 ] = [Ea , Eb ] ⊗ E a ⊗ E b + Ea ⊗ [E a , Eb ] ⊗ E b + Ea ⊗ Eb ⊗ [E a , E b ] = 0 is identically satisﬁed due to the deﬁnition of the commutators on D.

14.6 Action of a Poisson–Lie group on a symplectic manifold Let G be a Poisson–Lie group and M be a symplectic manifold. We assume that G acts on M . In this case the natural compatibility condition between the group action and the Poisson structures on G and M is: Deﬁnition. The action of a Poisson–Lie group on a symplectic manifold is a Lie–Poisson action if for any g ∈ G and any functions f1 and f2 on M , we have: {f1 (g · m), f2 (g · m)}G×M = {f1 , f2 }M (g · m)

(14.20)

Here the Poisson structure on G × M is the product Poisson structure. At the inﬁnitesimal level, let X ∈ G and denote by X · m the vector ﬁeld on M corresponding to the inﬁnitesimal transformation generated by X. For any function f on M we deﬁne the action of X on f by: (X.f )(m) =

d f (e−tX · m)|t=0 = ζf (m), X dt

where , denotes the pairing between G and G ∗ . This deﬁnes a function ζf : m → ζf (m) ∈ G ∗ . Proposition. The inﬁnitesimal form of eq. (14.20) for a Lie–Poisson action is: {X ·f1 , f2 }M + {f1 , X ·f2 }M + [ζf1 , ζf2 ]G ∗ , X = X ·{f1 , f2 }M (14.21) or, equivalently, {ζf1 , f2 }M + {f1 , ζf2 }M + [ζf1 , ζf2 ]G ∗ = ζ{f1 ,f2 }M . Proof. The deﬁnition of the product Poisson bracket is equivalent to: {f1 (g · m), f2 (g · m)}G×M = {f1 (g · m), f2 (g · m)}M + {f1 (g · m), f2 (g · m)}G In the right-hand side of this formula, in the term { , }M the functions f1,2 (gm) are viewed as functions of m, and g is a parameter, while in the term { , }G they are viewed as functions of g and m is a parameter. So the Lie–Poisson condition reads: {f1 (g · m), f2 (g · m)}M + {f1 (g · m), f2 (g · m)}G = {f1 , f2 }M (g · m)

14.6 Action of a Poisson–Lie group

539

Setting g = e−tX and taking t inﬁnitesimal, the ﬁrst Poisson bracket becomes −{X · f1 (m), f2 (m)}M − {f1 (m), X · f2 (m)}M and the right-hand side becomes −X · {f1 , f2 }M (m). The second Poisson bracket is − de {f1 (g · m), f2 (g · m)}G , X , where de is the diﬀerential of a function on G taken at g = e. By deﬁnition of the Lie bracket on G ∗ this is − [de f1 , de f2 ]G ∗ , X . One gets eq. (14.21) noting that ζf (m) = −de f (g·m). Introducing two dual basis of the Lie algebras G and G ∗ , Ea ∈ G a ∗ a a and E ∈ Ga with E , Eb = δb , eq. (14.21) becomes, using ζf = a (Ea · f )E : {Ea · f1 , f2 }M + {f1 , Ea · f2 }M − Cabc (Eb · f1 )(Ec · f2 ) = Ea .{f1 , f2 }M (14.22) It follows immediately from eq. (14.21) that a Lie–Poisson action cannot be symplectic if the algebra G ∗ is non-Abelian. Hence we cannot expect that inﬁnitesimal group actions are locally generated by Hamiltonians as in the symplectic case. There is, however, a generalization of this notion, in the Lie–Poisson case, by what are called non-Abelian Hamiltonians. Proposition. Assume that a Poisson–Lie group G acts on M by a Lie– Poisson action. Then, there exists a function Γ, locally deﬁned on M and taking values in the group G∗ , with Lie algebra G ∗ , such that for any function f on M , X.f = Γ−1 {f, Γ}M , X

,

∀X∈G

(14.23)

Γ−1

Equivalently, ζf (m) = {f, Γ}M (m). We will refer to Γ as the non– Abelian Hamiltonian of the Lie–Poisson action. Proof. Introduce the Darboux coordinates (q i , pi ) on the symplectic mani i ifold M . For any X ∈ G expand X ·m = X p ∂pi +X q ∂qi and introduce the i i form ΩX = i (X q dpi − X p dqi ). Finally, let Ω be the G ∗ -valued 1-form Ω = E a ΩEa . As in the symplectic case, eq. (14.21) is then equivalent to the following zero-curvature condition for Ω: dΩ + [Ω, Ω]G ∗ = 0 Therefore, locally on M , Ω = Γ−1 dΓ with Γ ∈ G∗ . This proves eq. (14.23). The converse is true: an action generated by a non–Abelian Hamiltonian as in eq. (14.23) is Lie–Poisson since we have: X · {f1 , f2 }M − {X · f1 , f2 }M − {f1 , X · f2 }M 0 / = Γ−1 {f1 , Γ}M , Γ−1 {f2 , Γ}M G ∗ , X

540

14 Symplectic geometry

In the Abelian case, we have Γ(m) = exp (−P(m)), where P is the momentum taking values in the Abelian Lie algebra G ∗ . This is because, in the symplectic case, eq. (14.23) becomes X · f = {P, f }, X = {HX , f }. Hence Γ is the non–Abelian generalization of the moment map. 14.7 The groups G and G∗ As a preparation for our study of dressing transformations, we apply the previous results to a more speciﬁc situation. Let G be a Lie algebra with a bilinear invariant form denoted by Tr, with associated connected Lie group G. We denote by C the tensor Casimir in G ⊗ G. We equip G with a Lie–Poisson structure by choosing a cocycle η(g). A simple way to fulﬁl the cocycle condition is to take for η(g) a coboundary η12 (g) = r12 − Adg r12 where r is a constant element in G ⊗ G. Then eq. (14.15) becomes: {g1 , g2 }G = [r12 , g1 g2 ]

(14.24)

This is the Sklyanin bracket. In this case the Lie–Poisson property is easy to check. One has to show that: {(gh)1 , (gh)2 }G×G = [r12 , (gh)1 (gh)2 ] This is obvious when we notice that, in the product Poisson structure on G × G, g and h Poisson commute. The computation then reduces to the computation we have done for the Sklyanin approach to the closed Toda chain, see Chapter 6. Since in G × G, g and h Poisson commute, we have: {(gh)1 , (gh)2 }G×G = {g1 , g2 }G h1 h2 + g1 g2 {h1 , h2 }G = [r12 , g1 g2 ]h1 h2 + g1 g2 [r12 , h1 h2 ] = [r12 , g1 h1 g2 h2 ] Note that r12 is deﬁned only up to the addition of a multiple of the Casimir ± = r12 ± 12 C12 . Antisymmetry element which drops out of η(g). We deﬁne r12 of the Poisson bracket is ensured by choosing r12 antisymmetric, so that ± ∓ = −r21 . The Jacobi identity for the Poisson bracket is ensured by r12 ± are solutions of the classical Yang–Baxter equation: requiring that r12 / ± ±0 / ± ±0 / ± ±0 r12 , r13 + r12 (14.25) , r23 + r13 , r23 = 0 Equation (14.25) is an equation in G ⊗ G ⊗ G, and the indices on r± refer to the copies of G on which r± is acting. Using the bilinear form Tr to

14.7 The groups G and G∗

541

± identify the vector spaces G ∗ and G, the elements r12 of G ⊗ G can be ± ∗ mapped into elements R ∈ G ⊗ G ∼ = EndG deﬁned by: ± R± (X) = Tr2 r12 (1 ⊗ X) , ∀ X ∈ G (14.26)

Note that we have R+ − R− = Id. The Poisson bracket (14.24) on G induces a Lie algebra structure on G ∗ by eq. (14.17). Identifying the vector spaces G and G ∗ by Tr, the bracket on G ∗ is mapped to the R-bracket: / 0 / 0 [ X, Y ]R = R± (X), Y + X, R∓ (Y ) = [R(X), Y ] + [X, R(Y )] (14.27) with R = 12 (R+ + R− ). We gave a detailed analysis of this bracket in Chapter 4, and all the results apply here. We simply recall that because R+ − R− = Id, any X ∈ G admits a decomposition as: ± X = X+ − X− , X± = R± (X) = Tr2 r12 (1 ⊗ X) (14.28) In terms of the components X+ and X− , the commutator in G ∗ becomes: [ X, Y ]R = [X+ , Y+ ] − [X− , Y− ] In particular, the plus and minus components commute in G ∗ . Moreover, R± are Lie algebra homomorphisms so that X± live in two subalgebras G± of G. Recall also that the image of G ∗ in G− ⊕G+ is the set of X = X+ −X− such that θ(X+ ) = X− , see Chapter 4. By exponentiation, the subalgebras G± correspond to connected Lie subgroups G± of G, and the group G∗ can be viewed as the set of pairs (g− , g+ ), subjected to some condition θ(g+ ) = g− , with product law: (g− , g+ ) · (h− , h+ ) = (g− h− , g+ h+ )

(14.29)

Any element g ∈ G (in a neighbourhood of the identity) admits a unique factorization as: −1 g = g− g+

(14.30)

with θ(g+ ) = g− . This associates with g ∈ G a unique element of G∗ = (g− , g+ ) through a factorization problem. So as sets, G and G∗ are identiﬁed, but they have diﬀerent group structures. The group G∗ itself becomes a Poisson–Lie group if we introduce on it the Semenov-Tian-Shansky Poisson bracket: / ± 0 {(g+ )1 , (g+ )2 }G∗ = − r12 , (g+ )1 (g+ )2 0 / ∓ , (g− )1 (g− )2 {(g− )1 , (g− )2 }G∗ = − r12 / − 0 , (g− )1 (g+ )2 {(g− )1 , (g+ )2 }G∗ = − r12 0 / + (14.31) , (g+ )1 (g− )2 {(g+ )1 , (g− )2 }G∗ = − r12

542

14 Symplectic geometry

−1 g+ : or, for the factorized element g = g− + − ± ∓ {g1 , g2 }G∗ = −g1 r12 g2 − g2 r12 g1 + g1 g2 r12 + r12 g1 g2

(14.32)

The multiplication in G∗ is a Poisson map for the brackets (14.31). The group G∗ is therefore a Poisson–Lie group. 14.8 The group of dressing transformations We use the above results to understand the Poisson structure of the dressing transformations introduced in Chapter 3. Deﬁnition. Let G be a Poisson–Lie group associated with an r-matrix, and G∗ its dual Poisson–Lie group. We deﬁne an action of G∗ on G which we call a dressing transformation. We identify G∗ to G as sets, via the −1 g+ . The dressing of x ∈ G by g ∈ G∗ is factorization problem g = g− deﬁned by: −1 −1 (g = g− g+ ∈ G∗ , x ∈ G) :→ g x = (xgx−1 )± x g± ∈G

(14.33)

In this equation (xgx−1 )± refers to the factorization −1 −1 (xgx−1 )−1 − (xgx )+ = (xgx )

and this implies that the two signs give the same result for g x. Proposition. The action x → g x is a group action of G∗ on G. Proof. We have to show that g (h x) = gh x, that is: −1

(h x g h x

−1 )± h x g± = (x(gh)x−1 )± x(gh)−1 ±

(14.34)

Introducing the notation, for any h ∈ G∗ : Θh± = (xhx−1 )± , so that h x = h h h −1 Θh± xh−1 ± , and using the sign freedom to write the ﬁrst x in ( x g x )± with the minus sign and the second one with the plus sign, the left-hand side of eq. (14.34) reads:

−1 −1 h −1 gh x Θ Θh± x h−1 Θh− x h−1 + − + ± g± ±

−1 Since g = g− g+ , and due to the deﬁnition of the group law in G∗ , −1 = Θ(gh) −1 Θ(gh) h−1 − gh+ = (gh). Moreover, since by deﬁnition x(gh)x − + we get: (gh) −1 (gh) h −1 −1 h −1 Θh− x h−1 Θ+ = Θh− Θ− Θ+ Θ+ − gh+ x

14.8 The group of dressing transformations

543

so that one reads the factorization:

(gh) −1 h −1 h −1 Θ− x h− gh+ x Θ+ = Θ± Θh±−1 ±

From this the result follows immediately. The inﬁnitesimal form of eq. (14.33) is, for any X ∈ G with X = X+ − X− : δX x = Y± x − x X±

with

Y± = (xXx−1 )±

(14.35)

One of the main properties of this action is that it is a Lie–Poisson action of G∗ on G if the groups G and G∗ are equipped with the Poisson structures deﬁned in eqs. (14.24) and (14.31), i.e. we have: / ± g g 0 , x1 x2 {g x1 , g x2 }G∗ ×G = r12 We will prove this fact by exhibiting a non–Abelian Hamiltonian for dressing transformations, so that they are automatically Lie–Poisson. Proposition. The non–Abelian Hamiltonian of the dressing transformations (14.33), which is an element of G∗∗ ∼ = G, is the identity function of G, i.e. it is x itself: X · x = −δX x = Tr2 x−1 ∀X ∈ G, x ∈ G (14.36) 2 {x2 , x1 }G X2 Proof. First note that, as usual, the action on functions is deﬁned with the inverse group element, so that the action of X on the function x is −δX x = −(xXx−1 )± x + xX± . From eq. (14.24), in which we take Γ(x) = x, we have: −1 ± Tr2 x−1 2 {x2 , x1 }G X2 = −Tr2 x2 [r12 , x1 x2 ]X2 ± ± + Tr2 x1 r12 = −Tr2 r12 x1 x2 X2 x−1 X2 2 = −(xXx−1 )± x + x X± = −δX x It is a remarkable fact that there exists a non-Abelian Hamiltonian, since the group G with the Sklyanin bracket is not a symplectic manifold, the bracket being degenerate. References [1] V. Arnold, M´ethodes math´ematiques de la M´ecanique classique. MIR, Moscow (1976). [2] R. Abraham and J. Marsden, Foundations of Mechanics. Benjamin, Reading, Massachusetts (1978).

544

14 Symplectic geometry

[3] M. Semenov-Tian-Shansky, Dressing transformations and Poisson group actions. Publ. RIMS 21 (1985) 1237. [4] V. Drinfeld, Hamiltonian structures on Lie groups, Lie bialgebras and the geometric meaning of classical Yang–Baxter equations. Soviet Math. Dokl. 27 (1983) no. 1, 68–71. [5] J.H. Lu, Multiplicative and Aﬃne Poisson Structures on Lie Groups. PhD Thesis, University of California at Berkeley (1990).

15 Riemann surfaces

Riemann surfaces play a ubiquitous role in the analytic study of integrable systems. Here it is fundamental to see Riemann surfaces both as smooth analytical one-dimensional varieties and as the desingularization of the locus of an algebraic equation P (x, y) = 0. We explain the notion of line bundle which arises naturally in the study of integrable systems. This allows us to provide a proof of the Riemann–Roch theorem. This theorem is the main enumerative tool in our applications. Riemann himself discovered the profound implications of theta functions and notably the geometry of the theta divisor in the subject. The starting point is Riemann’s theorem which we use to exhibit explicit solutions of integrable systems in terms of theta-functions. We close the chapter by sketching Birkhoﬀ’s proof of the Riemann–Hilbert factorization theorem, which plays a central role throughout the book. 15.1 Smooth algebraic curves Riemann surfaces are compact smooth analytic varieties of dimension 1. This means that around each point p there is a neighbourhood and a local parameter z(p) mapping it homeomorphically to an open disc |z| < 1 of the complex numbers. Moreover, in the intersection of two such neighbourhoods the corresponding local parameters z1 (p) and z2 (p) must be related by an analytic bijection. Hence, locally a smooth curve looks like the complex line. Finally, a Riemann surface is compact, hence it is a closed surface without boundary. For our purposes it is very important to look at Riemann surfaces from an algebraic viewpoint, that is as the locus in C2 of an algebraic equation P (x, y) = 0, where P is a polynomial in the complex variables x and y. 545

546

15 Riemann surfaces

At a generic point of this locus one has ∂x P = 0 and ∂y P = 0, hence both x and y can be taken as analytic local parameters in the vicinity of this point. If P has degree N in y we can ﬁnd N analytic solutions y = fj (x) for j = 1, . . . , N deﬁned in some open set of the x variable. These are the N branches of the curve presented as an N -fold covering of the complex line of the x variable. This situation can be analytically continued until one gets to a point where some branches meet, that is the equation in y, P (x, y) = 0, has a multiple root, so that ∂y P = 0. Generically one still has ∂x P = 0 at such a point and one can choose y instead of x as local parameter in the vicinity of this point which is a perfectly smooth point on the curve. The covering projection (x, y) → x, however, is branched at this point, that is several branches coalesce to one point. We say that P (x, y) = 0 expresses the curve as an N -fold branched covering of the complex line. Branch points occur when the discriminant of P (x, y), viewed as a polynomial in y, vanishes. This is a polynomial in x, hence has a ﬁnite number of roots above which the covering projection branches. A more complicated situation arises at points where both partial derivatives of P (x, y) vanish. This means that locally the curve looks like the intersection of several lines, hence is not smooth. In order to make contact with our general deﬁnition of a Riemann surface one has to perform an operation called desingularization, which basically consists of replacing the singular point by several ordinary points while leaving the analytic structure of the neighbourhood untouched. An easy way to understand how this can be achieved is to consider birational transformations. They are mappings of the complex plane to itself (x, y) → (x = x (x, y), y = y (x, y)), where x and y are rational functions of x and y such that one can also express x and y as rational functions of x and y . At a non-singular point of such a transformation it is clearly bijective and preserves the analytic structure. A simple example is the quadratic transformation x = 1/x and y = 1/y, which is obviously bijective and analytic on its domain of deﬁnition. When transforming the equation P (x, y) = 0 under such a mapping chosen to have its singular set precisely at a singular point of the curve one may blow up the singular point into several ordinary points of the transformed curve. Let us see how this works on a simple example. Consider the curve y 2 = x2 + x3 , which is singular at (0, 0), and the birational transformation x = x , y = x y , which can be inverted by y = y/x. It is obviously bijective except at the singular point (0, 0) where y is indeterminate. Substituting into the equation of the curve one gets y 2 = 1 + x and we see that the two ordinary points x = 0 and y = ±1 project to the one singular point x = y = 0, while any other point of the transformed

15.2 Hyperelliptic curves

547

curve bijectively corresponds to just one point of the initial curve. This bijection preserves the analytic structure of both curves. A similar construction can be done around any singular point of an algebraic curve and then patched with the analytic structure around ordinary points to get a smooth Riemann surface associated with the equation P (x, y) = 0. We say that this Riemann surface is the desingularized curve of the equation P (x, y) = 0. Finally, to get a compact surface one has to consider points at inﬁnity. We transform the equation P (x, y) = 0 under the quadratic transformation x = 1/x , y = 1/y so that a chart around ∞ becomes an analytic chart around 0. In general one gets a singular point at (0, 0) which has to be desingularized by the above method. We shall give a simple example of this procedure in the next section, in which we study a type of Riemann surface frequently occuring in integrable models: the hyperelliptic curves.

15.2 Hyperelliptic curves A hyperelliptic curve is the locus of an equation of the form: 2

y = P (x),

N P (x) = (x − ai )

(15.1)

i

where P (x) is a polynomial in x of degree N . In order not to have singular points we shall assume ai = aj for i = j. There is an analytic involution of the curve into itself (x, y) → (x, −y) which is called the hyperelliptic involution, and the existence of such an automorphism in fact characterizes hyperelliptic curves. Of course the curve can be expressed as a two-sheeted covering of the complex line branched above the points x = ai . Around such a point (x = ai , y = 0) one can take y as a perfectly smooth local parameter. For example, if P (x) = xP1 (x) with P1 (0) = 0 one can express x = y 2 + O(y 3 ) around (0, 0). Note that the situation would be entirely similar for an equation of the type y n = xP1 (x) around (0, 0) except that now n branches coalesce at the origin. The only tricky point is to consider the situation at inﬁnity. Performing the transformation x = 1/x , y = 1/y one gets N N y (1 − x ai ) − x = 0 2

i=1

and we see that the origin is singular for N > 1. It is now necessary to distinguish two cases, N odd and even.

548

15 Riemann surfaces

When N = 2g + 1, for some integer g, one can perform the birational transformation x = x and y = (x g / (1 − x ai ))y and the trans formed equation reads y 2 − x (1 − x ai ) = 0. We see that the singular point (x = 0, y = 0) gives rise to a single point (x = 0, y = 0) on the desingularised curve, hence we say that there is just one point at inﬁnity, but it is a branch point of the covering (x, y) → x. The local parameter around ∞ is y and we have locally: x=

1 (1 + O(y )), y 2

y=

1 y 2g+1

(1 + O(y ))

Note that x and y are meromorphic functions on the Riemann surface with poles only at ∞ where, √ for example, x has a pole of order 2. Finally, one frequently says that x is a local parameter around ∞ since this is equivalent to 1/y . When N = 2g + 2 one performs the birational transformation x = x and y = (x (g+1) / (1 − x ai ))y , which yields y 2 = (1 − x ai ). Hence we are in the generic situation, i.e. there are two points corresponding to the singular point (x = 0, y = 0), speciﬁcally ∞+ = (x = 0, y = +1) and ∞− = (x = 0, y = −1). In other words, all the points of the desingularised curve bijectively and analytically correspond to points of the original curve except the two points ∞± which map to the same ∞ = (x = 0, y = 0). We say that the singular curve is obtained from the non-singular one by identifying two points. Around ∞± either x or y are good local parameters and one gets for example x=

1 , x

y=

1 x (g+1)

(1 + O(x ))

We see that the meromorphic function x has two simple poles on the curve, one at ∞+ and the other at its hyperelliptic conjugate ∞− . The curve can be seen as a two-sheeted branched covering of the Riemann sphere branched over the 2g + 2 points (ak , 0). Let us remark that in both cases, N = 2g +1 and N = 2g +2, eq. (15.1) allows us to present the hyperelliptic curve as a two-sheeted branched covering of the Riemann sphere with 2g + 2 branch points. In fact, for N = 2g + 1 we have the 2g + 1 points (ak , 0) plus the single point at inﬁnity ∞, while for N = 2g + 2 we have two points at inﬁnity and they are not branch points of the covering. This will allow us to conclude that in both cases the underlying topological surface is of genus g as we now explain.

15.3 The Riemann–Hurwitz formula

549

15.3 The Riemann–Hurwitz formula Let us recall that for a triangulated surface where the triangulation has F faces, V vertices and A edges one deﬁnes the Euler–Poincar´e characteristic χ = F − A + V = 2 − 2g which is a topological invariant. Here g is called the genus of the Riemann surface. Any surface of genus g is homeomorphic to a sphere with g handles. Let us now assume that a Riemann surface is presented as an N -sheeted branched covering of some base space of Euler characteristic χ0 . Choose a triangulation of the base space such that all branch points are included as vertices. Now, for each triangle consider its N pre-images under the covering projection, and similarly the N pre-images of each edge. While ordinary vertices have N pre-images, each base point corresponding to a branch point has fewer than N pre-images, and the reduction is given by the order of the branch point, i.e. the number of branches which coalesce at the branch point minus 1. This is also called the index of the branch point. Altogether one gets a triangulation of the Riemann surface with N F triangles, N A edges and N V −B vertices, where B is the total number of branch points counted with their index on the Riemann surface. Hence the Euler characteristic χ of the Riemann surface is related to χ0 by χ = N χ0 − B. This is the Riemann–Hurwitz formula: 2g − 2 = N (2g0 − 2) + B

(15.2)

Let us apply this formula to the computation of the genus of a hyperelliptic curve. The base is the Riemann sphere for which there is a triangulation with eight faces, twelve edges and six vertices, so that χ0 = 2 or g0 = 0. We have seen that the covering has B = 2g + 2 branch points of index 1. Since N = 2, the genus computed by the Riemann–Hurwitz formula is precisely the number g parametrizing the degree of the polynomial P (x) in the equation of the curve y 2 = P (x).

15.4 The ﬁeld of meromorphic functions of a Riemann surface Consider a Riemann surface of genus g. A complex valued function on this surface is analytic around a point if its expression in terms of a local parameter is analytic. Similarly, it has a pole of order n if its expression in terms of a local parameter has a pole of order n, and so on. These deﬁnitions are invariant under analytic reparametrizations. It is impossible to get an everywhere non-constant analytic function on a compact

550

15 Riemann surfaces

Riemann surface, this is basically the Liouville theorem. We are generally interested in meromorphic functions which have a ﬁnite number of poles. Obviously the meromorphic functions form a ﬁeld which is called the function ﬁeld of the surface. When the curve is given by an equation P (x, y) = 0, any rational function of x and y is a meromorphic function on the curve with poles located at arbitrary points and one can show that the most general meromorphic function can be so constructed. The ﬁeld of meromorphic functions is just the ﬁeld of rational functions of x and y modulo the equation of the curve. This allows us to give a completely algebraic description of Riemann surfaces. Conversely, let f be a non-trivial meromorphic function on a Riemann surface. The Cauchy theorem still holds true on a Riemann surface. Integrating over a small circle whose interior contains no poles or zeroes of f − a, one gets: 1 df 0= = number of zeroes of (f − a) − number of poles of f 2πi f −a which shows that f takes each value the same number of times, say N times. This allows us to present a general Riemann surface as an N sheeted branched covering of the Riemann sphere by p → z = f (p). For any z on the Riemann sphere consider its N pre-images pj under f on the Riemann surface. Now take any other meromorphic function h, and consider the elementary symmetric functions of h(pj ). They do not depend on the order of the sheets, hence deﬁne meromorphic functions of z on the Riemann sphere, that is rational functions of z. It is then clear that h obeys a polynomial equation P (f, h) = 0. By similar arguments one can show that one can choose h so that the polynomial P is irreducible and then any other meromorphic function is a rational function of f and h. We see that any abstract Riemann surface may be viewed as a smooth compact algebraic curve. Let us consider the example of a hyperelliptic curve y 2 = P2g+1 (x), where P2g+1 is a polynomial of degree 2g + 1. Any meromorphic function can be written as f (x, y) = (A(x) + yB(x))/C(x) with A, B, C polynomials in x since one can eliminate y 2 using the equation of the curve and eliminate y in the denominator. In this case we get a simple construction of a meromorphic function with g + 1 poles at (x1 , y1 ), . . . , (xg+1 , yg+1 ) by taking y + Q(x) f (x, y) = i (x − xi ) where Q(x) is a polynomial of degree g determined by requiring that Q(xi ) = yi so that the numerator of f vanishes at the point (xi , −yi ). Note

15.5 Line bundles on a Riemann surface

551

that f vanishes at ∞ and has g zeroes at ﬁnite distance. Also note that the meromorphic function x has a double pole at ∞ hence takes each value twice. The existence of such a function characterizes hyperelliptic curves. 15.5 Line bundles on a Riemann surface It is important to generalize the notion of function on a Riemann surface by considering line bundles on the surface, and sections of these bundles. Such line bundles occur naturally in the study of integrable systems. Let us consider a covering {Uα } of the Riemann surface by open sets Uα and assume that for each non-void intersection Uα ∩ Uβ some continuous functions (called transition functions) tαβ are given which are neither vanishing nor ∞ on the intersection. Moreover, we assume that on each non-void triple intersection Uα ∩ Uβ ∩ Uγ one has tαβ tβγ tγα = 1 and that tαβ tβα = 1. This deﬁnes a line bundle ξ. When the transition functions are diﬀerentiable it is a diﬀerentiable bundle, while if the transition functions are analytic it is an analytic bundle. We shall be concerned with analytic bundles on a Riemann surface. A section of the bundle ξ is a collection of functions fα on each Uα such that on each intersection we have fα = tαβ fβ . If all functions fα are holomorphic this is a holomorphic section of ξ and the space of such analytic sections will be denoted by Γ(ξ). If all functions fα are meromorphic it is called a meromorphic section. If fα has a pole or a zero of order m at some point of Uα ∩ Uβ , it is the same for fβ since tαβ is analytic without zero, and we say that the section has a zero or a pole of order m. Geometrically, an analytic line bundle can be seen as a triple (E, B, π) such that for any point b ∈ B there exists an open set Ub and an analytic isomorphism π −1 (Ub ) Ub × C. Any point above Uα can be written (p, φα ) with p ∈ Uα and φα a number. If, moreover, p ∈ Uβ , the same point can be written (p, φβ ). The two descriptions patch if one can write φα = tαβ (p)φβ for some analytic non-vanishing function tαβ deﬁned on Uα ∩Uβ . The triple intersection condition is obviously satisﬁed. Moreover, for any non-vanishing analytic functions fα deﬁned on Uα an equivalent description of the line bundle is obtained by sending the point (p, φα ) of the line bundle to (p, fα (p)φα ). In the new description, the transition functions are now tαβ (p)fα (p)/fβ (p), so that transition functions diﬀering by multiplication by a ratio fα /fβ deﬁne the same line bundle ξ. A local or global section of the line bundle ξ can be viewed intrinsically as a map which associates with each point of the Riemann surface a point in the ﬁbre above it. In a local trivialization Uα × C this point can be written (p, fα (p)). In the intersection Uα ∩ Uβ the two descriptions are related by fα = tαβ fβ . We recover the above deﬁnition of sections.

552

15 Riemann surfaces

Note that the quotient of two sections is a meromorphic function since the transition functions cancel, and the product of a section by a function is a section. One can diﬀerentiate diﬀerentiable sections of ξ with respect to z¯ since ∂z¯fα = tαβ ∂z¯fβ , hence {∂z¯fα } is a section. Example 1. A very important example of line bundle is provided by the canonical bundle. Consider a covering Uα such that each open set Uα is analytically isomorphic to a domain of the complex numbers by a coordinate p → zα (p). In each non-trivial intersection Uα ∩ Uβ the coordinate zβ can be expressed as an analytic function of the coordinate zα , and conversely. One can deﬁne: καβ (p) =

dzβ (p) dzα

which is holomorphic non-vanishing in the intersection. The canonical bundle is deﬁned by these transition functions. This deﬁnition is canonical since under change of local parameters zα → wα the transition functions transform according to καβ → wβ /wα · καβ and the wα are analytic nonvanishing on Uα . They therefore deﬁne the same bundle. A global section of this bundle is a collection of analytic functions fα such that fα = καβ fβ on intersections, that is fα dzα = fβ dzβ . It can be viewed as a globally deﬁned holomorphic diﬀerential form of the type (1, 0), i.e. a form involving dz only. Example 2. Another important example is provided by the so-called point bundles. Let the covering be as above and assume that p is a point on the Riemann surface belonging to some Uα and to no other Uβ . One can assume that zα (p) = 0. Deﬁne the transition functions tαβ = zα on nontrivial Uα ∩ Uβ and choose tβγ = 1 on all other non-trivial intersections. This deﬁnes a line bundle ξp . Note that our point bundle admits at least one analytic section, σα , namely σα (zα ) = zα and σβ = 1. The patching conditions σβ = tβγ σγ are obeyed for any β, γ including α. This section has just one zero at p. We see that an analytic line bundle may have nontrivial holomorphic sections while holomorphic functions on a compact Riemann surface are constant. Also, note that a bundle ξ is the trivial bundle if and only if it admits an analytic non-vanishing section fα , since in this case one can write tαβ = fα /fβ . One can introduce a group structure on line bundles. Given two line bundles ξ and σ one can assume that they are deﬁned on a common suitably reﬁned covering Uα by transition functions tαβ and sαβ . Then

15.6 Divisors

553

the product ξσ is deﬁned by the transition functions tαβ sαβ and the inverse ξ −1 is deﬁned by 1/tαβ , while the neutral element is just the trivial bundle, i.e. the cartesian product of the Riemann surface by the complex numbers which can be deﬁned by tαβ = 1. Note that the above deﬁnitions are coherent with the redeﬁnitions tαβ = fα /fβ · tαβ . This group law is commutative, hence an additive notation is frequently used. If {fα } is a section of ξ and {gα } a section of σ then {fα gα } is a section of ξσ and {1/fα } is a meromorphic section of ξ −1 . 15.6 Divisors One can build more complicated bundles from point bundles at points p1 , . . . , pk by taking their product, that is j nj ξpj . Here nj are positive or negative integers. A formal sum of points pj on the Riemann surface with multiplicities nj is called a divisor, denoted by D = j nj pj . The line bundle ξ = j nj ξpj is the line bundle associated with the divisor D(ξ) = j nj pj . For any meromorphic function f on the Riemann surface with zeroes at pj of order nj and poles at qk of order mk one deﬁnes the divisor of the function f as the formal sum n j pj − mk qk D(f ) = j

k

The divisor of a section of a line bundle is similarly deﬁned. A divisor is positive if all nj are positive, hence a meromorphic section is analytic if and only if its divisor is positive. Since ξp has a section with divisor p, the line bundle ξ associated with D has a section fξ with divisor D. When the divisor D is the divisor of a meromorphic function f , the associated bundle ξ has a section fξ with divisor D. The section f −1 fξ is analytic non-vanishing so that ξ is the trivial bundle. Conversely, if ξ associated with D is trivial, the section fξ gives rise to a meromorphic function of divisor D, since we can dividefξ by an analytic non-vanishing section. This allows us to introduce an equivalence relation between divisors: two divisors D1 and D2 are equivalent if their associated bundles ξ1 and ξ2 are such that ξ1 − ξ2 is the trivial is the divisor of a meromorphic function. bundle, i.e. if D1 − D2 With a divisor D = j nj pj is associated a number nj deg (D) = j

called the degree of the divisor. Two equivalent divisors have the same degree since a meromorphic function has the same number of zeroes and

554

15 Riemann surfaces

poles. Note that if the line bundle ξ has a meromorphic section, σ, of divisor D and η is the line bundle associated with D with meromorphic section τ having the same divisor D, ξ−η has a holomorphic non-vanishing section σ/τ , hence ξ = η. 15.7 Chern class It is well known from diﬀerential geometry that one can associate with diﬀerential vector bundles an integer called the Chern class which can be computed as the integral of some curvature form. For line bundles it can be shown that this integer classiﬁes diﬀerential bundles. Moreover, the Chern class is compatible with the group structure, hence the Chern class of the “product” ξ + σ is just the sum c(ξ) + c(σ) of the Chern classes of ξ and σ. Simple proofs of these facts, adapted from the diﬀerential geometric case, can be found in the references. This leads to an index theorem stating that for any meromorphic section f of a line bundle ξ one has: deg (D (f )) = c(ξ) The trivial bundle has Chern class 0, hence this theorem reduces in this case to the above mentioned fact that a meromorphic function has the same number of zeroes and poles. This implies in particular that only bundles with positive Chern class may have holomorphic sections. Note that the point bundle ξp has Chern class 1 since it has a holomorphic section with just one zero at p. 15.8 Serre duality Having introduced these natural deﬁnitions, we can now study the space of most interest to us, i.e. the space Γ(ξ) of holomorphic sections of the line bundle ξ. In order to do that it turns out to be very useful to introduce new spaces called H 1 (Σ, O(ξ)) associated with the Riemann surface Σ and the line bundle ξ. Their deﬁnition is analogous to that of line bundles, except that the multiplicative structure is replaced by an additive one. Speciﬁcally, consider a covering Uα (and we assume that each Uα is connected and simply connected) and for each non-void intersection Uα ∩ Uβ a holomorphic local section fαβ of ξ. Such sections are assumed to obey fαβ + fβγ + fγα = 0 on each non-void triple intersection Uα ∩ Uβ ∩ Uγ . The space H 1 (Σ, O(ξ)) is the space of {fαβ } modulo trivial ones, i.e. modulo those of the form fαβ = fα − fβ for holomorphic sections fα of ξ over Uα . Note that there is no non-vanishing condition here. If trivialisations of the bundle are deﬁned over the Uα one can represent these sections

15.8 Serre duality

555

by complex valued functions related by transition functions. Let fα and α , while f fαβ be represented over Uα by functions fαα and fαβ αβ and fβ β are represented on Uβ by fαβ and fββ respectively. Then the equation exα = f α − t f β or a pressing the triviality of fαβ reads on Uα ∩ Uβ : fαβ αβ β α similar expression for the β trivialization. Note that one can deﬁne a ∂¯ operator on sections of ξ since ∂¯ vanishes on transition functions. Let us give an alternative description of H 1 (Σ, O(ξ)). One can always solve fαβ = fα − fβ with fα diﬀerentiable sections of ξ on Uα by partition z) of unity arguments∗ . Consider the type (0, 1) forms (i.e. forms in d¯

¯ α = ∂fα d¯ ∂f z ∂ z¯ ¯ β = ∂f ¯ αβ = 0, hence these forms ¯ α − ∂f On an intersection we have ∂f patch to a globally deﬁned section σ of ξ with values in type (0, 1) forms. Conversely, if such a diﬀerentiable section is given, one can write it as ¯ α on each Uα because, by an application of the Cauchy integral formula, ∂f ¯ = σ on a connected set with f a diﬀerentiable one can always solve ∂f function. Then one can deﬁne on an intersection fαβ = fα − fβ which is analytic since ∂¯ vanishes on it. Of course fαβ vanishes if and only if the set of all fα deﬁne a section of ξ, i.e. if fα = fβ . Hence H 1 (Σ, O(ξ)) can be identiﬁed with the set of diﬀerentiable sections of type (0, 1) of ξ, denoted by Γ1d (ξ), modulo the image by ∂¯ of the set of diﬀerentiable sections of ξ. ¯ d (ξ) H 1 (Σ, O(ξ)) ≡ Γ1d (ξ)/∂Γ It can be shown that H 1 (Σ, O(ξ)) is ﬁnite-dimensional. This description is useful for understanding the Serre duality between the spaces H 1 (Σ, O(ξ)) and the space Γ(κ − ξ), deﬁned by a non-singular pairing: H 1 (Σ, O(ξ)) × Γ(κ − ξ) → C As we have seen in the discussion of the canonical bundle, a section of κ − ξ can be viewed as a holomorphic section of −ξ with values in type (1, 0) forms. Take a diﬀerentiable element of Γ1 (ξ), locally of the form f (z, z¯)d¯ z , and a holomorphic section of κ − ξ, locally of the form g(z)dz, and consider their wedge product, locally f (z, z¯)g(z) dz ∧ d¯ z . In the prod−1 uct, the transition functions tαβ of ξ and tαβ of −ξ cancel so that one ∗

If α rα = 1 with supp(rα ) ∈ Uα , deﬁnefα = γ rγ fαγ where rγ fαγ is extended to 0 in Uα − Uγ , and note that fα − fβ = γ rγ (fαγ − fβγ ) = γ rγ fαβ = fαβ .

556

15 Riemann surfaces

ends up with a globally deﬁned volume form on the Riemann surface that we integrate. This deﬁnes the pairing. Note that if f is identiﬁed ¯ one can integrate by parts obtaining with 0, that is if f (z, z¯)d¯ z = ∂h, ¯ ∧ dz which vanishes since g is holomorphic. Hence the the integral of h∂g pairing is well-deﬁned between the considered spaces. Finally, the essential point is that the pairing is non-singular. Indeed, any linear form acting on f d¯ z may be written f (z, z¯)g(z, z¯) dz ∧ d¯ z for some distribution g, ¯ the distribution g is weakly anasince this form vanishes when f d¯ z = ∂h, lytic hence is an analytic function by the classical Weyl lemma. It follows that Γ(κ − ξ) and H 1 (Σ, O(ξ)) are ﬁnite-dimensional spaces of the same dimension.

15.9 The Riemann–Roch theorem Let Γ(ξ) be the ﬁnite-dimensional space of holomorphic sections of the line bundle ξ and c(ξ) its Chern class. Theorem. On a Riemann surface of genus g with canonical bundle κ we have, for any line bundle ξ: dimΓ(ξ) − dimΓ(κ − ξ) = c(ξ) + 1 − g

(15.3)

Proof. We ﬁrst show that χ(ξ) = dimΓ(ξ) − dimH 1 (Σ, O(ξ)) − c(ξ) is independent of the line bundle ξ. As a ﬁrst step, we show that χ(ξ + ξp ) = χ(ξ) for any point bundle ξp (see Example 2), with analytic section σp vanishing at p. Note that any local or global analytic section of ξ can be multiplied by σp to produce a section of ξ + ξp vanishing at p. This is clearly an injective homomorphism of the space of sections. It fails to be surjective if some global sections of ξ + ξp do not vanish at p. Hence we have two cases: (a) There exists a global section of ξ + ξp non-vanishing at p. We have dim Γ(ξ + ξp ) = dim Γ(ξ) + 1. (b) All global sections of ξ + ξp vanish at p. We have dim Γ(ξ + ξp ) = dimΓ(ξ). Now let us consider local sections over intersections Uα ∩ Uβ (we assume that p does not belong to such intersections and p ∈ Uα ). The homomorphism fαβ → σp fαβ is clearly bijective in this case since σp does not vanish outside p and trivial elements are mapped to trivial elements. This induces a surjective homomorphism H 1 (Σ, O(ξ)) → H 1 (Σ, O(ξ + ξp )) which fails to be injective if there exists a non-trivial set of sections fαβ of ξ mapping to a trivial set of sections fα − fβ of ξ + ξp , that is σp fαβ = fα − fβ .

15.9 The Riemann–Roch theorem

557

In case (a) above, we can substract the global non-vanishing section from all fγ so as to achieve fα (p) = 0. Then one can divide all fγ by σp , getting fα such that fαβ = fα − fβ . Hence the mapping is bijective and we have dim H 1 (Σ, O(ξ)) = dim H 1 (Σ, O(ξ + ξp )). In case (b) we will show that dim H 1 (Σ, O(ξ)) = dim H 1 (Σ, O(ξ+ξp ))+ 1 by constructing a section fαβ which maps under σp on a trivial section = f −f , where we have chosen f (p) = 0. If we had f fαβ αβ = fα −fβ we α α β would get fα − σp fα = fβ − σp fα , hence deﬁning a global section of ξ + ξp which by hypothesis vanishes at p, yielding fα (p) = 0, a contradiction. If, however, fα (p) = 0 one can divide it by σp and fαβ is trivial. This shows that non-trivial elements {fαβ } are parametrized by fα (p) and form a space of dimension 1 modulo trivial elements. Since c(ξ +ξp ) = c(ξ)+1, in both cases we get χ(ξ +ξp ) = χ(ξ). Starting from ξ − ξp we get χ(ξ) = χ(ξ − ξp ). Hence for any divisor D and line bundle η associated with D we have χ(ξ + η) = χ(ξ). The rest of the proof is easier. There exists D such that dim Γ(ξ+η) = 0. Otherwise one gets c(η) = C ste − dim H 1 (Σ, O(ξ + η)) ≤ C ste , where C ste = dim H 1 (Σ, O(ξ)) − dim Γ(ξ) is independent of η. This is impossible since c(η) can be arbitrarily large. Let σ be a non-trivial analytic section of ξ + η. Since η has a meromorphic section ση of divisor D we see that ξ has a non-trivial meromorphic section σ/ση . Hence ξ is the line bundle associated with this meromorphic section. We have obtained the: Proposition. Any line bundle on the Riemann surface Σ has a nontrivial meromorphic section of divisor D and is the line bundle associated with this divisor. We can now take for ξ the trivial bundle in χ(ξ + η) = χ(ξ) and we see that χ(η) is independent of η, as previously claimed. Taking into account the Serre duality formula, we get: χ = dim Γ(ξ) − dim Γ(κ − ξ) − c(ξ) When ξ is the trivial bundle dim Γ(ξ) = 1 since global analytic functions on Σ are constants, hence χ = 1 − dim Γ(κ). When ξ = κ one gets χ = dim Γ(κ) − 1 − c(κ), hence χ = −(1/2)c(κ). To compute c(κ) we view the Riemann surface Σ as an n-sheeted covering of the Riemann sphere using a meromorphic function f on Σ. We start from a meromorphic diﬀerential ω on the Riemann sphere and take its pullback f ∗ (ω) on Σ. On the sphere we can take ω = dz. Its divisor has degree −2, since it has a double pole at inﬁnity. Hence its pullback has n poles of order 2 above ∞. Moreover, for each branch point of order ν the mapping f is locally z = f (w) = wν , so that f ∗ (dz) = νwν−1 dw has a zero of order ν −1 which

558

15 Riemann surfaces

is the multiplicity index of the branch point. So the total multiplicity of the zeroes is precisely the total multiplicity B of the branch points of the covering, which by the Riemann–Hurwitz formula, eq. (15.2), with g0 = 0 is equal to 2g − 2 + 2n. The Chern class of κ is therefore 2g − 2. Hence χ = 1 − g, yielding the Riemann–Roch theorem. Moreover, we see that dim Γ(κ) = g which means that the space of globally deﬁned analytic one forms is of dimension g. Consider now the meromorphic functions on Σ. Let M (−D) be the set of meromorphic functions on Σ whose divisor is bigger than −D; i.e. f ∈ M (−D) iﬀ the orders of its poles are less than or equal to those speciﬁed by −D and the orders of its zeroes are greater than or equal to those speciﬁed by −D. Let ξ be the line bundle associated with the divisor D. The space of holomorphic sections of ξ is isomorphic to M (−D), because this line bundle has a meromorphic section of divisor D and any other section is obtained by multiplication by a meromorphic function. The section will be holomorphic iﬀ the divisor of the function is greater than −D. Hence dim M (−D) = dim Γ(ξ). We deﬁne i(D) = dim Γ(κ − ξ). This is the dimension of the space of diﬀerentials with a divisor greater than D. Recalling that c(ξ) = deg D, the Riemann–Roch theorem can be written as: dim M (−D) = i(D) + deg D − g + 1 (15.4) In general the Riemann-Roch formula eq. (15.3) relates two unknown quantities. However, if c(ξ) > 2g − 2, we see that c(κ − ξ) < 0, hence dim Γ(κ − ξ) = 0 (because c(κ − ξ) is the degree of the divisor of any meromorphic section of κ−ξ which has then necessarily poles), and we get: dim Γ(ξ) = c(ξ) + 1 − g,

if

c(ξ) > 2(g − 1)

(15.5)

Corollary. The dimension of the space of meromorphic functions having at most k prescribed poles and at least l prescribed zeroes is greater than or equal to k − l + 1 − g. Equality occurs when k − l ≥ 2g − 2. If k−l ≥ g the equality is satisﬁed for generic positions of the prescribed zeroes and poles. Let us take for simplicity l = 0. Then dim Γ(κ − ξ) is the dimension of the space of holomorphic diﬀerentials having k prescribed zeroes. But the space of holomorphic diﬀerentials is of dimension g and we want to impose k linear conditions. This is generically impossible for k ≥ g. Hence the useful statement: deg D ≥ g ⇒ i(D) = 0

generically

15.10 Abelian diﬀerentials

559

Note that we have previously illustrated this situation by constructing a meromorphic function with g + 1 prescribed poles on a hyperelliptic surface.

15.10 Abelian diﬀerentials Consider a Riemann surface Σ of genus g. Let ai , bi be a basis of cycles on Σ with canonical intersection matrix (ai · aj ) = (bi · bj ) = 0, (ai · bj ) = δij . This means that one can take diﬀerentiable loops t → ai (t) and t → bi (t) such that there is no intersection between loops ai and aj or bj with j = i, while ai and bi intersect at just one point p. Moreover, at p the tangent vectors ai and bi form a positively oriented basis of the tangent space at p (for the orientation given by the complex space structure). One can then continuously deform these loops without changing the intersection index which is the sum of signs ±1 at each intersection according to the orientation of the tangent vectors. In particular, one can deform the loops ai and bi so that they have a common base point and then cut the Riemann surface along them, getting a polygon with some edges identiﬁed. The −1 boundary of this polygon can be described as a1 · b1 · a−1 1 · b1 · · · ag · bg · −1 a−1 g · bg , where the identiﬁcations are obvious. The common base point becomes all the vertices of the polygon. The globally deﬁned analytic 1-forms on Σ are called Abelian diﬀerentials of the ﬁrst kind. They form a space of dimension g over the complex numbers. Note that such a diﬀerential has 2g − 2 zeroes on the Riemann surface since c(κ) = 2g − 2 and it has no pole. There is a natural pairing between these forms and loops obtained by integrating the form along the loop. It can be shown that the pairing between a-cycles and diﬀerentials is non-degenerate (note they have the same dimension g). This is a consequence of the Riemann bilinear identities that we shall describe below. We choose a basis of ﬁrst kind Abelian diﬀerentials, which we denote by ωj , j = 1, . . . , g, normalized with respect to the a-cycles: ωi = δij (15.6) aj

The matrix of b-periods is then deﬁned as the matrix B with matrix elements: Bij = ωj (15.7) bi

Taking the example of a hyperelliptic surface y 2 = P2g+1 (x), where P (x) is a polynomial of degree 2g + 1, a basis of regular Abelian diﬀerentials is

560

15 Riemann surfaces

provided by the forms ωj = xj dx/y

for j = 0, . . . , g − 1

These forms are regular except perhaps at the branch points and at ∞. At a branch point the local parameter is y and we have y 2 = a(x − b) + · · · , hence xj dx/y = (2bj /a)(1 + · · ·)dy which is regular. At ∞ we take x = 1/x and y = y/x(g+1) so that y 2 = ax + · · · and xj dx/y = by 2(g−j−1) dy which is regular for j ≤ g − 1 since y is the local parameter. Of course these forms are unnormalized. Note that their (2g − 2) zeroes are located at x = 0 and x = ∞. Similarly Abelian diﬀerentials of the second kind are meromorphic differentials with poles of order greater than 2. Given a point p on Σ, there exists an Abelian diﬀerential of the second kind whose only singularity is a pole of second order at p. Indeed, by eq. (15.3) we see that dim Γ(κ+2ξp ) ≥ c(κ+2ξp )+1−g = g +1. An element in Γ(κ+2ξp ) comes from an Abelian diﬀerential multiplied by σp2 , where σp is the section of the point bundle ξp vanishing at p. Since the space of regular diﬀerentials is of dimension g we see that there exists at least one section in Γ(κ + 2ξp ) whose division by σp2 has a pole at p, which is necessarily of second order. Adding a proper combination of diﬀerentials of the ﬁrst kind, one can always ensure that all a-periods of the second kind diﬀerential vanish. Such a second kind diﬀerential is called normalized. We can apply the Cauchy theorem to a meromorphic globally deﬁned 1-form of type (1,0), yielding the vanishing of the sum of residues of ﬁrst order poles. Note that the residue of a meromorphic 1-form is intrinsically . deﬁned since Res = (1/2πi) ω, where the contour is a small loop around the given singularity. So we deﬁne Abelian diﬀerentials of the third kind as general meromorphic diﬀerentials with ﬁrst order poles whose sum of residues vanish. Given two points p and q there exists a unique normalized (all a-periods vanish) third kind diﬀerential whose only singularities are a pole of order 1 at p with residue 1, and a pole of order 1 at q with residue −1. 15.11 Riemann bilinear identities On a Riemann surface on which we have chosen canonical cycles there is a pairing between meromorphic diﬀerentials. Speciﬁcally, let Ω1 and Ω2 be two meromorphic diﬀerentials on Σ. The pairing (Ω1 • Ω2 ) is deﬁned by integrating them along the canonical cycles as follows:   g   (Ω1 • Ω2 ) =  Ω1 Ω 2 − Ω 2 Ω 1  i=1

aj

bj

aj

bj

15.11 Riemann bilinear identities

561

The Riemann bilinear identity expresses this quantity in terms of residues: Proposition. Let g1 be a function deﬁned on the Riemann surface, cut along the canonical cycles, and such that dg1 = Ω1 . We have: (Ω1 • Ω2 ) = 2iπ res(g1 Ω2 ) (15.8) poles

. Proof. One computes g1 Ω2 on the boundary of the polygon representing the Riemann surface. On one hand, this produces the sum of residues in eq. (15.8). On the other hand, we compute this integral explicitly: g1 Ω2 = g1 Ω2 − (g1 + Ω1 )Ω2 aj a−1 j

since g1 is shifted by

aj

bj

bj

Ω1 when crossing the cut aj . Similarly, one gets:

bj b−1 j

aj

g1 Ω2 −

g1 Ω 2 = bj

(g1 −

bj

Ω1 )Ω2 aj

Adding the contributions for all j one gets (Ω1 • Ω2 ). Corollary. The matrix of b-periods B is symmetric. Proof. The pairing between the normalized holomorphic diﬀerentials is trivial: (ωi • ωj ) = 0 for i, j = 1, . . . , g. This reads Bij = Bji . Corollary. Let Ω2 be a normalized diﬀerential of the second kind with a pole of order n, with principal part z −n dz at z = 0 for some local parameter z. Let Ω1 = ωk be a normalized holomorphic diﬀerential expanded as ∞ ci z i )dz ωk = ( i=0

around z = 0. One has:

Ω2 = 2πi bk

cn−2 n−1

second kind diﬀerential with principal By linearity, if Ω(P ) is a normalized N part dP (z), where P (z) = n=1 pn z −n , then we have 1 Ω(P ) = −Res (ωk P ) (15.9) 2iπ bk

562

15 Riemann surfaces 15.12 Jacobi variety

We have seen that diﬀerential line bundles on a Riemann surface are classiﬁed by their Chern class. This is not so for analytic line bundles. To describe their classiﬁcation, it is suﬃcient to consider the diﬀerent analytic structures on line bundles of Chern class 0. The space of such structures is called the Jacobi variety. Since analytic line bundles are the same as equivalence classes of divisors on the Riemann surface, the Jacobi variety identiﬁes with equivalence classes of divisors of degree 0. It is thus necessary to characterize the divisors of meromorphic functions. In order to dothat, consider a divisor of degree 0 which can always be written D = i (pi − qi ), with non-necessarily distinct points. Choose paths γi from qi to pi and associate with D the point in Cg of coordinates: ρk (D) = ωk , k = 1, . . . , g i

γi

Such sums are called Abel sums. If the paths are homotopically deformed these integrals remain constant by the Cauchy theorem. If one makes a loop around ak , then ρl → ρl + δkl . If one makes a loop around bk , then ρl → ρl + Bkl . Hence the maps ρk give a well-deﬁned point on the torus: J(Σ) = Cg / (Zg + BZg )

(15.10)

where B is the matrix of the b-periods. If one permutes the points pi and qi independently the point in the torus does not change. To see it, let q 1 to the paths γ1 connect q1 to p 2 , γ2 connect q2 to p1 and σ connect q2 . One has γ1 ω = γ ω − σ ω up to periods and γ2 ω = γ ω + σ ω 1 2 up to periods, so σ ω cancels in the sum. Note that J(Σ) is an additive group and that the above map from line bundles to points of J(Σ) is a homomorphism. The theorems of Abel and Jacobi state that this point on the torus J(Σ) characterizes the divisor D up to equivalence, so that the Jacobian variety can be identiﬁed with the g-dimensional complex torus J(Σ). Theorem (Abel). A divisor D = i (pi − qi ) is the divisor of a meromorphic function if and only if, for any ﬁrst kind Abelian diﬀerential ω, the Abel sum i γi ω vanishes modulo Z + BZ for any choice of paths γi from qi to pi . Theorem(Jacobi). For any point λ ∈ J(Σ) and a ﬁxed reference divisor D0 = gi=1 qi , one can ﬁnd a divisor of g points D = p1 + · · · + pg on Σ such that ρk (D − D0 ) maps to λ. Moreover, for generic λ the divisor D is unique.

563

15.13 Theta-functions

Proof. Let f be a meromorphic function with divisor D = i (pi − qi ) and consider for λ ∈ C the pencil of divisors Dλ = i (pλi − qi ) of the meromorphic functions f + λ (the poles are those of f , the zeroes vary analytically with λ). Finally, let φ(λ) be the point in C/(Z+BZ) obtained by integrating ω along paths from qi to pλi . The map φ is obviously analytic and can be extended to an analytic map on the Riemann sphere because when λ → ∞, we have pλ → q. But such a map from the Riemann sphere to a torus is necessarily constant, since dz is a regular diﬀerential on the torus, hence φ∗ (dz) has to vanish on the Riemann sphere. Since φ(∞) = 0 one gets φ = 0. In order to prove the converse and the Jacobi theorem one has to exhibit particular functions or divisors. This will also be a consequence of the powerful Riemann theorem that we will show later on. To show the generic uniqueness of the g points mapping to a given λ note that we have g equations for g unknowns, hence the space of solutions is generically of dimension 0. If we have two solutions, there exists a meromorphic function f whose poles and zeroes are respectively these solutions. Then f + λ for λ ∈ C relates the divisor of poles to a one-parameter family of equivalent divisors, hence the solution space would be of dimension 1, a contradiction. One can embed the Riemann surface Σ into its Jacobian J(Σ) by the Abel map. Speciﬁcally, choose a point q0 ∈ Σ and deﬁne the vector A(p) with coordinates Ak (p) modulo the lattice of periods: A : Σ −→ J(Σ) p Ak (p) = ωk

(15.11) (15.12)

q0

Clearly, the Abel map depends on the point q0 . But changing this point just amounts to a translation in J(Σ). The Abel map is an analytic embedding of the Riemann surface into the g-dimensional torus J(Σ), i.e. is injective. 15.13 Theta-functions One can show using Riemann bilinear type identities that the imaginary part of the period matrix B is a positive deﬁnite quadratic form. This allows us to deﬁne the Riemann theta-function: e2πi(m,z)+πi(Bm,m) (15.13) θ(z1 , . . . , zg ) = m∈Zg

Since the series is convergent, it deﬁnes an analytic function on Cg .

564

15 Riemann surfaces

The theta-function has simple automorphy properties with respect to the period lattice of the Riemann surface: for any l ∈ Zg and z ∈ Cg θ(z + l) = θ(z) θ(z + Bl) = exp[−iπ(Bl, l) − 2iπ(l, z)]θ(z)

(15.14)

The divisor of the theta-function is the set of points in the Jacobian torus where θ(z) = 0. Note that this is an analytic subvariety of dimension g −1 of the torus, well-deﬁned due to the automorphy property. The fundamental theorem of Riemann expresses the intersection of the image of the embedding of Σ into J(Σ) with the divisor of the thetafunction. Theorem. Let w = (w1 , . . . , wg ) ∈ Cg , arbitrary. Either the function θ(A(p) − w) vanishes identically for p ∈ Σ or it has exactly g zeroes p1 , . . . , pg such that: A(p1 ) + · · · + A(pg ) = w − K

(15.15)

where K is the so-called vector of Riemann’s constants, depending on the curve Σ and the point q0 but independent of w. Proof. We ﬁrst dissect the Riemann surface Σ as explained above, ob−1 −1 −1 taining a polygon with boundary a1 · b1 · a−1 1 · b1 · · · ag · bg · ag · bg in C. Consider the analytic function on the polygon (or more precisely on the Riemann surface cut along the previous loops): f (p) = θ(A(p) − w) Assuming that f does not vanish identically, it has discrete zeroes pi . Since it has no pole the number of these zeroes is given by: df 1 number of zeroes = 2πi f where the integral is taken on the boundary of the polygon. This integral is a sum of terms on the arcs ak and bk . The integrals on the arcs bk and b−1 k are related by a translation by the ak period, hence cancel. Similarly the diﬀerence of the integrals on ak and a−1 k reduces by the automorphy property to 2iπ ak ωk = 2iπ. Thus number of zeroes = g. To prove the second identity we proceed similarly by considering the integral: 1 df gk , with dgk = ωk , and gk (q0 ) = 0 2πi f

565

15.13 Theta-functions

computed on the edge of the polygon. On one hand, this integral is equal to the sum of residues which occur at the zeroes of f , and produces: g

gk (pj ) =

j=1

Ak (pj )

j

On the other hand, it can be computed as a sum over arcs using the automorphy properties of the theta-function: gk d(log f ) = − (gk + Bjk )(d(log f ) − 2iπdAj (p)) aj a−1 j

aj

gk d(log f ) ≡ 2iπ

+ aj

gk ω j aj

modulo B periods. But we have: gk d(log f ) = (gk + δjk )d(log f ) − gk d(log f ) bj b−1 j

bj

bj

≡ δjk (2iπwj − 2iπAj (q1 )) modulo periods, where q1 is the base point of all the loops at the boundary. Putting everything together one gets the Riemann formula with K given by a complicated expression, independent of w. Corollary (Jacobi’s theorem). Any point in the Jacobian J(Σ) is the image of some degree g divisor p1 + · · · + pg on Σ. Proof. One has to ﬁnd g points such that A(p1 ) + · · · + A(pg ) = z modulo periods for given z. We ﬁnd these points by solving the equation θ(A(p) − K − z) = 0. The divisor of the zeroes of a theta-function has also a nice characterization in terms of points on the Riemann surface. Theorem. The zero divisor of a theta-function can be written as X =K−

g−1

A(ηi )

i=1

Proof. Let X be a point of the Θ divisor, i.e. θ(X) = 0. Consider the function θ(A(p) + X)

(15.16)

566

15 Riemann surfaces

By Riemann’s theorem, the zeroes of this function are such that (if it does not vanish identically) A(p1 ) + · · · + A(pg ) + X = −K Among these zeroes, one has necessarily q0 , the base pointof Abel’s map g−1 (since θ(X) = 0), say pg = q0 . Hence we have X = −K − i=1 A(pi ). Conversely, if X is of this form, one solves eq. (15.16) producing a g divisor of g points gi=1 qi such that g−1 i=1 A(pi ) = i=1 A(qi ). A solution is obviously pi = qi for i = 1, . . . , g − 1 and qg = q0 . This solution is generically unique up to permutation due to the Jacobi theorem. Equation (15.16) for p = q0 reads θ(X) = 0. Note that the space of points of the form X = −K − g−1 i=1 A(pi ) and the solution of θ(X) = 0 are both closed in the Jacobian, hence they are equal. The Riemann theorem can be used to express meromorphic functions in terms of theta-functions. Let f be a meromorphic function with g poles at points δ1 , . . . , δg and an additional pole at the point q + and one of its zeroes at a speciﬁed point q − ; i.e. with divisor greater than D = −δ1 − · · · − δg − q + + q − . By the Riemann–Roch theorem there is a unique such function generically. Let w, w+ , w− , w0 be vectors deﬁned by the formulae: w=

g

A(δs ) + K

s=1

w

+

= A(q ) + +

w− = A(q − ) +

g s=2 g

A(δs ) + K A(δs ) + K

s=2 + −

w0 = w + w − w . Let us deﬁne the function f (p) =

θ(A(p) − w− )θ(A(p) − w0 ) θ(A(p) − w)θ(A(p) − w+ )

(15.17)

From the Riemann theorem it follows that the two factors in the denominator vanish generically at the points δ1 , . . . , δg and q + , δ2 , . . . , δg , respectively. Similarly, the two factors in the numerator vanish at q − , δ2 , . . . , δg and g other points. The zeroes at δ2 , . . . , δg cancel between the numerator and the denominator, thereby leaving us with the correct divisor of

15.14 The genus 1 case

567

zeroes and poles. It remains to show that the function f is well-deﬁned on Σ. This is because, due to the deﬁnition of w0 , the automorphy factors of the theta functions in eq. (15.14) cancel between the numerator and the denominator when p describes b-cycles on Σ. The converse of Abel’s theorem results from an analogous construction. 15.14 The genus 1 case The application of these ideas to the genus 1 case is the classical theory of elliptic functions. A genus 1 analytic surface can be viewed as the quotient of C by a lattice, whose periods are denoted by a classical convention 2ω1 , 2ω2 . In this case, the theorems of Abel and Jacobi identify the curve and its Jacobi variety. Note that dz is a well-deﬁned regular analytic diﬀerential on this torus, and spans the one-dimensional space of Abelian diﬀerentials. For any meromorphic function f on the torus, i.e. periodic with respect to the lattice, the diﬀerential f dz is a meromorphic diﬀerential, so the sum of its residues vanishes. Thus f has at least twopoles or a pole of order 2, and a meromorphic function with just one pole is impossible. The main example is the Weierstrass ℘-function: 8 5 1 1 1 − ℘(z) = 2 + z (z − 2mω1 − 2nω2 )2 (2mω1 + 2nω2 )2 m,n=0

An analogue of the theta-function is provided by the Weierstrass sigmafunction: 2 z z z 1 (15.18) σ(z) = z 1− exp + ωmn ωmn 2 ωmn m,n=0

with ωmn = 2mω1 + 2nω2 . This function is related to the ℘-function by the equations: ζ(z) =

σ (z) , σ(z)

℘(z) = −ζ (z),

(15.19)

The ℘-function is doubly periodic, and the sigma-function and zetafunctions transform according to: ζ(z + 2ωl ) = ζ(z) + 2ηl ,

σ(z + 2ωl ) = −σ(z)e2ηl (z+ωl )

The Riemann bilinear identity applied to the forms Ω1 = −℘(z)dz and Ω2 = dz yields 2(η1 ω2 − η2 ω1 ) = iπ. These functions have the symmetries σ(−z) = −σ(z),

ζ(−z) = −ζ(z),

℘(−z) = ℘(z).

568

15 Riemann surfaces

Their behaviour at the neighbourhood of zero is σ(z) = z + O(z 5 ),

ζ(z) = z −1 + O(z 3 ),

℘(z) = z −2 + O(z 2 )

It is useful for the study of the elliptic Calogero model to introduce the Lam´e function: Φ(x, z) =

σ(z − x) ζ(z)x e σ(x) σ(z)

(15.20)

It has the symmetry property Φ(−x, z) = −Φ(x, −z). The function Φ(x, z) is a doubly-periodic function of the variable z, Φ(x, z + 2ωl ) = Φ(x, z), and has an expansion of the form: Φ(x, z) = (−z −1 + ζ(x) + O(z))eζ(z)x

(15.21)

at the point z = 0. As a function of x, it has the following monodromy properties: Φ(x + 2ωl , z) = Φ(x, z) exp 2(ζ(z)ωl − ηl z).

(15.22)

and has a pole at the point x = 0: Φ(x, z) = x−1 + O(x). The function Φ is a solution of the Lam´e equation: 2 d − 2℘(x) Φ(x, z) = ℘(z)Φ(x, z) (15.23) dx2 Choosing the periods ω1 = ∞ and ω2 = i π2 , we obtain the hyperbolic functions: 2 z 1 z 1 , ζ(z) → coth(z)− , ℘(z) → + σ(z) → sinh(z) exp − 2 6 3 sinh (z) 3 and Φ(x, z) →

sinh(z − x) x coth z e sinh(z) sinh(x)

In the rational limit, we have σ(z) → z,

1 ζ(z) → , z

1 ℘(z) → 2 , z

(15.24)

Φ(x, z) →

1 1 − x z

x

ez

15.15 The Riemann–Hilbert factorization problem In this section, we give the proof of the Riemann–Hilbert theorem on the Riemann sphere, see eq. (3.49) in Chapter 3. Let U+ be the disc |x| < 1+η, and U− be the disc |x| > 1 − η. Let C be the circle |x| = 1.

15.15 The Riemann–Hilbert factorization problem

569

Let us give ourselves a matrix A analytic in the ring U+ ∩ U− , such that det A = 0. We consider the kernel: K(x, t) = A−1 (x)A(t) − 1 which is analytic in the above ring and vanishes for x = t. On continuous functions on C we deﬁne an operator F 1 K(x, t) (FF )(x) = F (x) + F (t)dt 2iπ C x − t This is a Fredholm operator because it is of the form 1 plus a compact operator. Hence, Im F is closed and of ﬁnite codimension. So one can choose a matrix of polynomials P (x) such that det P (x) = 0 for |x| suﬃciently large and such that there exists a function F satisfying (FF )(x) = A−1 (x)P (x),

x∈C

The function F (x) has an analytical extension on the ring U+ ∩ U− given by K(x, t) 1 −1 F (x) = A (x)P (x) + F (t)dt (15.25) 2iπ C t − x It follows that one can expand F (x) in a Laurent series and write F (x) = F+ (x) + F− (x), where F± (x) are analytic in U± respectively and F− (x) vanishes at x = ∞. Note that for |x| > 1, one has F (t) 1 F− (x) = − dt 2iπ C t − x Subtracting from eq. (15.25), we get for 1 < |x| < 1 + η: 1 F (t) + K(x, t)F (t) −1 dt F+ (x) = A (x)P (x) + 2iπ C t−x Multiplying by A(x), we get in the same domain A(t)F (t) 1 dt ≡ F˜− (x) A(x)F+ (x) = P (x) + 2iπ C t − x The function F˜− (x) is in fact analytic for |x| > 1 and behaves as P (x) for x → ∞, so that its determinant does not vanish for |x| suﬃciently large. It follows that det F+ (x) and det F˜− (x) do not vanish identically and therefore have a ﬁnite number of zeroes at ﬁnite distance. We remove each zero successively by the following procedure. Consider an equation of the form (15.26) AG+ = G−

570

15 Riemann surfaces

Suppose that det G+ (x) has a simple zero at x0 in U+ . This means that there is a linear combination of its columns vanishing at x0 . This combination is realized by multiplying on the right by a constant matrix M such that det M = 0. We have of course AG+ M = G− M . Let us assume that it is the ﬁrst column of G+ M which vanishes at x0 . Multiplying on the right by diag(1/(x−x0 ), 1, . . . , 1), we remove the zero x0 without modifying the analytic properties of the right-hand side. Iterating the procedure, we get a pair G+ , G− satisfying eq. (15.26) with det G+ (x) = 0 for x ∈ U+ . It follows that the zeroes of det G− (x) are not in U+ and can be removed by the same procedure without modifying the analytic properties of G+ (x). At the end we get a matrix G− (x) behaving at ∞ as G− (x) = A− (x)diag (xκ1 , . . . , xκN ),

det A− (∞) = 0

Setting A+ = G+ , we have ﬁnally factorized A = A− diag (xκ1 , . . . , xκN )A−1 + ,

det A± = 0

Remark. When A is close to the identity matrix one can write A = 1 + A and the map F is surjective for suﬃciently small . In that case one can take P (x) = 1, and it is then clear that F+ (x) = 1 + O(), F˜− (x) = 1 + O(), so that their determinants do not vanish. Hence all the indices κi vanish.

References [1] G. Springer, Introduction to Riemann surfaces. Addison–Wesley (1957). [2] R.C. Gunning, Lectures on Riemann surfaces. Princeton University Press (1966). [3] P. Griﬃths and J. Harris, Principles of Algebraic Geometry. Wiley (1978). [4] D. Mumford, Tata Lectures on Theta. Vols. I and II. Birkhauser (1983). [5] J.P. Serre, Algebraic Groups and Class Fields. Springer (1997). [6] E.T. Whittaker and G.N. Watson, A Course of Modern Analysis. Cambridge University Press (1902). [7] J. Fay, Theta Functions on Riemann Surfaces. Springer lectures notes (1973).

16 Lie algebras

We present basic facts about Lie groups and Lie algebras. We describe semi-simple Lie algebras and their representations which can be characterized in terms of roots and weights. We discuss inﬁnite-dimensional Lie algebras, called aﬃne Kac–Moody algebras, which are at the heart of the study of ﬁeld theoretical integrable systems. In particular we construct the so-called level one representations using the techniques of Fock spaces and vertex operators introduced in Chapter 9. 16.1 Lie groups and Lie algebras A Lie group is a group G which is at the same time a diﬀerentiable manifold, and such that the group operation (g, h) → gh−1 is diﬀerentiable. Due to a theorem of Montgomery and Zippin, the diﬀerentiable structure is automatically real analytic. The maps h → gh and h → hg are called respectively left and right translations by g. Their diﬀerentials at the point h map the tangent space Th (G) respectively to Tgh (G) and Thg (G). We will denote by g · X and X · g the images of X ∈ Th (G) by these maps. This notation is coherent because, diﬀerentiating the associativity condition in G, one gets (g · X) · h = g · (X · h), and g · (h · X) = (gh) · X. In particular, this last relation shows that, for any X in the tangent space of G at the unit element e, the vector ﬁeld with value g · X at g is invariant under any left translation. Conversely, any such left-invariant vector ﬁeld is of the form g · X. So in the following we identify leftinvariant vector ﬁelds on G and elements of Te (G). This ﬁnite-dimensional vector space is called the Lie algebra of G and will be denoted by G. Alternatively, one can deﬁne the vector ﬁeld X(g) by its action on any function f , that is (X · f )(g) is the derivative of f along the tangent 571

572

16 Lie algebras

vector X at g. This deﬁnes a new function X · f . The left invariance of the vector ﬁeld X(g) means that (X · f )(hg) = X · h f (g), where h f : g → f (hg). In other words, the diﬀerential operator X commutes with left translations. More generally, for X1 , X2 , . . . ∈ G, one can consider linear combinations of diﬀerential operators (of any order) of the form X1 · (X2 · · · (Xk · f ) · · ·) which obviously form an associative algebra of left-invariant diﬀerential operators. The Lie algebra is embedded into this associative algebra as the set of ﬁrst order diﬀerential operators. One deﬁnes the Lie bracket [X, Y ] = XY − Y X, in terms of the associative algebra product. The main point is that, while XY and Y X are second order diﬀerential operators, their commutator is a ﬁrst order diﬀerential operator, so that [X, Y ] belongs to the Lie algebra. It is clear from this deﬁnition that (X, Y ) → [X, Y ] is bilinear antisymmetric and obeys the Jacobi identity: [[X, Y ], Z] + [[Z, X], Y ] + [[Y, Z], X] = 0

(16.1)

The associative algebra of left-invariant diﬀerential operators on G is called the universal enveloping algebra of the Lie algebra G and will be denoted by U(G). Finally, there is a natural action of the Lie group G on its Lie algebra G called the adjoint action. Note that for any X in the tangent space at e and any g, g · X · g −1 is also in the tangent space at e to G. We deﬁne the adjoint action: Ad (g)(X) = g · X · g −1 ,

X∈G

(16.2)

and note that Adgh = Adg Adh , so this is a group action of G on G. Let X ∈ G and consider the left-invariant vector ﬁeld g · X. For small t we can solve the diﬀerential equation g˙ = g · X with initial condition g(0) = e. The solution g(t) ∈ G is such that g(s + t) = g(s)g(t) for s, t small. This is because both members solve the above diﬀerential equation with initial value g(s) for t = 0. One can then use this property to show that the solution of the diﬀerential equation extends to a domain larger than initially deﬁned (if we have a solution for |s| ≤ then g(s + t) is a solution for |s + t| ≤ 2 and still solves the equation there), and successively extends to all of R. The solution, deﬁned for all t, is denoted by exp (tX). In particular, for t = 1 this deﬁnes the exponential map from G to G (obviously exp (tX) belongs to the connected component of e in G, so in the following we assume that G is connected). The exponential map allows us to relate subgroups of G to subalgebras of G. Let H be a subalgebra of G, and H be the smallest subgroup of G containing all the exp (X) for X ∈ H. One can show that H is a Lie

16.1 Lie groups and Lie algebras

573

subgroup of G with Lie algebra H. The precise deﬁnition of a Lie subgroup is quite tricky, in particular H may be embedded in a very complicated way in G, as shown by the simple example below. Conversely, given H its Lie algebra H is the set of X ∈ G such that t → etX is a continuous curve in H. Example. Consider the torus R2 /Z2 with the Abelian group law induced by the addition in R2 . This is a Lie group G, with Abelian Lie algebra R2 . Consider the one-parameter subgroup H = {exp (tX)|t ∈ R}, which is a Lie subgroup. When X has irrational slope, this subgroup is dense in G. In particular, any neighbourhood U of e contains inﬁnititely many components of H. A much nicer situation is obtained when H is closed in G, and fortunately this is the situation of interest for our purposes. We are mostly interested in the case when the Lie group G acts diﬀerentiably on a manifold M , and H is the stabilizer of a point m ∈ M . In this case, since the operation is continuous, H is automatically closed in G. Remarkably, this is suﬃcient to ensure that H is a closed Lie subgroup of G, thanks to a theorem of E. Cartan. Theorem. If H is a closed subgroup of a Lie group G, there exists a unique analytic structure on H such that H is a Lie subgroup of G. Proof. We sketch the proof. The idea is to deﬁne H = {X ∈ G| exp (tX) ∈ H, ∀t} and show that this is a subalgebra of G. Easy computations show that: n

t t = exp t(X + Y ) + O(1/n) exp X exp Y n n n2

−t −t t t exp = exp t2 [X, Y ] + O(1/n) X exp Y exp X exp Y n n n n so that, if X, Y ∈ H, both (X + Y ) and [X, Y ] are in H since H is closed. One then uses the closedness of H to show that the Lie subgroup of G of Lie subalgebra H is in fact equal to H (more precisely to the connected component of the identity in H). In the situation described above where G acts on M , one can show that the application g → g · m of G onto the orbit Om of m is open so that Om is isomorphic to the homogeneous space G/H.

574

16 Lie algebras

Examples. The most natural examples of Lie groups are provided by so-called algebraic subgroups of the general linear group GL(n). These are subvarieties of GL(n) deﬁned by polynomial equations compatible with the multiplication law. For example, the special linear group is deﬁned by the equation det g = 1, and the product of two such matrices has determinant 1. Hence these are naturally closed Lie subgroups of GL(n). The other standard examples are the subgroups of orthogonal and symplectic matrices. For any Lie group G, a representation ρ on a vector space V is a differentiable group homomorphism G → GL(V ). For g ∈ G and v ∈ V one denotes g · v = ρ(g)(v) so that (gh) · v = g · (h · v). The diﬀerential of ρ at e maps the Lie algebra G to the Lie algebra gl(V ) of GL(V ). Similarly, left-invariant vector ﬁelds on G are mapped on left-invariant vector ﬁelds on GL(V ), and so are their Lie brackets. Hence we get a representation of G on gl(V ), which we shall also denote by ρ. In other words, we have [ρ(X), ρ(Y )] = ρ([X, Y ]). Such a representation is faithful if ρ : G → gl(V ) is injective (i.e. for X = 0 there exists v ∈ V such that ρ(X)(v) = 0). In this case G may be seen as a subalgebra of gl(V ). There is a natural representation of any Lie group on its Lie algebra, i.e. V = G, given by the adjoint representation. This induces a representation of G on G, also called the adjoint representation: adX (Y ) = [X, Y ]

(16.3)

It is easy to check that this is a representation of G, using the Jacobi identity. Almost all results on Lie algebras are obtained by studying this representation.

16.2 Semi-simple Lie algebras Because there is such an interplay between Lie groups and Lie algebras, we study here Lie algebras from an algebraic viewpoint. In this section we consider Lie algebras over the complex numbers, e.g. complexiﬁcations GR ⊗ C of real Lie algebras. We will often use a basis (Ta ), a = 1, . . . , dim G, on the complex Lie algebra G. The Lie bracket is then expressed as: c Tc [Ta , Tb ] = fab c are called structure constants. Note that in this basis The coeﬃcients fab c . the matrix elements of the adjoint representation are (adTa )cb = fab

16.2 Semi-simple Lie algebras

575

The adjoint representation eq. (16.3) allows us to deﬁne a natural bilinear form on G, also called the Killing form, by: (X, Y ) = Tr (adX adY ) This bilinear form is invariant in the sense that: ([X, Y ], Z) = (X, [Y, Z]) This results immediately from the cyclic invariance of the trace and the fact that X → adX is a representation. The invariance property also means that (adY X, Z) + (X, adY Z) = 0. A Lie algebra is said to be semi-simple if it does not contain any nontrivial Abelian ideal. The Cartan criterion says that this is the case if and only if the Killing form is non-degenerate. In one direction this is easy. If X belongs to an Abelian ideal I, and Y is arbitrary, choose a basis Ta of G such that Ta is a basis of I for i = 1, . . . , r. Then adY adX (Ta ) vanishes for a ≤ r and belongs to I for a > r, hence adY adX has no diagonal element in this basis, and its trace vanishes. We see that any Abelian ideal is in the kernel of the Killing form. A Lie algebra G is called simple if it is semi-simple and its only ideals are either {0} or the algebra G itself. For any ideal I in a semi-simple algebra, its orthogonal under the Killing form is also an ideal. Moreover, by invariance and non-degeneracy of the Killing form one sees that I ∩ I ⊥ is an Abelian ideal (for X ∈ I, Y ∈ I ⊥ , and Z arbitrary ([X, Y ], Z) = (X, [Y, Z]) = 0), hence vanishes. It follows that G is the direct sum of its simple ideals, this being an orthogonal decomposition. We now introduce the concept of a Cartan subalgebra in a semi-simple Lie algebra. First, an element X of G is called semi-simple if adX is a diagonalizable matrix in the adjoint representation. A Cartan subalgebra H of a semi-simple Lie algebra G is a maximal Abelian subalgebra of G whose elements are all semi-simple. The existence of such an algebra is a very non-trivial result. To construct it one starts with a regular element, that is an element of G such that adX has a maximal number of distinct eigenvalues (as a matrix in the adjoint representation). Then the subalgebra of G on which adX is nilpotent is a Cartan subalgebra. One can show that any two Cartan subalgebras are related by a Lie algebra automorphism, and their common dimension is called the rank of the Lie algebra and will be denoted by rank G. In the adjoint representation, the endomorphisms ad (H) for H ∈ H form a system of commuting diagonalizable endomorphisms. We can thus diagonalize them simultaneously. Let Eα ∈ G be the common eigenvectors: ad (H) · Eα = α(H) Eα

576

16 Lie algebras

The application α : H ∈ H → α(H) ∈ R is a linear form deﬁned over H. That is, α belongs to the dual of the Cartan algebra: α ∈ H∗ . These forms are called the roots of the Lie algebra. We shall denote their set by ∆. They satisfy a few simple properties: (i) if α is a root then so is −α, (ii) a non-zero root is non-degenerate (i.e. the eigenspace is of dimension 1), (iii) if α is a root and t ∈ C, tα is not a root, except for t = ±1. Let {Hi } be a basis of the Cartan subalgebra. Then {Hi , Eα } form a basis of the Lie algebra G, on which the Killing form has a very simple structure, namely: (Hi , Eα ) = 0,

(Eα , Eβ ) = 0,

α + β = 0

(16.4)

This is because (H, [H , Eα ]) = α(H )(H, Eα ) = ([H, H ], Eα ) = 0, and ([H, Eα ], Eβ ) = α(H)(Eα , Eβ ) = (−([H, Eβ ], Eα ) = −β(H)(Eα , Eβ ). As a consequence, the restriction of the Killing form to the Cartan subalgebra is non-degenerate, otherwise the Killing form would be degenerate on the Lie algebra. It is convenient to introduce the isomorphism between H and its dual H∗ induced by the Killing form: α ∈ H ∗ → Hα ∈ H

with α(H) = (Hα , H),

∀H ∈ H

This deﬁnes, for any α ∈ H∗ , an element Hα ∈ H depending linearly on α. We may then deﬁne a bilinear form on H∗ by: (α, β) = (Hα , Hβ ) = α(Hβ ),

α, β ∈ H∗

This form is non-degenerate because the Killing form is non-degenerate on the Cartan subalgebra. Moreover, Hα for α ∈ ∆ span the Cartan subalgebra. In the basis {Hi , Eα } the Lie bracket reads: [Hi , Hj ] = 0 [Hi , Eα ] = α(Hi )Eα 9 (Eα , E−α )Hα [Eα , Eβ ] = Cα,β Eα+β 0

(16.5) if α + β = 0 if α + β ∈ ∆ if α + β ∈ ∆

with Cα,β some structure constants. Here we remark that [Eα , Eβ ] either vanishes or is proportional to Eα+β if α + β = 0 is a root, because [H, [Eα , Eβ ]] = [[H, Eα ], Eβ ] + [Eα , [H, Eβ ]] = (α(H) + β(H))[Eα , Eβ ]. If,

16.2 Semi-simple Lie algebras

577

however, α + β = 0, this shows that [Eα , E−α ] is in the Cartan subalgebra, and we have (H, [Eα , E−α ]) = ([H, Eα ], E−α ) = α(H)(Eα , E−α ) = (H, (Eα , E−α )Hα ). For each root α the three generators Hα , Eα , E−α form an sl(2) subalgebra of G. This allows us to study the α-chain through β, that is the set of roots of the form β + nα, using the commutation relations: [Hα , E±α ] = ±α(Hα )E±α , [Eα , E−α ] = (Eα , E−α )Hα

j The vectors adE±α Eβ for j ∈ N are obviously linearly independent root vectors in G, for the roots β ± jα, if they don’t vanish. They span a representation space for the considered sl(2) and since this representation is of ﬁnite dimension, the chain must be of ﬁnite length. Let p ≤ 0 be the minimal index such that β + pα is a root, and q ≥ 0 be the maximal index such that β + qα is a root. Let β = β + pα and consider the

j vectors vj = adEα Eβ for j ∈ N. By the minimality of p, we have ad E−α v0 = 0. Using this property, we can compute:

adHα vj = β (Hα ) + jα(Hα ) vj α(Hα )

vj−1 (16.6) adE−α vj = −j(Eα , E−α ) β (Hα ) + (j − 1) 2 Since vq−p+1 vanishes, but vq−p does not, we have β (Hα ) + (q − p)α(Hα )/2 = 0 or: (β, α) = −(p + q) ∈ Z (16.7) 2 (α, α) This result allows us to show that the Killing form induces a positive deﬁnite scalar product on the real vector space α∈∆ RHα . By duality this deﬁnes a positive deﬁnite scalar product on α∈∆ Rα. Indeed, computing the Killing form on the basis of G provided by the Hi and the E±α , we have (α, γ)(β, γ) (α, β) = (Hα , Hβ ) = Tr (ad Hα ad Hβ ) = γ

Taking α = β and dividing by (α, α)2 , we get: (α, γ) 2 1 = (α, α) (α, α) γ so that (α, α) is a rational number. It follows that (α, β) is a rational number, hence is real. Then for any x = α xα α with xα ∈ R we have (x, x) = γ (x, γ)2 ≥ 0 and this vanishes only if x = 0.

578

16 Lie algebras

For later use let us write another consequence of eq. (16.6): [E−α [Eα , Eβ ]] = (Eα , E−α )q(1 − p)

(α, α) Eβ 2

(16.8)

Both members are homogeneous in the normalizations of E±α and Eβ so one can replace Eβ = v−p with the notations of eq. (16.6), and then use eq. (16.7). With any root α one can associate a reﬂection wα acting on H∗ by: wα (x) = x − 2

(α, x) α (α, α)

These orthogonal reﬂections are called Weyl reﬂections. They preserve the root system, i.e. if β is a root so is wα (β). This is because, using eq. (16.7), wα (β) = β + (p + q)α is in the chain β + pα, . . . , β + qα. The Weyl group is the discrete group generated by these reﬂections. While roots span H∗ , they are not linearly independent in general, and one can choose a subset of them which forms a basis. There exists a subset Π of roots αi , i = 1, . . . , r, such that any other root α can be written α = i ni αi , where the ni are integers all of the same sign. When all ni are ≥ 0 we say that α is a positive root and otherwise α is called a negative root. The αi are called simple roots. So they are positive roots which cannot be written as the sum of two positive roots. The choice of Π is not unique, but any two such choices are related by a unique transformation of the Weyl group. To show the existence of a basis of simple roots, choose a hyperplane in H∗ which does not contain any root. Half of the roots are then on one side of this hyperplane, and we call them positive roots. If a positive root can be written as a sum of two positive roots we call it decomposable, otherwise we call it simple. Obviously, any positive root can be written as a linear combination of simple roots with positive integer coeﬃcients. In particular the simple roots span H∗ . We show that they are linearly independent. Note that for two simple roots α and β their diﬀerence (β −α) is not a root, because if (β −α) is a positive root we can decompose β = (β − α) + α as sum of positive roots, in contradiction with the simplicity of β, while if (α−β) is positive, we get similarly a contradiction with the simplicity of α. Hence the simple root condition means that p = 0 in eq. (16.7), so that: (α, β) −2 =n∈N (α, α) In particular, the scalar product (α, β) is negative for α, β simple roots. The α-chain through β consists of the roots of the form β + nα

16.2 Semi-simple Lie algebras

579

for n = 0, 1, . . . , −2(α, β)/(α, α). Assume now that there is a linear rela tion between the simple roots that we can write in the form r α s s = s r α with r and r real and positive, and the set {s} disjoint from s s s s s the set {s }. From this equality one gets ( s rs αs )2 = ss rs rs (αs , αs ). The left-hand side is obviously positive, while the right-hand side is obviously negative because s = s so (αs , αs ) ≤ 0. It follows that rs = rs = 0, so the simple roots are linearly independent. At the extreme opposite of the simple roots are highest roots. They are of the form θ = i ni αi , where the ni are maximal ≥ 0 integers. For a simple Lie algebra one can show that the highest root is unique, and that all ni > 0. Let αi be a set of simple roots. One deﬁnes the Cartan matrix, which is independent of the choice of basis (since two bases are related by the Weyl group) by: aij =

2(αj , αi ) (αi , αi )

,

i, j = 1, . . . , rank G

(16.9)

It is such that aii = 2 and aij ≤ 0, aij = 0 ⇒ aji = 0 for i = j. Moreover, the aij for i = j are negative integers such that 0 ≤ aij aji ≤ 4. This last condition comes from the fact that the scalar product is positive deﬁnite on H∗ . The Cartan matrix is non-degenerate: det(a) = 0 because det(a) is proportional to the determinant of the matrix of scalar products of simple roots which are linearly independent. With the Cartan matrix, we can give a presentation of the Lie algebra G, by generators and relations. For each simple root αi the elements: hi =

2 Hα , (αi , αi ) i

e+ i = Eα i ,

e− i =

1 E−αi (Eαi , E−αi )

generate an sl(2) subalgebra with standard commutation relations. The Cartan matrix allows us to reconstruct the Lie algebra from this set of sl(2) subalgebras. Given a Cartan matrix aij satisfying the properties mentioned above, we may deﬁne G as the Lie algebra generated by the − sets (hi , e+ i , ei ) with the relations (called the Serre relations): [hi , hj ] = 0 hi , e± = ±aij e± j j − = δij hi e+ i , ej

1−aij (ad e± · e± i ) j = 0

for i = j

(16.10)

580

16 Lie algebras

The last condition is just the condition that the αi -chain starting at αj is of length −aij . The elements hi generate the Cartan subalgebra H. The fact that these relations yield a ﬁnite-dimensional Lie algebra is a theorem by J.P. Serre. Let N± be the subalgebra generated by the e± i . We have: G = N− ⊕ H ⊕ N+ The subalgebras B± = H ⊕ N± are called Borel subalgebras. The classiﬁcation of ﬁnite-dimensional simple Lie algebras is then reduced to the classiﬁcation of ﬁnite-dimensional Cartan matrices satisfying the mentioned properties. This leads to four inﬁnite series An = sl(n + 1), Bn = so(2n + 1), Cn = sp(2n), Dn = so(2n) and a few exceptional algebras called E6 , E7 , E8 and F4 , G2 (see the References). A consequence of Serre’s theorem is the existence of an involutive automorphism ω of the Lie algebra G, called the Chevalley automorphism. To deﬁne it we give its action on the generators: ω(hi ) = −hi ,

− ω(e+ i ) = −ei ,

+ ω(e− i ) = −ei

and check that it preserves the relations. Hence it extends to the whole Lie algebra. For any root α of the Lie algebra one can choose E−α = −ω(Eα ). Changing Eα → λEα , we have (Eα , E−α ) → λ2 (Eα , E−α ) so that we can always achieve the condition (Eα , E−α ) = 1. Notice that in general λ will be a complex number. 16.3 Linear representations Recall that a linear representation on a vector space V , of a ﬁnitedimensional Lie algebra G, is a homomorphism ρ from G to End V . We can deﬁne the sum and the product of two representations (ρ1 , V1 ) and (ρ2 , V2 ). The sum is the representation on the direct sum V1 ⊕ V2 such that ρV1 ⊕V2 maps elements of G into block diagonal endomorphisms whose restrictions to V1,2 coincide with their images under ρ1,2 , in other words ρV1 ⊕V2 is block diagonal. The product is the representation on the tensor product V1 ⊗ V2 with ρV1 ⊗V2 (X) = ρ1 (X) ⊗ 1 + 1 ⊗ ρ2 (X) for any element X ∈ G. A representation on a vector space V is said to be indecomposable if it cannot be decomposed into the sum of subrepresentations. Note that for a general algebra A the sum of two representations is a representation, but the tensor product is not. The Lie algebra case appears as very special, and this is due to the existence of the algebra homomorphism, called the coproduct: ∆ : U(G) → U(G) ⊗ U(G),

∆(X) = X ⊗ 1 + 1 ⊗ X

(16.11)

16.3 Linear representations

581

The elements of the Cartan subalgebra are represented by a family of commuting diagonalizable endomorphisms. They can thus be simultaneously diagonalized. Let |λ be an eigenvector, and λ(H) be the corresponding eigenvalues, which depend linearly on H. The various λ are linear forms acting on the Cartan subalgebra H, i.e. λ ∈ H∗ , and are called weights. The weights λ(H) may have multiplicities, so we denote by |λa the weight vectors with the same weight λ. H|λa = λ(H)|λa The weight vectors |λa form a basis of the representation space V . Their number, degeneracy included, is the dimension of the representation. The representation space V contains representations of the sl(2) subalgebras generated by (Hα , Eα , E−α ) for any root α. From the knowledge of ﬁnitedimensional representations of sl(2) we get the basic integrality condition: 2

(λ, α) ∈ Z for all α ∈ ∆ (α, α)

Note that the weight system of a representation is invariant under the action of the Weyl group. In other words, if λ is a weight, so is wα (λ) for any root α. The diﬀerence of two weights of a given irreducible representation always belongs to the root system. We may thus introduce an order between weights of a representation by λ1 > λ2 iﬀ λ1 − λ2 > 0. Any ﬁnitedimensional representation possesses a highest weight since its number of weights is ﬁnite. This vector is unique for irreducible representations. Let Λ be this highest weight, which is thus non-degenerate. The corresponding eigenvector |Λ is called the highest weight vector of the representation. It is such that: H|Λ = Λ(H)|Λ , Eαi |Λ = 0 for H ∈ H and αi any simple positive root. This follows from the fact that since Λ is a highest weight Λ + αi is not a weight. Given a representation with highest weight Λ, one deﬁnes its Dynkin indices δi by: (Λ, αi ) δi = 2 ∈N (αi , αi ) with αi the simple roots. The proof that this is a positive integer is the same as in the case of the adjoint representation. By deﬁnition, the fundamental weights Λj are the highest weights with Dynkin indices δij . They are speciﬁed by: (Λj , αi ) 2 = δij (αi , αi )

582

16 Lie algebras

The number of fundamental weights is equal to the rank of G, and there exist representations with highest weights the fundamental weights (called fundamental repesentations). Any highest weight Λ of a ﬁnite-dimensional representation may be decomposed on the fundamental weights: Λ = j δj Λj with δj the Dynkin indices. More generally, the weight lattice is the set of λ ∈ H∗ such that (λ, α) ∈ Z for any root α. Any weight of any representation is on the weight lattice. As a Z-module, the weight lattice has a basis provided by the fundamental weights, such that (λi , αj ) = δij , for any simple root αj . The highest weight representations may also be viewed as quotient of Verma modules. The Verma module associated with a highest weight vector |Λ is the space (U (N− )|Λ ) with U (N− ) the enveloping algebra of N− . Then the irreducible representation with highest weight vector Λ is isomorphic to the quotient: (U (N− )|Λ ) /MΛ where MΛ the maximal submodule of U (N− )|Λ , which is shown to exist, and is unique (by maximality). The above quotient is ﬁnite-dimensional when Λ belongs to the weight lattice. The Verma module construction shows that any weight of the weight lattice is conjugated by the Weyl group to the highest weight of some representation. The roots themselves generate a lattice called the root lattice. It is a sublattice of the weight lattice. On the universal enveloping algebra, one can deﬁne a operation such ¯ . In a Cartan–Weyl basis it reads that (XY ) = Y X and (λX) = λX H = H, Eα = E−α , which is compatible with the commutation relations since C−α,−β = −Cα,β and Cα,β and the α(H) are real. In the highest weight representation, the operation is just Hermitian conjugation and allows us to introduce complex conjugated representations. In particular, the state Λ|, dual to the highest weight |Λ , satisﬁes: Λ|H = Λ(H) Λ|,

Λ|E−α = 0 for α > 0

since Λ|E−α = (Eα |Λ ) = 0. The Casimir operator C is the following operator, quadratic in the Lie algebra generators (Ta ) forming a basis of the Lie algebra, hence living in the universal enveloping algebra of G: C= Ta K ab Tb a,b

583

16.4 Real Lie algebras

with K ab the matrix inverse of the Killing form Kab . Its main property is that it is in the centre of the enveloping algebra, so that, in any given representation, the Casimir operator commutes with the endomorphisms representing the elements of the Lie algebra. It thus acts proportionally to the identity on irreducible representations. If Λ is the highest weight of the representation, its value is: C(Λ) = (Λ, Λ + ρ) with ρ the Weyl vector equal to the sum of the fundamental weights: ρ= Λj j

We frequently meet the tensor Casimir operator living in G ⊗ G given by ab C12 = a,b K Ta ⊗ Tb . Note that we have, usinq eq. (16.11): 1 C12 = (∆C − C ⊗ 1 − 1 ⊗ C) 2 The main property of the tensor Casimir is that [C12 , ∆(X)] = 0 for any X ∈ G. 16.4 Real Lie algebras Up to now, we considered complex Lie algebras. Examples are provided by complexiﬁcation of real Lie algebras. More precisely, let G be a real Lie algebra. This means that we have a basis Xa of the Lie algebra such that the structure constants are real, and we consider linear combinations of the Xa with real coeﬃcients. Its complexiﬁcation GC is the set of elements Z = X + iY with X, Y ∈ G. On GC we deﬁne a conjugation c : X + iY → X − iY . We have c2 = 1,

[c(Z1 ), c(Z2 )] = c([Z1 , Z2 ]),

¯ c(λZ) = λc(Z)

Conversely, given any such conjugation, c, we can write GC = G+ ⊕ G− , where c|G± = ±1 and GC can be viewed as the complexiﬁcation of G+ which is a real Lie algebra. Diﬀerent real Lie algebras may have the same complexiﬁcation. For example, sl(2, C) is the common complexiﬁcation of the two real Lie algebras sl(2, R) and su(2). The algebra sl(2, R) is the Lie algebra of 2 × 2 traceless real matrices, with basis: 1 0 0 1 0 0 E+ = E− = H= 0 −1 0 0 1 0

584

16 Lie algebras

and commutation relations [H, E± ] = ±2E± and [E+ , E− ] = H. On the other hand, su(2) is the Lie algebra of antihermitean traceless 2×2 matrices, i.e. linear combinations with real coeﬃcients of the matrices tk = iσk , where σk are the Pauli matrices: 1 0 0 −i 0 1 σ2 = σ1 = σ3 = 0 −1 i 0 1 0 The commutation relations are [ti , tj ] = −2 ijk tk . We see that the structure constants of su(2) are real. The Lie algebras sl(2, R) and su(2) are referred to as the non-compact and compact real forms of sl(2, C) respectively. Notice that although the algebra su(2) is real, the matrices representing it have complex entries. It is an important problem to classify the real forms of a given complex Lie algebra. This amounts to classifying the conjugations c. For this purpose note that one can build a basis of the Lie algebra GC such that all the structure constants are real, as follows. Choose the basis E±α , Hα such that ω(Eα ) = −E−α (where ω is the Chevalley automorphism) and (Eα , E−α ) = 1. Then [Eα , E−α ] = Hα , and [Hα , E±β ] = ±β(Hα )E±β , where β(Hα ) are real. Setting [Eα , Eβ ] = Cα,β Eα+β and applying ω, we get the relation: C−α,−β = −Cα,β

(16.12)

To show that these structure constants are real, we compute: 2 2 Hα+β = −Cα,β (Hα + Hβ ) [[Eα , Eβ ], [E−α , E−β ]] = −Cα,β

On the other hand, using the Jacobi identity and eq. (16.8), we get: [[Eα , Eβ ], [E−α , E−β ]] = [E−β , [E−α , [Eα , Eβ ]]] + [E−α , [E−β , [Eβ , Eα ]]] (β, β) (α, α) = −q(1 − p) Hα − q (1 − p ) Hβ 2 2 Here p and q refer to the β-chain through α. If Cα,β = 0, i.e. α + β is a root, Hα and Hβ are linearly independent, so identifying the coeﬃcients we get: (α, α) 2 Cα,β = q(1 − p) 2 and a similar formula with p and q . Recalling that p < 0 and (α, α) > 0, we see that Cα,β is real. The basis of the Lie algebra that we have constructed is called a Weyl basis.

585

16.4 Real Lie algebras

The real Lie algebra G spanned over R by E±α and Hα is the analogue of the non-compact sl(2, R) in the general case. We obviously have GC = G +iG and the real form is selected by the conjugation c (X +iY ) = X − iY . It is a theorem by H. Weyl that for any semi-simple complex Lie algebra, there exists a real form which is the Lie algebra of a compact Lie group. The Lie algebra of a semi-simple compact Lie group is characterized by the fact that its Killing form is negative deﬁnite. Indeed, if G is a compact Lie group, choose any positive deﬁnite scalar product on G, its Lie algebra. Since G is compact, one can use the Haar integral on G to take the average of this bilinear form and obtain a positive deﬁnite invariant scalar product on G. This means that the Lie group G acts by orthogonal matrices in the adjoint representation (for this scalar product). Hence, in an orthonormal basis the matrices adX are antisymmetric and the Killing form (X, X) = − ij (adX )2ij ≤ 0 vanishes only when adX = 0, i.e. when X is in the centre of G. We see that G decomposes as the orthogonal sum of its centre and a semi-simple algebra [G, G] on which the Killing form is negative deﬁnite. Conversely, starting from the Weyl basis, one can construct the compact form as follows: consider the generators Xα = (Eα − E−α ),

Yα = i(Eα + E−α ),

Zα = iHα

(16.13)

The real vector space spanned by these elements is a real Lie algebra thanks to eq. (16.12). Moreover, it has a negative deﬁnite Killing form. To show this recall the orthogonality relations, eq. (16.4), so it is suﬃcient to look at each subspace (Xα , Yα , Zα ) independently, where the check is simple, thereby proving the Weyl theorem. For instance, (Xα , Xα ) = (Eα − E−α , Eα − E−α ) = −2(Eα , E−α ) = −2 and (Xα , Yα ) = (Eα − E−α , i(Eα + E−α )) = 0, and so on. This compact real form corresponds to the conjugation: c(E±α ) = −E∓α ,

c(Hα ) = −Hα

(16.14)

This conjugation selects the analogue of su(2) in the sl(2) case. In general, the representations of a (real) compact Lie group G may be complex. Choosing on the representation space V an arbitrary sesquilinear form and averaging it by the group G, one gets an invariant sesquilinear form, i.e. gv, gw = v, w . Hence all elements of G are represented by unitary matrices and elements of G by antihermitean matrices. The generators eq. (16.13) are such that Xα+ = −Xα , Yα+ = −Yα and Zα+ = −Zα . This also reads Eα+ = E−α , Hα+ = Hα , or more abstractly for any X in the complexiﬁed Lie algebra, X + = −c(X).

586

16 Lie algebras

In particular, any maximal Abelian subalgebra of G is a Cartan subalgebra, because antihermitean matrices are always diagonalizable with eigenvalues purely imaginary. Its image by the exponential map is the Weyl torus. One can choose a basis Hj of the Cartanalgebra such that any element of the torus is of the form h = exp ( j θj Hj ) with exp (2πHj ) = 1 for all j. If |λ is a weight vector in V , we have h|λ = χλ (h)|λ , where χλ (h) is a character, i.e. χλ (hh ) = χλ (h)χλ (h ). So we have χλ (h) = exp ( j θj λ(Hj )). The condition exp (2πHj ) = 1 gives λ(Hj ) ∈ iZ for all j. This deﬁnes a lattice in H∗ called the weight lattice of the group G, which is a sublattice of the weight lattice of the Lie algebra. Moreover, since the adjoint representation of G is well-deﬁned, the root lattice is a sublattice of the weight lattice of G. In general the weight lattice of G is a sublattice of the weight lattice of G, and is equal to it only when G is simply connected. This is because any positive weight of the weight lattice of G is the highest weight of some representation of G which can then be lifted to G. Hence, for a compact semi-simple simply connected Lie group, the weight lattices of the group and its Lie algebra are the same. Note that 2λ(Hα )/(α, α) ∈ Z for any weight λ of G, so in this case the elements Hj such that exp (2πHj ) = 1 are of the form Hj = i2Hαj /(αj , αj ), where αj are the simple roots. When G is a semi-simple compact connected Lie group, its centre is a ﬁnite group contained in all maximal tori of G. Assuming that the centre of G is trivial, its weight lattice is equal to the root lattice, because in this case the adjoint representation is a faithful representation of G and so generates all representations of G, taking tensor products. It follows that the root lattice generates the weight lattice. This allows us to describe the various compact Lie groups G with Lie algebra G. Starting from the universal cover of any one of them, which we call G (and can be shown to be compact), having centre Z, the other compact Lie groups are of the form G/D, where D is any subgroup of the discrete Abelian group Z. They have centre Z/D isomorphic to the quotient of their weight lattice by the root lattice. Moreover, their ﬁrst homotopy group is isomorphic to the quotient of the weight lattice of G by their own weight lattice. We see that global topological properties of compact Lie groups are remarkably encoded in the structure of their tangent space at the unit element. The classiﬁcation of real forms of complex Lie algebras is also the basis of the study of symmetric spaces. We have obtained two conjugations c and c which select non-compact and compact real forms of GC . Note that c commutes with the conjugation c deﬁned in eq. (16.14). Hence we

16.5 Aﬃne Kac–Moody algebras

587

can diagonalize c in the eigenspaces of c and conversely. This yields the decompositions, called Cartan decompositions of the real Lie algebras G and G : G = t ⊕ p, G = t ⊕ ip The Lie algebra t = G∩G is generated by the (Eα −E−α ), and p is spanned by the i(Eα + E−α ) and the iHα . Moreover, we have the relations: [t, t] ⊂ t,

[t, p] ⊂ p,

[p, p] ⊂ t

and similarly with p → ip. For example in the case of sl(n), this corresponds to the decomposition into symmetric and antisymmetric matrices. At the Lie group level, with the algebra t corresponds a compact group K, and with the Lie algebras G and G correspond appropriate Lie groups G and G , respectively compact and non-compact. One gets symmetric spaces G/K and G /K of the compact and non-compact type respectively. This is the situation we have encountered in Chapter 7. Many more conjugations exist, but we will not enter into this subject. 16.5 Aﬃne Kac–Moody algebras We start from a ﬁnite-dimensional simple Lie algebra, G, and construct the loop algebra which consists of formal Laurent polynomials G = G ⊗ C[λ, λ−1 ] with Lie bracket: [X ⊗ λn , Y ⊗ λm ] = [X, Y ] ⊗ λn+m is the central extension of the loop The aﬃne Kac–Moody algebra, G, ˜ algebra G by a central element denoted by K (this means that the formal element K commutes with all other elements). It is convenient to further extend this algebra by including the derivation d = λ∂λ . Thus, the aﬃne Kac–Moody algebra is: G = G˜ ⊕ CK ⊕ Cd and the Lie brackets are deﬁned (with (X, Y ) the Killing form on G) by: [X ⊗ λn , Y ⊗ λm ] = [X, Y ] ⊗ λn+m + 12 nδm+n,0 (X, Y )K (16.15) [d, X ⊗ λn ] = n X ⊗ λn n [K, X ⊗ λ ] = [K, d] = 0 Note that the coeﬃcient ω(X ⊗ λn , Y ⊗ λm ) ≡ 12 nδm+n,0 (X, Y )

(16.16)

588

16 Lie algebras

of the central element K satisﬁes the cocycle condition ω([X, Y ], Z) + ω([Z, X], Y ) + ω([Y, Z], X) = 0, ensuring that the Jacobi identity is satisﬁed. An invariant bilinear form on the Kac–Moody algebra is given by: (X ⊗ λn , Y ⊗ λm ) = (X, Y )δn+m,0 ,

(K, K) = (d, d) = 0,

(K, d) = 1

and (K, X ⊗ λn ) = (d, X ⊗ λn ) = 0. The fact that this form is invariant is easy to check by direct computation. Alternatively, denoting a general element of G by = X(λ) ˜ X + XK K + Xd d we have in the aﬃne Kac–Moody algebra: dλ 1 ˜ ˜ ˜ Y˜ (λ)) K X, Y = X, Y + 2 (∂λ X(λ), 2iπ = [K, d] = 0 K, X d ˜ d, X = λ X(λ) dλ

(16.17)

It is worth noticing that aﬃne Kac–Moody algebras are subalgebras of the Lie algebra gl(∞) introduced in Chapter 9. To see it, one has to associate an inﬁnite-dimensional matrix with λn X, where X is a k × k matrix. We represent λ by the shift operator S with matrix elements SIJ = δI+1,J for I, J ∈ Z and λn X by X ⊗ S n . In other words, one has (λn X)i+kI,j+kJ = Xij δI+n,J The loop algebra structure is obviously preserved by this identiﬁcation, moreover, one can check that the cocycles eq. (9.12) and eq. (16.16) also match. Hence gl(k) is embedded into gl(∞) as the subalgebra of inﬁnite matrices with period k along the diagonal Let αi be the simple roots of the ﬁnite-dimensional simple Lie algebra G and θ its highest root. The aﬃne Kac–Moody algebra G ⊗ C(λ, λ−1 ) ⊕ CK is generated by the following elements: (E−θ ⊗ λ, K − Hθ , Eθ ⊗ λ−1 ) (16.18) Each triplet form an sl2 subalgebra. These triplets are associated with the simple roots of the aﬃne Kac–Moody algebra. The derivation d = λ∂λ is not in the algebra generated by these elements and has to be added by hand. The λ dependence in this presentation corresponds to what is (Eαi , Hαi , E−αi ),

i = 1, . . . , rank G,

589

16.5 Aﬃne Kac–Moody algebras

called the homogeneous gradation. The gradation is deﬁned by the degree in λ, which is counted by d. A slight modiﬁcation of this construction allows us to deﬁne the twisted aﬃne Kac–Moody algebras. Assume that G has an automorphism τ of order N , i.e. τ N = 1. Let ζ = e2iπ/N . One extends τ to an automorphism τ of the Kac–Moody algebra by setting: τ(X ⊗ λn ) = τ (X) ⊗ (ζλ)n ,

τ(K) = K,

τ(d) = d

(16.19)

Since τ is an automorphism, the set of its ﬁxed points is a Lie algebra, which is called the twisted aﬃne Kac–Moody algebra associated with τ , and denoted by Gτ . If the automorphism τ is an inner automorphism, Gτ is isomorphic to If, however, τ is not an inner automorphism, one the untwisted algebra G. gets an essentially diﬀerent algebra. It is known that this situation occurs only when N = 2 or N = 3 and only for particular simple Lie algebras. Let us illustrate the use of an inner automorphism to obtain the presentation of the aﬃne Kac–Moody algebra in the principal gradation. Consider the Weyl vector ρ = i Λi , where the Λi are the fundamental weights of G. So we have (ρ, αi ) = 1 for any simple root αi of G. Moreover, if θ is the highest root of G we deﬁne the dual Coxeter number h∗ by (ρ, θ) = h∗ − 1. Let τ be the inner automorphism of G: τ (X) = e− h∗ Hρ Xe h∗ Hρ , 2iπ

2iπ

τ (H) = H,

τ (Eα ) = e− h∗ (ρ,α) Eα 2iπ

and extend it to an automorphism τ of the aﬃne algebra as in eq. (16.19) 2iπ with ζ = e h∗ . The algebra of its ﬁxed points is isomorphic to our aﬃne ∗ ∗ algebra, and is linearly generated by the H ⊗λmh and the Eα ⊗λmh +(ρ,α) . It follows that the elements of degree ±1, 0 are: (Eαi ⊗λ, Hαi , E−αi ⊗λ−1 ), i = 1, . . . , rank G,

(E−θ ⊗λ, K −Hθ , Eθ ⊗λ−1 ) (16.20) These elements generate the whole ﬁxed point algebra. This presentation diﬀers from the presentation in eq. (16.18) by the way in which the degrees in λ are distributed. Aﬃne Kac–Moody algebra may also be presented by generators and relations using Cartan matrices and their associated set of generators. Speciﬁcally, an aﬃne Cartan matrix is a ﬁnite-dimensional matrix aij such that: aii = 2; aij ≤ 0; aij = 0 ⇒ aji = 0, the aij are negative integers for i = j, and the dimension of its kernel is 1. Note that the only diﬀerence with the Cartan matrix of a semi-simple algebra is that its determinant vanishes. A classiﬁcation of such Cartan matrices may be found in the References, where it is shown that irreducible ones yield exactly the

590

16 Lie algebras

standard and twisted algebras constructed above, for any simple Lie algebra G. In analogy to the ﬁnite-dimensional case, the aﬃne Kac–Moody algebra with Cartan matrix aij is deﬁned as the Lie algebra generated by − the elements (e+ i , ei , hi ) with Serre relations: [hi , hj ] = 0 hi , e± = ±aij e± j j − = δij hi e+ i , ej

(16.21)

1−aij (ad e± · e± i ) j = 0 for i = j

One gets an inﬁnite algebra because det (a) = 0. The elements hi generate the Cartan subalgebra H. Let nj be such that i ni aij = 0. Since, by hypothesis, the kernel of aij is one-dimensional, such coeﬃcients are unique up to a multiplicative constant. It is usually convenient to normalize them such that j nj = h∗ with h∗ the dual Coxeter number; the coeﬃcients nj are then all non-negative integers. By construction, the element K, n i hi K= i

is central. The derivation d is not an element of the algebra generated by − the (e+ i , ei , hi ). It has to be added by hand. Its commutation relations depends on the gradation one chooses. For example, the principal gradation obtained in eq. (16.20) corresponds to the choice: / −0 / +0 d, ei = −1 d, ei = 1, [d, hi ] = 0, In particular, the rank of the (untwisted) aﬃne Kac–Moody algebra G is (1 + rank G) if one does not include the derivation in its deﬁnition and is (2 + rank G) if one does include it. As for ﬁnite-dimensional Lie algebras, one has the decomposition: G = N− ⊕ H ⊕ N+ with N± the subalgebra generated by the e± i and H the Cartan subalgebra. One may also introduce roots, which are points in the dual H∗ of the Cartan subalgebra H, and systems of simple roots. However, in contrast to the ﬁnite-dimensional case, the number of roots is inﬁnite and roots may have multiplicities. Weights are elements of H∗ . The fundamental weight vectors Λj are such that Λj (hi ) = δij (16.22)

16.5 Aﬃne Kac–Moody algebras

591

By deﬁnition, integrable highest vectors Λ are integer linear combinations of the fundamental weights: Λ = j δj Λj with δj integers. The coeﬃcients δj are called Dynkin indices. Unitary highest weight representations may be deﬁned as in the ﬁnitedimensional case. Let Λ be an integrable highest weight vector and |Λ be the corresponding highest weight vector. By deﬁnition, one assumes that hi |Λ = Λ(hi )|Λ ,

e+ j |Λ = 0

The highest weight representation V (Λ), with highest weight vector Λ, is then deﬁned as: V (Λ) = (U (N− )|Λ ) /MΛ with the U (N− ) the universal enveloping algebra of N− and MΛ the maximal submodule of U (N− )|Λ . More concretely, vectors in V (Λ) are obtained by multiple action of the generators e− i on the highest weight δ Λ , then the central element K = vector |Λ . Note that if Λ = i i i i ni hi acts on V (λ) as the C-number K = i ni δi . This number is called the level of the representation. Note that the adjoint representation is not a highest weight representation. in more detail. Let Let us present the aﬃne Kac–Moody algebra sl(2) E+ , E− , H be the three generators of the Lie algebra sl(2): [H, E± ] = ±2E± ,

[E+ , E− ] = H

We normalize the Killing form on sl(2) by (H, H) = 2, (E+ , E− ) = 1. The loop algebra sl(2) is the Lie algebra of traceless 2 × 2 matrices with entries Laurent polynomials in λ: sl(2) = sl(2) ⊗ C(λ, λ−1 ). The aﬃne Lie is the central extension of sl(2): = sl(2) algebra sl(2) sl(2) ⊕ CK ⊕ Cd, ∂ with K the central element and d the derivation d = λ ∂λ . Let us write = N ⊕N + . First one can choose, as in − ⊕ H the decomposition: sl(2) eq. (16.18), the simple root vectors Eα1 = E+ , Eα2 = λE− . Together with the Cartan algebra generators H, K, d and E−α1 = E− , E−α2 = λ−1 E+ they generate the whole algebra. The simple root vectors are of degree 0 and 1. This is called the homogeneous gradation. It is more convenient to deﬁne a gradation such that simple root vectors have degree 1, the so-called principal gradation. To do that we choose simple root vectors Eα1 = λE+ , Eα2 = λE− , and E−α1 = λ−1 E− , E−α2 = λ−1 E+ . Together with the Cartan algebra generators H, K, d, they generate the algebra. The degree 0 elements are: = {H, d, K} H

592

16 Lie algebras

The positive degree ones (n > 0) are: + = {E (2n−1) = E+ ⊗ λ2n−1 , E (2n−1) = E− ⊗ λ2n−1 , H (2n) = H ⊗ λ2n } N + − and the negative degree ones (n < 0) are: − = {E (2n+1) = E+ ⊗ λ2n+1 , E (2n+1) = E− ⊗ λ2n+1 , H (2n) = H ⊗ λ2n } N + − in the We can exhibit the isomorphism between the aﬃne algebra sl(2) homogeneous gradation and this presentation which corresponds to the principal gradation. First, replace the parameter λ by λ2 , in the homogeneous presentation. Then the simple root vectors of the homogeneous gradation are E+ and λ2 E− . Then perform a conjugation by exp (log(λ)H/2). This conjugation sends E+ to λE+ and E− to λ−1 E− and extends to an isomorphism of the two algebras. Note that d → d − 12 H. In the principal gradation, the commutation relations read: H (r) , H (s) = r δr+s,0 K (s) (r+s) (16.23) H (r) , E± = ±2E± (r) (s) E+ , E− = H (r+s) + 12 r δr+s,0 K In the notation of eq. (16.21), we have h1 = H + 12 K and h2 = −H + 12 K. Let H ∗ , K ∗ , d∗ be the dual basis of the basis H, K, d of the Cartan (r) algebra. With the root vectors E± are associated the roots ±2H ∗ + rd∗ , and with the root vectors H (r) are associated the roots rd∗ . Let us draw see Fig 16.1. the root diagram of sl(2), algebra possesses two fundamental highest weights, The aﬃne sl(2) + denoted by Λ and Λ− . They are characterized by eq. (16.22). Expanding on the dual basis H ∗ , K ∗ , one gets Λ± = ± 12 H ∗ + K ∗ , or equivalently: Λ± (H) = ± 12 ;

Λ± (K) = 1;

Λ± (d) = 0

Note that the levels of these fundamental representations are equal to one, i.e. K takes the value 1 on them. 16.6 Vertex operator representations We now recall the vertex operator construction of the level one represen in the principal gradation. We introduce oscillators pn for tations of sl(2), n odd, such that [pm , pn ] = mδn+m,0

593

16.6 Vertex operator representations d∗ (3)

E−

c

c c

(1)

E−

c

(3)

E+

H (2)

α2

c

α1

(1)

E+

H∗

(−1)

E−

c

c c

(−3)

E−

(−1)

E+

H (−2)

c

c

(−3)

E+

Fig. 16.1. The root diagram of sl(2). Note that the choice n odd ensures that there is no centre in this algebra. Assume p+ n = p−n . The vacuum |0 is deﬁned by pn |0 = 0 for n > 0. Its dual 0| is deﬁned by 0|pn = 0 for n < 0, and the normalization condition 0|0 = 1. This allows us to compute the vacuum expectation value, 0|O|0 , of any operator O. The representation space is the Fock space generated by the p−n acting on the vacuum. We deﬁne the normal ordering on monomials of the pn by putting pn , n > 0 to the right. We denote it by : :. Deﬁne the operators acting on the Fock space: √ zn p−n Q(z) = −i 2 n n odd

We have

0|Q(z1 )Q(z2 )|0 = log

z 1 + z2 z 1 − z2

,

|z1 | > |z2 |

(16.24)

This is because 0|Q(z1 )Q(z2 )|0 = −2

n1 n2

z1n1 z2n2 0|p−n1 p−n2 |0 n 1 n2

Here n1 , n2 are odd integers, but the properties of the vacuum select n1 < 0 and n2 > 0 and we have 0|p−n1 p−n2 |0 = −n1 δn1 +n2 ,0 using

594

16 Lie algebras

p−n1 p−n2 = p−n2 p−n1 − n1 δn1 +n2 ,0 . The sum reduces to: 1 z2 n2 z 1 + z2 = log 0|Q(z1 )Q(z2 )|0 = 2 , n 2 z1 z 1 − z2

|z1 | > |z2 |

n2 >0

The vertex operator V (r, z) is deﬁned by: V (r, z) =

1 : exp (irQ(z)) : 2

Proposition. The normal ordered form of a product of two vertex operators is given by: z1 − z2 rs : V (r, z1 )V (s, z2 ) : |z1 | > |z2 | (16.25) V (r, z1 )V (s, z2 ) = z 1 + z2 Proof. Let

√ p−n z n /n Q± = −i 2 ∓n>0

so that Q = Q+ + Q− and Q+ |0 = 0. Then, by deﬁnition of the normal order, V (r, z) = 12 eirQ− (z) eirQ+ (z) . To compute : V (r, z1 )V (s, z2 ) : we need to commute exp (irQ+ (z1 )) to the right of exp (isQ− (z2 )). Now it is clear that the commutator of Q+ (z1 ) and Q− (z2 ) is a C-number. To evaluate this number one can take its vacuum expectation value. One has, using Q+ |0 = 0 and 0|Q− = 0, 0|[Q+ (z1 ), Q− (z2 )]|0 = 0|Q+ (z1 )Q− (z2 )|0 = 0|Q(z1 )Q(z2 )|0 which is given by eq. (16.24). Moreover, if A and B are two operators such that [A, B] is a C-number, one has eA eB = eB eA e[A,B] . So we arrive at eq. (16.25) since e[A,B] = ((z1 − z2 )/(z1 + z2 ))rs . The level one vertex operator representations of the Lie algebra sl(2) are obtained as follows: (n) (n) z −n (E+ + E− ) = P (z) ≡ pn z −n (16.26) n

n odd

n even

z −n H (n) +

z −n (E+ − E− ) = ± V (z) (n)

(n)

(16.27)

n odd

where V (z) denotes the vertex operator: √ √ √ 1 1 V (z) = V (− 2, z) = : e−i 2Q(z) := : ei 2Q(−z) : 2 2

(16.28)

16.6 Vertex operator representations

595

(n)

Proposition. The operators E± and H (n) deﬁned in eq. (16.27) provide representations of the aﬃne algebra eq. (16.23) with K = 1. These representations correspond to the fundamental highest weights Λ± according to the sign in eq. (16.27). They are the fundamental level one representations of sl(2). Proof. We derive eq. (16.25) with respect to z1 and get for |z1 | > |z2 |: 1 ∂ i dQ(z1 ) irQ(z1 ) V (r, z1 )V (s, z2 ) = : e : V (s, z2 ) r ∂z1 2 dz1 z1 − z2 rs 2sz2 = 2 : V (r, z1 )V (s, z2 ) : z1 − z22 z1 + z2 i z1 − z2 rs dQ(z1 ) irQ(z1 ) + : e V (s, z2 ) : 2 z 1 + z2 dz1 Here we have used the fact that in the normal ordered product everything commutes so that one can derive the exponential straightforwardly. We then set r = 0. Deﬁning: √ 2sz1 z2 Γ(z1 , z2 ) = 2 V (s, z2 )+ : P (z1 )V (s, z2 ) : z1 − z22 we get: P (z1 )V (s, z2 ) = Γ(z1 , z2 ),

|z1 | > |z2 |

Similarly, we derive eq. (16.25) with respect to z2 and then perform the exchange (z1 , z2 , r) → (z2 , z1 , s). We get: V (s, z2 )P (z1 ) = Γ(z1 , z2 ), |z1 | < |z2 | Expanding P (z) = n pn z −n , where n is odd, so that dz n−1 (n) (n) p n = E+ + E − = P (z) z C 2iπ we can write the commutator [pn , V (s, z2 )] as: dz1 n−1 [pn , V (s, z2 )] = z1 Γ(z1 , z2 ) C1 −C2 2iπ where C1 is a circle around the origin with |z1 | > |z2 | while C2 is a circle the around the origin with |z1 | < |z2 |. This contour integral is given by √ residues at the two-poles z1 = ±z2 and we ﬁnally obtain, setting s = − 2: (n)

(n)

[E+ + E− , V (z)] = −2z n V (z)

(16.29)

596

16 Lie algebras

Similarly, starting from eq. (16.25) and setting V (z) = gets: [Vn , V (z2 )] =

C1 −C2

dz1 n−1 z 2iπ 1

z 1 − z2 z 1 + z2

nV

(n) z −n ,

one

2 : V (z1 )V (z2 ) :

The residue is at z1 = −z2 and is easily computed, noting that: V (z2 )V (−z2 ) := 1/4. One ﬁnds [Vn , V (z)] = 2(−1)n z n P (z) + (−1)n nz n . Separating n even and odd this reads (with the sign of eq. (16.27)): (n)

(n)

[E+ + E− , V (z)] = −2z n P (z) − nz n (16.30) From eqs. (16.29, 16.30), one gets by expanding V (z) into its components (note that cancels): [H (n) , V (z)] = 2z n P (z) + nz n ,

(m)

(n+m)

[H (n) , E± ] = ±2E±

(16.31)

1 (n) (m) (m) [E± , E+ − E− ] = −H (n+m) ∓ nδn+m,0 2

(16.32)

(n) (n) Finally, we have P (z) = n pn z −n so that E+ +E− = pn , and from the (n) (n) (m) (m) commutation relations of pn we get [E+ + E− , E+ + E− ] = nδn+m,0 . Combining with eq. (16.30), this gives: (n)

(m)

[E± , E+

1 (m) + E− ] = ±H (n+m) + nδn+m,0 2

(16.33)

Equations (16.31, 16.32, 16.33) are equivalent to the commutation relations eq. (16.23) for K = 1. We have obtained a level one repesentation It remains to identify the highest weight. It is provided by the of sl(2). vacuum vector |0 because 1 √ V (z)|0 = e−i 2Q− (z) |0 2

(16.34) (n)

(n)

contains only positive odd powers of z. It follows that (E+ −E− )|0 = 0 (n) (n) and H (n) |0 = 0 for n > 0. Moreover, since E+ + E− = pn we have (n) (n) (E+ + E− )|0 = 0 for n > 0. This implies that the vacuum is annihilated by all positive root vectors, hence is a highest weight vector. Equation (16.34) also gives H (0) |0 = 12 |0 so that the corresponding weight is 12 . It is known that this representation on Fock space is irreducible.

16.6 Vertex operator representations

597

References [1] S. Helgason, Diﬀerential Geometry and Symmetric spaces. Academic Press (1962). [2] J.E. Humphreys, Introduction to Lie Algebras and Representation Theory. Springer (1972). [3] V. G. Kac, Inﬁnite Dimensional Lie Algebras. Cambridge University Press (1985).

Index

Abel map, 29, 136, 184, 399, 552, 553 Abelian diﬀerentials, 183, 280, 549 abelianization, 35, 64 action–angle variables, 10, 161, 500 adjoint action, 563 adjoint linear system, 135, 339 adjoint representation, 565 Adler trace, 333, 451 AKS scheme, 90, 113, 335 Arnold theorem, 10

central extension, 61, 358 chiral ﬁelds, 436, 460 classical double, 527 coadjoint action, 40, 92 coadjoint orbit, 40, 43, 92, 101, 113, 124, 152 conformal invariance, 386, 436, 454, 460 coproduct, 571 cotangent bundle, 228, 512

B¨acklund transformations, 69 Baker–Akhiezer function, 145, 182, 184, 215, 222, 279, 322, 328, 338, 342, 380, 393 bihamiltonian structure, 356, 385 bilinear identities, 339 Bloch solutions, 191, 221, 378, 401 bosonization, 303

Darboux theorem, 9, 510 degenerate Poisson bracket, 508 degree of divisor, 544 desingularization, 130, 166, 537, 539 divisor, 544 double, see classical double dressing transformation, 72, 74, 454, 463, 532 Drinfeld–Sokolov construction, 358, 441, 448 dualization, 85 dynamical divisor, 127, 132, 138, 139, 179, 189, 215, 396

Calogero–Moser model, 202, 208, 236, 238 canonical bundle, 543 canonical coordinates, 157, 189, 193, 510 canonical cycles, 549 canonical transformation, 7, 8 Cartan matrix, 569 Cartan subalgebra, 107, 437, 566 Casimir operator, 44, 45, 47, 100, 294, 301, 573 Cauchy determinant, 305

eigenvector bundle, 128, 150, 178, 213 elementary ﬂows, 48 elliptic functions, 203, 557 equivalent divisors, 544 Euler top, 19, 32, 39, 47

599

600

Index

exchange algebra, 447 exponential map, 563 factorization, 54, 56, 94, 150, 454, 487 fermions, 297, 388 ﬁnite zone solutions, 170, 222, 400, 474 Fock space, 297 Fuchs relation, 254, 264, 278 Gaudin model, 235 Gelfand–Dickey, 37, 66, 332 Gelfand–Levitan–Marchenko, 490 genus, 123, 540 geodesics, 25, 107 Grassmannian, 309, 325 Hamilton–Jacobi equation, 160 Hamiltonian reduction, 24, 107, 116, 125, 203, 228, 238, 359, 519 Hamiltonian vector ﬁeld, 9, 509 hierarchy, 52, 72, 244, 335 highest root, 569 highest weight, 572, 440 Hirota equation, 274, 296, 307, 387, 443 Hirota operators, 274, 296, 307, 387 Hitchin systems, 227 hyperelliptic curve, 177, 372, 393, 474, 538 irregular singular points, 246 isomonodromic ﬂows, 260, 261 isospectral, 12 Iwasawa decomposition, 107 Jacobi identity, 15 Jacobi matrices, 103 Jacobi problem, 25 Jacobi–Trudy formula, 317 Jacobian torus, 139, 552 Jacobian variety, 29, 552 Jost solutions, 479 Kac–Moody algebra, 175, 459, 578 Kadomtsev–Petviashvili, see KP

KdV equation, 350 KdV Hamiltonians, 350, 382 KdV hierarchy, 348, 379 Kepler problem, 17 Killing form, 565 Kirillov symplectic form, 514 Korteweg–de Vries, see KdV Kostant–Kirillov bracket, 41, 43, 86, 153, 202, 350, 513, 525 Kowalevski top, 22, 118, 165 KP equation, 338 KP hierarchy, 222, 274, 308, 323, 335 Lagrange top, 20, 33, 40 Lam´e function, 203, 558 Lax connection, 62, 72, 438, 460 Lax equation, 11, 34, 41, 92, 139, 176, 204, 392 Lax pair, 11, 13, 92, 93, 119, 175 level of representation, 583 Lie–Poisson action, 528 line bundle, 542 linear system, 52, 62, 122, 182, 394, 477 linearization, 140, 399 Liouville equation, 435 Liouville theorem, 7, 10 Liouville tori, 10, 160, 198 loop algebra, 41, 91, 578 matrix of periods, 550 Miura transformation, 354, 441, 448 moment map, 109, 228, 240, 359, 518, 524 monodromy matrix, 62, 70, 77, 190, 254, 377, 458, 496 Neumann model, 23, 26, 33, 40, 48, 125, 134, 143, 158, 160 Noether theorem, 518 non-Abelian Hamiltonian, 76, 458, 529, 533 normal order, 298 null vectors, 418 operator ∇, 59, 271, 347, 383

Index Painlev´e property, 284 path-ordered exponential, 62 phase shift, 390, 466 Pl¨ ucker relations, 310 point bundle, 543 Poisson bracket, 6, 507 Poisson manifold, 508 Poisson–Lie group, 76, 457, 525, 526 Poissonian action, 517 polar part, 34 product of bundles, 543 Prym variety, 170 pseudo-diﬀerential operators, 322, 333, 379, 451 R± projectors, 76, 177, 456 r-bracket, 85, 531 r-matrix, 14, 43, 70, 76, 85, 98, 100, 176, 206, 497 rank of an algebra, 566 reality conditions, 195, 402 reduced Poisson bracket, 111, 521, 522 reduction group, 39 regular singular points, 246 representation, 565 representation of Lie algebras, 106 Ricatti equations, 271, 379 Riemann bilinear identity, 152, 282, 430, 551, 552 Riemann problem, 256 Riemann surface, 29 Riemann theorem, 217, 554 Riemann’s constants, 554 Riemann–Hilbert, 53, 56, 72, 78, 231, 256, 487, 559 Riemann–Hurwitz formula, 123, 168, 177, 212, 278, 540 Riemann–Roch, 132, 183, 229, 546, 548, 549 root, 95, 107, 566 Ruijsenaars–Schneider model, 471

601

Schlesinger transformation, 265 Schroedinger discrete, 178 Schroedinger equation, 376 Schur polynomials, 317 Schwarzian derivative, 437 section of bundle, 542 Semenov-Tian-Shansky bracket, 457, 532 semi-inﬁnite wedge product, 312 semi-simple Lie algebra, 566 separation of variables, 18, 27, 28, 158, 410 Serre relations, 570, 580 sigma model, 67, 81 sine-Gordon hierarchy, 462, 464 sine-Gordon model, 68, 458, 477 Sklyanin bracket, 70, 192, 446, 497, 530 soliton solutions, 77, 308, 388, 412, 463, 493 spectral curve, 122, 177, 211, 391 spectral parameter, 32, 34, 61 stationary ﬂows, 171 Stokes matrix, 251 Stokes sectors, 248, 251 symmetric spaces, 107, 108, 115, 239 symplectic form, 43, 157, 186, 194, 219, 407, 415, 466, 508 symplectic transformations, 509 tau-function, 57, 106, 151, 185, 268, 281, 295, 305, 315, 327, 346, 387, 413, 443, 463 tensor notation, 13 theta-function, 141, 147, 184, 218, 395, 554 Toda chain, closed, 175 Toda chain, open, 95 Toda ﬁeld, 437 topological charge, 465, 478 twisted algebra, 579 ultralocality, 70, 447

Sato formula, 59, 274, 328, 346 scattering data, 478, 481, 488 Schlesinger equations, 263, 267

Verma module, 572 vertex operator, 295, 304, 463, 584

602

Index

Virasoro algebra, 356, 386, 449 Volterra group, 335 wave function, 52, 56, 62, 67, 147, 179, 221, 322, 338, 376, 439 weight, 571, 581 Weyl group, 103, 569 Weyl vector, 438 Whitham average, 367, 425 Whitham equations, 371, 431 Wick theorem, 298

Yang–Baxter equation, 15, 45, 86, 100, 207, 528 Yang–Baxter modiﬁed equation, 85, 90, 100 Young diagram, 316 Zakharov–Shabat, 34, 63, 72, 244 zero curvature, 50, 62, 245, 438 zones, allowed, 196 zones, forbidden, 196

Introduction to classical integrable systems

Read more

Introduction to classical integrable systems

Read more

Introduction to classical integrable systems

Read more

Global aspects of classical integrable systems

Read more

Global Aspects of Classical Integrable Systems

Read more

Introduction to multidimensional integrable equations

Read more

Integrable systems. Selected papers

Read more

Integrable Hamiltonian systems

Read more

Integrable and Superintegrable Systems

Read more

Lectures on integrable systems

Read more

Bilinear Integrable Systems

Read more

Discrete integrable systems

Read more

Discrete Integrable Systems

Read more

Integrable Systems Selected Papers

Read more

Introduction to Classical Geometries

Read more

Introduction to classical mechanics

Read more

Introduction to Classical Nahuatl

Read more

Introduction to Classical Geometries

Read more

Introduction to Classical Mechanics (

Read more

Introduction to Classical Armenian

Read more

Introduction to classical mechanics

Read more

From Quantum Cohomology to Integrable Systems

Read more

Classical and quantum nonlinear integrable systems: theory and applications

Read more

Classical and quantum nonlinear integrable systems: theory and applications

Read more

Global Aspects of Classical Integrable Systems (2nd edition)

Read more

1+1 Dimensional Integrable Systems

Read more

Integrable Systems and Quantum Groups

Read more

Harmonic maps and integrable systems

Read more

Probability, geometry and integrable systems

Read more

Integrable systems in celestial mechanics

Read more

Recommend Documents

Introduction to classical integrable systems

Introduction to classical integrable systems

This page intentionally left blank This book provides a thorough introduction to the theory of classical integrable s...

Introduction to classical integrable systems

Global aspects of classical integrable systems

Global Aspects of Classical Integrable Systems

Introduction to multidimensional integrable equations

Integrable systems. Selected papers

Integrable Hamiltonian systems

INTEGRABLE HAMILTONIAN SYSTEMS Geometry, Topology, Classification A. V. Bolsinov and A. T. Fomenko CHAPMAN & HALL/CRC ...

Integrable and Superintegrable Systems

bitegrable and Supenntegrable Systems This page is intentionally left blank Integrable and Superintegrable Systems E...

Lectures on integrable systems