Peyresq Lectures on
Nonlinear Phenomena
This page is intentionally left blank
Peyresq Lectures on
Nonlinear Phenomena
Editors
Robin Kaiser James Montaldi Institut Non Lineaire de Nice, France
1 > World Scientific
Singapore »New Jersey * London • Hong Kong
Published by World Scientific Publishing Co. Pte. Ltd. P O Box 128, Farrer Road, Singapore 912805 USA office: Suite IB, 1060 Main Street, River Edge, NJ 07661 UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE
British Library Cataloguing-ta-Publication DaU A catalogue record for this book is available from the British Library.
PEYRESQ LECTURES ON NONLINEAR PHENOMENA Copyright © 2000 by World Scientific Publishing Co. Pte. Ltd. All rights reserved. This book, or parts thereof, may not be reproduced in any form or by any means, electronic or mechanical, including photocopying, recording or any information storage and retrieval system now known or to be invented, without written permission from the Publisher.
For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to photocopy is not required from the publisher.
ISBN 981-02-4315-4
Printed in Singapore by Uto-Print
Preface Nonlinear science is a very broad domain, with its feet in mathematics, physics, chemistry, biology, medicine as well as in less exact sciences such as economics and sociology. Nineteenth century science was mostly linear and the progress towards an understanding of the diverse behaviour of non linear systems is among the most important general scientific advances of the twentieth century. The lectures contained in this book took place at two summer schools, the INLN Summer Schools on Nonlinear Phenomena, in June 1998 and June 1999. The Institut Non Lin^aire de Nice (INLN) is a pluridisciplinary institute interested in many aspects of nonlinear science, and the principal purpose of this ongoing series of summer schools is to introduce doctoral students, either from the INLN or form other institutions, to a range of topics that are outside of their own domain of research. The eight courses represented by these lecture notes therefore cover a broad area, describing analytic, geometric and experimental approaches to subjects as diverse as wound-healing, turbulence, elasticity, classical mechanics, semi-classical quantum theory, water waves and trapping atoms. It is hoped that the publication of these notes will be useful to others in the field(s) of nonlinear science. We would like to take this opportunity, as organizers of the two sum mer schools, to thank the Fondation Nicolas-Claude Fabri de Peiresc, which hosts our stay in the beautiful village of Peyresq in the French Alps, and in particular the president Mady Smets, for providing a wonderfully relaxed at mosphere, allowing the participants and lecturers to interact easily both on scientific and personal levels. We would also like to thank the local Direction Rtgionale du CNRS, for partially funding these summer schools. Robin Kaiser James Montaldi Valbonne, 2000
V
This page is intentionally left blank
Addresses of Contributors Basile Audoly Yves Pomeau Laboratoire de Physique Statistique de I'Ecole Normale Supirieure 24 Rue Lhomond, 75231 Paris Cedex 05, France. Dominique Delande Laboratoire Kastler-Brossel Tour 12, Etage 1, Universite Pierre et Marie Curie 4, place Jussieu, F-75252 Paris Cedex 05, France. Gerard Iooss Michel LeBellac Eric Lombardi Robin Kaiser James Montaldi Institut Non Lineaire de Nice UMR CNRS-UNSA 6618 1361 route des Lucioles F-06560 Valbonne, France Philip Miani Centre for Mathematical Biology Mathematical Institute 24-29 St. Giles' Oxford 0X1 3LB
VII
This page is intentionally left blank
CONTENTS
Preface Addresses of Contributors
v vii
Elasticity and Geometry B. Audoly & Y. Pomeau 1. Introduction 2. Differential geometry of 2D manifolds 2.1. Developable surfaces 2.2. Geometry of the Poincare" half-plane 3. Thin plate elasticity 3.1. Euler-Lagrange functional 3.2. Scaling of the FvK equations 3.3. Geometry and the FvK equations 3.4. Thin shells: an example 4. Buckling in thin film delamination 4.1. The straight sided blister 4.2. Telephone cord delamination Quantum Chaos D. Delande 1. What is Quantum Chaos? 1.1. Classical chaos 1.2. Quantum dynamcis 1.3. Semiclassical dynamics 1.4. Physical situations of interest 1.5. A simple example: the hydrogen atom in a magnetic field 2. Time scales — Energy scales 3. Statistical properties of energy levels — random matrix theory 3.1. Level dynamics 3.2. Statistical analysis of the spectral fluctuations fluctuations 3.3. Regular regime ix
1 1 2 5 7 11 13 15 17 23 28 29 33 37 37 37 38 40 42 44 46 48 48 51 53
x
Contents
3.4. Chaotic regime — random matrix theory 3.5. Random matrix theory — continued 4. Semiclassical approximation 4.1. Regular systems — EBK/WKB quantization 4.2. Semiclassical propagator 4.3. Green's function 4.4. Trace formula 4.5. Convergence properties of the trace formula 4.6. An example: the Helium atom 5. Conclusion
56 58 60 60 63 65 66 69 71 72
The Water-wave Problem as a Spatial Dynamical System G. Iooss 1. Introduction 2. Formulation as a reversible dynamical system 2.1. Case of one layer with surface tension at thefreesurface 2.2. Case of two layers without surface tension 3. The linearized problem 4. Basic codimension one reversible normal forms 4.1. Case (i) 4.2. Case (ii) 4.3. Case (iii) 4.4. Case (iv) 5. Typical results for finite depth problems 6. Infinite depth case 6.1. Spectrum of the linearized problem 6.2. Normal forms in infinite dimensions 6.3. Typical results
77
Cold Atoms and Multiple Scattering R. Kaiser 1. Classical model of Doppler cooling 1.1. Internal motion: elastically bound electron 1.2. Radiation forces acting on the atom: "classical approach" 1.3. Resonant radiation pressure 1.4. Dipole force
77 77 77 79 81 83 84 85 86 88 89 90 90 91 91 95 95 96 99 103 105
Contents
xi
1.5. Doppler cooling 2. Interferences in multiple scattering 2.1. Scattering cross section of single atoms 2.2. Multiple scattering samples in atomic physics 2.3. Dwell time 2.4. Coherent backscattering of light 2.5. Strong localization of light in atom? 3. Conclusion
106 110 110 112 113 114 119 124
An Introduction to Zakharov Theory of Weak Turbulence M. L. Bellac 1. Introduction 2. Hamiltonian formalism for water waves 2.1. Fundamental equations 2.2. Hamilton's equations of motion 2.3. The pertubative expansion 3. The normal form of the Hamiltonian 3.1. Ho = sum of harmonic oscillators 3.2. Nonlinear terms: three wave interactions 3.3. Nonlinear terms: four wave interactions 3.4. Dimensional analysis and scaling laws 3.5. Miscellaneous remarks 4. Kinetic equations 4.1. Derivation of the kinetic equations 4.2. Conservation laws 5. Stationary spectra of weak turbulence 5.1. Dimensional estimates 5.2. Zakharov transformation 5.3. Examples andfinalremarks
127 127 131 131 133 135 137 137 138 142 144 145 146 147 150 152 152 155 157
Phenomena Beyond All Orders and Bifurcations of Reversible Homoclinic Connections near Higher Resonances 161 E. Lombardi 1. Introduction 161 1.1. Phenomena beyond all orders in dynamical systems 161
xii
2.
3.
4. 5.
Contents
1.2. A little toy model:fromphenomena beyond any algebraic order to oscillatory integrals Exponential tools for evaluating monofrequencyoscillatory integrals 2.1. Rough exponential upper bounds 2.2. Sharp exponential upper bounds 2.3. Exponential equivalent: general theory 2.4. Exponential equivalent: strategy for nonlinear differential equation Resonances of reversible vector fields 3.1. Definitions 3.2. Linear classification of reversible fixed points 3.3. Nomenclature The 02+ioo resonance The (i(D0)2icoi resonance 5.1. Exponential asymptotics of bi-oscillatory integrals 5.2. Strategy for non linear differential equations
Mathematical Modelling in the Life Sciences: Applications in Pattern Formation and Wound Healing P. K. Maini 1. Introduction 2. Models for pattern formation and morphogenesis 2.1. Chemical pre-pattern models 2.2. Cell movement models 2.3. Cell rearrangement models 2.4. Applications 2.5. Coupling pattern generators 2.6. Domain growth 2.7. Discussion 3. Models for wound healing 3.1. Corneal wound healing 3.2. Dermal healing 3.3. Discussion 4. Conclusions
164 171 171 173 175 179 182 182 183 183 188 193 196 198
201 201 202 202 207 209 210 215 217 219 220 220 223 230 230
Contents
Relative Equilibria and Conserved Quantities in Symmetric Hamiltonian System J. Montaldi 1. Introduction 1.1. Hamilton's equations 1.2. Examples 1.3. Symmetry 1.4. Central force problem 1.5. Lie group actions 2. Noether's theorem and the momentum map 2.1. Noether's theorem 2.2. Equivariance of the momentum map 2.3. Reduction 2.4. Singular reduction 2.5. Symplectic slice and the reduced space 3. Relative equilibria 4. Bifurcations of (relative) equilibria 4.1. One degree of freedom 4.2. Higher degrees of freedom 5. Geometric bifurcations 6. Examples 6.1. Point vortices on the sphere 6.2. Point vortices in the plane 6.3. molecules
xiii
239 239 239 241 243 243 246 247 249 251 255 256 257 257 261 262 263 264 266 266 272 275
ELASTICITY A N D GEOMETRY BASILE AUDOLY YVES POMEAU Laboraioire de Physique Statistique de VEcolt Normale Sup&rieure We outline the general principles of thin plate elasticity, by emphasizing their connection with the classical results of differential geometry. The relevant FvK equation, can be solved in some specific cases, even though they are strongly and definitely nonlinear. We present two types of solutions. The first one concerns the contact of a spherical shell on a flat plane at increasing pressing forces, the second one is about the buckling of a thin film under pressure on a flat substrate, where we explain the observed "telephone-cord" pattern of delamination.
1
Introduction
This paper follows from a set of lectures by the two authors given in the beau tiful setting of the Rencontres de Peyresq, in the high country, north of Nice in late Spring 1999. Those lectures were devoted to the exposition of some re cent results in thin plate elasticity. This venerable field of classical mechanics is witnessing a renewal of interest because in parts of the new attraction of physicists and applied mathematician for everything linking classical geome try and observations made in everyday life. Those lectures were focused first on the general principles of thin plate elasticity, emphasizing as much as we have could their connection with the beautiful results of classical differential geometry. Below, we present a rather detailed derivation of Gauss Theorema egregium, stating the condition under which two surfaces can be mapped on each other without changing the curvilinear distances. Although this is of ten presented as obvious, the connection between this Theorema egregium and the laws of elasticity of thin plates is not so simple. Hopefully, we make this clearer in our derivation of the equations of Foppl-von Karman for thin plates. Later, we use those equations to analyze two physical problems. The first one concerns the way a spherical shell deforms when pressed on a plane, as when a tennis ball bounces on a racket. We show that two regimes can be observed, depending on the strength of the force. At low forces, the ball makes contact on a flat disc. When this force gets bigger, the ball inverts itself on a cap, and the contact is limited now to the circular ridge in between the inverted and the non inverted part. The details of the geometry of the ridge are deduced from an analysis of the elasticity equations. Finally, we discuss, again by using the same elasticity equations, the problem of buckling of a 1
2
B. Audoly & Y. Pomeau
delaminated film. As it occurs quite often, a film coating some bulk material is under compression because of the way it has been deposited. This film may relax the compression by buckling out of the surface of the bulk material. In many instances a very specific pattern for the buckled film is observed, the so-called telephon cord delamination. We show that this may be explained as a result of a secondary bifurcation of a tunnel like structure of delaminated film, the Euler column. 2
Differential geometry of 2D manifolds
The equations for the elasticity of thin plates (FvK equations later) were de rived at the beginning of the twentieth century by Foppl and they are notorious for their complex nonlinear structure. Only recently various investigations put in evidence the possibility of getting explicit solutions in various limits that may be put globally under the heading of large deformations. Actually, those solutions rely heavily on the connection between the FvK equations and the underlying geometry. One central question in this geometry of surfaces, closely linked to elasticity problems, is to find the conditions for a given surface to be isometrically deformable. By this, we mean a deformation leaving unchanged the (intrinsic) distances measured along the surface. If one thinks of a piece of paper this intrinsic distance is just the length of a line drawn between two points on the paper. This length remains the same when the paper is rolled in one way or another, but without tearing, whence the name "intrinsic". Al though the definition of this intrinsic length is relatively straightforward in the present case, it becomes rapidly far more subtle when higher dimensions spaces are considered, and even for non planar 2D surfaces (like the surface of a sphere for instance). Riemannian geometry is the geometry of surfaces (and their generalization to higher dimensions, the so-called manifolds) such that the distances are invariant, independent on the coordinates chosen on the surface itself. That this is a crucial question in elasticity theory is evi dent when noticing that elastic energy precisely accounts for the amount of stretching occured by the material under the deformation. This stretching is measured by how much the distances between material points vary. In the present section, we consider the geometrical problem only, and we shall deal in a rather casual way with deep results of differential geometry related to this question of deformation of surfaces. Far more elaborate presentations of this topic (necessary anyway when dealing with manifolds of dimension higher than 2) can be found in [1]. The problem we shall look at is the following one: under what conditions is it possible to find a one-to-one map between a plane and a surface given
Elasticity and Geometry
3
by a Cartesian equation z = Z(x,y), without changing the lengths along the surface? Later, we shall also examine the existence of an isometric, one-toone mapping between two given surfaces. We are looking for a local map of a point of coordinate (x, y) in the horizontal plane to the point of coordinates x' = x + u(x,y), y' = y + v(x,y), z' = Z(x,y), that is situated on the surface. The functions u(x,y) and v(x,y) define "practically" the mapping under consideration. The constraint (imposed on u and v) is that the length element along the surface is the same as the length element on the plane, that is that ds'2 = dx'2 + dy'2 + dz'2 = ds2 = dx2+
dy2.
The orientation of the tangent plane of the surface at the origin (x, y) = (0,0) can be chosen arbitrarily with the help of a rigid-body rotation. We will therefore assume that it is horizontal. Then, the mapping is close to the identity near the origin. The Taylor expansion of Z(x, y) is quadratic in x and y near the origin, and we expect u and v to be small (actually, u and v are generically cubic mx,y near x = y = 0). Expanding dx12 and dy12 at first order in u and v, and at second order in Z (see explanation below), one gets: ds12 = d(x + u(x, y))2 + d(y + v(x, y))2 + dZ(x, y)2
, .du
nJ
+2dxM
d-y
dv +
rx +
dZ8Z^
^^-
Now the condition of invariance of the length element under the mapping becomes the condition that ds12 is the same quadratic form as ds2, which yields three conditions, one for the coefficient of dx2 to be one, another for the coefficient of dy2 to be one too, and the last one for the coefficient of the cross term dxdy to vanish: du l,dZ,2 n
d-x + 2^-°'
&
^ l,dZ,2 n ^+2(¥)2=0,
(3)
du dv dzaz md
n
dy- + 8-x + dx-dy- = °-
&
As we aim at eliminating u and v, there is one more condition than the number of unknown functions (three versus two), and one condition has to be satisfied for the existence of solutions of (2,3,4). This is to be imposed to the function Z(x,y), a data in the problem. It is obtained by deriving the first equation
4 B. Audoly & Y. Pomeau
twice with respect to y, the second one twice with respect to x and the last one once with respect to x and once with respect to y. Subtracting now the last result from the sum of the first two, the u's and u's cancel out and there remains an equation for Z only: 6PZ_&Z__ ( 82Z dx2 dy2 \dxdy
= 0.
(5)
This has a simple geometrical interpretation. Let us write Z(x,y) in the coordinate system diagonalizing its Taylor expansion near x = y = 0:
where ii lt 2 are the so-called principal radii of curvature of the surface at x - y = 0. Therefore, the equation (5) amounts to JJ^J- = 0, or equivalently to that at least one of the radius of curvature is infinite. Surfaces such that this holds true everywhere are called developable. When this condition is verified, integration of (2,3) u{x,y) « - ^ and v{x,y) ss - ^ . It is a straightforward exercise now to get by the same method the exis tence condition of an isometry for two smooth surfaces of Cartesian equations z = Za(x,y) and z = Zt,(x,y). One can take those two surfaces as tangent to the horizontal plane at x = y = 0, then redo the same calculation as be fore, but by imposing that the length on the two surfaces remain the same under two mappings. Those mapping depend on two functions ua,b{x,y) and wa,ft(a;»y), and are mappings from the plane to surfaces a and 6, such that x'a = x + ua(x,y), y'a = y + va(x,y) and z'a = Za(x'a,y'a) and a similar set with the subscript b instead of a. Now one imposes that the two length el ements (dx'a)2 + (dy'a)2 + (dZ'a)2 and (dx'b)2 + (dy'b)2 + (dZ'b)2, are the same quadratic form in dx and dy, which yields: *!• + 1/dZ°\2 = *£* . 1(9Zt,.2 dx + 2[ 8x ' dx V dx' ' *!•+ +{ h^H.* =*! + 1(^2 dy 2 dy) dy*2(dy), and
dUa
dy
dVa
dx
dZa
dZa
= dub dx dy dy
9vb
dx
dZb
d
h dx dy '
Because of the obvious similarity of these equations with the one of the pre vious case, one may use the same method to get rid of the functions ua,j, and ua,fc. One gets at the end that the Gaussian curvatures of the surfaces a and
Elasticity and Geometry
5
b have to be the same:
iPZacPZa _ (d2za\2 dx2 dy2
\dxdy)
fd2zb\2
d^d^_ dx2 dy2
\dxdy)
'
W
This is the so-called Theorema egregium of Gauss (meaning approximately "outstanding", or "out of the crowd" theorem). This theorem is sometimes said as showing that the Gaussian curvature is a bending invariant: suppose that one can deform the surface isometrically ("bend it"), then its Gaussian curvature must remain the same at every point. This property is obviously a constraint on isometric deformations, but it is still in general a difficult ques tion to know if non trivial isometries exist for a given surface. For instance, a plane or a cylinder are deformable surfaces, but not a sphere, nor even a convex surface (when the edge is attached). In the coming two subsections, we shall expose two questions of differential geometry, the first one having to do with some properties of the developable surfaces, something that will be useful later on for the elasticity of thin plates, the next one will have a more mathematical bent and aims at showing an example of application of the ideas of differential geometry in a well defined case, the so-called Poincar6 half-plane. 2.1
Developable surfaces
By definition, such a surface may be mapped on a plane without stretching, and it is C2 smooth (an important assumption). Let us state first the Theo rema egregium in its general form (actually we stated it in the case of almost horizontal surfaces). Its extension is almost trivial, because it only requires to write the Gaussian curvature in an arbitrary system of coordinates. This can be done in a number of ways, and the final result is that the product of the principal curvature (or inverse of the principal radius of curvature) is equal to ri
\ —
{X V)
° ' -RlR2
— '5**"5y T
\~5xUy'
~ i+ (£)* + (£).•
/»•> (7)
which reduces to the left hand side of (5) when the tangent plane is horizontal. Therefore the algebraic condition that the Gaussian curvature vanishes is always
dx2 dy2
\dxdy)
~
6 B. Audoly & Y. PomeaM
The algebra will be made simpler later on with the notation
_ d*za d2zh V"»**\
dx2
dy2
d2zb d*za „ # % d2zb +
dx2
Qy2
'dxdydxdy'
so that G(x,v) =
I**L 2(l + (ff)! + (f)')'
A classical problem, called the Monge-Ampere equation amounts to find the unknown function(s) ° Z(x, y) such that G(x, y) is prescribed in a bounded domain for instance. In the case of zero curvature, the Monge-Ampere equa tion G(x, y) = 0 has an interesting general solution. Consider a one parameter family of planes of Cartesian equation: C(x, y\s) = a{s) + b(s)x + c(s)y,
where a(s), 6(a) and c(s) are smooth arbitrary functions of a parameter a. The envelop of this family of planes is the surface tangent everywhere to one plane in the family. As a first result, we show that this surface is tangent to those planes along straight lines (the generatrices). Consider two planes with neighboring indices, a and a + 6s, 8s small. Those two planes cross along a straight line, intersection of planes of equation z = a(s) + b(s)x + c(s)y, and
z = a(s + Ss) + b(s + Ss)x + c(s + 6s)y.
Take the difference between those two equations and divide by 6s, then one obtains that, in the limit of a vanishing 6s, the limit line of intersection of the two planes has Cartesian equation: z = a{s) + b(s)x + c(s)y, da db dc
^
dl + dlX+d-sy = 0-
(8)
W
These are two Cartesian equations of planes, showing that the envelope of the family of planes must include generically the straight lines whose equation is obtained in this way. It remains to show that the surface so generated has zero Gaussian curvature. This is shown directly, by computing G(x,y) as given by (7). The calculation is not straightforward, and so we shall decompose "There might be more than one solution: think of the case of zero Gaussian curvature.
Elasticity and Geometry
7
the main steps. Notice first that equation (8) can be considered as giving s implicitely as a function of x, y. By differentiation with respect to x, one gets: fk = Ms) + — d(Q(3) + b(s)x + c(s)y) ( 10 ) dx dx ds and a similar equation for the derivative of z with respect to y. Prom the second equation (9), M*'+ \^)x+c\siVI — Q whenever one takes points on the surface tangent to the family of planes. Accordingly | | = b(s) and |^ = c(a). Therefore, d2Z dbds dx2 ~ dsdx' d2z dcds dy* ~ds"dy' d2z dbds dcds a a a a dydx ~" dsdx ~ ~ dsdy Putting this into the definition of [z,z]:
^'^-'[dxidy2
\dxdy) ) '
one gets that [z, z] is zero for a surface that is the envelop of a one parameter family of planes. This shows that the general solution of the Monge-Ampere equation G(x, y) — 0 = [z(x, y),z(x, y)] (that is for a zero Gaussian curvature) is z = £(x,y\s(x,y)), where s(x, y) is given by (9). Let us notice at this point that the surface found in this way in not necessarily physically representable by a folded sheet of paper, as it may cross itself, along either a line or at a conical point wherefrom a bundle of generatrices stems. 2.2 Geometry of the Poincari half-plane This subsection is to show on a concrete example how differential geometry works. This example is the one of the Poincari half-plane, that is the upper half plane, x arbitrary and y positive. This half plane is endowed with the metric
ds2 = * ^ ± * 1 . 2
(12)
y This metric is invariant under dilations centered on the x axis and is actually related to a projective metric in higher space dimensions [2]. The mathematics of the Poincari half-plane is an immense field of investigations and we shall
8 B. Audoly & Y. Pomeau
content ourselves with a few simple remarks aiming to show how differential geometry works in a specific case. The first simple remark is that this metric cannot be realized by a surface sitting above the half plane, since for large j/'s, the length element becomes too small to be written as ds2 = dx2 + dy2 + dZ2, where Z(x,y) is some function of x and y. A fundamental notion in differential geometry is the one of geodesic line. By definition, a geodesic has the shortest arc length between two points. This holds true for a positive definite metric, as the one under consideration. For a non positive definite metric, one replaces this condition of shortest distance by the condition that this distance is extremal in the variational sense. The geodesic lines of the Poincare half-plane are called horocycles, and they minimize the arclength J * "*" v . The minimization is easier by assuming that the Cartesian equation of the horocycle is given by x(y), and one gets as Euler-Lagrange condition, the so called Jacobi equation:
5 (TfTTPs) = °'
(13
>
where x' = ¥■. This can be solved at once to give the general parametric equation for the horocycles: x — -Rcos(j> + x0 and y = R sin <j>, where R is an arbitrary positive constant and
an angle in [0,7r]; XO is the center of the semicircle drawn by the horocycle. There are two other remarkable properties of the Poincare" half-plane. The first one is that it has constant negative Gaussian curvature. In the present framework, one cannot define Gaussian curvature as we did before, because there is no explicit realization of the Poincare half-plane as a real 2D surface. Actually the Gaussian curvature may be defined intrinsically as well, through the general formula giving the total rotation of the tangent to a closed curve (the Gauss-Bonnet theorem). By definition, there is no rotation of the tangent along a geodesic. Therefore, it is easier to consider this total rotation along a closed curve made of geodesic arcs. In ordinary plane geometry, the total rotation of the tangent to a smooth closed curve is 2ir. For a polygon with straight edges, this is still the same 2ir if one blunts a bit the vertices in such a way that the tangent along its vertex (of index j) rotates by at - it where <Xj is the outer angle of the vertex. Therefore Ylj(aj - 7r) = 27r for any closed polygon, a familiar result indeed. Take a triangle, this yields £ \ aj = 5n or X)j(27r - aj) = IT, showing that the sum of the inner angles 27r - a, is n.
Elasticity and Geometry 9
Figure 1. Geodesic polygon in the Poincarf half-plane.
We borrow the proof of the Gauss-Bonnet theorem for the Poincare half plane from C.L.Siegel [3]. Consider a polygon with its side made of arcs of horocycles, and compute the area fi inside this polygon. The area element consistent with the metric is ^p-. The area inside a polygon is just the integral il = / ^ ^ extended to the inside of the polygon. Integrating over y turns the area into a sum of line integrals along each geodesic arc bounding the polygon:
»-?£-(*)• where C* is the arc of index k bounding the polygon. The line integrals over Ck are computed by using as variable the angle 0 that was introduced when solving Jacobi's equation for the geodesies: dx = Rsin<j>d and y = Rsm<j> and the line integral becomes H = £ A f£ -d = - £ ^ ( 7 * - 0k), where 0k and 7* are the values of at the ends of the arc C* (see fig. 1). Now one notices that (7*; - 0k) is precisely the amount by which the normal has rotated from one end to this arc to the other. The sum ]Cfc(7* - 0k) is 2;r minus the sum of the angles necessary to turn the normal at each vertex when going from one arc to the next. These angles read also (a* - ir)'s, with a* outer angle at vertex k: - f t = £ A ( 7 * - 0k) = 2n - £ * ( Q * - n). This is a particular case of a more general formula, the Gauss-Bonnet formula that relates in general the integral of the geodesic curvature along a closed curve to the integral of the Gaussian curvature inside the enclosed area. This formula reads, for a polygon bounded by geodesic arcs: V(ajb - TT) + / KGdu) = 2TT, k
(14)
Jn
where KQ is the Gaussian curvature, du> the area element. This formula can be written in a more general form for a general (non piecewise geodesic) closed
10
B. Axtdoly & Y. Pomeau
curve. The Gauss-Bonnet formula in its form (14) shows that the Gaussian curvature of the Poincare half plane is (-1). Another important property of the Poincare half plane is that it has a continuous group of isometries, the so-called modular group. Its general ele ment depends on three real variables (it is actually easier to use four variables with one constraint). The general transformation reads: z->z'
,
az +b = ——cz + d
, , 15
In (15), a,b,c and d are real numbers such that ad - be = 1. The latter condition puts the map into a kind of canonical form. Actually, the map is not changed by multiplication of all four parameters a, 6, c and d by a non zero real constant, a multiplication that changes the value of ad - be. Setting this to 1 selects one representative among all possible choices of (a, 6, c, d) for the same transformation. Because the map z -> z' is analytic it conserves the angles (it is a "conformal" map). That it is a group is checked directly: the products of two maps with parameters a, b, c, d and a', b', d, d1 has parameters a" = aa' + be1,6" = ab' + bd',c" = ca' + dd,d" = cb' + M, and one has a"d" - b"c" = (ad - bc)(a'd' - b'c'), which shows that the condition for the unit determinant is conserved by the map. The unit element is a = c = 1, b = d = 0. The inverse is computed by getting z as a function of z'. The metric (15) comes naturally into play in the Poincare' half-plane. As we show, it indeed leaves invariant the length element defined in (12). It is simpler to show it for the transformations close to the identity, that is for maps of the form z' = z + ef(z), with e small real and f(z) function of z to be defined. First of all, one notices that because of its form, the metric is obviously invariant under translations (small or large) in the x direction as well as under dilations from the origin. The corresponding map, when small would correspond to f(z) = 1, for translations, and to / ( z ) = z for dilations. The last possibility for f(z) corresponds to a = d = 1, b = 0, and c = - e , namely to /(z) = z 2 . The Euclidean length element becomes: dx'2 + dy'2 = dz'dz1 = dzdz(l - e(z + z)), although the denominator y2 becomes: ^2=~4(Z'-Z')2 * " 4 ( z - z ) 2 ( 1
+
€ (
*
+ I ) )
'
which shows that, to first order in e, dx \dV i s left unchanged. We shall end this short account of some elementary properties of the PoincarS half plane by mentioning that it underlies the theory of spinors in
Elasticity and Geometry 11
quantum mechanics [4]. Also, we notice that the associated metric has a non trivial isometry group. The general question of finding the full isometry group associated to a quadratic form seems to be quite difficult * (besides the now obvious condition that the map has to be between points with the same Gauss curvature). Think for instance of the half upper plane with the indefinite metric ds2 = dx ~*y . The geodesies become pieces of hyperbolae now, but it is not obvious that the isometry group is bigger than its obvious part, the dilations and the translations in the x direction. However, there is chance that the isometry group is big, since Gauss curvature for this indefinite metric (meaning that the square length can be positive, zero or negative) has constant positive Gauss curvature equal to + 1 . 3
Thin plate elasticity
The Theorema egregium gives the condition under which a surface is mapped into another surface by keeping the distances invariant. This has obvious application in elasticity of thin plates and shells, since one expects naturally that these deformations keep at its minimum the energy of strain: if the length along the ideal zero thickness surface is unchanged, an eventual energy of deformation will be due to the small thickness of the plate, that is to its 3D structure. This elastic energy will be likely some sort of small correction in a systematic expansion in the small parameter represented by the plate thickness (or, more precisely by the aspect ratio). However, the mechanical problem is not such a straight application of the geometrical concepts, for two reasons. First the energy of deformation may involve at the same order inplane stretching effects, and (3D) bending effects (which are absent in a purely geometrical description). The Gauss-inspired differential geometry relies upon the crucial assumption of smoothness of the surface to establish the condition of isometric mapping (actually C 2 -smoothness in technical terms, meaning existence and continuity of the tangent plane). However it happens quite often that a sheet or a shell under strong deformation will actually minimize its elastic energy by becoming singular in the sense of the 2D geometry of surfaces, the regularization being due to the three dimensional structure: for instance, a crumpled plate will get a radius of curvature of order of magnitude of its thickness h, something that would become a singular ridge in the zero thickness—ideal plate limit. If one crumples a piece of paper (or better a 6
An interesting, if not important, question related to this topic is the formation of wrinkles on human skin. Facial wrinkles may be seen as due to the fact that no isometric sliding of the skin exists on the surface of the skull.
12 B. Audoly & Y. Pomeau
sheet of transparency, a much better material than paper for this purpose) one can notice crescent-like marks (or scares) left on the paper, once it is uncrumpled. Those scares show up at very localized places, manifesting a focusing of stresses such that the elastic limit of the material (polymetacrylat for a piece of transparency) has been so overpassed that some irreversible phenomenon took place there, leaving a scare on the otherwise smooth sheet. A very simple observation explained in [6] shows (although in a rather non trivial way) how differential geometry and elasticity are related there. This kind of remark has more than an anecdotical character, as it is also relevant for crash of cars, with the body is strongly deformed at well definite locations, due to the same phenomenon of stress-focusing. The folds on fabrics flattened by ironing fall in the same class. A plate is a piece of thin elastic material of constant thickness (this later condition could be relaxed at the price of great notational burden). In its rest state a plate is a plane. Although the equations of elasticity for thin rods were derived by Euler in the eighteen century, it took another century and a half to derive the equation of elasticity of thin plates, something done in 1905 by Foppl. Since those equations are widely known as the von Karman equations, we shall call them afterwards the FvK equations. They allow to compute consistently the energy of deformation of a thin plate of elastic material in the Hookean case c . By definition, a thin plate is a piece of 3D elastic material of thickness much less than its width in any other dimension. In this limit of a large aspect ratio, and by assuming that the plate remains almost horizontal, the first FvK equation for an equilibrium deformation read : h3E j^--^A2C-MC,X]
= 0,
(16)
where x is the so-called Airy potential and £(x, y) the (small) deviation of the plate out of the z = 0 plane (x, y are the Cartesian coordinates in this plane), h is the thickness of the plate and E {/a ) its Young modulus(/Poisson ratio). Moreover A 2 is for the bilaplacian (that is the square of the Laplacian):
dx* c
dy*
dx2dy2 '
The Hookean approximation forbids one to consider the irreversible deformations alluded before. It would require to take into account irreversible phenomena in elasticity, something that is much beyond the scope of this work. Note however that this occurrence of scares, for instance, is a way of study, at least indirectly, the irreversible behavior of solids without breaking large pieces of material.
Elasticity and Geometry
13
The equation (16) is to be completed by the second FvK equation: A 2 x + £[C,C1 = 0.
(17)
The symbol [U, V] in (16,17) with U, V either ((x,y) or xix,y) before: 1
'
J _
dx2 dy2
+
dx2 dy2
dxdydxdy'
is defined as
(
'
The FvK equations (16,17) describe the elasticity of a plate. We shall examine now three basic questions related to the FvK equations, the question of the boundary conditions being left to the next section: i) These equations are the Euler-Lagrange equations for an energy func tional (being derived through a consistent schema from the general equa tion of elasticity, the Euler-Lagrange structure carries over). ii) Although this looks a simple question, the orders of magnitude consis tent with the FvK equation are not so trivial. The main point we are going to show is that, for large deformations, the flexural term (that is i2(i-^-s)A2C in (16)) is negligible, because it involves the largest power of the small thickness h. Without this term, and for convenient boundary conditions, the energy minimum is a purely geometrical problem that reduces itself to the one of developable surface. But this last one may have no smooth solution, so that a more complicated "basic" (= without bending energy) solution has to be found, where the flexural term may become locally relevant (the radius of curvature of the surface becomes locally much smaller than its average value, a typical boundary layer situation). iii) There is a rather tight connection between the proof we gave of the Theorema egregium and the structure of the FvK equations. This is shown below by deriving that part of the FvK equation that takes into account the stretching energy only. Moreover, the derivation presented below will permit to introduce in the next Section the boundary conditions for the FvK equation in a kind of rational way. 3.1
Euler-Lagrange functional
As it can be verified after integrations by part, and by skipping for the mo ment the boundary terms (considered in next Section), the FvK equations are
14 B. Audoly & Y. Pomeau
deduced by Euler-Lagrange variation with respect to \ and C of the functional:
Z
h?E
1
(19) i d : r d y [ 2 4 ( l - ^ ) ( A C ) 2 ~ 2E ( ( A * ) 2 " X [ C ' C 1 ) ] This integral is over the whole area of the plate A, again assumed to be close to the plane 2 = 0. One can get rid of the Young modulus in (17) by redefining a scaled Airy potential x' — % so that the functional to be studied becomes:
F'(C,x')=Eh f dxdy JA
h
24(1 - a2)
(AC)2-iU(Ax')2-x'[C,C])
(20)
As this functional does include a cubic term, x'[C C]i w *th the largest power (any other term is formally quadratic in either £ or x'), this energy has no well defined sign and cannot be bounded from below (or from above). A simple trick circumvents this difficulty: let us consider x' as a function of [£, Q (that is of the Gaussian curvature as it enters in the second FvK equation (17)). Then, once this is done, the Airy stress function has disappeared from the energy integral, and this one depends now only on the shape of the plate given by {(x,y). FVom the algebraic point of view (that is by forgetting that £ and x' are functions of x and y, taking them as ordinary real variables), one can see the formal elimination of x' as what would be done when computing the (local) minimum of a function F"(C, x') = 2 | — V + *'C2> w i t n *'» C ordinary real variables, and a a coefficient (its role will be explained shortly). Indeed F" has no absolute minimum because its highest power is the cubic x'C2 that can be made arbitrarily large negative or positive, and much larger in absolute value than any other term in F" by taking x' large negative and £ large, positive or negative. However, this function has a local minimum, found by solving for \' the equation ^i- - 0, which gives x' - C2> which, once put into F" gives F" = ^j- + |C 4 i that has one or three local minima, as a function of C, depending on the sign of a: if it is positive, then the unique minimum of F" is at ( = 0, while if a is negative, the extremum of F" for < = 0 becomes a maximum, and two new minima at C2 = V<* have a lower energy than than F"(C = 0) = 0. The trick is extended now to the functional case, the equivalent of a being played by the boundary and loading conditions: if the plate is pushed on its lateral boundaries, the solution £ = 0 may become a maximum of the elastic energy. FVom this solution, a non zero minimum branches off with a finite plate deformation when a changes sign. Once we have minimized F' by tuning x' at C(x,y) fixed, but arbitrary, it remains, as in the case just studied,
Elasticity and Geometry
15
to vary the shape, i.e. the function C(x,y), until one reaches the minimum of the functional F' (which has a true minimum now, because its highest power in £ is a positive quartic, as the function | £ 4 in F"). Therefore, one has to consider the integral
I dxdy(~(V\r-X%C})
(21)
as a functional of [(, C] = s(»") only, by putting into it the value of \' that is the solution of A 2 x' + [C.C] = 0- The last two terms in F'(£,x') become the following quadratic functional in g(r): min /
dxdy{-l-{V2X'?-X%,Q)
=\ l
dxdy f
dx'dy'K{r,r')g{r)g{r'\
(22) where K(r,r') is the Green's function of the bilaplacian for the equation (17), a positive definite operator. This operator is positive because the operator A 2 is positive itself: multiplying (17) by \' z^d integrating over dxdy, one gets the integral of a square after integration by part, which shows that A 2 is positive as well as its inverse K(r,r'). 3.2
Scaling of the FvK equations
We have in mind plates that are deformed because they are constrained to follow a given non planar curve for instance. This implies in particular that the strength of an external force is not the scaling parameter, contrary to the case considered by Landau [7]. Accordingly, and as it follows from the form of the Euler-Lagrange functional given in (18):
F'(Cx') = Eh j A dxdv ( 24(1 ^ g2) (AC) 2 - \ ((Ax')2 - x'[C,C])) •
(23)
The only parameters in F ' ( C x ' ) sie length scales [(1 - a 2 ) is a pure number assumed to be of order 1]: the plate thickness h and some typical length scale coming from the boundary condition, that will be denoted as R, and could be seen for instance as the typical value of the inverse torsion of the (non planar) curve that bounds the plate. With this assumption, the radius of curvature of the surface should be also of order R, so that [£,£] * TJT- Balancing the second and third term in the integrand of (23), and taking A « -^ , one gets x' «s R2, so that the order of magnitude of the first term in the integrand of (23) is }jr although the order of magnitude of the second and last one is 1. In the limit where the FvK equations apply, that is for ^ small compared to 1, the flexural term, 24;1ft_(r3v (A£) 2 in (23), is negligible. However, as we shall
16
B. Audoly & Y. Pomeau
see, by neglecting this bending energy, one may find Euler-Lagrange minima represented by non C 2 -smooth functions £ , making then 24/1A_g2S (^C) 2 ^fi nite at some well defined points and/or along some curves. This flexural term cannot be neglected there and other local scaling have to be used. This is what defines in general (although in a rather loose sense) the inner problem in an application of the boundary later method: some well defined approximation can be used almost everywhere in the solution of a PDE, because it amounts to neglect terms that are formally small. But, it may happen (and actually this happens quite often) that the solution of this "outer problem" becomes singular somewhere. Wherever this happens the neglected terms are no longer negligible. The region where another approximation schema has to be called upon is the boundary layer or "inner region". Scaling of the inner solution does not depend upon the outer scale R, and is such that all terms in the FvK equation are of the same order. This is because of the general principles of scaling theory: here the outer solution is defined by the balance of the second and last term in the FvK energy, both representing the stretching energy. Near the singularities, the first term diverges, while it was neglected in the outer solution. Therefore the scaling of the inner solution is defined by the constraint that the first term of FvK (the flexural term) is of the same order of magnitude as the second and third one. This balance is indeed realized by the condition that every length scale is of order h, and that \' « h2 for the inner solution. This inner scaling seems to contradict the basic assumption of the FvK theory, namely that the radius of curvature of the plate is everywhere much larger than h. However this can be fixed by introducing a dimensionless pa rameter, controlling the local departure from planarity near the singularity. If the singularity is an ordinary fold, the small parameter is just (n - 0C), 0C fold angle (> w) defined such as 0C = 7r for a plane. Another type of singu larity, compatible with developability in the sense that length are conserved, but without requirement of smoothness, is the so-called "d-cone"[6]. This is a cone such that a circle drawn with its center at the tip on the cone will have a perimeter equal to 2n times its radius, as on an Euclidean plane. As the fold mentioned before, such a d-cone can be assumed flat enough if its angle at the tip is close to f, f being the angle for a flat plane. If this angle is so chosen , the radius of curvature near the tip is of order £ , and can be made as large as wanted for 0C small, although it scales with h as required by arguments following from the boundary layer analysis. To show that the stress is focused near the singularities requires more refined estimates. Actually the stress scales like a second derivative of \ \ and
Elasticity and Geometry 17
so it does not seem to make a difference to have v/ of order h 2 in an area of size h or y/ of order R? in an area of size R, because in both cases the order of magnitude of y/ is 1. The situation is in fact more complicated than that, since the outer solution is a piece of developable surface for which y/ vanishes at the order R?, and so does not yield a contribution to the stress at the order of magnitude given by a simple estimate. We reconsider this question of the stress-focusing later on, when discussing in more details the structure of the solution in a specific example. 3.3
Geometry and the FvK equations
In this subsection, we show the connection between the general principles of the geometry of surfaces and the FvK equations. Gauss' "Theorema egregium" tells us if a surface, once deformed in one way or another can be mapped back on its undeformed state without stretching. If no such stretch ing is needed, the elastic energy in the deformation will remain quite small. It will come from the bending part of the FvK energy, of order (jj) com pared to the stretching part. The FvK equations allow to tell more. Given some departure from the Gauss condition (pointwise equality of the Gaussian curvatures on the disturbed and undisturbed surfaces), the FvK equations give a way to measure the elastic energy stored in the deformation. Their complicated structure comes from the non local character of the Theorema egregium. A simple example shows this point: take a regular cone, with a circular basis. Its Gaussian curvature is zero almost everywhere. At almost any point on its surface, one of the principal curvature is indeed zero, the one along the straight generatrices stemming from the tip, so that the Gaussian curvature is zero, except at the tip where it is undefined. Let us now try to flatten this cone on a plane. Even though its Gaussian curvature is zero almost everywhere, this is obviously impossible without a large amount of stretching (or tearing...). Think of a circle of radius R with its center at the tip. The perimeter of this circle, drawn on the cone, is 2nRsinae, where a c is half the cone angle at the tip. For ac = | , the cone is flat, and one recovers the usual perimeter/radius relation on a plane. Otherwise, the perimeter is less than its value on a flat surface, and a large azimuthal stretching is needed to bring the cone on the plane. From the geometrical point of view, the Gauss condition (zero Gaussian curvature on both surfaces) is violated at one point only (the tip), nevertheless, the elastic energy needed to pull the cone on a plane is not local at all: the cone needs to be stretched everywhere to be flattened on the plane. To implement those ideas in a computational schema, that it is to express
18 B. Audoly & Y. Pomeau
Figure 2. Parameterization of a mapping of a plane onto a given surface.
the amount of stretching needed to pull a plane on a surface of given Gaussian curvature, we start as before from an arbitrary pointwise mapping of a plane to a surface. This depends on three functions of x and y: u, v and Z, such that x is mapped into x -> x' = x + u(x, y), y into y -¥ y' = y + v(x,y) and z = 0 into z = Z(x,y) (see fig. 2). The amount of stretching is measured by the departure of the length element in the deformed plate from its rest value. This is precisely the kind of quantity we had set to zero to impose isometric mapping, and to show the Theorema egregium. The length element ds — \/dx2 4- dy2 is changed by the mapping into ds' such that: ds'2 = dx'2 + dy'2 + dz2 = dx2 + dy2 + wxxdx2 + uyydy2 + cjxydxdy where the coefficients u>xx, uyy and ujxy describe the straining arising from the mapping (x, y, 0) -»• (x',y',z). For the case of an almost horizontal surface (u and v are then small), the components of the strain are the quantities which appeared in equations (2-4): n3u
(dZ\2
ndv
fdZ\2
_3u ft) azaz xy
dy dx dx dy According now to the general principle of Hookean elasticity, one writes the elastic energy of deformation as a quadratic form in the strain parameters Uij. This quadratic form is not the most general, as it is constrained by the geometrical symmetries of the problem. The next step in the calculation is formally very simple to do, but far less easy to justify. Notice first that the quantities like | | that appear in u ^ are dimensionless; and so can be considered as small by themselves (that is without being
Elasticity and Geometry 19
compared to another physical parameter). Almost no material, besides rubber and its kins can stand a strain of order unity: usually most solid material will break, or at least be irreversibly damaged when strained above some small limit, like 1 0 - 3 or 1 0 - 2 . Therefore it seems to be a good approximation to neglect the quadratic term in the above expressions for the u/s. The difficulty there is that the nonlinear terms are also necessary to account for transforma tions that do not strain at all the material, the ordinary rotations. To show this, let us compute wxx under a pure in-plane rotation of angle 6. In this case, u = x(cos8 - 1) + y sin8 and v = y(cosd - 1) - zsin0. The complete expression for u;xx, that includes the terms non linear in the derivatives of u and v, reads for plane strain:
—'!♦((£)'♦(£)")• Putting in the expression of LJXX just given the value of the derivatives ^ and | j for a plane rotation, one gets: u)xx = 2(cos0 - 1) + ((cos0 - l ) 2 -I- sin2 6), which vanishes, as it should, if one includes all terms in the expression, coming from the linear piece, 2(cos0 - 1), and from the nonlinear piece, ((cos 6 - l) 2 + sin2 6) as well. Therefore, the expression of the strain limited to the linear part in the derivatives of the displacement can only be true if one disregards large scale rotations: such a large scale rotation would lead to a non zero contribution to uxx, when limited to its linear part, although no such term actually exists. This puts a strong limitation on the applicability of the theory presented be low, even within the range of validity of linear elasticity. It is indeed possible to write the FvK equations in a covariant form, that takes into account the possibility of large scale rotations with no contribution to the physical strain ing. Although this sort of equations have their interest because they are the most general ones for this sort of problem, their complicated form make them quite unsuitable for practical calculations. Therefore, we shall limit ourselves for the moment to this linear approximation, that is to the expressions of the components of Uij limited to terms linear in the derivatives of u and v. In contrast, the vertical deflection, Z, does not show up at all in the strain at the linear order d, so the nonlinearities in Z (£ in the FvK equations) have to be kept. d
This is because any purely normal deformation of a plane is isometric to first order in the deformation.
20
B. Audoly & Y. Pomeau
As already said, we shall assume that the energy of deformation brought by straining is quadratic in the parameters denning this deformation. In this particular field, this is known as the Hooke's law, after the name of Robert Hooke, a contemporary (but by far not a friend) of Newton, who made his place in the history of Science with a particularly short sentence, called now Hooke's law. He wrote it in latin as an anagram (that is by pulling together the letters of the sentence). The latin rendering of the anagram is ut tensio sic vis, meaning approximately that the forces or stesses (vis) are proportional (sic) to strain (tensio). One can only admire how efficient was latin (the lingua franca of Science for centuries) to carry information. Anyway, Hooke's law (if one assumes that, in the anagram, sic means "proportional", not completely obvious) is transformed, when it goes to the energies, into assuming energy proportional to the square of strains. Since the component of UJ transforms in rotations like the corresponding products of components, that is like xx, xy and yy, it is easy to guess the quantities that will remain unchanged under rotations and will be products of two components of u: they will have to be combinations of fourth powers of x and y that are invariant under rotations. Those combinations are (x 2 + y2)2 and its expanded form: xA + y4 + 2x2y2. To the first corresponds the square of the trace of a;, that is (UJXX + uyy)2. To the second, the combination (uxx)2 + (ioyy)2 + 2(w xy ) 2 , One may check too that under rotations, the two quadratic forms (wxx + w w ) 2 and (^ii) 2 + (w yv ) 2 + 2(u)xy)2 remain separately invariant, as expected. The elastic energy should make no difference between systems of coordinates rotated by an arbitrary angle (assuming again an isotropic plate). Therefore it should be a linear superposition of those two quadratic invariants. Since u depends usually on position, this elastic energy is given by an integral over the surface of the plate: Uel=hj
dxdy
- ( W „ + Uyy)2 + H{(UXX)2 + (Uyy)2 + 2(UXy)2)} , (24)
where the quantities called A and \x are positive (otherwise the plate would lower its energy by spontaneously straining itself without any constraint com ing from the boundary conditions or from some other outside stress source). In front of the integral giving the value of Uei lies a factor h. This part of the energy is proportional to the thickness of the plate. Taking, at constant in-plane strain, a plate twice as thick as another one requires as much energy as straining two plates of the same thickness, whence the energy of straining is proportional to the thickness. Although this looks a rather obvious result, it does not carry on to the case of the bending energy. The coefficients A and p are called the Lame1 coefficients, with dimension of a pressure. This
Elasticity and Geometry
21
derivation of the FvK equations uses that Uei can be written as h /
dxdy ee[(u,x,u
ZtX, Z,y),
JA
where u, x is for | | , and similar expressions for u iV , w,x, • • • By definition: eet =
2 (Wxx + U>y!/)2 + \l {(Uxx)2
+ (Wyy) 2 + 2 ( w x y ) 2 }
It remains now to derive from the expression of the energy the equations of equilibrium of thin plates. This is done in the usual way, that is by assuming that the energy is the lowest possible under the given external conditions. Those condition may be seen for instance as the value of the deformation field at the edges of the plates. The FvK equations, as derived from the energy (24) will include the stretching part of the FvK energy only. This is because no contribution coming from the dependance of the strain across the plate is included. We shall derive this (bending) energy at the end. The equilibrium condition are derived in the usual way from the energy functional as the Euler-Lagrange condition giving the energy minimum under variations of the relevant parameters. Those physical parameters are the two components, u and v, of the planar strain, plus the deviation C(x, y) in the 2-direction. This will give three differential equation. However, the two equations coming from the Euler-Lagrange condition for the variations of u and v may be reduced to one single equation, thanks to the introduction of the Airy potential. The energy Uei is a quadratic form in the derivatives of u and v. Therefore, in the first variation with respect to u and v, one will get two equations in the form: duxx dx ^
dffx = 0, dy + ^
ay
= 0.
(25) (26)
ox
where the stress tensor a is defined as follows. The Euler-Lagrange condition for stationarity of £/<>/ indeed reads d deei dx du,x
ddeei_ dy duA
that results from the condition of stationarity with respect to variations of u, and another similar condition for the variations with respect to v. By
22 B. Axidoly & Y. Pomeau
identification with (25, 26), one gets: _ deei ®XX —
1
(27)
deet dv,,'
(28)
r.
OU,x
°yV~
deei
deei
(29)
du
,y
That axy = ayx follows from the fact that eej depends on uiV and vy), while we expect no free function at all. Therefore, x should satisfy an equation deduced from Cauchy-Poisson. Actually this equation is found by noticing that the definition of x in function of u, v and Z involves three relations. Those three relations allow to get the three components of the strain, u>xx, uyy and uxy in function of x- The result is: Ox* = - ^ = 2(A + 2fi)uxx + 2Au;„v, dy* a
w = -Q^ = 2(A + 2»)uyy + 2Aw«,
(30) (31)
2
axy
=
&X ~dx^
= 4fMJxv
(32)
-
One deduces from this system the components of the strain u in function of the second derivatives of x,,
_ (A + 2/i)x, v l ; -Ax ,XX
"xx ..
/nn\
'
(33)
'
(34)
_ (A + 2/i)x,»» - \x,w
"- ~
»°d
MAT^) SMAT^)
«„ = -*g*.
m
,
(35)
Elasticity and Geometry 23
Indeed those three equations begin to look like the one we met before when proving the Theorema egregium: this theorem states the condition under which the components of the strain are all zero, which is equivalent to say that the strain w vanishes, that is precisely a condition deduced from the cancellation of the three components of u just written in (33, 34 and 35). When proving the Theorema Egregium, we have used the identity o2uxx d2u}yy d2uxy _ d2z .2 „d2zd2z K dy2 dx2 dxdy dxdy' dx2 dy 2 Therefore, by applying the same combination of second derivatives to the right hand side of the expression of u in terms of the derivatives of x, one obtains an equation relating the Airy potential (actually a linear combination of its fourth derivative with respect to x and y) and Gaussian curvature. One gets the unpolished expression: d2Z% dy2
[
d2Z2y dx2
2
2d
(Z<xZ,v) dxdy
1
=
8/I(A + /J)
ix+2 +
^ w]
_ 2 A _2^ _ +4(A v 8x dy2
+
(36)
^ ) , a 24 x 2 ^'dx dy _
which becomes, after proper rewriting: A2x
+ E[Z,Z] = 0,
with E = 8 f f i ^ ■ This is the second FvK equation (17) The non bending part of the first FvK equation is derived along very similar lines by writing the condition of stationarity of Uei under variations of Z. This gives as a first result: — (2AZ x(uxx
+ u)yy) + ifxZy(u)xx + uyy) + 4/iZ, z w xs ) = 0.
(37) Now the first FvK equation follows by substituting into (37) the expression of the components of u> in terms of the second derivatives of x- The final result is bilinear in Z and x and is just the part of the first FvK equation not involving the bending, that is [x, Z] = 0. 3.4
Thin shells: an example
We discuss in this Section the contact of a thin elastic shell (actually either a ping pong or a tennis ball) with a rigid plane. Experiments show a first or der/discontinuous transition between two different configurations: the first is characterized by a flat contact between the shell and the rigid plane, whereas
24
B. Audoly & Y. Pomeau
in the second case the shell inverses itself and makes contact with the plane along a circular ridge only. This striking phenomenon can be explained, al though in qualitative terms, by comparing the energy of deformation of the two configurations. Below we shall explain first this phenomenon in qualita tive terms only, and then we shall sketch a more quantitative approach, that rather surprisingly uses the FvK equations. This is a bit surprising because the FvK equations concern flat plates without any rest curvature, although a ball is what is called a shell in elasticity theory, because of its equilibrium curvature. In their book on Elasticity Landau and Lifshitz present the solution of the problem of a spherical shell strained by a normal force pointing inwards, a solution due to Pogorelov [8]. When the intensity of the force is small, the deformation is localized near the point of application and grows linearly with the force. For large forces a circular fold appears around the point of application. In the latter case, the scaling laws for the radius of the fold can be deduced from order of magnitude estimates of the various contributions to the elastic energy. As shown by Pogorelov, if the applied force is F, then the deformation, e is proportional to the square of F for F large: F
Eh2 — e for e < h,
(38)
R/,5/2
F
5—e 1/2
for c > h,
(39)
where E is Young modulus, R the radius of the shell at rest. In the case studied by Pogorelov, the transition from linear to quadratic behavior of F at increasing e is continuous, and happens roughly when c is of order h. This concerns a force applied with a sharp point of radius of curvature much less than the shell radius. On the contrary, we consider below the opposite case, namely when the force has a wide support, for instance when a ball is crushed between two parallel plane, or the dynamics of a tennis ball bouncing on a racket, both situations similar to the classical Hertz contact problem. The experiments, as reported in [5] show two regimes, depending on the strength of the applied force: i) For weak forces, the deformation is small as well, and part of the ball gets into contact with the rigid plate along a disc. ii) For larger forces, the disc of contact unfolds and reverse itself to become an inverse spherical cap, the contact with the rigid plane being now along a circular ridge, that separates the inverted cap from the rest of the
Elasticity and Geometry 25
Figure 3. Geometry of the ball at large forces in configuration II,left, and close-up view of the circular fold, right.
sphere (see fig. 3). The elastic energy is stored in the latter case almost exclusively in the fold. Later on we shall call the two configurations (I) and (II). The transition takes place for a deformation (as measured by the change of distance to the center of the sphere in the radial direction) of order 2/i. The dependance of the force on the deformation is like F ~ e, for e < 2/i and F ~ e 1 / 2 for e > 2/i; which shows two different behaviors. Furthermore, an hysteresis is observed as the force is first increased and then decreased, which implies a discontinuous transition from one regime to the other, as observed too. A detailed interpretation of the experiments would require to take into account various irreversible phenomena in the material the balls are made of. We shall not consider this question here, and limit ourselves to the discussion of the purely elastic and reversible phenomena. To estimate the elastic energies associated with configuration (I), one no tices that this energy is made of two pieces: first the flattening of the spherical cap changes the lengths measured along a piece of the shell, which goes from spherical (the rest state) to a flat disc. This involve a stretching energy, be cause the Gaussian curvatures everywhere changes from R~2 to 0. Secondly, a fold is created along the edge of the contact region, the contribution of which has to be considered. The stretching contribution to the energy is computed as usual, as the integral of the square of the strain over the volume. The radius of the disc of contact is r = y/2Rt for e small. The straining of the length is estimated as fol lows. The unperturbed cap, bounded by a circular ridge of radius r is a spher-
26
B. Audoly & Y. Pomeau
ical disc of radius Ra along the surface of the sphere, with sin a = ^ . This curvilinear distance Ra becomes R sin a once the cap is flattened to become a disc of radius r. Therefore, the strain—the derivative of a displacement—once integrated along a radius of this disc, is equal to the difference between the unperturbed and the perturbed length, namely to /?((sina) — a) ss -R9*-The gradient is approximately this displacement divided by r, which gives JJH ^ Bsl „ <£, s o that the elastic energy of strain is: 2nEhfdr r (f^) 2 , h being the thickness of the sphere, as usual, an integral computed over the disc of contact. Therefore, up to constants of order 1, this elastic energy is approximately r6
Ehe3
Now we have to compare this energy to the energy stored in the ridge between the inverted cap and the unperturbed sphere. Actually, such a ridge exists whether the sphere rests upon the ridge or upon the full disc. In both cases (flat disc of contact or inverted cap) there is a jump of orientation of the tangent plane across the ridge. When there is a disc of contact [situation (I)] the jump is precisely the angle a we have just defined, while in the case (II) (inverted cap), the discontinuity is twice a. The straining energy of the flattened cap disappears in the case of the inverted cap, but the angular discontinuity near the ridge is bigger for the inverted cap by a factor of two, and so one expects a bigger contribution to the elastic energy as well. The transition from configuration (I) to (II) requires some care to be properly understood. We have just estimated the elastic energy due to the flattening. The energy of the ridge is not so straightforward. It is computed in two steps. First one shows that the Gaussian curvature along the ridge is much bigger than the curvature of the unperturbed sphere. In a first approximation this permits to neglect the fact that this sphere has a curvature at rest and so to compute the energy of deformation by using the FvK theory, valid at zero rest curvature. The next step is to compute the order of magnitude of the Gaussian curvature in the ridge area. One of the radius of curvature, Ri is the radius of curvature of the intersection of the deformed sphere with a plane passing along the axis of symmetry. This radius i?i is the smallest radius, and it is of order ~, 6 thickness of the ridge, a length scale that should tend to zero as h tends to zero at constant R. The radius of curvature in the perpendicular direction is found by an argument borrowed from Pogorelov: this is the radius of curvature along a parallel. It goes from +R on the surface of the undeformed sphere to -R inside the inverted cap. Therefore it remains of order R all inside the ridge area. Therein, the Gaussian curvature
Elasticity and Geometry
27
is of order - ^ . To estimate S one has to balance the stretching and bending energy contributions of the ridge. The density of bending energy per unit area is Eh3 ^ - , while the density of stretching energy is
-
l
( — ) ^
(£fl)2'
Therefore the two energies balance for 6 ~ (ftfl) 1 / 2 , which yields by putting this into the expression of the FvK energy, Ueiji ~ Bt flh— for the elastic energy of the ridge. Therefore the transition from the flat contact to the contact along the circular ridge takes place when the two elastic energies Ueij and Ueiji are of the same order of magnitude, which happens for e ~ ft, as observed. Indeed, it is essential that for "large" t (meaning large compared to ft), the elastic energy Uei / grows faster, like e3 than Uei 11 that grows as e3/2
It is possible to go beyond this scaling approach and to show how the structure of the ridge can be related to a solution of the FvK equations in a well defined limit. For that purpose, one notices first that in the ridge area, the parameters defining the surface (for instance the orientation of the normal) vary much more rapidly in the direction perpendicular to the ridge (along which the radius of curvature R\ is much smaller than R) than parallel to the ridge, with a radius of curvature of order R, as we have just seen. This gives the following idea: let us write the FvK equations in a local frame, in the plane tangent to the surface near the circular ridge. Let x be the coordinate perpendicular to the local direction of the ridge and y the local azimuthal coordinate. Because of the axisymmetry, any quantity depends locally on the 2
combination X = x - j ^ , because the curve X = 0 is locally the circle drawn by the ridge. Therefore, the second derivative -g-j is approximately — fj$f» which yields 1 cPx
y
2
r dX dX
dX
dXdX2''
At the dominant order, A 2 £ ss (,tx*> since a derivative with respect to X brings in a factor of order 6~l, much bigger than r~1. With those approximations the FvK equations become:
28
B. Audoly & Y. Pomeau
with the notation, rather standard in the field, D = 12h-ai) ■ The system (40 and 41) can be readily integrated once, to give:
fC_
hdx dC
3
dX
rdXdX
0,
(42)
0.
(43)
The constant of integration in (42) has been taken as zero, because far from the ridge (namely for X -¥ ±oo) the stress vanishes, which is proportional to the second derivative of x> while ^ tends to ±a depending on which side of the ridge on considers. All scaled quantities in (42) and (43) can be eliminated by the change of variables X = ( ^ ) l f \ X = X^ This gives the set of numerical coupled equations: d?C
2P + K-0, and
$
+ l-?-0.
(^)1'2
and C = <*C
(44) (45)
where the boundary conditions are £ -► ±1 and x -t 0 as X -* ±oo. x is an even function of X and C is odd. Finally those two equations derive from the Lagrange function
It is not too difficult to check that this gives a complete solution of the problem of the elastic energy of the ridge, that is consistent with the order of magnitude estimate given before. Note too that the transition layer between the flattened disc and the sphere is described by the same equations, except that the boundary condition for j £ , at X tending to +oo is 0, and that it remains a for X tending to - c o , the limit X large negative at the scale <5 representing the unperturbed spherical surface, and X large positive the flat disc of contact. The boundary conditions for the scaled equations are deduced in an obvious way for this case too. 4
Buckling in thin film delamination
To illustrate the use of the FvK equations, we study the patterns obtained during the delamination of thin coated films. Coated materials have many
Elasticity and Geometry 29
applications in the industry: coatings permit to obtain wear-resistant metalcutting tools, thermal barriers in the aircraft and automobile industry, in sulating layers in microelectronics, etc. Coated materials are often obtained by vapor deposition of a thin film on a substrate at high temperature. The film and the substrate are made of different materials, and because of the mismatch of thermal expansion coefficients between them, the film acquires a biaxial residual stressCTOupon cooling. When this residual stress is compressive (CT0 < 0), and large enough, an elastic instability may take place: to release its compression, the film tends to lift off the substrate, sometimes fracturing the interface [9,10]. Complex patterns of delamination have been observed, as telephone cord blisters: under various experimental conditions, the delaminated portion of the film looks like a tunnel with wavy edges [11,12]. The aim of the present section is to understand these patterns qualitatively. The instability leading to delamination results from a complex competi tion between the cohesion of the interface and the elasticity of the film, which itself involves both the bending and the stretching contributions in the en ergy (19). We choose to focus on the elastic effects, and do not attempt any analytical modelization of the film-substrate cohesion: we impose the loca tion of the edge of the delaminated region. Despite this severe simplification, we shall be able to recover, and therefore interpret, the telephone cord-like patterns of delamination. 4.1
The straight sided blister
One of the simplest illustration of the FvK equations is provided by the buck ling of the straight, infinite elastic strip. In this first subsection, we consider a buckling mode which keeps the translational symmetry of the strip: the plate buckles into a ID, i.e. cylindrical, pattern, as in fig. 4. More general buckling modes are discussed later in 4.2. We remind that E and a are the Young's modulus and the Poisson ratio of the plate, respectively, while h is its thickness, b the half-width of the strip, and er0 the initial, biaxial compression. The elasticity of the strip is governed by the FvK equations (16) and (17). The Airy potential x(x>y) satisfies, by definition: d2X 9-2=
a
»
d2X W=
d2X d ^
=
axv
~ '
( ^ ( 6 )
The Airy potential, x. and the vertical deflection, C> will be decomposed into a contribution due to the initial compression <70, indexed by "0", and a change from this initial state, due to buckling. The latter contributions are labelled
30 B. Audoly & Y. Pomeau
'Vx/*
Figure 4. Buckling of the elastic strip into the Euler column, solution of the FvK equations above the critical initial compression
by a prime: — -JQ
X°+x'
C=C
x2+y2
v °—- - f„f b — ^ — , X
(47)
<° = o.
(48)
One can check that the above definition of x° indeed yields the uniform, biaxial compression: cr°x = -a0, °\y = -aQ, a°y = 0, when plugged back into into eq. (46). We shall later drop the primes in x' and £'. Straightforward use of the FvK equations (16) and (17) yields the equations satisfied by the new unknown functions, x and C: h3E 12(1 -n2
-A2C + < 7 0 ftAC-/ l [C,x] = 0,
(49)
A 2 X + £[C,C] = 0.
(50)
The only effect of the initial compression is in the new term (CT0 h AC). These equations require boundary conditions along the edge of the blister (again, this edge is fixed and imposed—we do not discuss the selection of b). Because the film is much thinner than the substrate, clamped boundary conditions should be imposed along the edge: u(aj = ±6,y) = 0 v(x = ±b,y) = Q <(x = ±6,y) = 0
| i ( i = ±b,y) = 0.
(51) We now discuss the y-independent solutions of eq. (49-50) with the bound ary conditions (51). Because we restrict our investigations to ID profiles, there
Elasticity and Geometry 31
is no Gaussian curvature in the buckled configuration e (cylinders are devel opable surface); as a result, [(,£] = 0 in eq. (50). This equation therefore reduces to A 2 x = x,x* + X,y* + 2 X,*V = 0- Using the definition of x m eq. (46), x,y* = (°**) vv n a s t 0 vanish, because of the assumed translational invariance in the y direction. Similarly, x,x2v2 = (avv),yV vanishes. Eq. (50) therefore reduces to x,x* = 0, which is readily integrated into: X(z,y) = ^ x
3
+ ^ x 2 + x i * + x(y),
(52)
where X3,2,i and x are unknown functions of y. FVom the x -+ (—x) symmetry and by j/-invariance, X3 = 0; by translational invariance, X2 is a constant, which we call avv for obvious reasons. For the same reasons, x{y) = axx y2/2; Xi is a non-physical constant because only the second derivatives of x appear in the stress. We finally end up with: X(x,y) = ^ x
2
+ ^y
2
,
(53)
where axx and ovv are constants which remain to be determined. Quite re markably, the tangential stresses in the buckled configuration are uniform along the Euler column. The solution of (49): D C (4) (*)-/i(-
C(* = ±&) = 0 C'(* = ±fc) = 0 (54)
is now easy—note that we use the flexural rigidity of the plate, D = h3E/\2(l - a2)), as introduced earlier. When OQ - axx > 0: 1 + cos ([h(a0 -
x) •
(55)
The inequality means that the final transverse component of the stress, - c o + o~xx, will remain compressive in the delaminated region, as suggested by intuition. Compatibility with the boundary conditions CC^) — 0 yields:
( * f o ~ g M ) ) i / a 6 = ir.
(56)
Strictly speaking, the right side of this equation should be of the form (2n + \)ir, with n = 0 , 1 . . . When n > 1, however, the film has several undulations in the x direction, and the corresponding configuration is obviously unstable. 'This does not mean that tangential stresses are absent in the buckled configuration. Only the converse is true (in the absence of stretching, the deformed surface is almost everywhere developable).
32 B. Audoly & Y. Pomeav Therefore, we only need to consider n = 0. Equation (56) gives axx as a function of 6. The maximal deflection, £ m , and the longitudinal component of the stress, oyy still have to be found. To do this, the remaining boundary conditions have to be used: u(±b,y) = 0 and v(±b,y) = 0. Unfortunately, u and v have been eliminated from the formulation of the FvK equations, wherein the only unknown are x and (,. Therefore, we have to go back to the constitutive equations of the materials, the Hooke law. As explained in §3.3, linear elastic ity assumes proportionality between stress a., and strain w... In 3D isotropic materials, these relations take the form:
1 , E(axx 1 , W«,. — u vv E{ayv
Vxx =
- a{ayv + a„)),
(57)
- a{axx + azz)),
(58)
1+CT
(59)
IT
uuxy xv —
'xy
Since we consider elastic plates, all these quantities can be averaged over the thickness of the plate. Moreover, the free edge condition at z = ±h/2 yields o.z = 0. Finally, in the definition of the strain, only the nonlinearities related to the z component have to be retained (see §3.3). The relations above lead to: 2
du i /dcy di + 2\te) dv i / a c \ 2
i , =E{<Jxx~a(J^ I .
(60) (61)
i (dv_ du\ l
(62)
We mention in passing that the elimination of the functions u and v from these equations leads the second FvK equation (17). We are now able to enforce the remaining boundary conditions: the first one, u(x = ±b,y) = 0 can be rewritten as / ^ " 6 6 1 ^ ds. Using eq. (60), an equation involving x and £ only can then be derived:
\l>
E
(0XX - a 0yy)
2 \dxj
= 0.
(63)
With the help of the explicit form of £ determined in eq. (55), one obtains: 1
I
\
1 , 2 ^ *
g{Cxx ~ ° <Jyy) = —Cm ^".
(64)
Elasticity and Geometry
33
The last boundary conditions, v(x = ±b,y) = 0, can be rewritten ^ = 0 along the edge. Using eq. (61) together with the invariance by translation along y, one obtains: = aa„. (65) 'vv This equation a priori holds along the edge of the blister only (it stems from a boundary condition). Since azx and ayy are constants, it is in fact valid everywhere. One can now extract the three quantities: Cm, "** and oyy from equa tions (56), (64) and (65). This yields: Cm = h
S(H
1/2
(66)
where the critical compression ac is defined as: Dir2
(
„//»\
E
2
" = T*= {b)
it2
12(1-a-2)'
(67)
The buckled configuration in eq. (56) is a solution of the FvK equations only when the initial compression is compressive enough: <7o > °-c. The square-root dependence of £ m on (cro - <*c) is characteristic of a supercritical bifurcation. The residual constraints in the buckled column are: Ruckled) = _aQ + azx = _aQ + (ffo _ ac) = _ (Tc) (6g) ffyy = -<70 + Oyy = ~ ( 1 ~ v)^
- OOc.
(69)
The shear component, ahxy, vanishes for obvious symmetry reasons. Note that, in this problem, nonlinearities arise not only from the FvK equations, but also from the geometric nonlinearities in the boundary conditions. 4-2
Telephone cord delamination
Using the results of the previous section, we can interpret qualitatively the wrinkles observed in thin film delamination. The reader shall look at a pa per published recently for a more quantitative exposition of the same argu ments [13]. Under typical physical conditions, the initial film compression is quite high: in reference [11], cr0 as high as a few percent of E is reported. Because in eq. (67) ac/E scales like the square of the small parameter h/b, this corre sponds to initial compression well above the buckling threshold ac: typically, ao /crc is in the range 25-75. In these conditions, the Euler column is well
34 B. Audoly & Y. Pomeau
Figure 5. Secondary buckling of the Euler column under the effect of the residual longitu dinal compression (amplitude of the deflection is arbitrary). When a) a = 0.3 the most unstable mode is antisymmetric (5—), while b) for a = 0.2, it is symmetric (5+). Insta bility of the delamination front caused by this secondary buckling has been represented by the lateral arrows, leading to delamination patterns in the inserts.
above the weakly nonlinear regime ' . In the very nonlinear limit, ao ^ ac, eq. (69) shows that there remains an important longitudinal compression in the Euler column. This important residual compression motivates a stability analysis of the Euler column itself with respect to longitudinal deformations. In [13], we show that the Euler column in indeed unstable for typical experimental compres sions. Depending on the Poisson ratio of the film, the most unstable mode should be antisymmetric or symmetric (see fig. 5). Antisymmetric secondary buckling might well correspond to the observed telephone cord-like patterns. -'Note that this is compatible with the above analysis of the Euler column, in which no approximation has been done: the Euler column is an exact solution of the FvK equations; it is not only valid in the weakly nonlinear limit.
Elasticity and Geometry
35
On the other hand, varicose patterns of delamination should be expected when the Poisson ratios of the film is low {a < .255). Such patterns have not been observed so far, and experiments using macroscopic plates are being conducted to test this prediction. References 1. M. Spivak. Differential Geometry Vol. V. Publish or Perish Inc., Berkeley (1982). D. J. Struik. Lectures on classical differential geometry, 2nd ed., Dover, New York (1961). 2. M. Berger. Elements de geometrie, Vol. 5, Nathan Ed., Paris (1975). 3. C L. Siegel. Topics in complex function theory, Vol. 2, p. 14 et sq. Wiley classics, New York (1988). 4. F. A. M. Frescura, B. J. Hiley. Geometric interpretation of the Pauli spinor. Am. J. Phys., 49:152-157,1981. 5. L. Pauchard, Y. Pomeau and C. Rica. Deformations des coques elastiques, C. R. Acad. Sc, Sene lib, t. 324, 411 (1997). 6. M. Ben Amar and Y. Pomeau. Crumpled paper, Proc. Roy. Soc. Lon don, A 453, 1 (1997). A. Foppl. Vorlesungen uber technische Mechanik, Bd. 5, p. 132 et sq. Leipzig (1907). 7. L.D. Landau et E.M. Lifshitz. Thiorie de VElasticity, Editions Mir, Moscou (1967). 8. A.V. Pogorelov. Bendings of Surfaces and Stability of Shells, American Mathematical Society, Providence (1988). 9. D. Nir. Stress relief forms of diamond-like carbon thin films under inter nal compressive stress. Thin solid films, 112:41-49,1984. 10. A. S. Argon, V. Gupta, H. S. Landis, and J. A. Cornie. Intrinsic tough ness of interfaces between SiC coatings and substrates of Si or C fibre. Journal of material science, 24:1207-1218,1989. 11. G. Gille and B. Rau. Buckling instability and adhesion of carbon layers. Thin Solid Films, 120:109-21,1984. 12. G. Gioia and M. Ortiz. Adv. Appl. Mech. 33, 119 (1997). 13. B. Audoly. Stability of straight delamination blisters. To appear in Phys. Rev. Lett, 1999.
Q U A N T U M CHAOS DOMINIQUE DELANDE Laboratoire Kastler-Brossel Chaos is a well defined concept for classical systems. In these lectures, I study the manifestations of chaos for microscopic objects, for which a quantum description must be used. Various examples, mainly but not exclusively coming from atomic physics, are used to illustrate our current understanding of the problem.
1 1.1
What is Quantum Chaos? Classical Chaos
Chaos is usually defined for classical systems, i.e. systems whose dynamics can be described by deterministic equations of evolution in some phase space. The general form of these equations is:
f=f(X)
(1)
where X is a vector (in phase space) representing the relevant physical prop erties of the system [1] - in the simplest case, it can be the position and momentum of a single particle. In this case, the number of components of the vector X, i.e. the dimension of phase space, is twice the number d of degrees of freedom of the system. In the following, we will be interested in systems with a small number of degrees of freedom, typically d < 3. The func tion f depends only on the position X in phase space, which expresses the deterministic character of the dynamics. In the specific case of a time-independent Hamiltonian system for a single particle, the phase space coordinates are the position q and momentum p , and the equations of motion can be expressed using the Hamilton function tf(q,p)as[2]: dt dpi dPi_ dH dt dqt-
K
' W
Basically, classical chaos is exponential sensitivity on initial conditions: two neighbouring trajectories diverge exponentially with time, i.e. the dis tance between the two trajectories generically increases as exp(Ar) where A is 37
38 D. Delande
called the Lyapounov exponent of the system[l]. Sensitivity on initial condi tions is responsible for the the decrease of correlations over long times, loss of memory of the initial conditions and ultimately for deterministic unpredictibility of the long time behaviour of the system. Most often, when the system is sensitive on initial conditions, it is also mixing and ergodic [1], i.e. a typical trajectory uniformly fills up the entire phase space at long time. For low-dimensional systems we are interested in, the dynamics is often a mixed regular-chaotic one, depending on the initial conditions; also, when a parameter is changed in the Hamilton function, the transition from regularity to chaos is usually smooth with intermediate mixed dynamics. Such mixed systems are rather complicated and not too well understood - at least for quantum effects to be discussed in these lectures - and we will here restrict to the two extreme simple situations where the motion is almost fully integrable or almost fully chaotic. 1.2
Quantum dynamics
In quantum mechanics, there is neither any phase space, nor anything look ing like a trajectory. Hence, the notion of classical chaos cannot be simply extended to quantum physics. Quantum mechanics uses completely different notions, like the state vector |r/>) belonging to some Hilbert space, which de scribes all the physical properties of the system. Its evolution is given by the Schrodinger equation:
«4S2>.H(OW<.»
(4)
where h is the Planck's constant. The linear Hamiltonian operator H(t) is acting in the Hilbert space. The connection between this operator and the classical Hamilton function is far from obvious. The usual rule is that the quantum Hamiltonian is obtained from the classical one through replacement of the classical position by the position operator (which is diagonal in the standard position representation of the state by its wavefunction V(q) = (qlV')) and replacement of the clas sical momentum by -ihd/dq. There is a difficulty because the position and momentum operators do not commute, which is solved by using symmetrized combinations ensuring the hermiticity of H [3]. t/>(q) is not directly observable in quantum mechanics. In general - ac cording to the standard Copenhagen interpretation of quantum mechanics the result of a measure is some diagonal element of an Hermitean operator, something like (ip\0\ip) [3]. The physical processes involved in an experimen tal measurement are quite subtle, difficult and interesting, but beyond the
Quantum Chaos 39
subject of these lectures. It is also the subject of a vast litterature [4]. I will not consider this problem and restrict to a purely Hamiltonian evolution. The time-evolution operator U(t',t) is by definition the linear operator mapping the state \rp(t)) onto the state \ip(t')). It obeys the following equation (which is equivalent to Schrodinger equation): ih9-^-=H{t')U{t',t).
(5)
U(t',t) is the major object for studying the quantum dynamics. Because H is an Hermitean operator, U(t',t) is a linear unitary operator. An imme diate consequence is that the overlap between two states is preserved during the time evolution. Indeed, one has:
to(Ohfe(f)> = (Mt)\uHt',t)u(t',t)\rp2(t)) = MWItfeW)
(6)
which implies that two "neighbouring" states will remain neighbors forever. Because of linearity and unitarity, quantum mechanics cannot display any sensitivity on initial conditions, hence cannot be chaotic in the ordinary sense! However, the previous statement must be considered with care. Indeed, classical mechanics can also be seen as a linear theory if one considers the evolution of a classical phase space density p(q, p, t) given by the Liouville equation [2]:
where {,} denotes the Poisson bracket. The fact that we obtain both in classical and quantum mechanics a linear equation of evolution in some space just implies that the above argument on linearity in quantum mechanics is irrelevant. Discussions on subjects like "Is there any quantum chaos?" are in my opinion completely uninteresting because they focus on the formal aspects of the mathematical apparatus used. We will here define quantum chaos as the study of quantum systems whose classical dynamics is chaotic. The questions we would like to answer are thus: • What are the appropriate observables to detect the regular or chaotic classical behavior of the system? • More precisely, how the chaotic or regular behaviour expresses in the energy levels and eigenstates of the quantum system? • What kind of semiclassical approximations can be used?
40 D. Delande
These are the questions discussed in these lectures. I will only present selected topics, forgetting lots of interesting questions and relevant references. Thes questions of course go towards an intrinsic definition of quantum chaos not refering to the classical dynamics [5]. Thus, the problem of quan tum chaos is essentially related to the correspondance between classical and quantum dynamics, the subject of semiclassical physics. 1.3
Semiclassical dynamics
The whole idea of a semiclassical analysis is to obtain approximate solutions of the quantum equation of motion (the Schrodinger equation) using only classi cal ingredients (trajectories...) and the Planck's constant ft. For a macroscopic object, our common knowledge is that an approximate semiclassical solution should be very accurate. Technically, this is true because ft is much smaller than any classical quantity of interest (such as the classical action of the par ticle). One often refers to the "correspondance principle" as an explanation. However, this is a very vague concept which is usually not clearly stated, not proved and whose conditions of validity are not discussed. Actually, it is so vague and qualitative that it should be rejected. Part of these lectures are devoted to a serious scientific discussion of this issue, using the modern knowledge on classical chaos. In order to make the connection between classical and quantum quantities, it is useful to define the Wigner representation defined as [6]:
W
^
P)
= ( 2 ^ / * (<* " I) ** (««+ f) eX> ( » X )
dX
(8)
This is a real phase space density probability, or rather quasi-probability because it can be either positive or negative. Its evolution equation is simple to compute [6]: dW _ dt ~
2 ft tf(q,p)sin(y)w(q,p)
(9)
with: L
dPi dQi
dQi dPi
(10)
where the left (resp. right) arrow indicates action on the quantity on the left (resp. right) side. An explicit power expansion of the sine function is possible. This is in fact a power expansion in ft, hence well suited for a semiclassical approximation.
Quantum Chaos 41
At lowest (zeroth order), one finds exactly the classical Liouville equation, thus establishing a link between the quantum and classical dynamics. At next order in h (actually ft2), one finds terms involving third partial derivatives of the Hamiltonian. For harmonic systems, these terms vanish, proving that the classical and quantum phase space dynamics completely coincide. For non-harmonic systems, the corrective terms produce an additional spreading of initially localized wavepackets. For chaotic systems, the classical solutions of the Liouville equations tend to stretch and fold along (exponen tially) unstable directions and - because of conservation of volume in phase space - to shrink along (exponentially) attractive directions. This rapidly creates "whorls" and "tendrils" in the classical phase space density, which in turn implies more and more rapid spatial changes of the density. Thus, as time goes on, one expects some higher order partial derivatives to grow exponentially. Although the corresponding terms in the quantum equation of evolution are multiplied by h2, they will unavoidably grow and overcome the classical Liouville term °. Hence, after some "break time", the detailed quantum evolution will differ from the classical one. The estimation of this break time is a very difficult questions, and different answers are possible, depending on which aspect of the dynamics is under study (local, global...). I will not discuss this important point here, see [4,5]. Of course, for smaller h, the higher order terms are smaller and it requires a longer time for them to perturb the dynamics. Hence, the break time has to tend to infinity in the semiclassical limit h -+ 0. For a fixed time interval, one can always find a sufficiently small h such that the quantum and classical dynamics are almost identical. In other words, over a finite time range, the quantum dynamics tends to the classical one as h -+ 0. However, this limit is not uniform. For fixed h, there is always a finite time after which the quantum dynamics differs from the classical one. In other words, the two limits t -> oo and h -+ 0 do not commute. Taking first h -+ 0, then t -t oo is studying the long time classical dynamics, i.e. classical chaos. The other limit t -> oo, then ft -> 0, is what we are interested in, namely quantum (and semiclassical) chaos. In fact, the semiclassical limit ft -> 0 is highly singular and quantum chaos is essentially the problem of understanding correctly this limit.
a
A rather similar conclusion can be obtained using the so-called Ehrenfest theorem, which gives the time evolution of average values of the position and momentum [3]. Provided the wavefunction is a localized wavepacket, these equations coincide with the classical equations of motion. However, the unavoidable spreading of the wavepacket destroys its localized character and breaks this simple correspondance. Again, the problem with this approach is to estimate precisely how the spreading affects the global dynamics.
42 D. Delande
1.4
Physical situations of interest
Simple equations of motion may produce a chaotic behaviour. A rather nonintuitive result is that chaos may take place in low dimensional systems. On the other hand, classical chaos can only exist in systems where different de grees of freedom are strongly coupled (this is a consequence of the KAM the orem [1]). This implies that a small perturbation added to a regular system cannot make it chaotic. The simplest possible chaotic systems are thus time-independent 2dimensional systems. It is also simpler to consider bound systems with a discrete energy spectrum. Various model systems have been studied, among which billiards are the simplest ones. A billiard is a compact area in the plane containing a point particle bouncing elastically on the walls. Depending on the shape of the boundary, the motion may be regular or chaotic. Prom the quantum point of view, one has to find the eigenstates of the Laplace operator whose wavefunction vanishes on the boundary [7]. Open (i.e. not bound) systems have also been studied, mainly because the classical phase space structure is usually simpler in such systems. The simplest example is the "three disks system" which is an open billiard with three identical circular obstacles centered on a equilateral triangle. This is an example of "chaotic scattering" [8], where the chaotic behaviour comes from the existence of arbitrarily long and complex trajectories bouncing off the 3 disks without escaping. From the quantum point of view, there are no longer discrete bound states, but rather resonances with complex energies which are poles of the S-matrix or of the Green's function. If we now turn to "experimental" systems, it is obvious that quantum effects are likely to be noticeable only for microscopic systems. The dynamics of nucleons in an atomic nucleus might be chaotic - at least at sufficiently large energy - and the experimental results on highly excited states played a major role in the early development of quantum chaos [7]. The drawback is the existence of complex collective effects and the fact that the interaction is not perfectly well known. Atoms are among the best available prototypes for studying quantum chaos and I will use them as examples in these lectures. Compared with other microscopic complex systems (nuclei, atomic clusters, mesoscopic devices...), atoms have the great advantage that all the basic components are well under stood : these are essentially point particles (electrons and nucleus) interact ing through a Coulomb static field, and interacting with the external world through electromagnetic forces. Hence, it is possible to write down an explicit expression of the Hamiltonian. Another crucial advantage of atomic systems
Quantum Chaos
43
is that they can be studied theoretically and experimentally. The word "ex periment" must here be understood as traditional laboratory experiments, but also as "numerical experiments". Indeed, currently available computers make it possible to numerically compute properties of complex systems described by simple Hamiltonians. During the last fifteen years, the constant interaction between the experimental results and the numerical simulations led to major advances in the field of quantum chaos. Depending on the energy scale involved, different parts of the atomic dy namics are relevant. At "large" energy - of the order of leV - it is the internal dynamics of the atomic electrons (their motion around the nucleus) which may be chaotic. At much lower energy - 1 /xeV - it is the external dynamics of the center of mass of the atom (considered as a single particle) which may display a chaotic behavior under the influence of an external electromagnetic field [9]. The latter case has been made possible because of the impressive recent improvements on the control of ultra-cold atomic gases using quasi-resonant laser beams [10]. Let us illustrate the first case by considering simple isolated atoms with few electrons. The simplest atom - hydrogen - can be exactly solved both in classical and quantum mechanics and is thus not chaotic at all. The helium atom brings the three-body problem described by the following Hamiltonian (Z is the charge of the nucleus and m the mass of the electrons):
zm
Ti
T2
Ti2
which is known to be classically essentially chaotic [45]. FVom the chaos point of view, the interesting situation is when the two electrons have comparable excitations. Strong dynamical correlations between the two electrons are ex pected, leading to a breakdown of the independent electron picture for highly doubly excited states. Indeed, the most recent experimental results close to the double ionization threshold display extremely complex structures in the ionization cross-section, which have been shown to be related with the onset of (quantum) chaos [12,13]. The helium atom is briefly discussed in section 4.6. In molecules, the dynamics of the electrons may also be chaotic. In some cases, the motion of the nuclei in the effective potential created by the elec trons (which follow the nuclei adiabatically) is chaotic. Some interesting re sults on the NO2 molecules have been obtained [14]. At the microscopic level, the dynamics of electrons in a solid state sample may present a chaotic dynamics in, for example, suitable combinations of external fields. This has lead to dramatic results showing very clearly the
44 D. Delande
relevance of periodic orbits for understanding the quantum chaotic dynamics [15]. Another possibility exists for experimental study of quantum chaos. One can consider wave equations describing some other physical phenomena, which have a structure very similar to the Schrodinger equation. As what we are interested in is in fact "wave chaos" (properties of eigenmodes for example) whatever the waves themselves are, this opens a wide variety of possible ex periments. The best example is provided by flat microwave cavities where solving the Maxwell equations is equivalent to calculating the eigenstates of the corresponding two-dimensional Schrodinger billiard [16]. The advantage is that a measure of the "wavefunction" is possible. I know work out in some detail the simplest atomic prototype, which will be discussed as an example in the rest of these lectures. 1.5
A simple example: the hydrogen atom in a magnetic field
We consider the simplest atom - hydrogen - exposed to a strong external uniform magnetic field directed along the z-axis. Using the symmetric gauge A = | r x B for the vector potential, the Hamiltonian is given by (q is the charge of the electron): K 2m 47re0r 2m z 8m ' where Lz is the ^-component of the angular momentum. In atomic units (h = m = |g| = 47TC0 = 1), it reads:
H = i-l
+ l^if
(13,
where 7 denotes the magnetic field in atomic units of 2.35 x 105 T. Because of the azimuthal symmetry around the magnetic field axis, the paramagnetic term 7.L z /2, responsible for the usual Zeeman effect, is just a constant. The diamagnetic term, 7 2 p 2 /8, is directly responsible for the onset of chaos in the system. The competition between the Coulomb potential with spherical symmetry and the diamagnetic potential with cylindrical symmetry governs the dynamics. As a crude criterion, chaos is most developped when these two terms have the same order of magnitude. This can be realized in a laboratory experiment with Rydberg states n ~ 40 - 150 [17,18,19]. When written in cylindrical coordinates, the Hamiltonian (13) describes a time-independent two-dimensional system belonging to the class of the sim plest possibly chaotic systems [1]. This makes this system an almost ideal prototype for the study of quantum chaos [20].
Quantum Chaos
45
One of the main difficulties in the study of the semiclassical limit h -¥ 0 is that the value of the Planck's constant is fixed in laboratory experiments. One can get around this difficulty in atomic systems thanks to the existence of scaling laws. There is a convenient scaling of all variables and external fields which leave the equations of motion invariant [21]:
H ->\H, l7^A3/27,
(14)
where A is any positive real number. This means that different initial conditions with different external fields may have exactly the same classical dynamics. This is no longer true in quantum mechanics, since there is an absolute scale imposed by the Planck's constant h. Different scaled situations observed experimentally correspond to the same classical dynamics with different effective values of the Planck's constant. The scaled energy e = £7-2/3
(15)
measures the energy of the electron in units of magnetic field. Because of the scaling law, the classical dynamics, instead of depending both on E and 7, actually depends only on e, whereas quantum properties depend a priori on both quantities. Hence, in a real (or numerical) experiment, the semiclas sical limit h —> 0 can be studied, just by tuning simultaneously the energy and the characteristics of the external fields according to Eq. (14) towards higher excited states. This possibility has revealed extremely important for understanding the classical-quantal correspondance [18]. At low scaled energy (roughly e < -0.5) the classical dynamics is mainly regular (this is the low field limit where the magnetic field is a small perturba tion). Increasing e from —0.5, the system smoothly evolves to a fully chaotic situation reached above e = -0.13. Finally, the phase space opens to infinity at e = 0. From the theoretical point of view, the use of group theory allows ex tremely efficient numerical experiments [21,20,22], making the computation of very accurately highly excited energy levels and wavefunctions possible. The calculated quatitites are found in exact agreement with the (less accu rate) experimental measurement (see [17])!
46 D. Delande T
Min (shortest periodic orbit)
:
Periodic Orbit Theory
T
Heisenberg ~ ^ A
;
. ► • Time
Universality Random Matrix Theory
Energy h/TMin
A : mean level spacing
Figure 1. The important time and energy scales for a chaotic quantum system. The shortest relevant time scale is TMim the period of the shortest periodic orbit. The most important quantum time scale is Tneisenberg, associated the mean energy level spacing. In the semiclassical limit, TH««enb«rg '8 much larger than TMin- One expects a universal classical behaviour at long times, thus universal statistical properties of the energy levels, described in section 3.4. At short times (long energy range), the specificities of the system appear to be related to the periodic orbits of the system, as explained in section 4.
2
Time scales - Energy scales
For a correct undertsanding of the connections between the quantum and the classical properties of a chaotic system, it is crucial to know the relevant time scales (and the corresponding energy scales) of the problem. The shortest time scale is simply the typical time scale for the simplest evolution of the system. It is conveniently taken as the period of the shortest periodic orbit T\iin- A slightly longer time scale is given by the time taken for chaos to manifest, that is the inverse of the typical Lyapounov exponent. The larger the sensitivity on initial conditions, the shorter this time scale. These two time scales have of course nothing to do with ft. The corresponding energy scale, 27rft/TMin, see Fig. 1, is the largest energy scale of interest in the problem. There is also a basic quantum time scale. To understand its origin, let us consider a time-independent bound quantum system with Hamiltonian H, in an arbitrary initial state \rp(t = 0)). Its evolution can be expressed using the discrete eigenstates and eigenvalues of the Hamiltonian H
ff|fc) = Ei\4*)
(16)
Quantum Chaos
47
with the following expression:
wo> = E c « ex p(- i x)'* > '
(17)
where the constant coefficients c< are computed from the initial state using: d = (kMO)).
(18)
The autocorrelation function of the quantum system is a diagonal element of the time-evolution operator:
c(t) = wo)Mt)> = (m\u(t, o)Wo)> = £ M 2 ex P ( _ i x ) • ( 19) It is a discrete sum of oscillating terms, and, consequently, a quasi-periodic function of time. This is extremely different from a classical autocorrelation function for a chaotic system which is decreasing on the characteristic time scale Tviin and does not show any revival at longer times [1]. The Fourier transform of the autocorrelation function is:
C(E) = ^
J
eiEt'hC{t)At = £ \ctf6{E - Ek)
(20)
that is a sum of (5-peaks at the positions of the energy levels. If we now consider the Fourier transform not over the whole range of time from -oo to +oo, but over a finite time interval, we obtain a smoothed version of the quantum spectrum: fr/2
1
CT(E) = U 1
<sin T < g - £ - )
2S
J T
- /t
e^CWt = £ N "' i
»
,
(21)
2ft
where all the peaks are smoothed <5-peaks of width 2nh/T. For short T, the different broadened peaks centered at the energy levels Ei overlap, and CT(E) is a globally smooth function, like its classical counterpart. In such a situation, it is possible (although nothing proves that is is always the case) that the quantum CT{E) mimics the classical chaotic behaviour. The important point is that, for large T, the different peaks do not overlap and the discrete nature of the energy spectrum must appear in CT(E), whatever the initial state. The typical time needed for resolving individual quantum energy levels is called the Heisenberg time and is simply related to the mean level spacing A through:
48
D. Delande
■» Heisenberg —
.
•
(22)
After this time, the quantum system cannot mimic the classical chaotic behaviour which has a continuous spectrum. Since ^Heisenberg depends on ft, one can understand how quantum tends to classical dynamics as ft goes to zero. The mean level spacing is given by the Weyl's rule and scales as hd, with d the number of degrees of freedom (see Eq. (25)). For two- (or higher) dimensional systems, Theisenberg tends to infinity as ft -► 0, see Fig. 1. In some sense, after the Heisenberg time, the quantum system "knows" that the energy spectrum is discrete, it has resolved all individual peaks and the future evolution cannot bring any essential new information. As a conse quence, the system cannot explore a new part of the phase space, it freezes its evolution, repeating forever the same type of dynamics. Other time scales may exist in specific systems. For example, in an open Hamiltonian system, the typical time scale for escaping the chaotic region is obviously important. Also, in mixed chaotic-regular systems, different time scales coexist in the different regions of phase space (and at their boundaries) making general statements extremely difficult. For systems coupled to their environments, dissipation and decoherence of the quantum wavefunction is known to play a very important role [4] and these effects may be dominant over chaotic effects. For the internal motion of electrons in atoms, the most important dissipative effect is spontaneous emission of photons, a process usu ally rather small, acting significantly only after thousands of classical periods [23]. For the sake of simplicity, we will restrict to the case where T Min and ^Heisenberg are the only relevant time scales. In the semiclassical limit ft -* 0, the corresponding energy scales 2irh/TMin and 27rft/THeisenberg = A are both small compared to the energy itself. This means that we will always look at relative small changes in the energy, such that the classical dynamics does not substantially changes over the energy range considered. This is of course possible in the semiclassical regime thanks to the large density of states. For low excited states, such a local approach lacks any relevance. 3 3.1
Statistical Properties of Energy Levels - Random Matrix Theory Level Dynamics
The goal of traditional spectroscopy is to assign quantum numbers to the different energy levels in order to obtain a complete classification of the spec-
Quantum Chaos 49
trum. When little is known about the system, it is difficult to identify the good quantum numbers and their physical interpretation, or even to know whether they exist or not. A simple tool is to look at the level dynamics, that is the evolution of the various energy levels as a function of a parameter. As good quantum numbers are associated with conserved quantities, i.e. op erators commuting with the Hamiltonian, energy levels with different sets of good quantum numbers are not coupled and thus generically cross each other [24]. On the contrary, if two states are coupled, the energy levels will repel each other, producing an avoided crossing. The width of the avoided crossing, i.e. the minimum energy difference between the two energy curves, is a direct measure of the strength of the coupling. Thus, looking at the level dynamics gives some qualitative information on the properties of the systems. This is illustrated in Fig. 2 which shows the evolution of the energy levels of a hydrogen atom as a function of the magnetic field strength. At low magnetic field, Fig. 2a, there are only level crossings. A given eigenstate can be unambiguously followed in a wide range of field strength, since it crosses (or has very small avoided crossings with) the other energy levels, which proves that there are at least approximate good quantum numbers. At higher magnetic field, Fig. 2b, the sizes of the avoided crossings increase and individual states progressively loose their identities. In other words, the good quantum numbers are destroyed. A crucial observation is that the transition from crossings (or tiny avoided crossings) to large crossings takes place where the classical dynamics evolves from regular to chaotic. The transition is smooth - with the proportion of large avoided crossings progressively increasing - and there is a large interme diate region where crossings and large avoided crossings coexist. This corre sponds to the range of scaled energies e € [-0.5, -0.13], in complete agreement with the classical transition from regularity to chaos, see section 1.5. From a pure quantum point of view, this phenomenon is extremely difficult to un derstand: when the magnetic field strength increases, the only change in the matrix representing the Hamiltonian, Eq. (12), in any basis, is a global multi plication of all the matrix elements of p2 by a constant factor. The dramatic effect on the energy level dynamics is a direct manifestation of chaos in the quantum properties of the system. In section 4,1 will give an explanation of this transition from the regular region where good quantum numbers, i.e. conserved quantities, exist to the chaotic region where they are destroyed. In the fully chaotic regime, the energy levels and the eigenstates strongly fluctuate when the magnetic field is changed. In that sense, the quantum sys tem shows a high sensitivity on a small perturbation, like its classical equiva-
50 D. Delande
-J^-4.5e-04
(a)
-6.06-04 LOe-05
1.56-05
2.06-05
2.56-05
3.06-05
Magnetic Field (atomic units) -5.06-05
CO +^
'c O
-5.26-05
(b) 'E o .«—« CO O)-5.46-05 i_
CD C LLI
-5.66-05 1.006-05
1.056-05
1.106-05
Magnetic Field (atomic units) Figure 2. Map of the energy levels of a hydrogen atom versus magnetic field for typical Rydberg states of the (L z = 0, even parity) series. At low energy (a), the classical dynamics is regular and the energy levels (quasi-) cross. The quantum eigenstates are defined by a set of good quantum numbers. At high energy (b), the classical dynamics is chaotic, the good quantum numbers are lost and the energy levels strongly repell each other. The strong fluctuations in the energy levels are characteristic of a chaotic behaviour.
Quantum Chaos 51
lent. The energy spectrum of a classically chaotic system displays an extreme intrisic complication, which means the death of traditional spectroscopy. Such extremely complex spectra have been observed experimentally in atomic sys tems in external fields [25], on the eigenmodes of microwave billiards (when a parameter of the billiard shape is varied) [16] and numerically on virtually all chaotic systems [24,7]. It should be emphasized that level dynamics in the chaotic regime looks extremely similar whatever the system is, as long as its classical dynamics is chaotic. It is probably the simplest and most universal property. 3.2
Statistical analysis of the spectral fluctuations
This qualitative property has been put on a firm ground by the study of the statistical properties of energy levels [7,24,26]. The idea is the following: there are far too many levels and their evolution is far too complicated to deserve a detailed explanation, level by level. In complete similarity with a gas of interacting particles where the detailed positions of the various particles do not really carry the relevant information which is rather contained in some statistical properties, we must use a statistical approach for the description of the energy levels of a chaotic quantum system. In order to compare different systems and characterize the spectral fluctuations, we must first define proper quantities. For a complete description, see [7,24]. Density of states The density of states is:
d(E) = Y,HE-Ei)
(23)
i
where the Ei are the energy levels of the system. The cumulative density of states counts the number of energy levels below energy E. It is thus:
n{E) = f J — OO
d(e) de = £ ©(£ - ^ )
(24)
j
This is a step function with unit steps at each energy level. When there is a large number of levels, one can define the averaged cumulative density of states n(E), a function interpolating n(E) by smoothing the steps. Its derivative is the averaged density of states d(E). There are several cases where this quantity contains the only relevant quantity for the physics of the system. For example, in a large semiconductor sample, the averaged density of states at the Fermi level is what determines the contribution of electrons to the specific heat at low temperature [27].
52 D. Delande
The averaged density of states can be determined in the semiclassical approximation (see section 4) by the Weyl's rule (also known as the ThomasFermi approximation):
d(E) = ~ ^ Jdpdq 8(H(q, p) - E)
(25)
It only depends on the classical Hamilton function H and not on the regular or chaotic nature of the dynamics. Unfolding the spectrum The next step is to eliminate the slow changes in the averaged density of states by defining an unfolded spectrum through the following quantity: N{x) = n(fi-1(x))
(26)
which is nothing but the cumulative density of states represented as a function of a rescaled variable such that the "energy levels" now appear equally spaced by one unit. These rescaled energy levels Xi = n{Ei) have by construction density unity. It allows to compare spectra got for different parameters or even for completely different systems. Nearest Neighbor Spacing Distribution The simplest quantity is the distribution of nearest neighbour spacings, i.e. of energy difference between two consecutive levels Sj = Xj+i — x<. This dis tribution is traditionaly denoted P{s). By virtue of the unfolding procedure, the average spacing is one. Its behaviour near s = 0 measures the fraction of very small spacings (quasi-degeneracies), hence the degree of level repulsion. Number Variance The use of the nearest neighbor spacing distribution is simple, but not very logical from the statistical physics point of view. Indeed, P(s) involves all correlation functions among the energy levels. It is simpler to consider sepa rately the two-point, three-point, etc... correlation functions. The two-point correlation function R2 depends only on the energy difference if the spectrum is stationary (i.e. statistically invariant by a global translation, which is likely for a large unfolded spectrum). Near 0, it again measures the degree of level repulsion. A more global quantity is the number variance E 2 (L) which mea sures the variance of the number of levels contained in an energy interval of length L. It is related to the two-point correlation by [7]: E2(L) = L + 2 / (L-x)(R2(x)-l)dx Jo
(27)
Quantum Chaos
53
It is a measure of the rigidity of the spectrum, that is, it measures how the spectrum deviates from a uniform spectrum of equally spaced levels. Spectral Rigidity A related quantity is the so-called spectral rigidity A3(L) which measures how much the cumulative density of states differs from its best linear fit on an energy interval of length L. The relation is: A 3 (L) = jj[
(L3-
2L2x + x 3 )E 2 (L) dx
(28)
It is again an alternative to the two-point correlation function. Its advantage is that it is very robust against imperfections such as spurious or missing energy levels and can be determined rather safely from a limited number of energy levels. This is of major importance for example in analyzing experimental atomic [28,29] or nuclear spectra [7]. 3.3
Regular Regime
In the regular regime (see Fig. 2a), consecutive energy levels generally do not interact. Thus, from the statistical point of view, they can be considered as independent random variables. The distribution of spacings is the one of uncorrelated levels, that is a Poisson distribution: P{s) = e~>,
(29)
which nicely reproduces the numerical results obtained on different systems (see Fig. 3a) and also several experimental results [30,16]. Note that the max imum of the distribution is near s = 0 which shows that quasi-degeneracies are very probable and that level repulsion is absent. This is a universal result which applies generically to regular systems. Other statistical quantities can be described as well. The two-point correlation function is simply R2 (x) = 1 leading to the number variance E 2 (L) = L.
(30)
Fig. 3b shows the numerical result for the hydrogen atom in a magnetic field in the regular regime. The agreement with the prediction is good, at least for low L. The saturation at large L can be quantitatively understood using periodic orbit theory (see section 4). In simple words, it is due to long range correlations in the spectrum induced by periodic orbits.
54 D. Delande i
\
T-
'
*—
'
i
■
i
—
■
—
P(8) 0.8
-
0.6
:
\
0.4
(a) \
0.2
■*^-i
I
.
I
.
i
• ^LJ
- f — — — j r
Level Spacing s
Figure 3. Statistical properties of energy levels for the hydrogen atom in a magnetic field, obtained from numerical diagonalization of the Hamiltonian in the regular regime, (a) Nearest neighbor spacing distribution. The distribution is maximum at 0 and well fitted by a Poisson distribution (dashed line), (b) Number variance. Again, the Poisson prediction (dashed line) works quite well. The saturation at large L is due to the residual effects of periodic orbits and is well understood.
Quantum Chaos 55
1
2
Level Spacing s
Figure 4. Same as figure 3, but in the classically chaotic regime, (a) The probability of finding almost degenerate levels is very small (level repulsion). The results are well reproduced by the Wigner distribution (dashed line) and Random Matrix Theory, (b) The number variance is much smaller than in the regular case, showing the rigidity of the energy spectrum. The results agree perfectly with the prediction of Random Matrix Theory (dashed line).
56 D. Delande
3-4
Chaotic Regime - Random Matrix Theory
In the chaotic regime, the strong level repulsion induces a completely different result for the spacing distribution - see Fig. 4a - with practically no small spacing, and also a lack of large spacings. A simple model is able to predict the statistical properties of energy levels. It assumes a maximum disorder in the system and that - from a statistical point of view - all basis sets are equivalently good (or bad). It therefore models the Hamiltonian by a set of random matrices which couple any basis state to all the other ones. Depending on the symmetry properties of the Hamiltonian (especially with respect to time reversal, see section 3.5), different ensembles of random matrices have to be considered. Let us assume for the moment that the system is time-reversal invariant and can be represented by a real symmetric matrix -Hy in some basis. If the matrix size is N (not to be confused with the number of degrees of freedom), this leaves N(N + l)/2 real independent random variables. The natural (normalized) measure over the matrix space is:
dHoc
J J dHu i=l..N
Y[
dHij
(31)
i,j=l..N;i<j
which is invariant by any orthogonal transformation and thus puts all the orthonormal basis on the same footing. As a consequence, the probability density P(H) itself must be invariant by any orthogonal transformation. For simplicity, we will assume that the various matrix elements are inde pendent random variables6. With these basic assumptions, it is tedious but rather easy to show that the probability density can be written as [24]: P(tf)cxexp(-5g!r)
(32)
where a is the only remaining free parameter. From this equation and expanding the trace of H2 as a function of the ma trix elements Hy, one obtains easily that all matrix elements have a Gaussian distribution with zero average and variance: =(l
+ Sij)a2
(33)
These properties define the Gaussian Orthogonal Ensemble (GOE) of ran dom matrices 0 . b
This hypothesis is not at all crucial. It can be easily relaxed, generating other ensemble of random matrices with similar statistical properties. c An alternate derivation of the GOE is based on information theory. If we look for the prob-
Quantum Chaos 57
Knowing the probability density, we have to extract the statistical prop erties of the eigenvalues. The ensemble being invariant by any orthogonal transformation, it is simple to use as random variables the N eigenvalues and the N(N - l)/2 angles which characterize the orthogonal transformation bringing H to its diagonal form. The joint probability distribution is then obtained by tracing over the N(N - l)/2 angles. This is rather straightfor ward, because of the orthogonal invariance. The angles appear neither in the probability distribution itself, nor in the Jacobian of the transformation. The calculation of the Jacobian is the only tricky point. For a 2X2 matrix, it is straightforward (reader, you should do the calculation by yourself!) to show that is is \E\ - Ei\ where £1,2 are the two eigenvalues. For a NXN matrix, it is simply the product of all \Ei - Ej\ terms [24]. One finally obtains the joint probability density:
p(£?1>..,£Af)a(
n
l^-£;l)exp(-^gr^)
\i,j=l..N;i<j
)
\
(34)
/
This formula already contains a lot of information. Level repulsion is due to the \Ei - Ej\ factors which exclude level degeneracies. This factor is purely geometrical: it comes from the Jacobian of the transformation from matrix elements to eigenvalues. Although it looks simple, it is quite difficult to extract from the joint prob ability density the various statistical quantities of interest. It is easy for N = 2 and also feasible in the limit N -> 00, but involves the use of either beautiful old-fashioned mathematics [26] or almost incomprehensible supersymmetry techinques [31]. Most formulas are explicit but not very illuminating; they can be found in [7,26]. The spacing distribution cannot be calculated in closed form, but it hap pens to be very close to the result got for N — 2, known as the Wigner distribution: ~, >
its
P(s) = Ye~
*il
,„_% (35)
This distribution, shown in Fig. 4a, agrees extremely well with the nu merical results got on the hydrogen atom in a magnetic field. Similar results have been obtained on dozens of quantum chaotic systems, both numerically ability density which maximizes the entropy S = - J P(H) In P{H)dH with the constraint that the average value of Tr(/f 2 ) is fixed, one rediscovers immediately (using Lagrange multipliers) the GOE. The idea behind this derivation is that we know basically nothing about the distribution and have to take it as general as possible.
58
D. Delande
and experimentally. Experimental examples are the energy levels of highly excited nuclei [7], rovibrational levels of the NO? molecules [14], energy levels of the hydrogen atom in a magnetic field [29] and electromagnetic eigenmodes of microwave cavities [16]. The transition from a Poisson distribution in the classically regular regime to a Wigner distribution in the chaotic regime gives a characterization of quantum chaos, at least for highly excited states. Other statistical properties have been studied and are found in good agree ment with the predictions of Random Matrix Theory [28]. For example, the number variance, shown in Fig.4b, is in perfect agreement with the GOE prediction which, for large L, is E2(L)~^ln27rL
(36)
Note that the number variance is much smaller here than in the regular case. The spectrum is extremely rigid, as for L = 10 6 ,E 2 is only of the order of 3. This means that the typical fluctuation of the number of levels is 1 or 2 additional or missing levels over a range of one million level. In the Poisson model, the typical fluctuation would be \fh — 1000 levels! This extraor dinary large rigidity is due to the strong couplings existing between all the states in the model. If a fluctuation makes the level repulsion abnormally large between two states, they cannot repell too strongly because they are themselves strongly pushed by the other levels. From maximum disorder at the microscopic level, a globally rigid structure is born. Finding universal properties in the local statistical properties of energy levels for chaotic systems is not a real surprise. As discussed in the preceding section, this range of energy (mean level spacing A) corresponds to a long time behaviour (ft/A = THeisenberg ^> ^Min), where chaos is classically fully developped with its universal properties. Universality is also observed in the corresponding quantum dynamics. On the other hand, at shorter times of the order of TMin, non-universal properties exist in the classical behaviour. This implies also a deviation from the predictions of Random Matrix Theory on a large energy scale, as has been numerically and experimentally observed [21,30,29]. 3.5
Random Matrix Theory - Continued
Random Matrix Theory can also predict the behaviour of quantities beyond the energy levels. For example, it can predict the distribution of the wavefunction amplitude [32], the lifetimes of resonances in open systems [33,34] or the distributions of transition matrix elements [22].
Quantum Chaos
59
I now come back to the time-reversal symmetry which is necessary to obtain the GOE. If time-reversal symmetry (or more generally all anti-unitary symmetry) is broken, the Hamiltonian cannot be written as a real symmetric matrix, but rather as a complex Hermitian matrix. One has to change the ensemble of random matrices to use and define the Gaussian Unitary Ensemble (GUE). The natural measure is now: dH oc
J]
dHu
i=l..N
Y[
dReHij dlmHij
(37)
i,j=l..N;i<j
which is invariant by any unitary transformation and thus puts again all the orthonormal basis on the same footing. The probability distribution for H is found again to be given by Eq. (32): both Reify and Imi/y are Gaussian distributed. This adds more level repulsion because two arbitrary states have two chances to be coupled and to repell. Not surprisingly, this is visible in the joint probability distribution which takes the form:
P(EU..,EN) oc (
IJ
\i,j=l..N;i<j
\E' -E>\2) exP ( - ^ S r H )
\
<38)
I
The calculations are similar to the GOE case (although sometimes sim pler) and the predicted distributions agree very well with numerical results [7,35]. As far as I know, there is no convincing experimental result obtained in this regime. One also has to consider the special case of half-integer spin systems with time-reversal invariance: there, all levels are doubly degenerate (Kramers de generacy). If some rotational invariance exists, this degeneracy is hidden and the GOE should be used in each rotational series. If the rotational invariance is broken, every level will be exactly doubly degenerate and the Gaussian Symplectic Ensemble (GSE) of random matrices has to be used [7,24]. It is essentially identical to the GUE, with an exponent 4 instead of 2 in the joint probability density, Eq. (38). It is important to notice the role of symmetries for level statistics. If a good quantum number survives in a system (for example a discrete two-fold symmetry), the states with the same good quantum number will interact, but they will ignore the other states. Thus, even if each series with a fixed quantum number obeys the GOE statistics, the total spectrum will appear as the superposition of several uncorrelated GOE spectra, which has completely different statistical properties. It is very important to be sure that one has a pure sequence of levels before analyzing it. This may be difficult in a real
60
D. Delande
experiment because of stray mixing between series, usually much easier in numerical experiments. Finally, intermediate regimes have been studied, for example between the regular Poisson and the chaotic GOE regimes. In general, this transition is not universal. 4
Semiclassical Approximation
The previous section has shown the existence of universal fluctuation proper ties associated with chaos for quantum systems. These properties take place at short energy range, of the order of the mean level spacing, that is for times of the order of the Heisenberg time, much longer that the period of the short est periodic orbit. This also implies that a detailed analysis of all energy levels and eigenstates does not make sense: no interesting information can be brought to the physics of the chaotic phenomenon, beyond the statistical aspects. On the other hand, this does not mean that these enegy levels do not carry any information; it is just that this information has to be extracted in a different way. More precisely, as the individual specificity of a chaotic system manifests at relatively short times, before universal chaotic features dominate, it has to be found in the long energy range characteristics of the quantum spectra. For such a short time scale, as discussed in section 2, a semiclassical approximation might be used. It is the goal of this section to show how this can be implemented and eventually used to make some quantitative predictions on quantum chaotic systems which go beyond simple statistical statements. 4.1
Regular Systems - EBK/WKB
Quantization
For completely integrable systems, where there exist as many independent constants of motion as the number of degree of freedoms, there is a standard semiclassical theory which is a simple extension of the well known WKB theory for time-independent one-dimensional systems [36]. We assume the integrability of the system [1], which implies the existence of d pairs of canonically conjugate action-angle variables (0<, IA for 1 < t < d such that the Hamiltonian depends only on the actions: H = H{Iv..Id)
(39)
For a given trajectory, the actions U are constants of motion which define a invariant torus, a d-dimensional manifold embedded in the 2d-dimensional phase space, and the angles di (defined modulo 27r) are evolving linearly with
Quantum Chaos 61
time. A generic trajectory densely and uniformly fills the invariant torus, which implies that invariant tori are stationary structures during the time evolution. Hence, they are the relevant structures for building - in the semiclassical approximation - the eigenstates of the system. Let us now turn to the technicalities. One writes the wavefunction as: lKq) = A ( q ) e x p ( i ^ )
(40)
where ^4(q) and S(q) are real functions. We also assume that the Hamiltonian is of the form:
H = + F(q)
(41)
£
An elementary manipulation of the time-independent Schrodinger equa tion shows that it leads without any approximation to the two following real equations: V(/l 2 (q)VS(q)) = 0
(zssaUvw-«—£%!<£ ;
(42)
(43>
v 2m 2m A(q) ' The EBK approximation amounts to neglect the right-hand side term in the second equation, because it is multiplied by ft2 and thus likely to be small in the semiclassical limit. With this approximation, the equation becomes the classical Hamilton-Jacobi equation for the action [2]: ff(q,VS(q))-J0 = O (44)
Hence, its solutions are known and can be written, at least locally: 5(q)=|p.dq
(45)
where the integral is calculated along a trajectory"*. As equation (44) is a purely classical one, we can perform a canonical change of coordinates to action-angle variables in order to solve it. As the actions are constant, we get the trivial solutions:
S(0,..fc)= Y. l&
<46)
t=l..d
This is a locally uniquely defined function of the coordinates, i.e. a singlevalued solution of the Hamilton-Jacobi equation, and provides us with an d
The first equation (42) is nothing but a continuity equation which allows to compute the amplitude A of the wavefunction once the phase 5 is known.
62 D. Delande
approximate solution of the Schrodinger equation. However, this solution is not defined everywhere in configuration space, because the projection of the invariant torus over configuration space is only a finite region of it. At the boundary of this region, there are caustics. The simplest example is for a one-dimensional potential well: the oscillatory motion covers only a finite position range. The two extreme positions are turning points of the classical motion where the velocity changes sign and the particle traces back from where it came. There, VS(q) vanishes which produces a divergence of the amplitude A(q). This is turn makes the rhs of Eq. (43) tending to infinity and invalidates the semiclassical approximation. Each caustic requires a careful specific treatment is order to overcome this problem. Such a treatment goes beyond the scope of these lectures, but the result is simple: the solution, Eq. (46), can be continued through the caustics, provided a — TT/2 phase factor is added to the wavefunction for each caustic crossed. When the angle 0< is smoothly increased by 2ir, (with other angles fixed), one follows a closed loop on the invariant torus and comes back to the initial point. In order for the wavefunction to be single-valued, the total phase accumulated on such a closed loop must be an integer multiple of 2TT. There are two contributions to this phase: the first one is the change in the action 5 divided by ft, that is 27r/j/ft, the second one is -n/2 multiplied by the number of caustics crossed. The single-valued character of the wavefunction thus implies: Ii=(ni
+
^)h
(47)
where n» is a non-negative integer number and a, the Maslov index counting the number of caustics. Alternatively, this quantization condition can be rewritten as a function of the original coordinates as:
2^£p.dq=(„< + ir)ft
(48>
where the integral is evaluated along a closed loop 7J at the surface of the invariant torus. As there are d independent irreducible closed loops at the surface of the invariant torus (or equivalently d actions /j), this provides us with a set of d quantization conditions and d quantum numbers. These quantization rules are known as the EBK (Einstein, Brillouin, Keller) quantization conditions or invariant torus quantization [37]. The important point is that they do not use the classical trajectories, but the classical invariant tori. For a one-dimensional system, the trajectories
Quantum Chaos 63
coincide with the tori and one rediscovers the standard WKB quantization. This is also true for a degenerate multi-dimensional system like the hydrogen atom where all trajectories are closed. The EBK quantization rules can be used in a practical calculation. For example, for the hydrogen atom in a weak magnetic field, the classical dynam ics is mainly regular with a phase space full of invariant tori and the EBK scheme can be used. The semiclassical prediction for the energy levels is very accurate and practically indistinguishable from the exact quantum result in Fig. 2a. Another consequence of the EBK semiclassical quantization is that the eigenstates are localized on the invariant tori and that d good quntum numbers exist. As discussed in section 3.1, this implies that energy levels cross and that the statistical properties of the energy spectrum are well described by a Poisson law. In other words, the EBK quantization rules correctly predict the observed statistical properties of energy levels, see section 3.3.
4-2
Semiclassical Propagator
For a chaotic system, the invariant tori do not exist and the preceding analysis totally breaks down. There is no longer any structure which can be used to build global solutions of the Hamilton-Jacobi equation with a single-valued wavefunction. A completely different approach has to be used. As a direct solution of the time-independent Schrodinger equation seems out of reach, one tries to calculate a semiclassical approximation of the unitary evolution operator. This is also more convenient if one wants to compare to the classical dynamics, as the regular or chaotic character expresses in the time domain. The propagator is defined as a matrix element of the evolution operator in the configuration space representation:
K(4,t';q,t) = (c{\U(t,,t)\q)
(49)
The semiclassical approximation for the propagator is very similar to the one already discussed for the time-independent Schrodinger equation is section 4.1. It relies on a separation of phase and amplitude and neglection of higher order terms in h. One then finds the time-dependent Hamilton-Jacobi equation for the action [2], which can be locally solved along trajectories. The result is known as the Van Vleck propagator [38,39,40]:
64 D. Delande 4/2
W.f;q,t)=
£ r?lac TTraj. rai Clas.
(^
Det
^
d2.R(q',t';q,t) 0q'0q
e x p
1/2
x
^fl(q^;q,t)_.^
(50)
where the sum is over all the classical trajectories going from (q, i) to (q', t'). The function R(q', t'; q, t) is called the classical action, although it is different from the action used previously which, according to [2], should be called reduced action. The difference is that R is suitable when the time interval (t, t') is fixed while S is used at fixed energy. Altogether, the two functions differ by E(t' - t). R is just the integral of the Lagrangian along the trajectory: fl(q\«';q,0
= y
L(q,q,r)dT
(51)
The non-negative integer u counts the number of caustics encountered along the trajectory and is called a Morse index e . Few remarks should be made on this formula: • The structure of this formula is completely analogous to the one used in the energy domain, with a phase expressed as a purely classical quan tity evaluated along a trajectory, divided by h, and a smoothly varying amplitude. • The fact that the same quantity R appears in the phase and the amplitude is not surprising. It ensures the unitarity of the time evolution. It is the counterpart - in time domain - of the continuity equation in the energy domain, Eq. (42). In fact, the prefactor \/Det is of purely classical origin. It just represents how a classical phase space density initially localized in q and uniformly spread in p evolves according to the Liouville equation, Eq. (7). • At the caustics, the amplitude diverges and the semiclassical approxi mation breaks down. However, beyond the caustics, the semiclassical approximation recovers its validity, provided the convenient —?r/2 phase factor is added (through the Morse index), in complete analogy with the EBK approximation. 'The Maslov and Morse indices are not necessarily equal as the first one deals with trajec tories at fixed energy and the second at fixed f — t.
Quantum Chaos
65
• At short time difference t' - t, there is only one classical trajectory con necting (q,t) to ( q ' , 0 (more or less a straight line). The existence of multiple trajectories connecting the starting and ending points is analog to the existence of multiple paths at the surface of an invariant torus (see section 4.1). For a chaotic system, at long times, the trajectories become very complicated and their number grows exponentially, which renders the use of the semiclassical propagator more and more difficult. • A completely different derivation of the Van Vleck propagator is possible using the Feynman path integral [41] formulation of quantum mechanics. The propagator can be exactly written as a superposition of contributions of all paths connecting the starting and ending points. The phase of each contribution is the integral of the Lagrangian along the path divided by ft. In the semiclassical limit, the sum over paths can be calculated by the stationary phase approximation. The paths with stationary phase are precisely the classical trajectories, and the prefactor in the stationary phase integration exactly gives the Van Vleck amplitude. This approach explains why the contributions of the different classical trajectories have to be added coherently in the propagator. It also explains why the semiclassical propagator can cross the caustics safely. 4-3
Green's function
The next step is to go from the time domain to the energy domain. The Green's function is: (52) where the last equality is valid for E in the lower half complex plane. In configuration space, this gives:
G(q',q,JS) = (q'|G(£?)|q> = ^ f ° tf (q',r;q,0)exp ( i ^
dr
(53)
In order to get a semiclassical approximation for the Green's function, one plugs the Van Vleck propagator, Eq. (50), into Eq. (53). The integral over r can be performed by stationary phase approximation. The stationary phase condition writes:
M«qV;q,0)+JS OT
= 0
(54)
66
D. Delande
From the Lagrange-Hamilton equation [2,40], the partial derivative is minus the Hamilton function. Hence, stationary phase selects trajectories going from q to q' with precisely energy E. This allows to write the semiclassical Green's function at energy E as a sum over classical trajectories with energy E, a physicaly satisfactory result. The phase of each contribution is the sum of R (calculated along the orbit) and ET, which gives the reduced action S. The detailed calculation of the various prefactors is not very difficult, but rather tedious; see [40] for details. It is convenient to distinguish the coordinate <7|l chosen along the trajectory and the the coordinates qx transverse to the orbit. One finally obtains the semiclassical Green's function [38,42,40]: Deta*s(q,q',£:)
1/2
(55) with S(q,q',£)= f
p.dq
(56)
Again, the prefactor simply represents the classical evolution of a phase space density with fixed energy, according to the Liouville equation. This semiclassical approximation breaks down for very short trajectories. Indeed, the integral over T cannot be performed by stationary phase approximation in such a case. A specific short time expansion is possible when 5(q,q',jB) is not much larger than ft. It basically consists in ignoring the effect of the potential and using the free Green's function [38]. The Green's function by itself is not very illuminating. In order to obtain some information on the energy spectrum and eigenstates, one needs a more global quantity. 4-4
Trace Formula
The Green's function, Eq. (52), has a singularity at each energy level, like the density of states, Eq. (23). They are actually related by the simple equation: d(E)
- ^Im TrG(E) = ~hnfdn
G(q, q, E)
(57)
If one uses the semiclassical Green's function, Eq. (55), the density of states is obtained as a sum over closed trajectories, starting and ending at
Quantum Chaos
67
position q. The last integral over position q can again be performed by stationary phase. The Lagrange equations tell us that the partial derivative of 5 ( q , q ' , E ) with respect to q' is the final momentum p ' while its partial derivative with respect to q is minus the initial momentum p. Stationary phase thus selects closed orbits where the initial and final momenta are equal, that is periodic orbits. Putting everything together gives the celebrated trace formula (also known as the Gutzwiller trace formula from one of its author), written here for a two-dimensional system [42,39,38,40]:
p.p.o. K, repetitions r
v ■
\
*"
where the sum is performed over all primitive periodic orbits (i.e. periodic orbits which do not retrace the same path several times) and all their rep etitions r > 0. Tk is the period of the orbit, 5* its action, i//, its Maslov index and M* the 2X2 monodromy matrix describing the linear change of the transverse coordinates after one period; for details, see [38,40]. The term d(E) whose expression is given in Eq. (25) comes from the "zero-length" trajectories. Indeed, for such trajectories, the semiclassical ap proximation for the Green's function breaks down and a repaired formula (see previous section) has to be used, which produces this smooth term. The trace formula deserves several comments: • The trace formula is a central result in the area of quantum chaos, as it expresses a purely quantum quantity (the density of states) as a function of classical quantities (related to periodic orbits) and the constant h. • It uses only periodic orbits, which proves that they are the skeleton of the chaotic phase space. In that sense, they replace the invariant tori used for regular systems. • Each periodic orbit contributes to the density of states with an oscillatory contribution. The period of these modulations correspond to a change of the argument of the cosine function 5* /h by 2n. As the derivative of the action with respect to energy is the period T* of the orbit, the correspond ing characteristic energy scale is 2irh/Tk. In the semiclassical regime, this is much larger than the mean level spacing A = 27rft/THe,senberg- Hence, the trace formula describes the long range correlations in the energy spec trum, not the short range fluctuations described by Random Matrix The ory, see section 3.4.
68
D. Delande
• The trace formula breaks the simple connection between a given energy level and a simple structure in phase space. An energy level is a S singu larity in the density of states while each orbit contributes to a modulation of the density of states with finite amplitude. Thus, to build a S peak requires a coherent conspiration of an infinite number of periodic orbits. • The present formula is restricted to isolated periodic orbits such that the phase space distance to the closer periodic orbit is larger than ft.. For non isolated periodic orbits, the simple stationary phase treatment fails. A specific treatment is required and various similar formula can be written. Especially, for integrable systems, the sum over periodic orbits can be performed analytically using a Poisson sum formula [37]: the result is exactly equivalent to the EBK quantization scheme exposed in section 4.1. • The formula is valid only at lowest order in the Planck's constant ft. Including higher orders in the various stationary phase approximations is tedious, but feasible [43]. • If one is not interested in the density of states, but in some other physical quantity, it is often possible to get similar expressions. The general strat egy is to express the quantity of interest using the Green's function of the system, then to use the semiclassical Green's function. For example, the photo-ionization cross-section of an hydrogen atom in a magnetic field has been calculated in [44] as a sum over periodic orbits starting and ending at the nucleus. The practical use of the trace formula is difficult. Indeed, extracting in dividual energy levels by adding oscillatory contributions requires in principle an infinite number of periodic orbits. In practice, it may be argued that it is enough to sum up all orbits with periods up to the Heisenberg time. Indeed, longer orbits will produce modulations on an energy scale smaller than the mean level spacing, and are thus expected to cancel out and to be irrelevant (only useful to make peaks narrower but not moving theirs positions). In the semiclassical limit, Tiieisenberg is so much longer than TMin that the prolifer ation of long orbits makes the procedure unpractical. However, for relatively low excited states, Tneigenberg is not much larger than TMi„ and the trace formula has been successfully used to compute several states [40,11]. Another possibility is to use an open system with resonances instead of bound states. There, the density of states has only bumps related to the resonances and the use of a finite (and hopefully small) number of periodic orbits may correctly reproduced the quantum properties. This is shown in
Quantum Chaos
69
Fig. 5 for the hydrogen atom in a magnetic field at scaled energy e = 0.5. With few hundred periodic orbits, we are able to reproduce the finest details in the apparently random fluctuations of the photop-ionization cross-section. This is a striking illustration of the strength of semiclassical methods. 4-5
Convergence Properties of the Trace Formula
Because of the proliferation of long periodic orbits, it is not clear whether the sum in the trace formula converges or not. There is a competition be tween the exponential proliferation of periodic orbits and the decrease of the individual amplitudes (long orbits tend to be very unstable hence creating large denominators in the trace formula). For a generic bound system, the proliferation overcomes the decrease of amplitudes and the sum does not con verge [38]. This means that, depending on the order in which the various periodic orbit contributions are added, the result can be anything! However, it is rather clear that the periodic orbits are not independent from each other and that the information that they contain is somewhat structured: most of the information contained in the very long orbits can be essentially extracted from shorter periodic orbits. The idea is thus to use the structure of the classical orbits to make the trace formula more convergent. The first step is to pass from the density of states to the so-called spectral determinant defined by: f(E) = Y[(E-Ei)
(59)
»
which has a zero at each energy level. It can be rewritten as: f(E) = exp ( - - f
Im TtG(e) de j
(60)
This quantity is of course highly divergent but can be regularized through multiplication by a smooth quantity having no zero [38]. This corresponds to removing in the Green's function the smooth contribution of the zero length trajectories. One can thus define a "dynamical Zeta function" by:
Z(E) = exp f - -
f
Im TrG reg (e) de j
(61)
where G reg contains only the periodic orbit contributions. By inserting the semiclassical Green's function in Eq. (61), a rather simple manipulation allows
70 D. Delande
Semiclassical -0.5
60
62
64
66
68
70
-1/3 Figure 5. The photo-ionization cross-section of the hydrogen atom in a magnetic field 7, at constant scaled energy e = E 7 - 2 / 3 = 0.5 (plotted versus 7 - 1 / 3 ) , where the classical motion is fully chaotic. The smooth part of the cross-section is removed, in order to emphasize the apparently erratic fluctuations. Upper panel: the "exact" quantum cross-section calculated numerically. Lower panel: the semiclassical approximation of the cross-section, as calculated using periodic orbit theory. All the fine details - which look like random fluctuations - are well reproduced by periodic orbit theory.
Quantum Chaos 71
to sum over the repetitions and to obtain the infinite product [38]:
«->-nn(-=£fc£fl) m=Oppo. \
i
K
<
*
<->
/
where Afc is the eigenvalue (larger than 1 in magnitude) of the monodromy matrix MkThe transformation from an infinite sum to an infinite product does not cure the lack of convergency. Of course, the zeros of the infinite product are not the zeros of its individual terms (which do not have any for real energy, because |A*| is always larger than 1). However, the larger m, the larger the denominator and the more convergent the infinite product over primitive periodic orbits. Hence, the most significant zeroes - the most important ones for the physical properties - will come only from the m = 0 term'. When expanding the infinite product, there are some crossed terms be tween orbits appearing with a positive sign which might cancel approximately with negative terms from more complicated orbits. The idea of the cycle ex pansion is to group such terms so that maximum cancellation takes place. Suppose that I have two simple orbits labelled 0 and 1 and a more compli cated orbit labelled 01 which is roughly orbit 0 followed by orbit 1. If the action (resp. Maslov index) of orbit 01 was exactly the sum of the actions (resp. Maslov indices) of orbits 0 and 1 and its unstability eigenvalue A the product of the instability eigenvalues of orbits 0 and 1, complete cancellation would take place. As these properties cannot be exact, only partial cancella tion takes place. But, for longer and longer orbits, the cancellation is better and better and the cycle expansion might be convergent (although there is no proof). This simple idea can be generalized to take into account all the orbits if there exist a efficient coding scheme - also known as a good symbolic dynamics - for the periodic orbits. Then, there are some cases like the 3-disks scattering problem [38], where the cycle expansion can be made convergent and can be used to efficiently calculate the quantum properties of the system. 4-6
An Example : the Helium Atom
The idea of the cycle expansion has been succesfully used by Wintgen and coworkers [45] to calculate some energy levels of the helium atom. Although the system is not fully chaotic, most of the dynamics is. Of special interest is the eZe configuration where the electrons and the nucleus lie on a straight line, some cases, the m > 0 terms are absolutely convergent.
72 D. Delande
State lsls 2s2s 2s3s 4s7s 5s5s
Periodic Orbit -3.0984 -0.8044
-0.131
Cycle Expansion -2.9248 -0.7727 -0.5902 -0.1426 -0.129
Exact quantum -2.9037 -0.7779 -0.5899 -0.1426 -0.129
Table 1. Some energy levels (in atomic units) of the 1S' series of the helium atom, compared to the simple semiclassical quantization using only the simplest periodic orbit and the more refined "cycle expansion" which includes a set of unstable periodic orbits. The agreement is remarkable, which proves the efficiency of semiclassical methods for this chaotic system (courtesy of D. Wintgen).
with the electrons on opposite sides of the nucleus. In such a configuration, one can find a symbolic dynamics for the periodic orbits: any periodic orbit can be uniquely labelled by the sequence in which the electrons hit the nucleus. The motion transverse to the eZe configuration is stable and can be taken into account. By calculating the zeros of the infinite product, Wintgen et al. have been able to perform a fully semiclassical calculation of several energy levels of the helium atom. Some results are displayed in Table 1. For the ground state, it differs by only 0.7% from the exact quantum result. For excited states, it is even better. Thus, these authors have been able to solve a problem open since the beginning of the century, when pioneers of quantum mechanics tried to quantize helium after having sucessfully quantized the hydrogen atom. However, these pionners had no idea of the classically chaotic nature of phase space, they were not even thinking of a trace formula - not to speak of cycle expansion. They could not have the key idea that the correspondance between a classical orbit and an energy level is not one-to-one but that an infinite number of periodic orbits is needed to build a chaotic quantum eigenstate. It is only after 70 years of work on classical and semiclassical dynamics that their goal could be met. 5
Conclusion
In these lectures, I hope I could convince the audience that we have some partial answers to the question raised in the introduction. Chaos manifest itself in the quantum properties of the systems like the energy levels and the eigenstates, in at least two ways: • On a narrow energy interval — roughly at the level of individual eigen-
Quantum Chaos
73
states — the quantum structures display strong, apparently random, fluctuations and a high sensitivity on any small change of an external parameter. This is the quantum counterpart of the classical sensitivity on initial conditions. • On a large energy scale, where spectral properties are averaged over sev eral states, the specific features of the studied system become manifest and are mainly related to the periodic orbits of the classical system. For regular systems, efficient semiclassical methods exist. For chaotic systems, we understand the role of periodic orbits. Yet, we are not often able to compute individual highly excited states of a chaotic system from the knowledge of its classical dynamics. Using periodic orbits, we can compute low resolution spectra. Whether periodic orbit formulas are the end of the game or just an intermediate step towards a more global understanding is unknown. Finally, several very important aspects of quantum chaos have not been discussed in these lectures. The first one is the behaviour of the eigenstates in the chaotic regime. As a first approximation - and in the spirit of Random Matrix Theory - they are just unstructured random waves resulting from the interferences between plenty of classical paths [7,26]. However, this is not completely true and one often observes "scars" of periodic orbits, i.e. an enhanced probability density in the immediate vicinity of a periodic orbit [46,16,15]. Periodic orbit theory explains this phenomenon which has major experimental consequences. The second major phenomenon not discussed in these lectures is localization. At long times, it happens that - in sharp contrast with the classical behaviour - the quantum behaviour is not ergodic at all and the system remains localized [5,9]. This is related to the freezing of the quantum dynamics after the Heisenberg time, see section 2 and is also connected to the so-called Anderson localization in disordered systems. I thank Christian Miniatura for providing me with his personal notes and for careful reading of the manuscript, and Robin Kaiser for a constant kind stimulation when writing these notes. References 1. Lichtenberg, A.J., and Lieberman, M.A. (1983). Regular and stochastic motion, Springer-Verlag (New-York). 2. Landau L. and Lifchitz E. (1966), Mecanique, Mir (Moscow).
74 D. Delande
3. Cohen-Tannoudji C. Diu B., and Laloe F. (1973), Mecanique Quantique, Hermann (Paris). 4. Zurek W., (1998), Physica Scripta, T76, 186, and references therein. 5. Casati G., and Chirikov B., (1995) Quantum Chaos, Cambridge Univer sity Press (Cambridge). 6. Hillery M., O'Connel R.F., Scully M.O., and Wigner E.P., (1994), Phys. Rep. 106, 121. 7. Bohigas, O. (1991), in Chaos and quantum physics, edited by M.-J. Giannoni, A. Voros and J. Zinn-Justin, Les Houches Summer School, Session LII (North-Holland, Amsterdam,). 8. Smilansky, U., (1991) in Chaos and quantum physics, edited by M.-J. Giannoni, A. Voros and J. Zinn-Justin, Les Houches Summer School, Session LII (North-Holland, Amsterdam,). 9. Moore F.L. et al, (1995), Phys. Rev. Lett. 75, 4598; Amman H. et al (1998), Phys. Rev. Lett. 80, 4111. 10. Arimondo A., Phillips W.D., and Strumia F. (1992), Laser Manipulation of Atoms and Ions, North-Holland (Amsterdam). 11. Wintgen, D. (1988), Phys. Rev. Lett. 61, 1803; Tanner G. et al, (1991), Phys. Rev. Lett. 67, 2410; Wintgen, D., Richter, K., and Tanner, G. (1992). Chaos, 2, 19. 12. Rost J.M., Schulz K., Domke M., and Kaindl G. (1997), J. Phys B 30, 4663. 13. GrSmaud B., and Delande D., (1997), Europ. Lett, bf 40, 363. 14. Georges R, Delon A, and Jost R., (1995), J. Chem. Phys. 103, 1732. 15. Mueller G. et al, (1995) Phys. Rev. Lett. 75, 2875; Fromhold T.M. et al, (1994), Phys. Rev. Lett. 72, 2608. 16. Stockmann H.-J., Stein J., and Kollmann M. in [5]; Stockmann H.-J. et al (1990), Phys. Rev. Lett. 64, 2215 (1990); Graf H.-D. et al, (1992), Phys. Rev. Lett. 69, 1296 (1992). 17. Iu, C.H., Welch, G.R., Kash, M.M., Kleppner, D., Delande, D. and Gay, J.C. (1991). Phys. Rev. Lett. 66, 145. 18. Holle, A., Main, J., Wiebusch, G., Rottke, H., and Welge, K.H. (1988), Phys. Rev. Lett. 61, 161; Main J., Wiebush G., Welge K. H., Shaw J., and Delos J.B., (1994), Phys. Rev. A 49, 847. 19. Van der Veldt, T., Vassen, W., and Hogervorst, W. (1993), Europhys. Lett. 2 1 , 903. 20. Friedrich, H., and Wintgen, D. (1989), Phys. Rep. 183, 37. 21. Delande, D. (1991), in Chaos and quantum physics, edited by M.-J. Gian noni, A. Voros and J. Zinn-Justin, Les Houches Summer School, Session LII (North-Holland, Amsterdam,).
Quantum Chaos
75
22. Hasegawa, H., Robnik, M., and Wunner, G. (1989), Prog. Theor. Phys. 98, 198. 23. Cohen-Tannoudji C., Dupont-Roc J., and Grynberg G., (1987), Photons et Atomes, Inter Editions (Paris). 24. Haake F., Quantum signatures of chaos, Springer-Verlag (1991). 25. Iu C.H. et al, (1989), Phys. Rev. Lett. 63, 1133. 26. Mehta M.L., (1991), Random Matrices, Academic Press (London). 27. Akkermans E., Montambaux G., Pichard J.-L., and Zinn-Justin. J. (1995), Mesoscopic Quantum Physics, Elsevier Science B.V., NorthHolland (Amsterdam). 28. Delande, D and Gay, J.C. (1986). Phys. Rev. Lett. 57, 2006. 29. Held H., Schlichter J., Raithel G., and Walther H., (1998), Europ. Lett. 43, 392. 30. Welch G.R. et al, (1989), Phys. Rev. Lett. 62, 893. 31. Guhr. T., Mueller-Groeling A., and Weidenmuller H. A., Phys. Rep. (1998), 299, 189. 32. Kudrolli A., Kidambi V., and Sridhar S. (1995), Phys. Rev. Lett. 75, 822. 33. Gremaud, B., Delande, D., and Gay, J.C. (1993), Phys. Rev. Lett. 70, 615. 34. Ericson, T. (1960), Phys. Rev. Lett. 5, 430. 35. Sacha K., Zakrzewski J. and Delande D., Phys. Rev. Lett. (1999), 83, 2922. 36. Berry M.V., and Mount K.E., (1972), Rep. Prog. Phys. 35, 315. 37. Berry M. (1983) in Chaotic behaviour of deterministic systems, Les Houches Summer School 1981, G. Iooss, R. Helleman and R. Stora Ed. (North-Holland). 38. Cvitanovic P. et al, Classical and quantum chaos: a cyclist treatise, http://www.nbi.dk/ChaosBook. 39. Brack M. and Bhaduri R.K., (1997), Semiclassical Physics, AddisonWesley. 40. Gutzwiller, M.C. (1990), Chaos in Classical and Quantum Mechanics, Springer, Berlin. 41. Schulman L.S., (1981), Techniques and Applications of Path Integration, Wiley (New-York). 42. Balian, R., and Bloch, C. (1974). Ann. Phys. 85, 514. 43. Gaspard P. in [5]; Alonso D., and Gaspard. P. (1993), Chaos 3, 601. 44. Bogomolny, E.G. (1988), JETP Lett. 47, 526; Bogomolny, E.G. (1999), JETP 69, 275. 45. Wintgen D. et al, Prog. Theo. Phys. (1994), 116, 121; Tanner G. et al
76 D. Delande
(1999), Rep. Prog. Phys, in press. 46. Heller, E.J. (1984), Phys. Rev. Lett. 53, 1515.
THE WATER-WAVE PROBLEM AS A SPATIAL D Y N A M I C A L SYSTEM GERARD IOOSS Institut Universitaire de France Institut Non Liniaire de Nice The mathematical study of 2D travelling waves in the potential flow of one or several layers of perfect fluid, with free surface and interfaces can be set as an ill-posed evolution problem, where the horizontal space variable plays the role of a "time". In the finite depth case, the study of near equilibria waves reduces to a low dimensional reversible ordinary differential equation. In most cases, it appears that the problem is a perturbation of an integrable system, where all types of solutions are known. We shall describe the method of study and typical results in these lectures. In addition, we shall give indications on what happens when passing to the limit of infinite depth, which is indeed the case of physical interest. In such a case, the reduction technique fails because the linearized operator possesses a continuous spectrum crossing the imaginary axis, and we are lead to use tools adapted to the type of solution we are looking for. We shall develop such tools in the lectures.
1
Introduction
We present below a very old and classical problem concerning waves in perfect fluids, with the purpose to show how dynamical systems methods may be used to obtain results on the spatial behavior of travelling waves near the basic flat free surface state. In fact this bifurcation problem has the property to overlap several important subjects: i) elliptic partial differential equations in unbounded domains like strips, ii) the theory of reversible systems, iii) the normal form technique and iv) the methods of analysis available for systems close to integrable ones. 2 2.1
Formulation as a reversible dynamical system Case of one layer with surface tension at the free surface
Let us first consider the case of one layer (thickness h) of an inviscid fluid, the flow is assumed potential, under the influence of gravity g with surface tension T acting at the free surface (see figure 1). We are interested in steady waves of permanent form, i.e. travelling waves with constant velocity c. Formulating the problem in a moving reference frame, our solutions are steady in time, and we intend to consider the unbounded horizontal coordinate £ as a "time". 77
78 G. Iooss
T1=Z(0
£
Figure 1. One layer
Let us denote by p the fluid density, then we choose c as the velocity scale, I = T/pc? as the length scale and pc2 as the pressure scale. The important dimensionless parameters occuring in the equations are: gh \ = -j (inverse of (Froude number)2), 6 = ~h~5
=
L^
h (Weber number).
The complex potential is denoted by w(£ + it]) and the complex velocity (in dimensionless form) u/'(f + ZT/) = u - iv. The free surface is denoted by T) = Z(Q. The Euler equations are expressed here by the fact that w is analytic in C = f + if? and by the boundary conditions v = 0 at T) = 1/6 (flat bottom), the kinematic condition at the free surface uZ'(0 - v = 0 at r) = Z(0 (free surface), and the Bernoulli first integral at the free surface, expressing the condition on the pressure jump which is proportional to the curvature 1 -(tf + v*) + XbZ -
Z" (
1 = - at f) = Z(0 (free surface).
For formulating our problem into a dynamical system, we first transform the unknown domain into a strip. There are different ways for such a change of coordinates. We choose the one used by Levi-Civita [16]. The new unknown is a -t- i/3 as an analytic function of w = x + iy, where w'(£ + ir))—= oe0-ia
The Water-Wave Problem as a Spatial Dynamical System 79
the free surface is given by y = 0 , the rigid bottom by y — - 1 / 6 . In our formulation, the unknown is denoted by U where (ao(x),a(x,y),0(x,y))t
[U(x)](y) = and the system has the form
(1)
d
i = F^U)
where p. = (A, 6) , and ' sinh A) + A6e-"° f_l/h{e-0
cos a - \)dy
Equation (1) has to be understood in the space H = R x {L 1 (-l/6,0)} 2 ,and U(x) lies in O = R x { W ^ - ^ - l / M ) } 2 D {a 0 = a| y = 0 , a| y =-i/6 = 0}, where we denote by 0o the trace /3| v = 0 and by W1,1 (-1/6,0) the space of integrable functions with an integrable first derivative on the interval (—1/6,0). So, for a fixed z the right hand side of (2) is a function of y and U(x) is required to satisfy the boundary conditions indicated in D. A solution of our water-wave problem is any U 6 C°(D) n C ^ H ) which is solution of (1), where (e.g.) C° means continuous and bounded for z € R. It is clear that U = 0 is a particular solution of (1), which corresponds to the flat free surface state. A very important property of (1) is its reversibility: indeed let us define the symmetry: SU =
(-a0,-a,l3)i
then it is easy to see that the linear operator 5 anticomutes with F(fi, ■). This reflects the invariance under reflexion symmetry x -4 — x of our problem. 2.2
Case of two layers without surface tension
Let us now consider the two layers case (densities p\, fa), assuming that there is no surface tension, neither at the free surface nor at the interface. The thickness at rest of the upper layer is /12 while it is hi for the bottom one (see figure 2). The dimensionless parameters here are p = P2/P1 < 1> e = hi/h2, andA = $ i . The domain can be transformed into two superposed horizontal strips and we may use the same type of variables as above. One difficulty is that the z coordinate is not the same in each strip! We have to choose as the basic z coordinate the one given by the bottom layer which then introduces a factor
80
G. loots
Figure 2. Two layers
in the Cauchy-Riemann equations of the upper layer. In such a formulation, the unknown is defined by (0io(x),0n(x),a1(x,y),01(x,y),a2(x,y),^(x,y))t
[U(x))(y) =
and the system has the form (1) with /z = (p, e, A) and ' -A(l - p) s i n a j o e - 3 ^ - A sin a 2 i e - 3 / 3 j l "W^o-0'°
F(fi,U) = {
3
ye(-e,o) ' y e
pe^^-^)^-\y=Q
(3)
(o, l)
where we denote by aio, 0io and 02o the traces of (resp.) ai,/3i,02 at y = 0, and a2i,/?2i the traces of a 2 and 02 at y = 1. Here the basic space H = K2 x {WlA{-e, 0)} 2 x {W1,1^, l ) } 2 needs more regularity than in the previous case, to be able to define the term ^njMj,=o in the first component and the domain of the operator F is now: 0 = R2 x { W 2 , 1 ( - e , 0 ) } 2 x {W 2 ' J (0, l ) } 2 n {aio = a2o,/?io = 0|»=o,02i = /?| y =i,ai|v=-e = 0}, while the reversibility symmetry reads: SU = (jSio.fto,-<*x,ft, - a s , ft)'. The new system (1,3) should be completed by the following two Bernoulli first
The Water- Wave Problem as a Spatial Dynamical System 81
integrals (interface and free surface): o / o
/
(e~01 COSQJ - \)dy + l/2(e 2/Jl ° - 1) - p/2(e20™ - 1) = 0, A
(e~0i COSQI - l)dy + / (e~ 02 COSQ2 - l)dy + l/2A(e 2/3ai - 1) = 0,
which give the two first components of (1,3) after differentiation. In principle we might choose to treat this problem on a codimension-2 manifold, instead of expressing these two first components. It appears that it is much easier to work as we do at present, just keeping in mind that there are two constants which should be fixed, because of these known first integrals. Finally, it has to be understood that problem (1) is not a usual evolution problem: the initial value problem is ill-posed! This is in fact an elliptic problem in the strip Rx (-1/6,0) for problem (1,2), and R x ( - e , l)for problem (1,3). However we treat this problem by (local) techniques of dynamical systems theory, as in finite dimension! The key of this is the possibiUty for the study of solutions staying close to 0, to apply a center manifold reduction into a reversible ordinary differential equation. The aim of next section is to show the properties of the linearized operator near 0, which allow, in particular, to use such a reduction. 3
The linearized Problem
Since we are interested in solutions near 0, it is natural to study the problem obtained after linearization near 0. This linearized system reads
in M. In all problems, for layers with finite depth , it can be shown that the spectrum of LM which is symmetric with respect to both axis of the complex plane because of reversibility, is only composed of isolated eigenvalues of finite multiplicities, only accumulating at infinity. More precisely, denoting by ik these eigenvalues (not necessarily pure imaginary), then one has the classical "dispersion relation" for solving the eigenvalues, under the form of a complex equation f(fi, k) = 0. For instance in the case of problem (1,2), we obtain the following dispersion relation: (Xb + k2)k~l sinh k/b - cosh k/b = 0,
for k £ 0
(5)
82
G. Iooss
while in the case of problem (1,3), we obtain p(A2 - k?) sinh(fce) sinh k + (A sinh k - k cosh k)[k cosh(Jfce) - A sinh(fce)] = 0. (6) There is only a small number (no more than 4 in the first problem) of these eigenvalues on (or close to) the imaginary axis, the rest of them being located in a sector (ik € C; \kr\ < p\kt\ + r) of the complex plane. For the second problem the eigenvalue 0 is always present, because of the freedom of the choice of two constants (fixed later by the two first integrals), while for the first problem this happens only if A = 1. The roots of these dispersion equations are the poles of the resolvent operator {ikl - L M ) - 1 . In addition for both problems, we can obtain an estimate of the form ||(»*I-£„)->H^H,
(7)
for large |fc|, where £(H) is the space of bounded linear operators in KL It is fortunate that, for all these problems, the resolvent {ikl - L M ) _ 1 can be otained explicitly, specially in the second problem, because the "bad" trace term in the first component of Z,M gives difficulties to obtain such an estimate (7) for a choice of basic space other than BL This estimate appears to be essential in our method of reduction to a (small dimensional) center manifold. Finally, another common point for all these problems, is that when the bottom layer thickness grows, there is an accumulation of eigenvalues on the whole real axis, and at the limit, as we choose a basic space such that (a, /3) -»• 0 as y -> - c o , all real eigenvalues disappears (except 0 in the second problem), leaving place to the entire real axis forming the essential spectrum : for a real ^ 0 the operator ( PQ, or at a distance of order 1 from the imaginary axis, then the estimate (7) allows us to find a center manifold (see [15], [19], [22]). Roughly speaking, all "small" bounded continuous solutions taking val ues in HJ, of the system (1) for values of the (multi) parameter p near po, lie on an invariant manifold M^ which is smooth (however we loose the C°° regular ity) and which exists in a neighborhood of 0 independent of p. The dimension of M,! is equal to the sum of dimensions of invariants subspaces belonging to pure imaginary eigenvalues for the critical value po of the (multi)parameter.
The Water- Wave Problem as a Spatial Dynamical System
83
In other words, the modes corresponding to eigenvalues far from the imagi nary axis are functions ("slaves") of the modes belonging to eigenvalues near or on the imaginary axis. In addition, the reversibility property leads to a manifold which is invariant under the reversibility symmetry S. The trace of system (1) on M^ is also reversible under the restriction So of the symmetry S. At this point we should emphasize that the physical relevance of this reduction process is linked with the distance of the rest of eigenvalues to the imaginary axis. So, this validity is going to 0 when the thickness of the bottom layer increases, and in such a case we have to think to another technique (see section 6)! 4
Basic codimension one reversible normal forms
Once the problem is reduced to a reversible ODE, we need to examine various possible critical situations. This basically depends on how many eigenvalues lie on the imaginary axis, at criticallity \i = fio- The reversibility property reduces the cases of interest to the following: (i) LUo has only a double 0 eigenvalue on the imaginary axis, (ii) Lpa has only a double 0 and a pair of simple pure imaginary eigen values on the imaginary axis, (iii) LM0 has only a pair of double pure imaginary eigenvalues on the imaginary axis, (iv) LM0 has only two pairs of simple pure imaginary, strongly resoning, eigenvalues on the imaginary axis. There are other problems of interest when more than 4 eigenvalues may lie on the imaginary axis, as it is the case for problems with several layers, and with additional parameters like surface and interfacial tensions. There are also other interesting cases, for instance when 0 is a quadruple eigenvalue. This would correspond to a codimension two singularity (occurs in playing with two parameters), so it is not a generic case (see [7] for a study of this case). We shall not consider such cases here, whose study follows the same stream. In addition, for a problem such as the one of section 2.2, we always have a 0 eigenvalue which should be counted in addition to the above ones, with its multiplicity (2). For each of these cases this corresponds to a particular critical set in the parameter space, and we denote by v the bifurcation parameter, representing a (oriented) distance of y. from the critical set of (JLQ. The basic technique we use to study such cases is first normal form theory (see for instance chapter 1 of [6]) to simplify the reduced system up to a fixed order. This corresponds to a suitable choice of variables (after a polynomial
84 G. Iooss
B
A -3v/2a
Figure 3. Phase portrait of vector field (8) for u > 0, a < 0.
change), and this allows to recognize all types of bounded solutions: a re markable fact here is that in all cases (i) to (iv), we obtain integrable normal forms. This means that for the system truncated at any fixed order, we know all its small bounded solutions. We give below more details of these solutions.
4.1
Case (i)
Here the center manifold is two dimensional. Let us define by (A, B) the (real) coordinates (or "amplitudes"). Then we need to know how the reversiblity symmetry S 0 acts on (4, B). There are two theoretical possibilities: (A, B) -> {A, -B) or (-A, B). In all water-waves problems the first case holds, and then the normal form, truncated at quadratic order, reads
where one can compute explicitly the coefficient a. Here, for v > 0 the critical eigenvalues ±^/u are real, while they are pure imaginary for v < 0. The vector field (8) is integrable, and its phase portrait is given at figure 3, for v > 0, and a < 0. We observe that there is a second steady solution, corresponding to fiat free surface and interface ("conjugate flow"), there are periodic solutions with various amplitudes, and there is a homoclinic solution to 0, corresponding to a "solitary wave" of depression for problem (1,2). This problem was first solved in [1].
The Water-Wave
Problem as a Spatial Dynamical System
Figure 4. case (ii). Graphs of /K,H(.A)
4.2
85
for v > 0,o > 0.
Case (ii)
Here the center manifold is four dimensional. Let us denote by ±iq the pair of simple eigenvalues and define by (A, B) the (real) amplitudes and C the complex one, corresponding to the oscillating mode. Then we need to know how the reversiblity symmetry So acts on {A, B, C, C). Here again, there are two theoretical possibilities. In all our problems we have: (A,B,C,C) -* (A, —B, C, C) and the normal form, truncated at quadratic order, reads & = vA + aA2+c\C\2, %=iC(q + d1u + d2A),
(9)
where (real) coefficients a,c,dj can be explicitly computed. This system is indeed integrable, with the two first integrals K = \C\\ 2
H =B-
(10) (2/3)aA
3
2
- uA -
2cKA.
We show at figure 4 the various graphs of functions fa^i-A) = (2/3)aA3 + i/A2 + 2cKA + H depending on {K,H), for v > 0,a > 0. In this case, we have, in addition to the conjugate flow (as above), several types of periodic solutions, quasi-periodic solutions (interior of the triangular region in (K, H) plane, and homoclinic solutions, one homoclinic to 0, and all others homoclinic to one of the periodic solutions. We represent on figure 5, in the (A,B)
86 G. Iooss
Figure 5. case (ii). Bounded solutions for various values of H of (9) in the (A, B) plane, for v > 0,a > 0,0 < cK
plane all bounded solutions for 0 < cK < v2/4a. Notice that the homoclinic solution to A+ corresponds here to a generalized solitary wave for problem (1,2), tending at infinity towards a periodic wave. Notice that A+ ~ -cK/v when \K\ is very small, meaning that oscillations at oo are then very small in this case. For K = 0 this corresponds to a solitary wave of elevation for problem (1,2). However we shall see in section 5 that this solution does not exist mathematically, even though one may compute its expansion in powers of the bifurcation parameter up to any order. 4.3
Case (Hi)
Here the center manifold is four dimensional. Let us denote by ±iq the pair of double eigenvalues at criticallity, and define by {A, B) the complex am plitudes corresponding respectively to the eigenmode and to the generalized eigenmode. This case is often denoted by "1:1 reversible resonance". We can always assume that the reversibility symmetry So acts as: (4, B) -»• (A, -B~). The normal form, at any order, reads: dA = iqA + B + iAP[v, \A\2, i/2(AB - AB% dx dB — = iqB + iBP[v, \A\\i/2{AB - AB)} + AQ[vt \A\\i/2(AB
(11) -
AB)\,
where P and Q are real polynomial of degree one in their arguments, for the cubic normal form. Let us define more precisely the coefficients of Q: Q{v, u,v) = v + q2u + q3v,
(12)
which means that for v > 0 the eigenvalues are at a distance ^/v from the imaginary axis, while, for v < 0, they sit on the imaginary axis. The vector
The Water-Wave Problem as a Spatial Dynamical System 87
Figure 6. case (iii). Graphs of fK H (u) in the (K,H) plane, for u > 0,92 < 0. Hm = v2/2q?,
K
2
H3/2
field (11) is integrable, with the two following first integrals: K = i/2(AB-AB), H = \B\2- I Jo
(13)
Q[v,u,K]du.
It is then possible to describe all small bounded solutions of (11). Indeed, we obtain
(*£)' =/«<W>. where fKM\A\2)
= 1q2\A\6 + 4(u + q3K)\A\* + 4H\A\* - AK2.
We show in figures 6 & 7 the various graphs of functions JK,H depending on (K,H), for v > 0,92 < 0, and for v < 0,g2 > 0, which are the most interesting cases. In such cases we obtain large families of periodic and quasi-periodic solutions and, for g2 < 0 a family of solutions homoclinic to 0 (solitary waves with exponentially damped oscillations at infinity, sometimes called "bright" solitary waves), while for q2 > 0 we have a family of solutions homoclinic to periodic solutions (as in case (ii)) which correspond to so-called "black" solitary waves (the amplitude is minimum at x = 0).
88 G. Iooss
Figure 7. case (iii). Graphs of //<;,# (u) in the (K, H) plane, for v < 0,92 > 0.
4-4
Case (iv)
This case corresponds to 2 pairs of simple eigenvalues {±iwj,j = 1,2) on the imaginary axis, where for v = 0 we have u>i/u;2 = p/q < 1, P and g being positive integers. In fact when v varies the eigenvalues move on the imaginary axis and this rational ratio is lost, but what is important is that our analysis stays valid for values of v near 0 such that the interval of values taken by the ratio between the two eigenvalues does not contain a rational number p'/q' with p' + q' < p-¥ q. Here the center manifold is four dimensional, and we denote by A\, A2 the two^omplex amplitudes. The reversibility symmetry So acts as: (Ai,A2) -*■ (Ai, A2). Strongly resonant cases are such that p = l,q = 2 or 3. "Reversible 1:2 resonance" denotes the case with u\lu)2 = 1/2, and we concentrate our analysis on this case. The normal form truncated at cubic order reads: dA\ — 2 +al2\A2\2) + ibiAiA2, (14) dx = iAi(u + aii/ + an\Ai\ dA2 = iA2(2u + a2v + a 2 i|>ii| 2 + a22\A2\2) + ib2A\, dx where all coefficients aj,6j,a i; , are real. This system (14) is integrable, with the following two first integrals: K = b2\A,\2 +
bl\A2\\
H = A\A2 +A\A2 -b^(a2 -2aiH4i|21 (26i)- (a 2 i - 2a 11 )|7l 1 | 4 + +(26 2 )- 1 (a 2 2 - 2a 1 2 )|.4 2 | 4 .
(15)
The Water- Wave Problem as a Spatial Dynamical System
89
It is then possible to identify all periodic and quasi-periodic solutions, and homoclinics to periodic solutions, depending on the values of coefficients. 5
Typical results for finite depth problems
In previous section, we investigated the normal forms, and obtained various type of solutions that we might expect for our complete problems. For finite depth problems, thanks to the use of a center manifold reduction, we are now working on a reversible ordinary differental equation, and the above normal form is just the principal part of this ODE, whose higher order terms are not in such a nice form. The natural mathematical problem consists now in proving persistence results. In summary, the persistence of periodic solutions of the normal form can in general be performed, through an adaptation of the Lyapunov-Schmidt technique [9],[17]. The persistence of quasi-periodic solutions is much more delicate, and can only be performed in a subset of the 2 dimensional space of first integrals, where these solutions exist for the normal form (see a complete proof in [8], and another example in [9]). The persistence of solutions homoclinic to periodic solutions, provided that they are not too small, needs some technicity (runs well in the reduced 4 dimensional space), see for instance [9] for case (ii) and [10] for case (iii). About the method for computing all coefficients in such non trivial normal forms see [3]. Now, for the normal form of case (ii), there is a family of orbits homoclinic to a family of periodic solutions whose amplitude can be chosen arbitrarily small. Such a case (ii) has been investigated by many authors (see for instance [2], [20], [21] and [9] and references therein). There are homoclinic solutions to oscillations at infinity whose size is smaller than any power of the bifurcation parameter, corresponding to the fact that we cannot avoid such oscillations when we consider the full untruncated system. The extremely delicate aspect of exponentially small and still existing oscillations was proved by Sun [21] on the water wave problem (1,2), and for a wide class of problems is thoroughly studied by Lombardi in [17]. Moreover, despite of the fact that a solution homoclinic to 0 exists for the normal form (9), this is not true in general for the full system (see [18]), even though one can compute an asymptotic expansion up to any order of such an homoclinic (non existing) "solution"! For these results on homoclinics, when they exist, it should be mentioned that the decay at infinity is exponential. There are degenerate cases (codimension two situations) where this is no longer true. For example in the case (iii) when coefficient 92 is close to 0 (see [11]) it can exist in general (for v = 0) an homoclinic to 0, with a polynomial decay at infinity. It should be noticed that this phenomenon is in fact very different from the similar property of
90
G. Iooss
polynomial decay that we shall meet at next section. Both phenomena are due to very different causes.
6
Infinite depth case
We now consider the case where the bottom layer has infinite depth. This means in our model problems that for problem (1,2) we must replace b by 0, except in Xb which is the new parameter (i. In problem (1,3), we must replace e by oo and there are still two parameters \x = (X,p). Let us notice that, for most of physical situations, the infinite depth case is much more interesting than the finite depth one. Indeed, the domain of validity of our nonlinear (local) theory is limited by the distance from the imaginary axis of the rest of the spectrum (the non-critical eigenvalues) of the linear operator studied in section 3. It appears that this distance goes to 0 when the depth increases. Now, typically the length scale for water in problem (1,2) is of the order of one centimeter, so this means that for layers of more than few centimeters depth, many of the non critical eigenvalues are very close to the imaginary axis. It is then clear that our study becomes just academic! As generally, we need to study the worse limiting case, which is here the infinite depth case, and physical cases are in fact regular perturbations of this limiting case. We shall see below that this has dramatic consequences on the mathematical analysis!
6.1
Spectrum of the linearized problem
Passing to the infinite depth limit in the dispersion relations (5,6) we find much less eigenvalues. In fact all real eigenvalues disappear, except 0 for problem (1,3). However the real axis belongs entirely to the spectrum of L^ (see [12],[4],[13]). Indeed, for any non zero real a, the operator 61 — L^ is one to one but not onto (one says that the real axis forms the "essential spectrum" of LM). Its range is not closed in H, and the codimension of its closure is non zero (see [13]). It results that the spectrum of L^. crosses the imaginary axis at 0, and that we cannot use the center manifold reduction to an ODE. Despite of this awfull fact, we still have isolated eigenvalues of finite multiplicities outside of the real axis, and situations with pairs of eigenvalues on the imaginary axis as the ones described in cases (iii), (iv) of section 4. In addition, we still have the resolvent estimate (7), due to a good choice of space IHL
The Water- Wave Problem as a Spatial Dynamical System
6.2
91
Normal forms in infinite dimensions
Since we cannot reduce our problems to finite dimensional ODE's, we still would like to believe that eigenvalues near the imaginary axis are ruling the bounded solutions. This is a justification for developing a theory of normal forms in separating the finite dimensional critical space, from the rest (the "hyperbolic" part of the spectrum, including 0). This leads to "partial normal forms", where there are coupling terms, specially in the infinite dimensional part of the system (see [4],[12]). However there are some additional difficulties: i) due to cases where 0 is an eigenvalue embedded in the essential spectrum, our technique uses the explicit form of the resolvent operator near the real axis, to detect the good continuous linear form which can be used for the projection on the eigenspace. ii) In space H the linear operator has not an "easy" (even formal) adjoint. This adjoint and some of its eigenvectors are usually necessary for expressing projections on the critical finite dim space. In our problems, we use again the explicit form of the resolvent operator near the (double) eigenvalues, to make explicit the projections (see [14]). 6.3
Typical results
The method we use now, needs to precise a priori the type of solution, we are looking for. This is a major difference with the case where (center manifold) reduction to an ODE is possible. For periodic solutions, we use an adaptation of Lyapunov-Schmidt method, except that the presence of 0 in the spectrum gives some trouble (resonant terms). It appears that we can formulate all these problems, such that there is no such resonant term. As a result, there are as many periodic solutions as in the truncated normal form (see [13]). For solutions homoclinic to 0 {solitary waves), we first inverse the infinite dimensional part of the system under normal form, using Fourier transform. Indeed, the linearized Fourier transform uses the above resolvent operator, where we eliminated, via a suitable projection, the poles given by eigenvalues sitting on the imaginary axis. The fact that the resolvent operator is not analytic near 0 (0 is not a pole, since we eliminated it, but there is still a jump of the resolvent in crossing the real axis [12]), leads to the fact that this "hyperbolic part" of the solution decays polynomially at infinity. The principal part of the solution at finite distance still comes from the finite dimensional truncated normal form, but its decays faster at infinity than the other part of the solution, which makes this queue part predominant at infinity. This is the main difference with the finite depth case, where the principal part coming from the normal form is valid for all values of x (see [12] for the proofs related with problem (1,2)).
92
G. looss
References [I] C.J.Amick, K.Kirchgassner. A theory of solitary waves in the presence of surface tension. Arch. Rat. Mech. Anal. 105, 1-49, 1989. [2] J.T.Beale. Exact solitary water waves with capillary ripples at infinity. Comm. Pure Appl. Math. 44, 211-257,1991. [3] F.Dias, G.Iooss. Capillary-gravity solitary waves with damped oscilla tions. Phys. D 65, 399-423, 1993. [4] F.Dias, G.Iooss. Capillary-gravity interfacial waves in infinite depth. Eur. J. Mech. B/Fluids 15, 3, 367-393, 1996. [5] G.Iooss. Capillary-Gravity water-waves problem as a dynamical system. Adv. Series in Nonlinear Dynamics 7, p.42-57, A.Mielke, K.Kirchgassner Eds., World Sci. 1995. [6] G.Iooss, M.Adelmeyer. Topics in Bifurcation theory and Applications. Adv. Series in Nonlinear Dynamics 3, World Sci. Pub. 1992. [7] G.Iooss. A codimension two bifurcation for reversible vector fields. Fields Institute Comm. 4, 201-217, 1995. [8] G.Iooss, J.Los. Bifurcation of spatially quasi-periodic solutions in hydrodynamic stability problems. Nonlinearity 3, 851-871, 1980. [9] G.Iooss, K.Kirchgassner. Water waves for small surface tension: an ap proach via normal form. Proc. Roy. Soc. Edinburgh 122A, 267-299, 1992. [10] G.Iooss, M.C.Pe>oueme. Perturbed homoclinic solutions in reversible 1:1 resonance vector fields. J. Diff. Equ. 102, 62-88, 1993. [II] G.Iooss. Existence d'orbites homoclines a un equilibre elliptique, pour un systeme reversible. C.R. Acad. Sci. Paris, Serie I 324, 993-997,1997. [12] G.Iooss, P.Kirrmann. Capillary gravity waves on the free surface of an inviscid fluid of infinite depth. Existence of solitary waves. Arch. Rat. Mech. Anal. 136, 1-19, 1996. [13] G.Iooss. Gravity and Capillary-Gravity periodic travelling waves for two superposed fluid layers, one being of infinite depth. J. Math. Fluid Mech. 1, 24-£l, 1999. [14] T.Kato. Perturbation theory for Linear Operators. Springer Verlag, 1966. [15] K.Kirchgassner. Wave solutions of reversible systems and applications. J. Diff. Equ. 45, 113-127, 1982. [16] T.Levi-Civita. Determination rigoureuse des ondes permanentes d'ampleur finie. Math. Annalen 93, 264-314, 1925.
The Water- Wave Problem as a Spatial Dynamical System 93
[17] E.Lombardi. Orbits homoclinic to exponentially small periodic orbits for a class of reversible systems. Application to water waves. Arch. Rat. Mech. Anal. 137, 227-304, 1997. [18] E. Lombardi Non-persistence of homoclinic connections for perturbed integrable reversible systems Journal of Dynamics and Differential Equa tions 11, 129-208, 1999. [19] A.Mielke. Reduction of quasilinear elliptic equations in cylindrical do mains with applications. Math. Meth. Appl. Sci. 10, 51-66, 1988. [20] S.M.Sun. Existence of generalized solitary wave solution for water with positive Bond number less than 1/3. J. Math. Anal. Appl. 156, 471-504, 1991. [21] S.M.Sun, M.C.Shen. Exponentially small estimate for the amplitude of capillary ripples of generalized solitary wave. J. Math. Anal. Appl. 172, 533-566, 1993. [22] A.Vanderbauwhede, CIooss. Center manifold theory in infinite dimen sions. Dynamics Reported 1 new series, 125-163,1992.
COLD ATOMS A N D MULTIPLE SCATTERING ROBIN KAISER Institut NonLiniaire de Nice In this article we use a classical description of laser cooling of atoms. In a second part we describe the use of cold atoms for multiple scattering experiments and discuss some effects which appear for dense atomic media.
Laser cooling of atoms is usually described using a quantized atomic sys tem (e.g. two-level system) and a classical description of the light. This is the so-called semi-classical description of laser cooling. It is also possible to quantize the light field, such as in the well known dressed states 1 . In this pa per however we want to use the opposite possibility and use the most classical description possible to understand laser cooling of atoms 2 . As we will see, it will not be possible to eliminate completely the quantum effects for laser cooling of atoms: the atomic resonance is known to be dependant on Planck's constant and the recoil of an atom after absorbing or emitting one photon is also quantized. But given these two points, it is possible to compute the basic effects of the so-called Doppler cooling by a classical model. If one accepts the resonance frequency as a input parameter and a phenomenological diffusion constant for the residual heating, one could even apply this model to other situation such as e.g. acoustic waves. However it seems difficult to use this scheme to efficiently cool other systems than individual atoms, even if it is worthwhile noting that solid samples of matter have now been cooled using laser light 3 . 1
Classical m o d e l of D o p p l e r cooling
Let us study the center of mass motion of atoms interacting with quasi res onant light. The radiative forces experienced by the atoms will depend on the detuning S between the laser frequency u>£ and the atomic resonant fre quency uiat. If for example one wants to compute the radiation pressure one needs to know the scattering cross section, which, in the case of particles with an internal resonance, can be much larger than the geometrical size of the particle. In order to take into account these internal resonance effects, we will model the atom as a kernel surrounded by an elastically bound electron, with a resonance frequency uat. The laser light drives the electron and thus induces a dipole d = qf = q (fe — K ) which, in the driven regime, oscillates 95
96 R. Kaiser
at the driving frequency u>£. It will be the interaction between this driven dipole and the electro-magnetic field of the laser which acts on the center of mass of the atom. We will hence proceed in two steps: first compute the dipole induced by the laser light and second study the motion of this oscillating dipole with the electromagnetic field. 1.1
Internal motion: elastically bound electron
In this model we will suppose that the distance f=fe — R between the electron and the kernel of the atom follows the equation 2
£f+r£?+ Wot 2 ?=J£
(i)
The total force acting on the electron is composed by the force fs due to the electric field at the position fe = ~R + r of the electron:
fs = qE(r„t) and by a component due to the magnetic field:
fB =
q^-At(fe,t)
The ratio between the amplitude of these two forces is of the order of
is. ~ *• I = us.«i fs di c c and we hence can neglect the effect of the magnetic field for computing the relative motion of the electron. Furthermore, the mass of the kernel being much larger than that of the electron, the distance r = re—R between the electron and the kernel of the atom is determined by the motion of the electron. We will use the complex notation for the electric field for a monochromatic linear polarized light: E(f, t) = E0(r)Z+ exp(-iw L t)
(2)
Using eqs. (1) and (2) and only taking into account the electric field force on gets a driven solution f(i) = f0 exp (—iwLt) with: -u)2Lr0-iuLTfo
+ uiat2f0 = 9 °™Zi
Defining the polarizability a(wi) of the atomic dipole by: d = q~?=e0
a(uL)E
Cold Atoms and Multiple Scattering
97
one thus obtains: a(wi)=
i
e
(w o t 2 -a;^-tw z ,r) e0me
We will use the real and the complex part of a(wi): a = a' + ia": a,
Uat2-u2L
_ 2
f_
2
2
Kt -w£) + (uar) «.»», „<< =
^1
?L
( ^ f - w £ ) 2 + (a; L r) 2 eom e
With real notations for the electric field and for the dipole V one thus has i t = Red = Re [e„ a(wz,)i] For a wave propagating along Oz such as: E{r,i) = E0{r)e2exp[-i(uLt
- kz)]
one gets: D = e0 \a\ E0(r)z2 cos [- (uLt - kz) + (pa] where a = |a|exp(tV<»)- The induced dipole follows the driving electric field with some delay. This delay depend on the detuning between the laser fre quency and the resonance frequency of the dipole. If the electric field oscillates very slowly compared to the dipole resonance frequency (wi « wat), we have a' 3> a" and the induced dipole almost immediately follows the electric field excitation with a static polarizability:
".tot =
n = «o
£0meuatz For a resonant excitation (U>L = uat ) we have a' = 0 , i.e. a purely imaginary polarizability, and the dipole is in quadrature phase with the driv ing field. Defining the detuning S = wi — wat as the difference between the laser frequency u>i and the atomic resonance frequency uat one gets for a quasi-resonant excitation (S « WL, WL — <*>at) (figure 1):
98
R. Kaiser
- 6 - 4 - 2
0
2
4
frequency Figure 1. Atomic polarisability a[crou/{,/r]: real part a' (dashed line) and imaginary part a" (solid line) as a function of detuning u>j, — u/at[T]
a'
a"
_ —Sui
=71
(4)
..
(5)
~s'+4 2 _
$UL
"**+¥
2
Remarks i) In order to compute this polarizability we have approached the electric field Eo(fe) at the position of the electron with the field at R of the center of mass M of the atom. This approximation is valid if the distance between the electron and the center of mass of the atom is small compared to the scale on which the electric fielc varies, i.e. small compared to the wavelength of the laser: |f| = r e — R 4C A. This approximation is called electric dipole approximation as one can consider the atom as a point dipole on the scale of the wavelength A. Note that even in this dipole approximation, real atoms have a more complex internal structure (Zeeman sublevels e.g.) and exhibit some features which cannot be described by a classical dipole oscillation. ii) The damping of the dipole can be explained by the radiation of the oscil lating dipole. This radiation depends on the frequency of the oscillation and strictly speaking one should replace T by T ^ - . But we will only be interested
Cold Atoms and Multiple Scattering
99
in frequencies close to the resonant frequency and we will thus neglect the change of the damping on the scale of 6. □ 1.2
Radiation forces acting on the atom: "Classical approach"
The force acting on the center of mass of atom, considered now as an oscillating dipole, has two components: one due to the electric field and one due to the magnetic field of the incident laser field, propagating along the Oz axes. The force f E due to the electric field, which we take polarized along the Ox axes:
E(r,t) =
E0(f)l}txp(-iuLt)
is directed parallel to this electric field:
'fE = J2qE(r,t)(x^x The magnetic force ~f B on the contrary will be directed along the axes of propagation of the laser (for a linear polarized plane wavej! The electron driven by the electric field has a velocity along the axes Ox: —£*■ = \^-^\ e% and for a magnetic field along Oy: B{~r*e,t) = Bo(~r*e)~e*y exp(-iwz,<) one gets:
? B = /^A^(-r>e,<) = 9
d!>e B0(T*e)exp(-iwLt)-? dt
It is thus clear that one cannot neglect the effect of the magnetic field for computing the force acting on the center of mass of the atom! One can keep in mind the model of the electron driven by the electric field and it is the magnetic force which acts on this moving charged particle. Although this is not a complete description, it allows one e.g. to understand why the radiation pressure force is along the axes of propagation of the laser light, without using a quantum treatment for the electric field of the laser. Despite 4s- ~ |
100 R. Kaiaer
jt =
\q\E(t,t)-\q\E{reyt)
The two components of the electric force have opposite signs. However the net electric force is not zero as the electric field is not the same at the location fe of the electron and R of the nucleus. The electric force is hence a differential effect. The magnetic force on the other hand depends on the velocities of the charged particles. As the electron does move much faster than the nucleus, one only needs to consider the magnetic force acting on the electron. But one has to consider this force together with the electric force acting on both charges. Electric force JThe electric force can be evaluated by making a first order expansion of E(fe, t):
E(r.,t) ~ E(lt,t)+{[(f.-Tt)
^at] E(?e,t)}f
=li
Taking a = q (fe— R J the electric force acting on the atom is:
with components along each axes e? (i = x, y, z):
fs,i =
?<•£
Ei{r„t)
The spatial scale of variation for E(re, t) is the wavelength A. The electric force would then be zero if the electric field were uniform and its effect on the atomic dipole only appears at the first order in >-^—k We now have computed the instantaneous force /B which needs to be averaged over the fast optical frequency in order to describe the slow motion of the center of mass of the atom. Magnetic force For the magnetic force we restrict ourselves to:
n = q^.At(^e,t)
Cold Atoms and Multiple Scattering 101 and at the lowest order one has: dT> As d = q~~r* = e0 OI(UL)E
one can write:
n=jip*-*{%tj)-'tAjii[%t) Taking £~3(T$,t) ~ ■§i~^('^,t) (the velocities of the charges are small compared to c) and using Maxwell's equation ^- = — rot E one can express the magnetic force as a function of the electric field:
ft = jt p A #(!?,*)) + "^ A *#$ or, for the ■£? component:
/„ = i (V A*of,o)i+E (<,£* -« £») The time average of the first term is zero and we hence neglect this part in the following. Total force The total average force ( / ) on a atom by a light wave is:
(?) = (n) + (#) = {["*• 8^t] %?: <)}
+ ^ A r^lt
with components along ~€$:
<> =
S*£
This force seems to derive from a time average potential:
W
= -(**) = -(5>*
102 R. Kaiser
where the gradient is taken only on the electric field E* and not on the dipole a:
( ? ) = -grtkw =
{Xdi9^Ei)
Average radiation force Let us consider the following electric field EQ(f) =-e*xE0(r)exp(ikz-iwLt), with E0(r) real. Returning to real notations for the fields and the dipoles one has: ^ = Re ~a* = Re (e0 aE0(r) exp
(-uoLt))
For a wave propagating along Oz one gets: E0(r) ="#» |i? 0 (f)| exp (:'**) and V - e0 |io(r)| ~$* ( "'cos (uLt - kz)+a" ain(uLt - kz)) To calculate the average force let's first take ]jr^aBe[Eo(r)(-ibJLt)]
= Re{^[|£0(r)|exp(iA*)]
{-iuLt)}
^ R e [ £ o ( f ) ( - ^ i < ) ] = gTalt [\E0(r)\]cos (uLt - kz) +kef \E0(r)\sm{uLt
- kz)
We then obtain the instantaneous force: ~f = [e0 }E0(r)\ ( a'cos (uLt - kz)+a" sin (uLt - kz))] * [g7a1t[\E0(r)\]co8 (wLt - kz) + kit \E0(r)\sin {uLt - kz)] The time average force is thus
( ? ) = £° ( y
^
[l^o(r)| 2 ] +y~ket
|£0(r)|2)
(6)
Cold Atoms and Multiple Scattering 103
1.3
Resonant radiation pressure
The second term of (6) is called the resonant radiation pressure /rod- It is aligned along the direction of propagation of the laser (e^) and it is propor1
-rH2
tional to the laser intensity: Iine = %e0c E : ~Trad = a'ke-}-^1 (7) c One can thus define a scattering cross section one gets: aat = a"k, which depends on the detuning S (figure 2):
* * ( ' ) = „ ?i r » osL *
<$2 + r± 2e0cme
At resonance one has aTat' -
ao*5^-
(8)
Taking for the damping con
stant the radiation losses due to the oscillating electron ( r = g ^ y ^T^w^ ~ 6^*4^wat).
one
finds
at
resonance: _re« _ 3Aat
^ "17
,Q.
(9)
Remarks "Comets": this radiation pressure is responsible for the neural tail of comets. In these tails small particles (of diameter less than 1 mi cron) are pushed away from the sun. The radiation pressure which scales as surface/distance 2 dominates for small particles over the gravitational attrac tion which scales as volume/distance 2 . "Quantum approach": It is possible to evaluate the radiation pressure force of a plane monochromatic wave acting on an atom by a linear momentum conservation argument. As a classical electric field is not a eigenstate of the momentum operator, we use a quantum description of the light field in terms of photons. For a wave propagating along the Oz axes, each absorption process give rise to a momentum transfer of A~p* = hk~e*z. The emission of photons will occur in a random direction such that, on average, the momentum transfer after several fluorescence cycles will be zero for the emission processes. One thus gets an average momentum transfer of (A~p*) = hk~e*z per fluorescence cycle. The average force fav, depending on the number of fluorescence cycles per second ^ (hence also on the laser intensity and detuning) is thus given
104
R.
Kaiser
Figure 2. Cross section
by: Tot/ =
-ir^k'
directed along the axes of propagation of the incident laser light. This ar gument is based on a quantum treatment of the laser light (the electric field being quantized in terms of photons) and is not along the main line of the calculation followed in this article. However it clearly shows that the direction of the force acting on the atom can be along the direction of propagation of the laser light and hence transverse to the electric field! A similar argument could give the force acting on a mirror when reflecting light. It is possible to reconcile the "classical" and "quantum" description of
Cold Atoms and Multiple Scattering
105
the radiation pressure by rewriting the force (7) as:
-f
. f c -j»r/,- B e
r 2 /4
with I,at = ^ ^ - = r**<*i™*"i. In the case of Rubidium atoms one gets for example, Isat = 1.6mW/cm2 . This expression (10) allows for a simple physical explanation of the radiation pressure force. During each fluorescence cycle one has a ft k transfer of momentum from the laser field to the atom. The time scale for one fluorescence cycle depends on the lifetime p of the ex cited state of the atom and on the laser intensity needed to reexcite the atoms after the spontaneous emission. The number 4j£ of absorbed and reemitted photons per unit time is: dN dt
r iine r2/4 2 2 I,at 5 + T 2 /4
which has the resonant frequency dependance.
□
1-4 Dipole force The first term of (6) is called dipole force J dip- A particle with an real polarizability a' is attracted to a spatial region where its potential energy W = — a~E is minimum. With d = e0a'~E one has: W = -e.o' | # One thus obtains a force oriented towards regions of high electric field in the case of a' > 0 (high field seekers, such as dielectric spheres in air) and towards low electric field for a' < 0 (low field seekers, such as air bubbles in champagne). Remark Whereas the radiation pressure force can be explained by fluores cence cycles of absorption followed by spontaneous emission, the dipole force can be expressed in a quantum approach in terms of absorption followed by stimulated emission processes. □ Let us consider the case of a standing wave obtained by two contrapropagating plane waves \E\ located at z is in this case:
= 4 \Eo cos2(Arz).The dipole force for an atom
106 R. Kaiser
where 7j nc is the incident intensity of each of two planes waves producing the standing wave. Note that this force changes sign when moving along Oz. Order of magnitude: For an on-resonant laser (S = 0) with an intensity of / = ImW/cm2 the radiation pressure force is fTai = a"k^ ~ 10~20N. This force is 104 time larger than the gravitational force : fg = Mg = IQ~2*N\ Even though each momentum transfer is quite small the radiation pressure forces are huge because after the emission of photons the atoms can quickly be reexcited to the upper state and up to 107 cycles of fluorescence per second can be achieved. One condition for this to happen is that, after a spontaneous emission, the atoms falls back into the initial ground state from where it quickly can be reexcited (so-called closed transitions). In the elastically bound electron model, this condition does not appear explicitly, except for the fact that we suppose one single atomic frequency to be present. In some atoms this condition can be fulfilled (sometimes at the expense of an additional "repumping" laser), but in the case of molecules, it is much more difficult to find closed transitions useful for laser cooling.
1.5
Doppler cooling
We now apply the radiation pressure forces to study laser cooling of atoms, i.e. in order to reduce the width of the velocity distribution of a sample of atoms. The simplest idea for such a cooling has been proposed in 1975 by T. Hansch and A. Shawlow5 for neutral atoms and by D. Wineland and H. Dehmelt 6 for ions. Consider the case of two laser counterpropagating along Oz with the same frequency wj,. This argument can be generalized to three dimensions, but for simplicity we will restrict ourselves to one dimension. An atom interacting with such a laser configuration will be submitted to the radiation pressure forces calculated in section 1.2. A detailed analysis of this situation has to include both effects of the resonant radiation pressure and of the dipole force. But a basic explanation of the so-called Doppler cooling can be given by simply adding independently the resonant radiation pressure forces of the two propagating laser fields. Let us consider the case of the laser frequency being detuning below the atomic resonant frequency: 6 = wj, - w ot < 0 ("red detuning"). For an atom at rest (v2 = 0), the excitation by the laser light will not be efficient as the resonance condition for neither laser will be fulfilled. If now an atom is moving (vz ^ 0) it will experience different frequencies from the two laser fields. In the atom's rest frame, one of the laser frequencies will be shifted towards higher frequencies, the other one towards lower frequencies. For a "red detuning" (S < 0) the atom will shift into resonance (S - kzvt = 0) with
Cold Atoms and Multiple Scattering 107
1. 0.8 v 0.6 40. 0.4 ti 0.2 -
0. 0 C -0. 79 --0. 87 -1.
-4 -2 0 2 detuning
4
Figure 3. Radiation pressure force as a function of detuning 8/I'
the laser propagating opposite to the atoms velocity: 5 = k,zvZ < 0. The other laser on the contrary be shifted further out of resonance. The net force for an atom moving towards +Oz will be directed along -Oz, i.e. opposite to its velocity. In the same way, an atoms moving towards -Oz will experience a force along +Oz. By summing the two velocity dependent forces one gets with k1 = -k2 = k e :
71 toc( vz) =
It.,
(
hkir
(vz) = h
Ii.':
r2
2 Isac (6 - ki,Zv5/2 ) + r2/4
r
r
Iinc
2 Isac
G
-
k vZ )2 +
I + r2 Isar (
+ hk
inc
r2
/2 + r2/4 b - k2,ZvZ)
172
r2 /4 +
(8 + kv )2 + r2/4
For small velocities one can get a linearized expression of this force fZ (vZ) = -ymv2
(11)
108 R. Kaiser
with a friction coefficient 7: 7 =
r2
-4**^£/'-
/4
The velocities of the atoms around vz = 0 will thus decrease exponentially vz {t) = vz {t0) exp [-7 (t - t 0 )] The friction coefficient 7 is maximum for 8 = — TT?* For a laser with an intensity of / = I,at =
l.6mW/cm2: 9 hk2 _
This is a very fast processes. Because of this friction this type of cooling has been called "optical molasses". If one would change the frequency of the laser to positive ("blue") detuning, one would get a heating process for the atoms (increasing their velocities). The limit of such a cooling process is given by the fluctuations of the forces. These fluctuations are of quantum nature and depend on the recoil velocity hk. At each cycle of fluorescence a photon is emitted in a random direction. This yields a random walk in momentum space with a step of size 8p = hk. One thus gets an increase in the kinetic energy of the atomic distribution: dt The diffusion coefficient D is given by the ratio of the step size Sp and the time scale for a fluorescence cycle r:
i> = {**)! T
The average time r between two spontaneous emissions is given by:
i = I^£
(12)
^
( r 2 I,at S2 + P / 4 ' This diffusion gives rise to an increase in the kinetic energy, i.e. a heating of the atoms. At equilibrium the heating due to the fluctuations and the cooling due to the friction effect compensate:
\2"
dJApf dt
= -21(Apfiq Uq
+ 2D = Q
(13)
Cold Atoms and Multiple Scattering
0.0
-0.5 -1.0 -IS
-2.0 -IS
109
-34
detuning
Figure 4. Equilibrium temperature -gjh as a function of detuning S/T
and one obtains the temperature (figure 1.5): \2
L.T,M£L=.*.<£±m The minimum temperature is hence: 4
which is obtained for 8 = — y. Remark A precise calculation needs to take into account the fluctuations of the number of fluorescence cycles and not only the random direction of the spontaneous emitted photons and the three dimensional aspect of the random walk, slightly changed the numerical value of the above result, yielding in a ane dimension configuration a so-called Doppler limit of ,
T{1D)
__ KY
110 R. Kaiser
and in a standard three dimensional situation a Doppler temperature of ^BTDopp = - r -
For Rubidium atoms this Doppler temperature is
*%» = hhT =mtiK D
These are extremely low temperatures, below what has been obtained by other techniques before. In addition, experiments by W.D.Phillips and coworkers in 1988 7 have resulted in even lower temperatures. In order to explain this lower temperatures one needs to include the spatially modulated dipole forces in the standing wave together with polarization effects and the more complex internal structure of the atoms (optical pumping) 8,9 . But we will not discuss here such "Sisyphus" cooling10 or other elegant sub-recoil cooling techniques 11,8,9 . 2
Interferences in multiple scattering
In the preceding section we have studied the effect of atom-laser interaction in respect of its effect on the atomic momentum and position distribution. This is the main purpose of the community of laser manipulation of atoms. When considering atoms as scatterers, one will on the other side be more interested in the influence of the light scattered by atoms. We want to study in this section to what extend atoms are different from standard scatterers such as TiC>2 or other Rayleigh and Mie scatterers. 2.1
Scattering cross section of single atoms
Atoms can be considered as very strong scatterers with a cross section as large as —► 0 can be deduced from eq. (3):
the known w 4 dependance for Rayleigh scatterers at low frequencies. As the frequency increases one usually gets into the Mie regime with more or less sharp resonances and a cross section with tends to twice the geometrical cross section:
Cold Atoms and Multiple Scattering 111 a
-► 2?rr2 ID—fOO
Remark The factor of 2 compared to the geometrical cross section can be understood as being the diffracted light (#<«// ~ 7) by the sphere, which reduces the intensity in the forward beam. This result is valid if one is in the far field (L ^ 5r) u m i t - When looking at the cross section of an object by measuring the intensity profile after a short distance {L < ^) the diffracted light is not yet separated from the geometrical image of the sphere and one only recovers the geometrical cross section of TIT2 . □ From the scattering cross section one defines a mean free path the average distance between two successive scattering events: IMFP
=
IMFP
as
—
no where n is the density of scatterers in the medium. In this respect one can consider the atomic resonance frequency as a first 3harp resonance of the scattering cross section. In real atoms there are many transitions corresponding to different resonances in the cross section. An atomic spectrum is thus equivalent to a frequency dependant cross section. In these terms the finesse of a resonance in the case of atoms is extremely large. For Rubidium atoms e.g. one has:
With w = ^jp one gets for A = 780nm and with a natural linewidth of T = 6MHz a finesse of - J - - 6 107 but Such large values have been obtained in other types of scatterers 13,14 but in order to make use of this large values of the finesse in a multiple scattering experiment one needs to be able to produce many scatterers with the same resonance frequency (with a precision of the width of the resonance). This 3eems to be unrealistic with scatterers such as microspheres. Atoms of the 3ame element however have all exactly the same resonant frequency and are in this respect an unique way of studying high finesse cross section in multiple scattering 15 . If one does not only want to study an extinction cross section (attenuation af the incident beam) it is also of interest to analyze the differential scattering
112 R. Kaiser
cross section and the absorption cross section. Let us only consider the far field radiation by a dipole excited by an incident electric field of a linear polarized Jaser E0~e£exp(-iu)Lt). Using the results of section 1.1, one obtains a dipole d: d = t0 (a' + ia") E0~e$exp
(-iwLt)
The scattered field by a dipole is given by: -=>
_
1 exp(ifcr)
t-* scat — ~.
47re0
:
3
1* *
kr
which in the far field limit gives: or
The scattered field will not always be in phase with the incident field, because of the complex polarizability a of the dipole. The forward scattered field interferes with the incident field. In this direction an imaginary polariz ability (on resonance a = a") will correspond to a destructive interference as the dipole is then in quadrature phase compared to the incident field. For a collection of scatterers this can be compared to a complex part of the index of refraction and to an attenuation of the incident beam 12 . In any other di rection however there will be no interference with the incident beam and only the scattered field has to be considered, with a frequency dependant phase shift. 2.2
Multiple scattering samples in atomic physics
Atom vapors can be used for multiple scattering experiments and both for room temperature and laser cooled samples an optical thickness larger than one can be obtained. One has to pay care to the various situations one can produce in atomic vapors. As examples let us consider 3 accessible situations. First, an oven of sodium atoms 16 (actually many other elements can be used) at a temperature of ~ 200° Celsius can yield a density of I013atoms/cc and an with a sample thickness of L = 1cm on-resonant optical thickness of several 103. In this case however the Doppler shifts of the atoms leads to an inhomogeneous broadening of the cross section. For a detuning of
Cold Atoms and Multiple Scattering 113
1GHz however, the Doppler broadening can be neglected. The optical thickness is reduced but remains still larger than unity. In this situation one might have to consider collective effects, such as mean field and its quantum correction 15,17,18,19 which will modify the multiple scattering properties. A magneto-optical trap of 1010 Rubidium atoms in a volume at a few 100µK can be obtained in a volume of a few mm3 An optical thickness of several 10 has been obtained in several experiments, in particular those who aim at reaching Bose-Einstein condensation of cold atoms. This samples have the advantage of negligible Doppler effect. If the cooling and trapping light is however present, then the multiple scattering leads to a repulsion between the atoms inducing a correlation between the position of scatterers. A Bose-Einstein condensate of cold atoms20,21 has considerable higher densities of atoms, of the order of a few 1014atoms/cc. In this case however the sample does not have the same local fluctuations in the dielectric constant as all atoms are in the same quantum state. This is an extreme case of dependant scattering and transparency and recently superradiance effects have been reported in this situation22. Optical excitation of such condensates is subject to many recent theoretical investigation 21. 2.3 Dwell time Another aspect of scattering by a atom with an internal resonance is that there will be a frequency dependant phase shift of the scattered light. This effect can also be put in the time domain as a retardation effect. Different formulations of such retardation times are being used in the community of multiple scattering, such as Wigner time, delay time 15, 23,24,25,2s We will use a simple delay time interpretation by defining the dwell time as:
app (w) TD =
a
W
For a slab of glass with an index of refraction of n = 1.5 e.g. the transmitted light through a thickness L is phase shifted by:
exp (ip (w)) = exp (iw nL) c In this simple case the dwell time is:
a (TD)glass =
)
aw
nc
114 R. KaUer
and is n times longer than through free space. In a simple approach this would be the travel time for a light pulse through the sample. Applying this idea to a delay time for light scattering by a damped dipole, we will get with a = |a|exp(iV a (u>)): ♦
/
a
i
r
"
a dwell time of: (r
x
_
'dip
( r V4)
2 Tp
(r 2/4)
+
For Rubidium atoms e.g. with T (S = 0) delay time of
-1
= 25ns one has an on resonance
2
( T # )re. = f =
50nS
which would correspond to 15m travel distance in free space! When using the radiative transfer equation in multiple scattering, various velocities have to be defined, such as group velocity vg and transport velocity VE which can strongly depend on the dwell time 15,27 . Atomic samples with large dwell times thus seem to be an good testing ground for studying the influence of dwell time in multiple scattering experiments. 2.4
Coherent backscattering of light
When discussing sample parameters for multiple scattering experiments ad ditional aspects have to be studied. One can for example look for a difference between the mean free path IMFP (mean distance between two scattering events) and the transport mean free path /* (mean distance to loose the ini tial direction of propagation): ,. _
IMFP
~ 1 - (cos 9} where 6 is the angle between the incident and scattered light. Even though the dipole radiation pattern is not isotropic, one has (cos#) = 0 and hence I* =
IMFP-
One particular effect in multiple scattering has been the subject of de tailed studies in recent years: coherent backscattering of waves by a random medium. Scattering of wave by a static random medium of scatterers gives rise to a so-called speckle pattern 28 . Such speckle pattern are observed whether the medium is optically thin, with single scattering being dominant, or opti cally thick in the multiple scattering regime. When averaging over different
Cold Atoms and Multiple Scattering
115
realizations of the scatterer distribution, the scattered intensity will have a smooth angular dependance. The main argument in this explanation is that the detected field is the coherent sum of electric fields with random phases
j
The average detected intensity will then be:
> =
5Z^jexp(iVj)
A first approach will be to suppose the interference terms to average to zero and thus obtain
;
(15)
i]
<> = (x )
This argument however is neglecting the particular case of backscattering. Let us group two by two all scattering paths giving a contribution to the detected field, by taking for each path its reverse path (figure 5). Assuming that the dephasing for the forward and the reverse path are identical 29 the phase difference of the two paths will be A
inc+
k out) (I*! - T*N)
One can thus see that if the relative position of the scatterers is randomly changing the phase difference is generally also a random parameter and inter ference terms in eq. (2.4) will be cancelled. However for the particular case of backscattering * tnc T K out
=
U
the two reverse paths have exactly the same phase shift regardless of the position of the scatterers. Always having such a constructive interference will give rise to an enhanced backscattering peak when averaging over the sample distribution (figure 6). Remark Note that for a static sample one does not always have a maximum intensity in backward direction. Indeed, even though paths interfere construc tively two by two in this direction, the relative phase shift between the various multiple scattering paths (1 - 2 — .... — N and 1' — 2' - .... — N' for example) do not have a fixed phase relation. □
116
R. Kaiser
Figure 5. Various contributions to backscattering
This enhanced backscattering relies on the reciprocity of the reverse paths of scattering and on the constructive interference between these two paths. The total width at half maximum of the coherent backscattering cone is given for a half infinite medium by the transport mean free path 30,31,32,33 :
where k is the wavevector in the scattering medium. This results usually holds for kl* » 1. When kl* become of the order of unity, the so-called JorTeRe'gel criterion for strong localization will be obtained 15,34 . The coherent backscattering cone is a signature of interference effects in multiple scattering. It has been observed with many classical scatterers, but only recently in atomic samples 35 (see figure 7). Albedo Multiple scattering effects have been studied in atom physics 36,37,17 and radiation trapping and superradiance have been observed. One important aspect in multiple scattering of light by atoms is the frequency spectrum of the scattered light. The radiation spectrum has a complex shape 38,39,40 and
Cold Atoms and Multiple Scattering
117
2
angle
Figure 6 . Speckle pattern and averaged intensity
does not only show an elastic component at the drive frequency. In the case of a two level atom an inelastic component appears for larger intensities as the upper state population becomes more and more important and features such as the Mollow triplet have been observed". These inelastic components have a spectral width of the order of the natural linewidth of the atomic transition and it seems interesting to investigate what will be the influence of these components on multiple scattering properties such as the coherent backscattering. For a two level atom the total scattering rate is given by 1: V _r s 21+s where I' is the width of the excited state and s the saturation parameter: Zinc
r2/4
Iaat 62 + 1'2/4 This total rate can be separated in an elastic component I'elaa , having the same frequency spectrum as the incident laser , and an inelastic component
118
R. Kaiser
Figure 7. Enhanced backscattering observed from a optically thick laser cooled sample of Rb atoms
^'ineias w1^ that 1
a
broadened spectrum and a triplet structure. One can show 1
elai — ~
2(l + «)2
r'inclas
(i + »r
If we suppose that only the elastically scattered light contributes to a co herent backscattering cone (a hypothesis which is under present investigation) one could define an equivalent of an albedo for standard scatterers by: a =
elas ' elat
' inctat
1
l+»
which decreases as the transition rate of the atom becomes saturated. Several parameters will be of interest in such a study. Take for example an incident light with a broad spectrum (of At; = 6MHz e.g.). The coherence length for such a laser is of the order of Ax ^ ^ * 8m which will have to be compared
Cold Atoms and Multiple Scattering
119
to the length and time scales of the problem. Another aspect will be the time dependance of the scattered light. In time correlation experiments 42,43,44 for example this spectral width has to be compared to the detection bandwidth, which varies by orders of magnitude for standard CCD cameras or photomultipliers.
2.5 Strong localization of light in atoms? One "holy grail" in the multiple scattering community is the observation of strong localization of light. First results have been reported in december 1997 with semi conductor powders45. Indeed it is expected that only a high contrast in the index of refraction ( I L > 2.5) would yield strong localization. One problem to obtain this regime can be understood with the following arguments. Consider point scatterers . Let us take the Joffe-Regel criterion . One way to realize this would be to for strong localization: 1MFP use large wavelength radiation. In this limit the cross section of Rayleigh scattering is scaling as Q oc W4 w-^ 0
or
7rR2 - (kR)4 With the mean free IMFP path scaling as IMFP ^ a oc W-4 one obtains for the Joffe-Regel criterion in the low frequency limit (w -- 0): klMFP oc W-3 >> 1
However if one uses a resonant scattering, the cross section can be of the order of 0-"S _ A2 and strong localization might be expected for nA3 _ 1. In this case however collective radiation effects might become too important to be neglected and cold yield a larger mean free path than for independent scatterers. Today it is possible to obtain sample with nA3 > 1 in Bose-Einstein condensates cold atoms, but localization effects in such samples have not yet been observed. One consequence of atoms closer than one wavelength is the correlation arising from recurrent scattering, which can also be interpreted as dipoledipole Van der Waals effects. For dilute sample, such that kIMFP >> 1 the correlation can lead to an increased "effective" mean free path (transport mean free path), as the radiation pattern can be affected to give enhanced forward scattering (superradiance e.g.). In dense media, the polarizability is modified
120 R. Kaiser due to local field effect (Lorentz-Lorenz formula 46,47 ) and we have X =
na 1-inor
or e-1 1 = -na e+2 3 with e = 1 + 4irxUsing a =
_1 op ■UJL -i\2
=
6 + i$ 2 a0 2 - 2
<$ +£
/ , • II
UL
= a + »a
one has >
L
£ = 1 + Ann * - » E - £ >
= 1 + 4™L
which can be seen as Lorentz-Lorenz shift A A w
6-%fu,L
( ( J
AWLL
_
t t w L
+ iZ )
2 +
n
OCQ
2
Wi
of
n a
° " = -3TW
i
or using TT- = a o f c ^ or aow a( =f£yr one has (16)
AwLL = Using the relation 1
r = 3TT£
0
a;3 AC3
one can rewrite this shift hAwLL = —n
NT 3£n
This red-shift of the resonance is thus expected to be small for dilute samples, whereas for BEC samples e.g. one could expect shifts several times the natural line width of the transition. One can also write this result as: £ = 1 + Ann
uL = 1 + Ann — : - 5 — ^ —w/, (6 + Au>LL)2 + % 2
Cold Atoms and Multiple Scattering
121
At resonance, for S = 0, one thus has w
W L ) 2, 2 + ^™ 2~ i>
e = 1 + 4™— which, for the absorptive part, is 4nn
^
op
2 (Aw t L ) + V and hence
2
*■=—=5- times smaller than without local field effects.
This
reduction factor
S
_
*
(17)
is equivalent to a reduced resonant cross section: Tret
1
= <*>
(IS) +1 The "on-resonant" mean free path thus is increased:
^((SH-**
3A 3 2ir
and one obtains the Joffe-Regel criterion (figure 8)
(tf)' + . .(£)'+. r
" ~
„3A3 "4^
~" Q d
/nA»\ (.4^)
which is minimal for ^£y = 1 and is then ~
t;
2
< 1
" " 3 " This model seems to predict that strong localization of light in dense cold atomic vapors can only be obtained for on-resonant excitation in a narrow window of density. One has however to investigate more precisely how the fluctuations of such a "mean field effect" will modify not only the position of the resonance but also its width. The bare shift of the resonance could be taken care of by taking e.g.: 6 + AwLi = 0
122
R.
Kaiser
s u an o Oi
&o
Figure 8. Joffe-Regel parameter •> as a function of density nA 3 /4ir 2
One then has:
=
^((r--4^j +1J
and hence the detuning dependant Joffe-Regel parameter (figure 9):
a
i (»-£)'+' »
(3U)
This should compensate for this Lorentz-Lorenz shift and it should then be possible to keep the same cross section as without local field effects. In that case the Joffe-Regel criterion seems to be fulfilled when 1 1
[*!,„ = i < ■ 4»3
i.e. when nX3>^-
4TT 2
(18)
Cold Atoms and Multiple Scattering 123
4
3
2
1
0 -4
-2 0 2 4 6 detuning
8
10
Figure 9 . Joffe-Regel parameter y as a function of detuning 8/F
Note that in this result, only the optical wavelength is relevant (as opposed to the threshold for Bose-Einstein condensation where one requires nA B > 2.613 and where the De Broglie wavelength is the important parameter) The Lorentz-Lorenz correction has been described in more detail for optical thin media by Friedberg et al46. The precise shift depends on geometrical configuration and give different numerical factors. It is interesting to notice that even the anti-resonant terms (usually neglected in teh rotating wave approximation) contributes significantly to the shift. Furthermore collisional shift of the resonance cannot be neglected as it also varies linearly in density (as is well known in atomic clocks): OGlcoll =
On
This effect is difficult to distinguish from the Lorentz-Lorenz correction as both scale as the density of the atoms . However in a non linear experiment with hot atoms it has been shown that the Lorentz-Lorenz correction has to be taken into account for a precise evaluation of the shift of the atomic resonance in dense media. Furthermore the Doppler effect results in an inhomogeneous broadening of the line and the "effective" average cross section for resonant
124 R. Kaiser
detuning will be reduced. As the temperature dependance of the atoms does not appear explicitly in eq.(18) it is important to note that in addition to the shift of the resonance collisional broadening of the line, which also scales as the density of the atomic gas, cannot be neglected. 3
Conclusion
In this paper we have described the basic cooling mechanism of atoms by laser using a (almost) complete classical description. This approach can be very usefull to estimate the cooling (in-)efficiency in other situations (dielectric spheres with internal resonances e.g.). In the second part we have presented a well known feature of interference effect in multiple scattering which we have applied to atoms as scatterer. Extended basic criteria to high density, we have addressed the question of how the Joffe-Regel criterion is modified and where one can expect strong localization of light in dense atomic samples. This very interesting regime however needs more thorough investigation. We think that multiple scattering in dense cold atomic media is a very promising topic which is now accessible with e.g. Bose-Einstein condensates. References 1. C.Cohen-Tannoudji, J.Dupont-Roc, G.Grynberg, Processus d'interaction entre photons et atomes, Intereditions, Paris, 1988. 2. G.Grynberg, A.Aspect, C.Fabre, Introduction aux laser et a I'optique quantique, Ellipses, 1997. 3. C. E. Mungan, M. I. Buchwald, B. C. Edwards, R. I. Epstein, and T. R. Gosnell, Phys. Rev. Lett. 78, 1030 (1997). 4. C.Cohen-Tannoudji, College de France lectures 1982. 5. T.W.Hansch, A.Schawlow, Opt. Comm. 13, 68 (1975). 6. D.Wineland, H.Dehmelt, Bull. Am. Phys. Soc. 20, 637 (1975). 7. W.D.Phillips, Rev. Mod. Phys. 70, 721 (1998). 8. S.Chu, Rev. Mod. Phys. 70, 685 (1998). 9. C.Cohen-Tannoudji, Rev. Mod. Phys. 70, 707 (1998). 10. J. Dalibard, C. Cohen-Tannoudji, J.O.S.A.z B 6 , 2023 (1989). 11. A.Aspect, E.Arimondo, R.Kaiser, N.Vansteenkiste, C.Cohen-Tannoudji, Rev. Mod. Phys. 6 1 , 826-829 (1988). 12. V.B.Berestetskii, E.M.Lifshitz, L.P.Pitaevskii, Quantum Electrodynam ics, 2nd edition 1982 (course of theoretical physics vol 4), (ButterworthHeinemann ed., Reed Elsevier, ISBN 0 7506 3371 9) p.246.
Cold Atoms and Multiple Scattering 125
13. H.Mabuchi, H.J.Kimble, Opt. Lett. 19, 749 (1994). 14. J.CKnight, N.Dubreuil, V.Sandoghdar, J.Hare, V.Lefevre-Seguin, J.M.Raimond, S.Haroche, Opt. Lett. 20, 1515 (1995). 15. A.Lagendijk, B.v.Tiggelen, Phys. Rep. 270,143 (1996). 16. G.L.Lippi, G.P.Barozzi, S.Barbay, J.R.Tredicce, Phys. Rev. Lett. 76, 2452 (1996). 17. L.Mandel, E.Wolf Optical Coherence and Quantum Optics, Cambridge University Press (1995). 18. A.E.Siegman, "Lasers", University Science Books (1986). 19. M.Fleischhauer, S.Yelin, Phys. Rev. A59, 2427 (1999) 20. M.H. Anderson, J.R. Ensher, M.R. Matthews, C.E. Wieman and E.A. Cornell, Science 269, 198 (1995). 21. http://amo.phy.gasou.edu:80/bec.html/ 22. S.Inouye, A.P.Chikkatur, D.M.Stamper-Kurn, J.Stengr, D.E.Pritchard, W.Ketterle, Science 285, 571 (1999). 23. A.Steinberg, P.G.Kwiat, R.Ciao, Phys. Rev. Lett. Lett. 71 , 708 (1993). 24. H.M.Brodwosky, W.Heitman, G.Nimtz, Phys. Rev. A 222 , 125 (1996). 25. E.H.Hauge, J.A.Stovneng, Rev. Mod. Phys. 6 1 , 917 (1984). 26. V.S.Olkhovsky, E.Recami, Phys. Rep. 214, 339 (1992). 27. New Aspects of Electromagnetic and Acoustic Wave Diffusion, Springer Tracts in Modern Physics, vol.144, POAN Research group (ed.), 1998. 28. Laser Speckle and Related Phenomena, Topes in Applied Physics, vol 9, J.C.Dainty (ed. )Springer-Verlag (1975). 29. This assumption will break down in the case of a Faraday eifect or in the case of moving scatterers, leading to a decrease of the coherent backscattering cone. 30. Y. Kuga and A. Ishimaru, J. Opt. Soc. Am. A l , 831 (1984). 31. RE. Wolf and G. Maret, Phys. Rev. Lett. 55, 2696 (1985). 32. M.P. van Albada, A.Lagendijk, Phys. Rev. Lett. 55 , 2692 (1985). 33. D. Wiersma, M. van Albada, B. van Tiggelen and A. Lagendijk, Phys. Rev. Lett. 74, 4193 (1995). 34. A.F.Joffe, A.R.Regel, Progress in semiconductors 4, 237 (1960). 35. G. Labeyrie, F. de Tomasi, J.-C. Bernard, C. A. Miiller, C Miniatura and R. Kaiser, Phys. Rev. Lett. 83, 5266 (1999). 36. R.H.Dicke, Phys. Rev. 93, 99 (1954). 37. M.Gross, S.Haroche, Phys. Rep. 93, 301 (1982). 38. D.Polder, M.F.H.Schuurmans.Pht/s. Rev. A14, 1468 (1976). 39. C.Cohen-Tannoudji, S.Reynaud, J. Phys. 38, L173 (1977). 40. C.Cohen-Tannoudji, S.Reynaud, J. Phys. B 10, 365 (1977). 41. B.R.Mollow, Phys. Rev. 188, 1969 (1969). B.R.Mollow, Progress in
126
R. Kaiser
Optics, vol XIX, 1, (E.Wolf ed.) North Holland (1981). 42. G.Birkl, M.Gatzke, I.H.Deutsch, S.L.Rolston, W.D.Phillips, Phys. Rev. Lett. 75, 2823 (1995). 43. M.Weidemiiller, A.Hemmerich, A.Gorlitz, T.Easlinger, T.W.Hansch, Phys. Rev. Lett. 75, 4583 (1995). 44. C. Jurczak, K. Sengstock, R. Kaiser, N. Vansteenkiste, C.I. Westbrook, A. Aspect, Opt. Commun. 115, 480 (1995). 45. D.Wiersma, P.Bartolini, A.Lagendijk, R.Righini, Nature 390, 671 (1997). 46. R.Friedberg, S.R.Hartmann, J.T.Manassah, Phys. Rep. 7, 101 (1973). 47. Y.Castin, K.Moellmer, Phys. Rev. A51, R3426 (1995).
A N INTRODUCTION TO ZAKHAROV THEORY OF WEAK TURBULENCE MICHEL LE BELLAC Imtitut Mm Liniairt de Nice These lectures give a (hopefully!) pedagogical introduction to the theory of non linear interactions of waves and weak turbulence. After having derived the pertubative expansion of the Hamiltonian of surface waves, we write down the kinetic equations for the number density of waves. Then stationary solutions to these kinetic equations allow us to find the weak turbulence spectrum.
1
Introduction
These lectures address the problems of nonlinear interaction of waves and of weak turbulence. Actually, we shall deal mostly with the problem of surface waves, but many of the concepts which are introduced in these lectures also apply to other kinds of wave: sound waves, spin waves, Langmuir waves in plasmas, Rossby waves in the atmosphere etc. In introductory textbooks, waves are usually treated in the linear (or, equivalently, harmonic) approxi mation: the simplest example is probably wave propagation in an harmonic solid. An elementary one-dimensional model is as follows: one considers a set of N identical masses m vibrating on a line, each mass being linked to its two neighbours by identical springs (Fig. 1). The equilibrium position of the tth mass is denoted by x
i
where K is the spring constant. This Hamiltonian is diagonalised by going to normal modes
and one finds
127
128 M. L. Bellac
%
i-I.O
Figure 1. A chain of springs
Here u>k is the dispersion law . \ka n [K — (1.4) w* = 2\V —sin m 12 where a is the lattice spacing. When ka
a,k a n d a*k
1 . Imwk :{ak+a*_k) (<**-<*•-*) rPk y/2mcjk " ' *' " "V 2 These variables have the following Poisson brackets Qk =
{°*- a *'} = <**,*'
(1-6)
(1.7)
Then H is a sum of harmonic oscillator variables
H = Y^ukalak
(1.8)
In solid state physics, vibrations are quantised, and instead of the complex numbers ak and a£, one introduces Hermitian conjugate operators ak and a'k in (1.6) ; from the commutation relations of (fr and pj one deduces the commutation relations [ak,ak,] = h6kiki
(1.9)
An Introduction to Zakharov Theory of Weak Turbulence
129
and the Hamiltonian reads
H = Y,"k("W + %)
(1.10)
The factor of ft/2 in (1.10) gives the "zero point energy", namely the energy of the ground state. Energies are usually measured from the ground state, so that this factor is often neglected, although it may lead to interesting effects in quantum field theory [1]. The Hamiltonians (1.8) or (1.10) have the fundamental property that they conserve the "number of waves" with wave-vector A;. This property is easier to understand in quantum mechanics, where H commutes obviously with the operator Nk = a'kak, which counts the number of quanta in mode k: [H, Nk] = 0. In the harmonic approximation, there is thus no possible energy transfer between different modes. Energy transfer between modes can have two origins: (i) Interaction with the environment: this is assumed, for example, when one wishes to justify the thermal distribution of modes for non interacting waves (or quanta): the best example is that of blackbody radiation, since photons do not interact between themselves apart from higher orders effects in quantum electrodynamics [2]. (ii) nonlinear interactions between modes; this means that one must add to H terms which are cubic, quai tic or of higher order in the operators ak and a\. In what follows, we shall be concerned only with (ii). In particular, in sections 2 and 3, we shall compute the cubic and quartic terms in the case of water waves. In classical mechanics, the Hamiltonian is a function of a* and a*k, and the equations of motion are
idtak = g
(1.11)
while in quantum mechanics H is a function of the operators a* and ak, and the corresponding equations are idtak = [ak,H]
(1.12)
The plan of the lectures is as follows: in section 2 we show that the equa tions of motion for surface waves can be cast into an Hamiltonian form and we derive the explicit expression of the Hamiltonian. Although this Hamiltonian
130 M. L. Bellac
turns out to have a very simple form, it cannot be used as it stands, and one has to learn how to compute pertubatively, using as a small parameter the product krj of the wave-vector k by the wave amplitude 77. The Hamiltonian takes the form H = H0 + H1 + H2+--
(1.13)
where Ho is of order {kr))° and corresponds to the harmonic approximation (1.8) or (1.10); Hi is of order (kr))1 and is cubic in the variables a* and a j , #2 is of order (kr])2 and is quartic in a* and a*k etc". The explicit form of Ho, Hi and Hi is derived in section 3. Furthermore we explain that cubic terms can be eliminated thanks to a canonical transformation, provided a "nonresonance condition" holds. This will introduce a very important distinction between the "decay case", where cubic terms cannot be eliminated, and the "non-decay case", where this elimination can be performed, so that H\ may be put to zero. In deep water, gravity waves belong to the "non-decay case", while capillary waves belong to the "decay case ". Section 4 is devoted to the derivation of kinetic equations for the average value of N^, (Nk)- The stationary solutions to the kinetic equations will give the spectrum of weak turbulence [3], whose derivation is the ultimate goal of these lectures. Some particular cases will then be discussed. Let us give at once an example: in the case of deep water capillary waves, the wave spectrum n(k) = (N{k)) is found to be n(fc) = 9.85^/2 ( £ ) 1 / 4 f c - 1 7 / 4
(1.14)
where P is the energy flux, p the volumic mass and a the surface tension. In comparison with the Kolmogorov spectrum K41 (see (5.1)), one gets not only the power of fc, but also the absolute normalisation, while the constant in K41 is left undetermined. The spectrum in (1.14) is called a weak turbulence spectrum because it is governed by the (weakly) nonlinear interactions. Let us conclude this introduction with some comments on our conventions. (i) Dimensional analysis: M denotes a mass, L a length and T a time; for example, the dimension of an energy is ML2T~2, h has dimension ML2T~X etc. D denotes the dimension of space. (ii) Fourier transforms: we adopt periodic conditions in a box of volume LD (L should not be confused with the symbol used in dimensional analysis!), so °In a moderate gale, waves in the ocean have a typical wave length of ~ 150m, the amplitude is ~ 3m, so that kij ~ .1. We shall see (section 3.3) that the relevant small parameter is in this case is (fcij)2 ~ 0.01.
An Introduction to Zakharov Theory of Weak Turbulence
131
that (1.15) Useful identities are
jADr^k-i)r
=
LD6%i
(1.16) where <$£ p is a Kronecker 6 and <S'D)(r - f') a Dirac (^-function 2 2.1
Hamiltonian formalism for water waves Fundamental equations
Wave propagation requires restoring forces: in the case of sound waves in a solid, the restoring force is elasticity. In the case of water waves, the restoring forces are gravity and surface tension, corresponding to gravity waves and capillary waves respectively. We shall see that one type of wave is usually dominant: gravity waves are dominant at long wavelengths, capillary waves at short wavelengths. Turning now to the mathematical description of water waves, the horizontal variables will be denoted by x and y: (x,y) = r and the vertical one by z. The surface of the fluid is denoted by rj (at equilibrium z = r}(x,y)
(2.1)
and for simplicity we restrict ourselves, for the time being, to gravity waves; surface tension will be added in the final equations: see (2.20). The usual assumptions are that the flow is potential and incompressible v = V(p
V • v = VV = 0
(2.2a)
Then Bernoulli's equation may be written as 0 t ¥ > + i W 2 + * + ^ - ^ = O (2.3) 2 p where g is the acceleration of gravity, p the volumic mass, p the pressure and po the atmospheric pressure. Neglecting surface tension, Bernoulli's equation on the surface of the fluid (2.1) becomes
[%>+i(v)2 + m]
=o
132 M. L. Bellac
As many equations will be written as conditions on the fluid surface (2.1), it is convenient to introduce the following notation F(x,y) = F(x,y,z
= r)(x,y))
(2.4)
so that Bernoulli's equation reads a^+-(Vv>)2+5T7 = 0
(2.2b)
The so-called "kinematical condition" is obtained by taking the material derivative of (2.1); using the notation defined in (2.4), it reads dtTi + d^pdxT) + d^pdyr] = d^p
(2.2c)
or, by using velocity components dtV + v^dxT) + v^dyT) = d^ip We denote by 7p the value of the velocity potential on the fluid surface V{x, y) =
(2.5)
One should of course be aware that, for example, dxTp ^ dxtp. Useful equations are (we use only one horizontal variable, x, in order to simplify the writing) dtf = '8w + cfcpdtT) = dw + v;dtV
(2.6)
dxip = dx(p + d^pdxT) = vx- + v;dxT)
(2.7)
and
Finally there is a fourth equation coming from the boundary condition at the bottom of the tank, which we assume to be flat and defined by z = -h, where h is the depth of the fluid
««L=-fc = M , = - h = 0
(2.2d)
(Almost) all the theory of water waves is contained in equations (2.2a-c)! The detailed derivation of these equations may be found in standard textbooks [4], starting from Euler's equations for ideal fluids. We wish to show that these equations follow from a Hamiltonian formalism; the existence of such a formalism is a priori plausible for non dissipative hydrodynamics, but it is not obvious in practice [5]. Furthermore, as everything can be done with Euler equations, why bother with an Hamiltonian description? There are several reasons: (i) The analogy with other types of wave (sound waves, spin waves etc.) is much clearer.
An Introduction to Zakharov Theory of Weak Turbulence 133
(ii) The general strategy for a perturbative expansion is easier to visualize. (iii) The usefulness of performing canonical transformations is also clearer. (iv) Finally the availability of an invariant measure on phase-space (Liouville measure) may be a decisive advantage of the Hamiltonian approach. 2.2
Hamilton's equations of motion
We wish to show that the Bernoulli equation (2.2b) and the kinematical con dition (2.2c) are equivalent to Hamiltonian equations of motion, where T? and the value Tp of the velocity potential
6H _=-dtV 5n(x)
(2.8)
where the Hamiltonian H is given by (6 is the step function) H=1-jdxdzO(r,-
*)(W>)2 + \gjdxr?
(2.9)
One might find puzzling that H as defined in (2.9) depends on tp, which is a function of x and z, while Tp is a function of x only. We are going to show that, provided we restrict ourselves to harmonic velocity potentials = 0, then H can be written in terms of n and Tp alone. Purists might object that one should derive the condition V2
tp + 6
- z){dx
(2.10)
and, thanks to the boundary condition (2.2d) 0fo - z){dz S(n - z)(dz
(2.11)
The last terms of the previous two equations add up to zero thanks to the harmonicity of
134 M. L. Bellac
variation of H at constant r\ is thus <5flj = I dx[-v;dxr]
+ v^6ip(x,z
Following our convention (2.4), 6
T)(X))
= ri(x))
= 6ip(x), and schematically
SHI - j - SHI f <$# = = 5^ + - ^ (JT? dip If)
(2.12)
(2.13)
dT/ lyj
Using the relation (see (2.6) and (2.7)) SJp = 6ip + v;ST}
(2.14)
we may rewrite (2.13) as oti SH\
_ lr (*-
\
SH
— \ c \ SH o H
e
These equations allow us to identify the partial derivatives which we need SH\ SH SH\ _SH\ 1 x i x IEW = SXp dtp ii} or)
0L
=
-7=\ =-vx
dxr) + vz = dtv
, , (2.16)
(2.17)
d(p It,
while the second equation in (2.16) and Bernoulli's equation (2.2b) lead to -Srj\-=
2^^2
+91-^^
= -9t
(2.18)
where we have also made use of (2.6) and of the relation g^6(r)(x)
-z)
= S(u - x)S(r,(x) - z)
Thus (2.8) is equivalent to (2.2b) and (2.2c). It is also clear that one should be able to rewrite the Hamiltonian (2.9) by using only variables which are taken on the surface of the fluid. In doing this, one relies on the harmonicity of the potential, on the periodic conditions in x and on the boundary condition (2.2c); one could not obtain this simplification with an arbitrary shape of the bottom of the tank! In order to transform (2.9), we integrate by parts once more 9* [0(V ~ *)(
An Introduction to Zakharov Theory of Weak Turbulence 135 dz [6(r, - z)(
and, taking into account V2y? — 0, we arrive at the final form of H 2
(2.19) = \ JdxJp(x,t)[v:
- Vx~dxr)J +
\g$dxrj
However one should not be misled by the apparent simplicity of (2.19)! A naive interpretation of (2.19) would lead to
This result is wrong by a factor of 1/2, because a dependence with respect to Tp is also hidden in t>7 and wj! However it would be extremely difficult to take functional derivatives on (2.19), and it seems necessary to follow the approach described above. Let us finally write the full Hamiltonian with two horizontal variables, and including surface tension H = \p j d 2 r W,
t) (d£
- VXT? • V l ^ ) + l-gp j d 2 r
2 V
+ \a j d 2 r
(V±V)2
(2.20) where Vx = (dx,dv) is the horizontal gradient, a the surface tension and we have reestablished the volumic mass p. Actually we have used an approximate form of the potential energy V„ associated with surface tension, which will, however, be sufficient for our purposes. The exact expression reads
V, = ^ | d ' r [ > / l + (Vx.7)2-l] 2.3
The pertubative expansion
Our goal is now to expand the Hamiltonian in powers of the small parameter krj, where k is the wave vector. We are going to write Fourier transforms of Tp and r), where of course k is not limited. We have to assume that there exists an effective ultraviolet cutoff on k: the existence of such a cutoff is a very natural one. Indeed, we are working in the ideal fluid approximation, where dissipation is neglected. As shown for example in [7], the dissipation rate can be estimated to be 2uk2, where v is the kinematical viscosity. Thus dissipation occurs at large wave-vectors, or at small scales, and our approximation of an ideal fluid breaks down at large values of k. Let us write the Fourier expansion of the velocity potential (r, *) = £ > ( * ) e*V*' ?
(2.21)
136 M. L. Bellac
The form of
(2.22)
k
We expand (2.22) in powers of kt) = EjE, V(*i) (l + *H?(*0 + • ■ •) ei -dr
V&
" i-r = E*„* 2 (f(ki) +fciv(fci)r?(fc2)ei£2f+ • • -)e«*
(2.23)
where we have used the Fourier expansion of 77
We can now obtain the Fourier components of Tp =£/d2re-'*->(r>
m
We have been working consistently to order kr}, and in the spirit of a perturbative expansion, we may replace tp by Tp in the last term of (2.24) and invert this equation ¥>(*) = {%) - Y
+ °(fcl?)2
kMM&Wh)
(2-25)
Now let ip(r,z) = dzip(f,z), so that from (2.21) W, z) = Y, V(k)kekze*f
(2.26)
k
Expanding once more exp(kr)) in powers of krj and using (2.25) and (2.26) we obtain tp(k) on the fluid surface ?(£) = k
% , + & [*i - * • *i]?(*i Mb) + 0(kV)2
(2.27)
*1,*2
We may now plug (2.27) into the expression (2.20) of the Hamiltonian and use Parseval's theorem j A2rTp{f) ^(f) = L2 Y^tf)
VK-*)
(2.28)
An Introduction to Zakharov Theory of Weak Turbulence 137
or generalisations in order to obtain the perturbative expansion, which will be detailed in section 3. In view of the factor of L2 in (2.28), it is convenient to work with the Hamiltonian density V. = H/L2; this Hamiltonian density is expanded in powers of krj H = H0 + Hi+H2
+ ---
(2.29)
where Hn is of order (^77)". The preceding computations allow us to get the first two terms in (2.29), but it is clear, that, at least in principle, the same strategy would allow us to obtain terms of arbitrarily high order in n, although, in practice, calculations become more and more cumbersome when n increases. 3
The normal form of the Hamiltonian
3.1
y.o=sum ofharmonic
oscillators
Let us now write the explicit expression of %0 from (2.20), (2.27) and (2.28) . Since all variables are from now on taken on the fluid surface, we suppress the bar over tp and %l>
Ho = \pY,{
fflv-i)
(3-1)
This expression is easily transformed into a sum of harmonic oscillators through a generalisation of (1.6) (3.2)
Jp[X-lT}_-k-iXk}
a'i = with Xk =
\J2(^kT)
^*%*
3
(3-3)
The inverse formulae are y/pVi = Xk(ai
+ a'_i)
yfpn
= - i X ^
- a*_s)
(3.4)
A simple calculation gives 7i0 in the standard form
K0 = X! w * a £ a *
(3-5)
138 M. L. Bellac
In quantum mechanics , o^ and a*- would be replaced by operators o^ and a t obeying canonical comutation relations
Of course, nobody is thinking seriously of quantising water waves, but quan tum mechanics is sometimes more convenient in order to check dimensional analysis. The commutation relations (3.6) are equivalent to the equal time commutation relations of the canonically conjugate fields 77 and pip {r,(f,t),Mr'J)} = ih6™(r- f') (3.7) The dispersion law (3.3) gives the group velocity cg(k)
+3l
^=t'U° f)
<38)
The limiting cases are (i) Gravity waves for small values of k: uk = y/gk (ii) Capillary waves for large values of k
(3.9)
uk = J°-pk3'2
(3.10)
For water waves, with the numerical values in MKS units: a = 7.4 x 1 0 - 2 , p = 103 and g = 10, one checks (see Fig. 2) that cg goes through a minimum in a region of k where gravity and capillarity are equally important; in terms of wave-lengths, this minimum is found at A = 4.33 cm, corresponding to a group velocity c9 = 17.9cm.s - 1 . Of course, we have recovered in (3.3), (3.9) and (3.10) results which are derived in an elementary and direct manner from Euler's equations in standard textbooks on fluid mechanics [4]. 3.2
Nonlinear terms: three wave interactions
The expression (2.27) of tpk gives at once the cubic terms of the Hamiltonian (of course, one must not forget the second term in (2.20))
An Introduction to Zakharov Theory of Weak Turbulence 139
(remember that the bars on
(3.12)
+ 1 £*„&.& ( ^ 2 3 «i02a 3 + c.c)* £ i + £ a + g 5 i 0 where we have adopted the shorthand notation ai = a^ etc. Now comes an important point: Hi in (3.12) can be simplified thanks to a canonical transformation which will be given explicitly later on (eq. (3.18)). Without going through all the machinery of canonical transformations, let us give an intuitive argument for this simplification: a^(t) contains a rapidly oscillating factor exp(-io)jfei). Let us get rid of this factor by defining a slowly varying
o s (t) = ^(tje-****
(3.13)
The equation of motion for ag(t) is (we neglect the term proportional to U in (3.12): this will be justified later on) idtdj- = ukan + £ * l i & [ | V i i 2 0 £ i o f e *:,fcl+fc2 Wi
(3.14)
so that l+*2
+V*k2e^-«*-»)tA-kiAlSiiU~k2]
(3.15)
If Aw = Uk — wi — u)^ and ACJ' = u>i — Wk — &2 do not vanish, then the oscillating factor in the integrand will lead to dtA^ ~ 0: then the cubic terms will be unimportant. On the contrary, if the so-called "decay condition" k = k\ + %2
u>k= ojki + Uk2
(3.16)
may be realised, or, in other words, if the dispersion law is of the "decay type", we have a resonance condition. This is the well-known "resonance problem", or "small divisor problem" of classical mechanics, which has been known since Poincare. If the resonance condition is realised, one cannot eliminate the resonant terms through a canonical transformation. We shall thus have to distinguish between two completely different cases
140 M. L. Bellac
1.8" ~ 1.6 1.41.2 C9
capilL
gravity
10.80.60.40.2 o—
101
102
103
104
k (m"1) Figure 2. Dispersion law for deep-water waves
(i) The dispersion law is compatible with (3.16): it is then of the decay type. Physically it means that a wave can decay into two other waves while conserving wave vector and frequency (in the corresponding quan tum problem, it means that an elementary excitation can decay into two elementary excitations while conserving energy and momentum). In that case the cubic terms cannot be eliminated by a canonical transformation and they give the leading contribution, being of first order in krj. Note that the terms proportional to U in (3.12) are always non resonant and can always be eliminated: we shall not write them explicitly in what follows. (ii) The dispersion law is of the non-decay type. Then, barring accidental cancellations (see section 3.5), elastic scattering of two waves is always
An Introduction to Zakharov Theory of Weak Turbulence
141
Figure 3. Kinematics of the decay case
The two surfaces u = ka with origins O and Ox intersect along the 0\A curve or a > 1. possible k\ + ki=k~2 + ki
ukx + <^*j = w*s + u>*4
(3-17)
and the leading term will come from four wave interactions and will be of order (kr))2 It is easy to see (Fig. 3) that in the case of a power like dispersion law of ,he kind u/* oc ka, the decay case corresponds to a > 1 and the non-decay case ; o a < l , except in the one dimensional case which is always of the non-decay ype. As a simple example, let us assume a decay law u;* = ck2 (shallow water :apillary waves). Then the angle 9 between k and ki is fixed by cos0 = ki/k. Deep water gravity waves (w* = v ^ ) are of the non-decay type: four wave nteractions (K2) give the leading contribution in perturbation theory. On
142 M. L. Bellac
the other hand deep water capillary waves wk = s/ajpk3/2) are of the decay type and the cubic terms (Hi) are dominant. To summarize: in the decay case (e.g. deep water capillary waves), the cubic terms are dominant and the terms proportional to U in (3.12) may be eliminated through a canonical transformation a^ -+ b%. In the next two sections we shall be interested in the "resonant manifold" (3.16) only, so that, for notational simplicity, we shall keep a^ instead of 6£ in the equations. However, if one wishes to preserve the Hamiltonian structure, it is compulsory to take properly into account the canonical transformation [9]. In the nondecay case (e.g. deep water gravity waves), all cubic terms may be eliminated, and the dominant term is % • 3.3
Nonlinear terms: four wave interactions
In the non-decay case, the resonant manifold is defined by (3.17); this manifold is of codimension 3 in an 8-dimensional space. In order to eliminate the cubic terms, we write the canonical transformation ar -► br K
4 1 ' = £«..«. TU(i^S
K
*-Ai* - 2E*. A ri!!«A*iWs+*, A (3-18)
with a- = 6jj. The transformation is canonical (namely it conserves Poisson brackets) provided p(2) _ p(2) _ p{2) *AA fci,£A £3.£.£i and the elimination of cubic terms gives
It is easy to check that this choice allows one to eliminate the cubic terms. It would seem that one could stop at this point, but it is necessary to be more careful for two reasons (i) One would like to keep in H2 only terms of the form 6J&26364, which have as many powers of 6 as of 6*, and to eliminate terms like 6i&2&3&4 or &J&26364 (ii) One would like to preserve the Hamiltonian structure of the theory.
An Introduction to Zakharov Theory of Weak Turbulence
143
It can be shown (but the calculation is cumbersome [10]) that both conditions may be realised, provided one takes into account terms of order ak21 (namely cubic in b) in (3.18). However (3.18) limited to order ak1 gives correct results provided one restricts oneself to the resonant manifold (3.17). In the non-decay case, a long but straightforward calculation gives the form of N2
7l2
= 2 p
k2)k2,ks,k4 =
Ek 1,k2,k3,k4 ^k1Wkz^ksqk4Lk 1)k2,k3,
kv4ak1+k2 +k3 + k4,0
1klk2[-2k1-2k2+Ik1+k3l+Ikl+k'4l+Ik2 4
+k31 +^k'2+k4t,
(3.20) Then (3.20) is written from (3.4) in terms of harmonic oscillator variables and the final form is obtained through the canonical transformation (3.18) W2 = 4 Ek1k2,k3,k4 Tkik2,k-3,k4bib2bsb4 Vk 1 +k2.k1,k2 Vk3+k4 ks,k4
Tk1-2 ,k3,k4
= Wk1k2,
3,k4 +
(3.21) +p erm.
Wk1+Wk2- W91+92
where V is defined in (3.12) and W is the coefficient of a*, a2a3a4 in the expression of N2. The second term in the last line of (3.21) is reminiscent of second order perturbation theory in quantum mechanics, and it can be actually represented graphically as a Feynman graph (Fig. 4). The full expressions of V, W and T are rather cumbersome and we refer to the literature for their explicit form [3,8,10]. Expression (3.21) is valid on the resonant manifold (3.17) only, but that is all that will be needed in the next section.
(0 1+ w2= 033 + w4
2'a) 2 Figure 4 . Feynman graphs
The Hamiltonian density N2 in (3 . 21) conserves the wave action N = > btbk k
(3.22)
144
M. L. Bellac
because there are as many 6*s as bs in (3.21). However this conservation law is only approximate, since (3.21) is valid on the resonant manifold only. This could be corrected by using a canonical transformation including cubic terms, but even with this improvement, terms of order (kr))3 contributing to Tiz would spoil the conservation law. 3.4 Dimensional analysis and scaling laws Comparing (3.5) and (3.12), we see that the dimension of Va is the same as that of w: [u] = [Va\; as [a] = M 1 /'^" 1 / 2 (see (3.2) or (3.6)), the dimension of Vis [V] = M-^T-W
(3.23)
while the dimension of W is [W] = M- 1
(3.24)
Let us first consider deep water capillary waves: the only dimensionful quan tities at our disposal are p, a and k, and dimensional analysis implies foc(-^r)
(3.25)
The same result can of course be obtained from the explicit expression of the three wave interaction in (3.12) and the dimension of Xu in (3.3): Xt <x pl/4CT-l/4fc-l/4
For deep water gravity waves, the dimensionful quantities are g, p and k, so that dimensional analysis gives W oc p~lk3
(3.26)
Note that dimensional analysis is useless in the case of shallow water waves, because there is one more dimensionful quantity, namely the depth h. Then one must use the explicit expressions (see section 3.5). An interesting example is that of shallow water capillary waves, where
V
^ = iipdl,ik2
< 3 - 27 )
The power laws (3.25-27) are called scaling laws; they are valid provided one type of wave is dominant.
An Introduction to Zakharov Theory of Weak Turbulence 145
3.5 Miscellaneous remarks We conclude this section with some remarks.
(1) Shallow water. When the depth h is finite, the expression of wk and of the nonlinear terms are modified. One finds, for example (3.28)
wk = (gk + o' k3) tanh(kh) P
and Lk',)k2,ky
= - (ki • kz + klk2 tanh(kih) tanh(k2h))
(3.29)
One can immediately take the shallow water limit kh << 1 in these expressions. (2) Deep water gravity waves in D = 1 . There are two trivial solutions to (3.17), but there is also a non trivial one kl = all+() z k2 = all
+()2(2
k3 = -acz
k4 = all+(+(2)
2 (3.30)
where a is a dimensionful parameter and 0 < ( < 1. This equation gives the non trivial resonant manifold . It was shown recently that ?12 is zero on this non trivial resonant manifold [ 11]. This means that N2 may be eliminated through a canonical transformation . Should the process go on, it would mean that one dimensional gravity waves in deep water form an integrable system. "Unfortunately" N3 # 0 [12] , and integrability is valid only up to quartic terms. Of course, the four wave interaction is non zero on the trivial manifold, but this does not affect integrability , because R is of the form (in quantum mechanics)
^l = ^wkbkbk + Tk^,^(bkbk ,)(bt2bk2) k ki,k2
(3.31)
and the eigenvectors are obvious in Fock space. (3) Continuum normalisation . In the next two sections , we shall follow Zakharov' s conventions and use a continuum normalisation . Let us recall that L2
(27r)2
f dzk
(3 . 32)
kk The convention for Fourier transforms is
dzr -ik L2 tp ( r = 27r `Pk `p( k ) = f 27r e
(3 . 33)
146
M. L. Bellac
Similarly £2
o(fc) = ^aE
(3.34)
so that the dimension of a{k) is now A f 1 / 2 L 2 T - 1 / 2 . The free Hamiltonian (and not the Hamiltonian density) Ho is now H0 = fd2kuj(k)a*(k)a(k)
(3.35)
while Hi becomes, instead of (3.12) /f1 =
iy - d 2 fc 1 d 2 M 2 M (2 Hfci-fc2-4)[nfci,* 2 ,A3)a*(fc 1 )o(Jfc 2 )a(fc 3 )+c.c.] (3.36)
where *a&,k.*3) = ^ - V S j i A A 2TT
(3.37)
Similarly for the four wave interaction T(ki,h,hM = Q \
A A A
(3.38)
The equations of motion are idta(k,t) = —-=da*(fc)
(3.39)
where the RHS is now a functional derivative. 4
Kinetic equations
Up to now, we have been dealing with a pure dynamical problem. We are now turning to a statistical description, because we are interested in a situation where there is a large number of excited waves. Then we are not going to ask for a detailed information on all amplitudes and phases of the different modes, but we shall limit ourselves to the knowledge of a probability distribution. The main object of interest will be n(k,t), the number density of waves in wave-vector space. The time evolution of n(k, t) is called a kinetic equation.
An Introduction to Zakharov Theory of Weak Turbulence
147
4.1 Derivation of the kinetic equations The decay and non-decay cases turn out to be governed by somewhat different kinetic equations. We shall treat in some detail the latter case , which turns out to be simpler, and will quote only the results for the first case. We recall the equations of motion
i8a(k) - Wka(k) + 671,
at
Sa * (k)
(4.1)
with (see (3.36))) r
'H1 =
2
J
1 + c.c.
L Vi23 ala2a3
j 6(2)(kl - k2 - ks)d2kid2k2d2k3
(4.2)
The non resonant U-term in (3.12) has been neglected; it can be eliminated through a canonical transformation, and in any case it does not play any role on the resonant manifold (3.16). We introduce an averaging procedure consistent with the dynamics: this is possible because there exists an invariant measure (Liouville measure) on phase space. To order zero of perturbation theory, we shall make two basic assumptions for the correlation functions (a(k1)...a(kn)):
(i) Random phase approximation=RPA (ii) Gaussian statistics: all cumulants of order > 3 vanish. These are standard physicits' assumptions, but no detailed justification is given by Zakharov (the justification is postponed to the second volume of the book [31, which is still unpublished ...). The random phase approximations means that
(a(k))(0) = (Ia (k)I
ei° (k))(° = 0
because the phase averages out to zero; the superscript (0) means "order zero of perturbation theory ". Thus we have, for example, (a(k)a(k'))(°) = 0
(a(k)a*(k'))( °) = n(k)6(2) (k - k') (4.3) (a1*a*a3a4 )(°) = n(k1)n( k2) [5(2 )(k1 - k3 )8(2)(k2 - k4) 1 +J(2) (k1 - k4 )b(2) (k2 - k3) I
where we have used in the last equation the assumption of Gaussian statistics, which allows us to express a correlation function of order four in terms of
148 M. L. BMac correlation functions of order two; note that with the conventions of section 3.5, n has dimension of Planck's constant, ML2T~l. It must be emphasized that these equations will be modified by pertubative corrections: for example, the correlation function of order three vanishes to zeroth order, but will get non-zero corrections from non linear interactions. Taking these assumptions into account, we now derive the kinetic equations. The equation of motion for a(k) is idta(k) -u) k a(k) = / \kVki2aia26(k - ki - k2) J - - - l o o <4-4> 2 2 +Vl\2a1a^S(kl - k - Jk2)jd Jfcid *2 In ojder to derive the expression of dtn, we multiply the previous equation by o'(fc), write the equation of motion for a*(k), multiply it by o(jfc), and finally subtract the second equation from the first with the result dtn(k) = Im( f\vkl2JknS(k J
- k, - k2) -
-W^Jyk25{kX
-
-
-k-
1 o
o
\
<4'5)
k2)\&kl
where Jiiz6(k Now, from (4.3), j[2\ = 0, so to compute .A23 to first order second order in V for dtn(k). an equation for Ji23
- kx - k2) = {ala2a3) that dtn(k) = 0 to first order in V. We need in V in order to obtain a non trivial result to Following the same steps as before, we derive
\idt + (w! - u>2 - u)3)IJ123 = /[-5^1*45 J4523<5(Al " *4 ~ £5)
(4.6)
+ V725./l534<$(fc4 - *2 - £>)
+v;3bJl&2im
- h - £5)]d2M2ik5
with Jri234<5(*i + k2-
k3-
k^ = (aj0^0304)
In deriving (4.6), we have already made use of the random phase approxima tion. It would not be difficult to write down the exact equation, and, indeed, it is clear that we could derive a hierarchy of exact equations, involving corre lation functions of higher and higher order, similar to the BBGKY hierarchy in plasma physics. A useful result can only be obtained by truncating this
An Introduction to Zakharov Theory of Weak Turbulence
149
hierarchy. This will be done thanks to the assumption of Gaussian statistics, which allows us to write J1234 in terms of a product of n(k)s <°>
= Ka3)<°> (a^a4)W + 0)
(aiajWfaa,)™
0)
= T4 T4 [<5(fci - h)S{k2 - k\) + S(kt - kt)8(k2 - hj\ To first non trivial order of perturbation theory, we thus arrive at r
I
ni
r
"|(°>
\idt + (wi - w2 - W3) Ji 2 3 = V'23 |nin 2 + nin 3 - n 2 n 3 1
= .4
(4.7)
This equation is easily integrated
J<»>=£fe""* + A
(4.8)
with Aw = (wi - W2 - w3). The oscillatory term may be neglected and we get j ( l ) _ ^123( n l"2 + »1"3 ~ "2"3) 123 (Ji - w2 - w3 + t'e
, 4 Q> '
where e -¥ 0+. The justification of this ie (Aw -4 Aw + ie) is causality and is identical to that used in Landau's argument for the solution of Vlasov's equation [13]. One can derive intuitively of this prescription by adiabatic switching of the interaction: V -¥ Veet. Plugging this result into equation (4.5) for dtn, we obtain a closed nonlinear equation dtn(k,t)
=7r / |jVfci2|2/fci2<5(£- ki -k2)8(uk-
orm - w2)
-|Vi* 2 | 2 /i*2 *(*i - * - h)6{ui - w* - wa) - | ^ i | 2 / 2 * i 6(k2 - k - £,)*(«, - wfc - wO] d2*x d2fc2
( 4 - 10 )
where /fci2 = ni»2 - nk{rii + n 2 )
(4.11)
Equation (4.10) is reminiscent of a Boltzmann equation, although the analogy, according to Zakharov, is more misleading than useful. It is of the general form dtn(k, t) = 7£[n(J?)]
(4.12)
where I^[n), which is a functional of n, is called the collision term. In the non-decay case, the four wave kinetic equation is derived along similar lines
dtn(k, t) = %J\Tkl23\2fkl23 6(k + ki -k22
2
k3) 2
X(5(wt + wi - w2 - w 3 ) d fci d fc2 d fc3
( 4 13)
150 M. L. Bellac
where T is defined in (3.21) and A123 = n 2 rj 3 (ni + nk) - nink{n2
+ n3)
(4.14)
This equation was first derived in 1962 by Hasselmann [14], but it is only recently [15] that it was shown to be completely equivalent to that derived by Zakharov. 4.2
Conservation laws
In the weak nonlinear approximation, the average energy density is given by E = I u>(jfc)n(£)d2fc
(4.15)
The time variation of E is determined from the kinetic equation; to be specific, let us consider the four-wave case (4.13)
f «/"(k)\T
2 kl23\ fkl23S(k
+ k,-k22
2
x6(uk + u>i - u;2 - LJ3) d kd ki
k3) 2
d fc2 d2fc3
One uses a trick familiar from the study of Boltzmann's equation [16]: thanks to the symmetry properties of T and / uj(k) -¥ -(u>/t + Ui - u2 — W3)
so that dE/dt = 0, provided the integrals converge. In the four wave case, we have in addition to energy conservation the approximate conservation of the "number of waves", or of "wave action" N=fn{k,t)d2k
(4.17)
One can also derive other results analogous to those familiar in the case of Boltzmann's equation; defining for example an "entropy" S{t)=
f\nn{k,t)d2k
(4.18)
one obtains an "H-theorem" in the form dS/di > 0. This result implies that one should find an equilibrium distribution which obeys 7[neq.(£)] = 0. It is not difficult to check that the Rayleigh-Jeans distribution »«,. = ^ -
(4.19)
An Introduction to Zakharov Theory of Weak Turbulence
151
is precisely the equilibrium solution of I\n) = 0. However this equilibrium distribution is not what we are looking for, because we are interested in open systems, where energy is pumped into the system and then dissipated, through viscosity or wave breaking. What we are looking for are stationary non equi librium distributions. Let T(fc) represent the interaction with the environment dtn(k) = Ij;[n(k',t)} + T(k)n(k,t)
(4.20)
so that with a stationary non equilibrium distribution we have r(k)n(k) + /s[n(jf')] = 0
(4.21)
The solution to this equation will give precisely the weak turbulence spectrum n(k). Remember that it is called weak turbulence because n(k) is controlled by the weak nonlinearities of the Hamiltonian. Let us now obtain some constraints on our theory. From the "H-theorem" dtS = f d2k(dtn)n~l
= /d^/^njn-1 > 0
(4.22)
so that fr(k)d2k<0
(4.23)
and this last integral is equal to minus the entropy removed from the system: if entropy is created in the system, it should be removed by the environment in a stationary state. On the other hand the collision and decay processes conserve energy and momentum, so that the total energy and momentum given to the system vanish
fr{k)u(k)n(k)d2k = 0 J
f
_
.
2
T{k)kn(k)d k
(4-24)
=0
The first of the preceding equations shows that T(fc) must have alternating signs, and it is possible to show that T(k) < 0 at small scales. Thus we may expect a situation analogous to that of fully developed turbulence, where energy is pumped at large scalens (A; ;$ fcmi„) and dissipated at small scales (k k, km&x, and where there exists an "inertial range"
such that I^[n(k')] = 0 in the inertial range. At the ends of the inertial range, n(k) should match source and sink. The universality of weak turbulence distributions relies on the fact that they are independent of T(fc). Our program for the last section is thus as follows
152 M. L. Bellac
(i) Find the solutions of /* [n] = 0 (ii) Check the convergence of the collision integrals (iii) Match with source and sink (iv) Check the stability of the solutions. In fact, points (iii) and (iv) are extremely involved and will not be dealt with here: we refer to [3s] for details. Point (ii) will be briefly touched, and we shall be mainly interested in point (i). 5
Stationary spectra of weak turbulence
5.1 Dimensional estimates In this section, it will be instructive to keep the space dimension D arbi trary. Let dE/dk be the energy density per unit of k, which has dimension ML2~DT~2, and P be the energy flux pumped into the system, which has di mension ML2~DT~3. In Kolmogorov's theory of fully developed turbulence, one assumes that the energy spectrum dE/dk in the inertial range depends on P, k and p only; then dimensional analysis for D = 3 gives at once the famous K41 law
^f = cy/'p2/8*-5/3
(5.i)
where C is a dimensionless constant. More sophisticated assumptions may be made in order to "derive" K41 [17], but they lead to little improvement in our understanding of the problem. In our case we have one more dimensionful quantity at our disposal, namely the frequency u(k). We can form the dimensionless quantity *
=
pk(5-D) 37TT puj6{k)
(5-2) v
an the most general result is
^
= pw2(*)fc(°-6>/(£)
(5.3)
If dE/dk does not depend on w(fc), one recovers (5.1) in dimension D = 3. But, in the general case, we need further information in order to proceed. This information is provided by the scale invariance of the interaction and of the dispersion law.
An Introduction to Zakharov Theory of Weak Turbulence 153
In many physically relevant cases, the frequency and the interaction obey scaling laws u(\k) = Xau(k)
V(\k1,\k2,\k3)
= \mV(kllk2,k3)
(5.4)
For example, in the case of deep water capillary waves, a = 3/2 from (3.10) and m = 9/4 from (3.11). Let us assume that a power law for n(k) gives a stationary solution of (4.10) in the inertial range n(u) = Au-s/a
n(k) = Ak-"
(5.5)
We may write V(k,k1,k2) = V0kmf1{^^)
(5.6)
and the collision integral reads I(k) oc fdDkx dDifc2 \V\2n2S^(k 2D
2m 2
2
D
- fci - k2)5(uk -
Wl
1
(5-7)
oc k V*k A k- >k- u=
- u2)
A2V2kD+2m-2'uj-1
Equation (5.7) has, of course, a purely dimensional origin, and is obtained from simple power counting. Now, assuming isotropy, the energy flux dP/dk in k—space obeys ^- = SD u(k)T(k)n(k) = -SD u(k)I(k) (5.8) ak where we have used the stationarity condition (4.21); So is the surface of the sphere in dimension D (So = 7r(2fc)D_1). Thus P(k) = -v[ J
o
w(*')/(*:')(2* , ) l> ~ 1 dfc' k
2 2
= nA V a(m,D.a,S)f = irA2V2a(m, D, a,
k'D-lk'D+2m-2'dk'
(5 9)
'
Jo 2 D+2m 2 -^ s)k ^
where a(m,D,a,s) is a dimensionless integral. Thus, in the inertial range where there is a constant energy transfer from the source to the sink s=m +D
/ P W2 !»(*)=(—jj *"'
(5.10)
154
M. L. Bellac
and the exponent of k in (5.5) is determined. Note that the normalisation of n is also fixed. In the four wave case, an analogous but longer calculation gives 2m
„
f P
\1/2
(5.11)
However, in this case, we have in addition conservation of wave action with a new exponent s' ,
2m
„
n(k) n(k)=
I
1/2
(-9-) =\m>
k~
(5.12)
where Q is the flux of wave action. Energy and wave action fluxes move in opposite directions (Fig. 5).
fi>,
W
N, ' Q
CD 3
2
N2
1
Figure 5. A system with source and sink
In order to derive the exact expression of P, we have to be more careful than in the preceding calculation. Writing from (5.7) the collision integral as l\K ,S) = K
U
J{S)
we obtain the energy flux in the form 2 P(k) = - I Ak'k',/2(£>+m-»)-l J(S)
(5.13)
For s - D + m the denominator vanishes, but so does J(s), since by definition the collision integral should vanish for a stationary distribution. We have an indeterminacy, which is however easily lifted „
1 d J(s') I
We shall explain later on (after (5.30)) how to compute this derivative.
An Introduction to Zakharov Theory of Weak Turbulence
5.2
155
Zakharov transformation
It remains to check explicitly that a power law for n(k) does lead to a van ishing collision integral, and we must look more accurately at its explicit calculation. Let us examine the first term in the square bracket of (4.10). We limit ourselves to the case D = 2 h = J d2*! d2Ar2 |\4i 2 | 2 /*i2 S(k - £, - k2) 6(u - w, - wa)
(5.15)
with 6Jk = u. The /^-integral is performed immediately; assuming a dis persion law u(k) = cka, with c = 1 in order to avoid inessential details, we have h = fk1dk1des(ka - Ifcf - (Jfc2 + Jfc2 - 2 * ^ cos0) a / 2> ) J r v . = a-2j dWldu;2 {TAn6 l W / * » S(u - w, - wj)
( 5 - 16 )
where we have used UJ, rather than k, as integration variable. In (5.16) kk\ sin 6 = A is twice the area of the triangle built with the vectors k, k\ and £2 A = [2(Jb2Jfc2 + k2k\ + Jfc2*!) - Jfc4 - Jfc? - * £ ] 1 / 2
(5.17)
Then (5.16) becomes
I, = a-2V£ jdwidu* ^ ^ 2 / ° " 1 „*«/« g ^
fu3
S(fJ
_Ui_
W2) ( 5 l g )
where we have taken a scaling law of the form (5.6) for V \Vkl2\2 = V2k2mg(^-)
(5.19)
If we now assume a power law for the distribution: n(w) = w ", v = s/a, we have for /*i2, with u2 = <*> - u>i A12 = n i n 2 - n w ( n i + n 2 ) = ( w i u j ) " ' [ l - (— J
~ (—)
)
(5-20)
Summarizing h = a- 2 V 0 2 /dwtdwj t""-"^"" 1 u 2 m / ° < / f e ) *(w / _ _ VWV
Wl
- a*) (5.21)
156
M. L. Bellac
Let us now look at the second term of (4.10), obtained from (5.21) with the substitution w « uj in the integrand and a minus sign h = -y*d 2 A; 1 d 2 fc 2 |V li2 | 2 /i*2*(£i - k -
fc2)*(wi
lamagrl ^m/a9(^)
= -a-^jd^d^
-u-Ui)
jfo - w - a*)
(5.22)
x(«^)-'(l - («*.)"" - (S)"") The regions of integration are different in (5.21) and (5.22): in the first case 0 < wi < u> and in the latter w < ui\ < oo. In order to ensure identical regions of integration, we use the following transformation w2 wi = —
aiwi w2 = — r
(5.23)
We note that in this transformation U>2
9(lc>l,td 2 )
^-r-^
*(«* - « - * * ) = - ^ *(« - «J - «£)
(5.24)
Furthermore . 2/a
A(fc,fcx,fcj)= ( 4 )
A(fc,
fcj,
fci)
(5.25)
hile
/t« = «„»! -n l ( «„ + n2) = ( ^ ) " " ( l - ( ^ ) ~ " - ( £ ) - " ) (5.26) and
!«-!»-«^-(^.(^)
(a.27)
Plugging (5.24-27) into (5.22) and relabeling the integration variables u[ -► o>i, w2 -» W2, we get the final result for 72 /2 = - a - 2 F 0 2 | d u ; 1 d a ; 2 ' ^ f " > / ' x
/
/
x(^)-"(l-(^)
\-"
/
( a ) J(w -
g
\-"w
-(*)
Wl
- a*)
,2=^2-2^-1 (
5 2 8
)
)(-)
Setting x = -2 ^ i _ + 2i/ + 1
(5.29)
An Introduction to Zakharov Theory of Weak Turbulence
157
and adding the third term in (4.10) obtained from (5.22) by the substitution u>i <-> U2 allows us to cast the collision integral into
/M = a->V0»(^/-J^d^[^^]a/a","^(=^) 1
(53o)
-[i - w - (A) *] t -fer- (^n
The integrand in (5.31) vanishes for v — 1 (Rayleigh-Jeans) and v = ^^ (in general v — m^-)Thus the Rayleigh-Jeans and Kolmogorov distributions are the only stationary power law solutions of the kinetic equations; then the energy flux P (see (5.14) is easily computed from (dl/di/)\l/_s±2.. The existence of other (i.e. non power law) solutions is an open problem. 5.3
Examples and final remarks
All examples will be given in the case D = 2, assuming isotropy of the distri butions. (i) Capillary waves in shallow water a =2
m =2
fi(x) = 1
(5.31)
Note that the function f\ is remarkably simple! One finds n(k) = SV^P1/2(^)1/4k-4
(5.32)
(ii) Capillary waves in deep water 3 a=-
9 m=-
One finds n(fc) = C P 1 / 2 ( ^ ) 1 / 4 J f c - 1 7 / "
(5.33)
As the function / i is more complicated than that in (5.31), the constant C ~ 9.85 must be computed numerically. (iii) Deep water gravity waves. The four-wave interaction can be treated along similar lines, but the calculations are still more cumbersome. FVom the values m = 3
a = 2
one finds n(k) = diPp2)1/2^4 1
(energy)
n(fc) = C 2fl /6 (p 2Q )1 /3 A -23/6
( w a v e action)
(5 34)
158 M. L. Bellac
where C\ and C% are calculable constants. (iv) Convergence of the integrals. The assumption of a power law for n(k) may lead to convergence problems when k -*■ 0 or k -► oo. Of course, there are infrared and ultraviolet cut-offs coming from source and sink respectively. However, divergent integrals would be extremely sensitive to the cut-off, and the results for the spectra would be meaningless. It is easily checked that k -*■ oo gives no difficulty, but problems may arise from k -> 0. Fortunately the convergence is improved because of cancellations in suitable combinations of distribution functions. The k -> 0 behaviour is linked to that of the interaction for small values of one of the wave vectors V(k, £,, ha) ~ V02k?>k2m~m>
*, « k
(5.35)
From the explicit expression of V (see 3.11), one finds mi = 7/2. The condi tion for the convergence may be shown to be 2mx>2m
+ 2- 4a
(5.36)
This condition is realised for deep water capillary waves. (v) Stability of the results. In Kolmogorov's theory of fully developed tur bulence, one assumes isotropy of the energy spectrum dE/dk, even if energy pumping is anisotropic. The opposite situation may happen in weak turbu lence: even if energy pumping is isotropic (i.e. T(k) depends only on k), a linear stability analysis shows that formation of structures may occur spon taneously. For example, the isotropic spectrum (5.33) of deep water capillary waves is unstable, and there is a tendency towards an anisotropic spectrum
The spectrum is isotropic at large scales, but becomes anisotropic at small scales: even an isotropic pumping will lead to an anisotropic spectrum. By contrast, the isotropic spectrum is stable for deep water gravity waves, as well as for shallow water capillary waves. Acknowledgments I would like to thank Frederic Dias, Francoise Guerin and Alain Pumir for their help while preparing these notes. References [1] H. Casimir: Proc. Kon. Ned. Akad. Wetenschap B 5 1 , 793 (1948). C. Itzykson and J.-B. Zuber: Quantum Field Theory Addison-Wesley (1980) section 3-2-4. [2] Itzykson-Zuber, ref. [1] section 4-3-4.
An Introduction to Zakharov Theory of Weak Turbulence
159
[3] For a detailed exposition of the theory, see V. Zakharov, Y. L'vov and G. Falkovich: Kolmogorov Spectra of Turbulence I: Wave Turbulence Springer (1992). However, the reader should be aware of the numerous misprints in this book. [4] See, e.g.: L. Landau and E. Lifechitz: Fluid Mechanics, Pergamon Press (1980), sections 12 and 62. D. Acheson: Elementary Fluid Dynamics, Oxford University Press (1990). T. Faber: Fluid Dynamics for Physicists, Cambridge University Press (1995), chapter 5. [5] R. Seliger and G. Whitham: Proc. Roy. Soc. A.305 1 (1968) [6] V. Zakharov: preprint [7] Landau-Lifschitz, ref. [4] section 25 [8] H. Yuen and B. Lake: Nonlinear Dynamics of Deep-Water Gravity Waves [8] V. Zakharov: preface to the Russian version of the book by Yuen and Lake. [9] V. Krasitskii: J. Fluid Mech. 272 1 (1994) [10] A. Dyachenko and V. Zakharov: Phys. Lett. A 190 144 (1994) [12] A. Dyachenko, Y. Lvov and V. Zakharov: Physica D 87 233 (1995) [13] See e.g.: E. Lifechitz and L. Pitaevskii, Physical Kinetics Pergamon Press (1981) section 29 [14] K. Hasselmann: J. Fluid Mech. 12 481 (1962) [15] A. Dyachenko and Y. Lvov, Am. Met. Soc. 3237 (1995) [16] See e.g.: R. Balian: Fom Microphysics to Macrophysics, Springer (1992), section 15.3 [17] Landau-Lifschitz, ref. [4], sections 33-34. U. Frisch Turbulence, Cam bridge University Press (1995)
P H E N O M E N A B E Y O N D ALL ORDERS A N D BIFURCATIONS OF REVERSIBLE HOMOCLINIC CONNECTIONS N E A R HIGHER RESONANCES ERIC LOMBARDI Institut Non Liniaire de Nice These lectures are devoted to phenomena beyond all orders in dynamical systems. We explain how these phenomena are connected with oscillatory integrals and how to obtain an exponentially small equivalent of an oscillatory integral when it involves solutions of nonlinear differential equations. The method proposed herein enables us to study the problem of existence of homoclinic connections to 0 for vector fields admitting a 02iw or a (iwo)2iwi resonance. These problems could not be solved by a direct application of the Melnikov method since in these cases the Melnikov function is given by an exponentially small oscillatory integral.
1 1.1
Introduction Phenomena
beyond all orders in dynamical
systems
In many physical problems one can construct "solutions" in term of power
* series of a small parameter e which reads Y(t,e) = £ enYn(t) + o(e*). In n=0
some cases, this power series is denned at any order but it diverges. This divergence may express that the system has a solution for which such an expansion misses some exponentially small term like e _ 1 / e Z(t). Such a term is said to lie beyond any algebraic orders. Such problems, in which these very small terms have great practical interest, are known in many branches of science including dendritic crystals growth, quantum tunneling, KAM theory, theory of water-waves (which was our original motivation for this work) and others. A collection of these apparently unrelated problems can be found in [20]. A first example is a model of crystal growth governed by the equation e26"'+6' = cos6.
(1)
For e = 0 this equation admits a front connecting the two fixed points ± f . The question is then to determine whether a front connecting ± f exists for the perturbed equation. This question was studied by Hammersley and Mazzarino in [9] and by Amick and McLeod in [2] who proved that for e ^ 0, such a front does not exist. This problem was also studied by Kruskal and Segur in [14], using Matching Asymptotic Expansions (M.A.E.). They give arguments 161
162
E. Lombardi
which suggest that for e ^ 0, there is no heteroclinic orbits connecting ± | . Their argument is formal, but it is very general and it can be used for a large class of problems including P.D.E. A second example is a perturbed KdV equation a&u
&u
B
du
du
n
for which one is interested in the existence of solitary waves, i.e. in the existence of traveling waves u(x, t) = I Y ((x - ct)y/c) such that Y(f) —► 0. £-+±oo
This amounts to look for a homoclinic connection to 0 of the fourth order equation (e = e'\/c) £2^L +
fL^Y
+ Y2 = 0
(2)
For e' = 0, the KdV equation admits a one parameter family of solitary waves explicitly given by u(x,t) = §ccosh~2 P * - ^ ^ ) , Hammersley and Mazzarino [8] and Amick and McLeod [3] proved that for e ^ 0 there is no homoclinic connection to 0. This equation was also studied by Pomeau et al. in [19] using M.A.E. combined with Borel summation. They give arguments who suggest that for e ^ 0 there is no homoclinic connection to 0, but that there exist homoclinic connections to exponentially small periodic orbits. A third example occurs in water wave theory when studying the existence of solitary waves for an inviscid, incompressible and irrotational fluid layer governed by Euler equation, under the influence of gravity and small surface tension (Bond number 6 < 1/3) for a Froude number F close to 1 (see [11] and [15]). The existence of generalized solitary waves with exponentially small ripples at infinity has been obtained in [21] and [15]. Moreover, the non existence of truly solitary waves for Bond number smaller and closed to 1/3 and for Froude number close to 1 was proved by Sun in [22]. A fourth example is a chain of nonlinear oscillators coupled with their nearest neighbors, governed by Xn + V'(Xn) = 1(Xn+1-2Xn
+ Xn^),
n€Z,
(3)
where Xn is a function of t e M; 7 is assumed to be positive and where V is smooth function such that V'(0) = 0 and V"(0) = 1. The problem is then to determine the existence of traveling waves, i.e. solutions of (3) of the form Xn(t) = x(t - TIT). G.Iooss and K.Kirchassgner proved in [12] that there exist in the parameter plane (r, 7) curves in the neighborhood of which there always
Phenomena Beyond All Orders and Bifurcations 163 +10)
o-ICO
n
U=0
u>0
, i +1(0,
,,
11 - ( C O ,
c >
ii<0
Figure 1. The 02iu> and the (iuo)2iwi resonances
exist nanopteron , i.e homoclinic connection to exponentially small periodic orbits, and there is generically no homoclinic connection to 0. Now, if we want to determine the common denominators between all these examples, what makes the appearance of exponentially small oscillations and the disappearance of hetero or homoclinic connections: we first observe that all these problems can be reformulated as dynamical systems
in finite dimensions (O.D.E. case, X € R n ) or infinite dimensions (P.D.E. case X € H, H Hilbert or Banach space), studied near a fixed point placed at the origin. Moreover, the parameter can be chosen such that the phenomena of appearance of exponentially small oscillations and disappearance of hetero or homoclinic connections occurs for /x = 0. Finally the time £ of the dynamical system is not necessarily the physical time. For instance for the traveling waves "£ = x - ct". A second observation is that all these systems are reversible, i.e. that the vector field V anticommutes with some symmetry S. The last observation is that for n = 0 when the phenomena of appearance or disappearance occurs, the spectrum of the differential DxV(0,Q) presents very specific configuration. A first example is the 02iw resonance which occurs for the perturbed KdV equation and for the true water water wave problem. In this case a first part of the spectrum of DxV(0, n) is bounded away from the imaginary axis and a second part admits the bifurcation described in Figure 1. A second example is the (kjo)2iwi resonance, which occurs for the chain of coupled nonlinear oscillators. The bifurcation of spectrum corresponding to this resonance is described in Figure 1. We can observe that for these two resonances, even after bifurcation, for n > 0 it remains on the imaginary axis
164 E. Lombardi
pair of purely imaginary eigenvalues which coexists with a set of hyperbolic eigenvalues with small real parts. Hence, a "rap! id o scUlatory parts and a slow hyperbolic part" coexist in such vector fields admitting a 02iu> or a (iu>0)iui resonance. Other examples of problems involving relevant exponentially small terms can be found in the Hamiltonian literature when studying the splitting of separatrices. This study was initiated by Poincare in [18]. Later on, a regular method for studying the separatrices splitting was proposed by Melnikov [17] and Arnold gave to this method an elegant form [4]. Starting from the work of Holmes, Marsden and Scheurle [10] the rapidly forced pendulum governed by the equation x = sin x + /iep sin -
(4)
became one of the most popular models involving this difficulty. Holmes, Marsden and Scheurle gave exponentially small estimates of the splitting for p > 8. This result was improved by several authors up to p=0 (see [6] for the case p = 0 and for a good historic of this subject). More recently Gelfreich gave a proof of the asymptotic formula of the splitting up to p = - 2 [7]. Here again, o "rapid oscillatory parts and a slow hyperbolic part" coexist in the system. 1.2 A little toy model : from phenomena beyond any algebraic order to oscillatory integrals To illustrate how phenomena beyond all orders are induced by the coexistence in a system of a "rapid oscillatory parts and a slow hyperbolic part'', we begin with a toy model in R3 which reads ( dA
Si = -"*> ^=u)A
+ p(e*-a2),
(5)
da , 2 — = tl - a1. ax where (A, B, a) 6 R3 and u is a positive fixed number. This system has two parameters, e > 0 which is small, and p which can be any real number. This system develops typical "phenomena beyond any algebraic order" which can be seen without any difficulty because all the bounded solutions can be computed explicitly. Hence, this system enables to understand what kind of mathematical difficulties are hidden behind these phenomena, and
Phenomena Beyond AH Orders and Bifurcations
165
what kind of mathematical tools have to be developed for dealing with any system where such phenomena are involved. To study this system it is more convenient to set Z = A + iB,
Q = £/3,
x = t/e,
and to rewrite (5) with these new coordinates. We obtain
' § = *!z + v*(i-/n (6)
M
This system is reversible, which means that if (Z(t),0(t)) is a solution, then S(Z(—<),/?(-<)) is another solution where S is the reflection given by S(Z,0) = (Z, -0). We call a reversible solution of (6) a solution which s&tisGesS{Z(-t),0(-t)) = (Z(t),0(t)). For p = 0, the system is uncoupled and the phase portrait is fully sym metric (See Fig. 2) : the truncated system admits two families of periodic orbits of arbitrary size given by
P£v(«) = (*e IM/e)+,v ,±l), (Jfc = 0 corresponds to the two fixed points) and a family of heteroclinic orbits connecting Pjf which read Hk^tM
= ( f c e ^ / ^ . t a n h ^ + fo)).
Among these solutions there is a one parameter family of reversible ones given by Hk,o,o with k e R (H-k,o,o(t) = #*,*,<)(*)) and there is a unique (up to a phase shift) front connecting (0, ±1) which reads h(t) = ifo,o,o(*) = (0, tanh t). At last, observe that the periodic orbits are exchanged by symmetry, i.e.
spk+jt)
=
Pk^(-t).
The question is then to determine how this phase portrait is deformed by the higher order terms (p ^ 0). In other words, do the previously found orbits persist for the perturbed system? For periodic solutions the answer is yes. Moreover, because of the partic ular form of the perturbation, these orbits persist without any deformations. Indeed P * is still a solution of the perturbed system. For more general per turbation, the periodic orbits are deformed, and the proof of their persistence can be given by the Lyapunov Schmidt method. For the heteroclinic connections the situation is not so clear. A first question is the persistence of the reversible connections.
166 E. Lombardi
A
Figure 2. Phase portrait for p = 0
A very first way to make up one's mind is to look for a reversible front connecting the two fixed points (0, -1) and (0,1) using a classical asymptotic approach: one can compute a formal reversible solution of (6) connecting (0, -1) to (0,1) under the form Y
(*) = (2> B 2n(0,tanht in>0
where Z0 = Zx = 0
and
Zn(t) = -
d"for n > 2. (iw)"-1 dtn~2 Vcosh2(*)J
So this first approach predicts the persistence of the reversible front (provided that the power series converges.) A second approach for studying the persistence of the reversible connec tion is to formulate the problem in terms of stable manifolds and to illustrate it geometrically. We denote by W,"^* the stable manifold of the periodic solution P£ of the truncated system. The manifold Wg+^ is a two dimensional cylin der in R3 (one dimension for time and one dimension for the phase shift) .The
Phenomena Beyond All Orders and Bifurcations
Jfc>0
167
Jfc = 0
Figure 3. Stable manifold of the periodic solution p£ of the truncated system
radius of this cylinder is the size of the periodic orbit, i.e. k. A fundamental remark is the following : Since P£ and P^ are exchanged by symmetry, there exists a reversible heteroclinic orbit connecting P^ and Pf if and only if the stable manifold of P£ intersects the symmetry space E+ = {Y/SY = Y}. In this case, E+ is a line in R 3 . For k > 0, the intersection consists in two points which leads to the two reversible heteroclinic orbits connecting the periodic orbits Pfc- and Pf. For k = 0, there is a unique point of intersection, which leads to the unique reversible front of the truncated system. Figure 3 represents the intersection between W*j.' and £+ in two case : k > 0 and k = 0 for a fixed value of e. Let us denote by W ^ the stable manifold of the periodic solution P£ of the perturbed system. The stable manifold W ^ is obtained by perturbation of the stable manifold Ws+^* of the periodic solution of the truncated system. On figure 3 we can observe that for k = 0 the situation is not robust: unless there is a miracle, there should not exist any reversible front for the perturbed system, whereas for k large enough, the two points of intersection between W ^ * and S+ should persist, i.e. there should exist two reversible heteroclinic
168 E. Lombardi
orbits connecting P£ and P£. The natural question is then to determine, for a fixed value of the bifurcation parameter e, the smallest size kc(e) of the periodic solutions P£ and P£ which admit a reversible heteroclinic orbit connecting themselves : is it 0 or not? And if it is not 0 we would like to compute A;c with respect to e. When kc > 0, a subsidiary question is then to determine the behavior when t goes to - c o of the one dimensional stable manifold Wg+0 of the fixed point (0,1). This second approach predicts the non persistence of the reversible front. So, we have two heuristic arguments (an asymptotic one and a geometrical one) which lead to two opposite predictions. However, here we can overcome this difficulty since all the bounded so lutions of the perturbed system can be computed explicitly. For a more gen eral perturbation, explicit computations cannot be done, and this exponential smallness of the Melnikov function leads to serious difficulties. One aim of this work is to give mathematical tools to study such general systems. Now, let us perform explicit computations to check the validity of our heuristic arguments. Lemma 1.1 The bounded solutions of the perturbed system (6) are and Yto,zo(t) = (zoe'"t/e \
+ ipe f ^*-)/'—-l ds, tanh(t +1 0 )) Jo cosh (s +1 0 ) /
where k,
The proof of this lemma is made in two steps. We first observe that the bounded solutions of (6b) are given by /3(r) = tanh(r -I-10). Then it remains to solve (6a) which is a linear oscillator of high frequency w/e forced by an explicitly known, analytic, exponentially decaying second term. Using the explicit formula giving Yto<Xo we easily compute the reversible solutions. Lemma 1.2 1. The perturbed system, admits a one parameter family of reversible hete roclinic orbits H\ explicitly given by Hx = (xeiut/c+ipe \
/V
Phenomena Beyond All Orders and Bifurcations 169 with A G R.
2. The solution H\ connects Pj^x*),-
* ia,(\.\
f+°° sin(wa/e) . Jo cosh (s)
,
wflere
,
.
f+co cosius/e) . Jo cosh (s)
5. When A uartes /rom — oo to +oo, &(A,e) tafces ttwce all the values in }kc(e),+oo[ and once the value kc(e) which reads — , ds = n . , . r ln , ~ irpcje u"/2e 2 Jo cosh (s) 2sinh(wir/2e)«-»o *^ This lemma confirms our former geometric arguments: for the perturbed system, and for each e > 0 fixed, there exists a smallest size kc(e) (expo nentially small) such that for every k, 0 < k < kc(e) there is no reversible heteroclinic orbit which connects P^^ and Pf , whereas for the truncated system, there exists heteroclinic orbits which connect two periodic orbits of the same arbitrary small size. The "surprise" is that this smallest size is ex ponentially small. It lies beyond any algebraic order of e. This phenomenon cannot be detected using a classical expansion of the solutions in powers of e. The next question is to determine the behavior of the one-dimensional stable manifold Wg+0 of the fixed point (0,1), when t goes to —oo. Here again, using the explicit formulas giving the solutions we obtain kc(e) = ps
L e m m a 1.3 A parameterization of the stable manifold W* 0 of the fixed point (0,1) is given by r+oo
Ys(t) = ( -ipe f ° V « - * > / £ — 1 — d s . t a n h w ) •
\
Jt
coslr(s)
/
with r+oo
K{e) = pe f " V 1 * " / ' \r— ds = . J " * . ~ J-oo cosh2(s) smh(wjr/2e) e-»o
2Trpue-"n/2t. ^
So the front does not persist. The stable manifold of (0,1) does not connect (0,-1) to (0,1), but it connects an exponentially small periodic orbit PRU) „/2 t o (0>1)- The stable manifold of (0,1) for the perturbed system develops exponentially small oscillations at - o o which cannot be detected with a classical asymptotic expansion of the solution in powers of e.
170 E. Lombardi
A third phenomenon beyond any algebraic order can be found when com puting the distance d(e) between the stable manifold Wg+0 of (0,1) and the symmetry line £+. Once again, this distance is given by an oscillatory integral and it is exponentially small. In all cases, the exponential smallness of kc(e), K(e) and d(e) is due to the fact that theses quantities are given by oscillatory integrals of the form
f+3°^gf{X(t))dt
n(e)=
where / is an analytic function and X(t) is a particular solution of the system. The toy model has been designed, so that all the solutions can be explicitly computed. Hence, in this particular case the integrand of the oscillatory integral is explicitly known whereas in general it is not true. So, in general we have to face the following problem: Problem 1.4 Assume we study a nonlinear differential equation in finite (O.D.E. case) or infinite (P.D.E case) dimensions of the form ^
= F(Y,t,e)
(7)
and that we want to compute 1(e) = /
r+0° i*t e—g(Y0(t,e))dt
./-OO
where u is positive; g is a given function, and YQ is a particular solution of (7) characterized by its initial value or more frequently by its behavior at infinity (for instance, it goes to a fixed point, a periodic orbit...). The problem is to determine what kind of information on YQ we need to know for being able to compute or at least to bound the oscillatory integral 1(e). In this lecture we present rigorous mathematical methods partially in spired by M.A.E. approach, which enable to compute the size of oscillatory integrals involving solutions of nonlinear differential equations. These tools enable us to prove typical results such as existence of homoclinic connection to exponentially small periodic orbit and non persistence of homoclinic con nection to 0 for reversible vector fields studied near a resonance. Section 4 deal with the 0 2+ iu; resonance and section 5 with the (iu>o)2 iu>i resonance. The crux point of the analysis is the description of the analytic continuation of the solutions which enable to catch exponentially small terms which are "hidden beyond all orders on the real axis".
Phenomena Beyond All Orders and Bifurcations
171
2 Exponential tools for evaluating mono frequency oscillatory integrals This section is devoted to the study of problem 1.4, i.e. to the computation of upper bounds and equivalents of oscillatory integrals
/ +00
J ao f (t, E)et'wt/E dt. involving solutions of nonlinear analytic differential equations. 2.1 Rough exponential upper bounds We begin with a very simple lemma which gives exponential upper bounds for mono-frequency oscillatory integrals. Lemma 2 . 1 (First Mono -Frequency Exponential Lemma) Let w, t, A be three real positive numbers. Let He be the set of functions f : l3jx]0,1] -+ C such that f (•, e) is holomorphic in B1 = {l; E C/IZm (t;) I < t} and
IIfII H-t
:= EE]s 1^ (If(6,E)IeAe(^)I) < +00. up Er^^
r +00 Then for every f E He and E E]0,1], I' (f, e) = f (t, E) e:':"'t/e dt satis-
J
fies
II71(f,E)I <
2 IIf IIH^ e -We /E . A
Proof. We only do the proof for It For I-, perform the change of time t' = -t in the integral and observe that f N f (-f, e) also belongs to Hl
and that IIf II H, = IIf IIHA . Let f E Hp , e, f' < 1 be fixed . Since f is holomorphic in B1 , the integral of fe"111 along the path 17 1 given in Fig. 4 , is equal to zero. Pushing R to +oo, we get / +0 I +(f, e) =
J ao f (it' + t, e) e'w01
+t)/E dt.
The estimates then follows, where the exponential comes from the oscillating term computed on the line Im (t) = $'. ❑
172
E. Lombardi
Lemma 2.1 gives a very efficient way to obtain exponential upper bounds because the membership to H£ is stable by addition, multiplication, "compo sition" , which can be summed up as follows Lemma 2.2 Hf is an algebra and If f € H£ and g is holomorphic in a domain containing the range of f and satisfies g(0) = 0, then g o f £ Hf. Remark 2.3 Lemma 2.1 is called mono-frequency because functions be longing to Hf cannot have a second "high frequency" like /o(f,e) = cos(w 0 f/e) cosh - 1 £. Indeed, for every £ > 0, ||/ 0 |L X = +oo holds because of the oscillation term. Oscillatory integral involving such functions are stud ied in Subsection 5.1. Remark 2.4 How to use Lemma 2.1 for solving problem 1.4 Our aim is to compute the size or at least an upper bound of the size of an integral of the form +00
/
9(Y0(t),t,e)e^cdt,
■oo
where g is an analytic function and Y0 is a particular solution of a differential equation of the form
f = F(Y,t,e) with F analytic with respect to {Y, t). To apply Lemma 2.1 we have to proceed in three steps:
Figure 4. Path Ti.
Phenomena Beyond All Orders and Bifurcations
173
• We must find the singularity which has the smallest imaginary part of e H F(•, £, •). This determines the width of the complex strip 131: t must be chosen such that F has no singularity in 131.
• Studying the holomorphic equation d = F'(1', E 131i we must extend the solutions of our first real equation into 131. This step is often performed using the Contraction Mapping Theorem in an appropriate space of functions of the type H11. • Then, we can apply Lemma 2.1, to obtain an exponential upper bound of I(e). This strategy works either in finite or infinite dimensions.
2.2 Sharp exponential upper bounds In order to determine the subtlety of Lemma 2 . 1 one can apply it to the basic f+00
example J(e) =
J using the residues, J(s) =
e"16 cosh-2 (t) dt which can be explicitly computed
7r w e sinh (-7rw/2e)
2 7rw eio
e-arw/2e
E
The function t; * cosh-2 ^ has a double pole at i 2 . This pole limits the size of the complex strip to f < i . Applying Lemma 2.1, one then find IJ(e)I
:5
1 II cosh2(l;)
e--/-Het
1 e`06 de > 0, Vt < 2 = co12 t
To obtain the right coefficient in the exponential we can set t = 2 - be. We get 6w bw e-war /2e ti e e-war/2e e-io 5 2 g2 sine Se
IJ(e)I e
With this "trick" we obtain the right coefficient in the exponential , but the polynomial degeneracy is not correct . This is due to the fact that II'IIHA is uniform in the strip 13, and does not describe what happens near the singularities. So we give in this section a refined version of the Exponential Lemma 2.1
174
E. Lombardx
which gives the right coefficient in the exponential and the right polynomial degeneracy. L e m m a 2.5 (Second M o n o - I r e q u e n c y Exponential L e m m a ) Let u>, 7, a, A be four positive numbers. Let S be in [1, +oo[. Let H^'g be the Banach set of functions such that f : (J B„-g€ x {e} —► C; f{-,'e) is Q<e
holomorphic in B„-ie
and | | / | | „ , H
x
< +oo where
*'.t
ll/ll w 7 .*:=
^p
[ ( | / ( { , e ) | ( k 2 + ^ r x 1 ( 0 + (l-X,(0)e A | 7 l ««>l)l
u»f/» x, ( 0 = 1 for \TZe (£) | < 1 and x, ( 0 = 0 otherwise. Then for every 7 > 1, there exits Af(7, A,o*,w) sucA tfia* /or every / € r+00
tfj;*
and e e]0,
/(*, e) e * ^ ' dt satisfies
J—oo
\I±(f,e)\<M-^e-»"\\f\\irt>.
Remark 2.6 Observe that \a2 + £ 2 T = |(f - ia)(f + icr)|^. Hence, roughly speaking, H^'g is the space of the holomorphic functions in Ba~ge which decays exponentially at infinity and which explodes at most as |f ± icr|—>" for £ near ±io\ Proof. The proof is done as for the First Mono Frequency Exponential Lemma 2.1 using the rectangular path drawn in Figure 4. The details are left to the reader. Remark 2.7 We check on our basic example J(e) that this lemma gives the right polynomial degeneracy, \J(e)\ < M
1 cosh'te)
r/2.2
2 I e —Wa« = M' sup ( --r-5— ^ - )1 -e-™/ *. -e [o.fi \sm*xj e £
R e m a r k 2.8 The membership to H^'g does not mean that functions have a pole in ±\o. Functions like (£,e) 1-4 e~~*2 which is entire; (f,e) H- cosh - 2 (f/2), (£,£) ^ cosh _1 (£ + ie2) cosh _1 (f - k 2 ) which respectively have poles at ±i7r and at ± i § ± ie 2 ; (£,e) .->• e i l t a n h ( 0 C osh~ 2 (0 which has an essential singu larity at if all belong to Hl'\. This comes from the fact that we work in
Phenomena Beyond All Orders and Bifurcations
II ! IIHo.n if (01
175
If(^)I <_ IIfIIH.ae aiRelfll
Q7I +iQl7
Figure 5. Description of f E Ho'a
B 2 _E and not in B 2 . This lemma simply describes how the function grows near i(a - 8e).
Ha'6
Remark 2.9 Unlike Ht \, is not an algebra. However the following lemma enables us to use it in nonlinear differential equations. Lemma 2 .10 For every f, g E HQ'6 , E^" f g lies in
Ha'6.
So, Lemma 2.5 can be used like the Exponential Lemma 2.5 to solve Problem 1.4 (see Remark 2.4) with holomorphic continuations of solutions in Ho'6 instead of HP . Here again, these continuations can be obtained by the Contraction Mapping Theorem. 2.3 Exponential equivalent : general theory In this section , our aim is to determine what kind of informations on a function f we must know how to compute an equivalent of an oscillatory integral It (f, E). Moreover , we must be able to obtain this kind of informations when f is not explicitly known and involves solutions of O.D .E or P.D.E. The membership to Ha' ' only gives upper bounds of I(f, e). A very first idea is to believe that the equivalent of the oscillatory integral 1(f, e), is given by the leading part of f on R. The following example ensures that this is not always true.
176
E. Lombard*
Example 2.11 Let / be given by f(t,e) = fi(t,e) 1 ~ —TTTTT cosh^t)
h&e)
+ fi(t,e)
with
e" hit,e) = cosh4(t)
&nd
The leading part of / on R iiis f\ (/ ~ f \ ) . The oscillatory integrals can be computed using the residues h :=/(/i,e) =
e
smh(wu/2e)' iru ' ',2 /a :=/(/*,£) = e sinh(7ru;/2£;) ( & ♦ * ) « - ■ Hence, /(/,*) ~ e-t-o
h
~ ^e-«-/ae e-fo e
/(/,,) ~Q IX + /
a
^(i
fori/>2, +
£)e-~/*
for,
= 2,
3
J(/,e) ~
J2
~ ^-re-"w/2£
for0
We define below a set of holomorphic functions / , for which we are able to to compute an equivalent of the oscillatory integral induced by / . Lemma 2.15 below, ensure that the relevant part of the function to compute the oscillatory integral induced by f is not its leading part on the real axis but its leading near the singularity. Definition 2.12 Assume 7, A > 0, a > 0, 6 > 1 , roo,mi > 1, 0 < r < 1 and v>Q. Denote by ££'j (m 0 ,mi,r,i/) the set of functions f such that i- f e H-%, 2. for every e e]0,£ 0 ] and every z £ Ei>r>e :=] - £ r - 1 , e r - 1 [ x ] - er-\ reads f(±(io + ez),e) = ±ft(z) where ft € H?° and ft € H^
-S[ f
+ ^ft(z,e)
(8)
tvith
H?° = { / : &6 -> C, / holomorphic in fij := Rx] - 00, -6[, _
:= sup (\f(z)\\z\m°)
< +00}
Phenomena Beyond All Orders and Bifurcations
177
and Kr
= {1 /
:
U (S«.r* x {e}\ -» C, /(-,£) holomorphic in Zs>r<€, v
e e]o,e 0 ]
'
|/||
:= sup (|/(*)||*r) <+00}. H; z€E«,r,«
1 -
*#
1/(01 <
e"-
,-A|KefOI
1 y_ / = ^7/0 +
1/(01 <
e"^ / l
F
Figure 6. Description of / 6 ££•£
Remark 2.13 The four functions $ and ff are uniquely defined since they are given by fjf(z) := limeV(±(i^+e«),e) and/^C*) = lime-',(e'1f(±(ia+ e-¥0
-•*)) ~ foiz))-
Let
us
e-fO
denote by P r the operators P r : / »-4 / *
Remark 2.14 The membership to # J ' ^ simply ensures that /(•,£) decays exponentially for \Re (f) | -> 00 and explodes as most as |£ T i^l 7 for £ close to ±\{a - Se). The membership to E^'f specifies the behavior of the function /(•,e) near ±i(cr - 6e). In the domain ) -er,er[x]a -er,
178
B. Lombardi
and for £ close to i(a-5e),
f(£,e) explodes like ^ft(i6)
= ^Pf{f)(i6).
So,
+
P r (/) is the "leading part of / near \{a - Se)". We call it the principal part of / . Similarly, P r ~(/) is the principal part of / near -i(cr - Se) Moreover, the membership to £j; 4 X (m 0 , mi, r, v) does not mean that func tions have poles at ±ia. All the following functions belong to the space E \,1 (2> 2> h 2)' b u t o n l v t h e t w 0 first ones have poles in i i f . This comes from the fact that we work in B* _ c and not in B». • The function / : (£,e) (->■ cosh~2(f) has poles of order 2 at ± f and • The function / : (f,e) ^ e 2 cosh _ 4 (f) has poles of order 4 at ±'f
and
P r ± (/) = ^ 4 ; • The function / : (f,e) 1-4 e - ^ is entire and P^{f) = 0; • The function / : (f,e) >-4 coslT 2 (f/2) has poles at ±iw and Pf(f) -1
2
_1
= 0;
2
• The function / : (£,e) 1-4 cosh (£ + ie )cosh (£ - ie ) has poles at ± i f ± ie2 and P r ± (/) = - z ~ 2 ; • The function / : (£,e) >-> e u t a n h ( 0 cosh _2 (f) has an essential singularity at if and Pf(f) = -z-*<Mfr. The membership to E^'s gives enough informations on a function to com pute an equivalent of the oscillatory integral involving it. Lemma 2.15 (Third Mono-Frequency Exponential Lemma) For even/7 > 1, 6 > 1, a,\ > 0, mo,mi > 1, 0 < r < 1, i/,w > 0 and every function f € ^ ( m o , m i , r , « / ) the oscillatory integrals I±{f,e) = / 0 + 0 °e ± i w i / t f(t,e)dt satisfy
^.')-7FT-(A*+A(«")) where A* := f
P?(f)(z)
eluzdz
forn>5
•/-iij+R
(K* does not depend on n) and v* := min(i/, (1 - r)(mo - 1), (1 - r)(y - 1)).
Phenomena Beyond All Orders and Bifurcations
179
The proof of this Lemma is given in [16]. This lemma ensures that the rel evant parts of / € E^'g, for computing an equivalent of the oscillatory inte grals J ± (/,£), are P r ± (/). Moreover, it gives a very efficient way to compute equivalents of oscillatory integrals involving solutions of non linear differential equations since the principal part P* of a sum (resp. of a product) is the sum (resp. the product) of the principal parts: Lemma 2.16 P* : E2'g(m0,mi,r,i/) every f,g£ 2-4
£ # , eVglies
H-» H™° are linear operators and for
in ETa$ and P f V / f f ) = Pr* (/)?*()•
Exponential equivalent: strategy for nonlinear differential equation
To use the Third Mono-Frequency Exponential Lemma 2.15 for computing equivalents of oscillatory integrals 7 ± (/, e) when / involves solutions of real nonlinear differential equations, we must be able to prove that a given solution y of a real nonlinear analytic differential equation ~
= F(Y,t,e)
(9)
admits a holomorphic continuation which belongs to E%'f- We propose here a strategy to achieve this in three steps. This strategy is described for simplicity in the one dimensional case. It can be readily extended to n dimensional space
of type ft £%**. Step 1. Continuation far away from singularities. To prove the exis tence of a holomorphic continuation of y in T>a>r,e = #* \ {£ £ C/\TZe (£) | < \eT, \lm (f ± icr) | < | e r } , one can use the Contraction Mapping Theorem in the space of holomorphic functions defined in Vai, Pr- We first introduce two systems of coordinates in space and time: a first one (Y, £), which is the initial one on K, is called the outer system of coordinates; a second one (Y, z) := (e^Y, (f — \o-)/e) which is useful to describe what happens near icr is called the inner system of coordinates. For an orbit F(£) in the outer system of coordinates, let us denote by Y(z) = eyY(ia + ez) the corresponding orbit in the inner system of coordinates and vice versa. We use the same notations for complex domains: for a domain E of C, in the outer system of coordinates, let us denote by
180 E. Lombardi
S = (E - \a)je the corresponding one in the inner system of coordinates and vice versa. Step 1 gives a holomorphic continuation of 3>(f) in 2?ff,r,£- Then we must prove the existence of a holomorphic continuation of y(£) in Ea,r,e := ] - er, e r [ x ]a - er, a - 5e[ (see Fig. 8) which satisfies (8) + , i.e. the existence of a holomorphic continuation of y{z) in £,5,r,£ which satisfies 5>(z, e) = y0(z) + e"yi (z, e)
with
% € H?° and 5>i € 8%. (10)
Such a continuation in %,r,e is a holomorphic solution of the full system (9) rewritten in the system of coordinates dY - ~ - = F(Y,z,e):=e^F \J;Y(z),ia + ez,e\,
(11)
which satisfies (10) and which coincides on Ej, r , E n Va
Fo(Y,z):=F(Y,z,0)
(12)
in H™Q. This system is called the "inner system" and it represents "the relevant part of the system near ia". 1 I
.i i o L-ii1-56-
-io
Figure 7. Step 1 : continuation in domain TV,,-,!
Phenomena Beyond All Orders and Bifurcations
181
1a
Figure 8 . Step 2 : continuation in Ed,r,e
A
2 1/
0---^-'•
1/E1-r ll2e^`r' ^
Du.r.E
Figure 9. Domains Do,r,e and E6,,.,E
Step 2 . 2. Matching of holomorphic continuations. To achieve our procedure we must prove the existence of a solution Y of (9) which satisfies (10) where yo has been obtained at Step 2.1. Moreover Y must coincide with Y on E6,r,E fl DQ,r,e where y is the holomorphic continuation of y obtained at step 1. For that purpose, we can see the full equation as a perturbation of the inner equation dY = F(Y, z, e) = Fo (Y, z) + Fi (Y, z, e) dt
182 E. Lombardi
where the perturbation term is given by F\ (Y, z, e) := F(Y, z, e) - F(Y, z, 0). Then, we can solve the equation satisfied by 3^i using once more the contrac tion mapping theorem in H™*. Step 3. Continuation near — \a : symmetrization Step 1 and 2 give a holomorphic continuation of y in V* i e U X^ i e . We still denote by ^ this continuation. Moreover, ^ is real valued on the real axis. Hence, the isolated zero theorem ensures that y admits a holomorphic continuation in P f ,., £ U E i i i < e U S^TJ such that X = y@). Finally, since ^ ^ ^ is an isometry, Step 1, 2, 3 ensure that y belongs to E%j(mQ,mi,r,v) and that
Figure 10. Step 3 : continuation in Ea,r>e
Remark 2.17 This strategy of continuation step by step on different subdomains is a rigorous version for this problem of the formal theory of Matching Asymptotic Expansions (the words outer and inner systems of coordinates, and inner system comes from this theory). A clear use of this theory for this kind of problem, was done by Kruskal and Segur in [14]. 3 3.1
Resonances of reversible vector fields Definitions
Definition 3.1 Let V : U x A -> Rn : (u, A) •-► V(u, A) be a p-parameters family of vector fields in Rn defined in an open set UofW1 with parameter A
Phenomena Beyond All Orders and Bifurcations
183
lying in an open set A of W. The family is said to be reversible when there exists a symmetry S € GLn(R), S 2 = Id, which satisfies SV{u, X) = - V ( S u , X)
for every u € U and X 6 A.
In other words the symmetry is the same for all A € A. Definition 3.2 A fixed point u0 of a family of vector fields V(u, A) with u G U, A G A is a point u0 such that V(u0, X) = 0 holds for every A € A. When studying the dynamics of ^=V(u,A) u€Rn, at for a reversible family V{u, X) of vector fields near a reversible fixed point UQ where the spectrum of the differential L(X) = DuV(u0, X) at the fixed point has purely imaginary eigenvalues (non hyperbolic fixed point) for some critical value of the parameter Ac, a first question is the behavior of the eigenvalues of L(A) for A close to Ac- For a real reversible matrix L, if v G ker((L - aid)'), then Sv G ker((L + aid)9) and v G ker((L - aid)9). So, the eigenvalues of L(A) are given by pairs (a, -a) for a G (K U iK) \ {0} and by quadruplets (a,-a, a,-a) for a G C such that Ue{a)lm(a) ^ 0. Moreover, (a, -a) (resp. (a, -a, a, -a)) have generalized eigenspaces with the same Jordan de composition. So, if a(X) is a simple eigenvalues for A = Ac it cannot leave the imaginary axis for A close to Ac- So bifurcation of the spectrum can only occur for multiple purely imaginary eigenvalues. So, we define Definition 3.3 A reversible fixed point of a reversible family of vector fields V{u, A) is said to be resonant for X = Ac w^n "»e spectrum of DuV(u0, Ac) contains multiple purely imaginary eigenvalues. 3.2
Linear classification of reversible fixed points
One aim of this work is to study the dynamics of
for vector fields V{u) reversible with respect to some symmetry 5 near a reversible fixed point u0 when the spectrum of the differential L = DUV(u0) at the fixed point has purely imaginary eigenvalues (non hyperbolic fixed point). For simplicity we place the fixed point at the origin (u 0 = 0). We are specially .nterested in the existence of periodic orbits and homoclinic connections near
184 E. Lombardi
the origin. So we are only interested in the dynamics of V up to a linear change of coordinates, and more generally up to a change of coordinates of the form u = $(ui) with $(0) = 0 where $ is a diffeomorphism near the origin. So a very first step is to perform a linear change of coordinates, u = P(ui) with P e GL(RN) to obtain an equivalent equation ~
= V1(u1)
ueRN,
where Si = P^SP and Lx = P~XLP - I>UlVi(0) are simultaneously as simple as possible. Observe that Vi(izi) = P~1V{Pu\) and L\ are reversible with respect to the symmetry S\. Theorem 3.4 below gives the classification of the reversible pairs of N xN matrices, i.e. {(L,5) 6 (M(RN)f / S 2 = Id, LS = -SL} with respect to the equivalence relation ~ defined by the simultaneous conjugacy, i.e. (L,S) ~ (Li,S,) 4^3Pe
GL{RN),
Lx = P~lLP
and Sx =
P^SP.
Moreover it gives for each equivalence class a representative (L, S) where L is a Jordan normal matrix and S is the direct sum of blocks of the form Ip, -Iq, and where 5 is a diagonal matrix . In other words, this theorem says that it is always possible to perform a linear change of coordinates to obtain an equivalent equation governed by a reversible vector field Vi such that (Li,Si) is one of the previous represen tatives. Then we introduce a notation inspired by the notation of Jordan matrices used by V. Arnold in [5] to name each classes. To state the theorem we introduce the following convenient notations to describe matrices which are blockdiagonal and antiblockdiagonal: • for a k x k matrix M and a p x p matrix N we denote by M f\) N (resp. by M 0 N) the blockdiagonal matrix (resp. the antiblockdiagonal matrix)
• for kj x kj matrices Mj we denote by [\] Mj (resp. by 0 Mj) the block-
Phenomena Beyond All Orders and Bifurcations 185
diagonal matrix (resp. the antiblockdiagonal matrix) 'Mi 0
£K-
0
;'=I
0 \
•• 0
0 0 M,J
\/]Mj=
0 0 Mr, I 0 .•• 0
>=»
VMi 0
0
For a real matrix L, denote by fir. its minimal polynomial. It has t h e form
n(x) •.=n(* - *;)m(Aj) n ux -a^x - ^))
m(
where A; € R and Oj 6 C \ R. Denote by Jor(L) its real Jordan normal form which reads p m(\j)nk(Xj)
R
m(g,-)ni.(g,-)
Jor(L)=S S S M*i) Sj =Si fc=i S <=iS M°i) j=i *=i <=i where n^(^) is the number of Jordan blocks of size k corresponding to the eigenvalue £ and where J*(A) (resp. Jk{x + iy)) is the A; x k matrix (resp. the 2k x 2k matrix)
/A100\
o'-.'-.o oo'-.i
JkW =
( (*-
0
0
0
"••
o
\
Jk(x + iy) -
(13) 0
\000A/
•
•■■
«
° • ° £1/ Theorem 3.4 (Classification of real reversible matrices) 1. Let (L, S) be two N x N matrices such that S is a symmetry S2 = Id and SL = —LS holds. Then the minimal polynomial of L reads P
Q
tiL(X)=Xml[(X2 - \2)m^l[(X2 j=i
j=\
R
+ u2)m^J[[((X2 -a}){X2 i=i
-a)))m^
186
E. Lombardi
where \J,WJ > 0 and aj 6 C with He {a,) ,1m {a-,-) > 0. The numbers m, P, Q, R may be equal to 0. So the normal Jordan form of L reads
Jor(L) S
S Jt(0) S S S S
Q mjiwj) nk(uij)
R m(
SS N S W S S S j=l
«;=1
(M^)^M-^)) .
^
>.
N (.fc(»j)H-fc(-*i))
i=l *=1
<=1
/
<=1
2. (L,5) ~ (Jor(L),Sor(L,5)) to/iere Sor(L,S) is t/ie matrix m
fp>.(L,S)
fe=i \
9k(t,S)
\
<=i
/
<=i
iu/iere S* (resp. Sk,Sk) given by
S f c :=diag(l,-l,l,-l,---),
p
m(A,)n|.(A,)
i = i *=i
is the k x k (resp. 2k x 2k) diagonal matrix
S* = S(-1) J 'S2, J ^ S t j=o
with pk(L,S)
'=1
1
)^-
(14)
j=o
and qk(L,S) are given by inverse induction
pN{L,S) = dim(ker(L) n \m(LN-1) n ker(S - Id)), qN(L, S) = dim(ker(L) n i m ^ " - 1 ) n ker(5 + Id)), where im(L) is the image (or the range) of L and N Pk{L,
S) = dim(ker(L) n imCL*-1) n ker(S - Id)) -
£
Pi{L, S),
t=fc+i N
qk(L, S) = dim(ker(L) n imtL*- 1 ) n ker(5 + Id)) -
£ ft(L, 5). t=fc+i
5. 5o, (L,S) ~ (Li.Si) t/and on/y t/Jor(L) = Jor(Li) and (Pfc(S,L),
forl
Phenomena Beyond All Orders and Bifurcations
187
Remark 3.5 This theorem ensures that for reversible matrices L, L\ 6 GL(RN) (i.e. invertible), (L,S) ~ (LuSi) holds if and only if Jor(L) = Jor(Lj). In other words, for the reversible matrices which have the same invertible Jordan normal form there is only one possible symmetry up to si multaneous conjugacy. In this case, the theorem gives for each Jordan block the corresponding symmetry. However, for the reversible matrices which have the same non invertible Jordan normal form, there are several possible symmetries which leads to different equivalence classes for the simultaneous conjugacy. Each class is characterized by a non invertible Jordan normal form and by a sequence of positive integers (pk,Qk) with 1 < k < N which satisfy p* + g* = n*(0) where nfc(O) is the number of 0-Jordan block of size k in the Jordan normal form: for a class characterized by (p*,?*), the symmetry corresponding to "1.(0)
the direct sum of nt(0) 0-Jordan blocks of size k, i.e. ^
•/*(()), is given by
<=i
(fis*)s(£M*))3.3
Nomenclature
For denoting each class, we must introduce a name which describes the Jordan normal matrix and the sequence (pk,Qk), i-e. the structure of the symmetry corresponding to the 0-Jordan blocks. For denoting Jordan normal matrices we can use the nomenclature introduced by V. Arnold in [5]: A Jordan block Jk (A) corresponding to a real eigenvalue is denoted by A* and a Jordan block Jk (cr) corresponding to a pair of complex conjugate eigenvalues (a, a) is de noted byCT*CT*.Then a general Jordan normal matrix is denoted by the formal product of the name of its Jordan blocks. Hence, AoAoAf a2a2 represents the Jordan normal matrix Ji(Ao) S ^ i ( ^ o ) S " ^ 3 ^ 1 ) S " ^ 2 ^ ) For reversible Jordan normal matrices, we know that if A 6 R U iR \ {0} is an eigenvalue, then -A is also an eigenvalue with the same Jordan blocks. So we can omit the product of the name of the blocks corresponding to -A. Similarly, we simply write ak instead of akcrk(-ak)(-ak). So in what follows, AA2cr3 represents the reversible Jordan normal matrix
•MA) S Ji(~X) S J2(A) S J2(-A) S Jz{a) S 2 2
J3(-a).
To obtain very short name, instead of writing AAAA A , we write 3.A(2.A2) Finally for denoting a class up to simultaneous conjugacy we must add to the name of the Jordan normal form something which represents the sequence (Pfc> Ik) 1 < k < N (as already explained, this sequence describe the symmetry
188
E. Lombardi
corresponding to the 0 Jordan blocks). For that purpose we incorporate the sequence in the name of the 0 Jordan block : we denote by (p.0k+)(q.0k~) the class of the pair
_|j*(o),£|st s£l(-sfc)j ^/=1
1=1
1=1
)
Moreover, when p (resp. q) is equal to 0 we do not write the term (p.0fc+) (resp. (q-.O*-)). Hence, for example, we denote by 2.0 + 0 2_ (ia;) the class of the reversible pairs which are simultaneously conjugated to the pair (L,S) with
/00000 00000 00010 L= 00000 00 0 0
0 \ 0 0 0 0-u
VOOOOw 0 j
/10 0 00 01 0 00 00 -100 s = 00 0 10 00 0 01 \00 0 00
0 \ 0 0 0 0
-v
We name resonant fixed point using the nomenclature introduced for re versible pairs. So for instance, we say that u 0 is a 0 2 + iu resonant fixed point of V(u,X) for A = A,, or equivalently that V(u,\) admits a 02+k<; resonance at UQ for A = AC "" • V(uo» A) = 0 for A close to A^, • the spectrum of DuV(uo,hc) is {0,±iw} where 0 is double non semi simple eigenvalue and ±iw are simple eigenvalues. • S(tpo) = +ipo where
T h e 02+iu> resonance
This section is devoted to the 02+iu; resonance : let us study one parameter family of analytic vector fields in E 4 , du — =V(u,A), dt
uelR4, Ae[-A0,A0], Ao>0
(15)
Phenomena Beyond All Orders and Bifurcations
189
near 0, assuming that 0 is an equilibrium, i.e. V(0, A) = 0 for A € [—Ao, Ao]. In addition, the vector field is supposed to be reversible, i.e. there exists a reflection 5 € GL 4 (E) such that for every u and A, V(5u,A) = -SV(u,A) holds. We are interested in the neighborhood of the degenerate case when the spectrum of the differential at the origin D„V(0,0) is {±iu>,0} where 0 is a double non semi-simple eigenvalue, and we denote by {<po,ip\, ip+,o = 0, DuV(0,0)v?i = ¥>o, D u V(0,0)y± = ±ik>¥>±, and by (^SiVi.V+.Vl) t n e corresponding dual basis. There exist only two types of such vector fields, since necessarily Sipo = ±ipo holds. The vector fields corresponding to S<po = ¥>o are said to admit a 02+iu> resonance, the other ones are said to admit a 0 2_ io; resonance. In this section, we only study vector fields admitting a 02+'IUJ resonance due to their physical interest. Two last generic hypotheses on the linear and the quadratic parts of the vector field are made. The first one concerns the linear part of the vector field, c = (y>J,Dl; uV(0,0)tpoj / 0. It ensures that the bifurcation really occurs: we do not study the hyper-degenerated case when the double eigenvalue 0 stays at 0 for A ^ 0. The last hypothesis, a := | (y>i, D2J„V(0,0)[po, 0 with |Pfc*(*)| < Mk. It admits also a unique reversible homoclinic connections h to 0 and two families of reversible connections h{, k > 0, j - 1,2 homoclinic to P£(t) satisfying h{ = h and Using the Lyapunov-Schmidt method we check that the periodic orbits persists for the full system, i.e. for p — 1, the full system (16) admits a one parameter family of reversible periodic orbits Pk(t), k > 0 with |Pt(i)| < Mk. The question of the persistence of the homoclinic connections is far more
190
E. Lombardi
intricate. Indeed, roughly speaking, looking for a reversible orbit Y homoclinic to Pk (Y(t) —> Pk(t±(p)) under the form Y = h + Pk + v where v i s a t—>IOO
perturbation term, one has to solve an equation of the type +oo
/
elu^eg(h(t)
+ v{t)) dt.
(17)
-oo
This equation has solutions provided that k > |J(e)|. So, as for our toy model it appears a critical size kc(e) for the reversible homoclinic connec tions given by kc{e) = \I{e)\. The critical size kc(e) is the smallest size of a periodic orbit which admits a reversible connection homoclinic to it self. So we would like to know whether /(e) vanishes or not. For v = 0, 1(e) = M e (e) is the classical Melnikov function and it is exponentially small: Me(e) = J^&V'gMt)) dt^e-™'2'. The standard perturbation theory simply ensures that 7(e) = Me(e) + 0(e2) = e~™'2e + 0(e2). Such an estimates of 1(e) does not enables us to determine whether 1(e) vanishes or not. To answer this question, we first use the First Mono-frequency Exponential Lemma 2.1 to obtain an exponentially small upper bound of \I(e). This way we obtain T h e o r e m 4.1 For any t €]0,7r[, any p € [0,1] and for every e small enough, the full system (16) admits two families of reversible connections homoclinic to the periodic orbits Pk provided that k which is essentially the size of the periodic orbit, satisfies k > K(l)e2v~utlc. This theorems ensures that every vector field admitting a 0 2+ iw resonance ad mits reversible connections homoclinic to exponentially small periodic orbits. Then, with the Third Mono-frequency Exponential Lemma 2.15, we prove Theorem 4.2 There exists a real analytic function defined on [0,1], A(p) = J2 P n A n , such that for any p e [0,1] with A(p) ^ 0, then for e small enough, n>l
the full system does not admit any homoclinic connection to 0, which princi pal part is h. The first coefficient Ai is explicitly known in terms of Bessel functions and in terms of the coefficients of the power expansion of the re mainder R. Hence, if Ai / 0 (which is generically satisfied), there is at most a finite number of values of p for which the full system (16) admits, for e small enough, a homoclinic connection to 0 which principal part is h.
Phenomena Beyond All Orders and Bifurcations
191
So generically, vector fields admitting at 0 a 0 2+ iu; resonance at the origin do not admit any homoclinic connection to 0, they always admit homoclinic connections to exponentially small periodic orbits. The detailed proofs of Theorems 4.1 and 4.2 are given in [16]. The persistence results given by these two theorems are summed up in Figure 11. A sketch of proof for Theorem 4.2 is given in the following subsection. k k*\
* = K(fy? e"f k « A(/j)e e (expected critical size )
0 nonpersisteno Truncated system (p = 0)
Full system (p ^ 0, 0 < £ < ir)
Figure 11. Domain of existence of reversible homoclinic connections to a periodic orbit of size k for the truncated system and their domains of persistence and non-persistence obtained in this chapter
Sketch of proof for Theorem 4-2 We make a proof by contradiction. Assume that Y is a reversible homoclinic connection to 0 of (16) and denote by v = Y — h the perturbation term where h is the reversible homoclinic connection of the truncated system, i.e. of the system (16) with p = 0. Thus, the perturbation term v satisfies the equation dv — -DN(h).v
=
g(v,t,P,e)
(18)
with g(v,t,e)
= Q(v)+pR(h+v,e)
where Q(v,e) =
N(h+v,e)-N(h,e)-DN(h,e).v.
Then, the crux point of the analysis, is the following : for an antireversible function / which goes to 0 exponentially at infinity, there exists a reversible function u which goes to 0 exponentially at infinity and which satisfies ^-DN(h,e).u
=f
(19)
192
E. Lombard*
if and only if the function / satisfies the solvability condition
f
r+oo
(r.(t),f(t))dt =0 /o where the vector r_ is the solution of the homogeneous equation (19) with / = 0 explicitly given by Jo
r _ ( t ) : = (0,0,-sintf e (0,cosik(0)
with
rpe{t) = (» + ae) *+26etanh(|i).
Thus, if the full system (16) admits a reversible homoclinic connection Y to 0, then v = Y - h necessarily satisfies r+oo
I(e,p):=
/ Jo
(r-(t),Q(v,v)+PR(h{t)
+ v{t),u))dt
= 0.
Our aim is then to prove that T(e, p) is exponentially small but does not vanish which leads to a contradiction. The classical Melnikov approach is based on the idea that the leading part of I{e,p) comes from the leading part of Y on R. So one introduce the Melnikov function /•+oo
Me(p,e)=p
(r-(t),R{h(t),e))dt Jo .
r+oo = ij^
e l(( 7 +~)f+26eunh(ie))
(RB(h(t),e)+iRA(h(t),e))dt
where R = (Ra,Rp,RA,RB), expecting that it is the leading part of I(e,p). However, here Me{p,e) is given by an oscillatory integral which can be com puted explicitly using the residues and which is exponentially small U)7T
Me(p,e) = pe- e (A1+
O (e)). £->0
where Ai is explicitly known in terms of Bessel functions and in terms of the coefficients of the power expansion of the remainder R. Using classical perturbation theory, on can only obtain that I{e,p) = Me(p,e) + 0(e2). Such an estimates does not enables us to determine whether I(e, p) vanishes or not. So to obtain an equivalent of the oscillatory integral I{e,p) with Lemma 2.15, we follow the strategy proposed in subsection 2.4 to prove that Y = h + v belongs to i£j(2,3,I,i) x ^ j ( 3 , 4 , i , I ) x££j(2,3,l,I) x£$(2,3,*,!). for A e]0,1[ and some S > 1. This way we obtain that I{e,p) = {pAi + ] T p n A n ) e-Tn>2
Phenomena Beyond All Orders and Bifurcations
193
Observe that the leading part h on R of Y and the perturbation term v contribute to the size of the integral at the same order of e. For the numerical constant A(p) = £ n > 1 p n A n , the first term pAi comes from h and the other terms come from v. Thus generically A(p) ^ 0, so I(e, p) is exponentially small but does not vanishes and thus there is no reversible homoclinic connection to 0. 5
The (iu;o)2iu/i resonance
This third section is devoted to the (io>o)2iwi resonance : let us study a one parameter family of analytic vector fields in R 6 , ^ = V(u, A), « e R 6 , A 6 [-A0) A0], A0 > 0 (20) at near 0, assuming that 0 is an equilibrium, i.e. V(0, A) = 0 for A 6 [-Ao, Ao]. In addition, the vector field is supposed to be reversible, i.e. there exists a reflec tion S G GL 6 (R) such that for every u and A, V(5u, A) = -SV(u, A) holds. We are interested in the neighborhood of the degenerate case when the spectrum of the differential at the origin D u V(0,0) is {±iwo,±uJi} where ±iwo is a double non semi-simple eigenvalue, and we denote by (o,ip*,4>f) a basis of eigen vectors and generalized eigenvectors D„V(O,O)0* = ±ILJQ4>Q, DUV(0,0)ipp = ±iu0ip* + f. such that ^ = <$ and ip^ = ipQ- De note by ((<$'*,ip£'*,<j>f'*) the corresponding dual basis. Moreover, we make non resonance assumptions, 2 ^ ^p/q for p + q< 5; ^ 4. - ; ^i. £ N <=$> w* = min |wi - pu0\ > 0. Two last generic hypothesis on the linear and the cubic parts of the vector field are made. The first one concerns the linear part of the vector field, q° = (tpo'*,DluV{0>0)(j)^\ ^ 0. It ensures that the bifurcation really oc curs: we do not study the hyper-degenerated case when the double eigen values ±\uo stay on the imaginary axis for A ^ 0. The last hypothesis, 9° : = ( ^ ' ' . D L u ^ C 0 ' 0 ) ^ ' ^ ' ^ ] ) < ° ensures that the normal form sys tems admit homoclinic connections to 0. This last condition is the same as the one obtained in [13] for the (iw)2 resonance (also called 1:1 resonance ). One example of such vector fields in infinite dimensions occurs when studying traveling waves in a chain of nonlinear coupled oscillators (see [12]).
194
E.
Lombardi
To study the dynamics of (20) near the origin, one can use the normal form theory (see [1]) to obtain an equivalent system (u = $(Y,A) where * is a polynomial diffeomorphism of degree 3 close to identity) ^
= N(Y,e) + pR(Y,e)
(22)
where e = y/qfX; N is the normal form of order 3; R represents the higher or der terms and p is an additional parameter : p = 0 corresponds to the normal form system and for p = 1, (20) is equivalent to (22). For p = 0, the normal form system admits a one parameter family of reversible periodic orbits Pk* (t), fc > 0 with \Pk*{t)\ < Mk. It admits also two reversible homoclinic connections to 0 h and h and four families of reversible connections h{, t){, k > 0, j = 1,2 homoclinic to P£(t) and satisfying /ij = h, hj = h, h[(t) —> Pfc*(t ± tpj), fjfc W . ~Z* Fk (f ± 0 with \Pk(t)\ < Mk. The question of the persistence of the homoclinic connections is far more intricate. Indeed, roughly speaking, looking for a reversible orbit Y homoclinic to Pk (Y(t) t^x> Pk(t ± y)) under the form Y = h + Pk + v where v is a perturbation term, one has to solve an equation of the type +0O
/
e iu " t/e fl(/i(t) + v(t)) dt.
(23)
■oo
This equation has solutions provided that k > \I{e)\. For v = 0,1(e) ~ M e (e) is the classical Melnikov function and it is exponentially small. Indeed, it has the form
where u;* is defined in (21). Such an integral can not be estimated using the Third Mono-frequency Exponential Lemma 2.15 which gives exponen tial equivalent of oscillatory integral f*™eiUlt/c f{t,e)dt provided / admits a holomorphic continuation in a suitable space of bounded functions. Here, for every small t > 0, supje cosh - 1 (tf) cos(i&>0/e)l = +oo because of the sec£>0
ond rapid oscillations induced by w0. To solve this difficulty, we make here only a partial complexification of time : in subsection 5.1 we give a first BiFrequencies Exponential Lemma 5.3 which gives an exponential upper bound
Phenomena Beyond All Orders and Bifurcations
195
for a bi-oscillatory integral +0C
/
f(t,^,e)ei*»t'sdt
(25)
-oo
provided that / : (£, s,e) •-► /(£, s,e) belongs to a space bH£ of functions holomorphic with respect to f € C and 27r-periodic with respect to s G R. Such a lemma enables us to deal with integral of the type of (24). In subsection 5.2, we briefly explain how to prove that a solution y of a nonlinear differential equation can be written as y(t) := y(t,uot/e,e). Using this strategy we can finally prove Theorem 5.1 For any £ e]0, f [, any p G [0,1] and for every e small enough, the full system (22) admits four families of reversible connections homoclinic to the periodic orbits P* provided that k which is essentially the size of the periodic orbit, satisfies k > M(£)e~"'* // ' £ . This theorems ensures that every vector field admitting a (iu;o)2iu;i resonance with q° < 0 admits reversible connections homoclinic to exponentially small periodic orbits. To study the persistence of the reversible homoclinic connection to 0 (which corresponds to k=0) we must determine whether 1(e) defined in (23) vanishes or not. For that purpose, we give in section 2 a second Bi-Frequencies Exponential Lemma 5.6 which gives an exponential equivalent of bi-oscillatory integrals. Using this Lemma, we can prove Theorem 5.2 There exists a real analytic function defined on [0,1], \(p) = £ p n A n , such that for any p € [0,1] with \(p) ^ 0, then for e small enough, n>\
the full system does not admit any reversible homoclinic connection to 0, which principal part is h. The first coefficient Ai is explicitly known in terms of Bessel functions and in terms of the coefficients of the power expansion of the remainder R. Hence, if k\ ^ 0 (which is generically satisfied), there is at most a finite number of values of p for which the full system admits, for e small enough, a reversible homoclinic connection to 0 which principal part is h. So generically, vector fields admitting at 0 a (iwo)2iwi resonance with g° < 0 at the origin do not admit any reversible homoclinic connection to 0, whereas they always admit homoclinic connections to exponentially small periodic orbits. The detailed proofs of Theorems and Lemmas given in this section are available in [16].
196
5.1
E. Lombardi
Exponential asymptotics of bi-oscillatory integrals
Lemma 5.3 (First Bi-Frequencies Exponential Lemma) Let I, A be two positive numbers. Let bH$ be the set of functions f satisfying 1. f : Btx Rx]0,l] -> C; f ►-* /((,«,e) is holomorphic in Bt := {£ e C / | I m ( 0 | < 1}; a i-4 f(t,s,e) is 2ir-periodic and of class C 1 in R,(£, s) »-> /(£, s,e) is continuous in Bt x R; 2- ||/IL„ A :=
sup
( ( | / t t , « , e ) | + \%-(t,s,e)\)
e^^A
< +oo.
Let wo, <*>i, 6e two real positive numbers. Define w* by w* = min |wi — pu>o|. T/ien /or every f €bHf and e e]0,l],the bi-oscillatory integral I{f,e) defined by (25) satisfies
The Lemma 5.3 gives a very efficient way to obtain exponential upper bounds because the membership to bH( is stable by addition, multiplication, "compo sition": the space bH£ is an algebra and for every / ebH( and g holomorphic in a domain containing the range of / and satisfying g(0) = 0, go f ebH^ holds. For computing an equivalent of the bi-oscillatory integral, we need more information on / . Definition 5.4 Assume 7, A > 0, a > 0, 6 > 1 , mo.m! > 1, 0 < r < 1 and u>0. Denote by bEl'g (m 0 ,mi,r,iv) the set of functions f such that 1. f :
U
Ba-s£ x R x {e} -> C; { >->• {£,s,e) is holomorphic in Bf,
0
s >-»■ (£,s,e) is 2n-periodic and of class C 1 in R; (f,s) »-> (£,s,e) is continuous in B( x R and ||/|| ou , < +00 where
ll/IU := sup
0<E<<7
(|/«,*.e)| + |§j«,*,e)|)x
(C,«)6B<,-i,xR
(itt - w)(€ + i
Xl (fl)e*l*««>l)
«»** Xi ( 0 = 1 /or \Ke {$ | < 1 and *, ( 0 = 0 ot/iemwe. 5. /or every e e ]0,e0] and every z € E«,r,e := j - e 1 - 1 , ^ - ^ x ] - e r _ 1 , - o " [ , / reads /(±(Ur + « ) , « , e ) = — # ( * , « ) + — /,*(«,«, e)
Phenomena Beyond All Orders and Bifurcations
197
where ff £bH?° and f± GbH^ with
*&,"-= {/: ns
: -¥ C,
where Ug := 1 x ]-oo, -6[, z i-> f{z,s) holomorphic in Clg, s t-» /(z, s) 2n-periodic in R, 2, s) •-»• / ( z , s) contt'nuorw in tig x R := sup C|/(2r,s)||^|mo1) < +00} v y J * r ° (*..)€SixR
and b
ffTr= \J1
U (s«, P , e x R x {e}) -> C,
ee]0,eo]V
2 i-> f(z,s,e) is holomorphic in Ej.r.e, s •-> /(2,a,£) 2-rr-periodic in R, 2,3) »-» /(2,s,e) continuous in T^s
8tr
SUP (|/(2)||2|mi) < +00} X J
Remark 5.5 The two functions f0 and f\ are uniquely defined. Indeed, they are given by f*(z,s) := lime 7 /(±(io- + ez),s,e) and / * ( * ) = lim£ _ , / (£ 7 / ± (±(io- + ez),s,e)
- /,*(«)). Denote by Pf* the operators P,-* :
e—yO
Lemma 5.6 (Second Bi-Frequencies Exponential Lemma) Assume 7 > 1, 5 > 1,CT,A > 0, mo, mi > 1, 0 < r < 1, v,u> > 0. Let WQ, U>I 6e iu/o positive numbers. Define w* 6y w* = min|u;i -pu>o|- For ^ £ N + i , tAis minimum is reached for a unique value ofp denoted hyp*. When jjj1 € N+ 5, it is reached for 2 values of p and we denote by p* the smallest one. Then, for every function f EbE2's (m.Q,m\,r,v) the bi-oscillatory integral I(f,e) satisfies
' = -Fn-( A + .2D('")) where u* := min(t', (1 - r)(mo - 1), (1 - r)(7 - 1)) and A is given by
l.for%?N+l A := / dz e'u-2~ f f(z, s)eip-'ds J-in+K InJ-r,
with n > 6
198 E. Lombardi
where f = f£ when u>i - p*<^o > 0 and f = f0 when W\ - p*u>0 < 0; 2. for wi = (p* + \)u0, A := |
im't/j n > 6. This lemma ensures that the relevant parts of / ebEFa't for computing an equivalent of the oscillatory integrals /*(/,£) are P^C/). Moreover, it gives a very efficient way to compute equivalents of oscillatory integrals in volving solutions of non linear differential equations since the principal part P* of a sum (resp. of a product) is the sum (resp. the product) of the princi pal parts: P* :bE^ (mo,mi,r,v) t-tbH™° are linear operators and for every f,g £bE^, e-rfg lies in » £ $ and P?&fg) = P?U)P?(g). The exponential lemma 5.6, can be proved using the exponential lemma given in section 4 for mono-frequency oscillatory integrals. Observe that here, the interaction be tween the two frequencies u>o and wj changes the decay rate in the exponential: here we have w* instead of u)\ for the mono frequency case. 5.2 Strategy for non linear differential equations For using the Bi-Frequencies Exponential Lemma 5.3 to compute equivalents of bi-oscillatory integrals / ( / , e) when / involves solutions of real nonlinear differential equations, we must be able to prove that a given solution y of a real nonlinear analytic differential equation ^=F(Y,e)
(26)
can be written y(t) = y(t, ^,e) where y belongs to hH$. For that purpose we look for solutions of the first order P.D.E.
since every solution y of (27) gives a solution y of (26) by setting y{t) »(*. ^ . e ) - Observe that if yi((,s) = y2($,s) + $(( - e/uos) with 0(0) = 0 then j/i and j/ 2 lead to the same function y(t). This indetermination induced by the introduction of two times variables f and s lead to some technical difficulties. However, we can use the method of characteristics to transform (27) into an integral equation of the form y = T(F(y, e)) where T is a suitable integral operator. Then, we solve this last equation using the contraction
Phenomena Beyond All Orders and Bifurcations
199
mapping theorem in spaces of type bH(. For obtaining solution of (26) which reads 3^(0 = y(t, ^,e) where y belongs to bE%'g we must combine the above strategy with the one proposed in subsection 2.4 for obtaining solution of (26) in E^'t which is the suitable set of functions for computing equivalent of mono-frequency oscillatory integrals. References 1. Iooss G., Adelmeyer M., Topics in bifurcation theory and applications. Advanced Series in non Linear Dynamics, 3, World Scientific, 1992. 2. Amick C.J., McLeod J.B. Arch. Rational Mech. Anal. 109, 1990, 139171. 3. Amick C.J., McLeod J.B. A singular perturbation problem in water waves Stab. Appl. Anal, of Cont. Media, 1992, 127-148 4. Arnold V.I. Instability of dynamical systems with several degrees of free dom. Dokaldy Akad Nauk SSSR 156, 1964, 581-585 5. Arnold V.I. Geometrical methods in the theory of ordinary differential equations. Springer, 1983. 6. Delshams A., Seara T.M. An asymptotic expression for the splitting of separatrices of rapidly forced pendulum. Com. Math. Phys. 150, 1992, 433-463. 7. Gelfreich V.G. Reference system for splitting of separatrix. Nonlinearity 10, 1997, 175-193 Gelfreich V.G. Separatrix splitting for a high-frequency perturbation of the pendulum (1996, unpublished). 8. Hammersley J.M., Mazzarino G. Computational aspects of some au tonomous differential equations. Proc. Roy. Soc. London Ser. A 424 (1989) 9. Hammersley J.M., Mazzarino G. IMA J. Appl. Math. 42, 1989, 43-75 10. Holmes P., Marsden J., Scheurle J. Exponentially small splittings of sep aratrices with applications to KAM theory and degenerate bifurcations. Contemporary Mathematics Vol. 81, 1988, 213-244. 11. Iooss G., Kirchgassner K., Water waves for small surface tension: An approach via normal form, Proceedings of the Royal Society of Edinburgh 122A, 1992, 267-299. 12. Iooss G., Kirchgassner K., Traveling waves in a chain of coupled nonlinear oscillators. Comm. Math. Phys. to appear. 13. Iooss G., P&roueme M.C., Perturbed Homoclinic Solutions in 1:1 Reso nance Vector Fields, Journal of Differential Equations 102 1993, 62-88. 14. Kruskal M.D., Segur H., Asymptotics beyond all orders in a model of
200
E. Lombardi
crystal growth , Studies in Applied Mathematics 85 (1991) 129-181. 15. Lombardi E., Homoclinic orbits to exponentially small periodic orbits for a class of reversible systems. Application to water waves, Arch. Rational Mech. Anal. 137 (1997) 227-304. 16. Lombardi E., Oscillatory integrals and phenomena beyond any alge braic order; with applications to homoclinic orbits in reversible systems. Springer Verlag Lecture Notes in Mathematics To appear. 17. Melnikov V.K. On the stability of the center for time periodic perturba tions. Trans. Moscow Math. soc. 12, 1963, 1-57. 18. PoincarSH. Les methodes nouvelles de la mecanique celeste (vol. 2) Paris: Gauthier-Villars (1893). 19. Pomeau Y., Ramani A., Grammaticos B. Structural stability of the Korteweg-De Vries solitons under a singular perturbation. Physica D 31,1988, 127-134 20. Segur H., Tanveer S., Levine H., Asymptotics beyond all orders. NATO ASI Series B Vol. 284 (1991). 21. Sun S.M., Shen M.C. Exponentially small estimate for the amplitude of capillary ripples of generalized solitary wave J. Math. Anal. Appl. 172, 1993, 533-566. 22. Sun S.M. Non-existence of Truly solitary waves in water with small sur face tension. Proc. Roy. London, 1999, in press.
MATHEMATICAL MODELLING IN THE LIFE SCIENCES: APPLICATIONS IN PATTERN FORMATION A N D W O U N D HEALING PHILIP K. MAINI Centre for Mathematical Biology, Oxford
1
Introduction
Spatial and spatio-temporal patterns occur widely in the life sciences as well as in chemistry. Perhaps the best-known example of spatio-temporal pattern formation is the spontaneous generation of propagating fronts, target pat terns, spiral waves and toroidal scrolls in the Belousov-Zhabotinsky reaction, in which bromate ions oxidise malonic acid in a reaction catalysed by cerium, which has the states Ce 3 + and Ce 4 + . Sustained periodic oscillations are ob served in the cerium ions. If, instead, one uses the catalyst Fe 2 + and Fe 3 + and phenanthroline, the periodic oscillations are visualised as colour changes between reddish-orange and blue (see, for example, Murray, 1993, Johnson and Scott, 1996 for review). Similar types of patterning arise in physiology and one of the most widely-studied and important areas of wave propagation concerns the electrical activity in the heart (Panfilov and Holden, 1997) which stimulates muscle contraction resulting in the heart beating. Understanding the mechanisms underlying spatio-temporal pattern for mation is a central goal in embryology. Although genes control pattern for mation, genetics does not give us an understanding of the actual mechanisms involved in patterning. Many models of how different processes can conspire to produce pattern have been proposed and analysed. They range from gradienttype models involving a simple source-sink mechanism (Wolpert, 1969); to cellular automata models in which the tissue is discretised and rules are intro duced as to how different elements interact with each other (see, for example, Bard, 1981); to more complicated models which incorporate more sophisti cated chemistry, physics and biology, and which propose that patterns are set up due to self-organisation rather than as a consequence of externally imposed :ues. All of these models focus on the key question of how cells respond to and interact with signalling cues. This is a crucial question in a number of areas oi the biomedical sciences. For example, in tumour formation, one wishes to inderstand how the signalling cues involved in the regulation of cellular projesses no longer function properly. Many of the physical processes that occur 201
202 P. K. Main.
during the formation and spread of tumour cells are also of vital importance in wound healing, where their function benefits the organism rather than be ing destructive to it. Recent advances in molecular and cellular biology have led to the rapid development of experimental research into the biochemical mechanisms underlying the processes of wound healing. Wound healing is an enormously complex dynamic spatio-temporal process and new insights are being gained by focussing on the interaction of specific processes involved for a particular aspect of healing. In all the above areas, a number of complex mechanical and biochemical processes are interacting in a highly nonlinear way. Such systems are amenable to mathematical modelling and the role of the modeller is to suggest explana tions, based on biologically plausible mechanisms, of observed behaviour and to make experimentally testable predictions. In this article, I will review, in Section 2, some of the commonly used models for pattern formation in early development. In Section 3 I will describe some models that have been used to address particular phenomena in wound healing. At the end of each section, a discussion on the particular applications is presented. Section 4 contains a general discussion and points to future directions. 2
Models for pattern formation and morphogenesis
Cell fate and position within the embryo can be strongly influenced by en vironmental factors. Therefore, to answer questions on pattern formation, one must really address the issue of how the embryo organises the complex spatio-temporal sequence of signalling cues necessary to develop structure in a controlled and coordinated manner. Structure can form through tissue movement and rearrangement, cell-cell interaction, or in response to chemical signals. We first consider the latter type of model. 2.1
Chemical pre-pattern models
The simplest chemical pre-pattern model is the gradient model proposed by Wolpert (1969) in which a source-sink mechanism, coupled with diffusion and degradation, leads to a spatial gradient in a single morphogen. He proposed that this provided positional information for cells, which differentiated ac cording to a series of threshold values. More complicated spatial patterns can be generated due to the reaction and diffusion of a number of chemicals. This phenomenon is known as diffusion-driven instability and was first proposed by Turing in a seminal paper (Turing, 1952). The reaction kinetics he consid-
Mathematical Modelling in the Life Sciences
203
ered were stabilizing and diffusion is, of course, a homogenizing process. Yet combined in the appropriate way, these two stabilizing influences can conspire to produce an instability which can result in spatially heterogeneous chemical profiles - a spatial pattern. This is an example of an emergent property and is termed diffusion-driven instability, that is, a spatially uniform steady state, linearly stable in the absence of diffusion, becomes linearly unstable in the presence of diffusion.
To derive the partial differential equation form of Turing's model, let us first consider a single chemical, with concentration c(x, t) at position x E R3 and time t E (0, oo). Consider an arbitrary volume V C R3. Then rate of change of chemical in V = - a flux + net production i.e. dcdv = FAS + f (c)dv v v v where F is the flux of chemical per unit area and f (c) is net chemical production per unit volume. On using the divergence theorem and the fact that the volume V is arbitrary, this equation becomes
J
f
ac
at
J
= -V.F + f (c). (2)
The function F is determined by Fick's Law, which states that chemical flux is proportional to the concentration gradient, i.e. F = -DVc (3) where D is the diffusion coefficient. Therefore the reaction-diffusion equation governing the spatio-temporal evolution of c takes the form: ac
at
= V.(DVc) + .f (c)•
(4)
To complete the model formulation we need to specify initial conditions, c(x, 0) = co(x) and boundary conditions. The latter may typically be periodic, zero flux or fixed, depending on the phenomenon being modelled. The above derivation can be generalised easily to a system of interacting chemicals leading (in the case of constant diffusion coefficients) to au =
at
DV2u +
f(u), (5)
where u is a vector of chemical concentrations, u = (ul, u2, ..., u„ )T ; f = (fl (u), f2 (u), ..., fn(u))T and models chemical interaction; and D is an n x n diffusion matrix. In the simplest examples, D is a diagonal matrix. More generally, D can have off-diagonal terms to model cross-diffusion.
204 P. K. Maini
The classical reaction-diffusion model is a system of two chemicals, u and v, reacting and diffusing as follows: — = I>iV 2 u + /(u,t;)
(6)
— = D-2V2v + g{u,v).
(7)
We will assume zero flux boundary conditions. The functions / and g are rational functions of u and v (see examples later). Using standard linear analysis (see, for example, Murray, 1993) it can be shown that a spatially uni form steady state of the above system can undergo diffusion-driven instability if the following conditions hold: ( C . l ) /„ + gv < 0 (C.2) fugv
- fv9u > 0
(C.3) Dl9v
+ D2fu > 2^D1D2(fu9v
-
fv9u)
where all the partial derivatives are evaluated at the spatially uniform steady state. One possible scenario for pattern formation is that in which fu and gu are positive, the latter implying that u activates v, while gv and /„ are negative, so that v inhibits u (see, for example, Dillon et al, 1994). Condition (C.l) =* l/ul < lflt>|, s o fr°m ( c -3) Di < D2. That is, the activator diffuses more slowly than the inhibitor. This is an example of the classic property of many self-organising systems, namely short-range activation, long-range inhibition. In Turing's original model, / and g were linear so that, if the uniform steady state became unstable, then the chemical concentrations would grow exponentially. This, of course, is biologically unrealistic. Since Turing's paper, a number of models have been proposed wherein / and g are nonlinear so that when the uniform steady state becomes unstable, it may or may not evolve to a bounded, stationary, spatially non-uniform, steady state (a spatial pattern) depending on the nonlinear terms. These models may be classified into four types: (i) Phenomenological Models: The functions / and g are chosen so that one of the chemicals is an activator, the other an inhibitor. An example is the Gierer-Meinhardt model (1972), for which 2
/(u,u) = a - / ? u + — ,
g(u,v) = Su2 - nv
(8)
Mathematical Modelling in the Life Sciences
where a,0,y,S u.
205
and 77 are positive constants, u activates v and v inhibits
(ii) Hypothetical Models: Derived from a hypothetically proposed series of chemical reactions. For example, Schnakenberg (1979) proposed a series of trimolecular autocatalytic reactions involving two chemicals as follows *i
XT±A,
B-^Y,
2I + K-^3I.
Using the Law of Mass Action, which states that the rate of reaction is directly proportional to the product of the active concentrations of the reactants, and denoting the concentrations of X, Y, A and B by u, v, a and b, respectively, we have f(u, v) — faa - kxu + ktu2v,
g(u, v) = £36 - k^2v
(9)
where ki,...,k4 are (positive) rate constants. Assuming that there is an abundance of A and B, a and b can be considered to be approximately constant. (iii) Empirical Models: The kinetics are fitted to experimental data. For example, the Thomas (1975) immobilized-enzyme substrate-inhibition mechanism involves the reaction of uric acid (concentration u) with oxy gen (concentration v). Both reactants diffuse from a reservoir maintained at constant concentration uo and vo, respectively, onto a membrane con taining the immobilized enzyme uricase. They react in the presence of the enzyme with empirical rate ^ ^ ' J M , so that /(u,i>) = a{UQ -u)
g(u,v) = (3(v0 -v)-
-
«•+•+"/*.•
(10)
2/
Km +u + u'/K, where a, 0, Vm, Km and K, are positive constants. (iv) Actual Chemical Reactions: Although Turing predicted, in 1952, the spatial patterning potential of chemical reactions, this phenomenon has only recently been realised in actual chemical reactions. Therefore, it is now possible in certain cases to write down detailed reaction schemes and derive, using the Law of Mass Action, the kinetic terms.
206 P. K. Maim
The first Turing patterns were observed in the chlorite-iodide-malonic acid starch reaction (CIMA reaction) (Castets et al., 1990, De Kepper et al., 1991). The model proposed for this reaction by Lengyel and Epstein (1991) stresses three processes: the reaction between malonic acid (MA) and iodine to create iodide, and the reactions between chlorite and iodide and chloride and iodide. These reactions take the form MA + h -> IMA + r + H+ C102 + r -► CIO2 + \h CIO2 + 4 / - + 4H+ -> Cl~ + 2/ 2 + 2H20. The rates of these reactions can be determined experimentally. By making the experimentally realistic assumption that the concentration of malonic acid, chlorine dioxide and iodine are constant, Lengyel and Epstein derived the following model: — = ki - u - — — 2 + V2u at 1+u 90
-k
k3 u
{ -TT*)
+ cV2u
where u,v are the concentrations of iodide and chlorite, respectively and &ii &2, ^3 and c are positive constants. Murray (1982) calculates and compares the parameter spaces determined by (C.1)-(C3) for linear instability in the Gierer-Meinhardt, Schnakenberg and Thomas models. In many cases, particularly in one dimension, the results of the linear analysis carry over to the weakly nonlinear case but the fully nonlinear system can only be analysed by numerical simulation. There is now a great deal of literature on this subject and general results on the patterning properties of reactiondiffusion equations can be found in the books by Britton (1986), EdelsteinKeshet (1988), Fife (1979), Grindrod (1996), Murray (1993) and Segel (1984). The gradient models and the Turing-type models differ in two crucial aspects: In the gradient model, the chemical pre-pattern is set up by a sim ple process which can only produce a simple gradient. To use this gradient to generate complicated pattern, it is hypothesized that a complex series of thresholds exist and cells have the machinery to interpret multiple thresholds. In Turing's model, complex spatial patterns arise due to a complex chemical interaction, but the interpretation of the pre-pattern is via a single threshold (Nagorcka, 1989) and is therefore simpler than that in the gradient model.
Mathematical Modelling in the Life Sciences
207
2.2 Cell movement models Cell movement models are based on the assumption that a spatial pattern arises in cell density, and cells then differentiate in a density-dependent manner. Cell aggregation occurs when the cell dispersing factors (for example, diffusion) are overcome by aggregating factors such as chemotaxis (movement up chemical gradients), or factors generated by the mechanical interaction of cells with the extracellular matrix (ECM) on which they move. These include haptotaxis (movement up adhesive gradients) or passive convection arising as the result of deformation of the ECM due to cell traction. Chemotactic models have been analysed by a number of authors and shown to lead to spatial pattern formation (see, for example, Keller and Segel, 1971; Maini et al., 1991). These models involve reaction and diffusion, but spatial patterning arises in this case due to the advective term introduced by chemotaxis. The typical model takes the form:
On =
at
DnV2n - V.(X(u)nVu) + f (n, u) 8u = D,,V 2 u+g(n,u), 8t
(11)
(12)
where n(x, t), u(x, t) denote cell density and chemoattractant concentration, respectively, at position x and time t; D, Du are diffusion coefficients, f, g are terms incorporating production and degradation and X(u) is the chemotactic sensitivity. The latter varies depending on the mode of cell-chemoattractant interaction (Othmer and Stevens, 1997). The first partial differential equation model incorporating the role of mechanical cues in the formation of cell aggregation was proposed by Oster et al. (1983) and since then such models have been extensively studied (Murray, 1993). They consist of conservation equations for cells and extracellular matrix, which take the general form of the equations above, but the main difference is the force-balance equation, which is that for a viscoelastic material. The mechanical model proposed by Oster et al. (1983) has three dependent variables: n(x, t), p(x, t) and u(x, t) which represent, respectively, cell density, matrix density and matrix displacement at position x and time t. The cell equation is an
at
= - V.J + rn(N - n)
(13)
where net cell production is assumed to be of logistic form, r and N are positive constants, and J, the cell flux, is given by J = -DOn+anVp+n et .
208 P. K. Maini
The first term on the right-hand side models random motion and the third term accounts for passive advection. Cells can also move by attaching cell processes to adhesive sites within the matrix and crawling along. As cells exert forces on the extracellular matrix they generate adhesive gradients which serve as guidance cues to motion. The movement up such gradients is termed haptotaxis. The assumption that the number of adhesive sites is proportional to ECM density leads to the second term on the right-hand side, where a is the haptotactic coefficient, assumed to be a non-negative constant. Hence the equation for cell motion is -£ = DV2n - aV.(nVp) -V.(n~)+
rn(N - n).
(14)
The equation for ECM density is much simpler as the only contribution to matrix flux is advection, and matrix secretion is assumed negligible. Hence, p satisfies dp _ ( du\ v 15
it = - - {'*) ■
< >
To derive the equation for the matrix displacement, u(x, t), we first note that for cellular and embryonic processes, inertial terms are negligible in compar ison to viscous and elastic forces, that is, motion ceases instantly when the applied forces are turned off. Hence the traction forces generated by the cells are balanced by the viscoelastic forces within the ECM. Therefore the equilibrium equations are V.a + pF = 0 (16) where a is the composite stress tensor of the cell-ECM milieu and pF accounts for body forces. Oster et al. (1983) model the cell-matrix composite as a linear viscoelastic material with stress tensor <7 = <7 p +cr n .
(17)
Here ap is the usual viscoelastic stress tensor (see, for example, Landau and Lifshitz, 1970), de viscous
89,
E
(
v
\
elastic
where: 6 = V.u is the dilatation, e = |[Vu + Vu T ] is the stress tensor, I is the unit tensor, p,i,pL2 are the shear and bulk viscosities, respectively, and E, v are the Young's modulus and the Poisson ratio, respectively.
Mathematical Modelling in the Life Science!
209
The stress due to cell traction is modelled by
where r and A are positive constants. This satisfies the conditions that there is no traction without matrix and that traction per cell decreases with increasing :ell density (contact inhibition). If the cell-matrix composite is attached to an external substratum, for example a subdermal basement layer, then the simplest way to model the aody force is to assume F = su
(20)
where s is the modulus of elasticity of the substrate to which the composite is attached. With appropriate boundary conditions (for example zero flux on n and o, with u fixed) the above equations for cell density, matrix density and dis placement define a simple version of the more complicated mechanical model presented by Oster et al. (1983). Linear and nonlinear analyses, plus numerical simulation, show that modsis within this general mechanical framework can exhibit steady-state spa tial patterns (Perelson et al., 1986) and spatio-temporal patterns (Ngwa and Maini, 1995). Other movement models hypothesize that cells move to minimize energy (Cocho et al, 1987a,b; Steinberg, 1970; Sulsky, 1984). Such models can be set up mathematically and solved to show cell sorting and patterning behaviour consistent with a number of experimentally observed phenomena. 2.3
Cell rearrangement models
Theoretical studies in this area include the early purse-string model of Odell et al. (1981) for tissue folding in which, in response to a large deformation, cells were proposed to actively contract and in so doing cause a large de formation in neighbouring cells which, in turn, also contract, setting up a propagating contraction wave which leads to tissue folding. This model was applied to a variety of developmental problems, and provided the precursor to the "mechanochemical theory" of developmental pattern formation reviewed above. This approach emphasises the link between tissue mechanics and chem ical regulation, and has been applied widely in both developmental biology and medicine. Subsequently, Weliky and Oster developed a discrete-cell modslling approach in which morphogenesis occurs via mechanical rearrangement
210 P. K. Maini
of neighbours in an epithelial sheet. They assume that the boundary of the ep ithelial sheet is pulled over the surface of the egg and show that the resultant model can produce many experimentally observed aspects of both Pundulus epiboly and notochord morphogenesis in Xenopus laevis (Weliky and Oster, 1990; Weliky et al., 1991). More recently, Davidson et al. (1995) used a com putational finite element model to test various explanations for sea urchin invagination. In all these models individual cell movements within the tissue are deter mined by the balance of mechanical forces acting on the cell. Such models can exhibit tissue folding, thickening, invagination, exogastrulation and intercala tion, and have been shown to capture many of the key aspects of processes such as gastrulation, neural tube formation, and ventral furrow formation in DrosophUa. Models for tissue motion are not amenable to a mathematical analysis and tend to be highly computation-based. 2.4 Applications Turing considered the chemicals in his model to be growth hormones, so that the spatial pattern in chemical concentrations would result in spatially nonuniform growth and hence pattern. He applied his theory to account for budding in plant stems and to growth-induced shape changes in the early embryo which he proposed could account for gastrulation. Since his seminal paper, reaction-diffusion models have been proposed to account for a vast number of patterning processes in nature, too great to be completely reviewed here, so we consider only a few examples to give a flavour of the applications. Gierer and Meinhardt (1972) used their model to account for pattern for mation in Hydra and showed that this model could account for the remarkable regenerative properties exhibited by this organism. Reaction-diffusion models have been proposed to account for compartmentalisation in insect develop ment and to provide an explanation for the occurrence of various mutants (see Meinhardt, 1982, for review). However, for DrosophUa it now appears that patterning is due to a cascade of protein interactions that are consistent with the gradient-type models and are not of Turing-type. Meinhardt (1995) has shown that reaction-diffusion type models solved on a growing domain can produce the spectacular variety of patterns seen on shells (Figure 1), while Nijhout (1990) has shown that such models, together with a small number of sources and sinks, can exhibit the vast array of pig mentation patterns observed on butterfly wings. Shell patterns can also be produced by an integro-partial differential equation model aimed at capturing neural activity along the front of the growing shell (Ermentrout et al, 1986).
Mathematical Modelling in the Life Sciences
211
Figure 1 . (a) Spatial pattern exhibited on the shell Oliva porphyria and a simulation of a reaction-diffusion model on a large array of cells. (Reproduced with permission from Meinhardt , 1995).
Although this model is based on very different biology to the reaction-diffusion model and has a different mathematical form, it still falls under the general category of short-range activation, long-range inhibition models. Reaction-diffusion and cell movement models have been applied to animal coat markings (Bard, 1981; Cocho et al., 1987a,b; Murray, 1981; Murray and Myerscough, 1991) and to skeletal patterning in the limb, for which gradient models have also been proposed (see Maini and Solursh, 1991, for a critical review). The above models propose different scenarios for pattern formation and, to date, it is a highly controversial issue as to which model is the appropriate one. Chemical morphogens, the name given to the chemicals proposed to form pre-patterns, have yet to be unequivocally identified, so it is difficult
212 P. K. Maini
to fix parameter values. For the mechanical type model mentioned above, parameters are also difficult to determine. Moreover, the original model is based on the Kelvin model of linear viscoelasticity and there is no justification for taking this particular form. The Maxwell model will also give rise to pattern formation, but for different parameter values (Byrne and Chaplain, 1996). Thus, without constraints on the parameter values, one can produce similar patterns with each model and therefore cannot distinguish between them on that basis. It is also difficult to distinguish, from a biological viewpoint, between models. For example, in some cases one observes both a chemical pattern and a cell aggregation pattern but it is difficult biologically to determine which came first. The chemical pre-pattern model would say that the chemical pattern arose first and cells responded to this by differentiating, resulting in cell condensations. The cell-chemotaxis model scenario would be that both patterns arise simultaneously, while under the assumptions in the mechanical model, the patterns would be explained by cell aggregations first forming and then secreting a chemical pattern as a result of differentiation. Although these models are based on very different biological assump tions, many of them share common properties. For example, the patterning in reaction-diffusion and in many cell movement models arises from the inter action of the processes of short-range activation, long-range inhibition. On the one hand, this has the disadvantage of making it very difficult to use models to distinguish between mechanisms, on the other hand, it does mean that one can make general conclusions and predictions that are mechanism-independent. This leads to the idea of developmental constraints which proposes that only certain patterns are possible, regardless of the mechanism (Oster and Murray, 1989). Figure 2 illustrates one such developmental constraint. A key property of many development processes is their robustness in the face of naturally occurring random fluctuations. This has been a major prob lem for reaction-diffusion theory, as it is well-known that the patterns it pro duces are not robust (Bard and Lauder, 1974). In other words, Turing-type models can exhibit multiple stable solutions in large regions of parameter space. Recently it has been shown that boundary conditions can play a cru cial role in stabilizing patterns. For example, if one chooses fixed boundary conditions for one chemical and zero flux boundary conditions for the other, then this reduces the number of admissible solutions and thus diminishes the regions in parameter space in which one obtains multiple stable solutions. In effect, the boundary conditions serve to select only certain patterns (Dillon et al, 1994). In higher dimensions, this problem becomes more acute as one now has
Mathematical Modelling in the Life Sciences
213
(o)
Figure 2. Simulation of a cell-chemotaxis model of the form (11) - (12) showing the effect of domain size on cell density concentration (arrow denotes increasing cell density). As the domain narrows, the diamond pattern changes to a simpler, wavy stripe pattern. This is an example of a developmental constraint. (b) Examples of diamond patterns on snakes (i) Crotalus adamanteus; (ii) Coluber hippocrepis (note the effect of the tapering domain). Reproduced with permission from Murray and Myerscough, 1991.
214 P. K. Maini
the added problem of degeneracy. For example, for certain parameters, there may be two or more admissible solutions with the same linear growth rate and it is then not clear which solution is selected. Using nonlinear bifur cation analysis, Ermentrout (1991) showed that the nonlinear terms play a key role in pattern selection, with quadratic terms favouring spots, while cu bic terms favour stripes. More recently, Benson et al. (1998) have shown how a spatially-varying parameter can unravel such degeneracies and select one pattern over another. The role of spatially-varying parameters has re ceived little attention (although it should be noted that the Gierer-Meinhardt model was initially designed to explain how localized structures could arise in places determined by a pre-existing gradient) but they can play a crucial role in the patterning process. For example, Wolpert and Hornbruch (1990) showed experimentally that double-anterior chick limb buds gave rise to two humeri, even though the size of the bud was the same as that of a normal limb bud, which only produces one humerus. This contradicts the standard Turing model, which predicts that patterning complexity is intimately linked to domain size. Maini et al. 1992, showed that a Turing model with spatiallyvarying diffusion coefficients could give rise to results that are consistent with Wolpert and Hornbruch's experiments. Results of dye-spreading experiments suggest that the hypothesis of spatially-varying diffusion is very plausible (Brummer et al., 1991). These studies have all been carried out on the reaction-diffusion model system because it is the simplest, mathematically speaking, of the models reviewed in this section. It is still an open question as to whether the results on robustness, pattern selection and spatially-varying parameters carry through to the other model types. In all the above applications, patterns occur simultaneously throughout the whole domain. However, in some cases, patterns arise as the result of propagation. For example, in the alligator embryo, the pigmentation stripes occur as a propagating pattern moving down the body from head to tail. Murray et al. (1990), have shown that a cell-chemotactic model of the form discussed above can give rise to such patterns. They were able to make ex perimentally confirmed predictions on how the number of stripes varies with the length of the embryo and present biological evidence that supports the view that this is a cell movement process. More recently, Painter et al. (1999a) have studied the formation of the primitive streak. This is a novel patterning process wherein, during the blas toderm stage of early chick development, a column of cells, known as the primitive streak, advances to about 3/5 the way across the disk-shaped blas toderm, before regressing. They have shown how this could arise from a
Mathematical Modelling in the Life Sciences 215
Figure 3. Coupling two reaction-diffusion models can lead to a complicated pattern of stripes interspersed with spots, as observed in the thirteen-lined ground squirrel and Pomacanthus maculatus. (Reproduced with permission from Arag6n et al., 1998).
cell-chemotaxis type model and have made a number of testable predictions. 2.5
Coupling pattern generators
In many cases, pattern formation arises as the result of the interaction of more than one pattern generator. For example, epidermal-dermal interactions play a crucial role in the development of skin organs, such as hair, teeth, feathers and scales. Nagorcka et al. (1987) considered a tissue-tissue interaction model that coupled a mechanochemical cell movement model in the dermis with a
216 P. K. Maini
reaction-diffusion model in the epidermis. They showed that such a model can give rise to patterns on two different length scales, namely, large spots inter spersed with small spots, and these are similar to the scale patterns observed in certain lizards and to the feather germ patterns observed on certain birds. More recently, Arag6n et al. (1998) have shown how the coupling between two systems of reaction-diffusion models can produce complex patterns (see Figure 3). The life cycle of the cellular slime mould Dictyostelium discoideum serves as an excellent paradigm for morphogenesis as it exhibits the processes of signal transduction, signal relay, cell movement and aggregation, all of which play important roles in early embryonic development. Hence, for many years, it has attracted the interest of developmental biologists and theoreticians alike. Starvation conditions trigger a developmental programme which is initiated by cell-cell signalling via the extracellular messenger cyclic 3'5'adenosine monophosphate (cAMP). The chemotactic response to this signal leads, through the phenomenon of cell streaming, to the formation of a multicellular organism composed typically of 104 - 105 cells. This organism passes through an intermediate motile (slug) phase during which cells differentiate into pre-spore and pre-stalk types, before developing a fruiting body, aiding the dispersal of spores from which, under favourable conditions, new amoe bae develop. The comparative simplicity of morphogenesis in Dictyostelium has made it an attractive model system for the study of self-organisation, and many of the molecular and cellular mechanisms which are involved in cell aggregation, collective movement and differentiation have now been identified. Several models have been proposed to account for many of the afore mentioned phenomena. Here, we shall focus on a model for cell aggregation proposed by Hofer et al. (1995a,b) - we refer the reader to this paper for a fuller description of the biology and parameter values, as well as references to other models. The model takes the form: dn — = V • (/xVn - X{v)nVu) (21) Yt = mn)h(u,v)
- ((n) + S)f2(u)} + V 2 u
dv -QI = -9\{u)v + g2{u)(l - v),
(22) (23)
where n, u and v denote cell density, extracellular cAMP concentration and fraction of active cAMP receptors, respectively. The second and third equations are a simplified model of the cAMP-cell receptor dynamics (see Martiel and Goldbeter, 1987, for full details) modi fied by cell density effects; the rate of cAMP synthesis per cell is / i (u, v) =
Mathematical Modelling in the Life Sciences
217
(bv + v2)(a + u 2 ) / ( l + u2), where a and b are positive constants. This models autocatalytic production with saturation, mediated by receptor binding. The functions /2 and gi are assumed to be linear in u, to model linear degradation of cAMP and binding of active receptors to cAMP, respectively, while 52 is assumed to be a positive constant, accounting for the resensitization of the desensitized fraction of receptors at constant rate. This subsystem exhibits ex citable dynamics leading to the formation of spiral waves of cAMP concentra tion. Cell density effects are modelled by the factor 0(n) = n/(l—pn/(K+n)), where p and K are positive constants. This excitable system is coupled with a chemotaxis equation for cell density, with constant diffusion coefficient p, and chemotactic sensitivity x{v) = Xovm/{Am +vm), m > 1, which accounts for adaptation, where Xo and A are positive constants. Hence an appreciable chemotactic response requires a minimal fraction of active receptors, yet, for a large fraction of active receptors, the response saturates. Note that many models of chemotaxis assume x to be a constant. Under that assumption, cells would respond to a pulse of chemoattractant by moving towards the wave in the wavefront, then moving with the wave in the waveback, resulting in a net movement away from the source of attractant, rather than towards it. This is the so-called "chemotactic wave paradox" (Soil et al., 1993). The form of x{v) chosen above resolves this paradox (Hofer et al, 1994). Using experimentally determined parameter values, the above model cap tures the key features of the aggregation process (see Figure 4). The model is consistent with the observation that as streaming proceeds, the wavespeed and wavelength of the spiral patterns decrease (Gross et al., 1976). Previously, this has been explained by assuming that biochemical changes must be occur ring in the cell-cAMP system and, indeed, it has now been established that changes of this sort can occur. However, in the above model, this behaviour arises naturally, because as the cells form streams, they alter the conditions through which the cAMP waves are propagating. This is initially equivalent to increasing the excitability of the medium which, in turn, leads to an in crease in the rotation frequency of the spiral core. As a result (Tyson and Keener, 1988) the wavespeed and wavelength of the spiral patterns decrease. 2.6
Domain Growth
None of the above models account for domain growth. However, a number of recent studies have shown that domain growth can play a vital part in the pat tern formation process. For example, Kondo and Asai (1995) showed that the pigmentation pattern in the juvenile angelfish Pomacanthus semicirculatis, exhibits changes in the number of stripes as it grows. Briefly, when the wave-
218 P. K. Maini
v\V-
\ j
W/// :
/
'
(a)
(b)
Figure 4. Spatio-temporal evolution of (a) cell density, and (b) cAMP concentration in a numerical simulation of (21)-(23). (Reproduced with permission from Hofer et ai., 1995b).
Mathematical Modelling in the Life Sciences
219
length of the stripes increases twofold due to domain growth, further stripes are inserted to restore the wavelength to its original value. They showed that this is consistent with a reaction-diffusion model. More recently, Painter et al. (1999b) showed that a cell-chemotaxis model coupled with a reaction-diffusion model on a growing domain could account also for the fact that the inserted stripes are thinner than the original stripes. This phenomenon is not easily explained by a reaction-diffusion model. It is believed that domain growth also plays a role in the determination of the sites of tooth primordia in alligators. The first seven tooth primordia of the lower alligator jaw occur (taking one half of the jaw) in the spatio-temporal sequence 4-1-5-2-6-3-7, that is, the fourth primordia to form is the posteriormost while the seventh primordia to form is the anterior-most. Kulesa et al. (1996) have shown how a reaction-diffusion model coupled with an inhibitor model on a growing domain could explain this sequence. More recently, Crampin et al. (1999) have shown that domain growth can lead to the robust pattern selection of certain types of patterns in reactiondiffusion models, while Holloway and Harrison (1999) have shown that inter specific variation in branching patterns in certain plants can be accounted for by a system of reaction-diffusion equations on a growing two-dimensional domain in which growth is controlled by the concentration of one of the chem icals. 2.7
Discussion
Development of spatial pattern and form is unquestionably one of the central areas in biology. It is a very complex process that involves the interaction of a large number of components acting at different levels, yet the models presented in this section focus on only a small number of components. An important question to ask therefore, is that although it is clear that even these simple models offer theoreticians an enormous range of challenging problems from modelling, numerical computation and mathematical viewpoints, what does it actually say about the biology? The models presented here are really of two types. The first are conceptual, for example, it is remarkable that the simple mechanism of short-range activation, long-range inhibition can give rise to such an enormous range of patterns. One can think of this as being a primary mechanism for generating spatial pattern. Such models provide a framework in which one can test theories of patterning, as well as making experimentally testable predictions. The second type of model is where more is known about the biochemistry and parameter values. Pattern formation in Dictyostelium discoideum pro-
220 P. K. Maini
vides the most well-studied case of this type of model. Although the model described in this section is based on only three equations, the excitable sub system of the model arises from a much larger model which can be simplified by using the fact that there are a number of different timescales involved. The recent work by Weijer and coworkers has gone a long way in explaining a number of patterning phenomena that occur in Dictyostelium discoideum by including ideas from chemotaxis, excitable media and computational fluid dy namics (see Vasiev and Weijer, 2000, and references therein). Such single-cell systems, which are simpler to manipulate than embryos, may be important in providing crucial insights to some of the workings of more complicated multicellular organisms, and are therefore attracting more and more attention. For example, there is now a great deal of literature on the modelling of bacterial patterns (see Ben-Jacob et al., 2000, and references therein). All the above models are based on considering pattern formation at a macroscopic level. There is now an enormous amount known at the molecular level about pattern formation. One of the main future challenges is to develop models that can integrate the molecular and cellular levels. 3
Models for wound healing
Wound healing is an enormously complicated phenomenon involving different processes interacting on different spatio-temporal timescales (see, for example, the books by Clark and Henson, 1988, Asmussen and Sollner, 1993) and the method of healing varies depending on whether the wound is a surface wound (epidermal) or a deep wound (dermal), and on whether it is in the adult or the embryo. In this section we focus on a model for epithelial wound healing and one for dermal wound healing, and show how the ideas of modelling used in the previous section can be applied. 3.1
Corneal Wound Healing
Cell migration and proliferation are central to the healing of wounds in the corneal epithelium and biological evidence suggests that both processes are regulated by a protein called epidermal growth factor (EGF). The source of EGF is an area of controversy and the model of Dale et al. (1994) sets out to investigate the possibility that the exposed underlying tissue within the wound acts as an additional source to the overlying tear film layer. Here, we summarise the model of Dale et al. (1994) and refer the reader to the original and references therein for full biological and modelling details. The model is a pair of reaction-diffusion equations for the cell density n(x, t)
Mathematical Modelling in the Life Sciences
221
and EGF concentration c(x, t) at position x and time t and takes the form (suitably nondimensionalised) Cell Migration
^
Mitotic Generation
Diffusion
/M Production by Cells
Lo8s
"n"N
(24)
- j=^-fc
(25)
= V.(D^(c)Vn)'+ i ( c ) n ( 2 - » )
|=a^j+
N a t ur al
-
/
s
v ' Decay of Active EGF
where £>n(c) = ac+0, and DC, (x,6,a , 0 and c are all positive constants. The model assumes that the cell diffusion coefficient is increased by EGF and s(c) is an increasing function of c to account for the EGF-enhanced cell mitotis rate. The function f(n) is taken to be of the form f(n) = A + B(n), where
{
a
if n < 0.2
CJ(2 - 5n) if 0.2 < n < 0.4 and A and a are positive constants. The A accounts for the constant source 0 the function if n > B(n) 0.4 is chosen to model EGF of EGF due to the tear film, while
and A and a are positive constants. The A accounts for the constant source of EGF due to the tear film, while the function B(n) is chosen to model EGF production due to wounding. The authors show that the detailed form of this function is not important to the behaviour of the model. Using parameter values derived from the biological literature, this model system is solved on a one-dimensional domain (a realistic approximation for the case of surface slash wounds, where cell fronts move in from either side of the slash to close the wound) to investigate the behaviour of travelling wave solutions which move from a region where the cell density and EGF concentration are at their unwounded levels, n = c = l ( a s x - » - c o ) , into a region of no cell density, n — 0, with the EGF concentration at its wounded level, c = f(0)/6 (as x -> +oo). An important conclusion from this work is that a realistic speed of healing is only attained if the function B(n) is included (Figure 5). Hence the conclusion is that for the wound healing scenario envisaged by this model, biologically realistic healing times can only be achieved by assuming that the tear source of EGF is supplemented by EGF production in the wounded tissue. For biologically realistic parameter values an analytical approximation to the minimum wavespeed can derived as J8/3s(££z-). An important biological implication of this result is that the rate of healing of corneal epithelial wounds can be increased by increasing the cell diffusion coefficient or the secretion rate of EGF. However, increasing the chemical diffusion coefficient does not have a
222 P. K. Maini
Figure 5. Numerical solution for the corneal wound healing model showing cell density and EGF concentration profiles as functions of space at equal time intervals, (a) The tear film is the only source of EGF, that is, /(n) = A. (b) An additional source of EGF is included, that is, /(n) = A + B(n). Note that in (b) healing occurs much more rapidly. (Reproduced with permission from Dale et at., 1994).
Mathematical Modelling in the Life Sciences 223
significant effect. The model can also be used to make experimentally testable predictions on how the speed of healing will change as the result of topical application of EGF. 3.2
Dermal healing
A crucial aspect of wound healing concerns the mechanical interaction of cells with their external environment. Cells deform and remodel the extracellular matrix on which they move and ECM materials, in turn, affect cellular prop erties and cell orientation. Using the mechanochemical model framework (see Murray, 1993, for review, and Murray et o/., 1988, Murray and Tranquillo, 1992) Olsen et al. (1995) developed a mechanochemical model for dermal wound healing. We refer the reader to the original paper for full details, in cluding experimental justification for each term within the model. The model consists of two cell types - fibroblasts and myofibroblasts, densities, n and m, respectively; a generic growth factor, concentration c, and ECM, density p. These quantities obey the general conservation equation f
= V'JQ + /C
(26)
where Q is the quantity in question, the first term on the right-hand side models motion with flux J Q , and the second term models production and degradation. To complete the model, a force balance equation is needed to account for the mechanical interaction of the cells with the ECM. For simplicity, we present here the one-dimensional version of the model, where x is space and t is time. The fibroblast cell equation takes the form: „ d2n d , , .dc du. _. , ., =Dn [x{c n) +n ]+R{c)n{1 )
dn
m
^-^
' ^ m
n, k\cn +k2m dnn
-K -c^Tc
-
,
,
-
(27) Implicit in these equations is the assumption that there are three main factors contributing to cell flux: random diffusion, with constant diffusion coefficient, Dn; chemotaxis with chemotactic sensitivity x(c,n), and advection in response to the displacement, u(x, t), of the ECM. The four remaining terms on the right-hand side model cell kinetics and include logistic cell growth with linear rate enhanced by growth factor, fibroblast conversion to myofibroblast phenotype mediated by growth factor, conversion from myofibroblast back to fibroblast cell type, and cell death. The myofibroblast equation takes the form dm d , du, _ . , .„ m. kicn . , .„„.
-at = d-x[-m-di]
+£
'* ( c ) m ( 1 " K] + cTTc - hm ~ ^ m -
(28)
224 P. K. Maini
Here, it is assumed that the dominant contribution to myofibroblast flux is advection, and that mitosis takes the same form as that for fibroblasts, mod ulated by a constant scale factor e r . The growth factor satisfies the equation dc
at
_ d2c
= Dc
w
+
d ,
[ c
du,
]
ai ~ m
nt
+ S{n m c)
' '
x
~
J
dcC
-
(29)
Implicit in this equation is the assumption that the dominant contributions to growth factor flux are random diffusion, with constant diffusion coefficient Dc, and advection. The remaining terms on the right-hand side model biosynthesis and degradation. The ECM moves primarily by advection and satisfies the equation dp
d ,
du.
d-t = d-x[-pm]
„. + B{n m c p)
> -' '
<30)
where B(n,m,c,p) represents ECM biosynthesis and degradation. Finally, modelling the ECM as a linear, isotropic, viscoelastic material, the displacement u satisfies the force balance equation
where the first two terms on the left-hand side model viscous and elastic forces, respectively, and the third term models cell traction forces. These forces are balanced by the body forces F(p,u). Note that this is very much a simplifi cation as a more realistic model should include anisotropy and plasticity. The above five equations, with appropriate initial and boundary conditions (see below) constitute the mechanochemical model frame work. Solving these equations in one spatial dimension is an approx imation to "slash" wounds. Numerical simulations show that these equations admit solutions in which a travelling front of cell density moves into the wound, causing it to heal. Using biologically realistic forms for the functions x(c)n),R(c),S{n,m,c),B(n,m.c,p),T(n,p),F{p,u), and estimates derived from experimental data for the parameters Dn,Dc,K,ki,k2,Ck,dn,dm,dc,€r,n and E, it can be shown that this model exhibits solutions for the decay of growth factor and rate of wound closure that closely agree with experimental results (see Olsen et cd., 1995, for full details). This model can be used to investigate abnormal wound healing. The full model is very complicated so simpler caricatures are considered. To investi gate the potential of the above model framework to exhibit spatially-varying contracted steady states, corresponding to fibrocontractive diseases, Olsen et
Mathematical Modelling in the Life Sciences 225
al. 1998, considered a simpler version of the model which focusses only on the mechanical aspects of the interaction. The non-dimensionalised version of this caricature model takes the form dn
_ d2n
Dn
m= w dp
d3u
^d2u
dT(n,p)
d
\
+
d .
du,
,,
d-x[-nm] + n{1-n)
J^y
.
,„„.
(32)
(VK\
.
. .
This caricature considers only the fibroblast cell type and assumes a simple form for logistic growth. It also assumes that there is negligible synthesis and degradation of ECM on the timescale of wound closure. This is a reasonable assumption to make in the stages prior to tissue remodelling during the process of wound healing. By defining the initial wound space as - 1 < x < 1 and using symmetry at x = 0 (the wound centre), we may restrict attention to the semi-infinite domain 0 < x < oo. The boundary conditions are thus 5H(0,t) = 1^(0,0 = u ( 0 , t ) = 0 and n(oo,«) = p(oo,«) = 1, u(oo,i) = 0. ox ox The initial conditions are n(x,0) = H(x-l),
p(x,0) = Pi + (l-pt)H{x-l),
u(x,0) = 0,
where the initial ECM density pi inside the wound is due to the early, pro visional wound matrix which is low in collagen and satisfies 0 < Pi < 1, and H(•) is the Heaviside step function. Consider now the healed steady state, n = 1. Linearising the matrix equation about the initial profile, we have (fH(l-du/dx),0<x1
(35)
as suggested by the small-strain restriction (since the convective flux should be small). Substituting this into the steady state equation for u results in a second order ordinary differential equation for u which can be written in the (rescaled) form u' = v
(36)
226 P. K. Maini
,
SOiU(l -v) , n r V r /i —VT> 0 < X < 1
su(\ -v)
1-ra-i;)' where T(p) =
&"'p'
, X>1
, and u satisfies the boundary conditions u(0) =
u(oo) = 0. Here it is assumed that the body force, F(p,u), is due to external tethering to the basement membrane and have modelled it by a linear spring, that is, F(p, u) = sup, where a is a constant. Standard phase plane analysis of (36)-(37) shows that for x > 1, the origin is a saddle (centre) iff T(l) < 1 (> 1). Linear stability analysis of the caricature model (32)-(34) shows that a necessary (but not sufficient) condition for the healed steady state to be stable is T ( l ) < 1 (see Olsen et a/., 1998). As this must be the case for the model to be realistic biologically, we have that the origin of the ordinary differential equation system (36)-(37) is a saddle, and the boundary condition u(oo) = 0 implies that the solution must converge towards the origin along the stable manifold as x tends to oo. By tracing backwards in x from infinity along the stable manifold, the solution reaches a point in the (u, u')-phase plane corresponding to x = 1 where u = U\, say. This must match the solution for 0 < x < 1. Now, at the wound centre, «(0) = 0, but v(0) is unspecified and is de termined by matching to the "outer" solution at x = 1. For 0 < x < 1, it can be shown that the origin can either be a saddle or a centre, depending on the form of the function T(TI, p) and the values of the other parameters. If the origin is a saddle point, then the solution in 0 < x < 1 will be either monotonic increasing with increasing gradient or monotonic decreasing with decreasing gradient. If the origin is a centre, then the solution in 0 < x < 1 may be oscillatory. Figure 6 illustrates the qualitative construction of such a solution and Figure 7 illustrates various possible forms of steady state so lutions based on this construction. Modelling the traction term, r(n,p), by T(TI, p) = Tonp/(T2 + p2), where TO and T are constant parameters, to account for the fact that traction forces depend on adhesion between cell surface re ceptors and binding sites on collagen fibres, but the ability of a cell to extend and retract protrusions within a collagen substrate is inhibited at relatively high collagen densities, steady states for (32)-(34) of the form illustrated in Figure 7(a)-(e) can be found by numerical simulation. Hence, this caricature model enables us to more fully understand the properties of the full model and shows clearly that the model can exhibit spatially-varying contracted steady states, and is thus consistent with clinical observations on normal healing. For full details, see Olsen et al. 1998.
Mathematical Modelling in the Life Sciences 227 V
S
** X • V.
—
- _ — ***
•
S •
I^LA \•
/ u
V-
I t , '
"s ^^
/ / / / / /
- -o-
\ \ \ \ \v
/ //
\u
/ //
0<x
\W
x>l
Figure 6. Qualitative illustration of a possible solution trajectory to (36)-(37) for the case in which the origin is a centre for 0 < x < 1 and a saddle point for x > 1 with u(x) -»• 0 from below as i -► oo. See also Figure 7(b). Dashed curves denote phase trajectories, with the contracted solution curve highlighted by solid arrows. See text for details. (Reproduced with permission from Olsen et al., 1997).
We now consider the application of the model to fibroproliferative wound healing disorders. These disorders are characterised by the generation of ab normally large amounts of tissue during the healing process, leading to, for example, keloid scarring. Numerical simulations of the full model show that it can exhibit solutions in which an excess of cells is observed, corresponding to a pathological state. To understand this more fully, a caricature model of the full system is, again, investigated. In this case, however, we focus purely on the chemical aspects of the mechanochemical framework by considering the cell-chemical sub-model d2n dn ^r- = Dn 2 dx at 32c dc ^- = D, 2 dx dt
X(c,n.
dx Kcnc + 7 + c -dcc,
dc +
PC
n 1 +Q + c n ( l - - ) - d „ n (38) (39)
where x( c > n ) = a/(P + c)2> aii^ OC,0,P,Q,KC and 7 are positive constants (see Olsen et al., 1996 for full details). This caricature model has two uniform steady states, (n, c) = (0,0), (K, 0) corresponding, respectively, to the trivial, or non-healing, state, and the nor-
228 P. K. Maini
Figure 7. Possible qualitative forms of the solution u(i) of the boundary value problem (36)-(37), representing contracted tissue displacement profiles. The point (u,«') = (0,0) must be a saddle point for x > 1 in the («, u')-phase plane, with u increasing to zero and u' decreasing to zero monotonically along the stable manifold in the top-left quadrant as x -f oo. For 0 < x < 1, the origin may be either a saddle point, in which case the profiles for u and u' are monotonic decreasing as shown in (a), or a centre, in which case « and u' oscillate about the origin as shown in (b-f); within this region, any number of oscillations is possible—for example, (f) is equivalent to (b) modulo one period. Note that the above steady-state profiles but with reversed signs of « and u' are also admissible solutions of (36)-(37), representing expanded tissue displacement profiles since u(l) would be positive. Recall that x = 1 is the initial wound boundary. (Reproduced with permission from Olsen et ai., 1997).
mal dermal state. For appropriate parameter values, two other steady states exist which have both n and c non-zero, with n > K. These are the patholog ical, or diseased, steady states. Results from bifurcation analysis of (38)-(39), in the absence of diffusion, show that for a critical value, KJ, of KC (which can be found in terms of the other parameters) the dermal steady state remains locally stable but loses global stability as the pathological steady states ap pear. At KC = «c, the dermal steady state loses stability and the pathological state with higher cell density level becomes globally stable. A travelling wave analysis of the model shows that trajectories from the dermal state to the pathological state are possible and a minimum wavespeed
Mathematical Modelling in the Life Sciences
229
Figure 8. Numerical simulations of (38)-(39) showing progression to pathological steady state (a), and cessation and regression for the case where KC is reduced to zero after a certain time (b). (Reproduced with permission from Olsen et al., 1997).
can be determined. Numerical simulations of the system show that such travelling waves do exist, but that reducing KC can cause the waves to stop and to regress (see Figure 8). This suggests that the reduction of the rate at which cells secrete growth factor can cause the disease to regress back to the normal dermal state. More detailed analysis of this model determines analytically how the bifurcation values of nc depend on the other parameters in the model. In particular, the model exhibits hysteresis and therefore, counter-intuitively, KC must be reduced considerably in order to progress from the diseased state to a healed state. This provides a clinically-testable method to help reduce this type of fibroproliferative disorder. It should be noted that although the above studies were carried out for simplified versions of the full mechanochemical model, the results do indeed
230 P. K. Mami
hold for the full model. 3.3
Discussion
The above models have been presented to illustrate how continuum type mod els can be used to address problems in normal and abnormal wound healing. The model for corneal wound healing has recently been extended to be more biologically realistic by including more than one cell type. The resultant model is a system of coupled integro-partial differential equations which ex hibits mitotic profiles that are more biologically realistic than those observed in the original model (Gaffney et al., 1999). The mechanochemical modelling framework can be extended to include cell alignment and matrix orientation, the latter is thought to play a crucial role in determining the severity of scar tissue formation (Olsen et al., 1999, Dallon et al., 1999). This particular modelling framework assumes that the extracellular matrix is a linear viscoelastic material. However, it is clearly more complicated than that and a more realistic model, considering the ECM as a viscoelastic-plastic material has recently been proposed (Tracqui et al., 1995). None of the above models investigate angiogenesis, the process by which new blood vessels form. This is of great interest also in tumour formation, where after reaching a certain size, limited by the availability of nutrients via simple diffusion, a tumour can only grow further by establishing its own blood supply via the release of so-called tumour angiogenesis factor. This is what allows the tumour to grow and undergo metastasis, causing the growth of secondary tumours which are usually fatal. For the mathematical modelling of angiogenesis in wound healing, the reader is referred to the paper by Byrne and Chaplain (1995), while the papers by Pettet et al. (1996) and Olsen et al. (1997) address wound healing angiogenesis. 4
Conclusions
In this paper I have considered the problems of spatial pattern formation in biology and of wound healing by illustrating a few applications. These seem ingly unrelated processes share the common underlying theme of cell response to, and interaction with, signalling cues. The models are conceptually simple and closely related, yet exhibit a bewildering array of behaviours, from spiral waves, spots and stripes, with application in pattern formation, to travelling waves of invasion in wound healing. The material in Sections 2 and 3 has illustrated how mathematical models
Mathematical Modelling in the Life Sciences
231
can be used to gain important insights to the underlying mechanisms respon sible for spatio-temporal pattern formation in biology and normal/abnormal wound healing and to make experimentally testable predictions. It is clear that even these so-called simple models pose challenging problems to the mathe matician. Many of the results presented here were obtained from numerical simulation. An important future aim will be to make some of these results mathematically realistic. Over the past decade there have been huge advances in molecular biology. A key problem for the next decade will be to combine this knowledge with research in cellular biology to gain a fuller understanding of the systems being studied. Important scientific advances in this area will only be made by a truly interdisciplinary approach and it is clear that mathematical modelling and numerical computation will have important roles to play in this endeavour. It is also clear that biology and medicine will continue to be a source of novel, difficult and challenging problems for mathematicians and numerical analysts. References 1. J.L. Arag6n, C. Varea, R.A. Barrio and P.K. Maini, Spatial patterning in modified Turing systems: Application to pigmentation patterns on marine fish, FORMA, 13, 213-221 (1998) 2. P.D. Asmussen and B. Sollner: Wound care: Principles of wound healing. Beierdorf Medical Bibliothek (1993) 3. J.B.L. Bard, A model for generating aspects of zebra and other mam malian coat patterns, J. theor. Biol., 93, 363-385 (1981) 4. J.B.L. Bard and I. Lauder, How well does Turing's theory of morphogen esis work? J. theor. Biol., 45, 501-531 (1974) 5. D.L. Benson, P.K. Maini and J.A. Sherratt, Unravelling the Turing bi furcation using spatially varying diffusion coefficients, J. Math. Biol., 37, 381-417 (1998) 6. E. Ben-Jacob, I. Cohen, I. Golding and Y. Kozlovsky, Modeling branching and chiral colonial patterning of lubricating bacteria, In: Proceedings of IMA Workshop on Pattern Formation, H.G. Othmer and P.K. Maini, (eds) to appear (2000). 7. N.F. Britton, Reaction-Diffusion Equations and Their Applications to Biology, Academic Press, London (1986) 8. F. Brummer, G. Zempel, P. Buhle, J.-C. Stein and D.F. Hulser, Retinoic acid modulates gap junction permeability: A comparative study of dye spreading and ionic coupling in cultured cells, Exp. Cell. Res., 196, 158-163 (1991)
232 P. K. Maini
9. H.M. Byrne and M.A.J. Chaplain, Mathematical models for tumour angiogenesis: numerical simulations and nonlinear wave solutions, Bull. Math. Biol., 57, 416-486 (1995) 10. H.M. Byrne and M.A.J. Chaplain, On the importance of consti tutive equations in mechanochemical models of pattern formation, Appl. Math. Lett, 9, 85-90 (1996) 11. V. Castets, E. Dulos, J. Boissonade and P. De Kepper, Experimen tal evidence of a sustained Turing-type equilibrium chemical pattern, Phys. Rev. Lett, 64(3), 2953-2956 (1990) 12. R.A.F. Clark and P.M. Henson (eds.), The Molecular and Cellular Biology of Wound Repair, Plenum Press, New York (1988) 13. G. Cocho, R. P6rez-Pascual and J.L. Rius, Discrete systems, cell-cell interactions and color pattern of animals. I. Conflicting dynamics and pattern formation, J. theor. Biol, 125, 419-435 (1987a) 14. G. Cocho, R. Perez-Pascual, J.L. Rius and F. Soto, Discrete systems, cell-cell interactions and color pattern of animals. I. Clonal theory and cellular automata, J. theor. Biol, 125, 437-447 (1987b) 15. E.J. Crampin, E.A. Gaffney and P.K. Maini, Reaction and Diffu sion on Growing Domains: Scenarios for Robust Pattern Formation, Bull. Math. Biol, 6 1 , 1093-1120 (1999) 16. P.D. Dale, P.K. Maini and J.A. Sherratt, Mathematical modelling of corneal epithelium wound healing, Math. Biosciences, 124, 127-147 (1994) 17. J. Dallon, J. Sherratt and P.K. Maini, Mathematical modelling of extra cellular matrix dynamics using discrete cells: Fiber orientation and tissue regeneration, J. theor. Biol, 199, 449-471 (1999) 18. A. Davidson, M.A.R. Koehl, R. Keller and G.F. Oster, How do sea urchins invaginate? Using biomechanics to distinguish between mechanisms of primary imagination, Dev., 121, 2005-2018 (1995). 19. P. De Kepper, V. Castets, E. Dulos and J. Boissonade, Turing-type chem ical patterns in the chlorite-iodide-malonic acid reaction, Physica D, 49, 161-169 (1991) 20. R. Dillon, P.K. Maini and H.G. Othmer, Pattern formation in generalised Turing systems: I. Steady-state patterns in systems with mixed boundary conditions, J. Math. Biol, 32, 345-393 (1994) 21. L. Edelstein-Keshet, Mathematical Models in Biology, New York, Ran dom House (1988) 22. B. Ermentrout, Stripes or spots? Nonlinear effects in bifurcation of reaction-diffusion equations on the square, Proc. Roy. Soc. Lond., A434, 413-417 (1991)
Mathematical Modelling in the Life Sciences
233
23. B. Ermentrout, J. Campbell and G. Oster, A model for shell patterns based on neural activity, The Veliger, 28, 369-338 (1986) 24. P. Fife, Mathematical Aspects of Reacting and Diffusing Systems, Led. Notes in Biomath., 28, Springer-Verlag, Berlin, Heidelberg, New York (1979) 25. E.A. Gaffney, P.K. Maini, J.A. Sherratt and S. Tuft, The mathe matical modelling of cell kinetics in corneal epithelial wound healing, J. theor. Biol., 197, 15-40 (1999) 26. A. Gierer and H. Meinhardt, A theory of biological pattern formation, Kybernetik, 12, 30-39 (1972) 27. P. Grindrod, The Theory of Applications of Reaction-Diffusion Equa tions: Pattern and Waves, Oxford University Press (1996) 28. J.D. Gross, M.J. Peacey and D.J. Trevan, Signal emission and signal propagation during early aggregation in Dictyostelium discoideum, J. Cell Sci., 22, 645-656 (1976) 29. T. Hofer, P.K. Maini, J.A. Sherratt, M.A.J. Chaplain, P. Chauvet, D. Metevier, P.C. Montes and J.D. Murray, A resolution of the chemotactic wave paradox, Appl. Math. Lett., 7, 1-5 (1994) 30. T. Hofer, J.A. Sherratt and P.K. Maini, Dictyostelium dis coideum: cellular self-organization in an excitable biological medium, Proc. Roy. Soc. Lond. B 259, 249-257 (1995a) 31. T. Hofer, J.A. Sherratt and P.K. Maini, Cellular pattern formation during Dictyostelium aggregation, Physica D, 85, 425-444 (1995b) 32. D.M. Holloway and L.G. Harrison, Algal morphogenesis: modelling inter specific variation in Micrasterias with reaction-diffusion patterned catal ysis of cell surface growth, Phil. Trans. R. Soc. Lond., B354, 417-433 (1999) 33. B.R. Johnson and S.K. Scott, New approaches to chemical patterns, Chem. Soc. Rev., 265-273 (1996) 34. E.F. Keller and L.A. Segel, Travelling bands of bacteria: a theoretical analysis, J. theor. Biol., 30, 235-248 (1971) 35. S. Kondo and R. Asai, A reaction-diffusion wave on the skin of the marine angelfish Pomacanthus, Nature, 376, 765-768 36. P.M. Kulesa, G.C. Cruywagen, S.R. Lubkin, P.K. Maini, J. Sneyd, M.W.J. Ferguson and J.D. Murray, On a model mechanism for the spa tial patterning of teeth primordia in the Alligator, J. theor. Biol., 180, 287-296 (1996) 37. L. Landau and E. Lifshitz, Theory of Elasticity, Pergamon Press, New York (1970) 38. I. Lengyel and I.R. Epstein, Modeling of Turing structures in the chlorite-
234 P. K. Maini
iodide-malonic acid-starch reaction system, Science, 251, 650-652 (1991) 39. P.K. Maini and M. Solursh, Cellular mechanisms of pattern formation in the developing limb, Int. Rev. Cytology, 129, 91-133 (1991) 40. P.K. Maini, D.L. Benson and J.A. Sherratt, Pattern formation in reaction diffusion models with spatially inhomogeneous diffusion coefficients, IMA J.Math.Appl.Med. & Bioi, 9, 197-213 (1992) 41. P.K. Maini, M.R. Myerscough, K.H. Winters and J.D.Murray, Bifurcat ing spatially heterogeneous solutions in a chemotaxis model for biological pattern formation, Bull. Math. Biol, 53, 701-719 (1991) 42. J.L. Martiel and A. Goldbeter, A model based on receptor desensitization for cyclic AMP signaling in Dictyostelium cells, Biophys. J., 52, 807-828 (1987) 43. H. Meinhardt, Models of Biological Pattern Formation, Academic Press (1982) 44. H. Meinhardt, The Algorithmic Beauty of Sea Shells, Springer-Verlag (1995) 45. J.D. Murray, A pre-pattern formation mechanism for animal coat mark ings, J. theor. Biol., 88, 161-199 (1981) 46. J.D. Murray, Parameter space for Turing instability in reaction-diffusion mechanisms: a comparison of models, J. theor. Biol., 98, 143-163 (1982) 47. J.D. Murray Mathematical Biology, Springer-Verlag (1993) 48. J.D. Murray, D.C. Deeming, D.C. and M.W.J. Ferguson, Size dependent pigmentation pattern formation in embryos of Alligator mississippiensis: time of initiation of pattern generation mechanism, Proc. Roy. Soc. Lond., B 239, 279-293 (1990) 49. J.D. Murray, P.K. Maini and R.T. Tranquillo, Mechanochemical mod els for generating biological pattern and form in development, Physics Reports, 171, 59-84 (1988) 50. J.D. Murray and M.R. Myerscough, Pigmentation pattern formation on snakes, J. theor. Biol, 149, 339-360 (1991) 51. J.D. Murray and R.T. Tranquillo, Continuum of fibroblast-driven wound contraction: inflammation-mediation, J. theor. Biol, 158, 135-172 (1992) 52. B.N. Nagorcka, Wavelike isomorphic prepatterns in development, J. theor. Biol, 137, 127-162 (1989) 53. B.N. Nagorcka, V.S. Manoranjan and J.D. Murray, Complex spatial pat terns from tissue interactions — an illustrative model, J. theor. Biol, 128, 359-374 (1987) 54. G.A. Ngwa and P.K. Maini, Spatio-temporal patterns in a mechani cal model for mesenchymal morphogenesis, J. Math. Biol, 33, 489-520
Mathematical Modelling in the Life Sciences
235
(1995) 55. H.F. Nijhout, A comprehensive model for colour pattern formation in butterflies, Proc. R. Soc. Lond. B 239, 81-113 (1990) 56. G.M. Odell, G. Oster, P. Alberch and B. Burnside, The mechanical basis of morphogenesis. I. Epithelial folding and invagination, Devi. Biol., 85, 446-462 (1981) 57. L. Olsen, J.A. Sherratt and P.K. Maini, A mechanochemical model for adult dermal wound contraction and the permanence of the contracted tissue displacement profile, J. theor. Biol., 177, 113-128 (1995) 58. L. Olsen, J.A. Sherratt and P.K. Maini, A mechanochemical model for adult dermal wound contraction: On the permanence of the contracted tissue displacement profile, J. theor. Biol., 177, 113-128 (1995) 59. L. Olsen, J.A. Sherratt and P.K. Maini, A mathematical model for fibroproliferative wound healing disorders, Bull. Math. Biol., 58, 787-808 (1996) 60. L. Olsen, P.K. Maini and J.A. Sherratt, Spatially varying equilibria of mechanical models: application to dermal wound contraction, Math. Biosciences, 147,113-129 (1998) 61. L. Olsen, J.A. Sherratt, P.K. Maini and F. Arnold, A mathematical model for the capillary endothelial cell-extracellular matrix interactions in wound-healing angiogenesis, IMA J.Math.Appl.Med. & Biol, 14(4), 261-281 (1997) 62. L. Olsen, P.K. Maini, J.A. Sherratt and J. Dallon, Mathematical mod elling of anisotropy in fibrous connective tissue, Math. Biosciences, 158, 145-170 (1999) 63. G.F. Oster and J.D. Murray, Pattern formation models and development, Zool., 251, 186-202 (1989) 64. G.F. Oster, J.D. Murray and A.K. Harris, Mechanical aspects of mesenchymal morphogenesis, J. Embryol.exp. Morph., 78, 83-125 (1983) 65. H.G. Othmer and A. Stevens, Aggregation, blowup, and collapse: The ABC's of taxis in reinforced random walks, SIAM J. Appl. Math., 57, 1044-1081 (1997) 66. A.V. Panfilov and A.V. Holden (eds), Computational Biology of the Heart, Chichester, John Wiley & Sons (1997) 67. K.J. Painter, P.K. Maini and H.G. Othmer, A chemotactic model for the advance and retreat of the primitive streak in avian development, Bull. Math. Biol. (to appear) (1999a) 68. K.J. Painter, P.K. Maini and H.G. Othmer, Stripe formation in juvenile Pomacanthus explained by a generalised Turing mechanism with chemotaxis, PNAS, 96, 5549-5554 (1999b)
236 P. K. Maini
69. A.S. Perelson, P.K. Maini, J.D. Murray, J.M. Hyman and G.F. Oster, Nonlinear pattern selection in a mechanical model for morphogenesis, J. Math. Bioi, 24, 525-541 (1986) 70. G. Pettet, M.A.J. Chaplain, D.L.S. McElwain and H.M. Byrne, On the role of angiogenesis in wound healing, Proc. Roy. Soc. Lond., B263, 1487-1493 (1996) 71. J. Schnakenberg, Simple chemical reaction systems with limit cycle be haviour, J. theor. Bioi, 8 1 , 389-400 (1979) 72. L.A. Segel, Modelling Dynamic Phenomena in Molecular and Cellular Biology, Cambridge, Cambridge University Press (1984) 73. D.R. Soil, D. Wessels and A. Sylwester, The motile behavior of amoebae in the aggregation wave in Dictyostelium discoideum, In: Experimental and Theoretical Advances in Biological Pattern Formation, H.G. Othmer, P.K. Maini and J.D. Murray, (eds.), Plenum Press, London, pp 325-328 (1993) 74. M.S. Steinberg, Does differential adhesion govern self-assembly processes in histogenesis? Equilibrium configurations and the emergence of a hier archy among populations of embryonic cells, J. exp. Zool., 173, 395-434 (1970) 75. D. Sulsky, S. Childress and J.K. Percus, A model of cell sorting, J. theor. Bioi, 106, 275-301 (1984) 76. D. Thomas, Artifical enzyme membranes, transport, memory and oscil latory phenomena, in Analysis and Control of Immobilized Enzyme Sys tems, ed. D. Thomas and J.-P. Kernevez, Springer, Berlin, Heidelberg, New York, 115-150 (1975) 77. P. Tracqui, D.E. Woodward, G.C. Cruywagen, J. Cook and J.D. Murray, A mechanical model for fibroblast-driven wound healing, J. Bioi Sys tems, 3, 1075-1085 (1995) 78. A.M. Turing, The chemical basis of morphogenesis, Phil Trans. Roy. Soc. Lond. B 327, 37-72 (1952) 79. J.J. Tyson and J.P. Keener, Singular perturbation theory of traveling waves in excitable media (a review), Physica D, 32, 327-361 (1988) 80. B. Vasiev and C.J. Weijer, Modelling Dictystelium Discoideum mor phogenesis, In: Proceedings of IMA Workshop on Pattern Formation, H.G. Othmer and P.K. Maini, (eds) to appear (2000). 81. M. Weliky, S. Minsuk, R. Keller and G.F. Oster, Notochord morpho genesis in Xenopus laevis: simulation of cell behaviour underlying tissue convergence and extension, Dev., 113, 1231-1244 (1991) 82. M. Weliky and G.F. Oster, The mechanical basis of cell rearrange ment. I. Epithelial morphogenesis during Fundulus epiboly, Dev., 109,
Mathematical Modelling in the Life Sciences
237
373-386 (1990) 83. L. Wolpert, Positional information and the spatial pattern of cellular differentiation, J. theor. Biol, 25, 1-47 (1969) 84. L. Wolpert and A. Hornbruch, Double anterior chick limb buds and models for cartilage rudiment specification, Development, 109, 961-966 (1990)
RELATIVE EQUILIBRIA A N D CONSERVED QUANTITIES IN SYMMETRIC HAMILTONIAN SYSTEMS JAMES MONTALDI Institut Non Liniaire de Nice
1
Introduction
In this introduction, we first recall the basic phase space structures involved in Hamiltonian systems, the symplectic form, the Poisson brackets and the Hamiltonian function and vector fields, and the relationship between them. Afterwards we describe a few examples of Hamiltonian systems, both of the classical 'kinetic+potential' type as well as others using the symplectic/Poisson structure more explicitly. There are many applications of the ideas in these notes that have been investigated by different people, but which I shall not cover. The major ex ample, the one for which classical mechanics was invented, is the gravitational iV-body problem. But there are many others too, such as rigid bodies, cou pled rigid bodies, coupled rods, underwater sea vehicles, . . . not to mention the infinite dimensional systems such as water waves, fluid flow, plasmas and elasticity. The interested reader should consult the books in the list of refer ences at the end of these notes. Note that many of the items in the list of references are not in fact referred to in the text! 1.1
Hamilton's equations
The archetypal Hamiltonian system describes the motion of a particle in a potential well. If the particle has mass m, and V(x) is the potential energy at the point x (in whatever Euclidean space), then Newton's laws state that
mi =
-W(x).
In the 18th century, Lagrange introduced the phase space by defining y = x and passing to a first order differential equation, and Hamilton carried this further by introducing his now-famous equations q= p=
dH/dp, -dH/dq,
where q replaces x, and p = mi is the momentum, and H{q,p) = ±n\p\2 + V{q) 239
240 J. Montaldi
is the Hamiltonian, or total energy (kinetic + potential). This first order system is equivalent to Newton's law above, as is easily checked. The principal advantage of Lagrange/Hamilton's approach is that it is more readily generalized to systems where the configuration space is not a Euclidean space, but is a manifold. Such systems usually come about because of constraints imposed (eg in the rigid body the constraints are that the dis tances between any two particles is fixed, and the configuration space is then the set of rotations and translations in Euclidean 3-space). The other advantage of the Hamilton's approach is that it lends itself to generalizations to systems that are not of the kinetic + potential type, such as the model of a system of N point vortices in the plane or on a sphere, which we will see below, or Euler's equations modelling the "reduced" motion of a rigid body. These two generalizations lead to defining the dynamics in terms of a Hamiltonian on a phase space, where the phase space has the additional structure of being a Poisson or a symplectic manifold. The 'canonical' Poisson structure is given by it
Q1
_ v^ d/
d
9
dg df
where / and g are any two smooth functions on the phase space. The canonical symplectic structure is given by, n
w = 2_J dpj A dqj = da, where a = 5Z"=i Pj A Mi is the canonical Liouville l-form.a We shall see other examples of Poisson and symplectic structures in the course of these lectures. The Hamiltonian vector field XH is determined by the Hamiltonian H, a smooth function on the phase space, in either of the following ways: F = {H, F} w(v,XH)=dH(v),
W
where F = XH(F) is the time-rate of change of the function F along the trajectories of the dynamical system. Combining these two expressions gives "Not all authors agree on the choices of signs in the definition of the symplectic form or the Poisson brackets, so when using formulae involving either from a text or paper, it is necessary first to check the definitions. Our choice ensures that A"{/,9} = [Xf, Xg].
Relative Equilibria and Conserved Quantities in Symmetric Hamittonian Systems
241
the useful formula relating the symplectic and Poisson structures, u,(Xf,Xg) = {f,g}
(1.2)
for any smooth functions / , g. The first property of such Hamiltonian systems is their conservative na ture: the Hamiltonian function H is conserved under the dynamics and so too is the natural volume in phase space (Liouville's theorem). This has an important effect, not only on the type of dynamics encountered in such sys tems, but also on the types of generic bifurcations that can occur. Indeed, the first consequence of these conservation laws (energy and volume) is that one cannot have attractors in Hamiltonian systems, and in particular the notion of asymptotic stability is not available. A further feature of Hamiltonian systems is that symmetries lead to con served quantities. The two best-known examples of this are rotational sym metry leading to conservation of angular momentum, and translational sym metry to conservation of ordinary linear momentum. These further conserved quantities could in principle complicate the types of dynamics and bifurca tions one sees. However a well defined process called reduction (or symplectic reduction) can be used to replace the symmetry and conservation laws by a family of Hamiltonian systems, parametrized by these conserved quantities, on which there are general Hamiltonian systems whose bifurcations are those expected in generic systems. That said, there is a further complication which is that the phase space(s) for these reduced systems may be singular, or change or degenerate in a family, and we are only just beginning to understand the effects of these degenerations on the dynamics and bifurcations. 1.2
Examples
Spherical pendulum A spherical pendulum is a particle constrained to move on the surface of a sphere under the influence of gravity. As coordinates, one can take spherical polar coordinates 9, <j> (6 measuring the angle with the downward vertical, and <$> the angle with a fixed horizontal axis). Of course, this system has the defect of being singular at 0 = 0, n. The kinetic energy is T(q,q) = ! m £ 2 ( < J 2 + s i n 2 0 ^ ) , while the potential energy is V(q) = mg£(l - cos9). The momenta conjugate to the spherical polars are pe = dC/86 = me29\
p* = dC/d4> = ml2 sin2 6 4>,
242
J. Montaldi
where £ = T - V is the Lagrangian, and the Hamiltonian is then
The equations of motion are given as usual by Hamilton's equations. In particular, one can see from these equations that the angular momentum p^, about the vertical axis is conserved since H is independent of <j> (one has PO = dH/d
where Zj is a complex number representing the position of the j'-th vortex. This system is Hamiltonian, with the Hamiltonian given by a pairwise inter action depending on the mutual distances:
This is clearly not of the form 'kinetic+potential'. The Poisson structure is
{/> s} = E A 7 l (/** - »*/*) 3
and the symplectic form is w(u, v) = £V Xj (UJVJ - UjVj). Being of dimension 3, the Euclidean symmetry has 3 conserved quantities associated to it—see equation (6.2). A similar model can be obtained for point vortices on the sphere, providing a simple model for cyclones and hurricanes in planetary atmospheres. See Section 6.
Relative Equilibria and Conserved Quantities in Symmetric Hamiltonian Systems
1.3
243
Symmetry
A transformation of the phase space T : V -»• V is a symmetry of the Hamil tonian system, if (i) H{Tx) = H{x) for all
xeV,
(ii) T preserves the symplectic structure: T"u — u, or (ii') T preserves the Poisson structure: {/ o T, g o T}(x) — {/,
g}(T(x)).
There are three basic ways that, symmetries affect Hamiltonian systems: (a) The image by T of a solution is also a solution; (b) A solution with initial point fixed by T lies entirely within the set Fix(T,P), where Fix(T,V) = {x6 V \Tx = x). (c) If T is part of a continuous group, then the group gives rise to conserved quantities (Noether's theorem). The first of these is clear, and in fact is also true of more generalized symmetries for which H o T — H is constant, and T*w = a j ( c a constant). This occurs for example for homothcties of the plane in the planar point vortex model described above. The second (b) is less obvious, but very well-known; it follows from a very simple calculation as follows. If Tx = x then at(x) = <7t(Tx)
=Tat(x),
where at is the time t flow associated to the Hamiltonian system, and so <Jt(x) is fixed by T. If T is part of a compact group, then not only is Fix(T, V) invari ant, but it is a symplectic submanifold, and the restrictions of the Hamiltonian and the symplectic form (or Poisson structure) to Fix(T, V) is a Hamiltonian system which coincides with the restriction of the given Hamiltonian system— an exercise for the reader. This technique of restricting to fixed point spaces is sometimes called discrete reduction. In these notes we will be concentrating on the effect of (c). The central force problem provides the basic motivating example of this. 1.4
Central force problem
Consider a particle of mass m moving in the plane under a conservative force, whose potential depends only on the distance to the origin (a sim ilar analysis is possible for the spherical pendulum). It is then natural
244 J. Montaldi
to use polar coordinates (r, cj>) which are adapted to the rotational sym metry of the problem, so that V = V(r). The velocity of the particle is x = (rcos> + (r sin (£)<£, rsin> - (rcos)4>), so that the kinetic energy is
The Lagrangian is given by £ = T - V, and the Hamiltonian is H = T + V with associated momentum variables given by pr = dC/dr = mf and p^ = d£/d(fi = mr24>. Substituting for the velocities r and 0 in terms of the momenta determines the Hamiltonian to be:
H(r,<j>,pr,P) = — (pl + _ p 2 j
+
y(r).
Then Hamilton's equations with respect to these variables are |
r=ipr,
p r = - - L y p $ + v'( f .) (1.4)
The last equation says that p^ is preserved under the dynamics. In fact, p^ is the angular momentum about the origin r = 0. Since p$ is preserved, let us consider a motion with initial condition for which p0 = /j. Then (r,pr) evolve as
liV = - ^
a +
V'(r)
(L5)
This is in fact a Hamiltonian system, with Hamilton f/^(r,p r ) obtained by substituting fj, for p^. So that ^ ( r , p r ) = ^ - p ^ + - ^ + V(r). This is a 1-degree of freedom problem, called the reduced system, with "effective" potential energy V
"(»-) = ^ 2 + V ( r ) ,
and for a given potential energy function V(r), one can study how the be haviour of the system depends on p. For example, with the gravitational potential V(r) = - 1 / r , one obtains an effective potential of the form in Figure 1 below, where the fist graph shows the potential V(r) as a function of r, while the second and third show V„(r) for increasing values of p,.
Relative Equilibria and Conserved Quantities in Symmetric
Hamiltonian
Systems
245
It is clear that for /j, — 0 there is no equilibrium for the reduced system, while for ft > 0 there is an equilibrium, at r^ satisfying V^(r^) = 0—here rM = 3n2/m. Indeed, in this example it is a stable equilibrium for the effective potential has a minimum at the rM.
Figure 1. Effective potential for increasing values of fi, for V(r)
-1/r
Since r = rM and pr = 0 is an equilibrium of the reduced system, it is natural to substitute these values into the original equation (1.4). The two remaining equations are then
This describes a simple periodic orbit in the original phase space: {r,4>,pr,p
246
J.
Montaldi
1.5
Lie group actions
These notes assume the reader has a basic knowledge of actions of Lie groups on manifolds. Here I recall a few basic formulae and properties that are used. A useful reference is the new book by Chossat and Lauterbach [5]. Let G be a Lie group acting smoothly on a manifold V, and let g be its Lie algebra. We denote this action by (g, x) t-» g ■ x. The orbit through x is G -x = {g-x | g G G}. To each element ( e g there is associated a vector field on V which we denote £p. It is defined as follows fr(z) = ^ | t
= 0 (exp(<0-z)-
The tangent space to the group orbit at x is then g • x = {&{%) I £ € g}. A simple calculation relates the vector field at x with its image at g ■ x: dgz£v(x) = (Adgt)-p(g-x),
(1.6)
where Ad9 £ is the adjoint action of g on f, which in the case of matrix groups is just Adg S = g$g-\ The adjoint representation of g on g is the infinitesimal version obtained by differentiating the adjoint action of G: ad
*V
=
Jt\t = o Ade "P('«) ^ = & ''l'
Dual to the adjoint action on g is the coadjoint action on g*: (Coad9 n, TJ) := {n, Ad 9 -i 77) ,
(1.7)
and similarly there is the infinitesimal version, (coad,c (i, rj) := (n, ad_ 4 7?) = (/*, [77,
fl).
(1.8)
Examples 2.5 describe the coadjoint actions for the groups SO(3),SE(2) and SL(2). Given x € V, the isotropy subgroup of x is Gx = {g€G\g-x
= x}.
The Lie algebra gx of Gx consists of those £ 6 g for which fp(z) = 0, and the fixed point set of K Fix{K,V) = {x£V\K-x
= x},
Relative Equilibria and Conserved Quantities in Symmetric
Hamiltonian
Systems
247
consists of those points whose isotropy subgroup contains K. It is not hard to show that it's a submanifold of V. Moreover, those points with isotropy precisely K form an open (possibly empty) subset of Fix(K,V). Stratification by orbit type If G acts on a manifold V, then the orbit space V/G is smooth at points where Gp is trivial, and more generally where the orbit type in a neighbourhood of p is constant. More generally, for each subgroup H < G one defines the orbit type stra tum V(H) to D e the set of points p for which Gp is conjugate to H. This is a union of G-orbits, and its image in V/G is also called the orbit type stratum (now iii the orbit space). These orbit type strata are submanifolds of V and V/G, and they fit together to form a locally trivial stratification (i.e. locally it has a product structure). For dynamical systems, the importance of this partition in to orbit type strata, is that for an equivariant vector field, the strata are preserved by the dynamics. Slice to a group action A slice to a group action at x 6 V is a submanifold of V which is transverse to the orbit through x and of complementary dimension. If possible, the slice is chosen to be invariant under the isotropy subgroup Gx (this is always possible if Gx is compact). A basic result of the theory of Lie group actions is that under the orbit map V -> V/G the slice projects to a neighbourhood of the image of G ■ x in the orbit space. Principle of symmetric criticality This principle is the variational version of discrete reduction, and provides a useful method for finding critical points of invariant functions. It states that, if G acts on a manifold V, and if / : V -> R is a smooth invariant function, then x £ F\x(G,V) is a critical point of / if and only if it is a critical point of the restriction f\Fix(G,v) °f / to Fix(G,V). One proof is to use an invariant Riemannian metric to define an equivariant vector field V / , which being equivariant, is tangent to Fix(G, V). For a full proof, valid also in infinite dimensions, see [58].
2
Noether's Theorem and the Momentum Map
The purpose of this section is to bring together facts about symmetry and conserved quantities that are useful for studying bifurcations. They are all found in various places, more or less explicitly, but not together in a single source. Furthermore, there appear to be misconceptions about whether "nonequivariant" momentum maps cause extra problems. Essentially, anything true for the equivariant ones remains true for non-equivariant ones (which are
248
J. Montaldi
in fact equivariant, but for a modified action, as we shall see below). Many of the details of this chapter can be found in the books [16,8,10,13]. Examples 2.1 We will be using 3 examples of symplectic group actions in this section to illustrate various points. These arise in the models of systems of point vortices. (A) G = SO(3) acts by rotations on the sphere V - S 2 , with symplectic form given by the usual area form with total area 4-K (for example in spherical polars, u = sin(#) d$Ad(p). The Lie algebra so(3) can be represented by skewsymmetric matrices, and the vector field corresponding to the skew-symmetric matrix f is simply x t-¥ £x. (B) G = SE(2) acts on the plane V = R 2 , with its usual symplectic form w = dx A dy. This group acts by translation and rotations; indeed, SE(2) ~ R 2 x SO(2) (semidirect product), where R 2 is the normal subgroup of translations of the plane, and SO(2) is the group of rotations about some point, e.g. the origin. The Lie algebra se(2) is represented by constant vector fields (corresponding to the translation subgroup) and by infinitesimal rotations. (C) G = SL(2) = SL(2, R) acts by isometries on the hyperbolic plane V = H. There are several ways to realize this action, of which perhaps the best-known is to use Mobius transformations on the upper-half plane. However, we will use one that is more in keeping with the others, which is to represent the hyperbolic plane as one sheet of a 2-sheeted hyperboloid in R 3 : U = {{x,y,z)
6 R 3 | z7 - x2 - y2 = 1, z > 0}.
(2.1)
(The hyperbolic metric on H is induced from the Minkowski metric (dx2 + dy2 - dz2) on R 3 .) The symplectic form on H is given by w x (u,v) = - — - f l ( x ) - u A v ,
(2.2)
where R(x,y,z) = (x,y, -z) and u , v 6 Tx7i. One way to realize the action of SL(2) on V. is to embed 7i into the set of 2 x 2-trace zero matrices s l ^ R ) :
x« ( ; ) - * = ( , ! ,
»_+/)■
(,s>
Then A ■ x = AkA~l, for A e SL(2). Note that the image of the embedding consists of those matrices X of trace zero, unit determinant and such that A'12 > vY2i. Under this identification, the symplectic form (2.2) becomes the Kostant-Kirillov-Souriau symplectic form on the coadjoint orbit (see Example 2.5(C)).
D
Relative Equilibria and Conserved Quantities
2.1
in Symmetric
Hamiltonian
Systems
249
Noether's theorem
With such a set-up, the famous theorem of Emmy Noether states that any 1-parameter group of symmetries is associated to a conserved quantity for the dynamics. In fact one needs some hypothesis such as the phase space being simply connected, or the group being semisimple (see [8] for details). For example, the circle group acting on the torus does not produce a globally well-defined conserved quantity. How do these conserved quantities come about? We already have a pro cedure for passing from Hamiltonian function to Hamiltonian vector field, and here we apply the reverse procedure. For each £ G g let fa : V —> R be a function that satisfies Hamilton's equation
dfc =«(&>,-),
(2.4)
if such a function exists. Of course each of these fa is only defined up to a constant, since only dfa is determined. Such functions are known as momentum functions, and a symplectic ac tion for which such momentum functions exist is said to be a Hamiltonian action. See [8] for conditions under which symplectic actions are Hamilto nian. Theorem 2.2 (Noether) Consider a Hamiltonian action of the Lie group G on the symplectic (or Poisson) manifold V, and let H be an invariant Hamil tonian. Then the flow of the Hamiltonian vector field leaves the momentum functions fa invariant. PROOF:
A simple algebraic computation: XH{fa)
= {H,fa}
= -{fa,H}
= -ZV(H)
This last equation holds because H is G-invariant.
= 0. □
Momentum map We leave questions of dynamics now, and consider the structure of the set of momentum functions. The first observation is that the momentum functions 0£ can be chosen to depend linearly on £. For example, let { £ i , . . . , £d} be a basis for g, and let fax,..., fad be Hamiltonian functions for the associated vector fields. Then for £ = ai£i + • • • + a<j£d, one can put, fa = a,ifa^+•■■ +adfad. It is easy to check that such (j>^ satisfy the necessary equation (2.4). Thus, for each point p e P w e have a linear functional £ >->fa(p),which we call $(p), so that
$(p) =
(fai(p),---,faM)
250 J. Montaldi
is a map $ : V -¥ g*, where g* is the vector space dual of the Lie algebra 9. Such a map is called a momentum map. The defining equation for the momentum map is,
(2.5)
for all p € V all v e TPV and all £ 6 g. An immediate and important consequence of (2.5) is: ker(d$ p ) = ( g . P r im(d* p ) = gpx C g*
(2-6) (2.7)
Here, U" is the linear space that is orthogonal to U with respect to the symplectic form. In particular, the momentum map is a submersion in a neighbourhood of any point where the action is free (or locally free: g p = 0). The fact that $ is only determined by the differential condition (2.5) means that it is only defined up to a constant. It follows that if $ is a momentum map, then $1 is a momentum map if and only if there is some C E g* for which, for all p € V,
We return to the possibility of choosing a different $ below. E x a m p l e s 2.3 Here we see momentum maps for three symplectic actions that arise for the point-vortex models, firstly for point vortices on the sphere, secondly for those in the plane and thirdly on the hyperbolic plane. We also give the general formula for momentum maps for coadjoint actions. (A) Let G - SO(3) act diagonally on the product V = S2 x . . . x S2 (N copies). On the r factor we put the symplectic form <jj = AjWo, where u>o is the canonical area form on the unit sphere (fs2 u>o — 4TT). Then a momentum map is given by $(xi,...,xN)
= ^2\jXj,
(2.8)
3
where the Lie algebra so(3) (consisting of skew symmetric 3 x 3 matrices) is identified with R 3 in the "usual way": B € so(3) corresponds to b € R 3 satisfying Bu = b x u for all vectors u 6 R 3 . It is clear that this momentum map is equivariant with respect to the coadjoint action, which under this identification becomes the usual action of SO(3) on R 3 . (B) An analogous example is G = SE(2) acting on a product of TV planes, with symplectic form u — ©jAjWj, where LJJ is the standard symplectic form
Relative Equilibria and Conserved Quantities in Symmetric Hamiltonian Systems
251
on the j t h plane. If we write SE(2) as R 2 x SO(2) (semidirect product), then the natural momentum map is
*(*x,..., xN) = I £ A, JXj, i £,. A; \Xj |2 j . where J = I
(2.9)
_ J is the matrix for rotation through 7r/2.
(C) A further analogous example is G = SL(2) acting on a product of N copies of the hyperbolic plane, V = %N, with symplectic form u; = ®j\jWj, where Uj is the standard symplectic form on the j t h plane, see (2.2). The natural momentum map is given by, $(xi,...,xN)
= ^TXjXj. (2.10) i D Remark 2.4 Consider any group acting on a manifold X, the configuration space. Classical mechanics of the "kinetic + potential" type takes place on the cotangent bundle of a configuration space, and in this setting the given action on X induces a symplectic action on the cotangent bundle, by the formula 9-{x,p)
= (9-x,
(dgx)-Tp),
where A~T is the inverse transpose of the operator A. Such actions are called cotangent actions or cotangent lifts and they always preserve the canonical symplectic form u on the cotangent bundle. The momentum map for cotangent actions always exists, and is given by (*(x,p), 0 = (P, &>(*)>, where the pairing on the right is between T*X and TXX. For example, if V = T*R 3 is the phase space for a central force problem, which has SO(3) symmetry, then after identifying so(3)* with R 3 as above, the momentum map $ : V -> so(3)* is just the angular momentum. 2.2
Equivariance of the momentum map
A natural question arises: since a momentum map $ : V -> g* is defined on a space V with an action of the group G, is there an action of G on g* for which $ commutes with (intertwines) the two actions? The answer was given in the affirmative by Souriau [16]. Usually, but not always, this turns out to be the coadjoint action of G o n g " , and before stating Souriau's result we give some examples of coadjoint action.
252
J. Montaldi
Examples 2.5 The coadjoint actions for the groups described in Examples 2.1 are as follows. (A) For G - SO(3), the Lie algebra g = so (3) consists of the 3 x 3 skewsymmetric matrices, and via the inner product (4, B) = ti(ABT) we can identify the dual g* with g. An easy computation shows that the coadjoint action is given by CoadA ft = Afj,AT. With the usual identification of so (3) with R 3 (see Example 2.3) the coadjoint action is just the usual action of SO(3) by rotations, so that the orbits for the coadjoint actions are spheres centered at the origin, and the origin itself. For each of the 2 types of orbit, the isotropy subgroup is either all of SO(3), or it is a circle subgroup SO(2) of SO(3). This variation of the symmetry type of the orbits has interesting repercussions for the dynamics, and in particular for the families of relative equilibria. (B) Consider now the 3-dimensional non-compact group G = SE(2) of Euclidean motions of the plane. As a group this is a semidirect product R 2 x SO(2), where R 2 acts by translations and SO(2) by rotations about some given point (the "origin"). Elements (u, R) e R 2 x SO(2) can be iden tified with elements R u 0 1 by introducing homogeneous coordinates. A calculation shows that the adjoint action is Ad ( u , f i ) (v,B) = ( i ? v - B u , B), since B and R commute. The coadjoint action is then Coad {Ui/0 (i/ > ^) = {Ru, ip + RvuT).
(2.11)
Note that since elements of so (2) are skew-symmetric, it follows that only the skew-symmetric part of tfj + RvnT is relevant. One can therefore replace rl> + RuuT by %l> + \{Rv\xT - uvTRT). A nice representation of this using complex numbers is given in Section 6. The coadjoint orbits are again of two types: first the cylinders with axis along the 1-dimensional subspace of g* which annihilates the translation sub group (or its subalgebra), and secondly the individual points on that axis; that is, the points of the form (0, V). In this case the two types of isotropy subgroup for the coadjoint action are firstly the translations in a given direc tion (orthogonal to v), so isomorphic to R, or in the case of a single point
Relative Equilibria and Conserved Quantities in Symmetric Hamiltonian Systems
253
Figure 2. Coadjoint orbits for SO(3) and for SE(2)
on the axis, it is the whole group. In both cases the isotropy subgroup is non-compact, a fact to be contrasted with the modified coadjoint action to be defined below. (C) For G = SL(2). Let A 6 SL(2) and /x e sl(2)*, then Coad^ fi =
A~TfiAT.
Notice that the determinant of the matrix in sl(2)* is constant on each coad joint orbit. In fact the coadjoint orbits are of 4 types: (i) the 1-sheeted hyperboloids (where GM ~ R), (ii) one sheet of the 2-sheeted hyperboloids (where G^ ~ SO(2)), (iii) each sheet of the cone with the origin removed (where G^ ~ R), and (iv) the origin itself (where GM = SL(2)). □ It is easy to see that the momentum maps given in Examples 2.3(A,C) are equivariant with respect to the coadjoint actions described above, which is not surprising in the light of the theorem below. However this is not always true in the case of G = SE(2). To describe the action that makes $ equivariant we follow Souriau and define the cocycle 6
-
G
^ f (212) v g ^4 $(# -x) - Coad 9 $(x), ' It is of course necessary to show that this expression is independent of x, which it is provided V is connected. We leave the details to the reader: it suffices to differentiate with respect to x and use the invariance of the symplectic form. The map 6 defined above allows one to define a modified coadjoint action, by Coad£/i:=Coad f l /i + % ) .
(2.13)
A short calculation shows that this is indeed an action. Moreover, this action is by affine transformations whose underlying linear transformations are the coadjoint action.
254 J. Montaldi
Theorem 2.6 (Souriau) Let the Lie group G act on the connected symplectic manifold V in such a way that there is a momentum map $ : V -► g*. Let 6 : G -4 g* be defined by (2.12). Then $ is equivariant with respect to the modified action on g* : $(g-x)
= Coa.d"g${x).
Furthermore, if G is either semisimple or compact then the momentum map can be chosen so that 9 = 0. For proofs see [16] or [8]; the first proof of equivariance in the compact case appears to be in [48]. Examples 2.7 In Examples 2.3 we gave the momentum maps for the three point-vortex models, and pointed out that for SO(3) and SL(2) the mo mentum map is equivariant with respect to the usual coadjoint action (not surprisingly in view of the theorem above since SO(3) is compact and SL(2) is semisimple). However, this is not always true in the planar case: (B) Consider the momentum map given in Example 2.3(B) for the point vortex model in the plane:
To find the action that makes this momentum map equivariant, we compute
\jJXj, I Zj \j\xs|2 + Ej *jA*j.u) +
* ( M + u , . . . , AxN + u) = (A£ 3
'
2
+ A(Ju, I|u| ) = Coad ( u M ) #(ar l t . ..,xN) + A(Ju, i | u | 2 ) , (2.14) where A = £V -\? £ R, and Coad is given in (2.11). The cocycle associated to this momentum map is thus given by 9(u,A) = A(Ju, | | u | 2 ) . If A = 0 then $ is equivariant with respect to the usual coadjoint action, while if A ^ 0 it is equivariant with respect to a modified coadjoint action. Furthermore, one can show that in this latter case there is no constant vector C G g* for which $ + C is coadjoint-equivariant. Indeed, it is enough to see that the orbits for this modified coadjoint action are in fact paraboloids, with axis the annihilator in g* of R 2 c g and these are not translations of the coadjoint orbits, which are either cylinders or points. See Figure 3. Furthermore, a short calculation shows that the isotropy subgroups for this action are all compact: G^ ~ SO(2), for all n e se(2)*, which is quite different from the coadjoint action. □
Relative Equilibria and Conserved Quantities in Symmetric Hamiitonian Systems
255
Figure 3. Modified coadjoint orbits for SE(2)
2.3
Reduction
Since by Noether's theorem the dynamics preserve the level sets of the mo mentum map $, it makes sense to treat the dynamical problem one level set at a time. However, these level sets are not in general symplectic submanifolds, and the system on them is therefore not a Hamiitonian system. However, it turns out that if one passes to the orbit space of one of these level sets of $, then the resulting reduced space is symplectic, and the induced dynamics are Hamiitonian. Historically, this process of reduction was first used by Jacobi, in what is called "elimination of the nodes". However, its systematic treatment is much more recent, and is due to Meyer [47] and independently to Marsden and Weinstein [43]. As usual suppose the Lie group G acts in a Hamiitonian fashion on the symplectic manifold V, and let $ : V -t g* be a momentum map which is equivariant with respect to a (possibly modified) coadjoint action as discussed above. Consider a value \i 6 0* of the momentum map. Then since $ is equivariant, the isotropy subgroup Gu of this modified coadjoint action acts on the level set < J>~ 1 (/J). Define the reduced space to be ^:=*-1(/x)/G„.
(2.15)
That is, two points of $ _ 1 ( A 0 axe identified if and only if they lie in the same group orbit. This defines Vu as a set, but to do dynamics one needs to give it more structure. If the group G acts freely near p (i.e. Gp is trivial), then Vu is a smooth manifold near the image of p in Vu. This is for two reasons: firstly $ is a submersion near p by (2.7), and secondly the orbit space by Gu will have no singularities. This is the case of regular reduction. We discuss singular reduction briefly below. It is important to know whether the induced dynamics on the reduced
256 J. Montaldi
spaces are also Hamiltonian. The answer of course is "yes". To see this it is necessary to define the symplectic form CJ^ on V^. Let u,v e TpV^, be projections of u,v e TVV, and define u^(u,«)
:=UJ(U,V).
Of course, one must show this to be well defined, non-degenerate and closed: exercises left to the reader! The Hamiltonian is an invariant, function on V so its restriction to $ _ 1 (/i) is G^-invariant and so induces a well-defined function on V^, denoted H^. The dynamics induced on the reduced space is then determined by a vector field Xu which satisfies Hamilton's equation: dH^ = u^X^,—). The orbit momentum map It is clear that V^ = $ _ 1 (/x)/G can also be defined by V» = ^~1(Oli)/G, where O^ is the coadjoint orbit through /x. Since $ _ 1 (C M )/G C V/G it is natural to use the orbit momentum map: if : V/G -+ Q'/G defined by
4 4V/G - A g*/G
(2.16)
where the vertical arrows are the quotient maps. Then V^ = <^_1(CM). This is very useful for studying bifurcations as the momentum value varies. However, it may not be so useful if G„ is not compact, for there the orbit space g*/G is not in general a reasonable space (it is not even Hausdorff for example for the coadjoint action of SE(2) near the V-axis).
2.4
Singular reduction
If the action of G on V is not free, then the reduced space is no longer a manifold. However, it was shown by Sjamaar and Lerman [67] that if G is compact then the reduced space can be stratified—that is, decomposed into finitely many submanifolds which fit together in a nice way—and each stratum has a symplectic form which determines the dynamics on that stratum. In fact these strata are simply the sets of constant orbit type in $ _ 1 (/i), or rather their images in V^. For the case of a proper action of a non-compact group see [20]. The theorem follows from the local normal form of Marie and Guillemin-Sternberg [8,42].
Relative Equilibria and Conserved Quantities in Symmetric Hamiltonian Systems
2.5
257
Symplectic slice and the reduced space
Recall that a slice to a group action at a point p € V is a submanifold 5 through p satisfying TPS © g • p = TPV. If Gv is compact, it can be chosen to be invariant under Gp. The slice, or more precisely S/Gp, provides a local model for the orbit space V/G. In the symplectic world, one needs to take into account the symplectic structure, and one wants the "symplectic slice" to provide a local model for the reduced space P M . In the case of free (or locally free) actions this is fairly straightforward, but in the general case, this is more delicate because the momentum map is singular. For this reason, the symplectic slice is usually taken to be a subspace of TPV rather than a submanifold of V. Definition 2.8 Suppose Gp is compact. Define N C TPV to be a Gp invariant subspace satisfying TpP = TV © g • p. The symplectic slice is then Ni
:=NDker{d$p).
It follows from the implicit function theorem that local coordinates can be chosen that identify a transversal to G^-p within the possibly singular set $ - 1 ( p ) with a subset of the symplectic slice N\. 3
Relative Equilibria
An equilibrium point is a point in the phase space that is invariant under the dynamics: p G V for which XH(P) — 0, or equivalently dHp = 0, and one way to define a relative equilibrium is as a group orbit that is invariant under the dynamics. Although geometrically appealing, this is not the most physically transparent definition. Definition 3.1 A relative equilibrium is a trajectory y(t) in V such that for each t G R there is a symmetry transformation gt G G for which i(t) = ft-7(0). In other words, the trajectory is contained in a single group orbit. It is clear that if a group orbit is invariant under the dynamics, then all the trajectories in it are relative equilibria; and conversely, if *y(t) is the trajectory through p, then g ■ j(t) is the trajectory through g ■ p and consequently the entire group orbit is invariant as claimed above. For TV-body problems in space, relative equilibria are simply motions where the shape of the body does not change, and such motions are always rigid rotations about some axis. Proposition 3.2 Let $ be a momentum map for the G-action on V and let H be a G-invariant Hamiltonian on V. Let p G V and let fj, = $(p). Then
258
J. Montaldi
the following are equivalent: 1 The trajectory -y(t) through p is a relative equilibrium, 2 The group orbit G • p is invariant under the dynamics, 3 3 f e g such that j(t) = exp(^) • p,
V« € R,
4 3£ € 0 such that p is a critical point of H^ = H — <j>^, 5 p is a critical point of the restriction of H to the level set $ - 1 ( / i ) . Remarks 3.3 (i) The vector £ appearing in (3) is the angular velocity of the relative equilibrium. It is the same as the vector £ appearing in (4). The angular velocity is only unique if the action is locally free at p; in general it is well-defined modulo QP. (ii) If $ - 1 (//) is singular then it has a natural stratification (see §2.4), and condition (4) of the proposition should be interpreted as being a stratified critical point; that is all derivatives of H along the stratum containing p vanish at p. (iii) Notice that (3) implies that relative equilibria cannot meandre around a group orbit, but must move in a rather rigid fashion. It follows from this equation that the trajectory is in fact a dense linear winding on a torus, at least if G is compact. The dimension of the torus is at most equal to the rank of the group G. For G — SO(3), the rank is 1, and this means that any RE that is not an equilibrium is in fact a periodic orbit. The equivalence of (1) and (2) is outlined above. Equivalence of (1) and (3): (3) => (1) is clear. For the converse, since y(t) lies in G • p so its derivative XH(p) = 7(0) lies in the tangent space gp = Tp(G-p). Let f € Q be such that XH(P) = £v(p)- Then by equivariance XH{9-P) = (Ad9 0v(9-p), see (1.6). Let gt = exp(tf)- Then since Ad 9l £ = £, PROOF:
jt{9t ■ p) = &{9t ■ p) = XH(gt ■ p). That is, 11-» gt ■ p is the unique solution through p. Equivalence of (3) and (4): (3) is equivalent to Xn(p) — (,vip)- Using the symplectic form, this is in turn equivalent to dH{p) = d(j>^{p). Equivalence of (4) and (5): If $ is submersive at p then $ - 1 (/f) is a submanifold of V and this is just Lagrange multipliers, since d^ = (d$(p), f). In the case that $ is singular, the result follows from the principle of symmetric criticality (see the end of the Introduction for a statement). If $ is singular at p, then by (2.7) gp ^ 0. Consider the restriction of H to Fix(G p ,'P). By the
Relative Equilibria and Conserved Quantities in Symmetric Hamiltonian Systems
259
theorem of Sjamaar and Lerman, the set $ _ 1 (/i) is stratified by the subsets of constant orbit type—see §2.4. Consider then the set VGP of points with isotropy precisely Gp (this an open subset of Fix(Gp,V) containing p), and restrict both $ and H to this submanifold. Now $ restricted to VG is of constant rank, so that $ _ 1 (/x) C\VGV is a submanifold. The result then follows as before, since by the principle of symmetric criticality H restricted to VGT has a critical point at p if and only if H has a critical point at p. O One of the earliest systematic investigations of relative equilibria was in a paper of Riemann where he classified all possible relative equilibria in a model of affine fluid flow that is now called the Riemann ellipsoid problem (or affine rigid body or pseudo-rigid body)—see [25] and [66] for details. The configuration space is the set of all 3 x 3 invertible matrices, and the symmetry group is SO(3) x SO(3). In that paper he identified the 6 conserved quantities and found geometric restrictions on the possible forms of relative equilibria using the fact that the momentum is conserved, and that the solutions are of the form given in (3) of the proposition above. In terms of general group actions, the geometric condition Riemann used is the following, which follows immediately from Proposition 3.2 above together with the conservation of momentum. Corollary 3.4 Let p G V be a point of a relative equilibrium, of angular velocity £, and let 9 be the cocycle associated to the momentum map, then coad* $(p) = 0. If G is compact, so the adjoint and coadjoint actions can be identified and 9 = 0 for a suitable choice of momentum map, this means that the angular velocity and momentum of a relative equilibrium commute. For example, for any system with symmetry SO(3) this means that at any relative equilibrium, the angular velocity and the value of the momentum are parallel. Definition 3.5 A relative equilibrium through p is said to be non-degenerate if the restriction of the Hessian d^H^ip) to the symplectic slice Ni is & nondegenerate quadratic form. Definition 3.6 A point n € jj* is a regular point of the (modified) coadjoint action if in a neighbourhood of /x all the isotropy subgroups are conjugate. Examples of regular points are: all points except the origin for the coad joint action of SO(3), all points except the special axis for the coadjoint action of SE(2), and all points for the modified coadjoint action of SE(2) described in Example 2.7(B). The following result, the first on the structure of the families of relative equilibria, was observed by V.I. Arnold in [2]. The proof is an application of
260 J. Montaldi
the implicit function theorem. Theorem 3.7 (Arnold) Suppose that p lies on a non-degenerate relative equilibrium, xoith Gv = 0 and n = $(p) a regular point of the (modified) coadjoint action. Then in a neighbourhood of p there exists a smooth family of relative equilibria parametrized by /i 6 g*. Lyapounov stability A compact invariant subset S of phase space is said to be Lyapounov stable if "any motion that starts nearby remains nearby", or more precisely, for every neighbourhood V of S there is another neighbourhood V C V such that every trajectory intersecting V is entirely contained in V. A compact subset is said to be Lyapounov stable relative to or modulo G if the V and V above are only required to be G-invariant subsets. The principal tool for showing an equilibrium to be Lyapounov stable is Dirichlet's criterion, which is that if the equilibrium point is a non-degenerate local minimum of the Hamiltonian, then it is Lyapounov stable. The proof consists of noting that in this case the level sets of the Hamiltonian are topologically spheres surrounding the equilibrium, and so by conservation of energy, if a trajectory lies within one of these spheres, it remains within it. It is reasonably clear firstly that it is sufficient if any conserved quantity has a local minimum at the equilibrium point, not necessarily the Hamiltonian itself, and secondly that the local minimum may in fact be degenerate. These observations lead to the notion of... Extremal relative equilibria A special role is played by extremal relative equilibria. This is partly due to Dirichlet's criterion for Lyapounov stability, and partly because of their robustness. A relative equilibrium is said to be extremal if the reduced Hamiltonian H^ on V^ has a local extremum (max or min) at that point. This is usually established by showing the restriction to the symplectic slice of the Hessian of H^ = H - fa to be positive (or negative) definite, for some f for which H^ has a critical point at p, see Proposition 3.2. It is of course conceivable that a relative equilibrium is extremal while the Hessian matrix is degenerate. The simplest case of this is for H{x, y) = x2 +j/ 4 in the plane. In fact this arises in the case of 7 identical point vortices in the plane. The configuration where they lie at the vertices of a regular heptagon is a relative equilibrium, and the Hessian of H on the symplectic slice is only positive semi-definite. However, a lengthy calculation shows that the relevant fourth order terms do not vanish, and the reduced Hamiltonian does indeed have a local minimum there. For the following statement, recall that any momentum map is equivariant with respect to an appropriate action of G on Q* (Section 2), and for y, e g*
Relative Equilibria and Conserved Quantities in Symmetric Hamiltonian Systems
261
we write G^ for the isotropy subgroup of n for that action. Theorem 3.8 ([48,34]) Let G act properly on V with momentum map $ , and suppose p € $ - 1 ( / i ) is an extremal relative equilibrium, with G^ compact. Then, (i) The relative equilibrium is Lyapounov stable, relative to G; (ii) There is a G-invariant neighbourhood U ofp such that, for all / / € $(U) there is a relative equilibrium in U H $ - 1 ( / i ' ) . The proof is mostly point-set topology on the orbit space and using the orbit momentum map (2.16), though part (ii) uses the deeper property of (local) openness of the momentum map. In fact the proof of (i) also holds if /i is a regular point for the (modified) coadjoint action, even if G^ is not compact. By Proposition 2.4 of [35] this result can be refined to conclude that the relative equilibrium is stable relative to G^, as described by Patrick [59]. Note that the compactness of Gu ensures that g,, has a G,,-invariant complement in g, as required in [35]. A consequence of using point-set topology is that there is very little information on the structure of the family of relative equilibria; for such information see Section 5. The crucial remaining point is how to determine whether a given relative equilibrium is extremal. In the case of free actions this was done by the so-called energy-momentum and/or energy-Casimir methods of Arnold and Marsden and others. Recently this was extended to the general case of proper actions: Proposition 3.9 (Lerman, Singer [35]) Letp £ $ - 1 ( / i ) be a relative equi librium satisfying the same hypotheses as the theorem above, and let £ £ g be any angular velocity of the RE as in Proposition 3.2. If the restriction to the symplectic slice of the quadratic form a?H^ is definite, then the RE is extremal, and so Lyapounov stable relative to G^. 4
Bifurcations of (relative) equilibria
In this section we take an extremely brief look at the typical bifurcations of equilibria in families of Hamiltonian systems as a single parameter is varied. These results also apply to relative equilibria, provided the reduced space is smooth in a neighbourhood of the relative equilibrium, and failing that, it applies to the stratum containing the relative equilibrium. Bifurcations of relative equilibria near singular points of the reduced space have not been investigated systematically.
262 J. Montaldi
4-1
One degree of freedom
Saddle-centre bifurcation This is the generic bifurcation of equilibria in 1 d.o.f. See Fig. 4. For A < 0 there are two equilibrium points: at (x,y) = (±y/^X, 0), one local extremum and one saddle. As A -► 0 these coalesce and then disappear for A > 0. One notices also a homoclinic orbit for A < 0, connecting the saddle point with itself, which can also be seen as a limit of the family of periodic orbits surrounding the stable equilibrium. In terms of eigenvalues of the associated linear system, this bifurcation can be seen as a pair of simple imaginary eigenvalues (a centre) decreases along the imaginary axis, collide at 0 and "then" emerge along the real axis (a saddle). This description is slightly misleading as the centre and saddle coexist. At the point of bifurcation, the linear system is I
_ 1; that is, it has non-zero nilpotent part.
Note that this bifurcation is compatible with an antisymplectic (i.e. timereversing) symmetry (x,y) -* (x, -y).
Figure 4. Saddle-centre bifurcation
Symmetric pitchfork This is usually caused by symmetry, and is 2 saddle centre bifurcations occurring simultaneously. On one side of the critical pa rameter value, there are 3 coexisting equilibria, while on the other side there is only one. In fact there are two types of pitchfork: a supercritical pitchfork involves two centres collapsing into a central saddle, leaving a single centre, and a subcritical pitchfork involves two saddles collapsing into a central centre, leaving a single saddle. See Figures 5 and 6. These results and normal forms are derived from Singularity/Catastrophe Theory.
Relative Equilibria and Conserved Quantities in Symmetric Hamiltonian Systems
263
Figure 5. Supercritical Pitchfork
m )M( )H A<0
A= 0 H(x,y) = \y2 - x* - Ax2 + h.o.t.
A>0
Figure 6. Subcritical Pitchfork
4-2 Higher degrees of freedom FVom the point of view of bifurcations of equilibria, or of critical points of the Hamiltonian, the results for 1 degree of freedom carry over to higher dimensions, by adding a sum of quadratic terms in the other variables. So for example the saddle-centre bifurcations has as "normal form", #A(X, y) = |j/f + I*? + \Xl + I £ ^ ( ± 3 $ ± vj) + h.o.t. Whether any of the equilibria are stable depends of course on the signs of the quadratic terms. On the other hand, it is a much more subtle question as to whether any of the associated dynamics, such as the heteroclinic connections, survive this
264 J. Montaldi
passage into higher dimensions. For the saddle-centre bifurcation, see [22] and for the time-reversible case, see the lectures of Eric Lombardi in this volume, and for more detail [39]. 5
Geometric Bifurcations
In this section we discuss bifurcations of the families of relative equilibria (RES) due to degenerations in the geometry of the momentum map. One understands fairly well now the geometry of the family of relative equilibria in the neighbourhood of a point where the reduced phase spaces change in dimension, which occurs at special values of the momentum map, provided however that the group action on the phase space is (locally) free, so that the momentum map is submersive. However, the general structure of relative equilibria near points with continuous isotropy is not so well understood, although some recent progress has been made. Notation Throughout the theorems below, we will suppose that p 6 V is a point on a non-degenerate RE, that the angular velocity of this RE is £ and the momentum value is p. Recall that an RE is said to be non-degenerate if the Hessian dPH^ restricted to the symplectic slice is a non-degenerate quadratic form. In an important paper [60], George Patrick investigates the structure of the set of relative equilibria as quoted in the following theorem. He also studied the nearby dynamics and introduced the notion of drift around relative equilibria in terms of the linearized vector field there, but we will not be describing that aspect here. Theorem 5.1 (Patrick [60]) Assume G is compact, Gp is finite and G$ !~l Gfj. is a maximal torus. Then in a neighbourhood of p the set of relative equilibria forms a smooth symplectic submanifold ofV of dimension dim(G) + rank(G). Note that, as matrices, since f and p commute they are simultaneously diagonalizable. Consequently, they are both contained in a common maximal torus, so that G$ nG M always contains a maximal torus. The condition of the theorem is therefore a generic condition. For example, for SO(3) the condition is satisfied if and only if £ and // are not both zero. This theorem has been refined by Patrick and Mark Roberts [62], where they show that assuming a generic transversality hypothesis and that as before Gp is finite, the set of relative equilibria near p is a stratified set, with the strata corresponding to the conjugacy class of the group G^CiC^.
Relative Equilibria and Conserved Quantities in Symmetric Hamiltonian Systems
265
The following result is more of a bifurcation theorem, as it is aimed at counting the number of relative equilibria on each reduced phase space, near a given non-degenerate RE. Recall that, if fj. is a regular point for the coadjoint action and Gp is trivial, then by Arnold's theorem (Theorem 3.7) there is a unique RE on each nearby reduced space. Theorem 5.2 (Montaldi [48]) Assume G^ is compact and Gp trivial. Then for regular fi' near fj, there are at least w{Gll) REs on the reduced space V^, where w{Gil) is the order of the Weyl group ofG^. Furthermore, if £ is regular then there are precisely this number of relative equilibria on V^. (For non-regular / / see [48]-) Note that if GM is a torus, then w{Gli) = 1, and this result reduces to Arnold's in the case that G is compact. The lower bound of w(G) for general £ follows from the Morse inequalities on the coadjoint orbits, and so presupposes that the RES on V^ are all non-degenerate. Without that assumption one can use the Lyusternik-Schnirelman category giving a lower bound of i d i m ( 0 , , ) - l - l . The proof of this result relies on the local normal form of Marie [42] and Guillemin-Sternberg [28]. Here I will outline the idea of the proof in the case that fj, = 0. The idea is to use the reduced Hamiltonian rather than the augmented Hamiltonian H$. So, the relative equilibria in question are critical points of the Hamiltonian restricted to V^, and near p one has 7V ~7>o x C V CV0
xg'.
The reduced space Vo can be identified with the symplectic slice Ni (see §2.5), so that by hypothesis p £ Vo = Vo x {0} is a non-degenerate critical point of the restriction of HtoVo- Write coordinates (j/,f) € V x g*. Then for each v, the function H(•, u) has an isolated non-degenerate critical point y = y(v). Define h : g* -¥ R by h{v) =
H(y{u),u).
Then one can show that the restriction h^ of h to O^ has a critical point at v iff H\v , has a critical point at (y(v),v). In this manner, the problem is reduced to finding critical points of the restrictions of a smooth function h to coadjoint orbits O^, and then one can use Morse theory or LyusternikSchnirelman techniques. The above proof assumes that Gp is trivial. However, if Gp is finite, then the proof can be modified to show that the nearby relative equilibria correspond to critical points of a smooth Gp-invariant function h, constructed in the same manner as above, but on Vo x g* rather than on the full orbit space Vo XG, 9*- This time though, different critical points correspond to
266
J. Montaldi
the same relative equilibrium if they lie in the same G p -orbit on O^. This argument gives the following 'equivariant' version of Theorem 5.2. Theorem 5.3 (Montaldi, Roberts [50]) // Gp is finite, then it acts on nearby coadjoint orbits O^. If a subgroup S of Gp is such that Fix(£,0„<) consists of isolated points, then they are all relative equilibria. This result uses the fact that H is G p -invariant, and not that Gp acts symplectically. In [50] this bifurcation result is applied to finding relative equilibria of molecules, and in [37] it is applied to finding relative equilibria of systems of point vortices on the sphere, and in both we use antisymplectic symmetries of H as well as symplectic ones. An action of G where the elements act either symplectically or antisymplectically, i.e. g'uj = ±u, is said to be semisymplectic [51]. In [50], the stability of the bifurcating relative equilibria is also calculated using these methods. The reader should also see [65,36,62,57] for further developments. 6
Examples
Let us look at three examples of symmetric Hamiltonian systems. • Point vortices on the sphere. • Point vortices in the plane. • Molecules (as classical mechanical systems). The motivation for choosing these models is that the first is relatively simple: it has no points where the action fails to be free (unless there are only 2 point vortices), and the group is compact. The bifurcations that arise are therefore of the types described in Section 5. The second is similar, except that the group of symmetries is no longer compact. The study of the classical mechanics of molecules is also of interest in molecular spectroscopy, where it is common practice to label particular quan tum states in terms of the corresponding classical behaviour. We treat this example extremely briefly! 6.1
Point vortices on the sphere
The model is a finite set of point vortices on the unit sphere in 3-space. A point vortex is an infinitesimal region of vorticity in a 2-dimensional fluid flow, though we ignore the fluid, and just concentrate on the vortices. The equations of motion for this system were obtained by V.A. Bogomolov [21]. A study of the dynamics of 3 point vortices has been carried out by Kidambi
Relative Equilibria and Conserved Quantities in Symmetric Hamiltonian Systems
267
and Newton [30] and by Pekarsky and Marsden [63]. The case of N identical point vortices has been treated in [37]. If xi,..., xw are the distinct locations of these point-vortices (unit vectors in R 3 ) , then the differential equation describing their motion is
>-2-A*i-sfc-z/ Here A i , . . . , XN are the strengths of the vortices: each Xj is a non-zero real number. It turns out that this vector field is Hamiltonian, with H(x) = - — ^ n
XjXk log (1 - Xj ■ xk),
}«■■
and Poisson structure {/>ff}(x) = - 5 2 *Jxdjf i where djf = djf(x) 2
Xj € S , and x =
x djg ■ Xj,
€ R 3 is the differential of / with respect to the point (XI,...,XN).
The phase space is V = S2 x S2 x . . . x S2 \ A, where A is the 'big diagonal' where at least one pair of points coincides, which is removed to avoid collisions. The symplectic form on V is given by u> = Xiu>\ ® • • • © AJVU>/V,
where <jj is the standard area form on the j t h copy of S2. This system has full rotational symmetry G = SO(3), and hence a 3component conserved quantity. After identifying so (3) with R 3 as usual, this momentum map is the so-called centre of vorticity:
*( x ) =^2^JxiThere are a number of immediate general consequences for this system that can be drawn from the Hamiltonian structure. For example, if all the vorticities are of the same sign, then as x -► A in V, so H(x) -* +00 and it follows that H attains its minimum at some point; this point is necessar ily an equilibrium point, and the set on which this minimum is attained is Lyapounov stable (one would expect this set to be a finite union of SO(3)orbits). Moreover, for any /x € so(3)* cz R 3 the same argument can be applied on $ - 1 ( / i ) , and the minimum on that set is necessarily a relative equilibrium, by Proposition 3.2.
268 J. Montaldi
However, if the vorticities are of mixed signs, then there is no general statement about the existence of equilibria or relative equilibria, except for TV = 3 where all RES are known (see below). If there are some identical vortices, then there is an extra finite symmetry group—a subgroup of the permutation group SN- This extra finite symmetry group is used to considerable effect in [37], from which Figures 7 and 8 are taken. Moreover, the time-reversing symmetries obtained from the reflexions in 0(3) are also used together with the principle of symmetric criticality to prove the existence of many types of relative equilibria. For example, let C/» denote the group of order 2 generated by reflexion in the horizontal plane. Then x e Fix(C h ) if all the vortices are on the equator. An application of the arguments above shows that if all the vorticities are of the same sign then there is a point on FixtCh.'P) where the restriction of the Hamiltonian attains its minimum, and by the principle of symmetric critical ity (see §1.5) it follows that this is an equilibrium point for the full system (though no longer a local minimum in general). And the same extension to this argument as before provides relative equilibria in Fix(C h , $ - 1 (/*)). In general there will be several relative equilibria in each component of Fix ( Q , , $ - 1 (/x)). However, it is shown in [37] that if all the vorticities are of the same sign then in each component of Fix(CA, V) there is a unique equilibrium point. 2 vortices If there are only 2 point vortices, then every solution is a relative equilibrium. Indeed, if $(x) = fi ^ 0 then the two points rotate at the same angular velocity about the axis containing fi. The only possible case where \i — 0 is if Ai = A2 and xx = - z 2 which is an equilibrium point. 3 vortices There have been two recent studies of the system of 3 point vortices, by Kidambi and Newton [30] (who also treat the 2 vortex case in an appendix) and by Pekarsky and Marsden [63]. The former describes not only the relative equilibria, but also self-similar collapse, where triple collision occurs in finite time with the three vortices retaining their same shape up to similarity. The latter describes the REs and their stability using the techniques described in these lectures (many of which were in fact developed by Marsden and co-workers). One of the results of Kidambi and Newton is that there are two classes of RE: those lying on a great-circle and those lying at the vertices of an equilateral triangle, and any RE belongs to one of these classes. Moreover, every equilateral triangle on the sphere is an RE, as we shall prove below. The following results are mostly taken from [30] and [63], and some are discussed below. Denote A1A2 + A2A3 + A3A1 by <72(A). P r o p o s i t i o n 6.1 For the system of N = 3 point vortices on the sphere,
Relative Equilibria and Conserved Quantities in Symmetric Hamiltonian Systems
C„*R)
269
qjOR-p)
Figure 7. Relative equilibria for 3 identical vortices on the sphere
(i) There exist equilibria iff 02(A) > 0, and they always lie on a great circle. (ii) All equilateral configurations are RES, and they are Lyapounov stable modulo SO(2) if a2(\) > 0, and are unstable if a2{\) < 0. (Hi) Self-similar collapse occurs iff (72(A) = 0. (iv) The configurations where the vortices lie on a great circle, at the vertices of a right-angled isosceles triangle is always an RE. Furthermore, the triangle with x\ at the right-angle is Lyapounov stable provided
\l + \l>2a2(\). The phase space is of dimension 6 in this case, and so the orbit space 7 7 /SO(3) is of dimension 3, and points in the orbit space correspond to the shapes of the triangle formed by the 3 vortices. The obvious set of coordinates consisting of the three pairwise distances has a problem for great circle con figurations since nearby such a configuration these distances do not determine the configuration uniquely. Indeed, these three distances are 0(3) invariants, as they do not distinguish the orientation, and configurations on a great circle have non-trivial isotropy for the 0(3)-action so it is not surprising that these coordinates have a problem there. For a good set of SO(3)-invariants, one must use the oriented volume as well, which is what is done in the two works cited above. However, the three distances do form a good set of coordinates away from the great circle configurations, and we will restrict our attention to those. So, let n be the chord distance \\x2 - X3II etc. Then the Hamiltonian and the orbit momentum map (2.16) are given by H(rur2,r3)
= - ^ l o g ( r 3 ) - ^ l o g ( n ) - ^Mog(r2)
>(n,r 2 ,r 3 ) = | $ | 2 = (Ai + A2 + A3)2 - AxA2r2 - A2A3r2 -
X^rj
(6.1)
270
J. Montaldi
The reduced spaces are then PM := ¥> -1 (A* 2 )- The relative equilibria are determined by the critical points of the restriction of H to the reduced spaces, and are therefore critical points of H - r}
0 A3Ax 0
0 \ 0 . AtAj/
Now, the tangent space to ¥>-1(/x2) is spanned by (Ai,-A 2 ,0), (Ai,0,-A 3 ), and a computation then shows that the Hessian of the reduced Hamiltonian is definite if and only if 0. The computations for the configurations of vortices lying on great circles are longer, and I will not go into them further here; details can be found in the original papers [30,63]. R e m a r k s 6.2 (i) The case CT2(A) = 0 remains to be understood. (ii) It is not known whether any other form of collapse (i.e. not self-similar) can occur. (iii) A further bifurcation occurs at equilateral RES which lie on a great circle which to my knowledge has not been investigated. 4 vortices Much less is known in general about the case of 4 vortices. It is shown in [63] that the configuration where the four vortices lie at the vertices of a regular tetrahedron is always a relative equilibrium. It is also easy to show (from the equations of motion) that a square lying in a great circle is always an RE, independently of the values of the vorticities. However, in contrast to the 3-vortex case, squares not lying in great circles are not REs unless all the vortices are identical. In the case that all the vorticities coincide, a classification of symmetric REs is given in [37], from which Figure 8 is taken. Moreover, using the implicit
Relative Equilibria and Conserved Quantities in Symmetric Hamiltonian Systems
CJR.R')
271
qj[2R)
Figure 8. Relative equilibria for 4 identical vortices on the sphere
function theorem one can show that many of the REs shown to exist in [37] persist under small perturbations of the Hamiltonian, and so in particular under small changes of the values of the vorticities. There is one interesting occurrence of a symmetric pitchfork bifurcation: consider the family of square configurations, on the co-latitude 0—type Civ{R) in the figure. When the ring is close to the North pole (say), the RE is stable. As 6 increases, one pair of eigenvalues approaches 0, and at 0 = arccos(l/v / 3) there is a pitchfork bifurcation (of type I in the terminology of §4.1). The bifurcating pair of relative equilibria are of type C2t,(/?, R')Stability of a ring of vortices It was shown by Dritschel and Polvani [64] that the stability of a single ring of identical vortices depends on the latitude. They show that if 9 is the angle subtended by any of the vortices with the axis of symmetry of the ring (the colatitude), then the configuration of a regular
272 J. Montaldi
ring of N identical vortices is linearly stable as follows N range of stability N range of stability 3 all 0 4 cos* 9 > 1/3 5 cos2 9 > 1/2 6 cos2 9 > 4/5 while for N > 6 the ring is never stable. See also [38], where it is shown that the rings are not only linearly stable but Lyapounov stable. Bifurcations The changes in stability that occur in the table above involve bifurcations of the relative equilibria, and indeed the supercritical pitch fork bifurcation described in Section 4, since they all involve a loss of C2 symmetry. Consider for example the case of N = 4 identical vortices. A single ring (square) near the pole is a Lyapounov stable relative equilib rium. As 6 -> cos _ 1 (l/\/3), so one of the eigenvalues tends to 0, and for 9 > c o s _ 1 ( l / \ / 3 ) , there appears a new family of relative equilibria consisting of vortices alternately above and below the vortices in the now-unstable square configuration. These bifurcating RES—denoted C2v(R, R') in [37]—are then Lyapounov stable. Other stability transitions have been observed in [37,38], but the corresponding bifurcations have not been studied. The other type of bifurcation that occurs in this problem is the geometric bifurcation due to the different geometry of the reduced spaces for p. = 0 and p. ■£ 0—see Section 5. Consider a relative equilibrium pe on p = 0, for example the ring of N identical vortices on the equator. This corresponds to a point with symmetry DNh in the phase space (the dihedral group in the equatorial plane together with inversion in that plane). Nearby reduced spaces are then locally of the form V^ ~ V0 x O^, where O^ is the coadjoint orbit through p., which here is a sphere, as described very briefly in Section 5. The relative equilibria on 7?M near p e are the critical points of some function h : O^ -* R, and moreover this function is invariant under some action of T>nh ~ D ^ x C2 on Op. An analysis of this action shows that there must be critical points with symmetry of types C^„ and C 2 „; see Figure 9. The corresponding relative equilibria have configurations of types CNV (R) (a regular ring) and if N = 2m then C2v{mR) and C2v((m - l)iJ,2p), while if JV = 2m +1 then C2v(mR,p). Here the configuration C2v (mR, £p) consists of m pairs and (. poles all lying on a common great circle, while the great circle is rotating rigidly about an axis containing the poles. See Figure 8, and see [37] for details. 6.2
Point vortices in the plane
This system has a much older history than the model of vortices on the sphere, going back to Helmoltz and KirchhofF, and has been studied by many people
Relative Equilibria and Conserved Quantities «n Symmetric Hamiltonian Systems
273
Figure 9. Level sets of a typical function on a sphere with symmetry D3/,, showing half of the 8 critical points
since; for reviews see [17,3,15]. Consider an ideal fluid in the plane whose vorticity is concentrated in N point vortices, of strengths A i , . . . , \ N - These points move according to the differential system z
i ~ o~ 2_,
A*
where Zj is a complex number representing the position of the j-th vortex, after identifying the plane with C, see (1.3). The Hamiltonian for this system is
H(z) = --7-^2 47T
X Xk lo
i
S \zi ~ 2*l
The symmetry here is the group of 2-D Euclidean motions SE(2)—which is not compact. The corresponding conserved quantity (momentum map) is
$ ( « i , . . . , * * ) = (i^XjZj,
|EAil2il2J-
(6-2)
From the geometric point of view, this is interesting because if the total vorticity A = £ , \j is non-zero, the coadjoint action on se(2)* must be mod ified in order that the momentum map be equivariant (see Section 2). If we identify SE(2) with C x U(l) and u(l) with R, then the modified coadjoint action (2.7) becomes Coadf UiS) (i/,^) = (e*v, rp + %(eievu)) + A (tu, £|u| 2 ) where 9(z) is the imaginary part of z.
(6.3)
274 J. Montaldi
If A = 0 then the orbits are points on the ^-axis, and cylinders around that axis, while if A ^ 0 the orbits are all paraboloids—see Examples 2.5 and 2.7 respectively and Figures 2 and 3. Indeed, one can show that the modified coadjoint orbits are given by the level sets of / : C x R -> R defined by f(v,tl>) = \v\2-2AiP,
(6.4)
at least if A ^ 0. If A = 0 then the non-zero level sets are the cylindrical orbits, but the zero level set is the whole V-axis. 2 vortices in the plane As in the case of 2 vortices on the sphere, here they are also always relative equilibria. It is simple to prove from the differential equation that if A = Xi + A2 ^ 0 then the two point vortices rotate about the fixed point (Ai^ + A 2 z 2 )/A. On the other hand, if A = 0 then they translate together towards infinity, in the direction orthogonal to the segment joining them. These two types of motion are both relative equilibria. From a geometrical point of view, the reduced spaces are all just single points, so the corresponding motions are indeed all relative equilibria, and furthermore the relative equilibria are trivially extremal, and so provided G„ is compact (ie. A ^ 0, so G^ ~ SO(2)) they are stable modulo GM. 3 vortices in the plane The classical work on three planar point vortices is a beautiful paper by J.L. Synge [69]. We approach this problem as we did for three point vortices on the sphere. That is, points in the quotient of the phase space (R 2 ) 3 ~ C 3 by SE(2) correspond to shapes of oriented triangles. Again we ignore the orientation, which only causes problems near collinear configurations. Then a point in the quotient space is determined by the three lengths (rj ,r2,r3), and on that space 47rH(r 1 ,r 2 ,r 3 ) = -XiX2\og(r3) - A 2 A 3 log(n) - X3XX log(r 2 ) V(ri,r 2 ,r 3 ) = - A ^ r 2 - A2A3r2 - XzXxrl,
lR
.. W
the second is just / o $. It is remarkable that apart from constant term in ip, these are identical to equations (6.1) for 3 point vortices on the sphere. The non-collinear relative equilibria are given by the critical points of H restricted to the level-sets of tp, and the computation has already been done. Thus the relative equilibria are again equilateral triangles, of side r say, with Lagrange multiplier n = l/{Airr2) again. Note however, that this time the relation between rj and the "angular velocity" f is not so simple: 0 = d(H -
W)(p)
= d(H - nf o *)(p) = dH(p) - n df(ti)d*(p),
so that f = n df(^). Thus if $(p) = (u, ij)) then i = 27,(1/, - A ) .
Relative Equilibria and Conserved Quantities in Symmetric Hamiltonian Systems
275
One consequence of this expression for f is that if A = 0 then the "angu lar velocity" is in fact rectilinear motion, with constant velocity f = 2TJI> = (l/27rr 2 )i/, where v = i V XjZj, which in modulus is independent of the rep resentative triangle. Note that if the 3 vortices are not collinear, then u ^ 0 so that the function / separates the relevant coadjoint orbits even in the case A = 0. On the other hand, if A ^ 0, and n ^ 0, then the relative equilibrium is a periodic orbit. Furthermore, the Lyapounov stabilities modulo G of these equilateral relative equilibria are the same as for the spherical case. The symplectic reduction for a particular class of planar 4-vortex prob lems has recently been considered by Patrick [61]. In particular he treats the case A — 0 so that the momentum map is coadjoint-equivariant; it would be interesting to see how the results are affected by changing to A ^ 0. 6.3
Molecules
Consider a molecule consisting of N atoms. The Born-Oppenheimer approxi mation consists of ignoring the movement of the electrons (which is reasonable as they are so light). The system then has 3N degrees of freedom, the 3 di rections of motion for each nucleus, or 3A^ — 3 after fixing the centre of mass. The rotational symmetry of the system gives rise to the conservation of angular momentum J. If we put J = 0, then there are 3A^ — 6 degrees of freedom which describes the shape of the molecule, and correspond to the vibrational motions. In other words, the J = 0 reduced space is of dimension 6A^ - 12. The simplest motions beyond the equilibria are the periodic orbits, which near the stable equilibrium are given by Lyapounov's theorem and its generalizations. For J / 0, the motion has a rotational aspect, and the simplest type of motion is the relative equilibrium. Beyond that are motions that are a combination of rotations and vibrations, so-called rovibrationcd states. The reduced spaces for J ^ 0 are of dimension 6N - 10. Consider now the simplest interesting case of a triatomic molecule. The reduced spaces are of dimension 6 (for J = 0) or 8 (for J ^ 0). There is one complication that we will not discuss here, namely the reduced space for J = 0 is singular at points corresponding to collinear equilibria. Consider then an equilibrium of a triatomic molecule which is not collinear. The geometric bifurcation methods discussed in Section 5, and at the end of §6.1, show that there are relative equilibria which bifurcate from the equilibria, and in fact there at at least 6 such families of RE parametrized by ||J|| and corresponding to critical points of functions on a sphere. The stabilities of these bifurcating families are discussed in [50]. For a complete investigation into the relative
276
J. Montaidi
equilibria of a specific molecule with 3 identical atoms—namely Hj—see [33]. Another interesting example is the tetra-atomic molecule ammonia NH3. This has an equilibrium where the three hydrogen atoms form an equilateral triangle, and the nitrogen atom is slighly above (or below) the centre of the triangle—so that the molecule is nearly planar. The analysis is similar to the triatomic case, with 6 families of RES that bifurcate from each equilibrium. The stability analyses should also be similar to the triatomic case. However, the presence of two very close stable equilibria (with the nitrogen atom on one side and on the other of the hydrogen-plane) and an unstable planar equilibrium betewen them suggests that there will be further bifurcations. An interesting "semilocal" analysis could be obtained by adding a new parameter A so that the equilibria undergo a pitchfork bifurcation (of type I) at A = 0, with the genuine system corresponding to say A = - 1 and an artificial one for A > 0 with the symmetric planar equilibrium being stable. One needs to investigate how the pitchfork bifurcation within the reduced space Vo, obtained by varying A, couples with the geometric bifurcation, obtained by varying ||J||. References Books
1. R. Abraham and J. Marsden. Foundations of Mechanics. BenjaminCummings, 1978. 2. V.I. Arnold. Mathematical Methods of Classical Mechanics. SpringerVerlag, 1978. 3. V.I. Arnold and B.A. Khesin, Topological Methods in Hydrodynamics, Springer-Verlag, New York, 1998. 4. V.I. Arnold, V.V. Kozlov and A.I. Neishtadt. Mathematical Aspects of Classical and Celestial Mechanics, 2 nd edition. Springer-Verlag, 1997. 5. P. Chossat and R. Lauterbach, Methods in Equivariant Bifurcation The ory and Dynamical Systems. World Scientific, to appear 2000. 6. R.H. Cushman and L.M. Bates, Global Aspects of Classical Integrable Systems. Birkhuser Verlag. 1997. 7. M. Golubitsky, I. Stewart and D. Schaeffer. Singularities and Groups in Bifurcation Theory, Vol. II. Springer-Verlag, New York. 1988. 8. V. Guillemin and S. Sternberg Symplectic Techniques in Physics. Cam bridge University Press. 1984. 9. V. Guillemin, E. Lerman and S. Sternberg Symplectic Fibrations and Multiplicity Diagrams. Cambridge University Press. 1996.
Relative Equilibria and Conserved Quantities in Symmetric Hamiltonian Systems
277
10. P. Libermann and C.-M. Marie. Symplectic Geometry and Analytical Mechanics. Reidel, 1987. 11. R. MacKay and J. Meiss. Hamiltonian Dynamical Systems, a reprint collection. Adam Hilger, Bristol. 1988. 12. J. Marsden. Lectures on Mechanics. L.M.S. Lecture Note Series 174, Cambridge University Press, 1992. 13. J. Marsden and T. Ratiu. Introduction to Mechanics and Symmetry. Springer-Verlag, New-York, 1994. [Second ed. 1999] 14. K. Meyer and G. Hall. Hamiltonian Systems and the N-Body Problem. Springer-Verlag, New-York, 1992. 15. P.G. Saffman, Vortex Dynamics. Cambridge University Press, Cam bridge, 1992. 16. J.-M. Souriau Structure des Systemes Dynamigues. Dunod, Paris, 1970. [English translation: Structure of Dynamical Systems: A Symplectic View of Physics, Birkhauser, Boston, 1997.] Research papers 17. H. Aref, Integrable, chaotic and turbulent vortex motion in twodimensional flows, Ann. Rev. Fluid Mech. 15 (1983), 345-389. 18. J.M. Arms, A. Fischer and J.E. Marsden, Une aproche symplectique pour des theoremes de decomposition en geometrie ou relativite generate. C. R. Acad. Sci. Paris 281 (1975), 517 - 520. 19. J.M. Arms, J.E. Marsden and V. Moncrief, Bifurcations of momentum mappings. Comm. Math. Phys. 78 (1981), 455-478. 20. L.M. Bates and E. Lerman, Proper group actions and symplectic strati fied spaces. Pacific J. Math. 181 (1997), 201-229. 21. V.A. Bogomolov, Dynamics of vorticity at a sphere, Fluid Dynamics 6 (1977), 863-870. 22. H. Broer, S.-N. Chow, Y. Kim and G. Vegter, A normally elliptic Hamil tonian bifurcation. Z. Angew. Math. Phys. 44 (1993), 389-432. 23. M. Dellnitz, I. Melbourne and J. Marsden, Generic bifurcation of Hamil tonian vector fields with symmetry. Nonlinearity 5 (1992), 979-996. 24. J.J. Duistermaat, Bifurcations of perodic solutions near equilibrium points of Hamiltonian systems. In Bifurcation Theory and Applications, Montecatini, 1983 (ed. L. Salvadori), LNM 1057, Springer, 1984. 25. F. Fasso and D. Lewis, Stability properties of the Riemann ellipsoids. Preprint, 2000. 26. I.M. Gelfand and L.D. Lidskii, On the structure of stability of linear Hamiltonian systems of differential equations with periodic coefficients.
278 J. Montaldi
27. 28.
29.
30. 31. 32. 33. 34. 35. 36. 37. 38. 39.
40. 41. 42.
43.
Usp. Math. Nauk. 10 (1955), 3-40. (English translation: Amer. Math. Soc. Translations (2) 8 (1958), 143-181.) M. Golubitsky and I. Stewart, Generic bifurcations of Hamiltonian sys tems with symmetry. Physica D 24 (1987), 391-405. V. Guillemin and S. Sternberg, A normal form for the moment map. In Differential Geometric Methods in Mathematical Physics (S. Sternberg ed.) Mathematical Physics Studies, 6. D. Reidel Publishing Company (1984). G. Iooss and M.-C. P&roueme, Perturbed homoclinic solutions in re versible 1:1 resonance vector fields. J. Differential Equations 102 (1993), 62-88. R. Kidambi and P. Newton, Motion of three point vortices on a sphere, Physica D 116 (1998), 143-175. Y. Kimura, Vortex motion on surfaces with constant curvature. Proc. R. Soc. Lond. A 455 (1999), 245-259. F.C. Kirwan, The topology of reduced phase spaces of the motion of vortices on a sphere, Physica D 30 (1988), 99-123. I. Kozin, R.M. Roberts and J. Tennyson, Symmetry and structure of rotating Hg". Preprint, University of Warwick, 1999. E. Lerman, J. Montaldi and T. Tokieda, Persistence of extremal relative equilibria. In preparation E. Lerman and S.F. Singer, Relative equilibria at singular points of the momentum map. Nonlinearity 11 (1998), 1637-1649 E. Lerman and T.F. Tokieda, On relative normal modes. C. R. Acad. Sci. Paris Sr. I 328 (1999), 413-418. C.C. Lim, J. Montaldi and R.M. Roberts, Systems of point vortices on the sphere. Preprint, INLN, 2000. C.C. Lim, J. Montaldi and R.M. Roberts, Stability of relative equilibria for point vortices on the sphere. In preparation. E. Lombardi, Oscillatory integrals and phenomena beyond any alge braic order; with applications to homoclinic orbits in reversible systems. Springer Verlag Lecture Notes in Mathematics To appear. A.M. Lyapounov, Probleme generate de la stability du mouvement. Ann. Fac. Sci. Toulouse 9, (1907). (Russian original: 1895.) R. MacKay, Stability of equilibria of Hamitonian systems. Nonlinear Phenomena and Chaos (1986), 254-70. Also reprinted in [11]. C.-M. Marie, Modele d'action hamiltonienne d'un groupe the Lie sur une vari6te symplectique. Rend. Sem. Mat. Univers. Politecn. Torino 43 (1985), 227-251. J. Marsden and A. Weinstein, Reduction of symplectic manifolds with
Relative Equilibria and Conserved Quantities tn Symmetric
Hamiltonian Systems
279
symmetry. Pep. Math. Phys 5 (1974), 121-130. 44. J.-C. van der Meer, The Hamiltonian Hopf bifurcation, Lecture Notes in Math. 1160, Springer, 1985. 45. I. Melbourne, Versa! unfoldings of equivariant linear Hamiltonian vector fields. Math. Proc. Camb. Phil. Soc. 114 (1993), 559-573. 46. I. Melbourne and M. Dellnitz, Normal forms for linear Hamiltonian vector fields commuting with the action of a compact Lie group. Math. Proc. Camb. Phil. Soc. 114 (1993), 235-268. 47. K. Meyer, Symmetries and integrals in mechanics. Dynamical Systems (M. Peixoto, ed.), 259-273. Academic Press, New York, 1973. 48. J. Montaldi, Persistence and stability of relative equilibria. Nonlinearity 10 (1997), 449-466. 49. J. Montaldi, Perturbing a symmetric resonance: the magnetic spheri cal pendulum. In SPT98 - Symmetry and Perturbation Theory II, (A. Degasperis and G. Gaeta eds.), World Scientific 1999. 50. J. Montaldi and R.M. Roberts, Relative equilibria of molecules. J. Nonlin. Sci. 9 (1999), 53-88. 51. J. Montaldi and R.M. Roberts, A note on semisymplectic actions of Lie groups. In preparation. 52. J. Montaldi, R.M. Roberts and I. Stewart, Periodic Solutions near Equi libria of Symmetric Hamiltonian Systems. Proc. Roy. Soc. London 325 (1988), 237-293. 53. J. Montaldi, R.M. Roberts and I. Stewart, Existence of nonlinear normal modes of symmetric Hamiltonian systems. Nonlinearity 3 (1990), 695730. 54. J. Moser, Lectures on Hamiltonian Systems. Memoirs of the A.M.S. 81 (1981). (Also reprinted in [11]). 55. J. Moser, Periodic orbits near equilibrium and a theorem by Alan Weinstein. Communs. Pure Appl. Math. 29 (1976), 727-747. 56. J.P. Ortega, Symmetry, Reduction ans Stability in Hamilonian Systems. Thesis, University of California, Santa Cruz, 1988. 57. J.P. Ortega and T.S. Ratiu, Stability of Hamiltonian relative equilibria. Nonlinearity 12 (1999), 693-720. 58. R. Palais. The principle of symmetric criticality, Commun. Math. Phys. 69 (1979), 19-30. 59. G.W. Patrick, Relative equilibria in Hamiltonian systems: The dynamic interpretation of nonlinear stability on the reduced phase space. J. Geom. Phys. 9 (1992), 111-119. 60. G.W. Patrick, Relative equilibria of Hamiltonian systems with symmetry: linearization, smoothness and drift. J. Nonlin. Sci. 5 (1995), 373-418.
280 J. Montaldi
61. G.W. Patrick, Reduction of the planar 4-vortex system at zero momen tum. Preprint math-ph/9910012 62. G.W. Patrick and R.M. Roberts, The transversal relative equilibria of Hamiltonian systems with symmetry. Preprint, University of Warwick (1999). 63. S. Pekarsky and J.E. Marsden, Point vortices on a sphere: Stability of relative equilibria, J. Mathematical Physics 39 (1998), 5894-5906. 64. L.M. Polvani and D.G. Dritschel, Wave and vortex dynamics on the sur face of a sphere, J. Fluid Mech. 255 (1993), 35-64. 65. R.M. Roberts and M.E.R. Sousa Dias, Bifurcations from relative equilib ria of Hamiltonian systems. Nonlinearity 10 (1997), 1719-1738. 66. R.M. Roberts and M.E.R. Sousa Dias, Symmetries of Riemann ellipsoids. To appear in Resenhas do IME - USP, 2000. 67. R. Sjamaar and E. Lerman, Stratified symplectic spaces and reduction. Ann. Math 134 (1991), 375-422. 68. S. Smale, Topology and mechanics I, II. Invent. Math. 10 (1970), 305331, and 11 (1970), 45-64. 69. J.L. Synge, On the motion of three vortices. Can. J. Math. 1 (1949), 257-270. 70. A. Vanderbauwhede and J.C. van der Meer, A general reduction method for periodic solutions near equilibria in Hamiltonian systems. Fields In stitute Comm. 4 (1995), 273-294. 71. A. Weinstein, Normal modes for nonlinear Hamiltonian systems. Invent. Math. 20 (1973), 47-57. 72. A. Weinstein, Bifurcations and Hamilton's principle. Math. Z. 159 (1978), 235-248.