Geometry, Mechanics, and Dynamics: Volume in Honor of the 60th Birthday of J. E. Marsden

Geometry, Mechanics, and Dynamics Geometry, Mechanics, and Dynamics Editors: Paul Newton, Phil Holmes, and Alan Weins...

Author: Newton P. K. | Weinstein A. | Holmes P.

23 downloads 693 Views 7MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

Geometry, Mechanics, and Dynamics

Geometry, Mechanics, and Dynamics Editors:

Paul Newton, Phil Holmes, and Alan Weinstein

Paul Newton Department of Aerospace and Mechanical Engineering and Department of Mathematics University of Southern California, Los Angeles, CA 90089-1191 USA [email protected]

Philip Holmes Department of Applied and Computational Mathematics Engineering Quadrangle Princeton University Princeton, NJ 08544-1000 USA [email protected]

Alan Weinstein Department of Mathematics University of California, Berkeley Berkeley, CA 94720 USA [email protected] Cover Illustration: Permission has been granted for use of the thunderstorm photograph on the cover by Kyle Poage, General Forecaster, National Weather Service, Dodge City, KS 67801, USA. The photo is of a spectacular thunderstorm that occurred at sunset over northwest Kansas in August, 1996. The view is to the east from Norton, Kansas. It was taken while Kyle was at Saint Louis Univeristy (SLU) and was featured as a cover photo for the SLU Department of Earth and Atmospheric Sciences homepage (8 September 1998). http://www.eas.slu.edu/Photos/photos.html The photograph of Jerry Marsden on page v was taken by photographer Robert J. Paz, Public Relations, California Institute of Technology, Pasadena, CA 91109, USA.

Mathematics Subject Classiﬁcation (2000), 00B1D, 37-02, 53-02, 58-02, 70-02, 73 IP data to come ISBN 0-387-91185-6

Printed on acid-free paper.

c 2002 by Springer-Verlag New York, Inc. All rights reserved. This work may not be translated or copied in whole or in part without the written permission of the publisher (Springer-Verlag New York, Inc., 175 Fifth Avenue, New York, NY 10010, USA), except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed is forbidden. The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are not identiﬁed as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights. Printed in the United States of America. 9 8 7 6 5 4 3 2 1

SPIN 10882145

Typesetting: Pages were created from author-prepared LATEX manuscripts by the technical editors, Wendy McKay and Ross Moore, using modiﬁcations of a Springer LATEX macro package, and other packages for the integration of graphics and consistent stylistic features within articles from diverse sources. www.springer-ny.com Springer-Verlag

New York, Berlin, Heidelberg

A member of BertelsmannSpringer Science+Business Media GmbH

To Jerry Marsden on the occasion of his 60th birthday, with admiration, aﬀection, and best wishes for many more years of creativity.

Photo by Robert J. Paz

Contents Preface

I

ix

Elasticity and Analysis

1 Some Open Problems in Elasticity by John Ball

1 3

2 Finite Elastoplasticity Lie Groups and Geodesics on SL(d) by Alexander Mielke 61 3 Asynchronous Variational Integrators by Adrian Lew and Michael Ortiz

II

Fluid Mechanics

91 111

4 Euler–Poincar´ e Dynamics of Perfect Complex Fluids by Darryl D. Holm

113

5 The Lagrangian Averaged Euler (LAE-α) Equations with Free-Slip or Mixed Boundary Conditions by Steve Shkoller

169

6 Nearly Inviscid Faraday Waves by Edgar Knobloch and Jos´e M. Vega

181

7 The Variational Multiscale Formulation of LES with Application to Turbulent Channel Flows by Thomas J. R. Hughes and Assad A. Oberai

223

III

241

Dynamical Systems

8 Patterns of Oscillation in Coupled Cell Systems by Martin Golubitsky and Ian Stewart

243

9 Simple Choreographic Motions of N Bodies: A Preliminary Study by Alain Chenciner, Joseph Gerver, Richard Montgomery and Carles Sim´ o

287

10 On Normal Form Computations by J¨ urgen Scheurle and Sebastian Walcher

309

vii

viii

IV

Geometric Mechanics

327

11 The Optimal Momentum Map by Juan-Pablo Ortega and Tudor S. Ratiu

329

12 Combinatorial Formulas for Products of Thom Classes by Victor Guillemin and Catalin Zara

363

13 Gauge Theory of Small Vibrations in Polyatomic Molecules by Robert G. Littlejohn and Kevin A. Mitchell

407

V

429

Geometric Control

14 Symmetries, Conservation Laws, and Control by Anthony M. Bloch and Naomi E. Leonard

431

VI

461

Relativity and Quantum Mechanics

15 Conformal Volume Collapse of 3-Manifolds and the Reduced Einstein Flow by Arthur E. Fischer and Vincent Moncrief

463

16 On Quantizing Semisimple Basic Algebras, I: sl(2, R) by Mark J. Gotay

523

VII

537

Jerrold Marsden, 1942–

Curriculum Vitae

539

Some Research Highlights

541

Graduate Students and Post Doctoral Scholars

545

Publications

549

Contributors

569

Preface Jerry Marsden, one of the world’s pre-eminent mechanicians and applied mathematicians, celebrated his 60th birthday in August 2002. The event was marked by a workshop on “Geometry, Mechanics, and Dynamics” at the Fields Institute for Research in the Mathematical Sciences, of which he was the founding Director. Rather than merely produce a conventional proceedings, with relatively brief accounts of research and technical advances presented at the meeting, we wished to acknowledge Jerry’s inﬂuence as a teacher, a propagator of new ideas, and a mentor of young talent. Consequently, starting in 1999, we sought to collect articles that might be used as entry points by students interested in ﬁelds that have been shaped by Jerry’s work. At the same time we hoped to give experts engrossed in their own technical niches an indication of the wonderful breadth and depth of their subjects as a whole. This book is an outcome of the eﬀorts of those who accepted our invitations to contribute. It presents both survey and research articles in the several ﬁelds that represent the main themes of Jerry’s work, including elasticity and analysis, ﬂuid mechanics, dynamical systems theory, geometric mechanics, geometric control theory, and relativity and quantum mechanics. The common thread running through this broad tapestry is the use of geometric methods that serve to unify diverse disciplines and bring a wide variety of scientists and mathematicians together, speaking a language which enhances dialogue and encourages cross-fertilization. We hope that this book will serve as a guide to these exciting and rapidly evolving areas, and that it will be a resource both for the student intent on contributing to one of these ﬁelds and to the seasoned practitioner who seeks a broader view. Jerry is a unique ﬁgure in mathematical circles because his work has signiﬁcantly inﬂuenced four often (alas!) separate research communities: pure mathematicians, applied mathematicians, physicists, and engineers. Foundations of Mechanics (with Ralph Abraham [294]), ﬁrst published in 1967 while Jerry was a graduate student at Princeton, has for the past 35 years been a landmark and inspiration in the ﬁeld of mechanics; during that time, Jerry and his collaborators have done extraordinary work in a huge variety of sub-ﬁelds of mechanics, geometry, and dynamics. Ralph Abraham recalls: “The ﬁrst edition of Foundations of Mechanics included, in my Preface, a few words on the genesis of the book as Jerry’s notes of my lectures in early 1966. I well recall the ﬁrst meeting of that graduate course. At the outset I announced a desire for volunteers to make notes ix

x

Preface

which might be duplicated for the use of students, as there was at that time no text we could follow. And at the end of that ﬁrst meeting, only one volunteer: Jerry. He was a new face for me, and seemed rather young and quiet, and I told him I hoped that others would volunteer for a team eﬀort on the notes. Well, there were no other volunteers, which was just as well. For shortly after each lecture Jerry would deliver a thick sheaf of handwritten notes, usually without a single error. Many details omitted in my talks were ﬁlled in with proofs, references, and so on, in the now-famous Marsden style. And the rest, as they say, is history. By now many people know that Jerry is an ideal coworker and coauthor, and I was lucky to be an early benefactor of his wonderful talents and personality.” A talented and proliﬁc expositor, Jerry has written numerous other books, from elementary to advanced level, in addition to his many research articles. Mathematical Foundations of Elasticity (with Tom Hughes [300]) introduced a generation of engineers with appetites for abstraction to a uniﬁed and global approach to the subject, and his recent book Introduction to Mechanics and Symmetry, (with Tudor Ratiu [303]) has been remarkably useful to a wide range of scientists and engineers. When Jerry won the 1990 Norbert Wiener Prize (jointly with Michael Aizenman), he noted in his response to the citation that Wiener was “classiﬁable neither as a pure nor an applied mathematician. He had breadth and depth that worked together in a mutually supportive way.” The same is true of Jerry: it is no accident that he began his career in mathematical physics, moved to a mathematics department, and is now working in the Division of Engineering and Applied Science at Caltech. Jerry’s inﬂuence on mathematical education has also been signiﬁcant. His books on calculus and complex variables are widely used and, with their skillful blend of concreteness and abstraction, have inﬂuenced generations of undergraduates. Thorough and wide-ranging in their coverage, they leave the conscientious student with a solid grounding in both theoretical techniques and physical intuition. Jerry’s Ph.D. and postdoctoral students, some of them now leaders in their ﬁelds, have made signiﬁcant contributions in many areas themselves. In addition, Jerry has worked tirelessly for the mathematical community, serving on editorial boards and arranging conferences and workshops, all the while teaching a stellar array of undergraduate and graduate students and post-docs, ﬁrst at UC Berkeley, and now at Caltech. His extraordinarily inﬂuential paper with David Ebin [13], on the analysis of ideal ﬂuid ﬂows remains a classic in the ﬁeld. It followed upon Arnold’s 1966 paper1 on ideal ﬂuid ﬂows, which showed how the Euler dynamics for 1 Arnold, V. I. [1966], Sur la g´ eometrie diﬀerentielle des groupes de Lie de dimension inﬁnie et ses applications ` a l’hydrodynamique des ﬂuids parfaits, Ann. Inst. Fourier, Grenoble, 16, 319–361.

Preface

xi

rigid bodies and ﬂuids could be viewed as geodesic ﬂow on SO(3) with a left-invariant metric, and on Diﬀ vol (Ω) — the volume preserving diﬀeomorphism group of a region Ω in R3 — with the right invariant metric deﬁned by the ﬂuid kinetic energy. Ebin and Marsden [13] put this work in the context of Sobolev (H s ) manifolds and showed that Arnold’s geodesic ﬂow on H s − Diﬀ vol (Ω), the volume preserving diﬀeomorphisms of Ω to itself of Sobolev class H s , comes from a smooth geodesic spray. This allowed them to show that the initial value problem for the Euler equations could be solved using Picard iteration and techniques from ordinary diﬀerential equation theory. Stephen Smale has remarked: “Jerry Marsden has many ﬁne achievements to his credit. But I am particularly fond of his early work with David Ebin on the equations of ﬂuid mechanics. There are many sides to this study. It gave formal ideas of Arnold great substance and provided an elegant way of presenting old and new fundamental work on the existence of solutions of Navier–Stokes and Euler equations. The rigorous group setting and one of the ﬁrst important uses of inﬁnite dimensional manifolds are there as well. Quite a milestone in mathematics!” His body of work on “reduction theory,” begun with Alan Weinstein, was an outgrowth of ideas developed by Smale (following Jacobi and others), who introduced the use of symmetry ideas in the context of tangent and cotangent bundles of conﬁguration spaces with Hamiltonians in the form of kinetic plus potential energy. The Marsden and Weinstein paper [30] uniﬁed approaches of both Smale2 and Arnold by putting this “reduction theory” in the context of symplectic manifolds. For instance, in the related Poisson context (developed by his student Richard Montgomery) if one starts with a cotangent bundle T ∗ Q and a Lie group G acting on Q, then the quotient (T ∗ Q)/G is a bundle over T ∗ (Q/G) with ﬁber g∗ , the dual of the Lie algebra of G. As described by Marsden [170], “Thus, one can say — perhaps with only a slight danger of oversimpliﬁcation — that reduction theory synthesizes the work of Smale, Arnold (and their predecessors of course) into a bundle, with Smale as the base and Arnold as the ﬁber.” Reduction theory has now been used successfully in a wide variety of ﬁelds, and we refer the reader to the overview articles by Marsden [170; 227] as well as many of the articles in this volume for current applications. From these works emerge more than speciﬁc theorems and techniques, deep and elegant as they may be. Viewing Jerry Marsden’s contributions as a whole, one ﬁnds a clear, pedagogical, and fundamental approach to the subject of mechanics that blends geometry, analysis, and dynamics in powerful, yet practical ways. Thus, while developing abstract techniques in dynamical systems theory, Jerry also helped understand speciﬁc orbit trajectories (with a group at the Jet Propulsion Lab) that were used in the Genesis Discovery Mission, launched on August 8, 2001, [244]. In 2 Smale,

S. [1970], Topology and mechanics, Invent. Math., 10, 305–331; 11, 45–64.

xii

Preface

the course of developing the averaged ﬂuid equations (with Holm, Ratiu, and Shkoller), he also contributed to their use in turbulent ﬂow computations [267]. While working out inﬁnite dimensional versions of the Melnikov method and Smale–Birkhoﬀ theory to prove the existence of “Smale horseshoes” in the context of partial diﬀerential equations, the chaotic oscillations of a forced beam were being analyzed [71]; and while developing symplectic-energy-momentum preserving variational integrators based on discrete variational principles, Jerry contributed to speciﬁc projects (with Michael Ortiz) to simulate the crushing of aluminum cans and analyze fracture mechanics and collision problems [253; 284]. In each of the general areas noted above, we have solicited survey and research articles that illustrate more speciﬁcally how some of the methods pioneered by Marsden are currently being used. For Elasticity and Analysis, the paper by J. M. Ball entitled “Some open problems in elasticity” is a self-contained overview which highlights some general open problems in elasticity theory, including some new results showing that local minimizers of the total elastic energy satisfy a weak form of the equilibrium equations. This is followed by the article of A. Mielke, “Finite elastoplasticity, Lie groups and geodesics on SL(d)” which interprets notions of nonlinear plasticity theory in terms of Lie groups, among other things. The contribution of A. Lew and M. Ortiz, entitled “Asynchronous variational integrators” describes a new class of algorithms for nonlinear elastodynamics which is based upon a discrete version of Hamilton’s principle. D. D. Holm’s article in Fluid Mechanics, “Euler–Poincar´e dynamics of perfect complex ﬂuids,” describes the use of Lagrangian reduction by stages to derive the Euler–Poincar´e equations for non-dissipative motion of exotic ﬂuids such as liquid crystals, superﬂuids, Yang-Mills magnetoﬂuids and spin-glass systems. Inclusion of defects, such as vortices, in the order parameters is also treated. S. Shkoller’s contribution, “The Lagrangian averaged Euler (LAE −α) equations with free-slip or mixed boundary conditions”, presents a simple proof of well-posedness of the Euler-α equations with novel boundary conditions. E. Knobloch’s and J. Vega’s article, “Nearly inviscid Faraday waves”, explores some of the consequences of introducing small viscosity in the study of surface-gravity-capillary waves excited by vertical vibration of a ﬂuid layer. The contribution of T. J. R. Hughes and A. A. Oberai, “The variational multiscale formulation of LES with applications to turbulent channel ﬂows”, studies turbulent two-dimensional equilibrium and three-dimensional non-equilibrium channel ﬂows using a variational multi-scale formulation of Large Eddy Simulation (LES). In Dynamical Systems Theory, M. Golubitsky and I. Stewart address “Patterns of oscillation in coupled cell systems”. The dynamics of coupled cell systems both in biological contexts (animal gaits) and physical contexts (coupled pendula/Josephson junctions) are described, with an emphasis on

Preface

xiii

the use of symmetry ideas. In particular, the issue of how the modeling assumptions dictate the kinds of equilibria and periodic solutions is explored. This is followed by the paper of A. Chenciner, J. Gerver, R. Montgomery, and C. Sim´ o, “Simple choreographic motions of N bodies: A preliminary study”. They describe the existence of new periodic solutions to the N body problem in which all N masses trace the same curve without colliding. J. Scheurle and S. Walcher’s “On normal form computations” closes this section by reviewing computational procedures involved in transforming a vector ﬁeld into a suitable normal form about a stationary point. For Geometric Mechanics, the paper by J. P. Ortega and T. Ratiu entitled “The optimal momentum map” discusses the (dare we say) classical Marsden–Weinstein reduction procedure and the use of a new optimal momentum map which more eﬃciently encodes symmetry information of the underlying Hamiltonian system. V. Guillemin’s and C. Zara’s “Combinatorial formulas for products of Thom classes” obtains combinatorial descriptions of the equivariant Thom class dual to the Morse–Whitney stratiﬁcation of compact Hamiltonian G-manifolds. The paper of R. Littlejohn and K. Mitchell, “Gauge theory of small vibrations in polyatomic molecules,” considers molecular vibrations in the context of gauge theory and ﬁber bundle theory. In Geometric Control Theory, the paper by A. Bloch and N. Leonard, entitled “Symmetries, conservation laws, and control” traces the role of Marsden’s ideas on reduction and symmetries in the setting of nonlinear control theory. Speciﬁc applications to the dynamics of rigid spacecraft with a rotor and the dynamics of underwater vehicles are considered in detail. Finally, for Relativity and Quantum Mechanics, A. E. Fischer and V. Moncrief’s article entitled “Conformal volume collapse of 3-manifolds and the reduced Einstein ﬂow ” describes the Hamiltonian reduction of Einstein’s equations of general relativity and the process of volume collapse. They prove that collapse occurs either along circular ﬁbers, embedded tori, or completely to a point, but surprisingly, always with bounded curvature. This is followed by M. Gotay’s contribution “On quantizing semisimple basic algebras” which examines whether there exists a consistent quantization of the coordinate ring of a basic coadjoint orbit of a semisimple Lie group. We hope that this collection of articles gives the reader some appreciation of both the unity and diversity of the topics inﬂuenced by Jerry Marsden’s approach to mechanics. But here we wish to do more than survey his mathematical and scientiﬁc contributions; we also want to celebrate Jerry as a colleague and friend. It therefore seems appropriate to conclude with some personal reminiscences.

xiv

Preface

Phil Holmes: I ﬁrst met Jerry in the summer of 1976, at a conference on dynamical systems at Southampton University, organised by David Rand and Brian Griﬃths. He joined Nancy Kopell, John Guckenheimer and Ken Cooke as one of four mathematicians from the USA invited to that meeting. I had completed my Ph.D. in Engineering (experimental studies of dispersive wave propagation in structures) at the Institute of Sound and Vibration a couple of years earlier, and had begun working on nonlinear vibration problems with David Rand. We had done some single and ﬁnite degree of freedom problems, and I wanted to begin looking at PDEs in structural mechanics from a dynamical systems perspective. I believe it was in late 1975 that someone told me Jerry was working on a book about bifurcations and dimension reduction for such problems. I wrote to ask for more information and back came a huge package, re-taped and tied with string by UK customs, containing a 500+ page photocopy of the typescript of “The Hopf Bifurcation and its Applications” by Marsden and McCracken [297]. In ﬁnancially-constrained Britain I had never seen more than ﬁfty pages of xerox copies (all copies were xerox copies in those days) at one time, without special permission. I started reading, and I’m still reading Jerry’s papers and trying to catch up. Jerry and I began corresponding. We met at the Southampton conference and I subsequently visited him in Berkeley during a hectic job-seeking tour of the USA in the Fall of 1976, and again during his visit to Heriot–Watt University in Edinburgh as a Carnegie Fellow in the spring of 1977. This resulted in our ﬁrst joint paper [54], and was the beginning of a twentyﬁve year collaboration and friendship which I hope will last at least another twenty ﬁve. For me, one of the high points of this was our paper [71], in which we gave one of the ﬁrst examples of a PDE with chaotic solutions (Smale horseshoes), via an inﬁnite-dimensional extension of the Smale–Birkhoﬀ theorem and Melnikov’s method. (John Guckenheimer gave another at about the same time via center-manifold reduction of a reactiondiﬀusion equation at a codimension-two bifurcation point.) After I had settled in the USA at Cornell University, Jerry invited me to Berkeley for the Spring semester of 1981, during which we wrote a series of papers [73; 77; 82] extending Melnikov type analyses to multi-degree-of-freedom Hamiltonian systems (although not without leaving a few gaps in our proofs to be ﬁlled by others, in the time-honored tradition of Poincar´e). While we have not written joint papers in the last ten years, his work at the interface of mechanics and mathematics has remained an inspiration for my own, and we have met once or twice every year and had countless scientiﬁc, editorial, organizational, and mathematico-political discussions and collaborations. Jerry is a mainstay of the Journal of Nonlinear Science, which I now edit, and I’m proud to serve as an advisor to the Springer Applied Mathematical Sciences Series which Jerry edits with Larry Sirovich and Stu Antman. I was even prouder to nominate him for the AMS–SIAM

Preface

xv

Norbert Wiener Prize in 1990, and to support his successful nominations to the Royal Society of Canada and the American Academy of Arts and Sciences. But rather than these well-deserved honors, I especially wish to celebrate Jerry’s continuing emphasis on mentoring and encouraging young people. Few people outside academia, and few Deans and Presidents within it, realise that a large part of research is actually teaching: teaching bright but sometimes erratically-educated graduate students the necessary background and methods, teaching colleagues and collaborators about new advances, and teaching oneself all the things that no one else did. Jerry is a master teacher: in his many textbooks at all levels, and in his conference presentations and lecture courses, always delivered with elegance, polish, and a little humor. (He is almost the only person I know who can put content into powerpoint — although he’s also careful to explain that it’s not actually powerpoint.) At Berkeley and Caltech Jerry has had, and continues to have, a succession of wonderful Ph.D. students and postdocs, many of whom have gone on to propagate his grand project of geometrizing mechanics (their names appear elsewhere in this volume). He has been equally generous with his time with young visitors (for many of whom, including myself, he raised the funds to invite), with the students of others, and simply with people who write or approach him to ask questions at conferences and workshops. I’m happy that we’ve been able to include articles contributed by several such colleagues in this Festschrift. Certainly, Jerry’s interest and involvement in the early struggles of a mechanic poorly trained in mathematics was enormously encouraging to me. In those far-oﬀ days, from misty England, he seemed to me a senior scientist: a Professor from distant and fabled Berkeley, David Lodge’s Euphoric State U. Now that we are both almost seniors, he no longer seems that much older, but he still knows a lot more geometry and analysis, and I’m still taking notes in the second row and having trouble with the homework. Paul Newton: Like many of us, I ﬁrst met Jerry in print. As a Freshman at Harvard in 1977, I learned Stokes’ theorem, Green’s theorem and the divergence theorem from his (and Tromba’s) beautiful “Vector Calculus.” Those who are familiar with the ﬁrst edition and who are aware of Jerry’s fascination with weather patterns will suspect that his favorite aspect of the book must have been the cloud formations on its cover. Mine was the elegant formulation of these theorems in terms of diﬀerential forms, something I had never seen in high school! I remember using this work to such an extent (so much for my social life) that today it is held together only by being wedged between two volumes on my shelf. Fast forward eight years to 1985. While a postdoc at Stanford, I participated (inconspicuously) in a seminar series on dynamics which Jerry,

xvi

Preface

together with Lieberman, ran at Berkeley, and it was there that I ﬁrst took note of what I now know to be his uncommon blend of open-mindedness and depth of thought, coupled with a generosity of spirit, demonstrated so vividly in his mentoring of young mathematicians. But it was when we both ended up in Los Angeles at roughly the same time (1993 in my case) that we became friends. I was busy working out the details of how the geometric phase manifested itself in the context of vortex dynamics problems, and Jerry’s encouragement and insights were invaluable in helping me move from an early interest in nonlinear dispersive wave models to more general issues in applied dynamical systems theory. Our interests overlapped again when I showed him some problems involving the motion of vortices on a sphere (with applications to weather patterns!). This led him (with Sergey Pekarsky) to begin applying nonlinear stability techniques to relative equilibrium conﬁgurations of vortices on a sphere. As I read and re-read many of his books and papers, I seem to fall farther and farther behind. But occasionally I look up to where it all began, the punished copy of “Vector Calculus” on my bookshelf, and I wonder if I would have become a mathematician had my professor chosen Edwards and Penney instead of Marsden and Tromba. Alan Weinstein: My work with Jerry has two facets: (1) symplectic reduction, Poisson geometry, and applications to stability of continuum mechanical systems; (2) calculus books. Our original work on reduction, which is probably one of the two or three most inﬂuential papers I have written, was stimulated by our listening to lectures of Smale around 1970; Smale had developed the theory in the special case of lifted actions on cotangent bundles. I think that my interest in abstract symplectic geometry, combined with some interest in physics, meshed perfectly with Jerry’s interest in relativity and applied Hamiltonian dynamics. (One should always mention in this context that symplectic reduction was discovered independently at about the same time by Ken Meyer, though I think it might be fair to say that he conceived of this construction in narrower terms than we did.) About 10 years later, we were attending the “dynamics seminar” in the Berkeley physics department, where Allan Kaufman and Robert Littlejohn were studying recent work of John Greene and Phil Morrison on the Hamiltonian structure of equations in plasma physics. These authors had a Poisson structure for the Maxwell–Vlasov equations which they found by trial and error and for which they checked the Jacobi identity by hand; according to Morrison, this took them 4 months of work, mostly calculations. Jerry and I spent about 6 months developing the right application of reduction to this problem, after which we could derive the correct bracket in 4 minutes, with the Jacobi identity coming for free. Another payoﬀ was that we discovered an error in the Morrison–Greene formula.

Preface

xvii

It was Jerry’s interest in applications which kept us going through much of the 80’s, applying these Poisson brackets to applications of Arnol’d’s general method for analyzing the stability of continuum-mechanical motions. Much of this work was also done in collaboration with Darryl Holm and Tudor Ratiu, and I eventually dropped out of the “group”, but the subject has continued to evolve in the hands of Jerry and his collaborators (now including notably Steve Shkoller), the latest developments being the study of “α-Euler equations” and wide applications of Lagrangian (as opposed to Hamiltonian) methods for the study of stability. On the calculus side, I recall that our collaboration started with a discussion at the end of a tennis game. Jerry was in contact with a new publisher who wanted to do a calculus book, and we had some new ideas for calculus teaching, based on the use of bifurcation ideas to replace the early introduction of limits (which eventually appeared only in a spin-oﬀ book called Calculus Unlimited). We went through several publishers and many versions of the book, and I never had time to play tennis again. I think that Jerry kept it (and squash) up, though.

Acknowledgments: The editors owe a debt of gratitude to Wendy McKay and Ross Moore. Without their combined expertise in LATEX and other matters, this volume might not have been presented to Jerry until his 70th birthday. We also wish to thank the editors and staﬀ at Springer-Verlag, particularly Achi Dosanjh and Elizabeth Young, for making this book a reality. Paul Newton Santa Barbara, California

Philip Holmes Princeton, New Jersey

Alan Weinstein Berkeley, California

March 2002

Part I

Elasticity and Analysis

1

1 Some Open Problems in Elasticity John M. Ball To Jerry Marsden on the occasion of his 60th birthday ABSTRACT Some outstanding open problems of nonlinear elasticity are described. The problems range from questions of existence, uniqueness, regularity and stability of solutions in statics and dynamics to issues such as the modelling of fracture and self-contact, the status of elasticity with respect to atomistic models, the understanding of microstructure induced by phase transformations, and the passage from three-dimensional elasticity to models of rods and shells. Reﬁnements are presented of the author’s earlier work Ball [1984a] on showing that local minimizers of the elastic energy satisfy certain weak forms of the equilibrium equations.

Contents 1 2

Introduction . . . . . . . . . . . . . . . . . . . . . . . Elastostatics . . . . . . . . . . . . . . . . . . . . . . . 2.1 The Stored-Energy Function and Equilibrium Solutions 2.2 Existence of Equilibrium Solutions . . . . . . . . . . 2.3 Regularity and the Classiﬁcation of Singularities . . 2.4 Satisfaction of the Euler–Lagrange Equation and Uniform Positivity of the Jacobian . . . . . . . . . . 2.5 Regularity and Self-Contact . . . . . . . . . . . . . . 2.6 Uniqueness of Solutions . . . . . . . . . . . . . . . . 2.7 Structure of the Solution Set . . . . . . . . . . . . . 2.8 Energy Minimization and Fracture . . . . . . . . . . 3 Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Continuum Thermomechanics . . . . . . . . . . . . . 3.2 Existence of Solutions . . . . . . . . . . . . . . . . . 3.3 The Relation Between Statics and Dynamics . . . . 4 Multiscale Problems . . . . . . . . . . . . . . . . . . . 4.1 From Atomic to Continuum . . . . . . . . . . . . . . 4.2 From Microscales to Macroscales . . . . . . . . . . . 4.3 From Three-Dimensional Elasticity to Theories of Rods and Shells . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3

4 4 4 6 11 15 22 22 23 26 27 27 29 33 36 36 38 42 45

4

John M. Ball

1

Introduction

In this paper I highlight some outstanding open problems in nonlinear (sometimes called ﬁnite) elasticity theory. While many of these will be well known to experts on analytic aspects of elasticity, I hope that the compilation will be of use both to those new to the ﬁeld and to researchers in solid mechanics having diﬀerent perspectives. Of course the selection of problems is a personal one, and indeed represents a list of those problems that I would most like to be able to solve, but cannot. In particular it concentrates on general open problems, or ones that illustrate general diﬃculties, rather than those related to very speciﬁc experimental situations, which is not to imply that the latter are not important or instructive. I have not included any open problems connected with the numerical computation of solutions, since I recently discussed some of these in Ball [2001]. The only new results of the paper are in connection with the problem of showing that local minimizers of the total elastic energy satisfy the weak form of the equilibrium equations. As I pointed out in Ball [1984a], there are hypotheses under which some forms of the equilibrium equations can be proved to hold, and in Section 2.4 I take the opportunity to present some reﬁnements of this old work. The paper is essentially self-contained, and can be read by those having no knowledge of elasticity theory. For those seeking further background on the subject I have written a short introduction (Ball [1996]) to some of the issues, intended for research students, which I hope is a quick and easy read. For more serious study in the spirit of this paper, the reader is referred to the books of Antman [1995], Ciarlet [1988, 1997, 2000], Marsden ˇ and Hughes [1983] and Silhav´ y [1997]. Other excellent but older books and survey articles are Antman [1983], Ericksen [1977b], Gurtin [1981] and Truesdell and Noll [1965]. Valuable additional perspectives can be found in the books of Green and Zerna [1968], Green and Adkins [1970], and Ogden [1984]. It is an honour to dedicate this article to Jerry Marsden, both as a friend and in recognition of his important contributions to elasticity, and thus to help celebrate his many talents as a mathematician, thinker and writer.

2 2.1

Elastostatics The Stored-Energy Function and Equilibrium Solutions

Consider an elastic body which in a reference conﬁguration occupies the bounded domain Ω ⊂ R3 . We suppose that Ω has a Lipschitz boundary ∂Ω = ∂Ω1 ∪ ∂Ω2 ∪ N , where ∂Ω1 , ∂Ω2 are disjoint relatively open subsets of ∂Ω and N has two-dimensional Hausdorﬀ measure H2 (N ) = 0 (i.e., N

1. Some Open Problems in Elasticity

5

has zero area). Deformations of the body are described by mappings y : Ω → R3 , where y(x) = y1 (x), y2 (x), y3 (x) denotes the deformed position of the material point x = (x1 , x2 , x3 ). We assume that y belongs to the Sobolev space W 1,1 (Ω; R3 ), so that in particular the deformation gradient Dy(x) is well deﬁned for a.e. x ∈ Ω. For each such x we can identify Dy(x) with the 3 × 3 matrix (∂yi /∂xj ). We require the deformation y to satisfy the boundary condition ¯( · ) , y∂Ω = y (2.1) 1

¯ : ∂Ω1 → R is a given boundary displacement. where y We suppose for simplicity that the body is homogeneous, i.e., the material response is the same at each point. In this case the total elastic energy corresponding to the deformation y is given by W Dy(x) dx , (2.2) I(y) = 3

Ω

where W = W (A) is the stored-energy function of the material. We suppose 3×3 that W : M+ → R is C 1 and bounded below, so that without loss of generality W ≥ 0. (Here and below, M m×n denotes the space of real m × n n×n denotes the space of those A ∈ M n×n with det A > 0.) matrices, and M+ The Piola–Kirchhoﬀ stress tensor is given by TR (A) = DA W (A) .

(2.3)

By formally computing d I(y + τ ϕ)τ =0 = 0 , dτ we obtain the weak form of the Euler–Lagrange equation for I, that is DA W (Dy) · Dϕ dx = 0 (2.4) Ω

for all smooth ϕ with ϕ|∂Ω1 = 0. This can be shown (cf. Antman and Osborn [1979]) to be equivalent to the balance of forces on arbitary subbodies. If y, ∂Ω1 and ∂Ω2 are suﬃciently regular then (2.4) is equivalent to the pointwise form of the equilibrium equations div DA W (Dy) = 0

in Ω ,

(2.5)

together with the natural boundary condition of zero applied traction DA W (Dy)n = 0

on ∂Ω2 ,

(2.6)

6

John M. Ball

where n = n(x) denotes the unit outward normal to ∂Ω at x. (More generally, we could have prescribed nonzero tractions of various types on ∂Ω2 , as well as including the potential energy of body forces such as gravity in the expression for the energy (2.2), but for simplicity we have not done this, since the main diﬃculties we address are already present without these additions.) To avoid interpenetration of matter, it is natural to require that y : Ω → R3 be invertible. To try to ensure that deformations have this property, we suppose that (2.7) W (A) → ∞ as det A → 0+ . So as to also prevent orientation reversal we deﬁne W (A) = ∞ if det A ≤ 0. Then W : M 3×3 → [0, ∞] is continuous. Clearly if I(y) < ∞ then det Dy(x) > 0

for a.e. x ∈ Ω .

(2.8)

Since y is not assumed to be C 1 , (2.8) does not imply even local invertibility. For studies of local and global invertibility in the context of elasticity, or relevant to it, see Ball [1981], Bauman and Phillips [1994], Ciarlet and Neˇcas [1985], Fonseca and Gangbo [1995], Giaquinta, Modica and Souˇcek ˇ ak [1988] and Weinstein [1985]. [1994], Meisters and Olech [1963], Sver´ We assume that for any elastic material the stored-energy function W is frame-indiﬀerent, i.e., W (RA) = W (A)

for all R ∈ SO(3) ,

A ∈ M 3×3 .

(2.9)

In addition, if the material has a nontrivial isotropy group S, W satisﬁes the material symmetry condition W (AQ) = W (A)

for all Q ∈ S ,

A ∈ M 3×3 .

The case S = SO(3) corresponds to an isotropic material. For incompressible materials the deformation y is required to satisfy the constraint det Dy(x) = 1 for a.e. x ∈ Ω . All of the problems and results contained in this article have corresponding incompressible versions, some of which we cite in the references. However, in general we do not state these explicitly.

2.2

Existence of Equilibrium Solutions

There are two traditional routes to proving the existence of equilibrium solutions. The ﬁrst, pioneered by Stoppelli [1954, 1955] and described in the book of Valent [1988], is to use the implicit function theorem in a suitable Banach space X to prove the existence of an equilibrium solution close to a given one, when the data of the problem are slightly perturbed. In order to

1. Some Open Problems in Elasticity

7

make this work, it is necessary to use spaces X of suﬃciently smooth mappings, for example subspaces of W 2,p (Ω; R3 ) for p > 3 or C 2+α (Ω; R3 ), so as to control the nonlinear dependence on Dy. In addition, the linearized elasticity operator at the given solution should be invertible as a map from X to a suitable target space Y . While this method automatically delivers smooth solutions, it is by its nature restricted to small perturbations (for example, small boundary displacements from a stress-free state), and because of the regularity properties required for the linearized operator it in general only applies to situations when ∂Ω1 and ∂Ω2 do not meet, for example when one of them is empty. In particular, mixed boundary conditions typically encountered in applications, for example (2.1) with ∂Ω1 comprising the two end-faces of a cylindrical rod, are in general not allowed, at least with the techniques as currently developed (see Section 2.7). The second route is to prove the existence of a global minimizer of I via the direct method of the calculus of variations. In principle such a minimizer should satisfy the equilibrium equations, at least in weak form, but this turns out to be a subtle matter (see Sections 2.3, 2.4). More generally we could ask for conditions ensuring that there exist some kind of local minimizer. Let ¯ , A = y ∈ W 1,1 (Ω; R3 ) : I(y) < ∞ , y|∂Ω1 = y where the boundary condition is understood in the sense of trace. In the deﬁnition of A we could include the requirement that y be invertible; this can be done in various ways, one of which is discussed in Section 2.5. 2.1 Deﬁnition. Let 1 ≤ p ≤ ∞. The deformation y ∈ A is a W 1,p local minimizer of I if there exists ε > 0 such that I(z) ≥ I(y) for any z ∈ A with z − yW 1,p ≤ ε. The problem of proving the existence of local, but not global, minimizers is discussed later (see Problem 9). A typical result on global minimization is the following. 2.2 Theorem.

Suppose that W satisﬁes the hypotheses

(H1) W is polyconvex, i.e., W (A) = g(A, cof A, det A) for all A ∈ M 3×3 for some convex g, 3 (H2) W (A) ≥ c0 |A|2 + |cof A| 2 − c1 for all A ∈ M 3×3 , where c0 > 0. Then, if A is nonempty, there exists a global minimizer y∗ of I in A. Here and below we take | · | to be the norm on M 3×3 with TEuclidean corresponding inner product A·B = T A B . Theorem 2.2 is a reﬁnement by M¨ uller, Qi and Yan [1994] of the result in Ball [1977a]. For the problem to be nontrivial we need that H2 (∂Ω1 ) > 0.

8

John M. Ball

The hypothesis (H1) is known to be too strong for the following reason. Let f : M m×n → R ∪ {+∞} be Borel measurable and bounded below. We recall f is said to be quasiconvex at A ∈ M m×n if the inequality f A + Dϕ(x) dx ≥ f (A) dx (2.10) Ω

Ω

C0∞ (Ω; Rm ), and n

is quasiconvex if it is quasiconvex at holds for any ϕ ∈ every A ∈ M m×n . Here Ω ⊂ R is any bounded open set whose boundary ∂Ω has zero n-dimensional Lebesgue measure. A standard scaling argument (see, for example, Ball and Murat [1984]) shows that contrary to appearances these deﬁnitions do not depend on Ω. Results of Morrey [1952] and Acerbi and Fusco [1984] imply that if f : M m×n → R is quasiconvex and satisﬁes the growth condition for all A ∈ M m×n , (2.11) C1 |A|p − C0 ≤ f (A) ≤ C2 |A|p + 1 where p > 1 and where C0 and C1 > 0, C2 > 0 are constants, then f (Dy) dx (2.12) F(y) = Ω

attains a global minimum on ¯ }. A = {y ∈ W 1,1 (Ω; Rm ) : y|∂Ω1 = y Here we assume that Ω has Lipschitz boundary ∂Ω, that ∂Ω1 ⊂ ∂Ω is Hn−1 ¯ : ∂Ω1 → Rm is given such that A is nonempty. As measurable and that y shown by Ball and Murat [1984], quasiconvexity of f is necessary for the existence of a global minimizer for all perturbed functionals of the form f Dy(x) + h x, y(x) dx F1 (y) = Ω

with h( · , · ) ≥ 0 continuous. These results strongly suggest that (H1) should be replaced by the requirement that W be quasiconvex, a weaker condition than polyconvexity. For example, it is easily seen that a larger class of W for which Theorem 2.2 holds consists of those of the form W = W1 + f , where W1 satisﬁes (H1) and (H2), and where f : M 3×3 → R is quasiconvex and satisﬁes (2.11). That this really is a larger class can be seen by taking f = KF for a large K > 0, where F is quasiconvex but not polyconvex. Such F exist satisfying F (RAQ) = F (A) for all R, Q ∈SO(3), ˇ ak [1991]1 . More A ∈ M 3×3 , and can be constructed by the method of Sver´ example we can take F = H qc to be the quasiconvexiﬁcation of p p H(A) = min U(A) − 1 , U(A) − λ1 , 1 where λ > 1 and U(A) = AT A 2 . The quasiconvexiﬁcation H qc of H is deﬁned to be the supremum of all quasiconvex functions ψ ≤ H. 1 For

1. Some Open Problems in Elasticity

9

generally we could take f to satisfy f (A) ≥ C1 |A|p − C0

(2.13)

for some p > 1, C1 > 0, C0 and to be the supremum of a nondecreasing sequence of continuous quasiconvex functions f k : M 3×3 → [0, ∞), each satisfying a growth condition 0 ≤ f k (A) ≤ αk |A|p − βk for constants αk > 0, βk . (Kristensen [1994] has shown that a lower semicontinuous function f : M 3×3 → [0, ∞] satisfying (2.13) is the supremum in the sense of such a sequence if and only if f is closed W 1,p quasiconvex

of Pedregal [1994], namely that Jensen’s inequality ν , f ≥ f (¯ ν ) holds for all homogeneous W 1,p gradient Young measures2 ν.) However, as they stand none of the existence theorems for minimizers of integrals of general quasiconvex functions apply to elasticity, since they all assume growth conditions such as (2.11) which are not consistent with the condition (2.7). (The same applies to other results, such as the relaxation theorem of Dacorogna [1982].) In particular, it is not clear whether or not a quasiconvex W satisfying our hypotheses can be written as the supremum of everywhere ﬁnite continuous quasiconvex functions. This is not true in general for quasiconvex functions f : M m×n → [0, ∞]; for example we can take m = n = 2, f (A) = 0 if A ∈ {A1 , A2 , A3 , A4 } and f (A) = ∞ otherwise, where the Ai are diagonal matrices in a Tartar conﬁguration (see Tartar [1993]), for example A1 = diag (−2, 1), A3 = diag (2, −1),

A2 = diag (1, 2), A4 = diag (−1, −2).

Then f is quasiconvex, since any y with Dy ∈ {A1 , A2 , A3 , A4 } a.e. has constant gradient (this following, for example, from the general result of Chleb´ik and Kirchheim [2001]). But the argument of Tartar shows that if f were the supremum of continuous quasiconvex functions f k : M 2×2 → [0, ∞) we would have f (0) = 0, a contradiction. 2 See

Young [1969], Tartar [1979], Ball [1989] for the deﬁnition and properties of the Young measure (νx )x∈Ω corresponding to a sequence of mappings z(j) : Ω → Rs satisfying a suitable bound, say z(j) L1 ≤ M < ∞, where Ω ⊂ Rn is open (or measurable). For each x ∈ Ω, νx is a probability measure on Rs giving the limiting distribution of values of z(j) (p) as j → ∞ and p → x. If f : Rs → R is continuous, then the weak limit of f (z (j) ) in L1 (E), where E ⊂ Ω is measurable, is given by the function x → νx , f , whenever the weak limit exists. In particular, if z(j) z in L1 (E), then z(x) = ν¯x for x ∈ E, where µ ¯ denotes the centre of mass of a measure µ. Such a Young measure is homogeneous if ν = νx is independent of x. If 1 < p ≤ ∞, a W 1,p gradient Young measure is a Young measure (νx )x∈Ω corresponding to a sequence z (j) = Dy(j) of gradients bounded in Lp (Ω; M m×n ), where we identify M m×n with Rmn .

10

John M. Ball

A further reason for preferring quasiconvexity to polyconvexity is that, unlike quasiconvexity, polyconvexity is not closed with respect to periodic homogenization (Braides [1994]). Problem 1. Prove the existence of energy minimizers for elastostatics for quasiconvex stored-energy functions satisfying (2.7). A principal diﬃculty here is that there is no known useful characterization of quasiconvexity. If W is quasiconvex then W is rank-one convex, that is the map t → W (A + ta ⊗ n) is convex for each A ∈ M m×n and a ∈ Rm , n ∈ Rn . For 40 years it seemed possible that in fact rank-one ˇ ak [1992] found his convexity was equivalent to quasiconvexity, until Sver´ well-known counterexample for the dimensions n ≥ 2, m ≥ 3. Then Krisˇ ak’s example to show that for the same dimensions tensen [1999] used Sver´ there is no local characterization of quasiconvexity. In the absence of a characterization leading to a new proof technique, one is forced to make direct use of the deﬁnition (2.10), which leads to serious problems of approximation by piecewise aﬃne functions when (2.7) holds. In Ball [1977a] it was shown how the hypotheses (H1), (H2) can be satisﬁed for a class of isotropic materials including models of natural rubbers, via theorems exploiting the representation W (A) = Φ(v1 , v2 , v3 )

(2.14)

of the stored-energy function W of an isotropic material, where Φ is a symmetric function of the singular values vi = vi (A), that is of the eigenvalues of (AT A)1/2 (for a diﬀerent proof of such theorems see Le Dret [1990]). However it is not obvious how to verify (H1) when the material is not isotropic, for example when it has cubic symmetry. Problem 2. Are there ways of verifying polyconvexity and quasiconvexity for a useful class of anisotropic stored-energy functions? To illustrate the diﬃculty in verifying (H1), in the isotropic case it is much more convenient to use the representation (2.14) rather than the equivalent representation W (A) = h(I1 , I2 , I3 ) in terms of the principal invariants Ij = Ij (A). Perhaps it is signiﬁcant that the function Φ in (2.14) has the same regularity as W , while h is less regular (see Ball [1984], ˇ Sylvester [1985], Silhav´ y [2000]). At any rate the more symmetric form (2.14) lends itself more easily to discussing convexity properties. For nonisotropic materials suitable representations do not seem to be available; for example, in the case of cubic symmetry it does not seem to be convenient to use the usual integrity basis (given, for example, in Green and Adkins [1970]).

1. Some Open Problems in Elasticity

2.3

11

Regularity and the Classiﬁcation of Singularities

The main open question concerning regularity is to decide when global, or local, minimizers of I are smooth. A special case is Problem 3.

When is the minimizer y∗ in Theorem 2.2 smooth?

Here smooth means C ∞ in Ω, and C ∞ up to the boundary (except in the neighbourhood of points x0 ∈ ∂Ω1 ∩ ∂Ω2 where singularities can be expected). Clearly additional hypotheses on W are needed for this to be 3×3 → R is C ∞ , and that true. One might assume, for example, that W : M+ (H1) is strengthened by assuming W to be strictly polyconvex (i.e., that g is strictly convex). Also for regularity up to the boundary we would need to assume both smoothness of the boundary (except perhaps at ∂Ω1 ∩ ∂Ω2 ) ¯ is smooth. The precise nature of these extra hypotheses is to be and that y determined. Problem 3 is unsolved even in the simplest special cases. In fact the only situation in which smoothness of y∗ seems to have been proved is for the pure displacement problem (∂Ω2 empty) with small boundary displacements from a stress-free state. For this case Zhang [1991], following work of Sivaloganathan [1989], gave hypotheses under which the smooth solution to the equilibrium equations delivered by the implicit function theorem was in fact the unique global minimizer y∗ of I given by Theorem 2.2. An even more ambitious target would be to somehow classify possible singularities in minimizers of I given by (2.2) for generic stored-energy functions W . If at the same time one could associate with each such singularity a condition on W that prevented it, one would also, by imposing all such conditions simultaneously, possess a set of hypotheses implying regularity. In fact it is possible to go a little way down this road. Consider ﬁrst the kind of singularity frequently observed at phase boundaries in elastic crystals, in which the deformation gradient Dy is piecewise constant, with values A, B on either side of a plane {x · n = k}. It was shown in Ball [1980] that, under the natural assumption that there is some matrix A0 that is a local minimizer of W ( · ), every such deformation y that is locally a weak solution of the Euler–Lagrange equation is trivial, that is A = B, if and only if W is strictly rank-one convex (i.e., the map t → W (A + ta ⊗ n) is strictly convex for every A and all nonzero a, n). Thus strict rank-one convexity is exactly what is needed to eliminate this particular kind of singularity. Another physically occuring singularity is that of cavitation. For radial cavitation the deformation has the form y : B(0, 1) → R3 , where B(0, 1) is the unit ball in R3 , and x . y(x) = r |x| |x| Thus if r(0) > 0, y is discontinuous at x = 0, where a hole of radius r(0) is formed. If (H1) holds, then since polyconvexity implies quasiconvexity,

12

John M. Ball

the minimizer of I among smooth (W 1,3 is enough, see below) y satisfying y(x) = λx for |x| = 1 (i.e., r(1) = λ) is given by the homogeneous deformation ˜ λ (x) ≡ λx . y However, it was shown in Ball [1982] that for a class of stored-energy functions W satisfying (H1) and the growth condition in (H2) but with 2 ≤ p < 3, q < 32 , I attains a minimum among radial deformations satisfying the boundary condition y(x) = λx for |x| = 1, and that for λ > 0 ¯ satisﬁes r(0) > 0. Furthermore y ¯ satisﬁes suﬃciently large the minimizer y the weak form of the Euler–Lagrange equation (2.4). As a speciﬁc example we can take (2.15) W (A) = |A|2 + h(det A) , for h : (0, ∞) → R smooth with h > 0, limδ→∞ h(δ) = limδ→0+ h(δ) = δ ∞. Cavitation is a common failure mechanism in polymers; for interesting pictures of almost radial cavitation of roughly spherical rubber particles imbedded in a matrix of Nylon-6 see Lazzeri and Bucknall [1995]. M¨ uller, Qi and Yan [1994] show that if (H2) holds then no deformation with ﬁnite energy can exhibit cavitation. A somewhat stronger condition, which by the Sobolev embedding theorem obviously prevents not only cavitation but any other discontinuity in y, is that W (A) ≥ c0 |A|p − c1 for all A, where c0 > 0 and p > 3. In fact even if p = 3 any ﬁnite-energy deformation is continuous on account of (2.8) and the result of Vodop’yanov, Gol’dshtein and Reshetnyak [1979]. There is an extensive literature on cavitation in elasticity; a sample of the more mathematical developments can be found in the papers of Antman and Negr´ on-Marrero [1987], James and Spector [1991], M¨ uller and Spector [1995], Polignone and Horgan [1993a,b], Sivaloganathan [1986, 1995, 1999], Sivaloganathan and Spector [2000a,b, 2001], Pericak-Spector and Spector [1997], Stringfellow and Abeyaratne [1989] and Stuart [1985, 1993]. An interesting feature of cavitation is that it provides a realistic example of the Lavrentiev phenomenon, whereby the inﬁmum of the energy is diﬀerent in diﬀerent function spaces. Here it takes the form yλ ) , inf I < inf I = I(˜ A1

A3

where Ap = y ∈ W 1,p (B(0, 1); R3 ) : y|∂B(0,1) = λx . More generally, there is a theory of minimization for elasticity with W polyconvex in function spaces not allowing cavitation due to Giaquinta, Modica and Souˇcek [1989, 1994, 1998] (see also the less technically demanding approach of M¨ uller [1988]). Thus the same W can have diﬀerent minimizers in diﬀerent function spaces; if we enlarge the space to allow not only cavitation but crack formation (see Section 2.8), then we can have diﬀerent minimizers in at least three diﬀerent spaces.

1. Some Open Problems in Elasticity

13

In cavitation there is a change of topology of the deformed conﬁguration associated with the Lavrentiev phenomenon, but one-dimensional examples in Ball and Mizel [1985] for integrals of the form b f x, y(x), yx (x) dx I(y) = a

show that the phenomenon can occur when the minimizer y is continuous with unbounded gradient. This leads to the question: Problem 4. Can the Lavrentiev phenomenon occur for elastostatics under growth conditions on the stored-energy function ensuring that all ﬁniteenergy deformations are continuous? Of course if y∗ is smooth then the Lavrentiev phenomenon cannot hold under the hypotheses of Theorem 2.2. Some interesting recent progress on Problem 4 is due to Foss [2001], Mizel, Foss and Hrusa [2002], who provide examples of the Lavrentiev phenomenon in two dimensions for a homogeneous isotropic polyconvex stored-energy function W satisfying (2.7) and the corresponding growth condition W (A) ≥ c0 |A|p − c1 for all 2×2 and some p > 2. In these examples the reference conﬁguration A ∈ M+ is given by a sector of a disk described in polar coordinates by Ωα = (r, θ) : 0 < r < 1 , 0 < θ < α , and the boundary conditions are of the ‘container’ type that the origin is ﬁxed, that y(Ωα ) ⊂ Ωβ , and β that y(1, θ) = (1, α θ), where 0 < β < 34 α. Whether such examples can be constructed for mixed boundary conditions of the type (2.1), even in two dimensions, or be associated with singularities in the interior of Ω, remains open. The Lavrentiev phenomenon has important implications for the numerical computation of minimizers (see Ball [2001] and the references therein). For a useful survey of the phenomenon in one and more dimensions see Buttazzo and Belloni [1995]. There are striking examples of variational problems of the form (2.12) for which the global minimizer is not smooth. The ﬁrst such example was found by Neˇcas [1977], who showed that if m = n2 with n suﬃciently large then there exists a strictly convex f = f (Dy) whose corresponding integral F(y) = f (Dy) dx (2.16) B(0,1)

has as global minimizer ∗ (x) = yij

xi xj , |x|

x ∈ B(0, 1)

subject to its own (smooth) boundary-values on ∂B(0, 1). Here y∗ is Lipschitz but not C 1 . Then Hao, Leonardi and Neˇcas [1996] modiﬁed the example to work for n ≥ 5 with minimizer xi xj ∗ − n1 |x|δij . = (2.17) yij |x|

14

John M. Ball

ˇ ak and Yan [2000] showed that there By a more sophisticated method, Sver´ exists a convex f such that (2.17) gives a minimizer for n = 3. In fact, working in the ﬁve-dimensional space of 3 × 3 traceless symmetric matrices ˇ ak and Yan also we thus obtain an example with n = 3, m = 5. Sver´ obtained a similar example for the case n = 4, m = 3. Note that the above maps y∗ are one-homogeneous, i.e., y∗ (sx) = sy∗ (x) for all s ≥ 0. In contrast Phillips [2001] has shown that when n = 2 any one-homogeneous weak solution y to a strongly elliptic system of the form div A(Dy) = 0 is linear. Even if y∗ is not smooth everywhere, we can ask for smoothness outside a closed set of Lebesgue measure zero. That such a result is true is strongly suggested by the classical partial regularity theorem of Evans [1986], which (with the incorporation of a weakening of the growth condition due to Acerbi and Fusco [1988]) states that any global minimizer has this property for a class of integrals of the form (2.16) with f satisfying (2.11) and the strong quasiconvexity condition that for some p ≥ 2 f (A + Dϕ) − f (A) dx ≥ γ |Dϕ|2 + |Dϕ|p dx , Ω

Ω

for all A ∈ M m×n , all ϕ ∈ C0∞ (Ω; Rm ). Recently, Kristensen and Taheri [2001] have proved the same result but for W 1,p local minimizers. However, it is not known how to extend these theorems to the case of elasticity when (2.7) holds (see Ball [1998] for a brief discussion). In proving regularity or partial regularity, it is not suﬃcient to just use the fact (if it is a fact, see below) that y∗ satisﬁes the weak form of the Euler–Lagrange equation. This follows from the example of M¨ uller and ˇ ak [2001] of a Lipschitz mapping y : Ω → R2 , with Ω ⊂ R2 a bounded Sver´ domain, that is nowhere C 1 and solves the weak form of the Euler–Lagrange equation for an integral I(y) = F (Dy) dx Ω

with F strictly quasiconvex. As shown by Kristensen and Taheri [2001] y can even be a W 1,∞ local minimizer. There seems to be little indication from experiment that natural rubbers have equilibrium solutions with singularities other than those involving cavitation or other forms of fracture. Thus it seems a reasonable conjecture that minimizers are smooth for models of natural rubber for which the stored-energy function satisﬁes growth conditions prohibiting discontinuities for deformations of ﬁnite energy. In view of the above and other counterexamples for elliptic systems, if minimizers are smooth it must be for special reasons applying to elasticity. Three plausible such reasons are: (a) the fact that the integrand W (Dy) does not depend explicitly on y (though dependence on x is allowed), (b) the fact that the dimensions

1. Some Open Problems in Elasticity

15

m = n = 3 are low, (c) the frame-indiﬀerence of W (see (2.9)). A fourth possible reason is (d) invertibility of y, which we discuss now. That invertibility could have an eﬀect on regularity was ﬁrst shown by Bauman, Owen and Phillips [1991], who gave an example of an essentially two-to-one equilibrium solution in a ball for two-dimensional elasticity, with stored-energy function of the form (2.15), that is C 1 but not smooth, and is such that det Dy vanishes at the centre of the ball. An instructive example (resulting from discussions with X. Yan and J. Bevan) is that of minimizing the two-dimensional energy for an incompressible material |Dy|2 dx , I(y) = B

where B = B(0, 1) is the unit disc in R2 , and y : B → R2 , in the set of admissible mappings ¯ , (2.18) A = y ∈ W 1,2 (B; R2 ) : det Dy = 1 a.e. , y∂B = y 1 ¯ : (r, θ) → √2 r, 2θ . Then there exists a global where in polar coordinates y ∗ ¯ ∈ A.) But since minimizer y of I in A. (Note that A is nonempty since y by degree theory there are no C 1 maps y satisfying the boundary condition (2.18), it is immediate that y ∗ is not C 1 . For interesting maximum principles satisﬁed by smooth equilibrium solutions in two-dimensional elasticity, with stored-energy function of the form (2.15), see Bauman, Owen and Phillips [1991a, 1992].

2.4

Satisfaction of the Euler–Lagrange Equation and Uniform Positivity of the Jacobian

Here we return to the computation formally leading to the weak form (2.4) of the Euler–Lagrange equation, under the assumption (2.7). If y∗ ∈ W 1,∞ (Ω; R3 ) is a W 1,∞ local minimizer of I in ¯ , A = y ∈ W 1,1 (Ω; R3 ) : y∂Ω1 = y and if (2.8) holds in the stronger form that for some ε > 0 det Dy∗ (x) ≥ ε

for a.e. x ∈ Ω ,

(2.19)

then it is easily seen that y∗ satisﬁes (2.4). In fact we can then pass to the limit τ → 0 using bounded convergence in the diﬀerence quotient 1 W (Dy∗ + τ Dϕ) − W (Dy∗ ) dx , (2.20) Ω τ since by (2.19) we have det Dy∗ (x) + τ Dϕ(x) ≥ ε/2 for a.e. x ∈ Ω. However, if only (2.7) is assumed, or if y∗ is not assumed in advance to be in W 1,∞ (Ω; R3 ), then it is not obvious how to pass to the limit.

16

John M. Ball

Problem 5. Prove or disprove that, under reasonable growth conditions on W , global or suitably deﬁned local minimizers of I satisfy the weak form (2.4) of the Euler–Lagrange equation. Problem 6. Prove or disprove that, under reasonable growth conditions on W , global or suitably deﬁned local minimizers of I satisfy (2.19). If W (A) → ∞ as |A| → ∞ and if y∗ ∈ W 1,∞ , or if (2.19) does not hold, then W (Dy∗ ) is essentially unbounded. This is at ﬁrst sight inconsistent with y∗ being a minimizer, but we know from the one-dimensional examples in Ball and Mizel [1985] and from the example of cavitation that it can pay to have the integrand inﬁnite somewhere so that it is smaller somewhere else. In general, it is not possible to estimate the integrand in the diﬀerence quotient (2.20) in terms of W (Dy∗ ), the only relevant quantity that is obviously integrable. This diﬃculty was pointed out by Antman [1976], who was the ﬁrst to address the issue of satisfaction of the Euler–Lagrange equation for one-dimensional problems of elasticity when (2.7) holds; in this context a device essentially due to Tonelli [1921] can be used to prove that the Euler–Lagrange equation holds (see also Ball [1981a]) without any supplementary growth conditions on W . It is perhaps worth making the simple observation that a smooth deformation y may satisfy I(y) < ∞ and det Dy(x) > 0 a.e. without (2.19) holding. As an example we may take Ω = B(0, 1) and y(x) = |x|2 x , with W (A) = − log det A + g(A), where g : M 3×3 → R is smooth. For a class of strongly elliptic stored-energy functions having the form W (A) = g(A) + h(det A), where g : M 3×3 → R and h : (0, ∞) → [0, ∞) are smooth with h(δ) → ∞ as δ → 0+ at a polynomial rate, Bauman, Owen and Phillips [1991] show that if y ∈ C 1,β satisﬁes the energy-momentum weak form of the Euler–Lagrange equation in (2.22) below, then in fact y is a smooth solution of the Euler–Lagrange equation (2.5) and the strict positivity condition (2.19) holds. As was pointed out in Ball [1984a], it is possible to derive diﬀerent ﬁrst-order necessary conditions for a minimizer when (2.7) holds. (Later Giaquinta, Modica and Souˇcek [1989] derived the same ﬁrst-order conditions in their framework of Cartesian currents, under somewhat stronger hypotheses.) We give here improved versions of these results. We consider the following conditions that may be satisﬁed by W : 3×3 , where K > 0 is a (C1) DA W (A)AT ≤ K W (A) + 1 for all A ∈ M+ constant; and 3×3 (C2) AT DA W (A) ≤ K W (A) + 1 for all A ∈ M+ , where K > 0 is a constant.

1. Some Open Problems in Elasticity

17

As usual, | · | denotes the Euclidean norm on M 3×3 , for which the inequalities |A · B| ≤ |A| · |B| and |AB| ≤ |A| · |B| hold. But of course the conditions are independent of the norm used up to a possible change in the constant K. 2.3 Proposition.

Let W satisfy (C2). Then W satisﬁes (C1).

Proof. Since W is frame-indiﬀerent the matrix DA W (A)AT is symmetric (this is equivalent to the symmetry of the Cauchy stress tensor — see (2.28) below). Hence DA W (A)AT 2 = DA W (A)AT · A DA W (A) T = AT DA W (A) · AT DA W (A ]T 2 ≤ AT DA W (A) ,

from which the result follows. Example.

Let

1 . det A Then W satisﬁes (C1) and (2.9), but not (C2). W (A) = (AT A)11 +

2.4 Theorem. For some 1 ≤ p < ∞ let y ∈ A ∩ W 1,p (Ω; R3 ) be a W 1,p local minimizer of I in A. (i) Let W satisfy (C1). Then [DA W (Dy)DyT ] · Dϕ(y) dx = 0

(2.21)

Ω

for all ϕ ∈ C 1 (R3 ; R3 ) such that ϕ and Dϕ are uniformly bounded and satisfy ϕ(y)|∂Ω1 = 0 in the sense of trace. (ii) Let W satisfy (C2). Then W (Dy)1 − DyT DA W (Dy) · Dϕ dx = 0

(2.22)

Ω

for all ϕ ∈ C01 (Ω; R3 ). We use the following simple lemma. 2.5 Lemma.

(a) If W satisﬁes (C1) then there exists γ > 0 such that if 3×3 C ∈ M+

then

and

|C − 1| < γ

DA W (CA)AT ≤ 3K W (A) + 1

3×3 for all A ∈ M+ .

18

John M. Ball

3×3 (b) If W satisﬁes (C2) then there exists γ > 0 such that if C ∈ M+ and |C − 1| < γ then T 3×3 A DA W (AC) ≤ 3K W (A) + 1 for all A ∈ M+ . (2.23)

Proof. We prove (a); the proof of (b) is similar. We ﬁrst show that there exists γ > 0 such that if |C − 1| < γ then W (CA) + 1 ≤ 32 (W (A) + 1)

3×3 for all A ∈ M+ .

(2.24)

1 ) suﬃciently small For t ∈ [0, 1] let C(t) = tC + (1 − t)1. Choose γ ∈ (0, 6K −1 so that |C − 1| < γ√ implies that |C(t) | ≤ 2 for all t ∈ [0, 1]. This is possible since |1| = 3 < 2. For |C − 1| < γ we have that

1

d W (C(t)A) dt 0 dt 1 = DA W C(t)A · (C − 1)A dt

W (CA) − W (A) =

0

=

0

1

T DA W C(t)A C(t)A · (C − 1)C(t)−1 dt

≤K 0

W C(t)A + 1 · C − 1 · C(t)−1 dt

1

≤ 2Kγ

1

W C(t)A + 1 dt .

0

Let θ(A) = sup|C−1|<γ W (CA). Then W (CA) − W (A) ≤ θ(A) − W (A) ≤ 2Kγ θ(A) + 1 , from which (2.24) follows. Finally, if |C − 1| < γ we have from (C1) and (2.24) that DA W (CA)AT = DA W (CA)(CA)T C−T ≤ K W (CA) + 1 C−T ≤ 3K W (A) + 1 ,

as required.

Proof of Theorem 2.4. (i) Given ϕ as in the theorem, deﬁne for |τ | sufﬁciently small yτ (x) := y(x) + τ ϕ y(x) . Then yτ ∈ A and Dyτ (x) = 1 + τ Dϕ y(x) Dy(x)

a.e. x ∈ Ω.

1. Some Open Problems in Elasticity

19

In particular det Dyτ (x) > 0 for a.e. x ∈ Ω and limτ →0 yτ − yW 1,p = 0. Hence I(yτ ) ≥ I(y) for |τ | suﬃciently small. But 1 I(yτ ) − I(y) τ 1 d 1 W 1 + sτ Dϕ y(x) Dy(x) ds dx = τ Ω 0 ds 1 = DW 1 + sτ Dϕ y(x) Dy(x) · Dϕ y(x) Dy(x) ds dx . Ω

0

Since by Lemma 2.5(a) the integrand is bounded by the integrable function 3K W Dy(x) + 1 sup |Dϕ(z)| , z∈R3

we may pass to the limit τ → 0 using dominated convergence to obtain (2.21). (ii) This follows in a similar way to (i) from Lemma 2.5 (b). Since most of the details have already been written down in Bauman, Owen and Phillips [1991a] we just sketch the idea. Let ϕ ∈ C01 (Ω; R3 ). For suﬃciently small τ > 0 the mapping θτ deﬁned by θτ (z) := z + τ ϕ(z) belongs to C 1 (Ω; R3 ), satisﬁes det Dθτ (z) > 0, and coincides with the identity on ∂Ω. Standard arguments from degree theory imply that θτ is a diﬀeomorphism of Ω to itself. Thus the ‘inner variation’ yτ (x) := y(zτ ) ,

x = zτ + τ ϕ(zτ )

deﬁnes a mapping yτ ∈ A, and −1 Dyτ (x) = Dy(zτ ) 1 + τ Dϕ(zτ )

a.e. x ∈ Ω.

Since y ∈ W 1,p it follows easily that yτ − yW 1,p → 0 as τ → 0. Changing variables we obtain −1 det 1 + τ Dϕ(z) dz , W Dy(z) 1 + τ Dϕ(z) I(yτ ) = Ω

from which (2.22) follows using (2.23) and dominated convergence.

In order to give an interpretation of Theorem 2.4 (i), let us make the following Invertibility Hypothesis. y is a homeomorphism of Ω onto Ω := y(Ω), Ω is a bounded domain, and the change of variables formula f y(x) det Dy(x) dx = f (z) dz (2.26) Ω

Ω

20

John M. Ball

holds whenever f : R3 → R is measurable, provided that one of the integrals in (2.26) exists. Suﬃcient conditions for this hypothesis to hold are given in Ball [1981] ˇ ak [1988]. and Sver´ 2.6 Theorem. Assume that the hypotheses of Theorem 2.4 (i) and the Invertibility Hypothesis hold. Then σ(z) · Dϕ(z) dz = 0 (2.27) Ω

for all ϕ ∈ C 1 (R3 ; R3 ) such that ϕ|y(∂Ω1 ) = 0, where the Cauchy stress tensor σ is deﬁned by σ(z) := T y−1 (z) , and

z ∈ Ω

−1 T(x) = det Dy(x) DA W (Dy x) Dy(x)T .

(2.28)

¯ is bounded, we can assume that ϕ and Proof. Since by assumption y(Ω) Dϕ are uniformly bounded. Thus (2.27) follows immediately from (2.21), (2.26) and (2.28). Thus Theorem 2.4(i) asserts that y satisﬁes the spatial (Eulerian) form of the equilibrium equations. Theorem 2.4 (ii), on the other hand, involves the so-called energy-momentum tensor W (A)1 − AT DA W (A), and is a multi-dimensional version of the Du Bois Reymond or Erdmann equation of the one-dimensional calculus of variations. The hypotheses (C1) and (C2) imply that W has polynomial growth. More precisely, we have 2.7 Proposition. Suppose W satisﬁes (C1) or (C2). Then for some s > 0, s s 3×3 W (A) ≤ M A + A−1 for all A ∈ M+ . Proof. Let V ∈ M 3×3 be symmetric. For t ≥ 0 d W (etV ) = DA W (etV )etV · V dt = etV DA W (etV ) · V ≤ K W (etV ) + 1 |V| .

(2.29)

From this it follows that W eV + 1 ≤ W (1) + 1 eK|V| .

(2.30)

1. Some Open Problems in Elasticity

21

Now set V = ln U, where U = UT > 0, and denote by vi the eigenvalues of U. Since 3

12 ln U = (ln vi )2 i=1

≤

3

ln vi ,

i=1

it follows that eK| ln U| ≤ v1K + v1−K v2K + v2−K v3K + v3−K 3 3

13 viK + vi−K ≤ 3−3 i=1

≤C

3 i=1

vi3K +

i=1 3

vi−3K

3

i=1

≤ C1 |U|3K + |U−1 |3K , where C > 0, C1 > 0 are constants. From (2.30) we thus obtain W (U) ≤ M |U|3K + |U−1 |3K , where M = C1 W (1) + 1 . The result now follows from the polar de3×3 composition A = RU of an arbitrary A ∈ M+ , where R ∈ SO(3), T U = U > 0. It is easily seen that if W is isotropic then both (C1) and (C2) are equivalent to the condition that (v1 Φ,1 , v2 Φ,2 , v3 Φ,3 ) ≤ K Φ(v1 , v2 , v3 ) + 1 for all vi > 0 and some K > 0, where Φ is given by (2.14) and Φ,i = ∂Φ/∂vi . In particular, both (C1) and (C2) hold for the class of Ogden materials (Ogden [1972a,b]), for which Φ has the form Φ(v1 , v2 , v3 ) =

M i=1

ai ϕ(αi ) +

N

bi ψ(βi ) + h(v1 v2 v3 )

i=1

where ϕ(α) = v1α + v2α + v3α ,

ψ(β) = (v2 v3 )β + (v3 v1 )β + (v1 v2 )β ,

ai > 0, bi > 0, αi = 0, βi = 0, and where h : (0, ∞) →[0, ∞) is convex, with h(δ) → ∞ as δ → 0, provided that δh (δ) ≤ K1 h(δ) + 1 for all δ > 0.

22

2.5

John M. Ball

Regularity and Self-Contact

An interesting approach to the problem of invertibility in mixed boundaryvalue problems (i.e., to the non-interpenetration of matter) is due to Ciarlet and Neˇcas [1985]. They proposed minimizing W (Dy) dx I(y) = Ω

subject to the boundary condition (2.1) and the global constraint det Dy(x) dx ≤ volume y(Ω) , Ω

and they gave hypotheses under which the minimum was attained, these hypotheses being weakened by Qi [1988]. They further showed that any minimizer is one-to-one almost everywhere, and that assuming suﬃcient regularity of the free boundary y(∂Ω2 ) the tangential components of the normal stress vector vanish there. Consequently they identiﬁed the above constrained boundary-value problem as corresponding to the case of smooth (i.e., frictionless) self-contact. A related but somwhat diﬀerent formulation has recently been proposed by Pantz [2001a]; see also Giaquinta, Modica and Souˇcek [1994]. Problem 7. Justify the Ciarlet-Neˇcas minimization problem, or an appropriate modiﬁcation of it, as a model of smooth self-contact. The problem here is to construct suitable variations in the neighbourhood of a region of self-contact of a minimizer to establish that in some sense the tangential stress components vanish there. This is non-trivial because in principle the two parts of the boundary in contact with one another could be wildly deformed and interlocked in a very complex conﬁguration. If such a result could be obtained, a more ambitious target would be to establish the regularity properties of the free boundary in both the self-contacting and non self-contacting regions.

2.6

Uniqueness of Solutions

For mixed boundary-value problems of elasticity nonuniqueness of equilibrium solutions is common-place, the most familiar examples being those associated with buckling of rods, plates and shells. Buckling can occur even for pure zero-traction boundary conditions, such as in the eversion of part of a spherical shell. For the pure zero-traction problem one can even have nonuniqueness among homogeneous dilatations (see Ball [1982]). Nonuniqueness of these types is expected to hold, and to some extent can be proved rigorously, when the stored-energy function satisﬁes favourable hypotheses such as strict polyconvexity (though see Section 2.7). For storedenergy functions corresponding to elastic crystals, for which there are many

1. Some Open Problems in Elasticity

23

minimum energy conﬁgurations with a continuum of diﬀerent sets of phase boundaries, the extent of non-uniqueness is of course much greater. For pure displacement boundary conditions, with a strictly polyconvex (or strictly quasiconvex) stored-energy function satisfying favourable growth conditions, the situation as regards uniqueness is less clear. John [1972b] proved uniqueness for smooth deformations with uniformly small strains (but possibly large rotations). In the same paper he gave a heuristic counter-example to uniqueness for the case of an annular two-dimensional body, and this has been made rigorous by Post and Sivaloganathan [1997] (see Section 2.7), who also proved nonuniquenesss for an analogous threedimensional problem with Ω a torus. But what if Ω is homeomorphic to a ball? In this case we have already seen that cavitation provides one counterexample to uniqueness, though the cavitating solution is discontinuous. Problem 8. Prove or disprove the uniqueness of suﬃciently smooth equilibrium solutions to pure displacement boundary-value problems for homogeneous bodies when the stored-energy function W is strictly polyconvex and Ω is homeomorphic to a ball. The answer to this problem probably depends on both the geometry of Ω and the boundary conditions. For example, suppose that Ω is a ball, and that the boundary conditions correspond to severely squeezing the ball until it has a dumb-bell shape consisting of two roughly ball-shaped regions connected by a narrow passage. In this case one might expect, though it is not obvious how to prove it, that there might be equilibrium solutions in which material from one half of Ω is pulled through into the other half, but prevented from returning by the constriction. On the other hand, an elegant result of Knops and Stuart [1984] implies uniqueness for the case when the boundary displacements are linear and Ω is star-shaped (see also Taheri [2001b]).

2.7

Structure of the Solution Set

Problem 9. Devise general methods for proving the existence of local minimizers of I that are not global minimizers, and of other weak solutions of the equilibrium equations. For the existence of local minimizers there are two natural approaches. First we could try to use the direct method of the calculus of variations in a suitable subset of A. For example, under the hypotheses of Theorem 2.2 suppose that we want to prove the existence of a local minimizer with respect to some metric d on A. Assume that d is such that if z (j) ∈ A with z (j) z in W 1,1 (Ω; R3 ) and sup I(z (j) ) < ∞ then d(z (j) , z) → 0. ¯ and boundary ∂U . Let U ⊂ A be open with respect to d, with closure U ¯ . Suppose now that we By the direct method, I attains a minimum yˆ on U

24

John M. Ball

can prove that inf I > inf I > inf I. ∂U

U

A

Then yˆ ∈ U and is a local, but not global, minimizer with respect to d. I believe that it should be possible to implement this method in some realistic examples, but have not seen it done. The only results on local minimizers in nonlinear elasticity using the direct method that I am aware of are due to Post and Sivaloganathan [1997], who prove the existence of local but not global minimizers for certain two-dimensional problems (see Section 2.6) for which the domain Ω has nontrivial topology by global minimization in a weakly closed homotopy class, and to Taheri [2001a], who generalizes the results in Post and Sivaloganathan [1997] to a wider class of domains. The second approach is to ﬁnd by some method a suﬃciently smooth solution yˆ to the equilibrium equations and attempt to show directly that it is a local minimizer. For local minimizers in W 1,∞ (Ω; R3 ) (weak local minimizers) this can be done in principle by checking positivity of the second variation. However for local minimizers in W 1,p (Ω; R3 ) with 1 ≤ p < ∞, or in Lq (Ω; R3 ), 1 ≤ q ≤ ∞, the task is made much more diﬃcult by the absence of a known generalization to higher dimensions of the Weierstrass fundamental suﬃciency theorem of the one-dimensional calculus of variations (for a discussion see Ball [1998]). Sometimes it is possible to circumvent the lack of such a theory. For example, in a dead-load traction problem arising from the bi-axial load experiments of Chu and James [1993, 1995] on CuAlNi single crystals, it is proved in Ball and James [2002], Ball, Chu and James [2002] (see also Ball, Chu and James [1995]) that cerˆ with Dˆ tain y y = A = constant are local (but not global) minimizers in L1 (Ω; R3 ), by an argument exploiting the geometric incompatibility of A with deformation gradients having lower energy. How can one prove the existence of equilibrium solutions that are not local minimizers? In exceptional cases one may know an equilibrium solution explicitly (for example a trivial solution) and be able to show that it does not satisfy some necessary condition for a local minimizer. If we can also prove the existence of a global minimizer then we have at least two equilibrium solutions. This can be done, for example, for the case of some mixed boundary-value problems when the stored-energy function is polyconvex but not quasiconvex at the boundary (see Ball and Marsden [1984]). Another approach would be to try to use Morse theory or mountain-pass methods, but it is not clear how to do this so that, for example, appropriate conditions of Palais-Smale type can be veriﬁed; for results in an interesting model problem see Zhang [2001]. More generally, one can ask for a description of how the set of equilibrium solutions varies as a function of relevant parameters such as boundary displacements or loads. For the pure traction problem near a stress-free state an interesting study of this type is that of Chillingworth, Marsden and Wan [1982, 1983] and Wan and Marsden [1983].

1. Some Open Problems in Elasticity

25

Problem 10. Develop local and global bifurcation theories for nonlinear elastostatics that apply to mixed displacement-traction boundary conditions. As an illustration, the most well-known bifurcation problem in elasticity is that of buckling of a thin rod. Although this problem has been treated from the perspective of rod theory in hundreds of papers since the time of Euler [1744], there is no rigorous three-dimensional theory that justiﬁes the usual picture of buckling, for example the existence of critical buckling loads or displacements, with corresponding branches of bifurcating buckled solutions. There are at least two diﬃculties in providing such a theory. The ﬁrst is that unless the problem is formulated in a somewhat unrealistic way, there is no suﬃciently explicit trivial compressed solution about which to linearize the equilibrium equations. For example, suppose that in a stressfree reference conﬁguration a homogeneous isotropic elastic rod occupies the region Ω = (0, L) × D, where D ⊂ R2 is the cross-section. A natural boundary-value problem to consider, corresponding to clamped ends, consists of the equilibrium equations (2.4) and the boundary conditions (2.1), (2.6), with the choices ∂Ω1 = {0, L} × D, ∂Ω2 = (0, L) × ∂D, and ¯ (0, x ) = (0, x ) , y

¯ (L, x ) = (λL, x ) , y

x ∈ D ,

(2.31)

where λ > 0. For λ = 1 the homogeneous deformation y(x1 , x ) = (λx1 , x ) does not in general satisfy the zero traction condition (2.6). For example, for the compressive case λ < 1 the rod will typically want to bulge, leading to boundary layers near x1 = 0 and x1 = L. In order to have a homogeneously deformed trivial solution y(x) = (λx1 , µx2 , µx3 ) , one can replace (2.31) by the conditions y1 (0, x ) = 0 ,

y1 (L, x ) = λL ,

x ∈ D ,

corresponding to the less realistic case of frictionless end-faces constrained to lie in the planes {0} × R2 and {λL} × R2 . To prevent sliding of the end-faces one could add the further constraint that y2 dx = y3 dx = 0 at x1 = 0, L . D

D

In this case the natural boundary conditions at x1 = 0, L for the variational problem are that the stress vector t across the end-faces has constant transverse components t2 , t3 which are equal at x1 = 0, L. If we try to prescribe compressive loads at x1 = 0, L rather than displacements we encounter other diﬃculties (see Ball [1996a] for a discussion of one of these). The second more serious diﬃculty has already been mentioned, namely the lack of regularity of solutions to the linearized equilibrium equations as

26

John M. Ball

one approaches points of ∂Ω1 ∩∂Ω2 , or points of discontinuity of the applied traction in a pure traction formulation of the problem, which prohibits use of the implicit function theorem in natural spaces. Perhaps it might be possible to work in spaces with suitable weights in the neighbourhood of ∂Ω1 ∩ ∂Ω2 . But it seems odd that ﬁne details of what goes on near ∂Ω1 ∩ ∂Ω2 should have a signiﬁcant bearing on the buckling phenomenon, so perhaps there is a diﬀerent approach to be discovered that circumvents this diﬃculty. Once a local bifurcation picture has been established, the next thing to understand is what happens to bifurcating solutions for large parameter values. For the case when ∂Ω1 ∩ ∂Ω2 is empty global results have recently been obtained by Healey and Rosakis [1997], Healey and Simpson [1998] and Healey [2000].

2.8

Energy Minimization and Fracture

Many of the problems described above have generalizations to variational models of fracture. Since typical fracture problems are described by deformations that have jump discontinuities across two-dimensional crack surfaces, fracture cannot in general be modelled in the context of Sobolev spaces. A generalization of the energy functional (2.2) to deformations allowing for fracture is W (Dy) dx + g y+ − y− , νy dH2 , (2.32) I(y) = Ω

Sy

where y belongs to the class SBV(Ω) of mappings of special bounded variation, i.e., those whose gradient is a bounded measure having no Cantor part. In (2.32) Sy denotes the set of jump points of y, νy the measure theoretic normal to Sy , and y± the traces of y on either side of Sy . The second integral represents the surface energy of cracks, as postulated in the Griﬃth theory of fracture (see, for example, Cherepanov [1998]), the simplest case g = constant corresponding to a contribution to the energy proportional to the total crack surface area H(Sy ). Despite much deep work on such models (see, for example, Acerbi, Fonseca and Fusco [1997], Ambrosio [1989, 1990], Ambrosio and Braides [1995], Ambrosio, Fusco and Pallara [1997, 2000], Ambrosio and Pallara [1997], Braides [1998], Braides and Coscia [1993, 1994], Braides, Dal Maso and Garroni [1999], Buttazzo [1995]), and their apparent potential for making an impact on understanding fracture, there have been only isolated attempts to discover their implications for practical problems of fracture mechanics (see, for example, Francfort and Marigo [1998], Bourdin, Francfort and Marigo [2000]). Problem 11. Clarify the status of models based on the energy functional (2.32) with respect to classical fracture mechanics and to nonlinear elastostatics.

1. Some Open Problems in Elasticity

27

Two key issues are fracture initiation and stability, which are both related to the study of local minimizers for the functional (2.32). A technical obstacle in such a study is the lack of a general method of calculating a general variation of I about a given y in the direction of nearby deformations having possibly very diﬀerent sets of jump points. An understanding of local minimizers would also clarify the status of the nonlinear elastostatics model based on (2.2) with respect to that based on (2.32), and thereby demystify the apparent sensitivity of the elastostatics model to growth behaviour for very large strains.

3 3.1

Dynamics Continuum Thermomechanics

We recall brieﬂy the elements of continuum thermomechanics. The basic balance laws are the balance of linear momentum d ρR yt dx = tR dS + b dx , (3.1) dt E ∂E E the balance of angular momentum d ρR x ∧ yt dx = x ∧ tR dS + x ∧ b dx , dt E ∂E E and the balance of energy 1 d 2 ρ |yt | + U dx = b · yt dx + tR · yt dS dt E 2 R E ∂E r dx − qR · n dS . + E

(3.2)

(3.3)

∂E

Here y = y(x, t) denotes the deformation, tR the Piola–Kirchhoﬀ stress vector, ρR > 0 the (constant) density in the reference conﬁguration, b the body force, U the internal energy, qR the heat ﬂux vector and r the heat supply. The balance laws are assumed to hold for all Lipschitz domains E ⊂ Ω, and the unit outward normal to ∂E is denoted by n. In addition to the balance laws, thermomechanical processes are required to obey the Second Law of Thermodynamics, which we assume to hold in the form of the Clausius–Duhem inequality qR · n r d dS + dx (3.4) η dx ≥ − dt E θ θ ∂E E for all E, where η is the entropy and θ the temperature. Standard arguments now show that tR = TR n, where TR is the Piola–Kirchhoﬀ stress tensor,

28

John M. Ball

and that for suﬃciently smooth processes (3.1), (3.3), (3.4) reduce to the pointwise forms ρR ytt − div TR − b = 0 , d 1 |yt |2 + U − b · yt − div (yt TR ) + div qR − r = 0 dt 2 qR r ηt + div − ≥ 0, θ θ

(3.5) (3.6) (3.7)

and that (3.2) is equivalent to the symmetry of the Cauchy stress tensor T = (det Dy)−1 TR (Dy)T . Eliminating r from (3.6), (3.7), using (3.5) and denoting by ψ = U − θη

(3.8)

the Helmholtz free energy, we obtain that for suﬃciently smooth processes −ψt − θt η + TR · Dyt −

qR · grad θ ≥ 0. θ

(3.9)

Adopting the prescription of Coleman and Noll [1963], we assume that given an arbitrary deformation y = y(x, t) and temperature ﬁeld θ = θ(x, t) we can choose a body force b = b(x, t) and heat supply r = r(x, t) to balance (3.5), (3.6), so that (3.9) becomes an identity to be satisﬁed by the constitutive equations. For the case of a thermoelastic material, for which TR , η, ψ, qR are assumed to be functions of Dy, θ, grad θ, this leads to the relations (3.10) ψ = ψ(Dy, θ), TR = DA ψ, η = −Dθ ψ , and then (3.9) reduces to the inequality −

qR · grad θ ≥ 0. θ

(3.11)

(Note that, although this inequality must be satisﬁed by the constitutive equation for qR , for processes involving shocks (3.11) is not equivalent to (3.7), since the cancellations in the argument used to obtain (3.9) are no longer valid.) For thermoelastic materials the balance of angular momentum is satisﬁed identically as a consequence of the requirement that TR is frame-indiﬀerent, i.e., for all R ∈ SO(3) , TR RA, θ = RTR A, θ , which is equivalent to the condition that ψ(RA, θ) = ψ(A, θ) ,

for all R ∈ SO(3) .

(3.12)

1. Some Open Problems in Elasticity

29

The condition of material symmetry becomes ψ(AQ, θ) = ψ(A, θ) ,

for all Q ∈ S ,

(3.13)

where S is the isotropy group. The equations of isothermal thermoelasticity are obtained from (3.5), (3.10) by assuming that θ(x, t) = θ0 = constant. Thus the balance of linear momentum becomes ρR ytt − div DA W (Dy) − b = 0 ,

(3.14)

where W (A) = ψ(A, θ0 ). As regards the entropy inequality, we again adopt the Coleman and Noll point of view, choosing r to balance (3.6). (Here we follow Dafermos [2000], who gives a similar reduction for isentropic thermoelasticity.) Since, from (3.11), qR = 0 when grad θ = 0, (3.7) becomes 1

2 2 ρR |yt |

+ψ

t

− b · yt − div (yt TR ) ≤ 0 .

(3.15)

For the more general case of a thermoviscoelastic material (of strain-rate type), TR , η, ψ, qR are assumed to be functions of Dy, Dyt , θ, grad θ. By the same method we ﬁnd that ψ = ψ(Dy, θ), and that S · Dyt − where

η = −Dθ ψ ,

qR · grad θ ≥ 0, θ

TR = DA ψ + S Dy, Dyt , θ, grad θ .

In the isothermal case we obtain the equation of motion ρR ytt − div DA W (Dy) − div S(Dy, Dyt ) − b = 0 ,

(3.16)

where W = ψ(Dy, θ0 ), S(Dy, Dyt ) = S(Dy, Dyt , θ0 , 0), together with the energy inequality (3.15). The frame-indiﬀerence of S takes the form S(Dy, Dyt ) = Dy Σ(U, Ut ) ,

(3.17) 1

for some matrix-valued function Σ, where U = (DyT Dy) 2 .

3.2

Existence of Solutions

Problem 12. Prove the global existence and uniqueness of solutions to initial boundary-value problems for properly formulated dynamic theories of nonlinear elasticity.

30

John M. Ball

To discuss this problem let us begin with isothermal thermoelasticity. The governing equations are (3.14). These equations need to be supplemented by boundary conditions such as (2.1), (2.6) and by the initial conditions (3.18) y(x, 0) = y0 (x), yt (x, 0) = y1 (x). If the body force is conservative, so that b = −grad y h(x, y),

(3.19)

then (3.14) formally comprises a Hamiltonian system, and could be alternatively obtained by applying Hamilton’s principle to the functional T 1 2 2 ρR |yt | − W (Dy) − h(x, y) dx dt . 0

Ω

In particular, solutions formally satisfy the balance of energy E(y, yt ) = E(y0 , y1 ) ,

where

1

E(y, v) = Ω

2 2 ρR |v|

t ≥ 0,

(3.20)

+ W (Dy) + h(x, y) dx .

However, weak solutions of the quasilinear wave equation (3.14) do not in general satisfy (3.20), since singularities such as shock waves can dissipate energy. Correspondingly, although equality holds in the dissipation inequality (3.15) for smooth solutions, in general it does not do so for weak solutions. Interpreted in the sense of distributions or measures, (3.15) acts as an admissibility criterion for weak solutions. In one dimension (3.14) takes the form ρR ytt − σ(yx )x − b = 0 ,

(3.21)

where σ(yx ) = W (yx ), which setting u1 = ρR yt , u2 = yx is equivalent to the system of two conservation laws ut − f (u)x = g ,

where f (u) =

σ(u2 ) ρ−1 R u1

,

g=

(3.22) b . 0

This system is strictly hyperbolic if σ = W > 0, so that W is strictly convex. Two approaches have been employed to study (3.22), the Glimm scheme, Glimm [1965], and variants of it such as front-tracking (introduced by Dafermos [1972]), and the method of compensated compactness as pioneered by Tartar [1979, 1982] and DiPerna [1983, 1985].

1. Some Open Problems in Elasticity

31

The Glimm scheme and variants apply to strictly hyperbolic systems of the form (3.22) with u ∈ Rn , f : Rn → Rn , g ∈ Rn . They involve a semi-explicit construction of the solutions in terms of approximation of the initial data by piecewise constant functions, together with an analysis of wave interactions. They are restricted to initial data having small total variation, and thus, via total variation estimates on the solution, to solutions of small total variation. Glimm’s original work assumed that the system was ‘genuinely nonlinear’, but this restriction was removed by Liu [1981]. Thanks to work of Bressan [1988, 1995], Bressan and Colombo [1995], Bressan [Crasta and Piccoli], Bressan and Goatin [1999], Bressan and Le Floch [1997], Bressan and Lewicka [2000], Bressan, Liu and Yang [1999] and Liu and Yang [1999b,a,c], the solutions obtained in these ways are now known to be unique in appropriate function classes. For genuinely nonlinear systems of two conservation laws, such as (3.21) with W > 0, W = 0, more is known (see [Dafermos, 2000, Chapter XI]). Most of this work is for solutions on the whole real line; for a treatment of (3.21) with displacement boundary conditions see Liu [1977]. The method of compensated compactness, on the other hand, has up to now been restricted to systems of at most two conservation laws, such as (3.21). Starting from a sequence of approximate solutions obtained from the method of vanishing viscosity (or by a variational time-discretization scheme, see Demoulini, Stuart and Tzavaras [2000]), it uses information coming from the existence of a suitable family of ‘entropies’ (quantities for which there is a corresponding conservation law satisﬁed by smooth solutions) to pass to the limit using weak convergence. However, there is no corresponding uniqueness theorem. These results are described in the books of Bressan [2000], Dafermos [2000] and Serre [2000]. In a recent development, Bianchini and Bressan [2001] have made a breakthrough by obtaining for the ﬁrst time total variation estimates directly from the vanishing viscosity method. For the three-dimensional equations (3.14) very little is known. Hughes, Kato and Marsden [1977] proved that if W satisﬁes the strong ellipticity condition D2 W (A)(a ⊗ n, a ⊗ n) ≥ µ|a|2 |n|2

(3.23)

3×3 for all A ∈ M+ , a, n ∈ R3 , where µ > 0, then for smooth initial data (3.18) deﬁned on the whole of R3 with det Dy0 > 0, there exists a unique smooth solution on a small time interval [0, T ), T > 0. This result was extended to pure displacement boundary conditions by Kato [1985]. For related results see Dafermos and Hrusa [1985] and [Dafermos, 2000, Chapter V]. There seem to be no short-time existence results known for mixed displacement-traction boundary conditions. Interesting results concerning large time existence for suﬃciently smooth and small initial data on the whole of R3 have been obtained by John [1988]. For corresponding results for incompressible elasticity see Hrusa and Renardy [1988], Ebin and Sax-

32

John M. Ball

ton [1986], Ebin and Simanca [1990, 1992] and Ebin [1993, 1996]. In the variables A = Dy, p = ρR yt , (3.14) becomes the system Dij = ρ−1 R vi,j ,

(3.24)

pt = div DA W (Dy) + b ,

(3.25)

At = D ,

which is hyperbolic if D2 W (A)(a ⊗ n, a ⊗ n) > 0

3×3 for all A ∈ M+ and nonzero a, n ∈ R3 .

There is no theory of weak solutions for such multi-dimensional systems. In particular, it is unclear what conditions on W are natural for existence, and whether these conditions will be the same as those guaranteeing existence for elastostatics, namely quasiconvexity or polyconvexity. The system (3.24), (3.25) is special in the sense that there is an involution Aij,k − Aik,j = 0 which is satisﬁed by all weak solutions. Exploiting this in the context of a general system having involutions, Dafermos [1996, 2000] proves a theorem implying that if W is quasiconvex and satisﬁes (3.23) then any Lipschitz solution A, p of (3.24), (3.25) on R3 × [0, T ], T > 0, is unique within the class of weak solutions admissible with respect to the entropy W , of uniformly small local oscillation, and satisfying the same initial data as A, p. An unpublished idea of LeFloch, found independently by Qin [1998], leads to the observation that for polyconvex W the hypothesis of uniformly small oscillation can be removed. These results are interesting because they so far represent the only use of quasiconvexity and polyconvexity in the ˇ ak [1995] for an idea of how quasiconvexity context of dynamics. See Sver´ (in an augmented space) might be used to prove existence by passage to the limit using weak convergence, in the spirit of compensated compactness. For the full system of three-dimensional nonlinear thermoelasticity (3.5)–(3.7), which has the additional conservation law (3.6), the state of knowledge (or rather lack of it) is similar. For these systems an additional diﬃculty is that of ensuring invertibility of solutions, and in particular the condition det Dy(x, t) > 0. For a thermoviscoelastic material, one can hope that a suﬃciently wellbehaved viscous part S of the stress will guarantee existence without any convexity conditions on W . Indeed, in the one-dimensional isothermal case, for which the equation of motion takes the form ρR ytt − σ(yx , yxt )x − b = 0 ,

(3.26)

for which Dafermos [1969] and Antman and Seidman [1996] have proved existence and uniqueness for a general class of σ. The special case of the equation (3.27) ρR ytt − σ(yx )x − yxxt = 0 .

1. Some Open Problems in Elasticity

33

has been treated in numerous papers (see, for example, Greenberg, MacCamy and Mizel [1967], Andrews [1980], Pego [1987]). For corresponding results for thermoviscoelasticity see Racke and Zheng [1997]. For the isothermal case in three dimensions it is natural to consider in place of (3.27) the equation ρR ytt − div DA W (Dy) − ∆yt = 0 , for which a theory of existence is available (see Rybka [1992], Friesecke and Dolzmann [1997]). However the corresponding viscous stress S = Dyt is not of the form (3.17), and so is not frame-indiﬀerent. The only existence theory for weak or strong solutions of (3.16) with S frame-indiﬀerent appears to be that of Potier-Ferry [1981, 1982], who, for pure displacement boundary conditions, established global existence and uniqueness of solutions for initial data close to a smooth equilibrium having strictly positive second variation. Potier-Ferry assumed that the linearized elasticity operator at the equilibrium is strongly elliptic, and that a corresponding positivity condition holds for the linearized viscous stress. He uses methods of Sobolevskii [1966] (for an alternative treatment also based on Sobolevskii’s work see Xu and Marsden [1996] and Xu [2000]). A recent monograph covering various aspects of the analysis of thermoelasticity is that of Jiang and Racke [2000]. A diﬀerent approach to the existence of solutions in elasticity is to weaken the concept of solution to that of a measure-valued solution, in which the unknown is a Young measure νx,t in appropriate variables satisfying an integral identity obtained by formally passing to the weak limit in a sequence of approximate solutions. Using a variational time-discretization method, the global existence of such solutions has been proved by Demoulini [2000] for the viscoelastic equation (3.16) with S frame-indiﬀerent, and by Demoulini, Stuart and Tzavaras [2001] for the equations (3.14) of elastodynamics with W polyconvex (exploiting the idea of Le Floch and Qin [1998]). However they are unable to handle the constraint det Dy > 0. Of course the signiﬁcance of such results would be greatly enhanced if there were examples known of cases in which there was no corresponding weak solution.

3.3

The Relation Between Statics and Dynamics

For suitable boundary conditions, the Second Law of Thermodynamics endows the equations of motion of continuum thermodynamics with a Lyapunov function, that is a function of the state variables that is nonincreasing along solutions. For example, suppose that the mechanical boundary conditions are that y = y(x, t) satisﬁes ¯( · ) y( · , t)∂Ω = y 1

34

John M. Ball

and the condition that the applied traction on ∂Ω2 is zero, and that the thermal boundary condition is θ( · , t)∂Ω = θ0 , where θ0 > 0 is a constant. Assume that the heat supply r is zero, and that the body force is given by (3.19). Then from (3.1), (3.3) and (3.4) with E = Ω we ﬁnd that the ballistic free energy 1 2 (3.28) E= 2 ρR |yt | + U − θ0 η + h dx Ω

is nonincreasing along solutions. (This is a result of Duhem [1911] for the case of thermoelasticity. See Coleman and Dill [1973], Ericksen [1966] and Ball [1986, 1992] for further discussion and references.) For a thermoviscoelastic material, if v( · , t) → 0, y( · , t) → y( · ) and θ( · , t) → θ0 as t → ∞ the integrand in (3.28) formally tends to W (Dy) + h(x, y), where W (Dy) = ψ(Dy, θ0 ) and ψ is the Helmholtz free energy ψ(Dy, θ0 ) = U (Dy, θ0 ) − θ0 η(Dy, θ0 ) . This motivates minimization of W (Dy) + h(x, y) dx . I(y) = Ω

(For pure zero traction boundary conditions, when uniformly rotating equilibria are to be expected, we do not expect that yt → 0; the corresponding entropy maximization problem is studied by Lin [1990]). As applied to such problems, the calculus of variations can be viewed as representing a crude version of dynamics in which true dynamic orbits are replaced by all paths in a phase space of mappings {y, yt , θ} along which E is nonincreasing. Problem 13. elasticity.

Develop a qualitative dynamics for dynamic theories of

Of course a prerequisite for such a qualitative dynamics is a global existence theory for solutions. Given such a theory, the points at issue are the usual ones for dissipative dynamical systems, namely whether solutions converge to equilibrium states as t → ∞, the structure of regions of attraction, the existence of stable and unstable manifolds of equilibria, the existence of a global attractor, and so on. In particular one can ask whether dynamic orbits generically realize suitably deﬁned local minimizing sequences for the ballistic free energy. This is especially interesting in the case when the ballistic free energy does not attain a minimum (as in models of elastic crystals — see Section 4.2). In fact for the one-dimensional viscoelastic model of this type studied by Ball, Holmes, James, Pego, and Swart [1991] and Friesecke and McLeod [1996], it is known that no dynamic solutions realize global minimizing sequences; it is unclear whether or not this is a one-dimensional phenomenon.

1. Some Open Problems in Elasticity

Problem 14. of equilibria.

35

Develop criteria for the dynamic stability and instability

Koiter [1976] is among those who have drawn attention to the problem of justifying the energy criterion for stability, that an equilibrium solution is stable if it is a local minimizer of the corresponding elastic energy (for example, of the ballistic free energy for a thermoviscoelastic material). To keep the discussion simple, consider the case of isothermal motion of a thermoviscoelastic material, for which the equation of motion is given by (3.16), and assume that the body force is zero. The corresponding Lyapunov function is 1 2 E(y, v) = 2 ρR |v| + W (Dy) dx , Ω

for which y = y∗ , v = 0 is a local minimizer provided y∗ is a local minimizer of W (Dy) dx . E(y) = Ω

Since there are diﬀerent types of local minimizer corresponding to diﬀerent metrics d on diﬀerent spaces X of deformations y, in particular W 1,∞ local minimizers and W 1,p local minimizers for 1 ≤ p < ∞, it is not clear which kinds of local minimizers are needed to ensure dynamical stability. As emphasised, for example, by Knops and Wilkes [1973], the standard argument for establishing Lyapunov stability with respect to a metric d requires more than just that y∗ is a strict local minimizer with respect to d (that is I(y) > I(y∗ ) whenever y = y∗ and d(y, y∗ ) is suﬃciently small). What is needed, in addition to the continuity of I with respect to d and the continuity in time of dynamic orbits with respect to d, is that y∗ lies in a potential well, namely that for some ε > 0 inf

d(y,y∗ )=ε

I(y) > I(y∗ ) .

For a way of verifying that a strict local minimizer in a space based on W 1,p for 1 < p < ∞ satisﬁes this requirement when W is strictly polyconvex see Ball and Marsden [1984], and for the case when W is strictly quasiconvex see Evans and Gariepy [1987] and Sychev [1999]. However, we do not know in general how to prove that a given y∗ is a strict W 1,p local minimizer. Further, for W that are not quasiconvex almost nothing is known. These questions are related to the open problem, already mentioned in Section 2.7, of generalizing to higher dimensions the Weierstrass fundamental suﬃciency theorem. The only rigorous result justifying the energy criterion in any generality (in particular, in three dimensions) seems to be that of Potier-Ferry [1982], who for pure displacement boundary conditions establishes asymptotic stability, with respect to the norm of W 2,p × W 2,p , p > 3, of smooth equilibria having strictly positive second variation, under the hypotheses described in Section 3.2.

36

John M. Ball

Finally, little is known about criteria justifying instability of an equilibrium solution. An instructive example is that of Friesecke and McLeod [1997], who for a problem of one-dimensional viscoelasticity exhibit an equilibrium solution u ¯ which, with respect to a topology in which the dynamics is well-posed, (a) is dynamically stable, but (b) is such that there is a continuous path in the phase space (not a dynamical solution) leaving u ¯ along which the energy strictly decreases, so that in particular u ¯ is not a local minimizer.

4 4.1

Multiscale Problems From Atomic to Continuum

Problem 15. Establish the status of elasticity theory with respect to atomistic models. Is it possible to derive elasticity theory from atomistic models? Such models range from full quantum many-body theory to approximations such as density-functional theory, Thomas–Fermi theory, and models in which electronic eﬀects are not explicitly considered but incorporated into interatomic potentials. There is an extensive physics and materials literature on such models and on methods for bridging the atomistic and continuum lengthscales (see Phillips [2001] for an introduction). Here I will concentrate on what little is known rigorously for the case of elastic crystals. However, another important class of materials is that of cross-linked polymers, which involves some diﬀerent issues that are considered from the point of view of statistical mechanics in Deam and Edwards [1976], Edwards and Vilgis [1988]. For crystals, the ﬁrst question to answer is why they occur in the ﬁrst place, that is why at low temperature the minimum energy conﬁguration of a very large number of atoms is spatially periodic. This is the famous unsolved ‘crystal problem’, nicely reviewed by Radin [1987]. Likewise, there is no fundamental understanding of the statistical mechanics of crystals, which would explain their stability and instabilities. Given this impasse, in attempts to pass from atomistic to continuum models of crystals some initial atomic order is always assumed. In the context of free-energy minimization, one can draw a distinction between two kinds of approach. In the ﬁrst, an appropriate limit of a discrete energy functional is sought along sequences of explicit atomic conﬁgurations. For example the atoms may be assumed to occupy a periodic lattice in a reference conﬁguration, and to be displaced according to a given suﬃciently smooth continuum deformation y (the Cauchy–Born hypothesis), the number of atoms being sent to inﬁnity with a suitable scaling for the energy. In this approach there is no attempt to explain why the atoms adopt the assumed conﬁgurations. A

1. Some Open Problems in Elasticity

37

recent example is the work of Blanc, Le Bris and Lions [2001], who obtain in a suitable scaling a limiting energy of the form I(y) = W (Dy) dx Ω

in the cases of (a) a two-body interaction, (b) Thomas–Fermi theory. As is well-known the case of two-body interactions leads to a W satisfying the Cauchy relations, namely that the linearized elasticity coeﬃcients at a natural state (say Dy = 1) cijkl =

∂ 2 W (1) ∂Aij ∂Akl

possess the symmetries cijkl = cjikl = cklij = cikjl . These symmetries are known not to hold in general (see Love [1927], Weiner [1983]). Blanc, LeBris, and Lions also obtain second-order terms in the expansion of the energy with respect to the scaling parameter, identifying these with surface energies. For fundamental results on linear deformations with the Cauchy–Born hypothesis see Lieb and Simon [1977] and Feﬀerman [1985]; in these papers the dependence on the deformation gradient enters implicitly through the given Bravais lattice. For more recent extensions of the results of Lieb and Simon [1977] see Catto, Le Bris and Lions [1998]. The second approach is to consider the true minimizers of the discrete problems, and to try to understand what functional their limit minimizes. One example of this approach is the interesting study by Friesecke and Theil [2001] of a model two-dimensional problem of a lattice of particles linked by harmonic springs between their nearest and next nearest neighbours. They determine open regions of atomic parameters in which the Cauchy–Born hypothesis holds in the appropriate limit, and open regions in which it does not. Another interesting recent study in this spirit is that of Penrose [2001], who considers a two-dimensional model problem of a lattice of rotatable disks with one-body wall interactions and angle-dependent two-body interactions. By suitably restricting the statistical ensemble, so that, for example, certain angles between atoms in the deformed conﬁguration are constrained to lie in certain intervals (these constraints being designed to deter dislocations), he proves the existence of an elastic free energy W corresponding to taking the thermodynamic limit with prescribed linear boundary data. He also deduces a convexity property of W weaker than rank-one convexity, and suggests that W might in fact be quasiconvex. Other work in this spirit is that of Braides, Dal Maso and Garroni [1999], Braides and Gelli [2001a,b] and Foccardi and Gelli [2001], who calculate the Γ-limits of certain discrete functionals with nearest-neighbour

38

John M. Ball

or pairwise interactions, obtaining a limiting functional allowing fracture of the general form (2.32) together with a corresponding function space on which to minimize it (namely SBV(Ω) or GSBV(Ω)). This acts as a reminder that, since the predictions of minimization problems can depend on the function space, as we have seen in Section 2.3 in connection with the Lavrentiev phenomenon, a proper atomistic to continuum derivation should deliver not only the limiting governing equations or energy, but the appropriate function space as well. For a proposed scheme for the passage from atomistic to continuum models for thin ﬁlms, rods or tubes see Friesecke and James [2000]. There seems to be no work on atomistic derivations of dynamic theories of elasticity, or even of elastostatics in the context of deformations that are not global energy minimizers. In some situations it is desirable to simultaneously use a continuum and a discrete model. For example, one may wish to study the interaction of a defect or other localized region (such as the vicinity of a crack tip), in which atomistic eﬀects may be important, with the surrounding bulk material, where a continuum theory is appropriate. One way of handling the resulting matching problem is the quasicontinuum method (see Tadmor, Ortiz and Phillips [1996]). A rigorous understanding of such methods is lacking.

4.2

From Microscales to Macroscales

Materials undergoing phase transformations involving a change of shape at a critical temperature typically develop characteristic patterns of microstructure, in which the deformation gradient has large variations on a ﬁne length-scale that varies from material to material but can be as small as a few atomic spacings. Such microstructures often contain twinned regions consisting of many parallel layers separated by sharp interfaces, the deformation gradient alternating between two distinct values in adjacent layers. Why does ﬁne microstructure form? To what extent can we predict its morphology? What are the properties of the material at a macroscale much larger than the microscale of the microstructure? Whereas it would be desirable to answer such questions in the context of a suitably formulated dynamical theory, it is neither clear what such a theory should be (especially as regards the kinetics of interfaces), nor do we currently have the techniques to give answers corresponding to any such theory. Hence we will discuss some more speciﬁc open problems that arise in static models of such phase transformations. Consider a single crystal of thermoelastic material that undergoes a diffusionless phase transformation involving a change of shape at the temperature θc . For deﬁniteness, suppose that there is an interval E of temperatures, containing θc , such that for θ ∈ E, θ > θc , the minimum energy

1. Some Open Problems in Elasticity

39

conﬁguration (called austenite) of the crystal is cubic. Taking the reference conﬁguration to be undistorted austenite at θ = θc , we suppose that for θ ∈ E, θ ≤ θc , a minimum energy conﬁguration (called martensite) is given by the transformation strain U (θ). We suppose that the Helmholtz free3×3 energy function ψ(A, θ) attains a ﬁnite minimum with respect to A ∈ M+ for each θ ∈ E, so that by adding to it a suitable function of θ we may assume for the purposes of free-energy minimization that min ψ(A, θ) = 0 .

3×3 A∈M+

Let

3×3 K(θ) = A ∈ M+ : ψ(A, θ) = 0

be the set of energy-minimizing deformation gradients. Note that by (3.12) and (3.13) SO(3)K(θ)S = K(θ) , where we take S to be the subgroup P 24 of SO(3) consisting of the 24 rotations mapping a cube into itself. It is thus reasonable to suppose that for θ ∈ E, ⎧ ⎪ , θ > θc , ⎨α(θ)SO(3) N (4.1) K(θ) = SO(3) ∪ i=1 SO(3)Ui (θc ) , θ = θc , ⎪ ⎩ N θ < θc , i=1 SO(3)Ui (θ) , where α( · ) describes the thermal expansion, with α(θc ) = 1, and where the Ui (θ), 1 ≤ i ≤ N , are the distinct positive deﬁnite symmetric matrices of the form QT U (θ)Q, Q ∈ P 24 . We assume that N is independent of θ ∈ E. If Ui (θc ) = 1 the transformation is ﬁrst-order, while if Ui (θc ) = 1 it is secondorder. We say that each Ui describes a diﬀerent variant of martensite. For example, in the case of a cubic-to-tetragonal transformation we may take U (θ) = U1 (θ) = diag (η3 , η1 , η1 ) , where η1 (θ) > 0, η3 (θ) > 0, and then we ﬁnd that N = 3, and that U2 (θ) = diag (η1 , η3 , η1 ) ,

U3 (θ) = diag (η1 , η1 , η3 ) .

For other transformations we get diﬀerent numbers of variants; for example, for cubic-to-orthorhombic transformations N = 6, and for cubic to monoclinic transformations N = 12. Note that in adopting (4.1) we exclude large shears leaving the crystal lattice invariant (see Ericksen [1977b]), the inclusion of which would lead to K(θ) consisting of an inﬁnite number of energy wells for each θ, and to an energy-minimization problem of a diﬀerent character to that based on (4.1) (see Fonseca [1988]).

40

John M. Ball

The total free energy corresponding to the deformation y : Ω → R3 is given by ψ(Dy, θ) dx . Iθ (y) = Ω

Zero-energy microstructures (at the temperature θ) correspond to sequences of deformations y(j) such that (4.2) lim Iθ y(j) = 0 . j→∞

If we assume a mild growth condition on ψ, such as ψ(A, θ) ≥ c0 |A|p − c1

3×3 for all A ∈ M+ ,

where c0 > 0, c1 and p > 1 are constants, then (4.2) is equivalent to the statement that Dy(j) → K(θ) in measure, or that the Young measure (νx )x∈Ω corresponding to (a subsequence of) Dy(j) satisﬁes supp νx ⊂ K(θ)

a.e. x ∈ Ω .

The set of macroscopic deformation gradients corresponding to zero-energy 3×3 microstructures is the set of gradients Dy : Ω → M+ such that Dy(j) p (j) Dy in L for some sequence y satisfying (4.2). Equivalently, following the results of Kinderlehrer and Pedregal [1991, 1994], this is the set of gradients Dy such that Dy(x) ∈ K(θ)qc a.e. x ∈ Ω , where for a compact set K ⊂ M 3×3 , K qc denotes the quasiconvexiﬁcation of K. Equivalent deﬁnitions of K qc are K qc = {¯ ν : ν is a homogeneous W 1,∞ Young measure with supp ν ⊂ K} = {A ∈ M 3×3 : ϕ(A) ≤ max ϕ(B) for all quasiconvex ϕ} B∈K = {E ⊃ K : E quasiconvex}. Here AA = y ∈ W 1,1 (Ω; R3 ) : y|∂Ω = Ax , and a set E is quasiconvex if it is the zero set ϕ−1 (0) of a nonnegative quasiconvex function ϕ. We also have that 3×3 : inf Iθ (y) = 0 . K qc = A ∈ M+ AA

We can similarly deﬁne the polyconvexiﬁcation K pc and the rank-one 3×3 by convexiﬁcation K rc of a compact set K ⊂ M+ K pc = A ∈ M 3×3 : ϕ(A) ≤ max ϕ(B) for all polyconvex ϕ , 3×3 B∈M K rc = A ∈ M 3×3 : ϕ(A) ≤ max ϕ(B) for all rank-one convex ϕ . 3×3 B∈M

Clearly K rc ⊂ K qc ⊂ K pc .

(4.3)

1. Some Open Problems in Elasticity

Problem 16.

41

Determine K(θ)qc for θ ≤ θc .

For θ > θc we have that K(θ)qc = K(θ) (cf. Ball and James [1992]). For θ ≤ θc the problem is open. In particular, K(θ)qc is not known for the cubic-to-tetragonal case either when θ < θc or θ = θc . Ball and James [1992] computed K qc for the case of two wells K = SO(3)U ∪ SO(3)V , with U = UT > 0, V = VT > 0, det U = det V and with SO(3)U rankone connected to SO(3)V, which occurs for orthorhombic-to-monoclinic transformations. In this case by linear changes of variables we can assume that (4.4) U = diag (η1 , η2 , η3 ) , V = diag (η2 , η1 , η3 ) , where η1 > 0, η2 > 0, η3 > 0 and η1 = η2 . (This includes the case U = U1 (θ), V = U2 (θ) of two tetragonal wells.) The answer in this case is (see 3×3 such that Ball and James [2003]) that K qc consists of those A ∈ M+ ⎞ ⎛ a c 0 AT A = ⎝ c b 0 ⎠ , 0 0 η32 where ab − c2 = η12 η22 , a + b + 2|c| ≤ η12 + η22 . The proof is by calculating K pc , showing that K pc ⊂ K rc , and using (4.3). Friesecke [2000] has announced that he can calculate K(θ)pc , θ < θc , in the cubic-to-tetragonal case. However, whether in general K(θ)pc = K(θ)qc is unknown. (Despite this, in their study of nonclassical austenite-martensite interfaces Ball and Carstensen [1999] were able to show that for θ < θc the identity matrix is rank-one connected to K(θ)qc if and only if it is rank-one connected to K(θ)pc .) Problem 17. For free-energy functions ψ(A, θ) of elastic crystals, determine for which ∂Ω1 ⊂ ∂Ω and g : ∂Ω1 → R3 the minimum of Iθ (y) = ψ(Dy, θ) dx Ω

in A = y ∈ W 1,1 (Ω; R3 ) : y∂Ω = g is attained, and for which it is not.

1

A solution to this problem (also to the corresponding problem including applied loads on ∂Ω2 ) would help clarify the validity of the hypothesis of Ball and James [1987] that the formation of microstructure is associated with non-attainment of minimum energy. For example, is non-attainment generic or exceptional? It is probably overly optimistic to expect a general answer to Problem 17. A simpler special case for which the answer is in general unknown is when

42

John M. Ball

∂Ω1 = ∂Ω, g(x) = Ax, and A ∈ K(θ)qc \K(θ). In this case the problem is equivalent to asking whether for such A there exists a deformation y satisfying y|∂Ω = Ax and Dy(x) ∈ K(θ)

almost everywhere .

For the corresponding two-well problem in which K(θ) is replaced by K = SO(3)U ∪ SO(3)V, with U, V given by (4.4), there is no y with y|∂Ω = Ax and Dy(x) ∈ K almost everywhere. This non-attainment result was proved by Ball and Carstensen [1999] using the result of Ball and James [1991] that any y with Dy(x) ∈ K qc a.e. is a plane strain, the point being that a plane strain y cannot coincide with a linear mapping Ax on the boundary of a three-dimensional region Ω unless Dy(x) = A a.e. . In the corresponding two-dimensional problem the answer is diﬀerent. In fact, if Ω ⊂ R2 and K = SO(2)U ∪ SO(2)V, where U = diag (η1 , η2 ) ,

V = diag (η2 , η1 ) ,

ˇ ak [1996] modiﬁed the theory uller and Sver´ and η1 > 0, η2 > 0 then M¨ of convex integration due to Gromov [1986] to show that there exists y with y|∂Ω = Ax, Dy(x) ∈ K a.e. for any A ∈ K qc . (For variations on the method see Dacorogna and Marcellini [1999], M¨ uller and Sychev [2001], Sychev [2001] and Kirchheim [2001].) Whether these exotic minimizers exist in three dimensions, and if so whether they are physically relevant, is unclear. If, as seems likely, they do exist, then it is natural to ask whether they are admissible, in the sense that they can be obtained as limits of minimizers for a corresponding functional incorporating interfacial energy, for example 2 ε ψ Dy(x), θ + ε2 D2 y(x) dx Iθ (y) = Ω

in the limit ε → 0. For general information on the models and techniques described in this section see Ball and James [2003], Bhattacharya [2001], Hane [1997], M¨ uller [1999], Luskin [1996] and Pedregal [1991, 2000].

4.3

From Three-Dimensional Elasticity to Theories of Rods and Shells

A rod is a three-dimensional body whose form is close to that of a curve in R3 . We can describe a reasonably wide class of such bodies as those which occupy in a reference conﬁguration the bounded domain Ωh = r(s) + Q(s)(0, x ) : s ∈ (0, L), x ∈ hD , where r : (0, L) → R3 is a smooth embedded curve parametrized by arclength, so that |r (s)| = 1, where the cross-section D ⊂ R2 is a bounded

1. Some Open Problems in Elasticity

43

domain with 0 ∈ D, and where Q : (0, L) → SO(3) is a smooth mapping with Q(s)e1 = r (s) for each s ∈ (0, L), which describes how the crosssection is rotated. The parameter h > 0 measures the thickness of the rod. The simple case of an initially straight rod of circular cross-section corresponds to the choice r(s) = se1 , D = B(0, 1) ⊂ R2 , Q(s) = 1, so that Ωh = (0, L)e1 × hD. A shell is a three-dimensional body whose form is close to that of a twodimensional surface. A class of such bodies consists of those occupying in a reference conﬁguration the bounded domain Ωh = r(s1 , s2 ) + τ n(s1 , s2 ) : (s1 , s2 ) ∈ S ,

|τ | ≤ h ,

where S ⊂ R2 is a bounded domain, and r : S → R3 is a smooth embedded oriented surface with unit normal n(s1 , s2 ). Here h > 0 is the thickness of the shell. A plate is a ﬂat shell, corresponding to Ωh = S × (−h, h)

(4.5)

When h is small, such thin rods and shells are traditionally described respectively by one-dimensional rod and two-dimensional shell models, in which the independent variables are respectively (s, t) and (s1 , s2 , t), where t is time. There is an immense literature on the many such theories, well surveyed in the books of Antman [1995], and Ciarlet [1997, 2000]. However there are only the beginnings of a rigorous theory justifying such models with respect to three-dimensional elasticity. Problem 18. Give a rigorous derivation of models of rods, plates and shells, showing that their solutions well approximate appropriate solutions to three-dimensional elasticity for small values of the thickness parameter h. There seem to be no results of this type for dynamical theories of elasticity, so we concentrate on what is known for elastostatics. Here one would ideally like results showing that the solution sets for boundary-value value problems of three-dimensional elasticity converge as h → 0 to corresponding sets for an appropriate rod or shell theory, together with appropriate error estimates. In passing to the limit h → 0 other parameters such as loads may need to be scaled with h. Taking into account such scaling, one would like the convergence and error estimates to be uniform with respect to parameters such as loads entering the boundary conditions, so that in particular the description of buckling according to the three-dimensional theory could be correlated with that for the rod or shell theory identiﬁed. One of the many diﬃculties to be overcome in order to achieve such results is to understand how boundary-layers behave in the limit h → 0. Such boundary layers will occur, for example, at the ends of a rod, where, according to Saint-Venant’s principle one expects the limiting rod theory to see

44

John M. Ball

only resultant loads and moments applied to the ends. Higher-order corrections in h can be expected to yield more sophisticated theories, for example involving numbers of directors (vectors depending on the independent variables giving a better description of the three-dimensional deformation). An isolated theory that addresses some of these diﬃculties is Mielke’s treatment Mielke [1988], Mielke [1990] of Saint-Venant’s problem for an initially straight rod of uniform cross-section and prescribed resultant loads and moments at its two ends, in which via a six-dimensional centre manifold he identiﬁes a Cosserat theory of rods whose solutions attract for long rods those of three-dimensional elasticity having uniformly small strains and the same resultant loads and moments. In this connection, for threedimensional nonlinear elasticity Ericksen [1977b,a, 1983] has derived equations describing beautiful semi-inverse solutions for helical deformations of a rod. For developments see Muncaster [1979, 1983], and for an existence theory for the corresponding problem deﬁned on cross-sections see Ball [1977]. For plates with a St. Venant-Kirchhoﬀ stored-energy function Monneau [2001] has devised a scheme which shows that for periodic boundary conditions and suﬃciently small external forces, there is a solution of the threedimensional equilibrium equations which converges as h → 0 to the solution of the corresponding Kirchhoﬀ-Love plate theory, together with error estimates. However, the principal method that has so far produced rigorous results of the desired type is Γ-convergence (see De Giorgi and Franzoni [1979], Dal Maso [1993]). The ﬁrst application of Γ-convergence to nonlinear elasticity was that of Acerbi, Buttazzo and Percivale [1991], who used it to derive a one-dimensional model for an elastic string. Then Le Dret and Raoult [1995a,b, 1996, 1998] and Ben Belgacem [1997] used it to derive a corresponding two-dimensional membrane theory (see also Braides, Fonseca and Francfort [2000]). Le Dret and Raoult [2000] have also investigated which Cosserat theories of plates Γ-converge to the membrane theory limit as the thickness goes to zero. Bhattacharya and James [1999] used Γ-convergence to derive equations for thin ﬁlms of martensitic material, an interesting conclusion being that in the two-dimensional theory there can be exact austenite–martensite interfaces (for developments see Shu [2000]). In interesting recent work Friesecke, James, and M¨ uller [2001] have derived a theory of nonlinear bending of plates starting from the nonlinear elastic energy W Dy(x) dx , I h (y) = Ωh

where Ωh is given by (4.5). This is a more delicate problem than for the membrane theory since for the boundary conditions for which the bending theory is expected to be valid, I h (yh ) is expected to be of order h3 for minimizers yh , whereas for boundary conditions leading to the membrane

1. Some Open Problems in Elasticity

45

theory we expect that I h (yh ) is of order h. Hence the limit h → 0 corresponds to a singular perturbation. The corresponding bending theory has energy functional 1 24

Q2 (II) ds1 ds2 , S

where II denotes the second fundamental form IIij = n,i · y,j ,

n = y,1 ∧ y,2 ,

and where Q2 (A) = min3 Q3 (A + a ⊗ e3 + e3 ⊗ a) , a∈R Q3 (A) = D2 W 1 (A, A) . The proof is via a reﬁnement of a rigidity result for SO(3) of John [1961, 1972a] (see also Kohn [1982]). John [1965, 1971] also rigorously obtains equations for shells of isotropic material, assuming that the radius of curvature of the shell is large and the maximum strain is uniformly small, providing interior estimates for the validity of the approximation. For other related work see Pantz [2000, 2001b]. For plates satisfying general boundary conditions one expects some theory incorporating both the membrane and bending cases, but the form this should take is unclear. For work in this direction see Ciarlet [2000] and Ciarlet and Roquefort [2000].

Acknowledgments: I am indebted to S. S. Antman, J. J. Bevan, K. Bhattacharya, C. M. Dafermos, G. Friesecke, R. D. James, M. Jungen, J. Kristensen, V. J. Mizel, O. Penrose, A. Raoult, J. Sivaloganathan, and A. Taheri for valuable suggestions and comments. The article was completed while I was visiting the Tata Institute for Fundamental Research in Bangalore, to whose members and staﬀ I am grateful for their support and warm hospitality.

References Acerbi E., G. Buttazzo and D. Percivale [1991], A variational deﬁnition of the strain energy for an elastic string. J. Elasticity, 25:137–148. Acerbi E., I. Fonseca and N. Fusco [1997], Regularity results for equilibria in a variational model of fracture. Proc. Royal Soc. Edinburgh, 127A:889–902. Acerbi E., and N. Fusco [1984], Semicontinuity problems in the calculus of variations. Arch. Rational Mech. Anal., 86:125–145. Acerbi E., and N. Fusco [1988], A regularity theorem for minimizers of quasiconvex integrals. Arch. Rational Mech. Anal., 99:261–281.

46

John M. Ball

Ambrosio, L. [1989], Variational problems in SBV. Acta Appl. Math., 17:1–40. Ambrosio, L. [1990], Existence theory for a new class of variational problems. Arch. Rational Mech. Anal., 111:291–322. Ambrosio, L. and A. Braides [1995], Energies in SBV and variational models in fracture. In Homogenization and applications to material sciences (Nice 1995), volume 9 of GAKUTO Internat. Ser. Math. Sci. Appl., pages 1–22, Tokyo. Gakk¨ otosho. Ambrosio, L. N. Fusco and D. Pallara [1997], Partial regularity of free discontinuity sets II. Ann. Scuola Norm. Sup. Pisa Cl. Sci. (4), 24:39–62. Ambrosio, L. N. Fusco and D. Pallara [2000], Functions of Bounded Variation and Free Discontinuity Problems. Oxford Mathematical Monographs. Oxford University Press. Ambrosio, L. and D. Pallara [1997], Partial regularity of free discontinuity sets I. Ann. Scuola Norm. Sup. Pisa Cl. Sci. (4), 24:1–38. Andrews, G. [1980], On the existence of solutions to the equation utt = uxxt + σ(ux )x . J. Diﬀerential Eqns, 35:200–231. Antman, S. S. [1976], Ordinary diﬀerential equations of nonlinear elasticity. II. Existence and regularity theory for conservative boundary-value problem. Arch. Rational Mech. Anal., 61:353–393. Antman, S. S. [1983], The inﬂuence of elasticity on analysis: Modern developments. Bull. Amer. Math. Soc., 9:267–291. Antman, S. S. [1995], Nonlinear Problems of Elasticity, volume 107 of Applied Mathematical Sciences. Springer-Verlag, New York. Antman, S. S. and P. V. Negr´ on-Marrero [1987], The remarkable nature of radially symmetric equilibrium states of aeolotropic nonlinearly elastic bodies. J. Elasticity, 18:131–164. Antman, S. S. and J. E. Osborn [1979], The principle of virtual work and integral laws of motion. Arch. Rational Mech. Anal., 69:231–262. Antman, S. S. and T. Seidman [1996], Quasilinear hyperbolic-parabolic equations of one-dimensional viscoelasticity. J. Diﬀerential Eqns, 124:132–185. Ball, J. M. [1977], Constitutive inequalities and existence theorems in nonlinear elastostatics. In R.J. Knops, editor, Nonlinear Analysis and Mechanics, HeriotWatt Symposium, Vol. 1. Pitman. Ball, J. M. [1977a], Convexity conditions and existence theorems in nonlinear elasticity. Arch. Rational Mech. Anal., 63:337–403. Ball, J. M. [1980], Strict convexity, strong ellipticity, and regularity in the calculus of variations. Proc. Camb. Phil. Soc., 87:501–513. Ball, J. M. [1981], Global invertibility of Sobolev functions and the interpenetration of matter. Proc. Royal Soc. Edinburgh, 88A:315–328. Ball, J. M. [1981a], Remarques sur l’existence et la r´egularit´e des solutions d’´elastostatique non lin´eaire. In H. Berestycki and H. Brezis, editors, Recent Contributions to Nonlinear Partial Diﬀerential Equations. Pitman. Ball, J. M. [1982], Discontinuous equilibrium solutions and cavitation in nonlinear elasticity. Phil. Trans. Royal Soc. London A, 306:557–611.

1. Some Open Problems in Elasticity

47

Ball, J. M. [1984], Diﬀerentiability properties of symmetric and isotropic functions. Duke Math. J., 51:699–728. Ball, J. M. [1984a], Minimizers and the Euler-Lagrange equations. In Trends and applications of pure mathematics to mechanics (Palaiseau, 1983), pages 1–4. Springer, Berlin. Ball, J. M. [1986], Minimizing sequences in thermomechanics. In Proc. Meeting on “Finite Thermoelasticity”, pages 45–54, Roma. Accademia Nazionale dei Lincei. Ball, J. M. [1989], A version of the fundamental theorem for Young measures. In M. Rascle, D. Serre and M. Slemrod, editors, Proceedings of conference on “Partial diﬀerential equations and continuum models of phase transitions,” pages 3–16. Springer Lecture Notes in Physics. No. 359. Ball, J. M. [1992], Dynamic energy minimization and phase transformations in solids. In Proceedings of ICIAM 91. SIAM. Ball, J. M. [1996], Nonlinear elasticity and materials science; a survey of some recent developments. In P.J. Aston, editor, Nonlinear Mathematics and Its Applications, pages 93–119. Cambridge University Press. Ball, J. M. [1996a], Review of Nonlinear Problems of Elasticity, by Stuart S. Antman. Bull. Amer. Math. Soc., 33:269–276. Ball, J. M. [1998], The calculus of variations and materials science. Quart. Appl. Math., 56:719–740. Ball, J. M. [2001], Singularities and computation of minimizers for variational problems. In R. DeVore, A. Iserles and E. Suli, editors, Foundations of Computational Mathematics. Cambridge University Press. Ball, J. M. and C. Carstensen [1999], Compatibility conditions for microstructures and the austenite-martensite transition. Materials Science & Engineering A, 273–275:231–236. Ball, J. M. , C. Chu and R. D. James [1995], Hysteresis during stress-induced variant rearrangement. J. de Physique IV, C8:245–251. Ball, J. M. , C. Chu and R. D. James [2002], Metastability and martensite. In preparation. Ball, J. M. , P. J. Holmes, R. D. James, R. L. Pego and P. J. Swart [1991], On the dynamics of ﬁne structure. J. Nonlinear Sci., 1:17–90. Ball, J. M. and R. D. James [2003], From Microscales to Macroscales in Materials. Book, in preparation. Ball, J. M. and R. D. James [2002], Incompatible sets of gradients and metastability. In preparation. Ball, J. M. and R. D. James [1987], Fine phase mixtures as minimizers of energy. Arch. Rational Mech. Anal., 100:13–52. Ball, J. M. and R. D. James [1991], A characterization of plane strain. Proc. Roy. Soc. London A, 432:93–99. Ball, J. M. and R. D. James [1992], Proposed experimental tests of a theory of ﬁne microstructure, and the two-well problem. Phil. Trans. Roy. Soc. London A, 338:389–450.

48

John M. Ball

Ball, J. M. and J. E. Marsden [1984], Quasiconvexity at the boundary, positivity of the second variation, and elastic stability. Arch. Rational Mech. Anal., 86:251–277. Ball, J. M. and V. J. Mizel [1985], One-dimensional variational problems whose minimizers do not satisfy the Euler-Lagrange equations. Arch. Rational Mech. Anal., 90:325–388. Ball, J. M. and F. Murat [1984], W 1,p -quasiconvexity and variational problems for multiple integrals. J. Functional Analysis, 58:225–253. Bauman, P., N. C. Owen and D. Phillips [1991], Maximal smoothness of solutions to certain Euler-Lagrange equations from nonlinear elasticity. Proc. Royal Soc. Edinburgh, 119A:241–263. Bauman, P., N. C. Owen and D. Phillips [1991a], Maximum principles and a priori estimates for a class of problems from nonlinear elasticity. Annales de l’Institut Henri Poincar´ e - Analyse non lin´eaire, 8:119–157. Bauman, P., N. C. Owen and D. Phillips [1992], Maximum principles and a priori estimates for an incompressible material in nonlinear elasticity. Comm. in Partial Diﬀ. Eqns, 17:1185–1212. Bauman, P. and D. Phillips [1994], Univalent minimizers of polyconvex functionals in 2 dimensions. Arch. Rational Mech. Anal., 126:161–181. Ben Belgacem, H. [1997], Une m´ethode de Γ-convergence pour un mod`ele de membrane non lin´eaire. C. R. Acad. Sci. Paris S´ er. I Math., 324:845–849. Bhattacharya, K. [2001], Microstructure of martensite. A continuum theory with applications to the shape-memory eﬀect. Oxford University Press, (to appear). Bhattacharya, K. and R. D. James [1999], A theory of thin ﬁlms of martensitic materials with applications to microactuators. J. Mech. Phys. Solids, 47:531– 576. Bianchini, S. and A. Bressan [2001], A center manifold technique for tracing viscous waves. Preprint. Blanc, X., C. Le Bris and P.-L. Lions [2001], Convergence de mod`eles mol´eculaires vers des mod`eles de m´ecanique des milieux continus. C. R. Acad. Sci. Paris S´er. I Math., 332:949–956. Bourdin, B., G. A. Francfort and J.-J. Marigo [2000], Numerical experiments in revisited brittle fracture. J. Mech. Phys. Solids, 48:797–826. Braides, A. [1994], Loss of polyconvexity by homogenization. Arch. Rational Mech. Anal., 127:183–190. Braides A. [1998], Approximation of Free-Discontinuity Problems, volume 1694 of Lecture Notes in Mathematics. Springer-Verlag, Berlin. Braides A. and A. Coscia [1993], A singular perturbation approach to variational problems in fracture mechanics. Math. Models Methods Appl. Sci., 3:303–340. Braides A. and A. Coscia [1994], The interaction between bulk energy and surface energy in multiple integrals. Proc. Royal Soc. Edinburgh, 124A:737–756. Braides A., I. Fonseca and G. Francfort [2000], 3D-2D asymptotic analysis for inhomogeneous thin ﬁlms. Indiana Univ. Math. J., 49:1367–1404.

1. Some Open Problems in Elasticity

49

Braides A. and M. S. Gelli [2001a], Limits of discrete systems with long-range interactions. Preprint. Braides A. and M. S. Gelli [2001b], Limits of discrete sytems without convexity hypotheses. Preprint. Braides A., G. Dal Maso and A. Garroni [1999], Variational formulation for softening phenomena in fracture mechanics: the one-dimensional case. Arch. Rational Mech. Anal., 146:23–58. Bressan, A. [1988], Contractive metrics for nonlinear hyperbolic systems. Indiana J. Math., 37:409–421. Bressan, A. [1995], The unique limit of the Glimm scheme. Arch. Rational Mech. Anal., 130:205–230. Bressan, A. [2000], Hyperbolic Systems of Conservation Laws. Oxford Lecture Series in Mathematics and Its Applications. Oxford University Press. Bressan, A. and R. M. Colombo [1995], The semigroup generated by 2 × 2 conservation laws. Arch. Rational Mech. Anal., 133:1–75. Bressan, A., G. Crasta and B. Piccoli [2000], Well-posedness of the Cauchy problem for n × n systems of conservation laws. Mem. Amer. Math. Soc, 146(694). Bressan, A. and P. G. Le Floch [1997], Uniqueness of weak solutions to hyperbolic systems of conservation laws. Arch. Rational Mech. Anal., 140:301–317. Bressan, A. and P. Goatin [1999], Oleinik type estimates and uniqueness for n×n conservation laws. J. Diﬀerential Eqns, 156:26–49. Bressan, A. and M. Lewicka [2000], A uniqueness condition for hyperbolic systems of conservation laws. Discrete Contin. Dynam. Systems, 6:673–682. Bressan, A., T.-P. Liu and T. Yang [1999], L1 stability estimates for n × n conservation laws. Arch. Rational Mech. Anal., 149:1–22. Buttazzo, G. [1995], Energies on BV and variational models in fracture mechanics. In Curvature ﬂows and related topics (Levico, 1994), volume 5 of GAKUTO Internat. Ser. Math. Sci. Appl., pages 25–36, Tokyo. Gakk¨ otosho. Buttazzo, G. and M. Belloni [1995], A survey on old and recent results about the gap phenomenon. In Recent Developments in Well-Posed Variational Problems, pages 1–27, edited by R. Lucchetti and J. Revalski, Kluwer Academic Publishers, Dordrecht. Catto, I., C. Le Bris and P.-L. Lions [1998], The Mathematical Theory of Thermodynamic Limits: Thomas-Fermi Type Models. Oxford University Press. Cherepanov, G. P., editor [1998], Fracture. Krieger, Malabar, Fl. Chillingworth, D. R. J., J. E. Marsden and Y. H. Wan [1982], Symmetry and bifurcation in three-dimensional elasticity, I. Arch. Rational Mech. Anal., 80:295– 331. Chillingworth, D. R. J., J. E. Marsden and Y. H. Wan [1983], Symmetry and bifurcation in three-dimensional elasticity, II. Arch. Rational Mech. Anal., 83:363–395. Chleb´ik, M. and B. Kirchheim [2001], Rigidity for the four gradient problem, (to appear).

50

John M. Ball

Chu, C. and R. D. James [1993], Biaxial loading experiments on Cu-Al-Ni single crystals. In Experiments in Smart Materials and Structures, pages 61–69. ASME. AMD-Vol. 181. Chu, C. and R. D. James [1995], Analysis of microstructures in Cu-14.0%Al3.9%Ni by energy minimization. J. de Physique IV, C8:143–149. Ciarlet, P. G. [2000], Un mod`ele bi-dimensionnel non lin´eaire de coque analogue a ` celui de W. T. Koiter. C. R. Acad. Sci. Paris S´ er. I Math., 331:405–410. Ciarlet, P. G. [1988], Mathematical Elasticity, Vol.I: Three-Dimensional Elasticity. North-Holland Publishing Co., Amsterdam. Ciarlet, P. G. [1997], Mathematical Elasticity. Vol. II: Theory of Plates. NorthHolland Publishing Co., Amsterdam. Ciarlet, P. G. [2000], Mathematical Elasticity. Vol. III: Theory of Shells. NorthHolland Publishing Co., Amsterdam. Ciarlet, P. G. and J. Neˇcas [1985], Unilateral problems in nonlinear threedimensional elasticity. Arch. Rational Mech. Anal., 87:319–338. Ciarlet, P. G. and A. Roquefort [2000], Justiﬁcation d’un mod`ele bi-dimensionnel non lin´eaire de coque analogue a ` celui de W. T. Koiter. C. R. Acad. Sci. Paris S´er. I Math., 331(5):411–416. Coleman, B. D. and E.H. Dill [1973], On thermodynamics and the stability of motion of materials with memory. Arch. Rational Mech. Anal., 51:1–53. Coleman, B. D. and W. Noll [1963], The thermodynamics of elastic materials with heat conduction and viscosity. Arch. Rational Mech. Anal., 13:167–178. Dacorogna, B. [1982], Quasiconvexity and relaxation of non convex variational problems. J. Funct. Anal., 46:102–118. Dacorogna, B. and P. Marcellini [1999], Implicit Partial Diﬀerential Equations. Birkh¨ auser Boston Inc., Boston, MA. Dafermos, C. M. [1969], The mixed initial boundary-value problem for the equations of nonlinear one-dimensional viscoelasticity. J. Diﬀerential Eqns, 6:71–86. Dafermos, C. M. [1972], Polygonal approximations of solutions of the initial value problem for a conservation law. J. Math. Anal. Appl., 38:33–41. Dafermos, C. M. [1996], Entropy and the stability of classical solutions of hyperbolic systems of conservation laws. In Recent Mathematical Methods in Nonlinear Wave Propagation (Montecatini Terme, 1994), volume 1640 of Lecture Notes in Math., pages 48–69, Berlin. Springer. Dafermos, C. M. [2000], Hyperbolic Conservation Laws in Continuum Physics, volume 325 of Grundlehren der Mathematischen Wissenschaften. Springer. Dafermos, C. M. and W. J. Hrusa [1985], Energy methods for quasilinear hyperbolic initial boundary-value problems. Arch. Rational Mech. Anal., 87:267–292. Dal Maso, G. [1993], An Introduction to Γ-convergence. Birkh¨ auser Boston Inc., Boston, MA. De Giorgi, E. and T. Franzoni [1979], On a type of variational convergence. In Proceedings of the Brescia Mathematical Seminar, Vol. 3 (Italian), pages 63–101, Milan. Univ. Cattolica Sacro Cuore.

1. Some Open Problems in Elasticity

51

Deam, R. T. and S. F. Edwards [1976], The theory of rubber elasticity. Philos. Trans. Roy. Soc. London Ser. A, 280:317–353. Demoulini, S. [2000], Weak solutions for a class of nonlinear systems of viscoelasticity. Arch. Rational Mech. Anal., 155:299–334. Demoulini, S., D. M. A. Stuart and A.E. Tzavaras [2000], Construction of entropy solutions for one-dimensional elastodynamics via time discretisation. Ann. Inst. H. Poincar´ e Anal. Non Lin´eaire, 17:711–731. Demoulini, S., D. M. A. Stuart and A.E. Tzavaras [2001], A variational approximation scheme for three-dimensional elastodynamics with polyconvex energy. Arch. Rational Mech. Anal., 157:325–344. DiPerna, R. J. [1983], Convergence of approximate solutions of conservation laws. Arch. Rational Mech. Anal., 82:27–70. DiPerna, R. J. [1985], Compensated compactness and general systems of conservation laws. Trans. A.M.S., 292:283–420. ´ Duhem, P. [1911], Trait´ e d’Energetique ou de Thermodynamique G´ en´erale. Gauthier-Villars, Paris. Ebin, D. G. [1993], Global solutions of the equations of elastodynamics of incompressible neo-Hookean materials. Proc. Nat. Acad. Sci. U.S.A., 90:3802–3805. Ebin, D. G. [1996], Global solutions of the equations of elastodynamics for incompressible materials. Electron. Res. Announc. Amer. Math. Soc., 2:50–59 (electronic). Ebin, D. G. and R.A. Saxton [1986], The initial value problem for elastodynamics of incompressible bodies. Arch. Rational Mech. Anal., 94:15–38. Ebin, D. G. and S.R. Simanca [1990], Small deformations of incompressible bodies with free boundary. Comm. Partial Diﬀerential Equations, 15:1589–1616. Ebin, D. G. and S.R. Simanca [1992], Deformations of incompressible bodies with free boundaries. Arch. Rational Mech. Anal., 120:61–97. Edwards, S. F. and T.A. Vilgis [1988], The tube model theory of rubber elasticity. Rep. Progr. Phys., 51:243–297. Ericksen, J. L. [1966], Thermoelastic stability. In Proc 5th National Cong. Appl. Mech., pages 187–193. Ericksen, J. L. [1977b], On the formulation of St.-Venant’s problem. In Nonlinear analysis and mechanics: Heriot-Watt Symposium (Edinburgh, 1976), Vol. I, pages 158–186. Res. Notes in Math., No. 17. Pitman, London. Ericksen, J. L. [1977b], Special topics in elastostatics. In C.-S. Yih, editor, Advances in Applied Mechanics, volume 17, pages 189–244. Academic Press. Ericksen, J. L. [1983], Ill-posed problems in thermoelasticity theory. In Proceedings of a NATO/London Mathematical Society advanced study institute held in Oxford, July 25–August 7, 1982, pages 71–93. D. Reidel Publishing Co., Dordrecht. Euler, L. [1744], Additamentum I de curvis elasticis, methodus inveniendi lineas curvas maximi minimivi proprietate gaudentes. Bousquent, Lausanne. In Opera Omnia I, Vol. 24, 231-297.

52

John M. Ball

Evans, L. C. [1986], Quasiconvexity and partial regularity in the calculus of variations. Arch. Rational Mech. Anal., 95:227–268. Evans, L. C. and R. F. Gariepy [1987], Some remarks concerning quasiconvexity and strong convergence. Proc. Roy. Soc. Edinburgh, 106A:53–61. Feﬀerman, C. [1985], The thermodynamic limit for a crystal. Comm. Math. Phys., 98(3):289–311. Foccardi, M. and M. S. Gelli [2001], A ﬁnite-diﬀerences approximation of fracture energies for non-linear elastic materials. Preprint. Fonseca, I. [1988], The lower quasiconvex envelope of the stored energy function of an elastic crystal. J. Math. Pures Appl., 67:175–195. Fonseca, I. and W. Gangbo [1995], Local invertibility of Sobolev functions. SIAM J. Math. Anal., 26:280–304. Foss, M. [2001], On Lavrentiev’s Phenomenon. PhD thesis, Carnegie-Mellon University. Francfort, G. A. and J.-J. Marigo [1998], Revisiting brittle fracture as an energy minimization problem. J. Mech. Phys. Solids, 46:1319–1342. Friesecke, G. [2000], personal communication. Friesecke, G. and G. Dolzmann [1997], Implicit time discretization and global existence for a quasi-linear evolution equation with nonconvex energy. SIAM J. Math. Anal., 28:363–380. Friesecke, G. and R. D. James [2000], A scheme for the passage from atomic to continuum theory for thin ﬁlms, nanotubes and nanorods. J. Mech. Phys. Solids, 48:1519–1540. Friesecke, G., R. D. James and S. M¨ uller [2001], Rigorous derivation of nonlinear plate theory and geometric rigidity. C. R. Acad. Sci. Paris S´ er. I Math., (to appear). Friesecke, G. and J. B. McLeod [1996], Dynamics as a mechanism preventing the formation of ﬁner and ﬁner microstructure. Arch. Rational Mech. Anal., 133:199–247. Friesecke, G. and J. B. McLeod [1997], Dynamic stability of non-minimizing phase mixtures. Proc. Roy. Soc. London Ser. A, 453:2427–2436. Friesecke, G. and F. Theil [2001], Validity and failure of the Cauchy–Born hypothesis in a 2D mass-spring lattice. Preprint. Giaquinta, M., G. Modica and J. Souˇcek [1989], Cartesian currents, weak diffeomorphisms and existence theorems in nonlinear elasticity. Arch. Rational Mech. Anal., 106:97–159. Addendum, ibid., 109:385-392, 1990. Giaquinta, M., G. Modica and J. Souˇcek [1994], A weak approach to ﬁnite elasticity. Calc. Var. Partial Diﬀerential Equations, 2:65–100. Giaquinta, M., G. Modica and J. Souˇcek [1998], Cartesian Currents in the Calculus of Variations. Volumes I, II. Springer-Verlag, Berlin. Cartesian currents. Glimm, J. [1965], Solutions in the large for nonlinear hyperbolic systems of equations. Comm. Pure Appl. Math., 18:697–715. Green, A. E. and J. E. Adkins [1970], Large Elastic Deformations. Oxford University Press, second edition.

1. Some Open Problems in Elasticity

53

Green, A. E. and W. Zerna [1968], Theoretical Elasticity. Clarendon Press, Oxford, second edition. Greenberg, J. M. , R. C. MacCamy and V. J. Mizel [1967], On the existence, uniqueness, and stability of solutions of the equation σ (ux )uxx + λuxtx = ρ0 utt . J. Math. Mech., 17:707–728, 1967/1968. Gromov, M. [1986], Partial Diﬀerential Relations. Springer-Verlag, Berlin. Gurtin, M. E. [1981], Topics in Finite Elasticity. SIAM, 1981. Hane, K. [1997], Microstructures in Thermoelastic Martensites. PhD thesis, Department of Aerospace Engineering and Mechanics, University of Minnesota. Hao, W., S. Leonardi and J. Neˇcas [1996], An example of irregular solution to a nonlinear Euler-Lagrange elliptic system with real analytic coeﬃcients. Ann. Scuola Norm. Sup. Pisa Cl. Sci. (4), 23:57–67. Healey, T. J. [2000], Global continuation in displacement problems of nonlinear elastostatics via the Leray-Schauder degree. Arch. Rational Mech. Anal., 152:273–28. Healey, T. J. and P. Rosakis [1997], Unbounded branches of classical injective solutions to the forced displacement problem in nonlinear elastostatics. J. Elasticity, 49:65–78. Healey, T. J. and H. Simpson [1998], Global continuation in nonlinear elasticity. Arch. Rational Mech. Anal., 143:1–28. Hrusa, W. J. and M. Renardy [1988], An existence theorem for the Dirichlet problem in the elastodynamics of incompressible materials. Arch. Rational Mech. Anal., 102:95–117. Corrections ibid 110:373-375,1990. Hughes, T. J. R., T. Kato and J. E. Marsden [1977], Well-posed quasilinear hyperbolic systems with applications to nonlinear elastodynamics and general relativity. Arch. Rational Mech. Anal., 63:273–294. James, R. D. and S. J. Spector [1991], The formation of ﬁlamentary voids in solids. J. Mech. Phys. Solids, 39:783–813. Jiang, S. and R. Racke [2000], Evolution equations in thermoelasticity. Chapman & Hall/CRC, Boca Raton, FL. John, F. [1961], Rotation and strain. Comm. Pure Appl. Math., 14:391–413. John, F. [1965], Estimates for the derivatives of the stresses in a thin shell and interior shell equations. Comm. Pure Appl. Math., 18:235–267. John, F. [1971], Reﬁned interior equations for thin elastic shells. Comm. Pure Appl. Math., 24:583–615. John, F. [1972a], Bounds for deformations in terms of average strains. In Inequalities, III (Proc. Third Sympos., Univ. California, Los Angeles, Calif., 1969; dedicated to the memory of Theodore S. Motzkin), pages 129–144. Academic Press, New York. John, F. [1972b], Uniqueness of non-linear elastic equilibrium for prescribed boundary displacements and suﬃciently small strains. Comm. Pure Appl. Math., 25:617–634. John, F. [1988], Almost global existence of elastic waves of ﬁnite amplitude arising from small initial disturbances. Comm. Pure Appl. Math., 41:615–666.

54

John M. Ball

Kato, T. [1985], Abstract Diﬀerential Equations and Nonlinear Mixed Problems. Lezioni Fermi. Scuola Normale Superiore, Pisa; Accademia Nazionale dei Lincei, Rome. Kinderlehrer, D. and P. Pedregal [1991], Characterizations of Young measures generated by gradients. Arch. Rational Mech. Anal., 115:329–365. Kinderlehrer, D. and P. Pedregal [1994], Gradient Young measures generated by sequences in Sobolev spaces. J. Geom. Anal., 4:59–90. Kirchheim, B. [2001], Deformations with ﬁnitely many gradients and stability of quasiconvex hulls. C. R. Acad. Sci. Paris S´ er. I Math., 332:289–294. Knops, R. J. and C.A. Stuart [1984], Quasiconvexity and uniqueness of equilibrium solutions in nonlinear elasticity. Arch. Rational Mech. Anal., 86:233–249. Knops, R. J. and E. W. Wilkes [1973], Theory of elastic stability. In S. Flugge, editor, Encyclopedia of Physics, volume VIa/1-4. Springer-Verlag, Berlin. Kohn, R. V. [1982], New integral estimates for deformations in terms of their nonlinear strains. Arch. Rational Mech. Anal., 78:131–172. Koiter, W. T. [1976], A basic open problem in the theory of elastic stability. In Applications of Methods of Functional Analysis to Problems in Mechanics (Joint Sympos., IUTAM/IMU, Marseille, 1975), pages 366–373. Lecture Notes in Math., 503. Springer, Berlin. Kristensen, J. [1994], Lower Semicontinuity of Variational Integrals. PhD thesis, Technical University of Lyngby. Kristensen, J. [1999], On the non-locality of quasiconvexity. Poincar´e, Anal. Non Lin´ eaire, 16:1–13.

Ann. Inst. H.

Kristensen, J. and A. Taheri [2001], Partial regularity of strong local minimisers. Preprint. Lazzeri, A. and C. B. Bucknall [1995], Applications of a dilatational yielding model to rubber-toughened polymers. Polymer, 36:2895–2902. Le Dret, H. [1990], Sur les fonctions de matrices convexes et isotropes. C. R. Acad. Sci. Paris S´ er. I Math., 310:617–620. Le Dret, H. and A. Raoult [1995a], From three-dimensional elasticity to nonlinear membranes. In Asymptotic methods for elastic structures (Lisbon, 1993), pages 89–102. de Gruyter, Berlin. Le Dret, H. and A. Raoult [1995b], The nonlinear membrane model as variational limit of nonlinear three-dimensional elasticity. J. Math. Pures Appl., 74:549– 578. Le Dret, H. and A. Raoult [1996], The membrane shell model in nonlinear elasticity: a variational asymptotic derivation. J. Nonlinear Sci., 6:59–84. Le Dret, H. and A. Raoult [1998], From three-dimensional elasticity to the nonlinear membrane model. In Nonlinear partial diﬀerential equations and their applications. Coll` ege de France Seminar, Vol. XIII (Paris, 1994/1996 ), pages 192–206. Longman, Harlow. Le Dret, H. and A. Raoult [2000], Variational convergence for nonlinear shell models with directors and related semicontinuity and relaxation results. Arch. Ration. Mech. Anal., 154:101–134.

1. Some Open Problems in Elasticity

55

Lieb, E. H. and B. Simon [1977], The Thomas-Fermi theory of atoms, molecules and solids. Adv. Math., 23:22–116. Lin, P. [1990], Maximization of the entropy for an elastic body free of surface traction. Arch. Rational Mech. Anal., 112:161–191. Liu, T.-P. [1977], Initial boundary-value problems in gas dynamics. Arch. Rational Mech. Anal., 64:137–168. Liu, T.-P. [1981], Admissible solutions of hyperbolic conservation laws. Memoirs AMS, 30(240). Liu, T.-P. and T. Yang [1999a], L1 stability for 2 × 2 systems of hyperbolic conservation laws. J. Amer. Math. Soc., 12:729–774. Liu, T.-P. and T. Yang [1999b], L1 stability of conservation laws with coinciding hugoniot and characteristic curves. Indiana Univ. Math. J, 48:237–247. Liu, T.-P. and T. Yang [1999c], Well-posedness theory for hyperbolic conservation laws. Comm. Pure Appl. Math, 52:1553–1586. Love, A. E. H. [1927], A Treatise on the Mathematical Theory of Elasticity. Cambridge University Press, fourth edition (revised and enlarged); Reprinted by Dover, New York, 1944. Luskin, M. [1996], On the computation of crystalline microstructure. Acta Numerica, 5:191–258. Marsden, J. E. and T.J.R. Hughes [1983], Mathematical Foundations of Elasticity. Prentice-Hall. Meisters, G. H. and C. Olech [1963], Locally one-to-one mappings and a classical theorem on Schlicht functions. Duke Math. J., 30:63–80. Mielke, A. [1988], Saint-Venant’s problem and semi-inverse solutions in nonlinear elasticity. Arch. Rational Mech. Anal., 102:205–229. Corrigendum ibid. 110::351-352, 1990. Mielke, A. [1990], Normal hyperbolicity of center manifolds and Saint-Venant’s principle. Arch. Rational Mech. Anal., 110:353–372. Mizel, V. J., M. Foss and W. J. Hrusa [2002], The Lavrentiev gap phenomenon in nonlinear elasticity, (to appear). Monneau, R. [2001], Justiﬁcation de la th´eorie non lin´eaire de Kirchhoﬀ-Love, comme application d’une nouvelle m´ethode d’inversion singuli`ere. C. R. Acad. Sci. Paris S´ er. I Math., (to appear). Morrey, C. B. [1952], Quasi-convexity and the lower semicontinuity of multiple integrals. Paciﬁc J. Math., 2:25–53. M¨ uller, S. [1988], Weak continuity of determinants and nonlinear elasticity. C. R. Acad. Sci. Paris S´ er. I Math., 307:501–506. M¨ uller, S. [1999], Variational methods for microstructure and phase transitions. In Calculus of variations and geometric evolution problems, volume 1713 of Lecture Notes in Math., pages 85–210. Springer, Berlin. M¨ uller, S., T. Qi and B.S. Yan [1994], On a new class of elastic deformations not allowing for cavitation. Ann. Inst. Henri Poincar´e, Analyse Nonlin´eaire, 11:217–243.

56

John M. Ball

M¨ uller, S. and S. J. Spector [1995], An existence theory for nonlinear elasticity that allows for cavitation. Arch. Rational Mech. Anal., 131:1–66. ˇ ak [1996], Attainment results for the two-well problem by M¨ uller, S. and V. Sver´ convex integration. In J. Jost, editor, Geometric analysis and the calculus of variations, pages 239–251. International Press. ˇ ak [2001], Convex integration for Lipschitz mappings and M¨ uller, S. and V. Sver´ counterexamples to regularity. Annals of Math., (to appear). M¨ uller, S. and M. A. Sychev [2001], Optimal existence theorems for nonhomogeneous diﬀerential inclusions. J. Funct. Anal., 181:447–475. Muncaster, R. G. [1979], Saint-Venant’s problem in nonlinear elasticity: a study of cross sections. In Nonlinear analysis and mechanics: Heriot-Watt Symposium, Vol. IV, pages 17–75. Pitman, Boston, Mass. Muncaster, R. G. [1983], Saint-Venant’s problem for slender prisms. Utilitas Math., 23:75–101, 1983. Neˇcas, J. [1977], Example of an irregular solution to a nonlinear elliptic system with analytic coeﬃcients and conditions for regularity. In Theory of Nonlinear Operators, pages 197–206, Berlin. Akademie-Verlag. Ogden, R. W. [1972a], Large deformation isotropic elasticity - on the correlation of theory and experiment for incompressible rubberlike solids. Proc. Roy. Soc. London A, 326:562–584. Ogden, R. W. [1972b], Large deformation isotropic elasticity: on the correlation of theory and experiment for compressible rubberlike solids. Proc. Roy. Soc. London A, 328:567–583. Ogden, R. W. [1984], Nonlinear Elastic Deformations. Ellis Horwood. Pantz, O. [2000], D´erivation des mod`eles de plaques membranaires non lin´eaires a ` partir de l’´elasticit´e tri-dimensionnelle. C. R. Acad. Sci. Paris S´ er. I Math., 331:171–174. ´ Pantz, O. [2001a], Quelques Probl`emes de Mod´elisation en Elasticit´ e Nonlin´eaire. PhD thesis, Universit´e Paris 6. Pantz, O. [2001b], Une justiﬁcation partielle du mod`ele de plaque en ﬂexion par Γ-convergence. C. R. Acad. Sci. Paris S´ er. I Math., 332:587–592. Pedregal, P. [1991], Parametrized Measures and Variational Principles, volume 30 of Progress in nonlinear diﬀerential equations and their applications. Birkh¨ auser, Basel. Pedregal, P. [1994], Jensen’s inequality in the calculus of variations. Diﬀerential Integral Equations, 7:57–72. Pedregal, P. [2000], Variational Methods in Nonlinear Elasticity. SIAM, Philadelphia. Pego, R. L. [1987], Phase transitions in one-dimensional nonlinear viscoelasticity: admissibility and stability. Arch. Rational Mech. Anal., 97:353–394. Penrose, O. [2001], Statistical mechanics of nonlinear elasticity. Markov Processes and Related Fields, (to appear). Pericak-Spector, K. A. and S. J. Spector [1997], Dynamic cavitation with shocks in nonlinear elasticity. Proc. Roy. Soc. Edinburgh, 127A:837–857.

1. Some Open Problems in Elasticity

57

Phillips, D. [2001], On one-homogeneous solutions to elliptic systems in two dimensions. C. R. Acad. Sci. Paris S´ er. I Math., (to appear). Phillips, R. [2001]. Crystals, defects and microstructures. Cambridge University Press. Polignone, D. A. and C. O. Horgan [1993a], Cavitation for incompressible anisotropic non-linearly elastic spheres. J. Elasticity, 33:27–65. Polignone, D. A. and C. O. Horgan [1993b], Eﬀects of material anisotropy and inhomogeneity on cavitation for composite incompressible anisotropic nonlinearly elastic spheres. Internat. J. Solids Structures, 30:3381–3416. Post, K. D. E. and J. Sivaloganathan [1997], On homotopy conditions and the existence of multiple equilibria in ﬁnite elasticity. Proc. Royal Soc. Edinburgh, 127A:595–614. Potier-Ferry, M. [1981], The linearization principle for the stability of solutions of quasilinear parabolic equations. I. Arch. Rational Mech. Anal., 77:301–320. Potier-Ferry, M. [1982], On the mathematical foundations of elastic stability theory. I. Arch. Rational Mech. Anal., 78:55–72. Qi, Tang [1988], Almost-everywhere injectivity in nonlinear elasticity. Proc. Royal Soc. Edinburgh, 109A:79–95. Qin, T. [1998], Symmetrizing the nonlinear elastodynamic system. J. Elasticity, 50:245–252. Racke, R. and S. Zheng [1997], Global existence and asymptotic behavior in nonlinear thermoviscoelasticity. J. Diﬀerential Equations, 134:46–67. Radin, C. [1987], Low temperature and the origin of crystalline symmetry. Internat. J. Modern Phys. B, 1:1157–1191. Rybka, P. [1992], Dynamical modelling of phase transitions by means of viscoelasticity in many dimensions. Proc. Royal Soc. Edinburgh, 121A:101–138. Serre, D. [2000], Syst`emes de Lois de Conservation, Vols I,II. Diderot, Paris, 1996. English translation: Systems of Conservation Laws, Vols I,II, Cambridge Univ. Press, Cambridge. Shu, Y. C. [2000], Heterogeneous thin ﬁlms of martensitic materials. Arch. Ration. Mech. Anal., 153:39–90. Sivaloganathan, J. [1986], Uniqueness of regular and singular equilibria for spherically symmetric problems of nonlinear elasticity. Arch. Rational Mech. Anal., 96:97–136. Sivaloganathan, J. [1989], The generalised Hamilton-Jacobi inequality and the stability of equilibria in nonlinear elasticity. Arch. Rational Mech. Anal., 107:347–369. Sivaloganathan, J. [1995], On the stability of cavitating equilibria. Quart. Appl. Math., 53:301–313. Sivaloganathan, J. [1999], On cavitation and degenerate cavitation under internal hydrostatic pressure. Proc. R. Soc. Lond. Ser. A, 455:3645–3664. Sivaloganathan, J. and S. J. Spector [2000a], On the optimal location of singularities arising in variational problems of nonlinear elasticity. J. Elasticity, 58:191–224.

58

John M. Ball

Sivaloganathan, J. and S. J. Spector [2000b], On the existence of minimizers with prescribed singular points in nonlinear elasticity. J. Elasticity, 59:83–113. In recognition of the sixtieth birthday of Roger L. Fosdick (Blacksburg, VA, 1999). Sivaloganathan, J. and S. J. Spector [2001], A construction of inﬁnitely many singular weak solutions to the equations of nonlinear elasticity. Preprint. Sobolevskii, P. E. [1966], Equations of parabolic type in Banach space. Amer. Math. Soc. Transl., 49:1–62. Stoppelli, F. [1954], Un teorema di esistenza e di unicita relativo alle equazioni dell’elastostatica isoterma per deformazioni ﬁnite. Recherche Mat., 3:247–267. Stoppelli, F. [1955], Sulla svilluppibilita in serie de potenze di un parametro delle soluzioni delle equazioni dell’elastostatica isoterma. Recherche Mat., 4:58–73, 1955. Stringfellow, R. and R. Abeyaratne [1989], Cavitation in an elastomer - comparison of theory with experiment. Materials Science and Engineering A Structural Materials Properties, Microstructure and Processing, 112:127–131. Stuart, C. A. [1985], Radially symmetric cavitation for hyperelastic materials. Ann. Inst. H. Poincar´ e. Anal. Non. Lin´ eaire, 2:33–66. Stuart, C. A. [1993], Estimating the critical radius for radially symmetric cavitation. Quart. Appl. Math., 51:251–263. ˇ Sver´ ak, V. [1988], Regularity properties of deformations with ﬁnite energy. Arch. Rational Mech. Anal., 100:105–127. ˇ ak, V. [1991], Quasiconvex functions with subquadratic growth. Proc. Roy. Sver´ Soc. Lond. A, 433:723–732. ˇ Sver´ ak, V. [1992], Rank-one convexity does not imply quasiconvexity. Proc. Royal Soc. Edinburgh, 120A:185–189. ˇ ak, V. [1995], Lower-semicontinuity of variational integrals and compensated Sver´ compactness. In Proc. International Congress of Mathematicians, Zurich 1994, Basel. Birkha¨ user. ˇ ak, V. and X. Yan [2000], A singular minimizer of a smooth strongly conSver´ vex functional in three dimensions. Calc. Var. Partial Diﬀerential Equations, 10:213–221. Sychev, M. A. [1999], A new approach to Young measure theory, relaxation and convergence in energy. Ann. Inst. H. Poincar´ e Anal. Non Lin´eaire, 16:773–812. Sychev, M. A. [2001], Few remarks on diﬀerential inclusions. Preprint. Sylvester, J. [1985], On the diﬀerentiability of O(n) invariant functions of symmetric matrices. Duke Math. J., 52:475–483. Tadmor, E. B. , M. Ortiz and R. Phillips [1996], Quasicontinuum analysis of defects in solids. Phil. Mag. A, 73:1529–1563. Taheri, A. [2001a], On Artin’s braid group and polyconvexity in the calculus of variations. Preprint. Taheri, A. [2001b], Quasiconvexity and uniqueness of stationary points in the multi-dimensional calculus of variations. Preprint.

1. Some Open Problems in Elasticity

59

Tartar, L. [1979], Compensated compactness and applications to partial diﬀerential equations. In R.J. Knops, editor, Nonlinear Analysis and Mechanics; Heriot-Watt Symposium, Vol. IV, pages 136–192. Pitman Research Notes in Mathematics. Tartar, L. [1982], The compensated compactness method applied to systems of conservation laws. In Systems of Nonlinear Partial Diﬀerential Equations, J. M. Ball, editor, pages 263–285. NATO ASI Series, Vol. C111, Reidel. Tartar, L. [1993], Some remarks on separately convex functions. In Proceedings of conference on Microstructures and phase transitions, IMA, Minneapolis, 1990. Tonelli, L. [1921], Fondamenti di Calcolo delle Variazioni, Volumes I, II. Zanichelli, 1921–23. Truesdell, C. and W. Noll [1965], The non-linear ﬁeld theories of mechanics. In S. Fl¨ ugge, editor, Handbuch der Physik, Berlin. Springer. Vol. III/3. Valent, T. [1988], Boundary Value Problems of Finite Elasticity, volume 31 of Springer Tracts in Natural Philosophy. Springer-Verlag. Vodop’yanov, S. K., V. M. Gol’dshtein and Yu. G. Reshetnyak [1979], The geometric properties of functions with generalized ﬁrst derivatives. Russian Math. Surveys, 34:19–74. ˇ Silhav´ y, M. [1997], The Mechanics and Thermodynamics of Continuous Media. Springer. ˇ Silhav´ y, M. [2000], Diﬀerentiability properties of rotationally invariant functions. J. Elasticity, 58:225–232. Wan, Y. H. and J. E. Marsden [1983], Symmetry and bifurcation in threedimensional elasticity, Part III: Stressed reference conﬁgurations. Arch. Rational Mech. Anal., 84:203–233. Weiner, J. H. [1983], Statistical Mechanics of Elasticity. Wiley, New York. Weinstein, A. [1985], A global invertibility theorem for manifolds with boundary. Proc. Royal Soc. Edinburgh, 99:283–284. Xu, C.-Y. [2000], Asymptotic Stability of Equilibria for Nonlinear Semiﬂows with Applications to Rotating Viscoelastic Rods. PhD thesis, Department of Mathematics, University of California, Berkeley, 2000. Xu, C.-Y. and J. E. Marsden [1996], Asymptotic stability for equilibria of nonlinear semiﬂows with applications to rotating viscoelastic rods. I. Topol. Methods Nonlinear Anal., 7:271–297. Young, L. C. [1969], Lectures on the Calculus of Variations and Optimal Control Theory. Saunders, 1969. Reprinted by A.M.S. Chelsea. Zhang, K. [1991], Energy minimizers in nonlinear elastostatics and the implicit function theorem. Arch. Rational Mech. Anal., 114:95–117. Zhang, K. [2001], A two-well structure and intrinsic mountain pass points. Calc. Var. Partial Diﬀerential Equations, 13:231–264.

2 Finite Elastoplasticity Lie Groups and Geodesics on SL(d) Alexander Mielke To Jerry Marsden on the occasion of his 60th birthday ABSTRACT The notions of nonlinear plasticity with ﬁnite deformations is interpreted in the sense of Lie groups. In particular, the plastic tensor P = F p−1 is considered as element of the Lie group SL(d). Moreover, the plastic dissipation deﬁnes a left–invariant Finsler metric on the tangent bundle of this Lie group. In the case of single crystal plasticity this metric is given in terms of the diﬀerent slip systems and is piecewise aﬃne on each tangent space. For von Mises plasticity the metric is a left–invariant Riemannian metric. A main goal is to study the associated distance metric and the geodesic curves.

Contents 1 2 3 4

Introduction . . . . . . . . . . . . . . . . . . . . Formulations of Elastoplasticity . . . . . . . . . Mathematical Formulation Using Lie Groups Dissipation Functionals . . . . . . . . . . . . . . 4.1 von Mises Plasticity . . . . . . . . . . . . . . . 4.2 Single-Crystal Plasticity . . . . . . . . . . . . . 5 The Metric and Geodesics on G . . . . . . . . . 6 Geodesics for von Mises Dissipation . . . . . . A Set-Valued Calculus of Variations . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . .

1

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

61 65 72 74 74 75 76 84 86 88

Introduction

We give a formulation of perfect plasticity in the context of ﬁnite strain. This formulation is based on two functionals, namely the stored-energy density ψ (also called the Helmholtz free energy) and the dissipation func The deformation of a body Ω ⊂ Rd is a mapping ϕ : Ω → Rd ; tional ∆. 61

62

A. Mielke

and the deformation gradient Dϕ ∈ Rd×d is denoted by F . In perfect ﬁnite elastoplasticity the function ψ depends on the deformation gradient F and the plastic variable P = F −1 p solely via the elastic part F e of the deforma tion gradient, i.e. F e = F P = F F −1 p . We write ψ = ψ(F , P ) = Ψ(F P ). This multiplicative split was introduced in Lee [1969] and its use is nowadays well-established, see e.g. Sim´o and Ortiz [1985]; Sim´o [1988]; Miehe and Stein [1992]; Hackl [1997]; Gurtin [2000]. The main point in this paper is to work out the Lie group structure which is central to ﬁnite plasticity. For the usage of Lie group theory in mechanics we refer to Abraham and Marsden [1978]; Arnol’d [1989]; Marsden and Hughes [1983]. In particular, we are led to the study of left-invariant mechanical systems on Lie groups, which play a fundamental rˆ ole in many areas in mechanics, such as in rigid-body motion Krishnaprasad and Marsden [1987], in the deformations of thin rods Mielke and Holmes [1988]; Mielke [1991] and in ﬂow problems for inviscid ﬂuids or plasmas Ebin and Marsden [1970]; Marsden and Weinstein [1983]; Holm, Marsden, Ratiu, and Weinstein [1985]. Since minimization of the dissipation is a fundamental concept in plasticity we encounter geodesic curves with respect to the left-invariant dissipation metric. In the present work all appearing Lie groups are matrix groups, so the paper can be read without previous knowledge in Lie group theory. The deformation gradient F is considered as an element of the general linear group GL(d) = GL+ (d, R) = F ∈ Rd×d : det F > 0 , and the plastic variable P is considered as an element of a Lie subgroup G ⊂ GL(d); in many cases G is the special linear group SL(d) = P ∈ GL(d) : det P = 1 . is a mapping from TG into [0, ∞) such that The dissipation functional ∆ ∆(P , P˙ ) = ∆(P −1 P˙ ) with

∆(αξ) = α∆(ξ)

for all (P , P˙ ) ∈ TG and α ≥ 0. The ﬁrst condition expresses the fact that previous plastic deformations do not inﬂuence the dissipation (i.e. for each G ∈ G the dissipation of the process t → GP (t) is the same). The second condition gives rate independency of the model. The function ∆ is deﬁned onthe Lie algebra g = TI G and is assumed to be convex and coercive, i.e., r1 ξ ≤ ∆(ξ) ≤ r2 ξ . (or equivalently Ψ and ∆) together with The two functionals ψ and ∆ the principle of maximal dissipation (cf. Sim´o and Ortiz [1985]; Ziegler and Wehrli [1987]; Sim´o [1988]) deﬁne the full equations of elastoplasticity. We

2. Finite Elastoplasticity

63

give here an alternate form using the subdiﬀerential ∂∆(ξ): ∂ ψ(F , P ) = DΨ(F P )P T , (1.1) ∂F ∂ ψ(F , P ) = ∂∆(P −1 P˙ ) + (F P )T DΨ(F P ) . 0 ∈ ∂∆(P −1 P˙ ) + P T ∂P Σ=

The second equation is posed on g. Together with the relation F = Dϕ and the elastic equilibrium condition − div Σ = f ext and suitable boundary conditions this deﬁnes the classical plastic ﬂow rule. The equivalence between these equations an the classical mechanical ﬂow rules involving the yield surface is given in Section 2. As in Carstensen, Hackl, and Mielke [2001] we propose an integrated version of (1.1) which implies (1.1) but not vice versa. To this end we need to deﬁne a non-symmetric distance metric on G as follows: 0 , P 1 ) = inf I∆ P ( · ) P ∈ C1 [0, 1] , G , P (0) = P 0 , P (1) = P 1 D(P 1 with I∆ P ( · ) = ∆ P (t)−1 P˙ (t) dt . (1.2) 0

is a so-called Finsler–Minkowski metric generated The distance metric D ˙ . The ˙ → ∆ G−1 G : TG → [0, ∞); G , G by the inﬁnitesimal metric ∆ metric is invariant with respect to the left translation. In the special situ

1 ation where ∆(ξ) = B ξ , ξ 2 with B = B∗ > 0 we obtain a Riemannian metric. over the body Ω ⊂ Rd we deﬁne the nonlinear By integrating ψ and ∆ functionals Ψ Dϕ P − f ext (t , · ) · ϕ dx − E t, ϕ, P = text (t, · ) · ϕ da , Ω Γtract Ω P 0 , P 1 = P 0 (x) , P 1 (x) dx . D D Ω

Here, E is the elastic and potential energy stored in the body under given Ω gives the minimal ϕ : Ω → Rd and P : Ω → G. The functional D dissipation arising from a change of the plastic variable from P 0 : Ω → G to the new state P 1 : Ω → G. The global non-diﬀerentiated form of the plastic ﬂow problem now takes the form Ω P (t) , P ∗ global stability : E t, ϕ(t), P (t) ≤ E(t, ϕ∗ , P ∗ ) + D for all (ϕ∗ , P ∗ ) with ϕ∗ |Γdispl = ϕ0 (t, · ) ; t energy balance : E t, ϕ(t), P (t) + s Ω ∆ P (τ, x)−1 P˙ (τ, x) dx dτ t ∂ E τ, ϕ(τ ), P (τ ) dτ . = E s, ϕ(s), P (s) + s ∂τ (1.3)

64

A. Mielke

A similar energetic formulation for phase transformation problems was given in Mielke and Theil [1999]; Mielke, Theil, and Levitas [2002]; Govindjee, Mielke, and Hall [2001]. Such a formulation has the major advantage, that it immediately suggests an incremental formulation which has a variational form. If 0 = t0 < t1 < . . . tN = T is a time discretization and P 0 : Ω → G the initial value, then the time stepping algorithm reads Incremental Problem. Given (ϕk , P k ) at time tk , ﬁnd (ϕk+1 , P k+1 ) such that it minimizes the functional Ω (P k , P ). Tk+1 : (ϕ, P ) → E(tk+1 , ϕ, P ) + D In Lemma 2.1 we show that all solutions of this incremental problem satisfy global stability as well as a discretized version of the energy balance in (1.3). Incremental formulations for rate-independent processes which solely depend on minimization are well-known in abstract sweeping processes (cf. Monteiro Marques [1993]) and in linearized elastoplasticity, cf. e.g. Han and Reddy [1995]. The usage of variational formulations in ﬁnite elastoplasticity is only recent Ortiz and Repetto [1999]; Carstensen, Hackl, and Mielke [2001]; Miehe and Schotte [2000]; Miehe [2000]. The usage of leftinvariant dissipation metrics in this context is new. In the present work we only deal with the case of ideal plasticity without hardening, in subsequent work Mielke [2002] the general case with hardening will treated, where also more general associative plastic ﬂow rules are considered. The above theory generates a need of a characterization of the Finsler– : G × G → [0, ∞) as deﬁned in (1.2). By left Minkowski distance metric D invariance (i.e. ∆(P , P˙ ) = ∆(P −1 P˙ )) we have 0 , P 1 ) = D(P −1 P 1 ) D(P 0

,P). with D(P ) := D(I

Recalling ∆(ξ) ≤ r2 ξ we immediately obtain an upper bound D(P ) ≤ r2 log P whenever a logarithm ξ ∈ g with P = eξ exists. This already indicates that the Lie algebraic exponential map does not provide the geodesic curves, i.e., the curves t → P (t) = etξ are not minimizers in (1.2) in general. In Section 6 we provide a geodesic exponential map which is surjective, see also M in (1.5) below. and its Sections 4 to 6 are concerned only with the study of the metric D shortest paths. In Theorem 5.1 we show that the inﬁmum in (1.2) is always attained, i.e., there is always a curve P ∈ CLip [0, 1] , G with P (0) = P 0 0 , P 1 ) = I P ( · ) . As usual, a geodesic and P (1) = P 1 such that D(P ∆ curve P : (t0 , t1 ) → G is a curve which is locally a shortest path. The diﬃculty for Finsler metrics arises from two facts; ﬁrst of all, ∆ : g\{0} → [0, ∞) may be non diﬀerentiable and, secondly, the subdiﬀerential Q = ∂∆(0) ⊂ g ∗ at 0 may be not strictly convex, where Q = ∂∆(0) = η ∈ g ∗ η , ξ ≤ ∆(ξ) for all ξ .

2. Finite Elastoplasticity

65

In this general situation we do not have a full characterization of shortest paths or geodesic curves. However, we derive a nonsmooth counterpart to it follows Noether’s theorem. For instance, from the left invariance of ∆ ∗ that for each shortest path P there exists an η ∈ g such that P T ηP −T ∈ ∂∆(P −1 P˙ )

(1.4)

holds almost everywhere along the curve. We continue to use the notation ( · )T for the transposed matrix, where throughout the Euclidean scalar product in Rd is used. See Theorem 5.2 for the exact statement in Lie group notation. Examples of nonsmooth shortest paths and geodesics are given there as well. In Section 6 we study von Mises plasticity where ∆ generates a leftinvariant Riemann metric on G = SL(d) which is additionally right invariant with respect to SO(d). The latter invariance corresponds to isotropy of the elastoplastic model. Starting from 1 ∆(ξ) = γ2 ξ sym 2 + γ3 ξ anti 2 2 where

ξ sym = 12 (ξ + ξ T ) ,

ξ anti = 12 (ξ−ξ T )

with γ2 , γ3 > 0 we ﬁnd that all geodesics are smooth and have the form P (t) = M t(α + ω) := P (0) etα etω (1.5) with α = γ12 η T and ω = γ12 + γ13 η anti . Here η is deﬁned such that (1.4) holds. This leads to the formula ,P) D(P ) = D(I 1 = min γ2 ξ sym 2 + γ3 ξ anti 2 2 : P = eξsym −δξanti e(1+δ) ξanti where δ = γ3 /γ2 .

2

Formulations of Elastoplasticity

The aim of this section is to motivate the model for elastoplasticity at ﬁnite deformations from the mechanical point of view, for more background we refer to Sim´o [1988]; Hackl [1997]; Gurtin [2000]. Then we reformulate the mechanical notion by using methods from convex analysis based on subdiﬀerentials. This leads to a formulation involving energy balance and local (inﬁnitesimal) stability. We will then argue that local stability can be replaced by global stability which then justiﬁes the incremental problem in variational form.

66

A. Mielke

We consider perfect ﬁnite plasticity, i.e., we do not take into account plastic hardening eﬀects. The body is given by a domain Ω ⊂ Rd , where the dimension d is usually equal to 3 but d ∈ {1, 2} is also relevant in certain special situations. The deformation is ϕ : Ω → Rd and its gradient F = Dϕ is decomposed multiplicatively into the elastic part F e and the plastic part F p Dϕ = F = F e F p . (2.1) The plastic part F p is not a gradient. It corresponds to the atomic disarrangement in single crystals arising through slips. In polycrystals F p also includes rearrangements of the grains. The ﬁrst main assumption of perfect ﬁnite plasticity is that the storedenergy density (per unit volume) ψ depends on F and F p only through the elastic part of the deformation gradient , P ) = Ψ(F e ) = Ψ(F P ) where P = F −1 . ψ = ψ(F p

(2.2)

For notational simplicity we omit all dependence on the material point x ∈ Ω, but of course our theory can handle general inhomogeneous materials with ψ = Ψ(x, F P ). Using the elastic stress tensor (ﬁrst Piola–Kirchhoﬀ tensor) Σ=

∂ ψ(F , P ) = DΨ(F P )P T ∂F

(2.3)

the elastic equilibrium equation reads −divΣ = f ext (t, · ) in Ω , ϕ = ϕ0 (t, · ) on Γdispl ,

(2.4)

Σν = text (t, · ) on Γtract = ∂Ω/Γdispl . We additionally need a ﬂow rule for the internal plastic variable P = F −1 p . This is obtained from the principle of maximal dissipation or equivalently from the associated ﬂow rule. The aim of this section is to derive another formulation using a dissipation metric and the calculus of subdiﬀerentials. We start with the conventional formulation. The plastic back stress Q can be deﬁned as Q=−

∂ ψ(F , P ) = −F T DΨ(F P ) . ∂P

(2.5)

Note that the tensor Q = P T Q depends only on the elastic part F e = F P of the deformations. The yield function y in general depends on Q and P and y = 0 deﬁnes the yield surface. Again assuming that y is invariant under all plastic deformations we postulate y = y(Q , P ) = Y (Q) = Y (P T Q) .

(2.6)

2. Finite Elastoplasticity

67

The principle of maximal dissipation generates the associated ﬂow rule (cf. Sim´o and Ortiz [1985]; Ziegler and Wehrli [1987]; Sim´o [1988]) P −1 P˙ = λ

∂Y (Q) , ∂Q

with λ ≥ 0 , Y (Q) ≤ 0 and λY (Q) = 0 .

(2.7)

Now (2.7) together with the elastic equilibrium condition (2.4) form the full set of equations describing the evolution of elastoplasticity. The ﬂow rule (2.7), which now has the Karush–Kuhn–Tucker form, can be reformulated by introducing 0 for Q ∈ Q d×d : Y (Q) ≤ 0 and XQ (Q) = Q= Q∈R ∞ otherwise . where we always assume that Q is convex and there exist r1 , r2 > 0 such that B(0, r1 ) ⊂ Q ⊂ B(0, r2 ) .

(2.8)

Using the outer normal cone NQ, the ﬂow rule (2.7) now takes the form P −1 P˙ ∈ ∂XQ (Q) = NQ Q.

(2.9)

Note that (2.9) is more general, since Q may have corners while ∂Q = Q : Y (Q) = 0 is C1 if Y ∈ C1 . For general convex functions f : Z → R on a Banach space Z the subdiﬀerential ∂f (z) at the point z ∈ Z is deﬁned via

∂f (z) = z ∗ ∈ Z ∗ : f (z + z) ≥ f (z) + z ∗ , z for all z ∈ Z , (2.10) where Z ∗ is the dual Banach space and z ∗ , z the dual pairing. The outer normal cone Nz C of a closed convex set C in the point z ∈ C is Nz C = z ∗ ∈ Z ∗ : z ∗ (z− z ) ≤ 0 for all z ∈ C (2.11) Using the Legendre transform (Lf )(z ∗ ) = supz∈Z z ∗ , z−f (z) we deﬁne the dissipation functional

∆(ξ) = LXQ (ξ) = sup η , ξ : η ∈ Q . (2.12) With (2.8) we conclude r1 ξ ≤ ∆(ξ) ≤ r2 ξ. Moreover ∆ : Rd×d → [0, ∞) is convex (as Q is convex) and homogeneous of degree 1. Under the additional assumption ∆(−ξ) = ∆(ξ), which is not necessary for the present work, the function ∆ deﬁnes a metric on Rd×d . Without this condition we call ∆ a nonsymmetric metric.

68

A. Mielke

We deﬁne the dissipation functional ∆(P , P˙ ) = ∆(P −1 P˙ ) . The classical duality theory for convex functions applied to ∆ = LXQ gives ξ ∈ ∂XQ (Q)

⇐⇒

Q ∈ ∂∆(ξ) .

, P˙ ) = P −T ∂∆(P −1 P˙ ) we ﬁnd two equivaWith ξ = P −1 P˙ and ∂2 ∆(P lent forms of the ﬂow rule Q ∈ ∂∆(P −1 P˙ )

⇐⇒

Q ∈ ∂2 ∆(P , P˙ ) .

(2.13)

We employ a standard relation for subdiﬀerentials of convex functions f which are homogeneous of degree 1: ∂f (z) =

z ∗ ∈ Z ∗ : z ∗ ∈ ∂f (0) and z ∗ , z = f (z) .

(2.14)

Together with the deﬁnition of Q in (2.5) the right-hand side of (2.13) gives our ﬁnal pointwise version of the ﬂow rule: 0∈

∂ , P˙ ) ⊂ Rd×d . ψ(F , P ) + ∂2 ∆(P ∂P

(2.15)

To obtain ﬁeld equations we consider the integrated functionals E t, ϕ, P =

ψ Dx ϕ(x) , P (x) − f

Ω

Ω P , P˙ = ∆

· ϕ(x) dx

text (t, x) · ϕ(x) da ,

−

ext (t, x)

Γtract

P (x) , P˙ (x) dx . ∆

Ω

Now, the whole problem of elastoplasticity can be written using variational derivatives and the subdiﬀerential as follows: elastic equilibrium : Dϕ E t, ϕ, P = 0 , Ω P , P˙ . plastic ﬂow rule : 0 ∈ DP E t, ϕ, P + ∂2 ∆

(2.16)

Our ﬁnal formulation, which we derive now, has the major advantage that it does not involve derivatives but relies only on of functionals. minimization ∂ P, · we Applying (2.14) with z ∗ = Q = − ∂P ψ F , P and f = ∆ can calculate the time derivative of the bulk energy e(t) = E t, ϕ(t), P (t)

2. Finite Elastoplasticity

69

along any solution t → ϕ(t) , P (t) : d e(t) dt ∂ ˙ = E t, ϕ(t), P (t) + Dϕ E t, ϕ(t), P (t) ϕ(t) + DP E t, ϕ, P P˙ (t) ∂t

∂ − Q(t, x) , P˙ (t, x) dx = E t, ϕ(t), P (t) + 0 + ∂t Ω ∂ ∆ P (t, x) , P˙ (t, x) dx = E t, ϕ(t), P (t) − ∂t Ω ∂ Ω P (t) , P˙ (t) . = E t, ϕ(t), P (t) − ∆ ∂t This leads to the energy balance t Ω P (τ ) , P˙ (τ ) dτ ∆ E t, ϕ(t), P (t) − E s, ϕ(s), P (s) + s t ∂ E τ, ϕ(τ ), P (τ ) dτ (2.17) = s ∂τ t ∂ ∂ f ext (τ, x) · ϕ(τ, x) dx + text (τ, x) · ϕ(τ, x) da dτ . =− s Ω ∂τ Γtract ∂τ Moreover (2.16) implies the local conditions Dϕ E t, ϕ(t), P (t) [ψ] = 0 Ω P (t) , S ≥ 0 DP E t, ϕ(t), P (t) [S] + ∆

(2.18)

for all ψ with ψ|Γdispl = ϕ0 (t, · ) and all S. Note that (2.17) and (2.18) are equivalent to (2.16) and hence to (2.4) and (2.7). Since we are considering rate-independent elastoplasticity we may interpret (2.18) as a local form of a stability condition. In fact, we may assume that ϕ(t) is the (global) minimizer of E t, · , P (t) . Similarly we may assume that P (t) is such that we cannot under the given loading (i.e., t ﬁxed) lower the energy from E t, ϕ(t), P (t) to a value E t, ϕ∗ , P ∗ such that the gain in the energy is larger than the associated dissipation obtained from moving P (t) continuously into P ∗ . To measure this dissipation we introduce a nonsymmetric distance metric on matrices via ! 1 P (τ ) , P˙ (τ ) dτ P ∈ C1 [0, 1] , Rd×d , ∆ D(P 0 , P 1 ) = inf 0

P (0) = P 0 ,

" P (1) = P 1 .

(2.19)

70

A. Mielke

I , P −1 P 1 but we do not assume sym P0 ,P1 = D Note that we have D 0 P 0 , P 1 = D P 1 , P 0 . The triangle inequality metry, i.e., we may have D 0 , P 2 ) ≤ D(P 0 , P 1 ) + D(P 1 , P 2) D(P

(2.20)

eξ ) = is an immediate consequence of the deﬁnition. Moreover we have D(I, 2 ∆(ξ) + O(|ξ| ) for ξ → 0. Deﬁning the integrated version of the metric Ω (P 0 , P 1 ) = D

P 0 (x) , P 1 (x) dx D

Ω

we arrive at our ﬁnal global formulation of elastoplasticity Ω (P (t) , P ∗ ) E t, ϕ(t), P (t) ≤ E(t, ϕ∗ , P ∗ ) + D ∗ ∗ for all ϕ with ϕ |Γdispl = ϕ0 (t, · ) and all P ∗ ; t energy balance: E t, ϕ(t), P (t) + s ∆Ω P (τ )−1 P˙ (τ ) dτ t ∂ = E s, ϕ(s), P (s) + s ∂τ E τ, ϕ(τ ), P (τ ) dτ . (2.21) Clearly this formulation implies the previous ones but not vice versa. Equivalence can be obtained, however, under certain convexity assumption like in linear elastoplasticity. A major advantage of the formulation (2.21) is that it immediately gives rise to a suitable incremental formulation when the time interval [0, T ] is discretized via 0 = t0 < t1 < . . . < tN = T and the initial value (ϕ0 , P 0 ) at t0 = 0 is given: global stability:

Incremental Problem: Given (ϕk , P k ) at time tk , ﬁnd (ϕk+1 , P k+1 ) such that it minimizes the functional Ω (P k , P ) . Tk+1 : (ϕ, P ) → E(tk+1 , ϕ, P ) + D

(2.22)

We show that any solutions of (2.22) satisfy a suitable discretized version of (2.21). 2.1 Lemma. Let (ϕk , P k ), k = 1, . . . , N , be a solution of the incremental problem (2.22). Then, for k = 1, . . . , N we have (i) stability of ϕk , P k at time tk , i.e., Ω P k , P ∗ for all ϕ∗ , P ∗ , E tk , ϕk , P k ≤ E t, ϕ∗ , P ∗ + D

2. Finite Elastoplasticity

71

(ii) the two-sided discretized energy estimate E(tk , ϕk , P k ) − E(tk−1 , ϕk , P k ) tk ∂ E(s, ϕk , P k ) ds = ∂s tk−1 Ω (P k−1 , P k ) ≤ E(tk , ϕk , P k ) − E(tk−1 , ϕk−1 , P k−1 ) + D tk ∂ E(s, ϕk−1 , P k−1 ) ds ≤ ∂s tk−1 = E(tk , ϕk−1 , P k−1 ) − E(tk−1 , ϕk−1 , P k−1 ) . Proof. To shorten the notation we let z k = (ϕk , P k ). The stability (i) is obtained as follows. For all z ∗ we have Ω (P k , P ∗ ) E(tk , z ∗ ) + D Ω (P k−1 , P ∗ ) + D Ω (P k , P ∗ ) − D Ω (P k−1 , P ∗ ) = E(tk , z ∗ ) + D Ω (P k−1 , P k ) + D Ω (P k , P ∗ ) − D Ω (P k−1 , P ∗ ) ≥ E(tk , z k ) + D ≥ E(tk , z k ) . Here, the ﬁrst inequality uses that z k is a global minimizer (cf. (2.22)) and the second estimate is the triangle inequality (2.20). The lower estimate in (ii) follows from the stability of z k−1 at time tk−1 : Ω P k−1 , P k ≥ E tk , z k − E tk−1 , z k . E tk , z k − E tk−1 , z k−1 + D The upper estimate is a consequence of z k being a minimizer: Ω P k−1 , P k ≤ E tk , z k−1 − E tk−1 , z k−1 . E tk , z k − E tk−1 , z k−1 + D This proves the result.

The minimization problem in the incremental problem (2.22) can be solved in two steps, since the functional Tk+1 has the form P k , P dx − Lext (tk+1 , ϕ) , Tk+1 (ϕ , P ) = Ω Ψ ( Dx ϕ)P + D where Lext (t, · ) denotes the external loading. We ﬁrst minimize pointwise in x ∈ Ω with respect to P and obtain the reduced incremental energy density Ψred P k (F ) = min Ψ F P + D P k , P : P ∈ G . To study the existence question for solutions (ϕk+1 , P k+1 ) it is now essend×d → [0, ∞] such tial to prove certain convexity properties of Ψred Pk( · ) : R as rank-one and quasi-convexity, see Ortiz and Repetto [1999]; Carstensen, Hackl, and Mielke [2001] for ﬁrst results in this direction. For the present this will be done in subsequent work. metric D

72

3

A. Mielke

Mathematical Formulation Using Lie Groups

In general situations of plasticity with ﬁnite elastic and plastic deformations the gradient F is assumed to lie in GL(3) = GL+ (3, R) = deformation to lie P ∈ R3×3 det P > 0 . The plastic variable P = F −1 p is assumed in a Lie subgroup G of GL(3). For example, G = SL(3) = P ∈ GL(3) : det P = 1 is the case of incompressible plastic deformations. If only one slip system should be taken into account, then we let G = I + αn ⊗ m : α ∈ R , where |n| = |m| = 1 and n · m = 0. Hence, in the following G ⊂ GL(d) will denote a general Lie subgroup which contains the plastic variable: P (t, x) ∈ G. By g = TI G we denote the associated Lie algebra. Our plasticity model is based solely on two constitutive functions: the , P ) and a plastic dissipation potential ∆(P , P˙ ); elastic energy density ψ(F ψ : GL(d) × G → R and

: TG → [0, ∞) . ∆

Here again we neglect any dependence on x ∈ Ω which could be included ∂ ψ(F , P ) is an element easily. As a consequence the elastic stress Σ = ∂F ∂ ∗ of TF GL(d) and the plastic backstress Q = − ∂P ψ(F , P ) is an element of T∗P G. Hence, the ﬂow rule (2.13) has to be considered on T∗ G. To introduce the (symmetry) axiom of our plasticity model we use the left and right translations LG and RG on GL(d): G → G, G → G, LG : RG : P → GP ; P → P G . The derivative DLG maps the tangent space TP G bijectively onto TLG P G. The basic axiom of multiplicative plasticity can now be formulated by introducing an action of G on GL(d) × T G via G × GL(d) × TG → GL(d) × TG , −1 τ: G, (F , P , P˙ ) → RG F , LG P , DLG P˙ . For τ G , (F , P , P˙ ) we use the abbreviation −1 τG (F , P , P˙ ) = RG F , LG P , DLG P˙ = F G−1 , GP , GP˙ . In the whole of this section, we will use the abstract Lie group notation but add for clarity the corresponding matrix notation in the double brackets ... . In addition to the plastic symmetry there are two standard symmetries which are already present in pure elasticity. The ﬁrst symmetry is called objectivity or frame indiﬀerence and is due to the Euclidian symmetry

2. Finite Elastoplasticity

73

of the space Rd . The rotations R ∈ O(d) act on F e via left translation. The second symmetry is the material symmetry which is incorporated in a subgroup S of O(d) and which acts on F e via right translation. (A1) Objectivity (frame indiﬀerence): For all R ∈ SO(d) and all F , P ∈ GL(d) × G we have , P ) = ψ(RF R F , P ) = ψ(F ,P) . ψ(L (A2) Invariance under superimposed plastic deformations: ◦ τG |TG = ∆ , for all G ∈ G; ψ ◦ τG |GL(d)×G = ψ and ∆

G−1 , GP ) = ψ(F ,P) ψ(F

and

GP , GP˙ = ∆ P , P˙ . ∆

(A3) Rate independency and dissipativity: The function P , · : TP G → [0, ∞) is convex and homogeneous of degree 1 ∆ P , LP ξ ≤ r2 ξ . with r1 ξ ≤ ∆ (A4) Material symmetry: For all S ∈ S and all F , P , P˙ ∈ GL(d) × TG we have , P S) , , P ) = ψ(F , R P ) = ψ(F ψ(F S R P , DR P˙ = ∆ P , P˙ = ∆ P S , P˙ S . ∆ S

S

Assumptions (A2) and (A3) imply P , P˙ = ∆ DL−1 P˙ = ∆ P −1 P˙ , ∆ P where ∆ : g = TI G → [0, ∞) is convex and homogeneous of degree 1. Assumptions (A1) and (A2) give , P ) = Ψ(F P ) = Ψ (F P )T F P . ψ(F The material symmetry S ⊂ O(d) in (A4) reads for the reduced functionals ∆ and Ψ (3.1) ∆ AdS ξ = ∆ ξ and Ψ F e S = Ψ F e for S ∈ S, ξ ∈ g and F e ∈ GL(d). Here, −1 ξ = RξR−1 AdR ξ = DLR DRR is the adjoint action of G on g. We will also need the co-adjoint action of G on g ∗ , which is deﬁned by ∗ G → CoAdG = Ad∗G−1 = Ad−∗ G ∈ Lin(g ) .

74

A. Mielke

The construction of suitable energy densities Ψ is well understood. Typically one assumes that the mapping F → Ψ(F P ) is such that it satisﬁes the conditions of pure elasticity, that is Ψ(F e ) = +∞ for det F e ≤ 0 and Ψ(F e ) → ∞ for det F e 0 or F e → ∞. Moreover, Ψ is assumed to be quasi- or poly-convex. The construction of plastic dissipation densities is discussed in the next section.

4

Dissipation Functionals

We discuss two diﬀerent cases, namely von Mises plasticity (cf. Sim´o and Ortiz [1985]; Sim´o [1988]; Miehe and Stein [1992]) and single-crystal plasticity (cf. Ortiz and Repetto [1999]; Gurtin [2000]). The former is a model for polycrystals such as metals where the presence of grains averages out the diﬀerent crystallographic directions; the material is assumed to be isotropic, i.e., S = SO(d).

4.1

von Mises Plasticity

The Lie group G of plastic directors is given by the special linear group G = SL(d) = P ∈ GL(d) : det P = 1 . The associated Lie algebra g = TI G is sl(d) = ξ ∈ Rd×d : trace ξ = 0 . The material symmetry is given by S = SO(d) = R ∈ GL(d) : RT R = I . Denoting by ξ F the matrix Frobenius norm d 2 2 ξ = ξ : ξ = trace ξξ T = ξjk F j,k=1

we obtain AdR ξ F = ξ F for R ∈ SO(d) and ξ ∈ Rd×d . Moreover we have the following result. 4.1 Proposition. Let ξ sym = 12 (ξ + ξ T ) and ξ anti = 12 (ξ − ξ T ) and take any function δ : R2 → [0, ∞) which is convex and homogeneous of degree 1. Then ∆(ξ) = δ ξ sym F , ξ anti F satisﬁes (3.1) for S = SO(d).

2. Finite Elastoplasticity

75

Proof. We simply use the fact that AdR leaves invariant the splitting Rd×d = Sym(d) ⊕ Anti(d). We consider the special case that ∆ is given by a scalar product, i.e.,

1 ∆B (ξ) = Bξ , ξ 2 ≥ γ ξ F (4.1) where B : g → g ∗ is symmetric and positive deﬁnite (γ > 0). This case is equivalent to assuming that G is equipped with a left-invariant Riemannian metric. From standard invariant theory it follows that every symmetric operator B which is Ad-invariant, i.e. ∆B AdR ξ = ∆B (ξ) for R ∈ SO(d) and ξ ∈ Rd×d , there exist real numbers γ1 , γ2 and γ3 such that B ξ = γ1 trace(ξ)I + γ2 ξ sym + γ3 ξ anti .

(4.2)

Here positive deﬁniteness is equivalent to dγ1 + γ2 > 0,

γ2 > 0 and γ3 > 0 .

Moreover, for ξ ∈ g = sl(d) we have trace ξ = 0 and hence γ1 is irrelevant. On g = sl(d) we have 2 2 1 (4.3) ∆B ξ = γ2 ξ sym F + γ3 ξ anti F 2 . Note that the plastic ﬂow rule gives P˙ = P ξ sym + ξ anti such that ξ sym corresponds to plastic stretching whereas ξ anti corresponds to plastic spin. The controversy about the relevance of plastic spin can be rephrased in our setting in the relative size of γ3 with respect to γ2 . We will discuss this question after the construction of the geodesic curves.

4.2

Single-Crystal Plasticity

In single crystals plasticity occurs through movements of dislocation and the plastic strain F p = P −1 can be thought of as atomic disarrangements. The motion of dislocations is viewed to take place on n crystallographically determined slip systems S α = dα ⊗ nα , α = 1, . . . , n, where nα is the unit normal vector to the αth slip plane and dα is the associated slip direction with |dα | = 1 and dα · nα = 0. The plastic ﬂow then takes the form (see Gurtin [2000]) n

να S α F p F˙ p = α=1

where να ≥ 0 is the slip rate of the slip system S α (negative slip rates are realized here by using the negative slip system −S α ). The crystal symmetry group S ⊂ O(d) associates a permutation πR ∈ Perm(n) to each R ∈ S such that πR◦R and = πR ◦ πR α ⇐⇒ S πR (α) = RS α RT Rd , Rnα = dπR (α) , nπR (α) for α = 1, . . . , n.

76

A. Mielke

For our plastic director P shows that span S α

#n α ˙ = F −1 p we obtain P = −P α=1 να S , which α = 1, . . . , n ⊂ g = TI G .

This generates in a natural way the associated Lie group via the smallest Lie algebra g containing all the slip systems. From dα · nα = 0 we know trace(S α ) = 0 and hence g ⊂ sl(d) and G ⊂ SL(d). According 4.2], the dissipation associated to #n #n to [Gurtin, 2000, Section ξ = − α=1 να S α is given by α=1 κα |να | with positive constants κα satisfying the material symmetry relation κπR (α) = κα for all R ∈ S. Since in general the coeﬃcients να are not uniquely determined by ξ we deﬁne ∆(ξ) = min

n !

n " κα |να | ξ = − να S α .

α=1

α=1

To match this construction with our abstract theory we need to show r1 ξ leq∆(ξ) ≤ r2 ξ for all ξ ∈ g. The lower estimate is trivial with r = minα κα , since n n α ξ ≤ S |ν | = |να | ≤ (1/r) ∆(ξ) . α F F α=1

α=1

The upper estimate holds if and only if g equals the span of all S α . The latter condition is nontrivial as is seen for the case d = 3 and n = 2 with S 1 = e1 ⊗ e2 and S 2 = e2 ⊗ e3 . Clearly, span S 1 , S 2 is two-dimensional Lie but the smallest algebra g containing this span is span S 1 , S 2 , S 3 with S 3 = S 1 , S 2 = S 1 S 2 − S 2 S 1 = e1 ⊗ e3 .

5

The Metric and Geodesics on G

To formulate the elastoplastic problem in Section 2 we used the distance on G which is induced by the (nonsymmetric) norm ∆ on g via metric D the left-invariant Finsler–Minkowski metric P , P˙ = ∆ DL−1 P˙ = ∆ P −1 P˙ , ∆ P P 0 , P 1 = inf I P ( · ) P ∈ C1 [0, 1], G , P (0) = P 0 , P (1) = P 1 D ∆ ∆ 1 ˙ with I∆ P ( · ) = ∆ DL−1 (5.1) P (t) P (t) dt . 0

As a short notation we introduce G → [0, ∞) D∆ : I,P P → D ∆

(5.2)

2. Finite Elastoplasticity

77

∆ P 0 , P 1 = D∆ P −1 P 1 . If ∆ −ξ = such that by left-invariance D 0 −1 ∆ ξ for all ξ ∈ g then D∆ P = D∆ P . In general we cannot expect that there is a C1 function such that the inﬁmum is attained. We also have to take into account curves with corners, namely P ∈ W1,∞ ([0,1] , G) =CLip ([0, 1] , G) where P˙ ∈ L∞ (0, 1) . A curve P ∈ CLip [0, 1] , G is called a shortest path from P (0) to ∆ P (0), P (1) . A curve P ∈ P (1) (with respect to ∆) if I∆ P = D CLip (t1 , t2 ), G is called a geodesic curve (with respect to ∆), if for each t ∈ (t1 , t2 ) there exists ε > 0 such that P |[t−ε,t+ε] is a shortest path. Every part P |[τ1 ,τ2 ] of a shortest path is again a shortest path. Moreover, shortest paths are in general not unique. Example.

Consider the Lie group G = P = diag(a, b) ∈ GL(2) a, b > 0

˙ P , P˙ = |a|/a and the left-invariant metric ∆ ˙ + |b|/b. Then, D∆ diag(a, b) = | log a| + | log b| and any curve t → diag( a(t), b(t) is a shortest path if a : [0, 1] → R and b : [0, 1] → R are monotone. A curve is a geodesic if and only if a and b are locally monotone. In particular, nonmonotone a or b is admissible if diﬀerent monotonicity regions are separated by intervals of constant values. E.g. a(t), b(t) = max{1, 2−|t|}, 1 is a geodesic over t ∈ R.

The next example, which has been mentioned to the author by D. Mittenhuber Mittenhuber [2000], shows that shortest paths for left-invariant Finsler metrics in general must have corners. Here the notion of a corner means that the function P :[0, 1] → G is not C 1 when the parametrization ˙ is such that ∆ DL−1 P (t) P (t) ≥ c > 0. The existence of nonsmooth shortest paths might indicate that certain experimental fact concerning the switching between diﬀerent slip systems can be modelled easily using Finsler metrics. This topic will be pursued in future research. " ! a b Example. Consider G = ∈ GL(2) a > 0, b ∈ R with the 0 1 associated Lie algebra g=

" ! α β ∈ gl(2) α, β ∈ R 0 0

78

A. Mielke

and the left-invariant Finsler metric induced by α β

= |α| + |β|. ∆ 0 0 a b ˙ ˙ the Finsler metric reads ∆ L−1 ˙ + |b|)/a. P P = (|a| 0 1 a1 b1 To ﬁnd the shortest path between P 0 = I and P 1 = 0 1 we have to minimize 1 a(1) a a(0) 1 ˙ |a(t)| |b(t)| ˙ 1 = and = + dt subject to . b(1) 0 b(0) a(t) a(t) b 1 0 Writing P =

The integral over the ﬁrst term gives the total variation of log a( · ) and is bounded from below by log amax + log(amax /a1 ) where amax = max a(t) t ∈ [0, 1] . The integral over the second term is bounded from below by |b(1) − b(0)|/amax = |b1 |/amax . In fact, these lower bounds are attained by the path ⎧ ⎪ 1 + 3t(amax −1), 0 for t ∈ [0, 13 ], ⎨ a(t), b(t) = for t ∈ [ 13 , 23 ], amax , (3t − 1)b1 ⎪ ⎩ for t ∈ [ 23 , 1]. amax −(a1 − amax )(3t − 2), b1 The shortest path is now obtained by minimizing with respect to amax ≥ max{1, a1 }. This gives D∆

a b

m2 |b| = min log + 0 1 a m ⎧ 1 ⎪ log + |b| ⎨ a = log a + |b|/a ⎪ ⎩ log(b2 /4a) + 2

m ≥ max{1, a} for a ∈ (0, 1] and |b| ≤ 2, for a ≥ max{1, |b|/2}, for |b| ≥ 2 and a ≤ |b|/2.

In particular, we see that the shortest path is unique (up to reparametrization). Hence, almost all shortest paths have one or two corners. (Note that a˙ b˙ ≡ 0 along all shortest paths.) For the general case we are able to estimate the dissipation metric from above via n # if P = eξ1 . . . eξn . ∆ ξi D∆ P ≤ i=1

(Let P (t) = eξ1 . . . eξk e(nt−k)ξk+1 , for t ∈ k/n, (k + 1)/n , and then use that P (t)−1 P˙ (t) = nξ k+1 .) In fact, we may approximate D∆ I, P as well as we like by increasing n and optimizing ξ 1 , . . . , ξ n . The simple estimate D∆ eξ ≤ ∆ ξ is of restricted use, since the (Lie algebraic) exponential map ξ → eξ from g

2. Finite Elastoplasticity

79

into G is not surjective in general. For a surjective geodesic exponential 6.2. Of course we have the local expansion map Mδ : g → G, see Corollary 2 D∆ eξ = ∆ ξ + O ξ for ξ → 0. Next we show that there always exists a shortest path between any two points on G. After that we use the left-invariance and a nonsmooth version of Noether’s theorem to characterize shortest paths. 5.1 Theorem. For any P 1 ∈ G there exists a shortest path P ( · ) ∈ CLip [0, 1] , G from I to P 1 . Proof. We need to ﬁnd P ∈ CLip [0, 1] , G with P (0) = I, P (1) = P 1 and I∆ P ( · ) = inf I∆ Q Q ∈ C1 [0, 1] , G , Q(0) = I, Q(1) = P 1 . (5.3) This just means that the inﬁmum is in fact a minimum if the set of candidates is increased from C1 to CLip . The right-hand side in (5.3) is by deﬁni tion D∆ (P 1 ) and we can choose an inﬁmizing sequence Q(k) in C1 [0, 1] , G with Q(k) (0) = I, Q(k) (1) = P 1 and δk := I∆ Q(k) → D∆ P 1 . We now use the rate-independency of the integral and reparametrise the time variable. Let t ˙ (k) (s) ds Q(k) (s) , Q 1+∆ τ (k) (t) = ck 0

with ck = 1/(1 + δk ), then τ τ (k) (0) = 0,

(k)

∈ C1 [0, 1] , R satisﬁes

τ (k) (1) = 1

and τ˙ (k) (t) ≥ ck > 0 .

Hence, the inverse function t = T (k) (τ ) exists and we can set P (k) (τ ) = Q(k) T (k) (τ ) . Clearly, P (k) ∈ C1 [0, 1] , G with P (k) (0) = I,

P (k) (1) = P 1 ,

and

I∆ P (k) = δk

by rate-independency. Thus, (P (k) )k∈N is also an inﬁmizing sequence. Additionally we know P (k) (τ ) , P˙ (k) (τ ) = T˙ (k) (τ ) a(k) (τ ) = ∆

a(k) (τ ) ≤ 1 + δk ck 1 + a(k) (τ )

(k) (k) ˙ Q(k) T (k) (τ ) , Q T (τ ) . where a(k) (τ ) = ∆ This new inﬁmizing sequence has the advantage that ξ (k) := t → ( DL−1 P˙ (t) P (k) (t)

80

A. Mielke

is bounded in the linear space L∞ (0, 1) , g . From t P (k) (t) = I + 0 P (k) (s)ξ (k) (s) ds

(5.4) we conclude that P (k) is bounded in the linear space C1 [0, 1] , Rd×d . Thus, we may extract a subsequence (kl )l∈N , such that (P (kl ) )l∈N con verges uniformly on [0, 1] to a limit P ∈ CLip [0, 1] , Rd×d and (ξ (kl) )l∈N converges weak* to a limit ξ ∈ L∞ (0, 1), g with ξ ∞ ≤ 1 + D∆ P 1 the limits P and ξ satisfy (5.4) as well (or weakly in L2 (0, 1), g ). Clearly, 1,∞ [0, 1] , G . and thus P ∈ W From standard theorems in the calculus of variations (cf. Ball [1977], Dacorogna [1989]) we know that convexity in P˙ implies weak lower semicontinuity and we conclude I∆ P ≤ lim inf I∆ P (kl ) = lim inf δkl = D∆ P1 . l→∞

l→∞

This proves the result.

We now want to characterize shortest paths by deriving suitable diﬀerential equations like those for geodesic curves in Riemannian geometry. To motivate the analysis we ﬁrst assume that ∆ : g → [0, ∞) is a C 2 -function (and no longer homogeneous of degree 1). The Euler–Lagrange equation associated to the functional I∆ (P ) is given by d P , P˙ − DP ∆(P , P˙ ) = 0 . (5.5) DP˙ ∆ dt In matrix notation we have P , P˙ = −P −T D∆ P −1 P˙ P˙ T P −T ∈ T∗ G DP ∆ P and

P , P˙ = P −T D∆ P −1 P˙ . DP˙ ∆

Thus, it is possible to write out the equation explicitly. To obtain a proper formulation in Lie group notation it is advantageous to use the Hamilto∗ of the Lagrangian Deﬁne the conjunian form on T on TG. G instead P , P˙ and the Hamiltonian gate variables P , Q ∈ T∗ G via Q = DP˙ ∆

P , Q = Q , P˙ − ∆ P , P˙ where P˙ has to be eliminated. In our speH cial case denote by f : g ∗ → g the inverse of the function D∆ : g → g ∗ and obtain −1 ∗ where DL−∗ P˙ = DLP f ( DL−∗ P Q), P = ( DLP ) , , Q) = H( DL−∗ Q) H(P P

with H(η) = sup η , ξ − ∆(ξ) ξ ∈ g . We have f (η) = DH(η) and the Hamiltonian equations which are equivalent to (5.5) are, in matrix notation, ˙ = −Q DH(P T Q) T . P˙ = P DH P T Q , Q (5.6)

2. Finite Elastoplasticity

81

To obtain a simple Lie algebraic formulation of the left-invariant system we introduce left-invariant coordinates (P , η) ∈ T0 G := G × g ∗ via η = ∗ DL−∗ P Q. Moreover, we need the Lie–Poisson structure JLP on g with ∗ JLP (η) ∈ Skew(g, g ) and

JLP (η)ξ 1 , ξ 2 = η , [ξ 1 , ξ 2 ] for all ξ 1 , ξ 2 ∈ g , where [ξ 1 , ξ 2 ] denotes the Lie bracket on g. According to Abraham and Marsden [1978] (5.5) is equivalent to P˙ = DLP DH(η) ,

η˙ = JLP (η) DH(η) .

(5.7)

is constant along solutions, namely Clearly, the Hamiltonian H d H DL−∗ P Q(t) ≡ 0 . dt Moreover, the Hamiltonian system is invariant under the G-action (τG )G∈G with τG : (P , Q) → (LG P , DL−∗ G Q). By Noether’s theorem (cf. Abraham and Marsden [1978]; Mielke [1991]) it follows that the co-adjoint action generates the associated ﬁrst integrals. Deﬁne the momentum map T∗ G → g ∗ , J : P , Q → Ad∗P −1 ( DL∗P Q) = QP T , then for all solutions we have J P (t), Q(t) ≡ const., see Abraham and Marsden [1978]. Here we used the coadjoint action Ad∗ of G on g ∗ which is deﬁned via ∗

−1 AdG η , ξ = η , AdG ξ = η , DLG DRG ξ for all ξ, η . Note that in matrix notation we have ∗ AdG η = P GT ηG−T where P : Rd×d → g ∗ is the projection deﬁned via trace Pηξ T = trace ηξ T for all ξ ∈ g . Letting J P (t), Q(t) = η ∈ g ∗ and reintroducing P˙ by elimination of Q we ﬁnd the “half-integrated” form of the Euler–Lagrange equation for left-invariant densities ∆: ∗ ˙ with a constant η ∈ g ∗ . (5.8) D∆ DL−1 P P = AdP η In the Hamiltonian context this can be rewritten as with a constant η ∈ g ∗ . P˙ = DLP DH Ad∗P η

(5.9)

82

A. Mielke

The generalization of the ODEs for the stationary curves of the functional I∆ (P ) to the case of nonsmooth ∆ involves the subdiﬀerential ∂∆ ξ as deﬁned in (2.10). Recall

Q = ∂∆(0) ⊂ g ∗ and ∂∆ ξ = η ∈ Q ∆(ξ) = η , ξ from (2.14). We derive integrated forms (5.8) and (5.9) for shortest paths by directly adapting the proof of Noether’s theorem to subdiﬀerential techniques. 5.2 Theorem. Let P : [0, 1] → G be a shortest path between P (0) and P (1) with P (0) = P (1). Then there exists a nonzero η ∈ g ∗ such that Ad∗ η ∈ ∂∆ DL−1 P˙ (t) for a.e. t ∈ [0, 1]. (5.10) P (t)

P (t)

∗ ˙ ∗ Equivalently, we have DL−1 P P ∈ ∂XQ AdP η = NAdP η Q, where Nη Q denotes the normal cone, cf. (2.11). Proof. We choose ξ ∈ g and ρ ∈ C∞ c (0, 1), R , and deﬁne Qρ (t) = eρ(t)ξ P (t). From ρ(0) = ρ(1) = 0 and the fact that P is a shortest path we conclude I∆ Qρ ≥ I∆ P = D∆ P (0)−1 P (1) . Moreover, we ﬁnd −1 ˙ ˙ (t) + P˙ (t) = ρκ(t) ˙ + Ξ(t) DL−1 Qρ Q = DLP (t) ρ(t)ξP with κ(t) = Ad−1 P (t) ξ. For ﬁxed ρ consider the functional Kρ : g → [0, ∞) deﬁned by 1 Kρ ξ := ∆ ρ(t) ˙ Ad−1 (5.11) P (t) ξ + Ξ(t) dt . 0

This functional is convex and by our assumption it attains the global min∗ imum in ξ = 0. This ρ (0) ⊂g . implies 0 ∈ ∂K −1 Let f (t, ξ) = ∆ Ξ(t) + ρ(t) ˙ AdP (t) ξ then ∂f (t, ξ) = ρ(t) ˙ Ad−∗ P (t) ∂∆ Ξ(t) and Lemma A.1 below gives " ! 1 ρ(t)η(t) ˙ dt η(t) ∈ Ad∗P (t)−1 ∂∆ Ξ(t) a.e. . ∂Kρ (0) = 0

From 0 ∈ ∂Kρ (0) for all ρ ∈ W01,∞ (0, 1), R and Lemma A.2 below we conclude that ⊂ g∗ E = ess Ad−∗ P (t) ∂∆ Ξ(t) t∈(0,1)

is nonempty. However, for any η ∈ E equation (5.10) is satisﬁed as Ξ = DLP −1 P˙ . Clearly, η = 0 would imply P˙ (t) = 0 for a.e. t ∈ [0, 1]. Hence, we conclude η = 0.

2. Finite Elastoplasticity

83

This theorem provides us with a suﬃcient condition for the diﬀerentiability of shortest paths and hence of geodesic curves. If the boundary ∂Q 1 of Q is of class C , then the normal cones Nη Q areone-dimensional rays λN (η) λ ≥ 0 , where N : ∂Q → ξ ∈ g ξ = 1 is continuous. Thus, up to time reparametrization (5.10) is equivalent to P˙ = DLP N Ad∗P η , whose solutions are continuously diﬀerentiable in time. 1 (or, equivalently, let ξ ∈ g 5.3 Corollary. Let ∂Q be of class C 1 ∆ ξ ≤ 1 be strictly convex ), then all geodesic curves are C functions in time. Example. To see the impact to Example 5, where of this result we return α, β > 0 is considered the Abelian Lie group G = diag(α, β) ∈ GL(2) 2 with ∆ ξ = |ξ1| + |ξ2 |, where ξ = (ξ1 , ξ2 ) ∈ g = R . We have ∂∆ ξ = Sign(ξ1 ) , Sign(ξ2 ) , where Sign(ξ1 ) = +1, [−1, 1], and −1 for ξ1 > 0, ξ1 = 0, and ξ1 < 0, respectively. Using Ad∗P (η) = η, as G is Abelian, equation (5.10) takes the form η Signα(t)/α(t) ˙ 1 ∈ η= for ﬁxed η ∈ R2 . ˙ η2 Sign β(t)/β(t) This clearly holds if and only if α˙ and β˙ do not change sign. Then only the cases η = (±1, ±1) are relevant. We might consider solutions of (5.10) as generalized geodesic curves. It is unclear whether they are real geodesic curves since the usual argument P , P˙ in P˙ which is in Riemannian geometry involves strict convexity of D certainly not the case here. The next example shows that we have to expect nonsmooth curves as soon as ∆ : g\{0} → [0, ∞) is not diﬀerentiable. Example. Consider G = GL(d) and ∆(ξ) = ξ sym ξ anti such that Q = η sym + η anti : η sym ≤ 1 , η anti ≤ 1 . Using the Sign function 1 ξ : ξ ≤ 1 and Sign(ξ) = ξ ξ for ξ = 0, we ﬁnd the with Sign(0) = formula ∂∆ ξ = Signsym ξ sym + Signanti ξ anti . Now deﬁne the nonsmooth curve P : [−1, 1] → GL(d) via Q(t)Ω(t) = e(σ−ω)t eωt for t ≤ 0, P (t) = for t ≥ 0, Ω(t) = eωt with ω = −ω T , σ = σ T and ω = σ = 1. For t < 0 we ﬁnd ∗ ˙ ∂∆ DL−1 P P = ∂∆ AdΩ(t) σ = AdΩ(t) σ + Sign(0) anti , and for t > 0 we have ˙ ∂∆ DL−1 P P = ∂∆ ω = Sign(0) sym + ω .

84

A. Mielke

For the left-hand side in (5.10) we obtain, with η = σ + ω, for t ≤ 0 the relation Ad∗P (t) η = Ad∗Ω(t) Ad∗Q(t) (σ + ω) = Ad∗Ω(t) σ + ω = Ad∗Ω(t) σ + ω and for t ≥ 0 we have Ad∗P (t) η = Ad∗Ω(t) (σ + ω) = Ad∗Ω(t) σ + ω. Hence, we see that (5.10) is satisﬁed for t ∈ [−1, 1]\{0}.

6

Geodesics for von Mises Dissipation

In the case of von Mises plasticity one assumes isotropy of the dissipation functional on G = SL(d). Moreover, it is assumed to be generated by a scalar product, see (4.1).

1 ∆B ξ = Bξ , ξ 2 . The geodesic curves can be obtained in two ways. On the one hand we can use the theory of Riemannian metrics where it is easy to show that the geodesic curves are obtained also by minimizing the energy 1 2 −1 ˙ 1 dt . 2 ∆ DLP (t) P (t) 0

Then our smooth theory starting with (5.5) applies. For a more general Lie group approach we refer to Mittenhuber [2000]. On the other hand we can specialize the general nonsmooth theory to this situation. We have

∂∆B (0) = η ∈ sl(d) η , B−1 η ≤ 1 1 and ∂∆B ξ = for ξ = 0 . 1 Bξ

Bξ ,ξ 2

Thus, under the assumption P˙ = 0 the half-integrated equation (5.8) takes the form (6.1) P˙ = λ(t) DLP B−1 Ad∗P η , where η ∈ g ∗ is ﬁxed and λ(t) is a positive scalar ﬁxing the time rate. Without loss of generality we take λ ≡ 1 further notation for on. In matrix G = GL(d) or G = SL(d) we have P˙ = P B−1 P T ηP −T . In order to solve (6.1) we need additional information. In von Mises plasticity we can use the isotropy which in the material symmetry manifests group S = SO(d) through ∆B AdR ξ = ∆B ξ . For B this is equivalent to B−1 Ad∗R = AdR−1 B−1 for all R ∈ SO(d). In the case G = SL(d) this leads, according to (4.2), to B−1 η = γ12 η sym + γ13 η anti = 2γ12 η + η T + 2γ13 η−η T with γ2 , γ3 > 0.

(6.2) (6.3)

2. Finite Elastoplasticity

85

6.1 Theorem. If ∆B is isotropic (see (6.2) or (6.3)) then the unique solution of (6.1) with λ ≡ 1 and P (0) = I is given by (6.4) P (t) = etα etω , with α = γ12 η T and ω = γ12 + γ13 η anti . Proof. We make the ansatz P (t) = etα etω with ω = −ω T ∈ so(d). With Ad∗P 1 P 2 = Ad∗P 2 Ad∗P 1 , we obtain P˙ = etα (α + ω)etω and T T P B−1 Ad∗P η = P Ade−tω B−1 Ad∗etα η = etα B−1 etα ηe−tα etω . Clearly, we have found a solution if αT η = ηαT

and α + ω = B−1 η

holds. The second equation gives αsym =

1 γ2

η sym

and

αanti =

1 γ3

η anti − ω .

This reduces the ﬁrst equation to ηω − ωη = γ12 + γ13 η sym η anti − η anti η sym

and we ﬁnd the desired result.

6.2 Corollary. Under the above assumptions we have

1 and DB P = min B ξ , ξ 2 P = Mδ (ξ) DB RT P R = DB P ! sl(d) → SL(d); (6.5) with δ = γ3 /γ2 and Mδ ξ = ξ → eξsym −δξanti e(1+δ)ξanti . Here Mδ is a surjective map for δ > 0. In particular, the limit γ3 0 (plastic spin is dissipationless) leads to the formula 1 DB0 P = γ2 log P P T 2 F . Proof. For ξ ∈ sl(d) let P (t) = Mδ tξ , then (6.4) holds with α = ξ sym − δ ξ anti

and

ω = (1 + δ)ξ anti .

Moreover, we have P −1 P˙ = e−tω (α + ω)etω . Hence 2 2 1 I∆B P ( · ) = ∆B (α + ω) = γ2 ξ sym F + γ3 ξ anti F 2 . As we know by Theorem 5.1 there exists a shortest path. By the above this path has to have the form P (t) = Mδ tξ and thus (6.5) follows. The special case δ = 0 follows since Mδ ξ depends continuously on δ ∈ R with M0 ξ = eξsym eξanti . Thus, the equation P = eξsym eξanti has a unique solution for ξ sym from P P T = e2ξsym . (We use the unique decomposition P = V R with V = V T > 0 and R ∈ SO(d), also called Cartan decomposition.) Here ξ anti is unique up to solutions of eξ = I.

86

A. Mielke

In many crystal experiments the plastic spin appears to be very small. Hence, the case γ3 ∞ should be studied since γ3 γ2 makes plastic spin less favorable than plastic stretching. The above distance metric can be rewritten as Dγ2 ,γ3 P = 1 1 γ2 min γ2 σ2 + 2 ω2 2 σ = σ T , ω = −ω T , P = eσ−ω e(1+ δ )ω γ3 and we conjecture that the limit for γ3 → 0 is given by 1 Dno spin (P ) = inf γ22 σ σ = σ T , ω = −ω T , P = eσ−ω eω . In case the matrix P has no representation in the desired form the inﬁmum is taken to be +∞. Finally we return to the Hamiltonian formulation for the geodesic curves. We recall the transformed Hamiltonian system (5.7) P˙ = DLP DH 0 (η) = DLP B−1 η , (6.6) T T η˙ = JLP (η) DH(η) = B−1 η η − η B−1 η

with H(η) = 12 B−1 η , η . Here JLP : g ∗ → Skew(g, g ∗ ), is the Lie–Poisson structure, which in our case reads as JLP η ξ = ξ T η − ηξ T , using matrix notation. Because of the isotropy (6.3) the second equation in (6.6) decouples as follows: η˙ sym = γ (η sym η anti − η anti η sym ) , η˙ anti = 0 ,

where γ =

1 γ2

+

1 γ3

.

(6.7)

This system can be solved explicitly by η(t) = η sym (t) + η anti (t) = e−tγω σetγω + ω , where ω = −ω T and σ = σ T . From this and the ﬁrst equation in (6.6) we (t)etγω which leads to P ˙ = P 1 (σ − ω). ﬁnd P (t) via the ansatz P (t) = P γ2 This justiﬁes the ansatz in Theorem 6.1, and shows that there are no other possibilities.

A

Set-Valued Calculus of Variations

In this appendix we provide the two lemmas which were needed to prove Theorem 5.2. In the ﬁrst lemma we give a formula for the subdiﬀerential of a function which is deﬁned as an integral over convex functions. This result and certain generalizations of it are well-known, so we omit the proof, see Barbu [1994]; Barbu and Precupanu [1986].

2. Finite Elastoplasticity

87

A.1 Lemma. Let Y be a ﬁnite-dimensional Banach space and Y ∗ its dual. Assume that the function f : [0, 1] × Y → [0, ∞) is Lebesgue–Borel measurable, that f (t, · ) is lower semi-continuous and convex, and that 1 f ( · , y) is integrable. Deﬁne the functional J(y) = 0 f (t, y) dt, then J : Y → [0, ∞) is convex and !

1

∂J(η) =

w(t) dt ∈ Y ∗ w ∈ L1 (0, 1) , Y ∗ ,

0

" w(t) ∈ ∂f (t, η) for a.e. t ∈ [0, 1] .

The second lemma generalizes the fundamental lemma in the calculus of variations which states that a function a : (0, 1) → R must be constant if 1 it satisﬁes 0 ρ(t)a(t) ˙ dt = 0 for all ρ ∈ W01,∞ (0, 1) . A.2 Lemma. For each t ∈ [0, 1] let C(t) ⊂ Rn be a convex, compact set such that the mapping (t, a) → XC(t) (a) from [0, 1] × Rn into [0, ∞] is measurable. Moreover, assume that for each ρ ∈ W01,∞ (0, 1) we have 0 ∈ Kρ := ! 1 " ρ(t)a(t) ˙ dt a : [0, 1] → Rn measurable, (t) ∈ C(t) a.e. ⊂ Rn . 0

Then there exists a vector b∗ ∈ Rn such that b∗ ∈ C(t) for a.e. t ∈ [0, 1]. Proof. We ﬁrst prove the result for piecewise constant maps t → C(t) and ρ with piecewise constant derivative. Assume[0, 1] = ∪ni=1 Ai with Ai ∩ Aj = ∅ for i = j, Ai measurable and |Ai | = Ai dt > 0. Moreover, ˙ = λi /|Ai | for a.e. t ∈ Ai . Then, convexity of assume C(t) = Ci and ρ(t) ˙ dt ∈ λi Ci . Hence, we have to show that ∩ni=1 Ci = ∅ Ci implies Ai ρ(t)a(t) under the assumption 0∈

n

λi Ci =

i=1

n !

" λi ai ai ∈Ci

i=1

#n

for all λ1 , . . . , λn ∈ R with i=1 λi = 0. We prove this by induction on n. For n = 2 we have 0 ∈ λ1 C1 +λ2 C2 with λ1 + λ2 = 0 if and only if 0 ∈ C1 − C2 = a1 − a2 aj ∈ Cj . But this is the same as C1 ∩ C2 = ∅. For the induction from n to n + 1 let Bn = ∩ni=1 Ci which is nonempty by the induction hypothesis. We need to show Bn ∩ Cn+1 = ∅. First note the simple identity Bn =

n ! 1

" µi Ci µ1 + . . . + µn = 1 .

88

A. Mielke

From 0 ∈

#n+1

0∈

i=1

λi Ci for all λ1 , . . . , λn+1 with

n !

µi Ci − Cn+1

i=1

=

n !

#n+1 i=1

λi = 0 we conclude

n " µi = 1 1

n " µi Ci µi = 1 − Cn+1 = Bn − Cn+1 . 1

i=1

$n+1 But this exactly means Bn ∩Cn+1 = i=1 Ci = ∅.This proves the assertion for piecewise constant functions t → ρ(t), ˙ C(t) . The general case is obtained by discretization. For m ∈ N we deﬁne the subintervals for k = 1, . . . , 2m Ikm = (k−1)2−m , k2−m and the sets ! m Ck =

1

" for a.e. t ∈ [0, 1] . a(t) dt a(t) ∈ C (k−1 + t)2−m

0

By taking all ρ ∈ W01,∞ (0, 1) such that ρ˙ is constant on Ikm we have that #2m # 0 ∈ k=1 λk Ckm for k λk = 0. From the above we conclude m

B

m

=

2

Ckm = ∅ .

k=1 m+1 m+1 By construction we have Ckm = 12 C2k−1 + 12 C2k . Since B m+1 ⊂ Cim+1 for m+1 m ⊂ Cj for all j, and ﬁnally B m+1 ⊂ B m . Because all all i we ﬁnd B m m is nonempty and satisﬁes B are compact, we know that B ∗ = ∩∞ m=1 B ∗ m ∗ m all m ∈ N and k = 1, . . . 2m . B ⊂ B for all m. This implies B ⊂ Ck for By the measurability of the family t → C(t) and the convexity of C(t) we m conclude that for almost all t the sets C[t2 m ] converge for m → ∞ to C(t). Hence ∅ = B ∗ ⊂ C(t) for a.e. t ∈ [0, 1].

References Abraham, R. and J. E. Marsden [1978], Foundations of Mechanics. Addison– Wesley, 2nd edition. Arnol’d, V. I. [1989], Mathematical Methods of Classical Mechanics. Graduate Texts in Mathematics, 60, 2nd edition, Springer-Verlag, New York. Ball, J. M. [1977], Convexity conditions and existence theorems in nonlinear elasticity. Arch. Rat. Mech. Analysis 63, 337–403. Barbu, V. [1994], Mathematical Methods in Optimization of Diﬀerential Systems. Mathematics and its Applications, 310. Kluwer Academic Publ., Dordrecht.

2. Finite Elastoplasticity

89

Barbu, V. and Th. Precupanu [1986], Convexity and Optimization in Banach Spaces. D. Reidel Publishing Co., Dordrecht. Carstensen, C., K. Hackl, and A. Mielke [2001], Nonconvex potentials and microstructures in ﬁnite-strain plasticity. Proc. Royal Soc. London, September 2001, (in press). Dacorogna, B. [1989], Direct Methods in the Calculus of Variations. SpringerVerlag. Ebin, D. G. and J. E. Marsden [1970], Groups of diﬀeomorphisms and the notion of an incompressible ﬂuid. Ann. of Math. 92, 102–163. Govindjee, S., A. Mielke, and G. J. Hall [2001], The free-energy of mixing for n-variant martensitic phase transformations using quasi–convex analysis. J. Mech. Physics Solids, (to appear). Gurtin, M. E. [2000], On the plasticity of single crystals: free energy, microforces, plastic-strain gradients. J. Mech. Physics Solids 48, 989–1036. Hackl, K. [1997], Generalized Standard Media and Variational Principles in Classical and Finite Strain Elastoplasticity. J. Mech. Phys. Solids 45 667–688. Han, W. and B. D. Reddy [1995], Computational plasticity: the variational basis and numerical analysis. Comp. Mech. Advances 2, 283–400. Holm, D. D., J. E. Marsden, T. Ratiu, and A. Weinstein [1985], Nonlinear stability of ﬂuid and plasma equilibria. Phys. Rep. 123, pp. 116. Krishnaprasad, P. S. and J. E. Marsden [1987], Hamiltonian structures and stability for rigid bodies with ﬂexible attachments. Arch. Rat. Mech. Anal. 98, 71–93. Lee, E. H. [1969], Elastic-plastic deformation at ﬁnite strains J. Appl. Mech, 36, 1–6. Marsden, J. E. and T. J. R. Hughes [1983], Mathematical Foundations of Elasticity. Prentice Hall 1983. Marsden, J. E. and A. Weinstein [1983], Coadjoint orbits, vortices, and Clebsch variables for incompressible ﬂuids, Physica D 7, 305–323. Miehe, C. [2000], Elements of computational inelasticity at ﬁnite strains, (in preparation). Miehe, C. and E. Stein [1992], A Canonical Model of Multiplicative ElastoPlasticity. Formulation and Aspects of the Numerical Implementation European J. Mechanics, A/Solids 11, 25–43. Miehe, C. and J. Schotte [2000], A two-scale micro-macro-aproach to anisotropic ﬁnite plasticity of polycrystals. In “Multiﬁeld problems, State of the Art; A.–M. S¨ andig, W. Schiehlen, W.L. Wendland (eds); Springer-Verlag 2000”, pp. 104– 111. Mielke, A. [1991], Hamiltonian and Lagrangian Flows on Center Manifolds, Lecture Notes in Mathematics Vol. 1489, Springer-Verlag, Berlin. Mielke, A. [2002], Energetic formulation of multiplicative elastoplasticity using dissipation distances. Preprint Univ. Stuttgart, February 2002. Mielke, A. and P. J. Holmes [1988], Spatially complex equilibria of buckled rods. Arch. Rat. Mech. Anal. 101, 319–348.

90

A. Mielke

Mielke, A. and F. Theil [1999], A mathematical model for rate-independent phase transformations with hysteresis. In “Models of Continuum Mechanics in Analysis and Engineering, (H.-D. Alber, R. Farwig eds.), Shaker Verlag, 1999”, 117–129. Mielke, A., F. Theil, and V. I. Levitas [2002], A variational formulation of rateindependent phase transformations using an extremum principle. Archive Rat. Mech. Analysis, (in print). Mittenhuber, D. [2000], Pseudo-Riemannian metrics on Lie groups. Preprint TU Darmstadt, Febr. 2000. Mittenhuber, D. [2000], Personal Communication by e-mail, August 2000. Monteiro Marques, M. D. P. [1993], Diﬀerential Inclusions in Nonsmooth Mechanics (Shocks and Dry Friction). Birkh¨ auser. Ortiz, M. and E. A. Repetto [1999], Nonconvex energy minimization and dislocation in ductile single crystals. Journal of the Mechanics and Physics of Solids. 47, 397–462. Sim´ o, J. C. [1988], A framework for ﬁnite strain elastoplasticity based on maximum plastic dissipation and the multiplicative decomposition. Part I: Continuum formulation. Comp. Meth. Appl. Mech. Engrg. 66, 199–219. Sim´ o, J. C., and M. Ortiz [1985], A Uniﬁed Approach to Finite Deformation Elastoplastic Analysis Based on the Use of Hyperelastic Constitutive Equations, Comput. Meths. Appl. Mech. Engrg. 49, 221–245. Ziegler, H. and C. Wehrli [1987], The Derivation of Constitutive Relations from the Free Energy and the Dissipation Function. In Advances in Applied Mechanics 25, (T. Y. Wu & J. W. Hutchinson, eds.) Academic Press, New York.

3 Asynchronous Variational Integrators A. Lew and M. Ortiz To Jerry Marsden on the occasion of his 60th birthday ABSTRACT We describe a class of asynchronous variational integrators (AVI) for nonlinear elastodynamics. The AVIs are characterized by the following distinguishing attributes: i) The algorithms permit the selection of independent time steps in each element, and the local time steps need not bear an integral relation to each other; ii) the algorithms derive from a spacetime form of a discrete version of Hamilton’s principle. As a consequence of this variational structure, the algorithms conserve local energy and momenta exactly, subject to solvability of the local time steps. Numerical tests reveal that, even when local energy balance is not enforced exactly, the global and local energy behavior of the AVIs is quite remarkable, a property which can probably be traced to the symplectic nature of the algorithm.

Contents 1 2

Introduction . . . . . . . . . . . . . . . . . . . The Discrete Problem . . . . . . . . . . . . . . 2.1 Spatial Discretization . . . . . . . . . . . . . 2.2 Time Discretization . . . . . . . . . . . . . . 2.3 Discrete Variational Principle . . . . . . . . . 2.4 Conservation Properties of AVIs . . . . . . . 2.5 Time-Adaption and Space-Time Formulation 3 Numerical Examples . . . . . . . . . . . . . . 3.1 Two-Dimensional Neo-Hookean block . . . . 3.2 Three-Dimensional L-Shaped Beam . . . . . 4 Summary . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . .

1

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

91 93 95 95 97 98 99 101 102 107 109 109

Introduction

Dynamical systems that exhibit well-separated multiple time scales arise in many areas of application, including molecular dynamics, microstructural 91

92

A. Lew and M. Ortiz

evolution in materials, and structural mechanics. In a numerical treatment of these problems, it is natural to consider time stepping algorithms with multiple time steps chosen to resolve the various time scales in the problem. In the engineering ﬁnite-element literature, the so-called subcycling algorithms (Neal and Belytschko [1989], Belytschko and Mullen [1976]) ﬁt that description. These algorithms were developed mainly so that stiﬀ elements, or regions of the model, could advance at smaller time steps than more compliant elements. In its original version, the method grouped the nodes of the mesh and assigned a diﬀerent time step to each group. Adjacent groups of nodes were constrained to have integer time step ratios (see Belytschko and Mullen [1976]), a condition that was relaxed subsequently (Neal and Belytschko [1989]; Belytschko [1981]). Recently, an implicit multi-time step integration method was developed and analyzed by Smolinski and Wu [1998]. Since the original formulation of subcycling algorithms, a sea change has occurred in the understanding of discrete dynamics, most notably since the development of the theory of variational integrators (Kane, Marsden, and Ortiz [1999]; Kane, Marsden, Ortiz, and West [2000]; Marsden, Patrick, and Shkoller [1998] and Marsden and West [2001]). Widely used algorithms, such as some versions of Newmark’s algorithm, can be recast into the discrete mechanics framework as shown by Kane, Marsden, Ortiz, and West [2000]. Even without deliberately adjusting the time step to achieve exact conservation these algorithms possess remarkable energy-conservation properties that probably originate from their symplectic and variational nature. One of the attractive features of variational methods is that if a problem has symmetries and, as a consequence, corresponding conserved quantities exist, then these quantities are automatically conserved by the algorithm. These recent theoretical developments provide a sound basis for formulating powerful multiresolution methods that oﬀer considerable freedom of local time step selection while possessing exact local energy and momenta conservation properties. In this article, we describe one such class of integrators, termed asynchronous variational integrators (AVI), characterized by the following distinguishing attributes: 1. When applied to dynamical systems deﬁned by the ﬁnite element method, AVIs permit the selection of diﬀerent time steps for each element. The local time steps need not bear an integral relation to each other, and the integration of the elements may, therefore, be carried out asynchronously. 2. The time-integration algorithm is given by a spacetime form of a discrete version of Hamilton’s principle. As a consequence of this variational structure, the algorithm conserves local energy and momenta exactly, subject to solvability of the local time steps.

3. Asynchronous Variational Integrators

93

The use of a variable time step sidesteps the restrictions imposed by wellknown theorems (Ge and Marsden [1988]), which rule out the possibility that constant time stepping algorithms be symplectic, and preserve energy and momentum. A symplectic-energy-momentum time integrator for ﬁnitedimensional dynamical systems was developed in Kane, Marsden, and Ortiz [1999], where the time step of the complete system was computed in order to preserve the total energy. Conditions for the solvability of the time step were also provided by Kane, Marsden, and Ortiz [1999]. The integrators described in this article are symplectic-energy-momentum preserving. The variational structure of the algorithms furnishes a local energy balance equation which may be satisﬁed by allowing for diﬀerent and variable time steps in each element. The resulting algorithms satisfy local, as opposed to merely global, energy and momentum balance. Numerical examples in two and three dimensions demonstrate the remarkable versatility and conservation properties of AVIs. The present article is extracted from a much more extended version (Lew, West, Marsden and Ortiz [2001]), which may be consulted for further details of the theory and algorithmic implementation.

2

The Discrete Problem

In describing the dynamic response of elastic bodies under loading, we select a reference conﬁguration B ⊂ R3 of the body at time t0 . The coordinates of points X ∈ B are used to identify material particles throughout the motion. The motion of the body is described by the deformation mapping X ∈ B.

x = ϕ(X, t) ,

(2.1)

Thus, x is the location of material particle X at time t. The material velocity and acceleration ﬁelds follow from (2.1) as ϕ(X, ˙ t) and ϕ(X, ¨ t), X ∈ B, respectively, where a superposed dot denotes partial diﬀerentiation with respect to time at ﬁxed material point X. The deformation mapping is subject to essential boundary conditions on the displacement boundary ∂d B ⊂ ∂B. For deﬁniteness, the potential energy of the body is assumed to be of the form W (D1 ϕ, X) dV − ρB · ϕ dV − T · ϕ dS , (2.2) V ϕ( · , t), t = B

B

∂τ B

where W is the strain energy density per unit volume, B(X, t) is the body force per unit mass, T(X, t) is the prescribed traction applied on the traction boundary ∂τ B = ∂B \ ∂d B, ρ is the mass density in B, and Di denotes the partial derivative of a function with respect to its i-th argument. In

94

A. Lew and M. Ortiz

addition, the kinetic energy of the body is assumed to be of the form ρ 2 |ϕ| ˙ dV . (2.3) T ϕ( ˙ · , t) = B 2 The corresponding Lagrangian of the body is L ϕ( · , t), ϕ( ˙ · , t), t = T [ϕ] ˙ − V [ϕ, t] .

(2.4)

Consider now a motion of the body during the time interval [t0 , tf ]. The action attendant to the motion is tf S[ϕ( · , · )] = L(ϕ, ϕ, ˙ t)dt . (2.5) t0

We note that, upon insertion of (2.4) in (2.5), the evaluation of the action functional entails a spacetime integral. Within the framework just outlined, Hamilton’s principle postulates that the motion ϕ(X, t) of the body which joins prescribed initial and ﬁnal conditions renders the action functional S stationary with respect to all admissible variations, i. e., variations of ϕ(X, t) vanishing at t0 and tf and satisfying the essential boundary conditions on ∂d B. A standard calculation shows that the Euler–Lagrange equations corresponding to Hamilton’s principle are d ˙ t) − D2 L(ϕ, ϕ, ˙ t) = 0 (2.6) D1 L(ϕ, ϕ, dt for all t ∈ [t0 , tf ]. We recall that, in Lagrangian mechanics, energy conservation follows as a consequence of the invariance of the action under time-translation. In order to establish this connection, introduce the shifted action t−α L(ϕ( · , ξ + α), ϕ( ˙ · , ξ + α), ξ) dξ (2.7a) Sα [ϕ] = t0 −α t

L(ϕ( · , ξ), ϕ( ˙ · , ξ), ξ − α) dξ ,

=

(2.7b)

t0

where t ∈ [t0 , tf ]. Thus, Sα is the value of the action for a motion identical in all respects to ϕ but occurring a time α earlier. Diﬀerentiating equations (2.7a) and (2.7b) with respect to α and evaluating the result along trajectories gives t t ∂Sα = H =− D3 L(ϕ, ϕ, ˙ ξ)dξ , (2.8) ∂α α=0 t0 t0 where H = D2 L(ϕ, ϕ, ˙ t) · ϕ˙ − L(ϕ, ϕ, ˙ t)

(2.9)

is the total energy of the body. In particular, for an autonomous Lagrangian D3 L = 0, and (2.8) simply states that the total energy of the body is conserved along trajectories.

3. Asynchronous Variational Integrators

2.1

95

Spatial Discretization

Next we consider a ﬁnite-element triangulation T of B. The corresponding ﬁnite-dimensional space of ﬁnite-element solutions consists of deformation mappings of the form xa Na (X) , (2.10) ϕh (X) = a∈T

where Na is the shape function corresponding to node a, xa represents the position of the node in the deformed conﬁguration. A key observation underlying the formulation of AVIs is that, owing to the extensive character of the Lagrangian (2.4), the following element-by-element additive decomposition holds: LK , (2.11) L= K∈T

where LK is the contribution of element K ∈ T to the total Lagrangian, which follows by restricting (2.4) to K. Each elemental or local Lagrangian LK can in turn be written as a function of the nodal positions and velocities of the element, i. e., LK ϕh ( · , t), ϕ˙ h ( · , t), t ≡ LK xK (t), x˙ K (t), t , (2.12) where xK is the vector of positions of all the nodes in element K. In particular, for the Lagrangian (2.4) the local Lagrangians have the form LK xK , x˙ K , t) = TK (x˙ K ) − VK (xK , t , (2.13) where VK (xK , t) is the elemental potential energy, and TK (x˙ K ) =

1 2

˙K x˙ T K MK x

(2.14)

is the elemental kinetic energy. Here MK is the element mass matrix, which is constant by conservation of mass and will be assumed to be expressible in diagonal or lumped form.

2.2

Time Discretization

A key feature of the AVIs is that the elements and nodes deﬁning the triangulation of the body are updated asynchronously in time. To this end, we endow each element K ∈ T with a discrete time set K −1 K (2.15) < tN ΘK = t0 = t1K < . . . < tN K K j j j K −1 K < tf ≤ t N with tN K . In addition, we write xK ≡ xK (tK ), tK ∈ ΘK , for K the discrete element coordinates, and % ΘK (2.16) Θ= K∈T

96

A. Lew and M. Ortiz

ϕh t

t

tf

tf

ti

ti X

x

Figure 2.1. Spacetime diagram of the motion of a three-element, one-dimensional mesh. The reference conﬁguration is shown on the left, while the deformed conﬁguration is on the right. The trajectories of the nodes are depicted as dashed lines in both conﬁgurations. The horizontal segments above each element K deﬁne the set ΘK .

for the entire time set. We shall also need to keep proper time at all nodes in the mesh. To this end, we let % a (2.17) Θa = ΘK = t0 = t1a ≤ . . . ≤ taNa −1 ≤ tN a {K∈T |a∈K}

denote the ordered nodal time set for node a. In these deﬁnitions, the sym & bol denotes disjoint union. For simplicity, we assume that tjK = tjK for any pair of elements K and K . The case of time coincidences between elements can be treated simply by taking the appropriate limits. We additionally write xia = xa (tia ), tia ∈ Θa , for the discrete nodal coordinates, and let (2.18) X = xia , a ∈ T , i = 1, . . . Na denote the set of nodal coordinates deﬁning the discrete trajectory. The particular class of AVIs under consideration here is obtained by allowing each node a ∈ T to follow a linear trajectory within each time interval [tia , ti+1 a ]. The corresponding nodal velocities are piecewise constant in time. The nodal trajectories thus constructed are deﬁned in the time a intervals [t0 , tN a ]. An x − t diagram of the motion of a three-element onedimensional mesh is shown in Fig. 2.1 by way of illustration. Higher-order AVI methods can also be devised by considering piecewise polynomial nodal trajectories.

3. Asynchronous Variational Integrators

97

We note that the pair of sets (Θ, X) completely deﬁnes the trajectories of the discrete system. A class of discrete dynamical systems is obtained by considering discrete action sums of the form (see, e. g., Marsden and West [2001] for a recent review on discrete dynamics and variational integrators) LjK , (2.19) Sd (X, Θ) = K∈T 1≤j
where the discrete Lagrangian LjK approximates element K over the interval [tjK , tj+1 K ], i. e., LjK

≈

the incremental action of

tj+1 K

tjK

LK dt .

(2.20)

A particular choice of discrete Lagrangian, resulting in explicit integrators of the central-diﬀerence type, is given by LjK

=

tj+1 K

tjK

j+1 j+1 j . TK x˙ K (t) dt − tj+1 K − t K V K x K , tK

(2.21)

In general, LjK depends on some subset of X of nodal coordinates, and some subset of Θ of elemental times.

2.3

Discrete Variational Principle

The discrete version of Hamilton’s principle states that the discrete trajectory having prescribed initial and ﬁnal end points renders the discrete action sum stationary with respect to admissible variations of the coordinate set X (see Marsden and West [2001] and Marsden, Patrick, and Shkoller [1998]). Note that since each element carries its own set of time K steps, the ﬁnal conﬁgurations of the elements, corresponding to tN K ≥ tf , i K ∈ T , are not synchronized in general. Thus, xa is not to be varied if tia ≥ tf , as the node belongs to the ﬁnal conﬁguration of some element in the mesh. Similarly, the initial nodal positions at t0 are not to be varied. The discrete Hamilton’s principle leads to the discrete Euler–Lagrange equations: (2.22) Dai Sd = 0 , for all a ∈ T and 1 < i < Na . Here and subsequently, Dai denotes diﬀerentiation with respect to xia . The discrete Euler–Lagrange equations (2.22) deﬁne equations of motion for the discrete problem. If the discrete time set Θ is given a priori, then the discrete equations of motion (2.22) determine the coordinate sets X which deﬁne the discrete trajectories of the system. For the particular case of the discrete Lagrangian (2.21), the discrete Euler–Lagrange equations follow explicitly in the form (cf. Lew, West,

98

A. Lew and M. Ortiz

Marsden and Ortiz [2001] for further details of the derivation) i+ 12

pa where i+ 12

pa

≡ Ma

i− 12

− pa

= Iia ,

xi+1 − xia i+ 12 a ≡ M v a a i+1 ta − tia

(2.23)

(2.24)

are discrete linear momenta and Ma nodal mass matrices. In addition, we deﬁne D1 VK xjK , tjK , (2.25) IjK ≡ − tjK − tj−1 K which may be regarded as the impulses exerted by element K on its nodes at time tjK . In equation (2.23) Iia represents the component of IjK corresponding to node a, with tia = tjK . Eq. (2.23) may be interpreted as describing a sequence of percussions imparted by the elements on their nodes at discrete instants of time. Thus, the element K accumulates and stores impulses IjK j over the time interval (tj−1 K , tK ). At the end of the interval, the element releases its stored impulses by imparting percussions on its nodes, causing the linear momentum of the nodes to be altered. The resulting nodal trajectories are piecewise linear in time, as initially assumed. We note that adjacent elements interact by transferring linear momentum through their common nodes.

2.4

Conservation Properties of AVIs

As in the continuous case, conserved quantities and conservation laws follow consistently as a consequence of the symmetries of the discrete action sum. It may be shown (Lew, West, Marsden and Ortiz [2001]) that discrete forms of global and local linear and angular momentum balance may be derived directly in this manner. It also follows that the solutions of the discrete Euler–Lagrange equations satisfy linear and angular momentum balance automatically. The conservation of local and global energy is a somewhat more delicate matter with far-reaching consequences. Thus, a consistent local energybalance equation may be derived by examining the eﬀect of time translations on the discrete action. The result is (cf Lew, West, Marsden and Ortiz [2001] for further details) j Sd = 0 (2.26) DK j denotes for all K ∈ T and all 1 < j < NK . Here and subsequently, DK j diﬀerentiation with respect to tK . This identity may be interpreted as a discrete element-wise, or local, energy-balance equation, which must be satisﬁed in addition to the discrete Euler–Lagrange equations. It therefore generalizes the consistent global energy-balance equation derived by Kane, Marsden, and Ortiz [1999].

3. Asynchronous Variational Integrators

99

For the particular case of the discrete Lagrangian (2.21), the discrete energy-balance equations take the form (cf Lew, West, Marsden and Ortiz [2001] for a detailed derivation) i− 12 T j j 1 ) Ma vai−1/2 + VK (xjK , tjK ) (tjK − tj−1 K ) D2 VK (xK , tK ) + 2 (va =

a∈K i+ 12 T 1 ) Ma 2 (va

j+1 vai+1/2 + VK (xj+1 K , tK ),

(2.27)

a∈K

where K ∈ T , 1 < j < NK and tia = tjK . Evidently, eq. (2.27) furnishes a discrete version of local energy balance, and it expresses the precise way in which the discrete trajectories conserve local energy. It should be noted that the local energy balance implied by eq. (2.27) does not require the element energies to remain constant. Indeed, energy may be exchanged reversibly between the elements and their incident nodes, which provides a mechanism for energy transfer.

2.5

Time-Adaption and Space-Time Formulation

The discrete Euler–Lagrange and energy-balance equations (2.22) and (2.26) may be collected to form the extended system of equations: Dai Sd (X, Θ) j DK Sd (X, Θ)

= 0 = 0,

(2.28) (2.29)

which determines both the discrete displacements X as well as the discrete times Θ, provided that the system of equations admits solutions. This results in time adaption, in as much as the time set Θ is not prescribed at the outset but is determined as part of the solution instead. The resulting method generalizes that proposed by Kane, Marsden, and Ortiz [1999], which allows for one adaptable time variable only and thus results in global energy conservation only. An alternative interpretation of eqs. (2.28) and (2.29) is as joint discrete Euler–Lagrange equations corresponding to a spacetime discretization of the spacetime domain B. In this approach, the spatial coordinates X and the temporal coordinates Θ are placed on an equal footing, and regarded jointly as spacetime coordinates. Of course, the viability of the spacetime approach relies on the solvability of the spacetime discrete Euler–Lagrange equations (2.28–2.29). However, Kane, Marsden, and Ortiz [1999] pointed out that it is not always possible to determine a positive time step from the discrete energy-conservation equation, especially near turning points where velocities are small. Kane, Marsden, and Ortiz [1999] overcame this diﬃculty by formulating a minimization problem that returns the exact spacetime solution whenever one exists.

100

A. Lew and M. Ortiz

pˆ

1

qˆi , pˆi− 2 1

ˆ i+ 2 1/h 1

ˆ i− 12 qˆi h

1

qˆi+1 , pˆi+ 2

1

1

qˆ

Figure 2.2. Graphical interpretation of the algorithm. There are two intersections of the constant energy and momentum surfaces. The cross denotes a soluˆ i+1/2 , while the circle indicates the positive tion rendering a negative value of h solution.

In the context of AVIs, the following simple example demonstrates that solvability cannot be always counted on, especially for explicit algorithms. The example concerns a simple harmonic oscillator with mass m and spring constant κ. For this system, the discrete spacetime Euler–Lagrange equations corresponding to the discrete Lagrangian (2.21) are 1

1

pi+1/2 − pi− 2 = −hi− 2 κq i i− 1 2 1 i+ 1 2 1 i+1 2 1 1 2 p 2 + 2κ q + 2 κ(q i )2 = 2m =H, 2m p

(2.30) (2.31)

where 1

pi+ 2 = m

q i+1 − q i 1

hi+ 2

(2.32)

and we write 1

hi+ 2 = ti+1 − ti .

(2.33)

It should be noted that eq. (2.30) describes a variable time step centraldiﬀerence scheme, and therefore the algorithm is explicit. In terms of the

3. Asynchronous Variational Integrators

dimensionless variables p , pˆ = √ 2mH

' qˆ = q

ˆ = (h , h m/κ

κ , H

101

(2.34)

equations (2.30), (2.31) and (2.32) may be recast in the form: 1 1 ˆ i− 12 qˆi pˆi+ 2 − pˆi− 2 = −h i− 1 2 i 2 i+ 1 2 i+1 2 =1 pˆ 2 + qˆ = pˆ 2 + qˆ

i+1

ˆ i+ 12 = qˆ h

− qˆ

(2.35) (2.36)

i

1

pˆi+ 2

.

(2.37)

1 ˆ i+ 12 ), subject The problem is now to solve these equations for (ˆ q i+1 , pˆi+ 2 , h 1 1 1 1 ˆ i− 2 > 0. ˆ i− 2 , h ˆ i+ 2 > 0, given qˆi , pˆi− 2 , h to the constraint h This problem can readily be solved graphically in the phase plane (ˆ q , pˆ) ∈ R2 , Fig. 2.2. Equation (2.36) deﬁnes a constant energy surface, which in the present case reduces to a circle, and equation (2.35) deﬁnes the constant linear momentum surface, which here reduces to a horizontal line. The intersections of this line with the circle return two possible solutions of the ˆ i+ 12 is given by the inverse of the slope of the segment system. The value of h 1 joining qˆi , 0 with qˆi+1 , pˆi+ 2 . Valid solutions correspond to segments with positive slopes. It is clear from this construction that solutions fail to exist for suﬃciently 1 hi− 2 qˆi , as under such conditions the constant linear-momentum line large ˆ ˆ i− 12 are does not intersect the constant-energy circle. Since both qˆi and h given as initial conditions, this lack of solvability implies that the explicit algorithm may not be able to conserve energy over some time steps. It does not appear to be known at present whether it is always possible to formulate—most likely implicit—discrete Lagrangians such that the discrete spacetime Euler–Lagrange equations (2.28 - 2.29) are always solvable.

3

Numerical Examples

We conclude this article with selected examples of application of the AVI corresponding to the discrete Lagrangian (2.21). In these examples, the elemental time steps are determined from the Courant condition, which provides an estimate of the stability limit for explicit integration (cf, e. g., Hughes [1987]). In consequence of this choice of local time-step, the local energy-balance equations (2.29) are not satisﬁed exactly by the algorithm. As we shall see, however, the numerical solution still exhibits excellent energy-conserving properties. Because of the algorithm’s asynchronous nature, a suitable scheduling procedure which determines the order of operations while ensuring causality must be carefully designed. One particularly eﬃcient implementation

102

A. Lew and M. Ortiz

consists of maintaining a priority queue (see, e. g., Knuth [1998]) containing the elements of the triangulation. The elements in the priority queue are ordered according to the next time at which they are to become active. Thus, the top element in the queue, and consequently the next element to be processed, is the element whose next activation time is closest to the present time. The general ﬂow of the calculations is as follows. The priority queue is popped in order to determine the next element to be processed. The new conﬁguration of this active element is computed from the current velocities of the nodes. Subsequently, these velocities are modiﬁed by impulses computed based on the new element conﬁguration. Finally, the next activation time for the element is computed as a fraction of the Courant limit and the element is pushed into the queue.

3.1

Two-Dimensional Neo-Hookean block

Our ﬁrst example concerns a square block 1 m in size, ﬁxed on one side and traction-free on the remaining three sides, released from rest from a stretched conﬁguration, Fig. 3.1. The block is free of body forces. The material is a compressible Neo-Hookean solid characterized by a strainenergy density of the form: (3.1) W F = 12 λ log2 J − µ log J + 12 µ tr FT F , where F = D1 ϕ is the deformation gradient, J = det F is the Jacobian of the deformation, and λ and µ are material constants. The values of the material constants used in calculations are: λ = 93 GPa, µ = 10 GPa, and ρ = 7800 kg/m3 . The initial stretch applied to the block is 1.2. The ﬁniteelement mesh contains a distribution of element sizes in order to have a 1.2 m

1m

1m

Figure 3.1. Geometry of the two-dimensional Neo-Hookean block example.

3. Asynchronous Variational Integrators

1.05

1.05

0.9

0.9

0.9

0.75

0.75

0.75

1.05

0.6

y[m]

0.6

y[m]

y[m]

0.6 0.45

0.45

0.3

0.3

0.3

0.15

0.15

0

0

0

0.25

0.5

0.75

1

1.25

-0.15

0

0

0.25

0.5

x[m]

1

1.25

-0.15

1.05

0.9

0.9

0.9

0.75

0.75

0.75

y[m]

0.3

0.3

0.15

0.15

0

0

0.5

0.75

1

1.25

-0.15

0.25

0.5

0.75

1

1.25

-0.15

1.05

0.9

0.9

0.9

0.75

0.75

0.75

y[m]

y[m]

0.3

0.3

0.15

0.15

0

0.5

0.75

x[m]

1

1.25

-0.15

0.75

1

1.25

0.75

1

1.25

0.45

0.3

0.25

0.5

0.6

0.45

0.15

0

0.25

1.05

0.6

0

1.25

x[m]

1.05

-0.15

0

x[m]

0.6

1

0

0

x[m]

0.45

0.75

0.45

0.3

0.25

0.5

0.6

0.45

0.15

0

0.25

1.05

0.6

0.45

-0.15

0

x[m]

1.05

y[m]

y[m]

0.75

x[m]

0.6

y[m]

0.45

0.15

-0.15

103

0

0

0.25

0.5

0.75

x[m]

1

1.25

-0.15

0

0.25

0.5

x[m]

Figure 3.2. Neo-Hookean block example. Snapshots of the deformed shape of the block at intervals of 2 × 10−4 s. Time increases from left to right and from top to bottom of the ﬁgure.

corresponding distribution of elemental time steps. The mesh is composed of 380 quadratic six-noded triangular elements and 821 nodes. A sequence of snapshots of the AVI solution is shown in Fig. 3.2. In addition, Fig. 3.3 shows a comparison of the AVI solution and a baseline solution obtained using Newmark’s second-order explicit algorithm (cf, e. g., Hughes [1987]). A ﬁrst noteworthy feature of the AVI solution is that, despite its asynchronous character, it advances smoothly in time without ostensible jerkiness or vacillation. The AVI and Newmark solutions appear to remain in lockstep over long runs and to be of comparable quality, Fig. 3.3. The main advantage of the AVI is illustrated in Fig. 3.4, which depicts the number of updates in each of the elements of the mesh. As is evident from the ﬁgure, the large elements in the mesh are updated much less frequently than the ﬁne elements. Some relevant statistics are collected in Table 3.1. Overall, in the present example the number of AVI updates is

104

A. Lew and M. Ortiz

1

y[m]

0.75

0.5

0.25

0

0

0.25

0.5

0.75

1

x[m]

Figure 3.3. Neo-Hookean block example. Comparison of the deformed conﬁgurations at t = 16 ms computed using Newmark’s second-order explicit algorithm (dashed lines) and the AVI (solid lines). The time corresponds to 2,208,000 Newmark steps, or 8 complete oscillation cycles.

Table 3.1. Neo-Hookean block example. Number of elemental updates after 10 ms of simulation.

AVI

Newmark

Maximum

1,374,413

1,380,000

Minimum

42,759

1,380,000

302,000,000

524,400,000

Total in the mesh

roughly 60% of the number of Newmark updates. It should be carefully noted, however, that in the example under consideration the vast majority of the elements in the mesh are small in size, and the number of large elements is correspondingly small. It is easy to set up examples in which the update count of the Newmark algorithm bears an arbitrarily large ratio to the update count of the AVI. A case which arises in practice with some frequency concerns a roughly uniform triangulation of the domain which contains a small number of high aspect-ratio elements. The presence of a

3. Asynchronous Variational Integrators

105

1 6.04 5.95 5.86 5.76 5.67 5.57 5.48 5.38 5.29 5.20 5.10 5.01 4.91 4.82 4.73

Y[m]

0.75

0.5

0.25

0

0

0.25

0.5

0.75

1

X[m]

Figure 3.4. Neo-Hookean block example. Contour plot of the log10 of the number of times each element is updated by the AVI after 10 ms of simulation.

Total Energy [MJ]

single bad element suﬃces to drive down the critical time step for explicit integration to an arbitrarily small value. This problem often besets explicit dynamics, especially in three dimensions where bad elements, or slivers, are diﬃcult to eliminate entirely. The AVI algorithm eﬀectively sidesteps this diﬃculty, as bad elements drive down their own times steps only, and not the time steps of the remaining elements in the mesh. In this manner, the overall calculation is shielded from the tyranny of the errant few. 710 708 706 704 702 700 0

10

20

30

40

50

t [ms]

60

70

80

90

100

Figure 3.5. Neo-Hookean block example. Total energy as a function of time as computed by the AVI.

The excellent energy-conservation properties of Newmark’s second-order

106

A. Lew and M. Ortiz

j

#

Dkj Sd [J]

explicit algorithm have been extensively documented in the engineering literature. In calculations, this good behavior manifests itself in the way in which the energy oscillates near the exact value, without displaying ostensible overall growth or decay. These empirical observations have some basis in theory, in as much as Newmark’s second-order explicit algorithm may be shown to be symplectic (Kane, Marsden, Ortiz, and West [2000]). This in turn renders results on the long-time energy behavior of symplectic methods applicable to Newmark’s algorithm (see, e. g., Hairer, and Lubich [1997] and Reich [1999]). In particular, the theory of backward error analysis establishes that, for suﬃciently small time steps ∆t, symplectic methods have errors of order (∆t)r for times up to (∆t)e−C/(∆t) , where r is the order of the method and C is a constant. 3000 2000 1000 0 -1000 -2000 -3000 0

20

40

t [ms]

60

80

100

t [ms]

60

80

100

Dkj

Sd [J]

100 50 0 -50 -100 0

20

40

Figure 3.6. Neo-Hookean block example. Instantaneous and accumulated local energy residual as a function of time for an element of the mesh.

Our numerical tests suggest that the AVI algorithm possesses excellent energy-conservation properties as well. Thus, for instance, Fig. 3.5 shows the time evolution of the total energy of the block. It is remarkable that, despite not enforcing energy-balance exactly, the energy of the solid remains nearly constant throughout the calculations, up to 12,500,000 updates of the smallest element in the mesh, or 50 periods of oscillation of the block. The residual of the energy equation (2.27), given by the left hand side of eq. (2.26), for a single element of the mesh is also of considerable interest. The evolution of this residual in time, and the accumulated residual, are shown in Fig. 3.6 for an element chosen at random. This accumulated residual equals the excess energy stored by the element as a consequence

3. Asynchronous Variational Integrators

107

of the lack of enforcement of local energy conservation. These numerical tests suggest that the local energy behavior of the AVI algorithm is also excellent. Thus, the accumulated energy residual remains below 1% of the value of the elemental energy at all times.

3.2

Three-Dimensional L-Shaped Beam

A second example concerns a three-dimensional free-standing L-shaped beam released from rest from a distorted conﬁguration, Fig. 3.7. The material is identical to that in the preceding example. The mesh comprises 621 10-node tetrahedral elements and 1262 nodes. The local time step is computed as a ﬁxed fraction of the Courant time step of the element.

3m

1m

2m

1m

Figure 3.7. Geometry and initial loading of the L-shaped beam.

Figure 3.8. L-shaped beam example. Deformed conﬁguration snapshots at intervals of 1 ms.

A. Lew and M. Ortiz

654 653 652 651 650 649 648

Total Energy [MJ]

108

0

10

20

30

40

50

t [ms]

60

70

80

90

100

Figure 3.9. L-shaped beam example. Total energy as a function of time as computed by the AVI.

A sequence of snapshots of the AVI solution is shown in Fig. 3.8. After 100 ms, the maximum and minimum number of elemental updates are 432,877 and 49,792, respectively, while the total number of elemental updates is 9× 107 . By way of comparison, the number of updates required by Newmark’s algorithm is 27 × 107 , or a factor of three larger than the AVI update count. The energy behavior of the AVI algorithm is again remarkable, both as regards global energy conservation, Fig. 3.9, and local energy balance, Fig. 3.10.

j

#

Dkj Sd [J]

1000 500 0 -500 -1000 0

20

40

t [ms]

60

80

100

Dkj Sd [J]

10 5 0 -5 -10 10

20

30

40

50

60

70

80

90

100

t [ms] Figure 3.10. L-shaped beam example. Instantaneous and accumulated local energy residual as a function of time for an element of the mesh.

3. Asynchronous Variational Integrators

4

109

Summary

We have described a class of asynchronous variational integrators (AVI) for ﬁnite-element nonlinear dynamics. The AVIs are characterized by the following distinguishing attributes: i) The algorithms permit the selection of independent time steps in each element, and the local time steps need not bear an integral relation to each other; ii) the algorithms derive from a spacetime form of a discrete version of Hamilton’s principle. As a consequence of this variational structure, the algorithms conserve local energy and momenta exactly, subject to solvability of the local time steps. Numerical tests reveal that, even when local energy balance is not enforced exactly, the global and local energy behavior of the AVIs is quite remarkable, a property which can probably be traced to the symplectic nature of the algorithm. In closing, we point out that the AVI methodology is not restricted to ﬁnite element calculations. Indeed, AVIs are applicable to any dynamical system in which the Lagrangian is expressible as a sum of component sub-Lagrangians. A case in point concerns molecular dynamics based on empirical potentials such as the embedded atom method, for which the total energy of the system is the sum of atom-by-atom contributions. For systems of this type, a treatment entirely identical to that described in this article permits updating each sub-system asynchronously with a frequency dictated by the sub-system’s natural timescale. In this manner, AVIs provide a theoretically sound and computationally eﬃcient basis for multiscale analysis of general dynamical systems in the time domain. In particular, the variational structure of the algorithms ensures proper global balance of conserved quantities for the entire system, as well as local detailed balance between the sub-systems.

References Belytschko, T. [1981], Partitioned and adaptive algorithms for explicit time integration. In W. Wunderlich, E. Stein, and K.-J. Bathe, editors, Nonlinear Finite Element Analysis in Structural Mechanics, 572–584. Springer-Verlag. Belytschko, T. and R. Mullen [1976], Mesh partitions of explicit–implicit time integrators. In K.-J. Bathe, J.T. Oden, and W. Wunderlich, editors, Formulations and Computational Algorithms in Finite Element Analysis, 673–690. MIT Press, 1976. Bridges, T.J. and S. Reich [1999], Multi-symplectic integrators: numerical schemes for Hamiltonian PDEs that conserve symplecticity. (preprint). Ge, Z. and J. E. Marsden [1988], Lie–Poisson integrators and Lie–Poisson Hamilton–Jacobi theory, Phys. Lett. A, 133, 134–139. Gonzalez, O. [1996], Time integration and discrete Hamiltonian systems, J. Nonlinear Sci., 6, 449–468.

110

A. Lew and M. Ortiz

Gonzalez, O. and J.C. Sim´ o [1996], On the stability of symplectic and energymomentum algorithms for non-linear Hamiltonian systems with symmetry, Comp. Meth. Appl. Mech. Eng., 134, 197–222. Gotay, M.J., J. Isenberg, J.E. Marsden and R. Montgomery [1997], Momentum maps and classical relativistic ﬁelds, Part I: Covariant ﬁeld theory. (Unpublished). Available from http://www.cds.caltech.edu/∼marsden. Kane, C., J. E. Marsden, and M. Ortiz [1999], Symplectic energy-momentum integrators, J. Math. Phys., 40, 3353–3371. Kane, C., J. E. Marsden, M. Ortiz, and M. West [2000], Variational integrators and the Newmark algorithm for conservative and dissipative mechanical systems, Int. J. Num. Math. Eng., 49, 1295–1325. Knuth, D. [1998], The art of computer programming, Addison-Wesley. Hughes, T. J. R. [1987] The Finite Element Method : Linear Static and Dynamic Finite Element Analysis. Prentice-Hall, Englewood Cliﬀs, N.J.. Hairer, E. and C. Lubich [2000], Long-time energy conservation of numerical methods for oscillatory diﬀerential equations, SIAM Journal on Numerical Analysis, 38, 414–441. Hairer, E. and C. Lubich [2000], The life-span of backward error analysis for numerical integrators, Numerische Mathematik, 76, 441–462. Lew, A., M. West, J. E. Marsden, and M. Ortiz, Asynchronous Variational Integrators, in preparation. Marsden, J. E. and T. J. R. Hughes [1994], Mathematical Foundations of Elasticity. Prentice Hall, 1983. Reprinted by Dover Publications, NY, 1994. Marsden, J. E., G. W. Patrick, and S. Shkoller [1998], Multisymplectic geometry, variational integrators and nonlinear PDEs, Comm. Math. Phys. 199, 351–395. Marsden, J. E., S. Pekarsky, S. Shkoller, and M. West [2001], Variational methods, multisymplectic geometry and continuum mechanics, J. Geometry and Physics, 38, 253–284. Marsden, J. E. and S. Shkoller [1999], Multisymplectic geometry, covariant Hamiltonians and water waves, Math. Proc. Camb. Phil. Soc. , 125, 553–575. Marsden, J. E. and M. West [2001], Discrete variational mechanics and variational integrators, Acta Numerica, 10, 357–514. Neal, M. O. and T. Belytschko [1989], Explicit-explicit subcycling with noninteger time step ratios for structural dynamic systems, Computers & Structures, 6, 871–880. Ortiz, M. [1986], A note on energy conservation and stability of nonlinear time stepping algorithms, Computers and Structures, 24, 167–168. Reich, S.[1999], Backward error analysis for numerical integrators, SIAM Journal on Numerical Analysis, 36, 1549–1570. Sim´ o, J. C., N. Tarnow, and K. K. Wong [1992], Exact energy-momentum conserving algorithms and symplectic schemes for nonlinear dynamics, Comp. Meth. Appl. Mech. Eng., 100, 63–116. Smolinski, P. and Y.-S. Wu [1998], An implicit multi-time step integration method for structural dynamics problems, Computational Mechanics, 22, 337– 343. West, M. [2001], Variational Runge-Kutta methods for ODEs and PDEs. (preprint).

Part II

Fluid Mechanics

4 Euler–Poincar´ e Dynamics of Perfect Complex Fluids Darryl D. Holm To Jerry Marsden on the occasion of his 60th birthday ABSTRACT Lagrangian reduction by stages is used to derive the Euler– Poincar´e equations for the nondissipative coupled motion and micromotion of complex ﬂuids. We mainly treat perfect complex ﬂuids (PCFs) whose order parameters are continuous material variables. These order parameters may be regarded geometrically either as objects in a vector space, or as coset spaces of Lie symmetry groups with respect to subgroups that leave these objects invariant. Examples include liquid crystals, superﬂuids, Yang–Mills magnetoﬂuids and spin-glasses. A Lie–Poisson Hamiltonian formulation of the dynamics for perfect complex ﬂuids is obtained by Legendre transforming the Euler–Poincar´e formulation. These dynamics are also derived by using the Clebsch approach. In the Hamiltonian and Lagrangian formulations of perfect complex ﬂuid dynamics Lie algebras containing twococycles arise as a characteristic feature. After discussing these geometrical formulations of the dynamics of perfect complex ﬂuids, we give an example of how to introduce defects into the order parameter as imperfections (e.g., vortices) that carry their own momentum. The defects may move relative to the Lagrangian ﬂuid material and thereby produce additional reactive forces and stresses.

Contents 1 2

Introduction . . . . . . . . . . . . . . . . . . . . . . The Example of Liquid Crystals . . . . . . . . . . 2.1 Background for Liquid Crystals . . . . . . . . . . . 2.2 Four Action Principles for Perfect Liquid Crystals 2.3 Hamiltonian Dynamics of Perfect Liquid Crystals . 2.4 Summary for Perfect Liquid Crystals . . . . . . . . 3 Action Principles and Lagrangian Reduction . . . 3.1 Lagrangian Reduction by Stages . . . . . . . . . . 3.2 Hamiltonian Dynamics of PCFs . . . . . . . . . . . 3.3 Clebsch Approach for PCF Dynamics . . . . . . . 3.4 Conclusions for PCFs . . . . . . . . . . . . . . . . 4 A Strategy for Introducing Defect Dynamics . . . 4.1 Vortices in Superﬂuid 4 He . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . .

113

. . . . . . . . . . . . . .

114 118 118 119 132 138 139 141 149 152 154 155 156 164

114

1

D. D. Holm

Introduction

Deﬁnitions. The hydrodynamic motion of a complex ﬂuid depends on variables called order parameters that describe the macroscopic variations of the internal structure of the ﬂuid parcels. These macroscopic variations may form observable patterns, as seen, for example, via the gradients of optical scattering properties in liquid crystals arising due to the spatially varying orientations of their molecules, as discussed in, e.g., Chandrasekhar [1992] and de Gennes and Prost [1993] This micro-order of a complex ﬂuid is described by an auxiliary macroscopic continuum ﬁeld of geometrical objects associated with each ﬂuid element and taking values in a vector space (or a manifold) called the order parameter space. The canonical example is the description of the local directional asymmetries of nematic liquid crystal molecules by a spatially and temporally varying macroscopic continuum ﬁeld of unsigned unit vectors called “directors,” see, e.g., Chandrasekhar [1992] and de Gennes and Prost [1993]. Thus, the presence of micro-order breaks the symmetry group O of the uniform ﬂuid state to a subgroup P. This spontaneous symmetry breaking occurs in every phase transition of condensed matter into an ordered state. The symmetry subgroup P ⊂ O that remains is the isotropy group of whatever geometrical object appears (e.g., a vector, a spin, a director, etc.) when the symmetries of the uniform ﬂuid state are broken to produce the micro-order. That is, the remaining symmetry subgroup P is the symmetry group of the micro-order. Equivalently, the order parameter may also be regarded as taking its values in the coset space C = O/P and its space and time variations may be represented by a space and time dependent curve in the Lie symmetry group O through the action of O on its coset space C. Thus, associating a geometrical object in a vector space (say, a director) with an order parameter is a way of visualizing the coset space C, see, e.g., Mermin [1979] and Volovick [1992] for physical examples. The symmetry group O of the original uniform ﬂuid state is called the order parameter group, or the broken symmetry. The order parameter dynamics that generates a curve parameterized by space and time in the broken symmetry group for a complex ﬂuid is called its micromotion, although it refers to continuum properties at the coarse-grained macroscopic scale. Spatial and temporal variations in the micro-order are measured relative to a reference conﬁguration. Let O act transitively from the right on a manifold M . Suppose the subgroup P ⊂ O leaves invariant an arbitrarily chosen reference point m0 ∈ M , i.e., m0 p = m0 , ∀p ∈ P. The reference point m0 then corresponds to the coset [e] = eP = P of O/P, where e is the identity element of O. For another choice of reference point, say m0 = m0 h with h ∈ O, the isotropy subgroup becomes the conjugate subgroup P → P = h−1 Ph and the corresponding coset space O/P is, thus, isomorphic to the original coset space O/P. Hence, the conjugacy equivalence classes of the order parameter coset spaces account for the

4. Dynamics of Perfect Complex Fluids

115

arbitrariness in the choice of reference point m0 . One may think equally well of the order parameter group as acting either on a manifold M with “origin” m0 , or on a coset space O/P, where P is the stabilizer of the reference point m0 . For example, let O be the group of proper orthogonal transformations SO(3) acting transitively on directors in R3 (unit vectors with ends identiﬁed) and choose m0 to be the vertical director in an arbitrary reference frame; so m0 is invariant under the O(2) group of rotations around the vertical axis in that frame and reﬂections across its horizontal plane. This example applies to cylindrically symmetric nematic liquid crystal properties. The isotropy group P is O(2), and the liquid crystal director may be represented in the coset space SO(3)/ O(2), which is isomorphic to S 2 /Z2 , the unit sphere S 2 with diametrically opposite points identiﬁed, i.e., the real projective plane RP 2 . Thus, after a reference conﬁguration has been chosen, the micro-order of nematic liquid crystals may be represented equivalently as a space and time dependent curve in the broken symmetry group O acting on either the order parameter manifold M = RP 2 , or on the coset space SO(3)/ O(2) of the broken symmetry SO(3). In addition to its micromotion, the motion of a complex ﬂuid also possesses the usual properties of classical ﬂuid dynamics. In particular, the motion of a complex ﬂuid involves the advection of thermodynamic state variables such as heat and mass, regarded as ﬂuid properties taking values in a vector space V ∗ . The complete dynamical equations for complex ﬂuids must describe both their motion and their micromotion. In general, these two types of motion will be nonlinearly coupled to each other. This coupling typically arises because properties of both types of motion appear in the stress tensor governing the total momentum. This deﬁnition of complex ﬂuid motion in terms of the continuum dynamics of both its order parameter and its usual ﬂuid properties encompasses a wide range of models for complex ﬂuid motion, including binary ﬂuids, multi-phase ﬂuids, polymeric materials, spin glasses, various other types of magnetic materials, superﬂuids and, of course, liquid crystals. The order parameter for each of these models provides a continuum (i.e., coarsegrained) description of the complex ﬂuid’s internal degrees of freedom, or micro-order. Here we will discuss mainly the case in which these order parameters are continuous material variables, that is, they are continuous functions carried along with the ﬂuid parcels1 . Such media shall be called perfect complex ﬂuids. This name (abbreviated PCF) is chosen to contrast with the perfect simple ﬂuid (which has no internal micro-degrees of freedom) and to provide a geometrical basis for treating defects as imper1 In

some cases, order parameters are determined from constraint relations that are Eulerian in nature, e.g., the volume fraction in two-phase interpenetrating ﬂow as in Holm and Kupershmidt [1986]. This case will not be discussed here.

116

D. D. Holm

fections in the order parameter ﬁeld that can propagate, or move relative to the material labels. The relative motion of defects through the medium generally introduces additional reactive forces and sources of dissipation. We believe the dynamics of defects in complex ﬂuids is best approached after ﬁrst discussing the dynamics of perfect complex ﬂuids, which is interesting in its own right. Here we shall concentrate on deriving the nonlinear dynamical equations for the ideal continuum motion and micromotion of perfect complex ﬂuids. Once the equations for the nonlinear dynamics of their ideal (nondissipative) motion and micro-motion are established, dissipative processes must be included for most physical applications of complex ﬂuids. By tradition, this is accomplished phenomenologically in these models, by introducing kinetic coeﬃcients, such as viscosity, mobility, thermal diﬀusivity, etc., so as to obey the requirements of the Clausius–Duhem relation that the entropy production rate be positive when the dynamics of all thermodynamic variables (including order parameters) are included, as in Dunn and Serrin [1985], Hohenberg and Halperin [1977]. Here we shall ignore dissipation entirely, trusting that it can be added later by using the standard phenomenological methods. For the case that the order parameter group O is the proper orthogonal group SO(3), a geometrical approach to complex ﬂuids exists as part of the rational theory of continuum dynamics for materials with orientational internal degrees of freedom, such as liquid crystals. Rational theories of such complex media began with Cosserat and Cosserat [1909]. The Cosserat theories were recapitulated at various times by many diﬀerent people. See, e.g., Kl´eman [1983] and Eringen [1997] recent developments and proposed applications of this approach for treating, e.g., liquid crystal dynamics in the tradition of the rational theory of continuum media. The present paper starts with the example of the Ericksen–Leslie theory of nematic liquid crystals and develops the geometrical framework for continuum theories of perfect complex ﬂuids. In this geometrical framework, the motion and micromotion are nonlinearly and self-consistently coupled to one another by the composite actions of the diﬀeomorphisms and the order parameter group. The micromotion follows a curve in the order parameter group depending on time and material coordinate, and the motion is a time-dependent curve in the group of diﬀeomorphisms, which acts on the material coordinates of the ﬂuid parcels to carry them from their reference conﬁguration to their current positions. A feedback develops between the composite motion and micromotion, because the stress tensor aﬀecting the velocity of the diﬀeomorphisms depends on the gradient of the order parameter. The mathematical basis for our development is the method of Lagrangian reduction by stages, due to Cendra, Marsden, and Ratiu [2001]. As we shall see, obtaining the Euler–Poincar´e equations for perfect complex ﬂuids requires two stages of Lagrangian reduction, ﬁrst by the order parameter group and then by the diﬀeomorphism group.

4. Dynamics of Perfect Complex Fluids

117

The main results in this paper. The Euler–Poincar´e approach provides a uniﬁed framework for modeling the dynamics of perfect complex ﬂuids that preserves and extends the mathematical structure inherent in the dynamics of classical ﬂuids and liquid crystals in the Eulerian description. This paper provides detailed derivations and discusses applications of the Euler–Lagrange equations, the Lagrange–Poincar´e equations, the Euler– Poincar´e equations, the Clebsch equations and the Lie–Poisson Hamiltonian structure of perfect complex ﬂuids. The action principles for these equations appear at their various stages of transformation under reduction by symmetries. The theme of this paper is the reduction of degrees of freedom by transforming to variables in the action principles that are invariant under the symmetries of the Lagrangian. (This is the Lagrangian version of Poisson reduction on the Hamiltonian side.) The transformation is done in stages, not all at once, for the additional perspective we hope it brings and to illustrate how Lagrangian reduction can be iterated in condensed matter applications. It may also be possible to impose nonholonomic constraints by stages, by transforming to variables that either respect the constraints, or appear in their speciﬁcations. The main new results for liquid crystals and PCFs in this paper are: 1. Four action principles for PCF dynamics and their associated motion and micromotion equations at the various stages of reduction — the Euler–Lagrange equations, the Lagrange–Poincar´e equations, the Euler–Poincar´e equations, and the Clebsch equations. 2. The canonical and Lie–Poisson Hamiltonian formulations of these equations, their Lie algebraic structures and the Poisson maps between them. 3. The momentum conservation laws and Kelvin–Noether circulation theorems for these equations. 4. The reduced equations for one-dimensional dependence on either time or space, and the relation of these reduced equations to the Euler– Poisson equations for the dynamics of generalized tops. 5. A strategy for composing the dynamics of defects in a complex ﬂuid with its underlying PCF dynamics. Outline. In Section 2, we develop these results for the motion and micromotion in the example of nematic liquid crystals, in forms that shall parallel the results of the general theory derived in the following section. The Euler– Lagrange equations from Hamilton’s principle specialize to the Ericksen– Leslie equations, upon making the appropriate choices of the kinetic and potential energies. The Lagrange–Poincar´e equations and Euler–Poincar´e equations that follow from applying two successive stages of Lagrangian

118

D. D. Holm

reduction of Hamilton’s principle with respect to its symmetries provide geometrical variants and generalizations of the Ericksen–Leslie equations. In Section 3, we perform two successive stages of Lagrangian symmetry reduction to derive ﬁrst the Lagrange–Poincar´e equations and then the Euler–Poincar´e equations for PCFs with an arbitrary order parameter group. These Euler–Poincar´e equations are Legendre-transformed to their Lie–Poisson Hamiltonian form. Their derivation by the Clebsch approach is also given. Finally, in Section 4 we discuss a strategy of including defect dynamics into the theory of PCF dynamics. This strategy introduces an additional set of ﬂuid variables that describe the motion of the defects relative to the material coordinates of the PCF. We shall illustrate this strategy in the example of rotating superﬂuid 4 He, in which U(1) is the broken symmetry and the defects are quantum vortices.

2 2.1

The Example of Liquid Crystals Background for Liquid Crystals

We begin with a hands-on example that illustrates the utility of the ideas we shall develop in this paper. Liquid crystals provide a ubiquitous application that embodies these ideas and supplies a guide for developing them. Liquid crystals are the prototype for complex ﬂuids. For extensive reviews, see Chandrasekhar [1992] and de Gennes and Prost [1993]. An orientational order parameter for a molecule of arbitrary shape is given in [Chandrasekhar, 1992, p. 40], as

(2.1) S = 12 3 χ ⊗ χ − Id ⊗ Id , where · is a statistical average and χ ∈ SO(3) is a rotation that speciﬁes the local molecular orientation relative to a ﬁxed reference frame. Thus, in index notation,

(2.2) SklKL = 12 3 χkK χlL − δkl δKL . This order parameter is traceless in both pairs of indices, since χT = χ−1 . For cylindrically shaped molecules, (e.g., nematics, cholesterics or smectics2 ) we may choose the 3-axis, say, as the reference axis of symmetry, i.e., choose K = 3 = L and set nk ≡ χk3 ,

so that

and S becomes, Skl33 ≡ Skl =

1 2

|n|2 = χ3k χk3 = 1 ,

(2.3)

3 nk nl − δkl .

(2.4)

2 Smectics form layers, so besides their director orientation they have an additional order parameter for their broken translational symmetry.

4. Dynamics of Perfect Complex Fluids

119

Note: The order parameter S does not distinguish between n and −n. Physically, S may be regarded as the quadrupole moment of a local molecular charge distribution. For a clear description of the use of this order parameter in assessing nematic order-disorder phase transitions using the modern theory of critical phenomena, see Lammert, Rokhsar, and Toner [1995]. Instead of considering such phase transitions, here we shall be interested in the continuum dynamics associated with the interaction of this order parameter with material deformations. In passing to a continuum mechanics description, one replaces statistical averages by a local space and time dependent unit vector, or “director” n(x, t). Then, the continuum order parameter S corresponding to the statistical quantity S in (2.4) is the symmetric traceless tensor, (2.5) Skl (n) ≡ 12 3 nk nl − δkl , which satisﬁes S(n) = S(−n) and admits n as an eigenvector, S · n = n. The hydrodynamic tensor order parameter S represents the deviation from isotropy of any convenient tensor property of the medium. For example, the residual dielectric and diamagnetic energy densities of a nematic liquid crystal due to anisotropy may be expressed in terms of the tensor order parameter S as, 1 2 ∆ E

· S(n) · E and

1 2 ∆µ B

· S(n) · B ,

(2.6)

for (external) electric and magnetic ﬁelds E and B, respectively. (For simplicity, we neglect any dependence of the electric and magnetic polarizabilities of the medium on the gradients ∇n, although this possibility is allowed.)

2.2

Four Action Principles for Perfect Liquid Crystals

Liquid crystal action. The standard equations for the continuum dynamics of liquid crystals without defects are the Ericksen–Leslie equations, due to Ericksen [1960, 1961] Leslie [1966, 1968] and reviewed, e.g., in Leslie [1979] These equations express the dynamics of the director n and may be derived from an action principle δS = 0 with action S = dt L in the following class, ˙ J, n, n, ˙ ∇n , (2.7) S = dt d3 X L x, with notation x˙ = ∂x(X, t)/∂t, so that overdot denotes material time derivative, J = det(∂x/∂X) and ∇n(X, t) has spatial components given by the chain rule expression, ∂x −1 ∂n ∂XA ∂ ≡ n(X, t) ≡ n,i . (2.8) ∇i n = ∂X ∂xi ∂XA i A ∂XA

120

D. D. Holm

Note that the coupling between the ﬂuid dynamics x(X, t) and director dynamics n(X, t) occurs in the Lagrangian density L through the inverse deformation gradient, (∂x/∂X)−1 , via the chain rule expression above for ∇i n(X, t). Varying the action S in the ﬁelds x and n at ﬁxed material position X and time t gives δS = −

dt

) ∂L ˙

∂ −1 ∂L ∂ ∂L J −J +J · n,p d3 X δxp ∂ x˙ p ∂xp ∂J ∂xm ∂n,m * ∂L ˙ ∂L ∂ ∂L J −1 +J + δn · − , (2.9) ∂ n˙ ∂n ∂xm ∂n,m

with natural (homogeneous) conditions, expressing continuity of ∂L ∂J

and n ˆm

∂L , ∂n,m

(2.10)

on a boundary, or at a material interface, whose unit normal has spatial Cartesian components n ˆ m . (We always sum over repeated indices.) Action principle #1 — Euler–Lagrange equations. Stationarity of the action, δS = 0, i.e., Hamilton’s principle, thus yields the following Euler–Lagrange equations in the Lagrangian ﬂuid description,

∂ −1 ∂L ∂ ∂L J −J · n,p = 0 , ∂ x˙ p ∂xp ∂J ∂xm ∂n,m ∂L ˙ ∂L ∂ −1 ∂L δn : J = 0. +J − ∂ n˙ ∂n ∂xm ∂n,m

δxp :

∂L ˙

+J

(2.11) (2.12)

The material volume element J = det(∂x/∂X) satisﬁes an auxiliary kinematic equation, obtained from its deﬁnition, ∂XA ∂ x˙ i J˙ = J . ∂xi ∂XA

(2.13)

Imposing constant J, say J = 1, gives incompressible ﬂow. The last term in equation (2.11) is the divergence of the Ericksen stress tensor arising due to the dependence of the potential energy of the medium on the strain ∇n, see Ericksen [1960, 1961]. With the proper choice of Lagrangian density, namely, ˙ 2 + 12 I|n| ˙ 2 + p(J − 1) + q |n|2 − 1 − F n, ∇n , (2.14) L = 12 ρ|x| where ρ and I are material constants one ﬁnds that the Euler–Lagrange equations (2.11) and (2.12) produce the Ericksen–Leslie equations for incompressible liquid crystal ﬂow, Chandrasekhar [1992] and de Gennes

4. Dynamics of Perfect Complex Fluids

121

and Prost [1993], ∂ ∂F p δij + n,i · = 0, ∂xj ∂n,j ∂F ∂ ∂F In ¨ − 2qn + h = 0 , with h = − . ∂n ∂xj ∂n,j ρx ¨i +

(2.15) (2.16)

In the Lagrangian density (2.14), the Lagrange multipliers p and q enforce the incompressibility condition J = 1 and the director normalization condition |n|2 = 1, respectively. The standard choice for the function F (n, ∇n) is the Oseen–Z¨ocher–Frank Helmholtz free energy density, as discussed, e.g., in Chandrasekhar [1992] and de Gennes and Prost [1993], namely3 , F (n, ∇n) = k2 (n · ∇ × n) + 12 k11 (∇ · n)2 + ,. + ,. chirality splay + 12 k22 (n · ∇ × n)2 + 12 k33 |n × ∇ × n|2 + ,. + ,. twist bend + 12 ∆ E · S(n) · E + 12 ∆µ B · S(n) · B ,

(2.17)

with k2 = 0 for nematics (but nonzero for cholesterics) and n · ∇ × n = 0 for smectics. Since smectics form layers that break translational symmetry, their order parameter group may be taken as the Euclidean group E(3), or perhaps as SO(3)×U(1) in simple cases. We shall concentrate our attention on nematics in this section. However, the general theory for arbitrary order parameter groups developed in the next section would also encompass smectics. Key concepts: Material angular frequency and spatial strain of rotation. The rate of rotation of a director n in the rest frame of the ﬂuid material element that carries it is given by ν = n × n˙ .

(2.18)

In terms of this material angular frequency ν (which is orthogonal to the director n), the dynamical equation (2.16) becomes simply, I ν˙ = h × n .

(2.19)

Likewise, the amount by which a speciﬁed director ﬁeld n(x) rotates under an inﬁnitesimal spatial displacement from xi to xi + dxi is given by, γ i = n × ∇i n or γ · dx ≡ γ i dxi = n × dn .

(2.20)

3 This notation treats spatial and director components on the same (Cartesian) basis. Later, we will distinguish between these types of components, see, e.g., equation (2.83).

122

D. D. Holm

Remarks on Kinematics. An orthogonal matrix with components χkL is associated with an orthonormal frame of unit vectors (l, m, n) whose components (lk , mk , nk ) = χk1 , χk2 , χk3 ) satisfy jkl (dχ χ−1 )lk = (l × dl + m × dm + n × dn)j ,

(2.21)

which is O(3) right invariant. Accordingly, the spatial rotational strain, with components (2.22) (n × dn)j = jkl (dχl3 χk3 ) , and the material angular velocity, (n × n) ˙ j = jkl (χ˙ l3 χk3 ) ,

(2.23)

are each elements of T S 2 /Z2 , i.e., they are tangent to the sphere |n|2 = 1 and O(2) right invariant under SO(2) rotations about n and under reﬂections n → −n. Also, n × dn and n × n˙ are O(2) right invariant elements of the Lie algebra, so(3) T SO(3)/ SO(3), so they are elements of so(3)/ O(2). Being orthogonal to n, their evolution preserves |n|2 . Therefore, |n|2 = 1 may be taken as an initial condition, and n may be reconstructed from n˙ = ν × n. The spatial rotational strain n × dn = γ · dx satisﬁes the kinematic relation, (2.24) γ · dx ˙ = 2ν × γ i + ∇i ν dxi , obtained from its deﬁnition. This kinematic relation involves only γ i , ν ˙ ∇n) to γ i and ν. and xi , which suggests transforming variables from (n, n, (Note that γ i and ν do not distinguish between n and −n.) Each component of γ i is orthogonal to the director n. That is, n · γ i = 0, for i = 1, 2, 3. Using this fact and |n|2 = 1 gives the relations, ∇i n = − n × γ i

and ∇ × n = − n tr γ + nm γ m ,

(2.25)

or, equivalently, in index notation, ∇i na = − abc nb γic

m i and ijk ∇j nk = −ni γm + nm γm ,

(2.26)

where lower indices are spatial and upper indices denote director components. Consequently, we have the following useful transformation formulas, ∇i n ·∇j n = γ i · γ j , ∇i n ×∇j n = γ i × γ j

and

n · ∇ × n = − tr γ .

(2.27)

4. Dynamics of Perfect Complex Fluids

123

Connection, curvature, singularities and topological indices. The Eulerian curl of spatial rotational strain γ i yields the remarkable relation, Bij ≡ γ i,j − γ j,i + 2γ i × γ j = 0 ,

(2.28)

where Bij vanishes, provided the vector ﬁeld n is continuous. In geometrical terms, vanishing of Bij is the Maurer–Cartan relation for the ﬂat connection one-form (or, left-invariant Cartan one-form) γ i dxi = n × dn, as discussed in, e.g., Flanders [1989]. That is, the curvature two-form Bij dxi ∧ dxj vanishes in the absence of singularities (disclinations) in the director ﬁeld n(x). Thus, Bij may be regarded as the areal density of disclinations in a nematic liquid crystal. In Section 3, we shall discuss how to proceed when defects exist and the disclination density Bij does not vanish. A second remarkable formula also stems from the curl−γ equation (2.28), which implies the relation, ∂γja ∂nb ∂nc dxi ∧ dxj = abc dxi ∧ dxj = abc dnb ∧ dnc , ∂xi ∂xi ∂xj

(2.29)

for a continuous director ﬁeld. Contracting with na and taking the exterior derivative of this relation implies ∂ a ∂γja n dxi ∧ dxj ∧ dxk = abc dna ∧ dnb ∧ dnc ∂xk ∂xi = det(∇n) d3 x = det(γ) d3 x .

(2.30)

By |n|2 = 1, these expressions must vanish, when the director ﬁeld has no singularities. Thus, in the absence of singularities, there is a vector v, for which kij na

∂γja ∂nb ∂nc = kij abc na = kij vj,i = (curl v)k , ∂xi ∂xi ∂xj

(2.31)

or, equivalently, abc na ∇nb × ∇nc = curl v .

(2.32)

Up to an inessential multiplicative factor, this formula is the well known Mermin–Ho relation between three-dimensional superﬂuid texture n and vorticity curl v in dipole-locked 3 He-A, found in Mermin and Ho [1976]. Since n is a unit vector, a simple calculation in spherical polar coordinates transforms this relation to abc na ∇nb × ∇nc = ∇φ × ∇ cos θ = curl v ,

(2.33)

for n = (sin θ cos φ, sin θ sin φ, cos θ)T in terms of the polar angle θ and azimuthal angle φ on the sphere. Hence, curl v · dS = abc na dnb ∧ dnc = dφ ∧ d(cos θ) ,

(2.34)

124

D. D. Holm

which is the area element on the sphere. Thus, by Stokes’ theorem for a given closed curve C in R3 with unit tangent vector n(x), the integral / 2π Wr ≡ v · dx = abc na γkb dxk ∧ γlc dxl , (2.35) C

S

is the (signed) area swept out by n on a portion of the unit sphere upon traversing the curve C which is the boundary of the surface S in three dimensions. This is also equal to the writhe of the curve C, see Fuller [1978], as cited in Kamien [1998]. Thus, if the ﬁeld lines of the three dimensional vector ﬁeld n(x) form a closed curve C, the writhe of this curve is given by the area integral in equation (2.35) over a surface whose boundary is C. Equation (2.35) is the second remarkable formula implied by the curl−γ equation (2.28). Finally, the integral of v · curl v over three dimensional space gives a number, 1 v · curl v d3 x , (2.36) N= 4π S 3 which takes integer values and is called the Hopf index or degree of mapping for the map n(x) : S 3 → S 2 . The Hopf degree of mapping N counts the number of times the map n(x) : S 3 → S 2 covers the unit sphere, as discussed in, e.g., Flanders [1989]. Equivalently, the Hopf degree of mapping N counts the number of linkages of the three dimensional divergenceless vector ﬁeld curl v with itself. See Mineev [1980]. for the physical interpretation of this formula in terms of disclinations in nematic liquid crystal experiments. See also Trebin [1982], Kl´eman [1983, 1989], and references therein for additional discussions of the diﬀerential geometry of defects in liquid crystal physics. The quantity abc na ∇nb × ∇nc in equation (2.33) appears in many differential geometric contexts in physics: in the Mermin–Ho relation for the vorticity of superﬂuid 3 He-A in terms of the “texture” n due to Mermin and Ho [1976]; in the Skyrmion Lagrangian, as discussed in, e.g., Trebin [1982]; as the n−ﬁeld topological Wess-Zumino term in the O(3) nonlinear sigma model, discussed in, e.g., Yabu and Kuratsuji [1999], Tsurumaru and Tsutsui [1999]; and as the instanton number density in coset models. For references to the last topic see Coquereaux and Jadcyk [1994] and for a discussion of the associated Poisson-Lie models, see Stern [1999]. This same term also produces forces on vortices in 3 He-A(B) as found in Hall [1985] and in ferromagnets, discussed in Kuratsuji and Yabu [1998]. Also, we have seen that this term also allows the calculation of the “writhe” of a self-avoiding closed loop of, say, DNA, or some other polymer ﬁlament with unit tangent vector n, see, e.g., Fuller [1978], Goldstein, Powers, and Wiggins [1998], Kamien [1998], See also Klapper [1996], and Goriely and Tabor [1997], for additional recent discussions of the diﬀerential geometric writhe of DNA conformations. Finally, we have identiﬁed the relation of the

4. Dynamics of Perfect Complex Fluids

125

Hopf degree of mapping to the linkage number of the writhe ﬂux, curl v. Perhaps surprisingly, in the case of liquid crystals, it will turn out that this topological index, the writhe of a ﬂuid loop, and the ﬂux of curvature Bij through a ﬂuid surface (which measures the presence of singularities in the director ﬁeld) are all three created in nonabelian PCF dynamics. Action principle #2 — Lagrange–Poincar´ e equations. Intrigued by the geometrical properties of the variables γ i and ν, we shall seek to express the Ericksen–Leslie equations solely in terms of these variables, by applying symmetry reduction methods to Hamilton’s principle for liquid crystal dynamics. The trace of the spatial rotational strain γ gives a ≡ abc nb ∇m nc γm

⇒

m tr γ ≡ γm = −n · ∇ × n,

(2.37)

which ﬁgures in the energies of chirality and twist in equation (2.17) for F (n, ∇n). We shall now drop the anisotropic dielectric and diamagnetic terms, which exert torques on the director angular momentum due to external E and B ﬁelds given by, hEM × n = 32 ∆ (E · n)E × n + 32 ∆µ (B · n)B × n ,

(2.38)

but which do not contribute in the motion equation. Upon neglecting these torques, which may always be restored later (see the Appendix), the remaining Oseen–Frank free energy F (n, ∇n) may be expressed (modulo a divergence term that will not contribute to the Euler–Lagrange equations) entirely as a function of a subset of the invariants of γ, namely, tr γ, tr(γ + γ T )2 and tr(γ − γ T )2 , with the help of the identities below, obtained using |n|2 = 1. In the notation of Eringen [1997], one has 2 2 I1 ≡ (∇ × n)2 = n · (∇ × n) + n × (∇ × n) , (2.39) 2 2 (2.40) I2 ≡ ni,j ni,j = ∇ · n + ∇ × n + nj ni,j − ni nj,j ,i 2 = ∇ × n + nj,i ni,j , 2 2 I3 ≡ (n · ∇)n) = n × (∇ × n) . (2.41) Geometrically, the divergence term j i n n,j − ni nj,j ,i = 2κ1 κ2 ,

(2.42)

appearing in equation (2.40) for I2 is (twice) the Gaussian curvature of the surface Σ whose normal is n(x). This equation follows from Rodrigues’ formula dn = − κ dx , (2.43) for the principle directions of the surface Σ. Such a surface exists globally, provided n · curl n = 0, see Weatherburn [1974], as cited in Kl´eman [1983].

126

D. D. Holm

Straightforward calculations give the following linear relations, see Eringen [1997] tr(γ 2S ) = − 12 I1 + I2 +

1 2

I3 ,

tr(γ 2A ) = − 12 (I1 + I3 ),

(2.44)

where γ S and γ A are the symmetric and antisymmetric parts of γ, γS =

1 2

(γ + γ T ) ,

and γ A =

1 2

(γ − γ T ) .

(2.45)

Hence, modulo the divergence term in (2.40), the Oseen–Frank free energy in (2.17) may be written in terms of the invariants of γ as F tr γ , tr(γ 2S ) , tr(γ 2A ) . The reduced Lagrange–Poincar´ e action. By transforming to the rotational strain γ m and frequency ν, the action S for liquid crystals in (2.7) may thus be reduced, that is, rewritten in fewer variables, by deﬁning, (2.46) S = dt d3 X L x˙ , J , ν , γ m , where the reduced variables ν and γ m are perpendicular to the director n. A calculation using their deﬁnitions shows that the variations of J and ν satisfy δJ = J

∂XA ∂δxi , ∂xi ∂XA

δν = n × δn ˙ − 2ν × n × δn ,

(2.47)

in terms of the variational quantity n × δn. Likewise, we calculate the variation of the rotational strain γ m as ∂ ∂XA J n × δn ∂XA ∂xm ∂ ∂XA J δxp . − 2γ m × n × δn − γ p J −1 ∂XA ∂xm

δγ m = J −1

(2.48)

This variational expression for γ m may be rearranged into the more suggestive form, δγ m + γ p ∇m δxp = 2 n × δn × γ m + ∇m n × δn ,

(2.49)

in which its similarity with the kinematic relation for γ in (2.24) is more readily apparent, especially when the kinematic equations (2.13) and (2.24) are rewritten in the notation of equation (2.8) as J˙ = J∇i x˙ i

and γ˙ m + γ p ∇m x˙ p = 2ν × γ m + ∇m ν .

(2.50)

4. Dynamics of Perfect Complex Fluids

127

The variation of the Lagrange–Poincar´e action S in (2.46) for liquid crystals in the Lagrangian ﬂuid description may now be rewritten as ) ∂L ˙

∂ −1 ∂L ∂ ∂L J −J δS = − dt d3 X δxp +J · γp ∂ x˙ p ∂xp ∂J ∂xm ∂γ m *

∂L ˙ ∂ ∂L ∂L ∂L J −1 − 2γ m × +J + n × δn · − 2ν × ∂ν ∂ν ∂xm ∂γ m ∂γ m )

˙ ∂L ∂L + n × δn · + dt d3 X δxp ∂ x˙ p ∂ν *

∂X ∂L ∂XA ∂L ∂ ∂L A n × δn · δmp − + δxp J · γp + . ∂XA ∂xm ∂J ∂γ m ∂xm ∂γ m (2.51) Consequently, the action principle δS = 0 yields the following Lagrange– Poincar´ e equations for liquid crystals,

∂ −1 ∂L ∂ ∂L J −J · γp = 0 , ∂ x˙ p ∂xp ∂J ∂xm ∂γ m ∂L ˙ ∂ ∂L ∂L ∂L n × δn : J −1 − 2γ m × +J − 2ν × = 0, ∂ν ∂ν ∂xm ∂γ m ∂γ m (2.52) with natural boundary conditions, cf. equation (2.10), δxp :

∂L ˙

+J

∂L =0 ∂J

and n ˆm

∂L = 0, ∂γ m

(2.53)

on a boundary, or material interface, whose unit normal has spatial Cartesian components n ˆ m . These are the dynamical boundary conditions for liquid crystal motion. These conditions ensure that the ﬂuid pressure ∂L/∂J and the normal stress are continuous across a ﬂuid interface. The term γ p · ∂L/∂γ m contributes to the stress tensor of the complex ﬂuid and arises from the dependence of its free energy upon the rotational strain γ i . Again J = 1 for incompressible ﬂow, as imposed by the Lagrange multiplier p, the ﬂuid pressure. (The Lagrange multiplier q is no longer necessary.) Upon specializing to the Oseen–Frank-type Lagrangian (2.14), station˙ J, ν, γ) in equation (2.46) recovers the Ericksen– arity of the action S(x, Leslie equations (2.15) written in these variables. This is an example of Lagrangian reduction — from the variables (n, n, ˙ ∇n) to (ν, γ). The equations in fewer variables that result from the ﬁrst stage of Lagrangian reduction were named the Lagrange–Poincar´e equations in Marsden and Scheurle [1995]. Cendra, Marsden, and Ratiu [2001]. One of the goals of this paper is to characterize the geometrical properties of such equations for PCFs, using the mathematical framework of Lagrangian reduction by stages established in Cendra, Marsden, and Ratiu [2001]. From this viewpoint, the ﬁrst stage of the Lagrangian reduction for liquid crystals is ﬁnished.

128

D. D. Holm

Action principle #3 — Euler–Poincar´ e equations. The Lagrange– Poincar´e equations (2.52) and their kinematic relations (2.50) are still expressed in the Lagrangian ﬂuid description. We shall pass to the Eulerian ﬂuid description by applying a second stage of Lagrangian reduction, now deﬁned by the following right actions of the diﬀeomorphism group, u(x, t) = x(X, ˙ t) g −1 (t) , D(x, t) d3 x = d3 X g −1 (t) , ν(x, t) = ν(X, t) g −1 (t) ,

∂xi (X, t) dXA g −1 (t) , γ i (x, t) dxi = γ i (X, t) ∂XA

(2.54)

where the right action denoted x(X, t) = Xg(t) deﬁnes the ﬂuid motion as following a time-dependent curve g(t) in the diﬀeomorphism group as it acts on the reference conﬁguration of the ﬂuid with coordinate X. In the Eulerian ﬂuid description, the action S in Hamilton’s principle δS = 0 is written as (2.55) S = dt d3 x (u, D, ν, γ) , in terms of the Lagrangian density given by −1 ˙ (t), Jg −1 (t), νg −1 (t), γ g −1 (t) d3 X g −1 (t) . (u, D, ν, γ) d3 x = L xg (2.56) The variations of these Eulerian ﬂuid quantities are computed from their deﬁnitions to be, ∂ηj ∂uj ∂ηj + uk − ηk , ∂t ∂xk ∂xk ∂Dηj δD = − , ∂xj ∂Σ ∂ν ∂Σ + um − 2ν × Σ − ηm , δν = ∂t ∂xm ∂xm ∂γ ∂ηk ∂Σ δγ m = − 2γ m × Σ − ηk m − γ k , ∂xm ∂xk ∂xm δuj =

(2.57) (2.58) (2.59) (2.60)

where Σ(x, t) ≡ (n × δn)(X, t)g −1 (t) and η ≡ δgg −1 (t). One may compare these Eulerian variations with the Lagrangian variations in (2.47) and (2.48). See also Holm, Marsden, and Ratiu [1998] for more discussion of such constrained variations in Eulerian ﬂuid dynamics. Here the quantities D and γ m satisfy the following Eulerian kinematic equations, also obtained from their deﬁnitions, cf. the Lagrangian kinematic relations (2.50), ∂Duj ∂D =− , ∂t ∂xj ∂γ m ∂ν ∂γ ∂uk = + 2 ν × γ m − uk m − γ k . ∂t ∂xm ∂xk ∂xm

(2.61) (2.62)

4. Dynamics of Perfect Complex Fluids

129

Note the similarity in form between the Eulerian variations of D and γ m , and their corresponding kinematic equations. This similarity arises because both the variations and the evolution equations for these quantities are obtained as inﬁnitesimal Lie group actions. Remark. The kinematic formula (2.62) for the evolution of γ m immediately implies the following γ−circulation theorem for the spatial rotational strain, / / d γ m dxm = 2 ν × γ m dxm . (2.63) dt c(u) c(u) Thus, the circulation of γ around a loop c(u) moving with the ﬂuid is only conserved when γ m ×ν is a gradient. Otherwise, the curl of γ m ×ν generates circulation of γ around ﬂuid loops. By Stokes’ theorem and equation (2.29) we have, / a ∂γm a γm dxm = dxj ∧ dxm = abc dnb ∧ dnc , (2.64) c(u) S(u) ∂xj S(u) for a surface S(u) whose boundary is the ﬂuid loop c(u). Consequently, the γ−circulation theorem (2.63) implies that a nonvanishing curl of γ m × ν generates a time-changing ﬂux of abc ∇nb × ∇nc for a = 1, 2, 3, through those surfaces whose boundaries move with the ﬂuid. The nonabelian 2-cocycle terms create writhe and linkages. We use the kinematic equation (2.62) for gamma and the deﬁnition of ν to calculate the evolution of curlv · dS = na abc dnb ∧ dnc as ∂t + £u curl v · dS = 2 dν b ∧ dnb = d n · 2ν × γ m dxm = d n · adν γ m dxm This means the nonabelian aspect of the generalized 2-cocycle (the ∇ν in the gamma equation) creates writhe in the boundary of a surface moving with the ﬂuid under the gamma-evolution. That is, by Stokes’ Law, / / / d v · dx = n · 2ν × γ m dxm = n · adν γ m dxm , dt c(u) c(u) c(u) and the writhe of a ﬂuid loop is not preserved. Likewise, we calculate ∂t + £u v · dx ∧ d(v · dx) = exact form . Therefore, we ﬁnd d dt

v · curl v d3 x = 0 .

130

D. D. Holm

Thus, perhaps surprisingly, the linkage number, the Hopf degree on mapping in equation (2.36) is also not preserved by the gamma-evolution. So the nonabelian generalized 2-cocycle term in the gamma equation also creates linkages in the Mermin–Ho quantity, curlv. Conversely, the preservation of these quantities is an Abelian characteristic. Euler–Poincar´ e action variations. We compute the variation of the action (2.55) in Eulerian variables at ﬁxed time t and spatial position x as, δ δ δ δ δD + · δν + δuj + · δγ m (2.65) δS = dt d3 x δuj δD δν δγ m ) ∂ δ ∂ δ δ ∂uk ∂ δ = dt d3 x ηj − − − uk + D ∂t δuj δuk ∂xj ∂xk δuj ∂xj δD δ ∂γ m ∂ δ δ ∂ν γ · · − · + − δν ∂xj δγ m ∂xj ∂xm j δγ m ∂ δ δ ∂ δ δ δ + 2ν × − um + + 2γ m × + Σ· − ∂t δν ∂xm δν δγ m δν δγ m ∂ δ δ δ δ δ ∂ + δjm − γ j · +Σ· um − D ηj ηj + ∂t δuj δν ∂xm δuj δD δγ * m

δ δ um + +Σ· , δν δγ m (2.66) where we have substituted the variational expressions (2.57)–(2.60) and integrated by parts. The dynamical equations resulting from Hamilton’s principle δS = 0 are obtained by requiring the coeﬃcients of the arbitrary variations ηj and Σ to vanish. We assume these variations themselves vanish at the temporal endpoints and we defer discussing the boundary terms for a moment. Hence, we obtain the Euler–Poincar´ e equations for liquid crystals, ∂ δ ∂ δ δ ∂uk ∂ δ (2.67) =− − uk + D ∂t δuj δuk ∂xj ∂xk δuj ∂xj δD δ ∂ν δ ∂γ m ∂ δ − γj · , · − · + δν ∂xj δγ m ∂xj ∂xm δγ m ∂ δ δ ∂ δ δ δ Σ: + 2ν × = − um + + 2γ m × . ∂t δν ∂xm δν δγ m δν δγ m (2.68)

ηj :

Equation (2.68) agrees with the Lagrangian version of the micromotion equation governing ν in (2.52). To see this, it is helpful to recall that δ/δν is an Eulerian density, so a Jacobian is involved in the transformation to the Lagrangian version.

4. Dynamics of Perfect Complex Fluids

131

Momentum conservation. In momentum conservation form, the liquid crystal motion equation (2.67) in the Eulerian ﬂuid description becomes ∂ ∂ ∂ ∂ ∂ ∂ , δmj − γ j · =− um + − D ∂t ∂uj ∂xm ∂uj ∂D ∂γ m

(2.69)

for simple algebraic dependence of the Lagrangian on u, D, ν and γ m . This momentum conservation law is in agreement with the direct passage to Eulerian coordinates of the motion equation (2.52) in the Lagrangian ﬂuid description. For this transformation, it is helpful to recognize from equation (2.56) that = (LJ −1 )g(t)−1 implies, by the chain rule, ∂L ∂ g(t)−1 = − D , ∂J ∂D

since

D(x, t) = J −1 (X, t)g(t)−1 .

(2.70)

The momentum conservation law (2.69) acquires additional terms, if the Lagrangian also depends on gradients of u, D, ν and γ m . Noether’s theorem. Noether’s theorem associates conservation laws to continuous symmetries of Hamilton’s principle. See, e.g., Olver [1993] for a clear discussion of the classical theory and Jackiw and Manton [1980] for its applications in gauge theories. The momentum conservation equation (2.69) also emerges from Noether’s theorem, since the action S in equation (2.55) admits spatial translations, that is, since this action is invariant under the transformations, xj → xj = xj + ηj (x, t)

with ηj = cj ,

(2.71)

for constants cj , with j = 1, 2, 3. To see how equation (2.69) emerges from Noether’s theorem, simply add the term ∂(ηj )/∂xj (arising from transformations of the spatial coordinate) to the endpoint and boundary terms in the variational formula (2.65) arising from variations at ﬁxed time t and spatial position x, then specialize to ηj = cj . Kelvin–Noether circulation theorem for liquid crystals. Rearranging the motion equation (2.67) and using the continuity equation for D in (2.61) gives the Kelvin–Noether circulation theorem, cf. Holm, Marsden, and Ratiu [1998], d dt

/

1 δ dxj = D δuj c(u) 1 0 / δ 1 δ ∂ δ · dν + · dγ m − − γj · dxj . δγ m ∂xm δγ m c(u) D δν

(2.72)

Hence, stresses in the director ﬁeld of a liquid crystal and gradients in its angular frequency ν and rotational strain γ can generate ﬂuid circulation. Equivalently, by Stokes’ theorem, these gradients of liquid crystal properties

132

D. D. Holm

can generate vorticity, deﬁned as ω ≡ curl (D −1 δ/δu). For incompressible ﬂows, we set D = 1 in these equations and write the vorticity dynamics as, ∂ωi δ δ a + (u · ∇)ωi − (ω · ∇)ui = ∇ν a × ∇ a + ∇γm ×∇ a ∂t δν i δγm i

∂ ∂ δ γ · . (2.73) + ijk ∂xj ∂xm k δγ m Thus, spatial gradients in the director angular frequency ν and rotational strain γ are sources of the ﬂuid vorticity in a liquid crystal.

2.3

Hamiltonian Dynamics of Perfect Liquid Crystals

The Euler–Lagrange–Poincar´e formulation of liquid crystal dynamics obtained so far allows passage to the corresponding Hamiltonian formulation via the following Legendre transformation of the reduced Lagrangian in the velocities u and ν, in the Eulerian ﬂuid description, mi =

δ δ , , σ= δui δν

h(m, D, σ, γ m ) = mi ui + σ · ν − (u, D, ν, γ m ). (2.74)

Accordingly, one computes the derivatives of h as δh = ui , δmi

δh =ν, δσ

δh δ =− , δD δD

δh δ =− . δγ m δγ m

(2.75)

Consequently, the Euler–Poincar´e equations (2.67)–(2.68) together with the auxiliary kinematic equations (2.61)–(2.62) for liquid crystal dynamics in the Eulerian description imply the following equations, for the Legendretransformed variables, (m, D, σ, γ), ∂ δh ∂ δh δh ∂ ∂mi mi −D = − mj − ∂t ∂xi δmj ∂xj δmj ∂xi δD ∂γ δh ∂ δh ∂ δh j + · γi · − σ· , (2.76) − ∂xi δγ j ∂xj δγ j ∂xi δσ ∂D ∂ δh D , (2.77) =− ∂t ∂xj δmj ∂σ δh ∂ δh δh δh σ − 2σ × =− − 2γ j × − , (2.78) ∂t ∂xj δmj δγ j δσ δγ j ∂γ δh ∂ δh ∂γ i ∂ δh δh i = − γj − 2 γi × . (2.79) − + ∂t ∂xi δmj ∂xj δmj ∂xi δσ δσ These equations are Hamiltonian. That is, they may be expressed in the form ∂z δh = {z, h} = b · , (2.80) ∂t δz

4. Dynamics of Perfect Complex Fluids

133

where z ∈ (m, D, σ, γ) and the Hamiltonian matrix b deﬁnes the Poisson bracket δh δf ·b· , (2.81) {f, h} = d n x δz δz which is bilinear, skew symmetric and satisﬁes the Jacobi identity, {f, {g, h}} + c.p.(f, g, h) = 0. Assembling the liquid crystal equations (2.76)–(2.79) into the Hamiltonian form (2.80) gives, ⎤ ⎡ ⎤ ⎡ δh/δmj mi ⎥ ⎢ ⎥ ∂ ⎢ ⎢ D ⎥ = −A ⎢ δh/δD ⎥ (2.82) ⎣ δh/δγ j ⎦ ∂t ⎣ γ i ⎦ σ δh/δσ where ⎡ mj ∂ i + ∂ j m i ⎢ ∂j D A=⎢ ⎣ γ j ∂i + γ i , j ∂j σ

D∂i 0 0 0

(∂j γ i − γ j , i ) · 0 0 −∂j + 2γ j ×

⎤ σ · ∂i ⎥ 0 ⎥ −∂i + 2γ i ×⎦ 2σ ×

In components, this Hamiltonian matrix expression becomes, ⎡ ⎤ ⎤ δh/δmj mi ⎢ ⎥ ⎥ ∂ ⎢ ⎢ D ⎥ = −B ⎢ δh/δDβ ⎥ ⎣δh/δγ ⎦ ∂t ⎣γiα ⎦ j σα δh/δσβ ⎡

(2.83)

where ⎡ mj ∂i + ∂j mi ⎢ ∂j D B=⎢ ⎣ γjα ∂i + γiα, j ∂j σ α

D∂i 0 0 0

∂j γiβ − γjβ, i 0 0 β −δαβ ∂j + tακ γjκ

⎤ σ β ∂i ⎥ 0 ⎥ α α κ⎦ −δβ ∂i − tβκ γi κ −tαβ σκ

where t αβκ = 2αβκ represents (twice) the vector cross product for liquid crystals. In this expression, the operators act to the right on all terms in a product by the chain rule and, as usual, the summation convention is enforced on repeated indices. At this point we have switched to using both lower and upper Greek indices for the internal degrees of freedom, so that we will agree later with the more general theory, in which upper Greek indices refer to a basis set in a Lie algebra, and lower Greek indices refer to the corresponding dual basis. Lower Latin indices still denote spatial components.

134

D. D. Holm

Remarks about the Hamiltonian matrix. The Hamiltonian matrix in equation (2.83) was discovered some time ago in the context of spin-glasses and Yang–Mills magnetohydrodynamics (YM-MHD) by using the Hamiltonian approach in Holm and Kupershmidt [1988]. There, it was shown to be a valid Hamiltonian matrix by associating its Poisson bracket as deﬁned in equation (2.81) with the dual space of a certain Lie algebra of semidirectproduct type that has a generalized two-cocycle on it. The mathematical discussion of this Lie algebra and its generalized two-cocycle is given in Holm and Kupershmidt [1988]. A related Poisson bracket for spin glass ﬂuids is given in Volovik and Dotsenko [1980]. A diﬀerent Poisson bracket for nematic liquid crystals is given in Kats and Lebedev [1994], who discuss a constrained Poisson bracket that in general does not satisfy the Jacobi identity. The liquid crystal equations in Kats and Lebedev [1994]. also diﬀer from the Ericksen–Leslie equations by being ﬁrst order in the time derivatives of the director, rather than second order, as in the Ericksen–Leslie theory. See also Isaev, Kovalevskii, and Peletminskii [1995]. for a discussion of this ﬁrst order theory using the Poisson bracket approach. The present work ignores the ﬁrst order (kinematic) theory in what follows and concentrates on second order (dynamic) theory. Being dual to a Lie algebra, our matrix in equation (2.83) is in fact a Lie– Poisson Hamiltonian matrix. See, e.g., Marsden and Ratiu [1999] and references therein for more discussions of such Hamiltonian matrices. For our present purposes, its rediscovery in the PCF context links the physical and mathematical interpretations of the variables in the theory of PCFs with earlier work in the gauge theory approach to condensed matter, see, e.g., Kleinert [1989]. These gauge theory aspects emerge upon rewriting the Lie–Poisson Hamiltonian equations in terms of covariant derivatives with respect to the space-time connection one-form given by νdt + γ m dxm , as done in Holm and Kupershmidt [1988]. The gauge theory approach to liquid crystal physics is reviewed in, e.g., Trebin [1982], Kleman [1983], and Kleman [1989]. The generalized two-cocycle in the Hamiltonian matrix (2.83) is somewhat exotic for a classical ﬂuid. This generalized two-cocycle consists of the partial derivatives in equation (2.83) appearing with the Kronecker deltas in the σ − γ cross terms. The ﬁrst hint of these terms comes from the exterior derivative dν appearing in the kinematic equation (2.62) for γ m . Finding such a feature in the continuum theory of liquid crystals may provide a bridge for transfering ideas and technology between the classical and quantum ﬂuid theories4 . See Volovick [1992] and Volovick and Vachas4 In

quantum ﬁeld theory, these partial derivative operators in the Poisson bracket (or commutator relations) are called non-ultralocal terms, or Schwinger terms, after Schwinger [1951, 1959]. These terms lead to the so-called “quantum anomalies.” Of course, no quantum eﬀects are considered here. However, the Poisson bracket (2.83) still contains classical Schwinger terms.

4. Dynamics of Perfect Complex Fluids

135

pati [1996] for discussions of similar opportunities for technology transfer in the theory of superﬂuid Helium. The implications of the generalized two-cocycle for the solutions of the liquid crystal equations can be seen by considering two special cases: static solutions with one-dimensional spatial dependence; and spatially homogeneous, but time-dependent, liquid crystal dynamics. Static perfect liquid crystal solutions with z-variation. Static (zerovelocity, steady,) solutions, with one-dimensional spatial variations in, say, the z-direction obey equations (2.76)–(2.79) specialized to −D

d δh d δh d δh = γ3 · = 0, + σ· dz δD dz δγ 3 dz δσ d δh δh δh , = 2γ 3 × + 2σ × dz δγ 3 δγ 3 δσ d δh δh = 2 γ3 × . dz δσ δσ

(2.84)

The sum of terms in the ﬁrst equation of the set (2.84) vanishes to give zero pressure gradient, as a consequence of the latter two equations. We compare the latter two equations with the E(3) Lie–Poisson Hamiltonian systems, given by ∂H ∂H dΠ = ×Π+ × Γ, dt ∂Π ∂Γ ∂H dΓ = × Γ. − dt ∂Π

−

(2.85)

When the Hamiltonian H(Π, Γ) in these equations is specialized to H=

3 (Πi )2 + M gχ · Γ , 2Ii i=1

(2.86)

where Ii , i = 1, 2, 3, and M gχ are constants, then the E(3) Lie–Poisson equations (2.85) specialize to the classical Euler–Poisson equations for a heavy top, discussed in, e.g., Marsden and Ratiu [1999]. By comparing these two equation sets, one observes that (at least for algebraic h) the static perfect liquid crystal ﬂows with z-variation in equations (2.84) and the E(3) Lie–Poisson tops governed by equations (2.85) are Legendre duals to each other under the map, d d = −2 , dz dt so that

h(γ 3 , σ) = Π · γ 3 + Γ · σ − H(Π, Γ) ,

(2.87)

∂H . ∂Γ

(2.88)

∂h = Π, ∂γ 3

∂h = Γ, ∂σ

Hence, we arrive at the result:

γ3 =

∂H , ∂Π

σ=

136

D. D. Holm

The class of static one dimensional ﬂows of a perfect liquid crystal is Legendre-isomorphic to the class of E(3) Lie–Poisson tops. 2 These tops conserve Π · Γ and Γ , but in general they are not integrable. Spatially homogeneous, unsteady ﬂows of perfect liquid crystals. Spatially homogeneous solutions of equations (2.76)–(2.79) obey the dynamical equations, δh δh dσ = ×σ+ × γ m, dt δσ δγ m δh dγ m = × γm . dt δσ

1 2 1 2

(2.89) (2.90)

For a single value of the spatial index, say m = 3, these are nothing more than the E(3) top equations (2.85) with time re-parameterized by t → −2t. Hence, in this case, the Hamiltonian internal dynamics of a spatially homogeneous liquid crystal is essentially identical to the E(3) Lie–Poisson dynamics of a top. In the multi-component case, one sums over m = 1, 2, 3, in the second term of equation (2.89) and, thus, the resulting dynamics is more complex than the simple top. Hence, we have: The class of spatially homogeneous, time-dependent perfect liquid crystal ﬂows is isomorphic to the generalization (2.89)–(2.90) of the E(3) Lie–Poisson tops. Action principle #4 — Clebsch representation. Another representation of Hamilton’s principle for liquid crystals in the Eulerian ﬂuid description may be found by constraining the Eulerian action S in equation (2.55) by using Lagrange multipliers to enforce the kinematic equations (2.61) and (2.62). The constrained action is, thus, ∂D ∂Duj + S = dt d3 x (u, D, ν, γ) + φ ∂t ∂xj ∂γ ∂ν ∂γ ∂uk m , (2.91) − + 2 γ m × ν + uk m + γ k + βm · ∂t ∂xm ∂xk ∂xm with Lagrange multipliers φ and β m . Stationarity of S under variations in uk and ν implies the relations δ ∂φ ∂γ m ∂ γ k · βm = 0 , −D + βm · − δuk ∂xk ∂xk ∂xm δ ∂β m δν : + + 2 βm × γ m = 0 . δν ∂xm

δuk :

(2.92) (2.93)

These are the Clebsch relations for the momentum of the motion mk = δ/δuk and the director angular momentum of the micromotion σ = δ/δν.

4. Dynamics of Perfect Complex Fluids

137

Stationary variations of S in D and γ m give, respectively, the dynamical equations for the canonical momenta, πD = φ and π γm = β m , as ∂φ δ ∂φ + uk = 0, − ∂t ∂xk δD ∂ ∂um ∂β m δ + β m uk − β k : − 2 ν × βm − = 0. ∂t ∂xk ∂xk δγ m

δD : δγ m

(2.94) (2.95)

Finally, variations in the Lagrange multipliers φ and β m imply the kinematic equations (2.61) and (2.62), respectively. These two kinematic equations combine with the four variational equations (2.92) through (2.95) to recover the Eulerian motion and micromotion equations after a calculation using the Clebsch relations, the Kelvin–Noether form of the motion equation, and the dynamical equations for the Clebsch potentials. At the end of Section 3, we shall systematize this type of calculation and, thus, clarify its meaning as a Poisson map. For now, we simply remark that the evolutionary Clebsch relations are Hamilton’s canonical equations for the Hamiltonian obtained from the constrained action in (2.91) by the usual Legendre transformation. Perhaps not unexpectedly, this Hamiltonian agrees exactly with that in equation (2.74) obtained from the Legendre transformation in u and ν alone. As we shall discuss more generally in the next section, the Clebsch representations for the momentum and director angular momentum provide a Poisson map from the canonical Poisson bracket in the Clebsch variables to the Lie–Poisson bracket for the Hamiltonian matrix with generalized two-cocycle found in equation (2.83). Historically, the Hamiltonian approach has been very fruitful in modeling the hydrodynamics of complex ﬂuids and quantum liquids, including superﬂuids, going back to the seminal work of Khalatnikov and Lebedev [1978, 1980], and Dzyaloshinskii and Volovick [1980]. The Clebsch approach has provided a series of physical examples of Lie–Poisson brackets: for superﬂuids in Holm and Kupershmidt [1982]; superconductors in Holm and Kupershmidt [1983a]; Yang–Mills plasmas (chromohydrodynamics) in Gibbons, Holm, and Kupershmidt [1982], and Gibbons, Holm, and Kupershmidt [1983]; magnetohydrodynamics, multiﬂuid plasmas, and elasticity, in Holm and Kupershmidt [1983b]; Yang–Mills magnetohydrodynamics in Holm and Kupershmidt [1984]; and its relation to superﬂuid plasmas in Holm and Kupershmidt [1987] and spin-glasses in Holm and Kupershmidt [1988]. Many, but not all, of these Lie–Poisson brackets ﬁt into the present Euler–Poincar´e framework for PCFs. The Euler–Poincar´e framework also accomodates many of the various types of Poisson brackets (such as “rigid body ﬂuids”) studied over the years by Grmela, Edwards, Beris, and others, as summarized in Beris and Edwards [1994]. For liquid crystals, these authors develop a bracket description both for Ericksen–Leslie equations and the Doi–Edwards theory

138

D. D. Holm

based on the conformation tensor C, which is related to the director theory by C = n ⊗ n. The extension of the present results to this case may be accomplished, e.g., by following the Clebsch approach of Holm and Kupershmidt [1983b]. who treated the corresponding case of Lie–Poisson brackets for nonlinear elasticity. The treatment in Beris and Edwards [1994]. ignores the geometrical content of the Lie–Poisson formulation in preference for its tensor properties alone.

2.4

Summary for Perfect Liquid Crystals

We now recapitulate the steps in the procedure we have followed in deriving the Euler–Lagrange–Poincar´e-Clebsch equations and the Lie–Poisson Hamiltonian formulations of the dynamics of perfect liquid crystals. 1. Deﬁne the order parameter group and its coset space. 2. Write Hamilton’s principle in the Lagrangian ﬂuid description. 3. Make 2 stages of reduction: (i) ﬁrst, to introduce the reduced set of variables in the Lagrangian ﬂuid description; and (ii) second, to pass to the Eulerian ﬂuid description. 4. Legendre transform to obtain the Hamiltonian formulation. The alternative Clebsch procedure starts directly with an action for Hamilton’s principle that is deﬁned in the Eulerian ﬂuid description and constrained by the Eulerian kinematic equations. Its Hamiltonian formulation is canonical and passes to a Lie–Poisson formulation via the Poisson map that is deﬁned by the Clebsch representations of the momentum and internal angular momentum in equations (2.92) and (2.93), respectively. Many physical extensions of these results for perfect liquid crystals are available, e.g., to include MHD, compressibility, anisotropic dielectric and diamagnetic eﬀects, linear wave excitation properties, etc. However, we wish to spend the most of the rest of this paper setting the formulations we have established here for perfect liquid crystal dynamics into the geometrical framework of Lagrangian reduction by stages developed in Cendra, Marsden, and Ratiu [2001]. This geometrical setting will take advantage of the unifying interpretation of order parameters as coset spaces of broken symmetry groups. (The coset interpretation of order parameters for liquid crystals, superﬂuids and spin glasses is reviewed, e.g., in Mermin [1979].) The present formulations are geometrical variants of the Ericksen–Leslie equations for liquid crystal dynamics that illuminate some of their mathematical features from the viewpoint of Lagrangian reduction.

4. Dynamics of Perfect Complex Fluids

3

139

Action Principles and Lagrangian Reduction

As we have seen, the passage to reduced variables ν and γ m for liquid crystals restricts the variables n, n, ˙ ∇n to the coset space SO(3)/ O(2) of rotations that properly aﬀect the director n and imposes invariance of the theory under the reﬂections n → −n. The reduced variables transform properly under SO(3), because the O(2) isotropy subgroup of n has been factored out of them. Thus, we “mod out” or “reduce” the symmetryassociated degrees of freedom by passing to variables that transform properly under rotations in SO(3) and admit the Z2 reﬂections n → −n. The removal of degrees of freedom associated with symmetries is the essential idea behind Marsden-Weinstein group reduction in Marsden and Weinstein [1974]. Marsden–Weinstein reduction ﬁrst appeared in the Hamiltonian setting. However, this sort of reduction by symmetry groups has been recently extended to the Lagrangian setting, see Cendra, Marsden, and Ratiu [2001] and Marsden, Ratiu, and Scheurle [2000]. The remainder of the paper applies the mathematical framework of Lagrangian reduction by stages due to Cendra, Marsden, and Ratiu [2001] to express some of the properties of PCF dynamics in the Eulerian description for an arbitrary order parameter group. A synthesis of the nonlinear dynamics for the motion and micromotion of various perfect complex ﬂuid models is possible, due to their common mathematical basis. The mathematical basis common to all ideal ﬂuid motion — both classical and complex ﬂuids — is Hamilton’s principle, see, e.g., Serrin [1959], δS = δ L dt = 0 . (3.1) In the Lagrangian (or material) representation for ﬂuids, the motion is described by the Euler–Lagrange equations for this action principle. In the Eulerian (or spatial) representation for ﬂuids, the Euler–Lagrange equations for the dynamics are replaced by the Euler–Poincar´e equation. The distinction between the Euler–Lagrange and the Euler–Poincar´e equations is exempliﬁed by the distinction between rigid body motion expressed in terms of the Euler angles and their time derivatives on the tangent space T SO(3) of the Lie group of proper rotations SO(3), and that same motion expressed in body angular velocity variables in its Lie algebra so(3). Poincar´e [1901] was the ﬁrst to write the latter equations on an arbitrary Lie algebra; hence, the name Euler–Poincar´ e equations. Euler–Poincar´e equations may be understood and derived via the theory of Lagrangian reduction as in Cendra, Holm, Marsden, and Ratiu [1999], and Cendra, Marsden, and Ratiu [2001]. Euler–Poincar´e equations arise when Euler–Lagrange equations and their corresponding Hamilton principles are mapped from a velocity phase space T Q to the quotient space

140

D. D. Holm

T Q/G (a vector bundle) by a Lie-group action of a symmetry group G on the conﬁguration space Q. If L is a G-invariant Lagrangian on T Q, this process maps it to a reduced Lagrangian and a corresponding reduced variational principle for the Euler–Poincar´e dynamics on T Q/G in which the variations are constrained. See Weinstein [1996] and Cendra, Marsden, and Ratiu [2001], for expositions of the mathematical framework that underlies Lagrangian reduction by stages and Holm, Marsden, and Ratiu [1998] for a discussion of Euler–Poincar´e equations and their many applications in classical ideal ﬂuid dynamics from the viewpoint of the present paper. See Marsden, Ratiu, and Scheurle [2000] for additional insight and recent results in Lagrangian reduction. The order parameters of PCFs are material variables. The Lagrangian in Hamilton’s principle (3.1) for PCFs is the map, L : T G × V ∗ × T O −→ R .

(3.2)

That is, the velocity phase space for the PCF Lagrangian L in material variables is the Cartesian product of three spaces: T G, the tangent space of the Lie group G of ﬂuid motions (the diﬀeomorphisms that take the ﬂuid parcels from their reference conﬁguration to their current positions in the domain of ﬂow), V ∗ , the vector space of advected quantities carried with the ﬂuid motion, and T O, the tangent space of the Lie group O of ﬂuid micromotions (O is the order parameter Lie group). The advected quantities in V ∗ include the volume element or mass density and whatever else is carried along with the ﬂuid parcels, such as the magnetic ﬁeld intensity in the case of magnetohydrodynamics. The new feature of PCFs relative to the simple ﬂuids with advected parameters treated in Holm, Marsden, and Ratiu [1998]. is the dependence of their Lagrangian on T O. The order parameter coset space at each material point is acted upon by the order parameter Lie group. (We choose the convention of group action from the right.) Since the order parameter is a material property, the diﬀeomorphism group G also acts on the order parameter group, as O × G → O, denoting action from the right. In this Section, we shall use Hamilton’s principle (3.1) with Lagrangian (3.2) to obtain the dynamical equations for the motion and micromotion of PCFs whose order parameters are deﬁned as coset spaces of Lie groups. In doing so, we shall begin by assuming this Lagrangian is invariant under the right action of the order parameter Lie group O on its tangent space T O. (This right action on the space of internal variables leaves the other components of the conﬁguration space T G and V ∗ ﬁxed.) We shall

4. Dynamics of Perfect Complex Fluids

141

assume this Lagrangian is also invariant under the right action of the diffeomorphisms G, which relabel the ﬂuid parcels. (This action of G does indeed aﬀect the material variables deﬁned on T O and V ∗ .) Under these symmetry assumptions we shall perform the following two group reductions T G × V ∗ × (T O/O) /G g × (V ∗ × o)g −1 (t) , with respect to the right actions of ﬁrst O and then G, by applying group reduction to the velocity phase space of this Lagrangian in two stages, 1st stage: 2nd stage:

(T G × V ∗ × T O)/O T G × V ∗ × o , ∗

∗

(T G × V × o)/G g × (V × o)g

−1

(t) .

(3.3) (3.4)

Here we denote isomorphisms as, e.g., o T O/O and g T G/G, where the Lie algebras o and g correspond, respectively, to the Lie groups O and G. The ﬁrst stage is Lagrangian reduction by the right action of O, the order parameter group5 . The second stage is Lagrangian reduction of the ﬁrst result by the right action of the diﬀeomorphisms G in the ﬁrst factor and by composition of functions in the second factor. Because of the assumed invariances of our Lagrangian, these two stages of reduction of the velocity phase spaces will each yield a reduced Lagrangian and a corresponding reduced variational principle for the dynamics. The group actions at each stage are assumed to be free and proper, so the reduced spaces will be local principle ﬁber bundles6 . The mathematical formulation of the process of Lagrangian reduction by stages and the introduction of various connections on the Lagrange–Poincar´ e bundles that arise in Lagrangian reduction are discussed in Cendra, Marsden, and Ratiu [2001]. These Lagrange–Poincar´e bundles are special cases of Lie algebroids. See Weinstein [1996] for a fundamental description of the relation between Lagrangian mechanics and Lie algebroids.

3.1

Lagrangian Reduction by Stages

We are dealing with a Lagrangian deﬁned by the map ˙ dχ) : T G × V ∗ × T O −→ R , L(g, g, ˙ a0 , χ, where G is the diﬀeomorphism group that acts on both the vector space V ∗ of advected material quantities and the order parameter group O. We exist. In particular, for liquid crystals, only a part of the Lie algebra o is required; namely, so(3)/ O(2), the part of the Lie algebra so(3) that is invariant under the O(2) isotropy group of the director, n. 6 A natural ﬂat connection appears on this bundle, but this bundle picture should be made intrinsic and global, while including defect dynamics. A strategy for obtaining equations for the defect dynamics will be discussed in the next Section. However, the global bundle picture is for future work. 5 Variants

142

D. D. Holm

assume that L has the following invariance properties, ˙ dχ) = L(g, g, ˙ a0 , χψ, ˙ dχψ) L(g, g, ˙ a0 , χ, = L(gh, gh, ˙ a0 h, χψh, ˙ dχψh) ,

(3.5)

for all ψ ∈ O and h ∈ G. In particular, we shall choose ψ = χ−1 (t) in the ﬁrst stage and h = g −1 (t) in the second stage of the reduction, so that ˙ dχ) = L(g, g, ˙ a0 , χχ ˙ −1 , dχ χ−1 ) , L(g, g, ˙ a0 , χ,

(3.6)

after the ﬁrst stage of reduction, and ˙ dχ) = L(e, gg ˙ −1 , a, χχ ˙ −1 g −1 , dχ χ−1 g −1 ) ≡ l(ξ, a, ν, γ) (3.7) L(g, g, ˙ a0 , χ, ˙ −1 )g −1 and after the second stage, with ξ ≡ gg ˙ −1 , a ≡ a0 g −1 , ν ≡ (χχ γ · dx ≡ (dχ χ−1 )g −1 . After the ﬁrst stage of reduction, the reduced action principle yields the Lagrange–Poincar´e equations, and after the second stage we shall obtain the Euler–Poincar´e equations for a perfect complex ﬂuid with an arbitrary order parameter group. 3.1.1

Lagrange–Poincar´ e equations

The ﬁrst stage T G × V ∗ × T O −→ T G × V ∗ × o , of the two-stage symmetry reduction in (3.3)–(3.4) aﬀects only the internal variables and passes from coordinates on the order parameter Lie group, O, to coordinates on its Lie algebra, o, obtained from the tangent vectors of the order parameter Lie group at the identity by the isomorphism o T O/O. The results at this ﬁrst stage consist of Euler–Lagrange equations for the ﬂuid motion, coupled through additional components of the stress tensor to equations of a type called Lagrange–Poincar´e equations in Marsden and Scheurle [1995], Cendra, Marsden, and Ratiu [2001]. In our case, these Lagrange–Poincar´e equations describe the micromotion in the Lagrangian (or material) ﬂuid description. The ﬁrst stage of reduction results in the Lagrange–Poincar´ e action principle, δ

dt

˙ J, ν, γ) = 0 , d3 X L(x,

written in the material representation and denoted as follows, L(x, ˙ J, ν, γ) is the reduced Lagrangian on T G × V ∗ × o, J(X, t) = det(∂x/∂X) ∈ V ∗ is the volume element, ν(X, t) = χχ ˙ −1 (X, t) ∈ o is the material angular frequency and

(3.8)

4. Dynamics of Perfect Complex Fluids

143

γ · dx = dχ χ−1 (X, t) ∈ o with components denoted as γm given by γm dxm (X, t) = dχ χ−1 (X, t) , ∂xm mat dXA = γA (X, t) dXA , = γm ∂XA

(3.9)

where γ · dx is the Cosserat strain one-form introduced in Cosserat and Cosserat [1909] and superposed “dot” ( )˙ denotes time derivative at ﬁxed material position X. These material quantities satisfy auxiliary kinematic equations, obtained by diﬀerentiating their deﬁnitions, · −1 3 · (3.10) J d x = d3 X = 0, · −1 · (γ · dx) = dχ χ = dν + adν (γ · dx) , (3.11) The material angular frequency ν = χχ ˙ −1 and the material Cosserat strain −1 one-form γ · dx = dχ χ take their values in the right-invariant Lie algebra o of the order parameter Lie group O. The ad-operation appearing in equation (3.11) denotes multiplication, or commutator, in the Lie algebra o. The dynamical Lagrange–Poincar´ e equations determine the complex ﬂuid’s motion with ﬂuid trajectory φt (X) = x(X, t) with φt ∈ G and micromotion χ(X, t) ∈ O in the material ﬂuid description. These equations take the following forms, ∂L ˙ 9 ∂ 8 −1 ∂L ∂ ∂L J (3.12) − + , γp = 0 , J −1 ∂ x˙ p ∂xp ∂J ∂xm ∂γm ∂L ˙ ∂ −1 ∂L ∂L ∂L J − ad∗γm +J − ad∗ν = 0. (3.13) ∂ν ∂ν ∂xm ∂γm ∂γm The ad∗ -operation appearing in (3.13) is deﬁned in terms of the adoperation and the symmetric pairing · , · between elements of the right Lie algebra o and its dual o∗ as, e.g., 9 8 9 8 ∂L ∂L 9 8 ∂L ,Σ = , adν Σ = , [ν , Σ ] . (3.14) − ad∗ν ∂ν ∂ν ∂ν ι In a Lie algebra basis satisfying [eα , eβ ] = tαβ eι and its dual basis eκ satisfying eκ , eι = δικ , we may write this formula as 8 9 ∂L 9 8 ∂L ∂L κ α β ,Σ = , adν Σ = − ad∗ν t ν Σ . (3.15) ∂ν ∂ν ∂ν κ αβ

Thus, for the sign conventions we choose in (3.14), the ad∗ -operation is deﬁned as the negative transpose of the ad-operation. The dynamical equations (3.12) and (3.13) follow from the Lagrange– Poincar´e action principle (3.8) for PCF dynamics in the material ﬂuid description. These Lagrange–Poincar´e equations may be calculated directly,

144

D. D. Holm

as

˙ J, ν, γ) (3.16) 0 = δ dt d3 X L(x, 0 1 9 8 ∂L 9 8 ∂L ∂L ∂L = dt d3 X δJ + , δν + δ x˙ p + , δγm ∂ x˙ p ∂J ∂ν ∂γm ) 0

91 ˙ ∂L ∂ 8 −1 ∂L ∂ ∂L 3 +J = dt d X δxp − −J ,γ J ∂ x˙ p ∂xp ∂J ∂xm ∂γm p * 8 ∂L ˙ ∂ −1 ∂L ∂L 9 ∗ ∂L ∗ J + adγm ,Σ −J + − + adν ∂ν ∂ν ∂xm ∂γm ∂γm 8 / 9 8 ∂L 9 ∂L ∂XA ∂L 2 δxm . + dt d S n ˆA ,Σ − , γ δxp + J ∂xm ∂γm ∂γm p ∂J

Here we deﬁne Σ ≡ δχ χ−1 , in terms of which we calculate, δν = Σ˙ − ν, Σ = Σ˙ − adν Σ , ∂Σ ∂Σ − γAmat , Σ = − adγAmat Σ . δγAmat = ∂XA ∂XA

(3.17) (3.18)

Since γm = γAmat (∂XA /∂xm ), this means that ∂XB ∂ δxp + ∂xm ∂XB ∂XB ∂ = − γp δxp + ∂xm ∂XB

δγm = − γp

∂XA δγ mat ∂xm A ∂XA ∂Σ − adγm Σ . ∂xm ∂XA

(3.19)

Thus, the variations in γm couple the two Lagrange–Poincar´e equations. We also drop endpoint terms that arise from integrating by parts in time, upon taking δxp and Σ to vanish at these endpoints. The natural boundary conditions ∂L ∂L = 0 and n ˆm = 0, (3.20) ∂J ∂γ m ensure that the ﬂuid pressure and the normal stress are continuous across a ﬂuid interface. 3.1.2

Euler–Poincar´ e equations

The passage next from the Lagrangian ﬂuid description of continuum mechanics to the Eulerian ﬂuid description will yield the Euler–Poincar´e equations. We obtain these equations by applying to Hamilton’s principle the second stage, T G × V ∗ × o −→ g × (V ∗ × o)g −1 (t) , of the two-stage Lagrangian reduction in (3.3)–(3.4). This second stage of reduction results in the Euler–Poincar´ e action principle, δ l(ξ, a, ν, γ) dt = 0 , (3.21)

4. Dynamics of Perfect Complex Fluids

145

with constrained variations ∂η + ξ , η , δa = −a η, ∂t ∂Σ + ξ · ∇Σ − adν Σ − ν η , δν = ∂t δ (γ · dx) = d Σ + adΣ (γ · dx) − (γ · dx) η , δξ =

(3.22)

where η(t) = δg(t)g(t)−1 ∈ g and Σ(t) = δχ(t)χ(t)−1 ∈ o both vanish at the endpoints. The Euler–Poincar´e action principle produces the following equations deﬁned on g × (V ∗ × o)g −1 (t) for the motion and micromotion, in which ∂/∂t denotes Eulerian time derivative at ﬁxed spatial position x, δl δl δl δl ∂ δl = − ad∗ξ + a+ ν+ γm , ∂t δξ δξ δa δν δγm δl δl δl δl ∂ δl = − div ξ + + ad∗ν + ad∗γm . ∂t δν δν δγ δν δγm

(3.23) (3.24)

These are the Euler–Poincar´ e equations for a perfect complex ﬂuid. In these equations, l is the reduced Lagrangian on g × (V ∗ × o)g −1 (t). Also, adξ η ≡ [ ξ , η ] ,

with ξ , η ∈ g ,

(3.25)

is the commutator in the Lie algebra of vector ﬁelds, g. In addition, we deﬁne the two operations ad∗ξ and as 8 9 8 δl 9 δl ad∗ξ ,η ≡ − , adξ η , δξ δξ and

8 δl

a, η

9

≡ −

8 δl

(3.26)

9 , aη .

(3.27) δa δa The concatenation a η denotes the right Lie algebra action of η ∈ g on a ∈ V ∗ (by Lie derivative). The pairing · , · now includes spatial integration and, thus, allows for integration by parts. Similar deﬁnitions hold for (δl/δγm γm ) and (δl/δν ν). In components, these quantities are given by,

9 8 δl 9 8 δl 9 8 ∂ δl β δl ∂γm γm , η = − , γm η = γjβ − β , ηj , β δγm δγm ∂xm δγm δγm ∂xj 8 δl 9 9 8 δl 9 8 δl ∂ν β (3.28) ν ,η = − ,ν η = − β , ηj , δν δν δν ∂xj 8 δl 9 9 8 δl 9 8 ∂ δl D,η = − ,Dη = D , ηj . δD δD ∂xj δD

146

D. D. Holm

Remark. At this point, one might have also introduced (3 + 1) covariant derivatives acting on o-valued functions of space and time, as Dm ≡

∂ − adγm , ∂xm

and Dt ≡

∂ − adν , ∂t

(3.29)

with associated curvature (or Yang–Mills magnetic ﬁeld) given by adBij ≡ Di , Dj ,

(3.30)

whose components are expressed, cf. equation (2.28), α α α Bijα = γi,j − γj,i + tβκ γiβ γjκ ,

(3.31)

as in Holm and Kupershmidt [1988]. However, the operations ad, ad∗ , and Lie derivative are suﬃcient for our present purposes.

Eulerian kinematic equations. By deﬁnition, an Eulerian advected quantity a ∈ V ∗ g(t)−1 satisﬁes ∂a + aξ = 0. ∂t This advection relation may be written equivalently as ∂a + £ξ a = 0 , ∂t −1 where £ξ is the Lie derivative with respect to ξ = g(t)g ˙ (t), the Eulerian ﬂuid velocity, often denoted also as u(x, t). The Eulerian versions of the Lagrangian kinematic equations (3.10) and (3.11) are given in terms of the Lie derivative by

∂ + £ξ D d 3 x = 0, ∂t ∂

+ £ξ (γ · dx) = dν + adν (γ · dx) . ∂t

(3.32) (3.33)

In these equations, the quantity D(x, t) = J −1 (X, t)g(t)−1 is the Eulerian mass density and the quantities ν(x, t) = ν(X, t)g(t)−1 and ∂x (X, t) · dX g(t)−1 γ(x, t) · dx = γ(X, t) · ∂X are the Eulerian counterparts of the right-invariant material quantities ν(X, t) and γ(X, t) in equations (3.11).

4. Dynamics of Perfect Complex Fluids

147

Remark. We note that equation (3.33) implies the γ−circulation theorem for PCFs, cf. equation (2.63), / / d γ · dx = adν (γ · dx) . (3.34) dt c(ξ) c(ξ) Thus, the circulation of γ around a loop c(ξ) moving with the ﬂuid is conserved when adν γ is a gradient. Otherwise, the curl of this quantity generates circulation of γ around ﬂuid loops. The Euler–Poincar´e equations (3.23) and (3.24) may be obtained directly from the Euler–Poincar´e action principle (3.21), as follows. Euler–Poincar´ e action variations. We compute the variation of the action (3.21) in Eulerian variables at ﬁxed time t and spatial position x as, 08 91 9 8 δl 9 8 δl 9 8 δl δl , δξ + , δa + , δν + , δγm δS = dt δξ δa δν δγm )8 9 ∂ δl δl δl δl δl − ad∗ξ +a +ν + γm = dt − ,η ∂t δξ δξ δa δν δγm

9* 8 ∂ δl δl ∂ δl δl δl + ad∗ν − ξm + + ad∗γm ,Σ + − ∂t δν ∂xm δν δγm δν δγm )

∂ δl δl δl (3.35) ηj + Σβ − d η a, + dt d3 x ∂t δξj δν β δa * ∂ δl δl β δl δl β + ηj ξm + Σ − γ ξ + , m β j β ∂xm δξj δν β δγm δγm where we have used the variational expressions in (3.22) and integrated by parts. Here η a in the ﬁrst boundary term denotes substitution of the δ vector ﬁeld η into the tensor diﬀerential form a and d η a , is the δa divergence term we neglected earlier that arises from integration by parts in the deﬁnition of the -operation in equation 3.27. The dynamical Euler–Poincar´e equations (3.23) and (3.24) are thus obtained from the Euler–Poincar´e action principle (3.21), by requiring the coeﬃcients of the arbitrary variations η and Σ to vanish in the variational formula (3.35). The remaining terms in (3.35) yield Noether’s theorem for this system, which assigns a conservation law to each symmetry of the Euler–Poincar´e variational principle. Momentum conservation. In momentum conservation form, the PCF motion equation in the Eulerian ﬂuid description (3.23) becomes, for algebraic dependence of the Lagrangian density l on (ξ, a, ν, γm ),

∂ ∂l ∂ ∂l ∂ ∂l ∂l ξm . (3.36) =− + lδmj − β γjβ + d a, ∂t ∂ξj ∂xm ∂ξj ∂xj ∂a ∂γm

148

D. D. Holm

In this equation, expressed in Cartesian coordinates, there is an implied sum over the various types of advected tensor quantities, a. This momentum conservation law also arises from Noether’s theorem, as a consequence of the invariance of the variational principle (3.21) under spatial translations. In fact, the simplest derivation of this equation is obtained by evaluating the variational formula (3.35) on the equations of motion and using the translational symmetry of the Lagrangian in Noether’s theorem with ηj = ∂/∂xj . For an algebraic Lagrangian density l(ξ, D, ν, γm ), this momentum conservation law becomes, cf. equation (2.69), ∂ ∂l ∂l ∂ ∂l ∂l β ξm (3.37) δmj − =− + l−D γ . β j ∂t ∂ξj ∂xm ∂ξj ∂D ∂γm Kelvin–Noether circulation theorem for PCFs. Rearranging the motion equation (3.23) and using the continuity equation for D in (3.32) gives d dt

/ c(ξ)

1 δl dxj = D δξj

/ c(ξ)

δl δl 1 δl a+ ν+ γm , D δa δν δγm

(3.38)

where the circulation loop c(ξ) moves with the ﬂuid velocity ξ and we have used the following relation, valid for one-form densities, δl

δl δl δl = £ξ , with = dxj ⊗ d3 x . (3.39) ad∗ξ δξ δξ δξ δξj This relation may be checked explicitly in Cartesian coordinates, as follows,

∂ j δl δl ∂ξ j δl £ξ ξ + j i η i d3 x ,η = j i δξ ∂x δξ δξ ∂x j ∂η ∂ξ j δl ξ i i − η i i d3 x =− j δξ ∂x ∂x

δl

∗ δl , adξ η = adξ ,η . (3.40) =− δξ δξ See Holm, Marsden, and Ratiu [1998] for more explanation and discussion of the Kelvin–Noether circulation theorem for Euler–Poincar´e systems. In components, the Kelvin–Noether circulation theorem (3.38) for PCFs with Lagrangian l(ξ, D, ν, γm ) may be written using equation (3.28) as, d dt

/

1 δl dxj = c(ξ) D δξj

/

δl ∂ν β 1 ∂ δl D − β ∂xj δD δν ∂xj c(ξ) D β δl ∂γm ∂ δl β dxj . + γj − β β ∂xm δγm δγm ∂xj

(3.41)

Thus, gradients of angular frequency and order parameter strain may cause ﬂuid circulation.

4. Dynamics of Perfect Complex Fluids

3.2

149

Hamiltonian Dynamics of PCFs

The Legendre Transformation. One passes from Euler–Poincar´e equations on a Lie algebra g to Lie–Poisson equations on the dual g∗ by means of the Legendre transformation, see, e.g., Holm, Marsden, and Ratiu [1998]. In our case, we start with the reduced Lagrangian l on g × (V ∗ × o)g(t)−1 and perform a Legendre transformation in the variables ξ and ν only, by writing µ=

δl , δξ

σ=

δl , δν

h(µ, a, σ, γ) = µ , ξ + σ , ν − l(ξ, a, ν, γ). (3.42)

One then computes the variational derivatives of h as δh =ξ, δµ

δh =ν, δσ

δh δl =− , δa δa

δh δl =− . δγ δγ

(3.43)

Hence, the Euler–Poincar´e equations (3.23)–(3.33) for PCF dynamics in the Eulerian description imply the following equations, for the Legendretransformed variables, (µ, a, σ, γ), cf. equations (2.76)–(2.79) δh ∂µ δh δh = − ad∗δh/δµ µ − a− σ, γm − ∂t δa δγm δσ ∂a = −£δh/δµ a , ∂t δh ∂γ · dx = −£δh/δµ (γ · dx) + d + adδh/δσ (γ · dx) , ∂t δσ

∂σ δh δh δh = − div σ− + ad∗δh/δσ σ − ad∗γm . ∂t δµ δγ δγm

(3.44)

As for the case of liquid crystals discussed earier, these equations are Hamiltonian and may be expressed in terms of a Lie–Poisson bracket. Lie–Poisson bracket for PCFs. Assembling the PCF equations (3.44) into Hamiltonian form gives, symbolically, ⎡ ⎤ ⎡ ⎤ µ δh/δµ ⎢ ⎥ ⎥ ∂ ⎢ ⎢ a ⎥ = −C ⎢ δh/δa ⎥ (3.45) ⎣ ⎣ ⎦ δh/δγ ⎦ ∂t γ σ δh/δσ where

⎡ ∗ ad µ a γ ⎢ £ a 0 0 C=⎢ ⎣ £ γ 0 0 0 − (div − ad∗γ ) £ σ

⎤ σ ⎥ 0 ⎥ − (grad − adγ )⎦ − ad∗ σ

with boxes indicating where the matrix operations occur. In the γ − σ entry of the Hamiltonian matrix (3.45), one recognizes the covariant spatial derivative deﬁned in equation (3.29), and ﬁnds its adjoint operator

150

D. D. Holm

in the σ − γ entry. More explicitly, in terms of indices and diﬀerential operators, and for a = D, the mass density, this Hamiltonian matrix form becomes ⎡ ⎤ ⎡ ⎤ δh/δµj µi ⎢ ⎥ ⎥ ∂ ⎢ ⎢ Dα ⎥ = −D ⎢ δh/δDβ ⎥ (3.46) ⎣δh/δγ ⎦ ∂t ⎣γi ⎦ j σα δh/δσβ where

⎡

µj ∂i + ∂j µi ⎢ ∂ D ⎢ D=⎣ α j γj ∂i + γiα, j ∂j σ α

D∂i 0 0 0

∂j γiβ − γjβ, i 0 0 β −δαβ ∂j + tακ γjκ

⎤ σ β ∂i ⎥ 0 ⎥ α α κ⎦ −δβ ∂i − tβκ γi κ − tαβ σκ

Here, the summation convention is enforced on repeated indices. Upper Greek indices refer to the Lie algebraic basis set, lower Greek indices refer to the dual basis and Latin indices refer to the spatial reference frame. The partial derivative ∂j = ∂/∂xj , say, acts to the right on all terms in a product by the chain rule. For the case that t αβκ are structure constants αβκ for the Lie algebra so(3), the Lie–Poisson Hamiltonian matrix (2.83) for liquid crystals is recovered, modulo an inessential factor of 2. Remark. As mentioned earlier in our discussion of Hamiltonian dynamics of liquid crystals, the Hamiltonian matrix in equation (3.46) was discovered some time ago in the context of investigating the relation between spin-glasses and Yang–Mills magnetohydrodynamics (YM-MHD) by using the Hamiltonian approach in Holm and Kupershmidt [1988]. There, it was shown to be a valid Hamiltonian matrix by associating its Poisson bracket with the dual space of a certain Lie algebra of semidirect-product type that has a generalized two-cocycle on it. This generalized two-cocycle contributes the grad and div terms appearing in the more symbolic expression of this Hamiltonian matrix in equation (3.45). The mathematical discussion of this Lie algebra and its generalized twococycle, as well as the corresponding Lie–Poisson Hamiltonian equations for spin-glass ﬂuids and YM-MHD, are given in Holm and Kupershmidt [1988]. The present work provides a rationale for the derivation of such Lie–Poisson brackets from the Lagrangian side. Spatially one-dimensional static solutions with z-variation. Static (steady, zero-velocity) solutions for PCFs, with constant pressure and onedimensional spatial variations in, say, the z-direction obey equations (3.46) for i = 3 = j, rewritten as −D

d δh d δh d δh = γ3α + σα = 0, α dz δD dz δγ3 dz δσα

4. Dynamics of Perfect Complex Fluids

d δh δh δh = tβακ γ3κ β − tκαβ σκ , dz δγ3α δσ β δγ3

151

(3.47)

d δh κ δh = − tα . βκ γ3 dz δσα δσβ As for the case of liquid crystals, the sum of terms in the ﬁrst equation of the set (3.47) vanishes to give zero pressure gradient, as a consequence of the latter two equations. Under the Legendre transformation h(γ3 , σ) = Πα γ3α + Γ α σα − H(Π, Γ) ,

(3.48)

these equations become, δH δH d Πα = tβακ Πβ − tκαβ κ Γβ , dz δΠκ δΓ d α δH β Γ = −tα Γ . βκ dz δΠκ

(3.49)

These Legendre-transformed equations are Poincar´e [1901] generalization of Euler’s equations for a heavy top, expressing them on an arbitrary Lie algebra with structure constants tα βκ . Thus, The steady, spatially one-dimensional solutions for all PCFs have the underlying Lie algebra structure discovered in Poincar´e [1901]. Spatially homogeneous, time-dependent PCF ﬂows. Spatially homogeneous solutions of equations (3.46) obey the dynamical equations, δh δh dσα = −tβακ γjκ β + tκαβ σκ , dt δσβ δγj dγiα κ δh = tα . βκ γi dt δσβ

(3.50)

For a single spatial index, say γi , these are again the Poincar´e [1901] equations generalizing Euler’s equations for a heavy top to an arbitrary Lie algebra. Of course, the corresponding Hamiltonian matrix for this system is the lower right corner of the matrix in equation (3.46). e’s equations (3.50) When tα βκ = αβκ for the Lie algebra so(3), Poincar´ correspond to the Leggett equations for 3 He-A with spin density σα and spin anisotropy vector γiα , see Leggett [1975]. For special solutions of these and other related equations in the context of 3 He-A, see Golo and Monastyrskii [1977, 1978]. Golo, Monastyrskii, and Novikov [1979]. Evolution of the disclination density. Holm and Kupershmidt [1988] use the chain rule and the deﬁning relation for the disclination density, cf. equation (2.28), α α α − γj,i + tβκ γiβ γjκ , (3.51) Bijα ≡ γi,j

152

D. D. Holm

to transform the Hamiltonian matrix (3.46) to a new Hamiltonian matrix, whose Lie–Poisson Hamiltonian dynamics may be written as ⎡ ⎤ ⎤ ⎡ δh/δµk µi ⎢ ⎥ ⎥ ∂ ⎢ ⎢ D ⎥ = −E ⎢ δh/δDβ ⎥ ⎣δh/δB ⎦ ∂t ⎣Bijα ⎦ lm σα δh/δσβ where ⎡

µk ∂i + ∂k µi ⎢ ∂k D E =⎢ ⎣B α + B α ∂j − B α ∂i ij,k ik jk ∂k σ α

D∂i 0 0 0

β β −Blm,i + ∂m Bliβ − ∂l Bmi 0 0 β κ tακ Blm

⎤ σ β ∂i ⎥ 0 ⎥ α −tβκ Bijκ ⎦ κ − tαβ σκ

The corresponding PCF dynamics for the disclination density then emerges as

δh ∂Bijα δh α α α = − Bij,k + Biαk ∂j − Bjk ∂i + tβκ Bijκ . (3.52) ∂t δµk δσβ As expected, this equation preserves the trivial solution Bijα = 0, which is the case when the γm −strain ﬁeld is continuous and the complex ﬂuid has no defects. We shall mention a strategy for dealing with defects in Section 4.

3.3

Clebsch Approach for PCF Dynamics

Following Serrin [1959], we call the auxiliary constraints imposed by the Eulerian kinematic equations the Lin constraints. As we shall see, the diamond operation deﬁned in equation (3.27) arises naturally in imposing the Lin constraints. Taking variations of the constrained Eulerian action, !

∂γ

" ∂a +£ξ a + β , +£ξ γ−dν−adν γ (3.53) S = dt l(ξ, a, ν, γ)+ v , ∂t ∂t yields the following PCF Clebsch relations, δξ : δν : δa : δv : δγ : δβ :

δl − v a − β γ = 0, δξ δl + dβ − ad∗γ β = 0, δν ∂v δl − − £ξ v = 0, δa ∂t ∂a + £ξ a = 0, ∂t ∂β δl − − £ξ β + ad∗ν β = 0, δγ ∂t ∂γ + £ξ γ − dν − adν γ = 0. ∂t

(3.54)

4. Dynamics of Perfect Complex Fluids

153

We shall show that these Clebsch relations recover the Euler–Poincar´e equations (3.23)–(3.24). (In what follows, we shall ignore boundary and endpoint terms that arise from integrating by parts.) The diamond operation is deﬁned by v a , η ≡ − v , £η a = − v , a η .

(3.55)

This operation is antisymmetric, v a , η = − a v , η ,

(3.56)

as obtained from, v , £η a + £η v , a = 0 ,

or, v , a η + v η , a = 0 ,

(3.57)

and the symmetry of the pairing · , ·. The diamond operation also satisﬁes the chain rule under the Lie derivative, £ξ (v a) , η = (£ξ v) a , η + v (£ξ a) , η.

(3.58)

This property can be veriﬁed, as follows, £ξ v a , η + v £ξ a , η = v ξ η , a − v η ξ , a = a , v (adξ η) = − a v , (adξ η) = ad∗ξ (a v) , η = £ξ (a v) , η ,

(3.59)

where we have used v ξ , a η + v ξη , a = 0, implied by (3.57), in the ﬁrst step. Finally, we have the useful identity, β dν , η = − dβ ν , η ,

(3.60)

as obtained from (dν)η = d(νη) and β , d(νη) + dβ , νη = 0 .

(3.61)

These three properties of the operation and the PCF Clebsch relations (3.54) together imply

∂ δl δl δl + £ξ (v a + β γ) = a+ ν+ γm (3.62) ∂t δa δν δγm ∗ + (adν β) γ + β adν γ + (ad∗γ β) ν . The term in square brackets is seen to vanish, upon pairing it with a vector ﬁeld, integrating by parts and again using the properties of the operation. This manipulation recovers the PCF motion equation (3.23) as

δl ∂ δl δl δl + £ξ = a+ ν+ γm , ∂t δξ δa δν δγm

(3.63)

154

D. D. Holm

since, as we have seen, δl δl = ad∗ξ , (3.64) δξ δξ for one-form densities such as δl/δξ. The PCF micromotion equation (3.24) is also recovered from the Clebsch relations (3.54). This is accomplished by taking the time derivative of the δν−formula, substituting the δβ− and δγ−formulas, and using linearity of ad∗ to ﬁnd

δl ∂ δl δl + £ξ =d − ad∗γ (3.65) ∂t δν δγ δγ = − d(ad∗ν β) + ad∗dν β − ad∗γ (ad∗ν β) − ad∗(adν γ) β = − ad∗ν dβ − ad∗ν (ad∗γ β) δl . = ad∗ν δν £ξ

Hence, the Clebsch relations (3.54) also recover the micromotion equation (3.24). Remarks. From the Hamiltonian viewpoint, the pairs (v, a) and (β, γ) are canonically conjugate variables and the Clebsch map (v, a, β, γ) → (µ, σ), with δl ≡ µ=va+βγ, δξ δl ≡ σ = −dβ + ad∗γ β , δν

(3.66) (3.67)

is a Poisson map from the canonical Poisson bracket to the Lie–Poisson Hamiltonian structure given in equation (3.46), in which a = D. Of course, there is no obstruction against allowing a to be any advected quantity, as discussed in Holm, Marsden, and Ratiu [1998]. The generalized two-cocycle associated with the Hamiltonian matrix in (3.46) arises from the term dβ in the σ−part of this Poisson map. Various other applications of the Lin constraint and Clebsch representation approach in formulating and analyzing ideal ﬂuid and plasma dynamics as Hamiltonian systems appear in Holm and Kupershmidt [1983b], Marsden and Weinstein [1983], Zakharov, Musher, and Rubenchik [1985], Zakharov and Kusnetsov [1997].

3.4

Conclusions for PCFs

Perfect complex ﬂuids (PCFs) have internal variables whose micromotion is coupled to the ﬂuid’s motion. Examples of PCFs include spin-glass ﬂuids, superﬂuids and liquid crystals. PCF internal variables are materially

4. Dynamics of Perfect Complex Fluids

155

advected order parameters that may be represented equivalently as either geometrical objects, or as coset spaces of Lie groups. The new feature of PCFs relative to simple ﬂuids with advected parameters treated in Holm, Marsden, and Ratiu [1998] is the dependence of their Lagrangian L : T G × V ∗ × T O −→ R , on T O, the tangent space of their order parameter group. Moreover, the diﬀeomorphisms G act on T O. We treat Lagrangians that are invariant under the right actions of both the order parameter group O and the diffeomorphisms G. In this case, reaching the Euler–Poincar´e ﬂuid description requires two stages of Lagrangian reduction, T G × V ∗ × (T O/O) /G g × (V ∗ × o)g −1 (t) , rather than the single stage of Lagrangian reduction (with respect to the “relabeling transformations” of G) employed for simple ﬂuids in Holm, Marsden, and Ratiu [1998]. After studying the example of nematics in Section 2, we derived the Euler–Poincar´e dynamics of PCFs in two stages of Lagrangian reduction in Section 3. The ﬁrst stage produced the Lagrange–Poincar´e equations derived from an action principle deﬁned on the right invariant Lie algebra of the order parameter group in the Lagrangian (or material) ﬂuid description. The second stage of Lagrangian reduction passed from the material ﬂuid description to the Eulerian (or spatial) ﬂuid description and produced the Euler–Poincar´e equations for PCFs. We also derived these Euler–Poincar´e equations using the Clebsch approach. In addition, we used a Legendre transformation to obtain the Lie–Poisson Hamiltonian formulation of PCF dynamics in the Eulerian ﬂuid description. The Lie–Poisson Hamiltonian formulation of these equations agreed with that found earlier in Holm and Kupershmidt [1987, 1988], who treated spin-glass ﬂuids, Yang–Mills magnetohydrodynamics and superﬂuid 4 He and 3 He. Thus, we found that Lagrangian reduction by stages provides a rationale for deriving this Lie–Poisson Hamiltonian formulation from the Lagrangian side. This approach also ﬁts well with some gauge theoretical descriptions of condensed matter physics.

4

A Strategy for Introducing Defect Dynamics

Many other potential applications of the Euler–Poincar´e framework abound in the physics of condensed matter. For example, besides the perfect liquid crystal dynamics treated here explicitly, the superﬂuid hydrodynamics of the various phases of 3 He may be treated similarly. In particular, the

156

D. D. Holm

geometrical framework of Lagrangian reduction by stages is well-adapted to the standard identiﬁcation of the phases of 3 He with the independent cosets of the order parameter group SO(3) × SO(3) × U(1), as discussed, e.g., in Mineev [1980] and Volovick [1992] Magnetic materials may also be treated this way. The seminal papers on the geometrical properties of magnetic materials are Dzyaloshinskii [1977]. Volovik and Dotsenko [1980] and Dzyaloshinskii and Volovick [1980]. Other recent studies of the dynamics of magnetic materials and superﬂuid 3 He in directions relevant to the present paper appear, e.g., in Holm and Kupershmidt [1988], Balatskii [1990], Isayev and Peletminsky [1997], Isaev, Kovalevskii, and Peletminskii [1997]. Most of these physical applications involve defects (imperfections, or “glitches” that appear as discontinuities in the order parameter) and one must describe their dynamics, as well. This is an area of intense investigation in many contexts in condensed matter physics. Our approach is based on an analogy with the theory of the Hall eﬀect in a neutral multicomponent ﬂuid plasma when inertia is negligible in one of the components (the electrons). The Hamiltonian context for Hall eﬀects in a neutral ion-electron plasma was considered by Holm [1987] for a normal-ﬂuid plasma and by Holm and Kupershmidt [1987] for a multicomponent electromagnetically charged superﬂuid plasma. In this context, the introduction of an independent gauge ﬁeld associated with the momentum of a distribution of superﬂuid vortex lines generalizes to apply for defects or vortices in any continuous medium possessing an order parameter description that arises from spontaneous symmetry breaking. In the remainder of this paper, we shall use the idea of reactive forces arising via the Hall eﬀect to discuss the case of quantum vortices in superﬂuid Helium-II. In particular, we shall apply the Hall eﬀect analogy to describe the hydrodynamics of the quantum vortex tangle in superﬂuid turbulence as an additional “third ﬂuid” in Landau’s two-ﬂuid ﬂuid model. The third ﬂuid associated with the vortex tangle carries momentum and moves with its own independent velocity in superﬂuid ﬂows.

4.1

Vortices in Superﬂuid 4 He

In superﬂuid 4 He the order parameter group is U(1) and the defects are called vortices. These are quantum vortices, since their circulation comes in integer multiples of κ = h/m 10−3 cm2 /sec. Conservation of the number of quantum vortices moving through superﬂuid 4 He (and across the streamlines of the normal ﬂuid component) is expressed by d ω·n ˆ dS = 0 , (4.1) dt S where the superﬂuid vorticity ω is the areal density of vortices and n ˆ is the unit vector normal to the surface S whose boundary ∂S moves with

4. Dynamics of Perfect Complex Fluids

157

the vortex line velocity v . When ω = curl vs this is equivalent to a vortex Kelvin theorem / d vs · dx = 0 , (4.2) dt ∂S(v ) which in turn implies the fundamental relation ∂vs − v × ω = ∇µ . ∂t

(4.3)

The superﬂuid velocity naturally splits into vs = u − A, where u = ∇φ and (minus) the curl of A yields the superﬂuid vorticity ω. The phase φ is then a regular function without singularities. This splitting will reveal that the Hamiltonian dynamics of superﬂuid 4 He with vortices may be expressed as an invariant subsystem of a larger Hamiltonian system in which u and A have independent evolution equations. We begin by deﬁning a phase frequency in the normal velocity frame as ∂φ + vn · ∇φ = ν . ∂t

(4.4)

The mass density ρ and the phase φ are canonically conjugate in the Hamiltonian formulation of the Landau two-ﬂuid model. Therefore, ν = −δh/δρ for a given Hamiltonian h and u = ∇φ satisﬁes δh ∂u + vn · ∇u + (∇vn )T · u = −∇ . ∂t δρ

(4.5)

The mass density ρ satisﬁes the dual equation ∂ρ δh + ∇ · (ρvn ) = −∇ · . ∂t δu

(4.6)

In the last two equations, we see the expected two-cocycle terms for Landau’s perfect superﬂuid, in which ω = 0. Here u and ρ in the superﬂuid play the roles of γ and σ for the PCFs. The curvature ω is nonvanishing now, because of the vortices (defects) that are represented by A. Perhaps not surprisingly, the rotational and potential components of the superﬂuid velocity must satisfy similar equations, but the rotational component must be advected by another velocity — the vortex line velocity v — instead of the normal velocity vn that advects u. Absorbing all gradients into u yields the form of the equation we should expect for A, ∂A + v × ω = 0 . ∂t

(4.7)

Taking the diﬀerence of the equations for u and A then recovers equation (4.3) as δh ∂vs − v × ω = − ∇ vn · u + , with vs = u − A , (4.8) ∂t δρ

158

D. D. Holm

in which regularity of the phase φ allows one to set curl u = 0. It remains to determine v from the Euler–Poincar´e formulation. Including the additional degree of freedom A allows the vortex lines to move relative to both the normal and super components of the ﬂuid, and thereby introduces additional reactive forces associated with the momentum of the vortex lines. For superﬂuid 4 He with vortices, the momenta conjugate to the velocities vn and v shall be our basic dynamical variables. To develop the Euler– Poincar´e formulation of this problem, we must consider a Lagrangian that ﬁrst of all is invariant under the order parameter group O = U(1). The Lagrangian must also be invariant under two types of diﬀeomorphisms: one corresponding to the material motion of the normal ﬂuid Gn and another corresponding to the motion of the vortices G . Thus we consider a Lagrangian that allows the following direct product of group reductions,

T Gn × V ∗ × (T O/O) /Gn × (T G × V ∗ )/G gn × (V ∗ × o) gn−1 (t) × g × V ∗ g−1 (t) .

We denote the corresponding dependence in this Lagrangian as l vn , S, ν, u ; v , n .

(4.9)

Here gn and g denote the Lie algebras of vector ﬁelds associated to the velocities vn and v , respectively. Also o denotes the Lie algebra of the Abelian gauge group U(1); so o contains ν and u. The V ∗ in each factor denotes the corresponding advected densities: entropy S advected by vn and vortex inertial mass n advected by v . We denote the inverse right actions of the two diﬀeomorphisms as gn−1 (t) and g−1 (t). At a given time t, these actions separately map spatial variables back to coordinates moving with the normal material and with the vortices, respectively. According to the Euler–Poincar´e action principle, the following dynamical equations are generated by this Lagrangian, cf. (3.23) and (3.24) δl δl δl δl ∂ δl S+ ν+ = − ad∗vn + um , ∂t δvn δvn δS δν δum δl ∂ δl δl = − div vn + , ∂t δν δν δu ∂ δl δl δl n. = − ad∗v + ∂t δv δv δn

(4.10)

These are the Euler–Poincar´ e equations for a superﬂuid with vortices. The Eulerian kinematic equations are, cf. (2.61)–(2.62) ∂S ∂n = − div(Svn ) , = − div(n v ) , ∂t ∂t ∂u = vn × curl u − ∇(u · vn − ν) . ∂t

4. Dynamics of Perfect Complex Fluids

159

If v and n are absent, these equations reduce to the equations for a PCF with broken U(1) symmetry. The momentum density conjugate to the frequency ν is the total mass density given by ρ = −δl/δν, which satisﬁes the equation above. So far, this is interesting, but standard in the present context. However, now a new feature develops because of the physical description of superﬂuids. Physically, nothing is known on the Lagrangian side about the relation of the gauge frequency ν to the other variables. However, on the Hamiltonian side we know from the Legendre transformation that ν = −δh/δρ. Moreover, the thermodynamic energy on the Hamiltonian side is a known function of ρ. This means we should leave the Lagrangian side to ﬁnish determining the dynamics for superﬂuid 4 He with vortices. The following Hamiltonian description for this dynamics is derived in Holm [2001]. 4.1 Proposition. The dynamics for superﬂuid 4 He with vortices follows from a Lie–Poisson bracket whose Hamiltonian matrix separates into two pieces given by7 ⎡ ⎤ ⎡ ⎤ ⎤⎡ Mj ∂i + ∂j Mi S∂i ρ∂i ∂j ui − uj , i Mi δh/δMj ⎢ ⎥ ⎥ ⎢ δh/δS ⎥ ∂ ⎢ ∂j S 0 0 0 ⎥ ⎢ S ⎥ = −⎢ ⎥⎢ ⎣ ⎦ ⎣ ⎦ ⎣ δh/δρ ⎦ , ρ 0 0 ∂ ρ ∂ ∂t j j u j ∂i + u i , j 0 ∂i 0 ui δh/δu j (4.11) and 0 10 1 0 1 ∂ Ni N ∂ + ∂j Ni n∂i δh/δNj =− j i (4.12) ∂j n 0 δh/δn ∂t n where M = δl/δvn , N = δl/δv , ρ = −δl/δν and the Hamiltonian h is the Legendre transform of the Lagrangian l in (4.9) with respect to vn , v and ν. 4.2 Corollary. If the Hamiltonian has no explicit spatial dependence, then total momentum conservation holds as, ∂ ∂ Mj + Nj = Mj + Nj , h = − k Tjk . ∂t ∂x Suppose the Hamiltonian density has dependence h(M, ρ, S, n, vs , ω, A), in which vs = u − A , A = −N/n and ω = curl vs . Then, the stress tensor Tjk is expressed in terms of derivatives of the Hamiltonian as ∂h ∂h ∂h + vs j + curl Tjk = Mj ∂Mk ∂vs k ∂ω k 7 The ﬁrst of these is the Hamiltonian matrix for a PCF with broken U(1) symmetry. The other Hamiltonian matrix gives the standard semidirect-product Lie–Poisson bracket, without two-cocycles. This combination of Hamiltonian matrices was ﬁrst introduced in Holm and Kupershmidt [1987] for superﬂuid plasmas.

160

D. D. Holm

− vs l, j mlk

∂h ∂h + δjk P − Aj . ∂ωm ∂Ak vs

where the pressure P is deﬁned as P = Ml

∂h ∂h ∂h ∂h +S +n − h. +ρ ∂Ml ∂ρ ∂S ∂n

Thus, the dependence of the Hamiltonian h on the vorticity ω introduces reactive stresses due to the motion of the vortices. Details of the choice of Hamiltonian, as well as the derivation and interpretation of the explicit equations are given in Holm [2001]. Other results include the transformation of these equations to a rotating reference frame and the resulting Taylor–Proudman theorem for superﬂuid 4 He with vortices. The generalization of this idea to complex ﬂuids with nonabelian broken symmetries will be discussed elsewhere.

Acknowledgments: I am grateful to A. Balatskii, H. Cendra, J. Hinch, P. Hjorth, J. Louck, J. Marsden, T. Mullin, M. Perlmutter, T. Ratiu, J. Toner and A. Weinstein for constructive comments and enlightening discussions during the course of this work. I am also grateful for hospitality at the Isaac Newton Institute for Mathematical Sciences where part of this work was completed. This research was supported by the U.S. Department of Energy under contracts W-7405-ENG-36 and the Applied Mathematical Sciences Program KC-07-01-01.

Appendix: External torques and partial Lagrangian reduction. The anisotropic dielectric and diamagnetic eﬀects on the director angular momentum due to external electric and magnetic ﬁelds can be restored by adding the torques from equation (2.38) to the right hand side of the second equation in (2.52). Knowing they can be restored this way, one could simply ignore the external torques. However, the major applications of liquid crystals involve these torques and their restoration also provides an example of partial Lagrangian reduction. This example also brieﬂy recapitulates the procedure of Lagrangian reduction by stages used in the remainder of the paper. Partially reduced Lagrange–Poincar´ e equations. Restoring the effects of external torques requires that we consider Hamilton’s principle for a Lagrangian that still retains its dependence on the director n, ˙ J, n, ν, ∇n . (4.16) S = dt d3 X L x,

4. Dynamics of Perfect Complex Fluids

161

In this case, the partial reduction of the Lagrangian dependence from n˙ to ν = n × n˙ proceeds as follows. The variation of ν = n × n˙ gives δν = (n × δn)˙ − 2n˙ × δn = (n × δn)˙ − 2ν × (n × δn) .

(4.17)

Moreover, we have the relation δn · A = (n × δn) · (n × A) = |n|2 δn · A − (n · δn)(n · A) ,

(4.18)

for any vector A, since |n|2 = 1, which implies that n · δn = 0. Substituting the identity (4.17) for δν and the relation (4.18) into Hamilton’s principle implies the following replacements, 0 1 ∂L ˙ ∂L ˙ ∂L =⇒ (n × δn) · − 2ν × δn · (4.19) ∂ n˙ ∂ν ∂ν ∂L ∂L =⇒ (n × δn) · n × (4.20) δn · ∂n ∂n Varying the action S in the ﬁelds x, n and ν at ﬁxed material position X and time t now gives ) ∂L ˙

∂ −1 ∂L ∂ ∂L 3 J −J +J · n,p δS = − dt d X δxp ∂ x˙ p ∂xp ∂J ∂xm ∂n,m ∂L ˙ ∂L (4.21) − 2ν × + (n × δn) · ∂ν ∂ν *

∂L ∂ ∂L J −1 −J − (n × δn) · n × , ∂n ∂xm ∂n,m with the same natural (homogeneous) boundary conditions as before. Consequently, the action principle δS = 0 yields the following partially reduced Lagrange–Poincar´ e equations for liquid crystals, ∂L ˙

∂ −1 ∂L ∂ ∂L −J · n,p = 0, J ∂ x˙ p ∂xp ∂J ∂xm ∂n,m ∂L ˙ ∂L ∂ −1 ∂L ∂L J = 0. − n× −J − 2ν × n × δn : ∂ν ∂ν ∂n ∂xm ∂n,m (4.22) One may compare these with the more completely reduced Lagrange– Poincar´e equations for liquid crystals in equations (2.52). The explicit dependence of the Lagrangian on the director n introduces torques and stresses not seen for Lagrangians depending only on ν and γ = n × dn. δxp :

+J

Partially reduced Euler–Poincar´ e equations. We are dealing with Hamilton’s principle for a Lagrangian in the class (4.23) S = dt d3 x (u, D, n, ν, ∇n) ,

162

D. D. Holm

in terms of the Lagrangian density given by (u, D, n, ν, ∇n) d3 x = (4.24) 3 −1 −1 −1 −1 −1 −1 (t) , Jg (t) , n g (t) , νg (t) , ∇n g (t) d X g (t) . L xg ˙ The variations of the Eulerian ﬂuid quantities are computed from their deﬁnitions to be, ∂ηj ∂uj ∂ηj + uk − ηk , (4.25) ∂t ∂xk ∂xk ∂Dηj δD = − , (4.26) ∂xj ∂Σ ∂ν ∂Σ + um − 2ν × Σ − ηm , (4.27) δν = ∂t ∂xm ∂xm ∂n δn = − n × Σ − ηk , (4.28) ∂xk where Σ(x, t) ≡ n × δn (X, t) g −1 (t) and η ≡ δg g −1 (t). We compute the variation of the action (4.23) in Eulerian variables at ﬁxed time t and spatial position x as, cf. equation (2.66), δ δ δ δ δD + · δν + · δn (4.29) δuj + δS = dt d3 x δuj δD δν δn ) ∂ δ δ ∂uk ∂ δ = dt d3 x ηj − − − uk ∂t δuj δuk ∂xj ∂xk δuj δ ∂ν ∂ δ δ ∂n − · · + D − ∂xj δD δν ∂xj δn ∂xj

∂ δ δ ∂ δ δ − um + 2ν × + n× + Σ· − ∂t δν ∂xm δν δν δn δ ∂ δ ηj +Σ· + ∂t δuj δν

δ

* δ ∂ δ ηj δjm + Σ · um um − D + , ∂xm δuj δD δν δuj =

where we have substituted the variational expressions (4.25)–(4.28) and integrated by parts. Hence, we obtain the partially reduced Euler– Poincar´ e equations for liquid crystals, ∂ δ ∂ δ δ ∂uk ∂ δ =− − uk + D ηj : ∂t δuj δuk ∂xj ∂xk δuj ∂xj δD δ ∂n δ ∂ν · · − , (4.30) − δν ∂xj δn ∂xj

∂ δ δ ∂ δ δ Σ: =− um + 2ν × + n× . (4.31) ∂t δν ∂xm δν δν δn

4. Dynamics of Perfect Complex Fluids

163

One may compare these with the more completely reduced Euler–Poincar´e equations for liquid crystals in equations (2.67) and (2.68). Again ones sees the torques and stresses generated by the explicit dependence of the Lagrangian on the director n. Partially reduced Hamiltonian dynamics of liquid crystals. The Euler–Lagrange–Poincar´e formulation of liquid crystal dynamics obtained so far allows passage to the corresponding Hamiltonian formulation via the following Legendre transformation of the partially reduced Lagrangian in the velocities u and ν, in the Eulerian ﬂuid description, δ δ , , σ= δui δν h(m, D, σ, n) = mi ui + σ · ν − (u, D, ν, n). mi =

(4.32)

Accordingly, one computes the derivatives of h as δh = ui , δmi

δh =ν, δσ

δh δ =− , δD δD

δh δ =− . δn δn

(4.33)

Consequently, the Euler–Poincar´e equations (2.67)–(4.31) together with the auxiliary kinematic equations (2.61)–(2.62) for liquid crystal dynamics in the Eulerian description imply the following equations, for the Legendretransformed variables, (m, D, σ, n), ∂ δh ∂ δh δh ∂ ∂mi mi −D = − mj − ∂t ∂xi δmj ∂xj δmj ∂xi δD ∂n δh ∂ δh − σ· , + · ∂xi δn ∂xi δσ ∂D ∂ δh D , =− ∂t ∂xj δmj ∂n δh ∂n δh =− −n× ∂t ∂xj δmj δσ ∂σ δh ∂ δh δh σ −n× =− − 2σ × . ∂t ∂xj δmj δn δσ

(4.34) (4.35) (4.36) (4.37)

These equations are Hamiltonian. Assembling the liquid crystal equations (4.34)–(4.37) into the Hamiltonian form (2.80) gives, ⎡ ⎡ ⎤ ⎤⎡ ⎤ mj ∂i + ∂j mi D∂i −n,i · σ · ∂i δh/δmj mi ⎢ ⎢ ⎥ ⎥ ∂ ⎢ ∂j D 0 0 0 ⎥ ⎢D⎥ = −⎢ ⎥ ⎢ δh/δD ⎥ (4.38) ⎣ ⎣ ⎣ ⎦ ⎦ 0 0 n× n,j δh/δn ⎦ ∂t n 0 n× 2σ × ∂j σ σ δh/δσ One may compare this with the more completely reduced Hamiltonian matrix form for liquid crystals in equations (2.82). The Jacobi identity for

164

D. D. Holm

the Poisson bracket deﬁned by this Hamiltonian matrix is guaranteed by associating it to the dual of a semidirect-product Lie algebra that, in this case, has no two-cocycles. The two-cocycles are generated by the further transformation to γ = n × dn.

References Balatskii, A. [1990], Hydrodynamics of an antiferromagnet with fermions, Phys. Rev. B 42, 8103–8109. Beris, A. N. and B. J. Edwards [1994], Thermodynamics of Flowing Systems with internal microstructure, Oxford University Press. Cendra, H., D. D. Holm, J. E. Marsden and T. S. Ratiu [1999], Lagrangian Reduction, the Euler–Poincar´e Equations, and Semidirect Products. Arnol’d Festschrift Volume II, 186, Amer. Math. Soc. Transl. Ser. 2, pp. 1–25. Cendra, H., J. E. Marsden, and T. Ratiu [2001], Lagrangian Reduction by Stages. Mem. Amer. Math. Soc. 152, no. 722, viii+108 pp. Chandrasekhar, S. [1992], Liquid Crystals, Second Edition. Cambridge University Press, Cambridge. Coquereaux, R. and A. Jadcyk [1994], Riemann Geometry Fiber Bundles Kaluza– Klein Theories and all that . . . ., World Scientiﬁc, Lecture Notes in Physics, vol. 16. Cosserat, E. and F. Cosserat [1909], Th´ eorie des corps deformable. Hermann, Paris. de Gennes, P. G. and J. Prost [1993], The Physics of Liquid Crystals, Second Edition. Oxford University Press, Oxford. Dunn, J. E. and J. Serrin [1985], On the thermodynamics of interstitial working, Arch. Rat. Mech. Anal. 88, 95–133. Dzyaloshinskii, I. E. [1977], Magnetic structure of UO2, Commun. on Phys. 2, 69–71. Dzyaloshinskii, I. E. and G. E. Volovick [1980], Poisson brackets in condensed matter physics, Ann. of Phys. 125, 67–97. Ericksen, J. L. [1960], Anisotropic ﬂuids, Arch. Rational Mech. Anal. 4, 231–237. Ericksen, J. L. [1961], Conservation laws for liquid crystals, Trans. Soc. Rheol. 5, 23–34. Eringen, A. C. [1997], A uniﬁed continuum theory of electrodynamics of liquid crystals, Internat. J. Engrg. Sci. 35, 1137–1157. Flanders, H. [1989], Diﬀerential Forms with Applications to the Physical Sciences, Dover Publications: New York. Fuller, F. B. [1978], Decomposition of linking number of a closed ribbon: problem from molecular-biology, Proc. Nat. Acad. Sci. USA 75, 3557–3561. Gibbons, J., D. D. Holm and B. Kupershmidt [1982], Gauge-invariant Poisson brackets for chromohydrodynamics, Phys. Lett. A 90, 281–283.

4. Dynamics of Perfect Complex Fluids

165

Gibbons, J., D. D. Holm and B. Kupershmidt [1983], The Hamiltonian structure of classical chromohydrodynamics, Physica D 6, 179–194. Goldstein, R. E., T. R. Powers and C. H. Wiggins [1998], Viscous nonlinear dynamics of twist and writhe, Phys. Rev. Lett. 80, 5232–5235. Golo, V. L. and M. I. Monastyrskii [1977], Topology of gauge ﬁelds with several vacuums, JETP Lett. 25, 251–254. [Pis’ma Zh. Eksp. Teor. Fiz. 25, 272–276.] Golo, V. L. and M. I. Monastyrskii [1978], Currents in superﬂuid Lett. Math. Phys. 2, 379–383.

3

He,

Golo, V. L., M. I. Monastyrskii and S. P. Novikov [1979], Solutions of the Ginzburg–Landau equations for planar textures in superﬂuid 3 He, Comm. Math. Phys. 69, 237–246. Goriely, A. and M. Tabor [1997], Nonlinear dynamics of ﬁlaments.1. Dynamical instabilities, Phys. D 105, 20–44. Hall, H. E. [1985], Evidence for intrinsic angular momentum in superﬂuid 3 He-A, Phys. Rev. Lett. 54, 205–208. Hohenberg, P.C. and B. I. Halperin [1977], Theory of dynamical critical phenomena, Rev. Mod. Phys. 49, 435–479. Holm, D. D. [1987], Hall magnetohydrodynamics: conservation laws and Lyapunov stability, Phys. Fluids 30, 1310–1322. Holm, D. D. [2001], Introduction to HVBK dynamics, in Quantized Vortex Dynamics and Superﬂuid Turbulence. (C. F. Barenghi, R. J. Donnelly and W. F. Vinen, eds.) Lecture Notes in Physics, volume 571, Springer-Verlag, pp. 114–130. Holm, D. D. and B. A. Kupershmidt [1982], Poisson structures of superﬂuids, Phys. Lett. A 91, 425–430. Holm, D. D. and B. A. Kupershmidt [1983a], Poisson structures of superconductors, Phys. Lett. A 93, 177–181. Holm, D. D. and B. A. Kupershmidt [1983b], Poisson brackets and Clebsch representations for magnetohydrodynamics, multiﬂuid plasmas, and elasticity, Physica D 6, 347–363. Holm, D. D. and B. A. Kupershmidt [1984], Yang–Mills magnetohydrodynamics: nonrelativistic theory, Phys. Rev. D 30, 2557–2560. Holm, D. D. and B. A. Kupershmidt [1986], Hamiltonian structure and Lyapunov stability of a hyperbolic system of two-phase ﬂow equations including surface tension, Phys. Fluids 29, 986–991. Holm, D. D. and B. A. Kupershmidt [1987], Superﬂuid plasmas: Multivelocity nonlinear hydrodynamics of superﬂuid solutions with charged condensates coupled electromagnetically, Phys. Rev. A 36, 3947–3956. Holm, D. D. and B. A. Kupershmidt [1988], The analogy between spin glasses and Yang–Mills ﬂuids, J. Math. Phys. 29, 21–30. Holm, D. D., J. E. Marsden, and T. S. Ratiu [1998], The Euler–Poincar´e equations and semidirect products with applications to continuum theories, Adv. in Math. 137, 1–81.

166

D. D. Holm

Isaev, A. A., M. Yu. Kovalevskii, and S. V. Peletminskii [1995], Hamiltonian appraoch to continuum dynamic, Theoret. and Math. Phys. 102, 208–218. [Teoret. Math. Fiz. 102, 283–296.] Isayev, A. A., M. Yu. Kovalevsky and S. V. Peletminsky [1997], Hydrodynamic theory of magnets with strong exchange interaction, Low Temp. Phys. 23, 522–533. Isayev, A. A. and S. V. Peletminsky [1997], On Hamiltonian formulation of hydrodynamic equations for superﬂuid 3 He-3, Low Temp. Phys. 23, 955–961. Jackiw, R. and N. S. Manton [1980], Symmetries and conservation laws in gauge theories, Ann. Phys. 127, 257–273. Kamien, R. D. [1998], Local writhing dynamics, Eur. Phys. J. B 1, 1–4. Kats, E. I. and V. V. Lebedev [1994], Fluctuational Eﬀects in the Dynamics of Liquid Crystals, Springer: New York. Khalatnikov, I. M. and V. V. Lebedev [1978], Canonical equations of hydrodynamics of quantum liquids, J. Low Temp. Phys. 32, 789–801; Khalatnikov, I. M. and V. V. Lebedev [1980], Equation of hydrodynamics of quantum liquid in the presence of continuously distributed singular solitons, Prog. Theo. Phys. Suppl. 69, 269–280. Klapper, I. [1996], Biological applications of the dynamics of twisted elastic rods, J. Comp. Phys. 125, 325–337. Kleinert, H. [1989], Gauge Fields in Condensed Matter, Vols. I, II, World Scientiﬁc. Kl´eman, M. [1983], Points, Lines and Walls in Liquid Crystals, Magnetic Systems and Various Ordered Media, John Wiley and Sons. Kl´eman, M. [1989], Defects in liquid crystals, Rep. on Prog. in Phys. 52, 555–654. Kuratsuji, H. and H. Yabu [1998], Force on a vortex in ferromagnet model and the properties of vortex conﬁgurations, J. Phys. A 31, L61–L65. Lammert, P. E., D. S. Rokhsar and J. Toner [1995], Topological and nematic ordering. I. A gauge theory, Phys. Rev. E 52, 1778–1800. Leggett, A. J. [1975], A theoretical description of the new phases of Rev. Mod. Phys. 47, 331–414.

3

He,

Leslie, F. M. [1966], Some constitutive equations for anisotropic ﬂuids, Quart. J. Mech. Appl. Math. 19, 357–370. Leslie, F. M. [1968], Some constitutive equations for liquid crystals, Arch. Rational Mech. Anal. 28, 265–283. Leslie, F. M. [1979], Theory of ﬂow phenomena in liquid crystals, in Advances in Liquid Crystals, vol. 4, (G. H. Brown, ed.) Academic, New York pp. 1–81. Marsden, J. E. and T. S. Ratiu [1999], Introduction to Mechanics and Symmetry, Second Edition, Springer-Verlag, Texts in Applied Mathematics 17. Marsden, J. E., T. S. Ratiu and J. Scheurle [2000], Reduction theory and the Lagrange–Routh equations, J. Math. Phys. 41, 3379–3429. Marsden, J. E. and J. Scheurle [1995], The Lagrange–Poincar´e equations, Fields Institute Commun. 1, 139–164.

4. Dynamics of Perfect Complex Fluids

167

Marsden, J. E. and A. Weinstein [1974], Reduction of symplectic manifolds with symmetry, Rep. Math. Phys. 5, 121–130. Marsden, J. E. and A. Weinstein [1983], Coadjoint orbits, vortices, and Clebsch variables for incompressible ﬂuids, Physica D 7, 305–323. Mermin, N. D. [1979], The topological theory of defects in ordered media, Rev. Mod. Phys. 51, 591–648. Mermin, N. D. and T.-L. Ho [1976], Circulation and angular momentum in the A phase of superﬂuid Helium-3, Phys. Rev. Lett. 36, 594–597. Mineev, V. P. [1980], Topologically stable defects and solitons in ordered media, Soviet Science Reviews, Section A: Physics Reviews, vol. 2, (I. M. Khalatnikov, ed.) (Chur, London, New York: Harwood Academic Publishers) pp. 173–246. Olver, P. J. [1993], Applications of Lie groups to diﬀerential equations, Second Edition, Springer-Verlag, New York. Poincar´e, H. [1901], Sur une forme nouvelle des ´equations de la m´ecanique, C. R. Acad. Sci. Paris 132, 369–371. Schwinger, J. [1951], On gauge invariance and vacuum polarization, Phys. Rev. 82, 664–679. Schwinger, J. [1959], Field theory commutators, Phys. Rev. Lett. 3, 296–297. Serrin, J. [1959], in Mathematical Principles of Classical Fluid Mechanics, vol. VIII/1 of Encyclopedia of Physics, (S. Fl¨ ugge, ed.), Springer-Verlag, Berlin, Sections 14–15, pp. 125–263. Stern, A. [1999], Duality for coset models, Nuc. Phys. B 557, 459–479. Trebin, H. R. [1982], The topology of non-uniform media in condensed matter physics, Adv. in Physics 31, 195–254. Tsurumaru, T. and I. Tsutsui [1999], On topological terms in the O(3) nonlinear sigma model, Phys. Lett. B 460, 94–102. Volovick, G. E. [1992], Exotic Properties of Superﬂuid 3 He, World-Scientiﬁc, Singapore. Volovick, G. E. and T. Vachaspati [1996], Aspects of 3 He and the standard electroweak model, Internat. J. Mod. Phys. B 10, 471–521. Volovik, G. E. and V. S. Dotsenko [1980], Hydrodynamics of defects in condensed media in the concrete cases of vortices in rotating Helium-II and of disclinations in planar magnetic substances, Sov. Phys. JETP, 58 65–80. [Zh. Eksp. Teor. Fiz. 78, 132–148.] Weatherburn, C. E. [1974], Diﬀerential Geometry in Three Dimensions, vol. 1, Cambridge University Press. Weinstein, A. [1996], Lagrangian mechanics and groupoids, Fields Inst. Commun. 7, 207–231. Yabu, H. and H. Kuratsuji [1999], Nonlinear sigma model Lagrangian for superﬂuid 3 He-A(B), J. Phys. A 32, 7367–7374. Zakharov, V. E. and E. A. Kusnetsov [1997], Hamiltonian formalism for nonlinear waves, Usp. Fiz. Nauk 167, 1137–1167. Zakharov, V. E., S. L. Musher and A. M. Rubenchik [1985], Hamiltonian approach to the description of nonlinear plasma phenomena, Phys. Rep. 129, 285–366.

5 The Lagrangian Averaged Euler (LAE-α) Equations with Free-Slip or Mixed Boundary Conditions Steve Shkoller To Jerry Marsden on the occasion of his 60th birthday ABSTRACT I shall present a simple proof of well-posedness for the Lagrangian averaged Euler (LAE-α) or Euler-α equations with either free-slip or mixed-type boundary conditions, and in the process describe certain features of the Riemannian geometry of new subgroups of the volumepreserving diﬀeomorphism group, corresponding to free-slip or mixed-type boundary data, thus answering in the aﬃrmative a conjecture of Holm, Marsden, and Ratiu [1998a].

Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . 2 Volume-Preserving Diﬀeomorphism Subgroups 3 Statement of Main Results . . . . . . . . . . . . . 4 Proof of Theorem . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . .

1

. . . . .

. . . . .

169 172 175 177 179

Introduction

The Lagrangian averaged Euler (LAE-α), or Euler-α, equations on a compact, oriented, C ∞ , n-dimensional Riemannian manifold (M, g) with C ∞ boundary ∂M may be expressed as the following system of partial diﬀerential equations (PDE): ∂t (1 − α2 ∆)u + curl(1 − α2 ∆)u × u = − grad p, u(0) = u0 , div u = 0, 169

(1.1)

170

S. Shkoller

where u(t, x) is the average (spatial) velocity ﬁeld of the ﬂuid, p(t, x) is the corresponding pressure function, and α > 0 is the averaging lengthscale. The operator ∆ acting on vector-ﬁelds (or 1-forms by identiﬁcation) is given by ∆ = −2 Def ∗ Def, Def u = 12 (∇u + ∇uT ), where ∇uT denotes the transpose of the matrix ∇u, and where Def ∗ is the L2 formal adjoint of the (rate of) deformation operator Def. When restricted to divergence-free vector ﬁelds (coexact 1-forms), ∆ = −(dδ + δd) + 2 Ric, where d denotes the exterior derivative, δ is its L2 formal adjoint, and Ric is the Ricci curvature of the Riemannian metric g on M . In the case that the ﬂuid container is a bounded open set in Rn and g = Id, the usual Euclidean metric, then ∆ = the usual Laplacian on functions, acting componentwise. The LAE-α PDE (1.1) models the averaged motion of an ideal incompressible ﬂuid, ﬁltering over spatial scales smaller than some a priori ﬁxed spatial scale α > 0 (see Holm, Marsden, and Ratiu [1998a,b], Shkoller [1998], and Marsden and Shkoller [2001a,b] for derivations of this model). As such, solutions of LAE-α faithfully reproduce the large-scale motion of the Euler equations for scales larger than α, while course-graining or smearing out the small unresolvable scales smaller than α. Its viscous counterpart, the so-called Lagrangian averaged Navier–Stokes (LANS-α) equations, have been used rather successfully in modeling both decaying and forced isotropic turbulence (see Chen et al. [1998, 1999a,b], Chen et al. [1999c],Mohseni et al. [2000], Nadiga and Shkoller [2001]). It has been a longstanding problem of turbulence modeling to produce a homogenized model of the Navier–Stokes equations; the Reynolds averaged Navier–Stokes (RANS) equations and Large-Eddy Simulation (LES) models are two commonly used approaches to producing such a model of the large-scale motion of a viscous ﬂuid. Both those approaches require a certain “guess” for the turbulence closure, i.e., the dynamics of the small computationally unresolvable spatial scales, and both methods rely on artiﬁcial viscosity to remove the small scale structures. The LANS-α equations, on the other hand, rely on the Lagrangian averaging procedure detailed in Marsden and Shkoller [2001a,b], which yields the system of PDE (1.1); this approach is geometric in nature, in that an ensemble average over geodesics of the L2 right-invariant metric on the group of volume-preserving diﬀeomorphisms is made. Solutions of the resulting LAE-α equations on manifolds with boundary are geodesics of a new right-invariant H 1 metric on

5. The Lagrangian Averaged Euler (LAE-α) Equations

171

new subgroups of the volume-preserving diﬀeomorphism group (see Marsden, Ratiu, and Shkoller [2000] and Shkoller [2000]). The Lagrangian averaging procedure may be viewed as an extension (to turbulence modeling) of the program initiated by Arnold [1966] and Ebin and Marsden [1970] in studying ideal ﬂuids (see also Arnold and Khesin [1998]). Instead of using artiﬁcial dissipation, the Lagrangian averaging method uses a type of nonlinear dispersion to ﬁlter small-scale structures; consequently, intermittency in turbulence simulations is not suppressed, a known shortcoming of the RANS and LES approaches. The existence of these H 1 right-invariant geodesics on the subgroup of the volume-preserving diﬀeomorphism group whose elements restrict to the identity on the boundary ∂M was proven in Shkoller [2000], and consequently, the well-posedness of the LAE-α equations (1.1) with the no-slip boundary conditions u = 0 on ∂M was established in the Sobolev space {u ∈ H s (T M )∩H01 (T M ) | div u = 0} for s > n2 + 1. The method of proof was founded upon a reformulation of (1.1) given by ∂t u + ∇u u + U α (u) + Rα (u) = −(1 − α2 ∆)−1 grad p , div u(t, x) = 0 , u = 0 on ∂M , u(0, x) = u0 (x),

(1.2a) (1.2b) (1.2c) (1.2d)

where U α (u) = α2 (1 − α2 ∆)−1 Div ∇u · ∇uT + ∇u · ∇u − ∇uT · ∇u , and Rα (u) = α2 (1 − α2 ∆)−1 Tr ∇ R(u, · )u + R(u, · )∇u + R(∇u, · )u −(∇u Ric) · u + ∇uT · Ric(u) , where R is the curvature operator associated to the metric g on M . This reformulation is possible when u = 0 on ∂M because in this case, ∇u u = 0 on ∂M as well which means that when u is suﬃciently regular, ∇u u is in the domain of (1−α2 ∆). As such, it is possible to compute the commutator [∇u , (1 − α2 ∆)]u and thus obtain (1.2) for the no-slip case. This note is devoted to study of the free-slip boundary conditions given by (1.4) g(u, n) = 0, [Def u · n]tan = 0 on ∂M , and the mixed-type boundary conditions u = 0 on Γ1

and g(u, n) = 0, [Def u · n]tan = 0 on Γ2 ,

(1.5)

172

S. Shkoller

where n denotes the outward unit normal vector ﬁeld on ∂M . In the case of mixed-type boundary conditions, it is supposed that M, ∂M are C ∞ , that Γ1 and Γ2 are two disjoint subsets of ∂M such that if m0 ∈ Γi (i = 1, 2), a local chart U (in M ) about m0 can be chosen so that U ∩ ∂M ⊂ Γi ; furthermore, it is assumed that Γ1 = ∂M /Γ2 and that ∂M = Γ1 ∪ Γ2 . Free-slip and mixed boundary conditions appear in many physical situations, wherein a stress-free condition must be imposed on (a portion of) the boundary of the ﬂuid container. For both the cases (1.4) and (1.5), it is not clear what boundary data ∇u u must satisfy, and thus it is not known if the advection term ∇u u is in the domain of (1 − α2 ∆); consequently, the question of whether or not the representation (1.2) of the LAE-α equations holds in this setting remains open, and the method of proof of well-posedness for the no-slip boundary condition case cannot be used for the free-slip and mixed-type cases. For this reason, I shall present a diﬀerent proof of well-posedness for these two cases, founded upon the Lie-Poisson formulation of the LAE-α equations, which proves the conjecture (Conjecture 8.1) made by Holm, Marsden, and Ratiu [1998a].

2

Volume-Preserving Diﬀeomorphism Subgroups

Let the ﬂow of the time-dependent vector ﬁeld u(t, x) be denoted by η(t, x) so that ∂t η(t, x) = u(t, η(t, x)), (2.1) η(0, x) = x for all x in Ω. For each t, denote the map η(t, · ) by ηt so that η0 = e, the identity map. Thus, the map x → ηt (x) gives the particle placement ﬁeld for the ﬂuid, i.e., ηt (x) is the position at time t of the ﬂuid particle which had position x at time t = 0. Since u is divergence-free, each map ηt is volume preserving and is a diﬀeomorphism. Vector ﬁelds u of Sobolev class H s for s > n2 + 1 shall be considered and, correspondingly, ηt ∈ Dµs , the group of H s diﬀeomorphisms of (M, g), deﬁned by ˜ ) | η is bijective , η −1 ∈ H s (M, M ˜ ), Ds = {η ∈ H s (M, M η leaves ∂M invariant}, ˜ , a compact manifold without where M is embedded into its double M ˜ in such a boundary of the same dimension, extending the metric g to M way as to ensure that ∂M is a totally geodesic submanifold. Also, let Dµs = {η ∈ Ds | det T η = 1}

5. The Lagrangian Averaged Euler (LAE-α) Equations

173

denote the submanifold of D s , consisting of volume-preserving diﬀeomorphisms. Ebin and Marsden [1970] proved that for s > n2 + 1, Ds and Dµs are C ∞ diﬀerentiable manifolds, and smooth topological groups. It is wellknown (see Ebin and Marsden [1970]) that the tangent space of Dµs at the identity is the vector space Te Dµs = {u ∈ H s (T M ) : div u = 0, g(u, n) = 0 on ∂M }, and for η ∈ Dµs , Tη Dµs is deﬁned by right-translation, so that Tη Dµs = Te Dµs ◦ η. For η ∈ Dµs , s > n2 + 1, let T η denote the tangent map of η. In local coordinate xi and for x in Ω, T η(x) = ∂η i /∂xj (x) is simply the matrix of partial derivatives of η. It will be necessary to deﬁne the subgroups of Dµs which correspond to the boundary conditions (1.2c), (1.4), and (1.5). In what follows, N denotes the normal bundle over ∂M . 2.1 Lemma (Theorem 1 of Shkoller [2000]). Set s > n2 + 1, and let n denote the outward-pointing normal ﬁeld along the boundary ∂M . Deﬁne the sets s = {η ∈ Dµs : T η|∂Ω · n ∈ Hηs−3/2 (N ), for all n ∈ H s−1/2 (N )}, Dµ,N s = {η ∈ Dµs | η|∂Ω = e}, Dµ,D

and s Dµ,mix = {η ∈ Dµs | η leaves Γi invariant, η|Γ1 = e,

T η|Γ2 · n ∈ H s−3/2 (N |Γ2 ), for all n ∈ H s−1/2 (N |Γ2 )}. s s s Then Dµ,D , Dµ,N , and Dµ,mix are C ∞ subgroups of Dµs , and the tangent space at the identity of these groups is given by s Te Dµ,N = {u ∈Te Dµs : [Def u · n]tan = 0 ∈ H s−3/2 (T ∂M )

for all n ∈ H s−1/2 (N )}, S = {u ∈Te Dµs : u|∂M = 0}, Te Dµ,D

and S Te Dµ,mix = {u ∈ Te Dµs : [Def u · n]tan = 0 ∈ H s−3/2 (T Γ2 )

for all n ∈ H s−1/2 (N |Γ2 ) and u|Γ1 = 0}. s s s Next, form the corresponding sets DN , DD , and Dmix which do not have the volume-preserving constraint imposed. These sets are C ∞ subgroups of the full diﬀeomorphism group D s , and have the analogous tangent spaces at the identity without the divergence-free constraint.

174

S. Shkoller

For the remainder of the paper, let s s Gs = D N or Dmix

and s s or Dµ,mix . Gsµ = Dµ,N

For s > n2 + 1, let · , · denote the H 1 -equivalent right-invariant metric on Gsµ given at the identity by

X, Y e = (X, Y )L2 + 2α2 (Def(X), Def(Y ))L2 , ∀X, Y ∈ Te Gsµ .

(2.2)

By Proposition 2 of Shkoller [2000], the weak Riemannian metric · , · on Gsµ is well-deﬁned and smooth. For r ≥ 1, let V r denote the H r vector ﬁelds on M which satisfy the boundary conditions prescribed to elements of Te Gs , and set Vµr = {u ∈ V r | div u = 0}. The Stokes projector is deﬁned next. 2.2 Lemma (Proposition 1 of Shkoller [2000]). The following decomposition (2.3) V r = Vµr ⊕ (1 − α2 ∆)−1 grad H r−1 (M ). is well-deﬁned for r ≥ 1. Thus, if F ∈ V r , then there exists (v, p) ∈ Vµr × H r−1 (M )/R such that F = v + (1 − α2 ∆)−1 grad p and the pair (v, p) are solutions of the Stokes problem (1 − α2 ∆)v + grad p = (1 − α2 ∆)F, div v = 0, v satisﬁes boundary conditions prescribed to elements of V r . The summands in (2.3) are projector,

·,·

e

-orthogonal. Now, deﬁne the Stokes

P α : V r → Vµr , P (F ) = F − (1 − α2 ∆)−1 grad p. α

Then, for s >

n 2

(2.4)

+ 1, P : T Gs → T Gsµ , given on each ﬁber by P η : Tη Gs → Tη Gsµ , P η (Xη ) = P α (Xη ◦ η −1 ) ◦ η,

is a C ∞ bundle map covering the identity.

(2.5)

5. The Lagrangian Averaged Euler (LAE-α) Equations

3

175

Statement of Main Results

3.1 Theorem. For s > n2 + 2, and u0 ∈ Gsµ , there exists an interval I = (−T, T ), depending on |u0 |H s , and

a unique geodesic curve η(t, · ) of the weak Riemannian metric · , · deﬁned in (2.2) with initial data η(0, · ) = e and ∂t η(0, · ) = u0 such that η is in C ∞ (I, Gsµ ) and has C ∞ dependence on the initial velocity u0 . The smooth geodesic η is the Lagrangian ﬂow of the divergence-free timedependent vector ﬁeld u(t, x) given by ∂t η(t, x) = u(t, η(t, x)),

η(0) = e,

and u ∈ C 0 (I, Te Gsµ ) ∩ C 1 (I, Te Gµs−1 ) uniquely solves (1.1) with either the free-slip boundary conditions (1.4) or the mixed boundary conditions (1.5), and depends continuously on u0 . The proof of Theorem 3.1 shall be given in Section 4. Some interesting geometric corollaries can now be established. 3.2 Corollary. For n = dim(M ) = 2 and s > 3, T = ∞, so C ∞ geodesics

s of · , · on Gµ exist for all time. Thus, the LAE −α equations (1.1) either free-slip (1.4) or mixed-type (1.5) boundary conditions are globally wellposed. Proof. Because of Theorem 3.1, it suﬃces to establish an a priori H s estimate for u (and hence η). This will follow from two key conservation laws that are easy to verify by direct computation on the potential vorticity q(t, x) = ∗d(1 − α2 ∆)u: d q m = 0, m ∈ N, (3.1) dt M and

d q t, η(t, · ) = 0 . dt

(3.2)

From (3.1) with m = 2, it follows that q(t, · L2 is conserved, and by elliptic regularity (3.3) u(t, · )H 3 < C for some constant C. Let us take s ≥ 5. Then, from (3.2) and Lemma 2.4 of Chapter 17 of Taylor [1996], (3.4) ∇(1 − α2 ∆)uL∞ ≤ C 1 + log (1 − α2 ∆)H s−2 Equation (1.1) is reexpressed as ∂t u + P α (1 − α2 ∆) ∇u (1 − α2 ∆)u − α2 ∇uT · ∆u = 0,

(3.5)

176

S. Shkoller

where P α is the Stokes projector deﬁned in Lemma 2.2. Taking s-derivatives of (3.5) and taking the L2 inner-product of this with Ds u gives d u2H s ≤ CDs−2 ∇u (1 − α2 ∆)u + Ds−2 (∇uT · ∆u), Ds uL2 , (3.6) dt using that P α : Gs → Gsµ is a bounded projector. Because div u = 0, the ﬁrst inner-product on the right-hand-side of (3.6) is actually 1 2

[Ds−2 , ∇u ](1 − α2 ∆)u, Ds uL2 ; for this, the well-known commutator estimate (see, e.g., Taylor [1996]) gives (3.7) [Dk , ∇u ]vL2 ≤ C (uC 1 vH k + vC 1 uH k ) , which holds for integers k >

n 2

+ 1. For k = s − 2,

[Ds−2 , ∇u ](1 − α2 ∆)uL2 ≤ C ∇uL∞ uH s + ∇(1 − α2 ∆)uL∞ uH s−2 ≤ CuH s , where the constant depends on the bounds given by (3.2) and (3.3). For the second term in the inner-product on the right-hand-side of (3.6), set w = D3 u; by the Cauchy–Schwartz lemma, ∇uT · ∆uH s−2 ≤ C

s−2

wW m−2,4 wW s−m−3,4 .

m=0

Using the standard interpolation inequalities 1/2

1/2

|v|L4 ≤ C|Dv|L2 |v|L2 1−i/m

|Di v|L2 ≤ C|v|L2

i/m

|Dm v|L2 ,

and interpolating down to wL2 and up to uH s gives 2s−8

2s−6 ∇uT · ∆uH s−2 ≤ CuH , s

where C depends on uH 3 . This is suﬃcient to establish an a priori H s bound for u for s ≥ 5. The bound for s = 4 follows by interpolation. The fact that η remains on Gsµ follows identically the argument in Theorem 1.1 of Shkoller [2001]. There are some interesting geometric corollaries which Theorem 3.1 provides. Deﬁne the Riemannian exponential map Expe : Te Gsµ → Gsµ of the

right invariant metric · , · by Expe (tu) = η(t, · ), where t > 0 is sufﬁciently small, and η(t, · ) is the geodesic curve on Gsµ emanating from e with initial

velocity u. Because the above theorem guarantees that geodesics of · , · have C ∞ dependence on initial data, Expe is well deﬁned, satisﬁes Expe (0) = e, and so by the inverse function theorem, the following corollary is proven.

5. The Lagrangian Averaged Euler (LAE-α) Equations

177

3.3 Corollary. For s > n2 + 2, the Riemannian exponential map Expe : Te Gsµ → Gsµ is a local diﬀeomorphism, and two elements η1 and η2 of Gsµ that are in a suﬃciently small neighborhood of e can be connected by a unique geodesic of · , · in Gsµ . Note that while the group exponential map is only C 0 and does not cover a neighborhood of the identity, the Riemannian exponential map on Gsµ is smooth by Theorem 3.1, so that in conjunction with the fact that the right multiplication map is C ∞ , the topological group Gsµ looks very much like a Lie group. As a consequence of the smoothness of Expe and the proof of Theorem 12.1 in Ebin and Marsden [1970], geodesics of · , · , which are the solutions of (1.1), instantly inherit the regularity of the initial data. Thus, 3.4 Corollary. For s > n2 + 2, let η(t, · ) be a geodesic of the right invariant metric · , · on Gsµ , i.e. ∂t η(t, x) = u(t, η(t, x)) and u(t, x) is and η(0, ˙ · ) ∈ Tη(0, · ) Gµs+k the unique solution of (1.1). If η(0, · ) ∈ Gs+k µ s+k for 0 ≤ k ≤ ∞, then η(t, · ) is H for all t ∈ I.

4

Proof of Theorem

The proof will be based on the Lie-Poisson formulation of the LAE-α equations. Let u ˜ denote the one-form corresponding to the vector ﬁeld u. Then equation (1.1) may be written as u + Lu (1 − α2 ∆)˜ u = −dp, ∂t (1 − α2 ∆)˜

(4.1)

where Lu denotes the Lie derivative in the direction u. Pulling back equation (4.1) by the ﬂow ηt and noting that for any time-dependent one-form wt , (d/dt)ηt∗ wt = ηt∗ (∂t wt + Lu wt ), it is clear that (4.1) is equivalent to d ∗ η (1 − α2 ∆)˜ u = −dηt∗ p, dt t using the fact that the pull-back commutes with the exterior derivative. Letting the Hodge projector P act on both sides of the above equation gives ut = P (1 − α2 ∆)˜ u0 , P ηt∗ (1 − α2 ∆)˜ and by deﬁnition of the Hodge projection, this is the same as ηt∗ (1 − α2 ∆)˜ ut = (1 − α2 ∆)u0 + df, for some time-dependent function f . Therefore, ∗

∗

u ˜t = (1 − α2 ∆)−1 ηt−1 (1 − α2 ∆)˜ u0 + (1 − α2 ∆)−1 dηt−1 f,

178

S. Shkoller

which by deﬁnition of the Stokes projector P α given in Lemma 2.2, is equivalent to ∗

u ˜t = P α (1 − α2 ∆)−1 ηt−1 (1 − α2 ∆)˜ u0 .

(4.2)

Equation (4.2) is the Lie-Poisson formula for the LAE-α equations, with the right-hand-side representing the coadjoint action of the volume-preserving diﬀeomorphism group on coexact one-forms satisfying either the free-slip or mixed-type boundary conditions (of course, if we allow Gsµ to s , then the same formula also covers the no-slip case be the subgroup Dµ,D as well). Using (2.1) together with (4.2) gives ∗ (4.3) ∂t ηt = P α (1 − α2 ∆)−1 ηt−1 v˜0 ◦ ηt , where u0 . v˜0 = (1 − α2 ∆)˜ Lemma 2.2 states that P η : Tη Gs → Tη Gsµ is C ∞ ; next, deﬁne (1 − α2 ∆)−1η in a similar fashion by (1 − α2 ∆)−1η (Xη ) = [(1 − α2 ∆)−1 (Xη ◦ η −1 )] ◦ η. Proposition 5 of Shkoller [2000] proves that for s >

n 2

+ 1,

(1 − α2 ∆)−1 : Hηs−2 ↓ Gs → T Gs is C ∞ where Hηs−2 ↓ Gs denotes the vector bundle with base Gs and ﬁbers isomorphic to H s−2 vector ﬁelds on M . It follows that (4.3) may be written as ∗ (4.4) ∂t ηt = F (ηt ) := Pη (1 − α2 ∆)−1η (ηt−1 v˜0 ) ◦ ηt . Since

∗

(ηt−1 v˜0 ) ◦ ηt (x) = v˜0 (x)[T η(x)]−1 , F : Gsµ → T Gsµ , −1

(4.5) s−2

, whenever u0 and ηt since v˜0 [T η] is in the multiplicative algebra H are of class H s . Finally, (4.6) F is a C ∞ vector ﬁeld on Gsµ , since P and (1 − α2 ∆)−1 are smooth, and the map η → T η is also smooth (see Lemma 4 of Shkoller [2000]). Thus, the existence of a unique smooth geodesic curve ηt on I is proved by the fundamental theorem of ordinary diﬀerential equations on Hilbert manifolds (see, for example, Theorem 2.6 of Lang [1995]), since Lemma 2.1

5. The Lagrangian Averaged Euler (LAE-α) Equations

179

established that Gsµ is a C ∞ Hilbert manifold. The smooth dependence of ηt on u0 follows from the smooth dependence of solutions on parameters. Since u(t, · ) = ∂t ηt ◦ ηt−1 , the existence of u ∈ C 0 (I, Te Gsµ ) ∩ C 1 (I, Te Gµs−1 ) follows from the fact that the map (η → η −1 ) : Dµs → Dµs is only C 0 while (η → η −1 ) : Dµs → Dµs−1 is C 1 (see Palais [1968]).

Acknowledgments: Research was partially supported by the NSF-KDI grant ATM-98-73133 and the NSF grant DMS-0105004, a Los Alamos National Laboratory IGPP minigrant, and an Alfred P. Sloan Foundation Research Fellowship.

References Arnold, V. I. [1966], Sur la geometrie diﬀerentielle des groupes de Lie de dimension inﬁnie et ses applications a l’hydrodynamique des ﬂuids parfaits, Ann. Inst. Grenoble, 16, 319–361. Arnold, V. I. and B. Khesin [1998], Topological Methods in Hydrodynamics, Springer Verlag, New York. Chen, S. C., Foias, D. D. Holm, E. Olson, E. S. Titi, and S. Wynne [1998], Camassa–Holm equations as a closure model for turbulent channel and pipe ﬂow, Phys. Rev. Lett., 81, 5338–5341. Chen, S. C., C. Foias, D. D. Holm, E. Olson, E. S. Titi, and S. Wynne [1999], A connection between the Camassa–Holm equations and turbulent ﬂows in channels and pipes, The International Conference on Turbulence (Los Alamos, NM, 1998). Phys. Fluids, 11, 2343–2353. Chen, S. C., C. Foias, D. D. Holm, E. Olson, E. S. Titi, and S. Wynne [1999], The Camassa–Holm equations and turbulence, Predictability: quantifying uncertainty in models of complex phenomena (Los Alamos, NM, 1998). Phys. D, 133, 49–65. Chen, S. C., D. D. Holm, L. G. Margolin, and R. Zhang [1999], Direct numerical simulations of the Navier–Stokes alpha model, Predictability: quantifying uncertainty in models of complex phenomena, (Los Alamos, NM, 1998). Phys. D, 133, 66–83. Ebin, D. and J. Marsden [1970], Groups of diﬀeomorphisms and the motion of an incompressible ﬂuid, Ann. of Math., 92, 102–163. Holm, D. D., J. E. Marsden, and T. S. Ratiu [1998], Euler–Poincar´e equations and semidirect products with applications to continuum theories, Adv. in Math., 137, 1–81. Holm, D. D., J. E. Marsden, and T. S. Ratiu [1998], Euler–Poincar´e models of ideal ﬂuids with nonlinear dispersion, Phys. Rev. Lett., 80, 4273–4277. Lang, S. [1995], Diﬀerentiable and Riemannian manifolds, Springer-Verlag, New York.

180

S. Shkoller

Marsden, J. E., T. S. Ratiu, and S. Shkoller [2000], A nonlinear analysis of the averaged Euler equations and a new diﬀeomorphism group, Geom. Funct. Anal., 10, 582–599. Marsden, J. E. and S. Shkoller [2001], The anisotropic Lagrangian averaged Euler and Navier–Stokes equations, Arch. Rational Mech. Anal., (to appear). E-print, http://xyz.lanl.gov/abs/math.AP/0005033/. Marsden, J. E. and S. Shkoller [2001], Global Well-posedness for the Lagrangian Averaged Navier–Stokes (LANS-α) equations on bounded domains, Proc. Roy. Soc. London, 359, 1449–1468. Mohseni, K., S. Shkoller, B. Kosovi´c, J. E. Marsden, D. Carati, A. Wray and R. Rogallo [2000], Numerical simulations of homogeneous turbulence using the Lagrangian averaged Navier–Stokes equations, Center for Turbulence Research, Proceedings of the Summer Program, 271–283. Nadiga, B. and S. Shkoller [2001], Enhanced of the inverse-cascade of energy in the two-dimensional Lagrangian-averaged Navier–Stokes equations, Phys. of Fluids, 13, 1528–1531. Palais, R. [1968], Foundations of global nonlinear analysis, Benjamin, New York. Shkoller, S. [1998], Geometry and curvature of diﬀeomorphism groups with H 1 metric and mean hydrodynamics, J. Funct. Anal., 160, 337–365. Shkoller, S., Analysis on groups of diﬀeomorphisms of manifolds with boundary and the averaged motion of a ﬂuid, J. Diﬀerential Geom., 55, (2000), 145–191. Shkoller, S. [2001], Smooth global Lagrangian ﬂow for the 2D Euler and secondgrade ﬂuid equations, Appl. Math. Lett., 14, 539–543. Taylor, M. E. [1996], Partial Diﬀerential Equations III, Springer-Verlag.

6 Nearly Inviscid Faraday Waves Edgar Knobloch Jos´ e M. Vega To Jerry Marsden on the occasion of his 60th birthday ABSTRACT Many powerful techniques from Hamiltonian mechanics are available for the study of ideal hydrodynamics. This article explores some of the consequences of including small viscosity in a study of surface gravitycapillary waves excited by the vertical vibration of a container. It is shown that in this system, as in others, the addition of small viscosity provides a singular perturbation of the ideal ﬂuid problem, and that as a result its eﬀects are nontrivial. The relevance of existing studies of ideal ﬂuid problems is discussed from this point of view.

Contents 1 2 3

Introduction . . . . . . . . . . . . . . . . . . . . The Faraday System . . . . . . . . . . . . . . . . Gravity-Capillary Waves in Moderately Large Aspect-Ratio Containers . . . . . . . . . . . . . 4 Dynamics of the Reduced Equations . . . . . . 4.1 Two-Mode Model and Basic Solutions . . . . . 4.2 Numerical Results . . . . . . . . . . . . . . . . 4.3 Comparison with the PDE . . . . . . . . . . . 5 Concluding Remarks . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . .

1

. . . 181 . . . 185 . . . . . . .

. . . . . . .

. . . . . . .

194 200 201 204 211 215 219

Introduction

Jerry Marsden has been a driving force in studies of ideal hydrodynamics using methods from Hamiltonian mechanics. Perhaps his most important contribution has been the discovery of a systematic procedure for the construction of noncanonical Hamiltonian structures for such ﬂows. The required noncanonical Poisson brackets are typically singular, implying the 181

182

E. Knobloch and J. M. Vega

presence of additional conserved quantities known as Casimirs. Using these techniques he and his colleagues were able to extend Arnol’d-type stability theorems to a number of ﬂows of importance in geophysics and engineering (Marsden and Morrison [1984]; Holm, Marsden, Ratiu, and Weinstein [1985]; Abarbanel, Holm, Marsden, and Ratiu [1986]; Holm, Marsden, and Ratiu [1986]; Lewis, Marsden, Montgomery, and Ratiu [1986]). These are major contributions to the ﬁeld of ideal hydrodynamics and are nowadays taught in graduate level courses on the subject. Although these techniques are powerful, and enable one to obtain results that would be hard to obtain by other means, there remains an important question as to their relevance to ﬂows in the real world. Unless one studies ﬂows in a superﬂuid, for example 4 He below the λ-point (i.e., at temperatures below the transition to superﬂuidity) such ﬂows are inevitably aﬀected by dissipative processes, be they viscous or thermal. These may have an importance beyond being responsible for the decay of the ﬂow on the slow diﬀusive time scale (Batchelor [1967]; Chorin and Marsden [1979]). In general the presence of small viscosity is responsible for the formation of thin boundary layers where the ﬂow departs drastically from that in the bulk. In such boundary layers vorticity is generated by viscous eﬀects and this vorticity may then diﬀuse or be convected into the bulk. In such cases the ﬂow in the bulk may be substantially modiﬁed. Boundary layers may be classiﬁed as passive or dynamic, depending on their eﬀect on the bulk ﬂow. Passive boundary layers do not aﬀect the ﬂow in the bulk, which will then resemble the potential solution over long times; such boundary layers serve merely to adjust the ﬂow to the physically relevant boundary conditions. In the absence of boundary layer separation such boundary layers are found, for example, in steady ﬂow around obstacles. Oscillatory boundary layers may likewise be passive if the oscillation amplitude is small and only the leading order oscillatory ﬂow is considered. However, as discussed further below, this is no longer so at second order in amplitude. In this and other cases the boundary layers can become dynamic, and force the ﬂow in the bulk even though this ﬂow remains largely inviscid. In such cases the inviscid ﬂow in the bulk diﬀers substantially from the ﬂow that would be obtained by ignoring the boundary layers altogether, and this eﬀect persists in the limit in which the viscosity vanishes, i.e., in these cases the limit of vanishing viscosity may have at most a tenuous connection with the behavior of the strictly inviscid system (Batchelor [1967]). The present article is devoted to the explication of this phenomenon in the context of a particularly interesting physical system, gravity-capillary waves in a vertically vibrating container (the Faraday system). The diﬀerence between the properties of the Euler equation for an ideal incompressible ﬂuid and the Navier–Stokes equation in the limit of large Reynolds number provides the most famous example of the dangers of ignoring viscosity entirely, in the sense that the ’thermodynamic equilibrium’ spectrum that results bears no relation to the energy spectrum in

6. Nearly Inviscid Faraday Waves

183

the so-called inertial range. But there are simple examples of problems not involving turbulence where viscosity, however small, also plays a profound role. Perhaps the simplest is provided by the computation of the Lagrangian drift of a ﬂuid element when a surface gravity-capillary wave passes overhead. This drift is important because its sum over all the ﬂuid elements may be identiﬁed with the linear momentum associated with the wave (Knobloch and Pierce [1998], and references therein). In the following we employ Cartesian coordinates, with the x-axis along the unperturbed free surface of the ﬂuid and y vertically upwards. An irrotational incompressible ﬂow then satisﬁes the equation ∇2 φ = 0, where u ≡ (φx , φy ) is the Eulerian velocity, subject to the boundary conditions at y = −h ;

φy = 0 ft + φx fx = φy ,

φt + |u| /2 + p/ρ + gf = 0 2

at y = f .

Here f is the free surface deﬂection, p = p0 − T fxx (1 + fx2 )−3/2 , the excess pressure being a consequence of the presence of the surface tension T , and g and ρ are, respectively, the acceleration due to gravity and the ﬂuid density. A formulation of this type assumes that the ﬂuid remains irrotational if it is irrotational initially. This is so only if the ﬂuid is strictly inviscid. Since a particle starting at x = a at t = 0 is at t v(a, t ) dt x=a+ 0

at time t, the Lagrangian velocity of the ﬂuid element at time t is given, to second order, by t u(a, t )dt · ∇a u(a, t) . v(a, t) = u(a, t) + 0

For a progressive sinusoidal wave of (small) amplitude A, f = A cos(kx − ωt) + O(A2 ) and φ = [ft cosh k(y + h)]/k sinh kh] + O(A2 ). If A is constant in space it is possible to show that the time-averaged Eulerian velocity u vanishes to second order but the time-averaged Lagrangian drift v does not: ωkA2 cosh 2k(y + h) v = , 0 . (1.2) 2 sinh2 kh This drift is known as the Stokes drift. However, in the presence of small viscosity, this result is misleading. The argument that follows goes back to the work of Schlichting [1932]. Observe that for suﬃciently small viscosity (namely βh 1, β/k 1, where β = (ω/2ν)1/2 ) the inviscid solution

184

E. Knobloch and J. M. Vega

applies everywhere except in the two thin oscillatory viscous boundary layers of O(β −1 ) thickness along the top and bottom, whose contribution can be superposed on top of the irrotational ﬂow just computed. Therefore, if in the bottom boundary layer we write u = ∇φ+u , then at leading order u = (u , v ) satisﬁes the linearized vorticity equation 2 ∂ Ω ∂2Ω ∂Ω =ν + , Ω = ∇ × u , ∂t ∂x2 ∂y 2 subject to the boundary conditions u = −φx ,

v = 0 at y = −h ;

u = 0 for β(y + h) 1 .

This problem has the solution

u = −ωA cosech kh e−β(y+h) cos kx − ωt + β(y + h) , y v =− [∂u (x, z, t)/∂x] dz . −h

With these expressions it is possible to compute a time-averaged Reynolds stress in the oscillatory boundary layer, 1 0 ω 2 kA2 −β y˜ −2β y˜ − e − 1 , u v = 2(β y ˜ sin β y ˜ + cos β y ˜ )e 4β sinh2 kh correct to second order in the wave amplitude A. Here y˜ ≡ y + h. This Reynolds stress drives a mean ﬂow (U (y), 0) according the mean momentum equation ν∂ 2 U /∂y 2 = ∂u v /∂y, i.e., ν

∂U = u v − u v ∞ , ∂y

(1.5)

where u v ∞ represents the Reynolds stress just outside of the boundary layer. Letting β(y + h) → ∞ one ﬁnds that u v ∞ = −ω 2 kA2 /(4β sinh2 kh) . In view of the requirement U (−h) = 0 equation (1.5) now implies that = U (y) → U∞

3ωkA2 4 sinh2 kh

for β(y + h) 1 .

Thus the time-averaged Eulerian velocity at the edge of the boundary layer is (a) ﬁnite at second order, and (b) independent of ν (for suﬃciently small ν), provided only that ν > 0! Since this Eulerian mean ﬂow also carries the ﬂuid elements with it its eﬀect must be added to the Stokes drift (1.2) computed on the basis of inviscid theory. Thus the net Lagrangian drift for βh 1, β/k 1 is in fact 5ωkA2 ,0 , v∞ = 4 sinh2 kh

6. Nearly Inviscid Faraday Waves

185

a value that is 5/2 times the inviscid value (Longuet–Higgins [1953]; Batchelor [1967]; Phillips [1977]; Craik [1982]). As recognized already by Longuet– Higgins [1953], a somewhat similar eﬀect is present at the free surface as well. It is clear therefore that the oscillatory viscous boundary layers must be retained even in the limit of arbitrarily small viscosity, and that these are eﬀective at driving large scale mean ﬂows even when the viscosity ν is arbitrarily small. In the following we discuss in some detail the corresponding phenomena in the Faraday system, where oscillatory viscous boundary layers are inevitably present, and explore the interaction between the Faraday instability and the mean ﬂow driven in these boundary layers. In systems of small to moderate aspect ratio such mean ﬂows are entirely of viscous origin (Nicol´ as and Vega [1996]; Higuera, Nicol´ as, and Vega [2000]), but in the larger aspect ratio systems of interest below the situation is rather more subtle because of the presence of an additional inviscid mean ﬂow. For inviscid free waves this mean ﬂow is associated with spatial modulation of a single mode, as described by the celebrated Davey-Stewartson equations (Davey and Stewartson [1974]; Pierce and Knobloch [1994]). If viscosity is retained and the system forced, as in a shear ﬂow, a similar set of equations but with complex coeﬃcients can be derived (Davey, Hocking, and Stewartson [1974]). In general the mean ﬂow present will contain both viscous and inviscid contributions, even in nearly inviscid ﬂows. It is because of these eﬀects that one cannot mimic the eﬀects of viscosity on an oscillating ﬂuid system by simply adding dissipation post facto to an otherwise inviscid theory.

2

The Faraday System

Surface gravity-capillary waves excited parametrically by the vertical oscillation of a container provide a convenient and well-studied system (Miles and Henderson [1990]; Fauve [1995]; Kudrolli and Gollub [1997]), where the issues raised in the preceding section come to the fore. We nondimensionalize distances with the unperturbed depth h and time with the gravity-capillary time [g/h + T /(ρh3 )]−1/2 . In two dimensions the resulting viscous problem is then described by the dimensionless equations (Vega, Knobloch, and Martel [2001]), ψxx + ψyy = Ω,

Ωt − ψy Ωx + ψx Ωy = Cg (Ωxx + Ωyy ) ,

− 4fx ψxy = 0 at y = f , ft − ψx − ψy fx = (ψyy − ψxx )(1 − ( 2 (1 − S)fx − S(fx / 1 + fx )xx − ψyt + ψxt fx − (ψx + ψy fx )Ω fx2 )

+ (ψx2 + ψy2 )x /2 + (ψx2 + ψy2 )y fx /2 − 4µω 2 cos(2ωt)fx = −Cg [3ψxxy + ψyyy − (ψxxx + ψxyy )fx ]

(2.1) (2.2)

186

E. Knobloch and J. M. Vega

0 + 2Cg

2ψxy fx2 + (ψxx − ψyy )fx 1 + fx2

1 x

(ψxxy − ψyyy )fx2 − ψxyy (1 − fx2 )fx + 2Cg at y = f , 1 + fx2 L L Ωy dx = ψ = ψy = 0 at y = −1, f dx = 0 , 0

(2.3) (2.4)

0

where ψ is the streamfunction, deﬁned such that u ≡ (−ψy , ψx ) is the velocity, Ω ≡ ∇×u is the vorticity, and f is again the free surface deﬂection. The latter is required to satisfy volume conservation as in (2.4d). In an annular container of dimensionless length L periodic boundary conditions are applied to all quantities; in this case the boundary condition (2.4a) guarantees that the pressure is also periodic in x. The resulting problem depends on L µ

the aspect ratio, the nondimensional vibration amplitude,

2ω

the nondimensional vibration frequency, 3

1/2

Cg = ν/[gh + (T h/ρ)] 2

S = T /(T + ρgh )

the capillary-gravity number, the gravity-capillary balance parameter.

Here ν is the kinematic viscosity. Thus Cg and S are related to the usual capillary number C = ν[ρ/T h]1/2 and the Bond number B = ρgh2 /T by Cg = C/(1 + B)1/2

and S = 1/(1 + B).

Note that 0 ≤ S ≤ 1 and that the extreme values S = 0, 1 correspond to the purely gravitational (T = 0) and the purely capillary (g = 0) limits, respectively. The formulation employed above uses the streamfunction ψ and not the velocity potential φ, since formulations of the Faraday problem in terms of the latter miss both the mechanism for the generation of (Eulerian) mean ﬂows already discussed in §1, and the possibility that vorticity will diﬀuse from the viscous boundary layers along walls and the free surface into the nominally inviscid interior. These boundary layers form because in the presence of viscosity the tangential velocity must vanish along any wall while the tangential stress along the free surface is also required to vanish. Neither of these two eﬀects is restored by the a posteriori addition of damping to a fundamentally inviscid formulation, i.e., a formulation based on the velocity potential. In fact, for times that are not too long the vorticity contamination of the bulk does remain negligible, so that the ﬂow in the bulk is correctly described by an inviscid formulation but with boundary conditions determined by a boundary layer analysis as in §1. The basic assumption made below is that viscosity is small, namely Cg " 1.

(2.5)

6. Nearly Inviscid Faraday Waves

187

However, as already mentioned, this does not mean that viscous eﬀects can be safely ignored. Indeed, the subtleties arise already at the level of the linear problem. The normal modes of the unforced problem, linearized around ψ = f = 0, take the form (ψ, f ) = (Ψ, F )eλt+ikx . In the limit (2.5) there are two types of such modes (Kakutani and Matsuuchi [1975]; Martel and Knobloch [1997]): A. The nearly inviscid modes (or surface modes) obey the dispersion relation (2.7) λ = iω − (1 + i)α1 Cg1/2 − α2 Cg + O(Cg3/2 ) , where

ω = [kσ(1 − S + Sk 2 )]1/2 ,

k(ω/2)1/2 , (2.8) sinh(2k) 2 k (1 + 8σ 2 − σ 4 ) , α2 = 4σ 2 and σ ≡ tanh k. Eq. (2.7) provides a good approximation for both the frequency ±Im(λ) and the damping rate, α1 =

δ ≡ −Re(λ) = α1 Cg1/2 + α2 Cg ,

(2.9)

for small but ﬁxed values of Cg , see Fig. 2.1. However, as noted in Martel and Knobloch [1997], if the (corrected) third term in (2.7) is omitted the resulting approximation breaks down as soon as k km ∼ | ln Cg |. Since these moderately large values of k are also of interest this term is retained in what follows. The eigenfunction associated with the dispersion relation (2.7) is given (up to a constant factor) by (Ψ, F ) = (Ψ0 , 1) + O(Cg1/2 ),

Ψ0 =

ω sinh[k(y + 1)] . k sinh k

These modes therefore exhibit a signiﬁcant free-surface deﬂection; moreover, they are irrotational in the bulk, outside two thin boundary layers of thickness O((Cg /ω)1/2 ) attached to the bottom plate and the free surface. 1/2 Since the decay rate of these modes is O(Cg ) for small Cg these modes are near-marginal in nearly inviscid ﬂuids. Note that the horizontal wavenumber k is only restricted by the periodicity condition and thus can take any value of the form 2πN/L, where N is an integer; in the limit L → ∞ the allowed wavenumbers become dense on the real line. In the following we assume that the basic disturbance consists of a pair of counterpropagating wavetrains with wavenumber ±k and frequency ω determined from the above dispersion relation, and that the mean ﬂow arises from nonlinear

188

E. Knobloch and J. M. Vega

$m(λ)

20

15

10

5

0

0

2

4

6

8

k10

0

%e(λ)/

(

Cg

–0.05

–0.1

–0.15

–0.2

0

2

4

6

8

k10

Figure 2.1. The nearly inviscid dispersion relation, m λ and e λ vs. k, for 1/2 Cg = 10−6 , S = 0.5, from Eq. (2.7) using the O(Cg ) results (dashed line) and the O(Cg ) results (solid line). These parameters correspond to the experiments of Henderson and Miles [1994].

interactions involving these two modes. Here ω represents half the forcing frequency. Thus the relevant nearly inviscid modes are either of long wavelength (k → 0) or are concentrated around the two counterpropagating modes. The long wave modes constitute the nearly inviscid mean ﬂow; in the strictly inviscid case, this ﬂow is the mean ﬂow considered in inviscid theories (Davey and Stewartson [1974]; Pierce and Knobloch [1994]). However, because of its long wavelength this mean ﬂow does not appear if the aspect ratio is of order unity (Nicol´ as and Vega [1996]; Higuera, Nicol´ as,

6. Nearly Inviscid Faraday Waves

189

and Vega [2000]). B. The viscous modes (or hydrodynamical modes) obey the dispersion relation λ = −Cg [k 2 + qn (k)2 ] + O(Cg2 ), where for each k > 0, qn > 0 is the n-th root of q tanh k = k tan q, and hence decay on an O(Cg ) timescale, i.e., more slowly than the surface modes when Cg is suﬃciently small. Consequently these modes are also near-marginal. Since the associated eigenfunction is Ψ = sin qn sinh(ky) − sinh k sin(qn y) + O(Cg ),

F = O(Cg ),

these modes do not result in any signiﬁcant free-surface deformation at leading order. On the other hand they are rotational throughout the domain and, when forced at the edge of the oscillatory boundary layers along the bottom (Schlichting [1932]) and the free surface (Longuet–Higgins [1953]) by the mechanism described in §1, they constitute the viscous mean ﬂow. In view of its slow decay this ﬂow must be included in any realistic nearly inviscid description. With this in mind it is now possible to perform a multiscale analysis of the viscous ﬂuid equations using Cg , L−1 and µ as unrelated small parameters. We focus on two well-separated scales in both space (x ∼ 1 and x 1) and time (t ∼ 1 and t 1), and derive equations for small, slowlyvarying amplitudes A and B of left- and right-propagating waves. Since viscosity is small, we must distinguish three regions in the physical domain, 1/2 namely, the two oscillatory boundary layers (of thickness O(Cg )) and the remaining part (or bulk) of the domain (see Fig. 2.2). The boundary layers must be considered in order to obtain the correct boundary conditions for the solution in the bulk. The details of the derivation are quite involved and can be found in a recent paper (Vega, Knobloch, and Martel [2001]), where explicit conditions for the validity of the resulting equations as a description of the two-dimensional nearly inviscid Faraday system are also derived. The resulting equations take the form ¯ At − vg Ax = iαAxx − (δ + id)A + i(α3 |A|2 − α4 |B|2 )A + iα5 µB 0 g(y)ψym x dy A + iα7 f m x A , (2.11) + iα6 −1

Bt + vg Bx = iαBxx − (δ + id)B + i(α3 |B|2 − α4 |A|2 )B + iα5 µA¯ 0 g(y)ψym x dy B + iα7 f m x B , (2.12) − iα6 −1

A(x + L, t) ≡ A(x, t) ,

B(x + L, t) ≡ B(x, t) ,

(2.13)

where µ denotes the (small) amplitude of the periodic forcing. The ﬁrst seven terms in these equations, accounting for inertia, propagation at the group velocity vg , dispersion, damping, detuning, cubic nonlinearity and

190

E. Knobloch and J. M. Vega

1

O(Cg2 )

1

O(Cg L) 2

1

bulk

1

1

O(Cg L) 2

O(Cg2 )

Figure 2.2. Sketch of the primary and secondary boundary layers, indicating their widths in comparison to the layer depth.

parametric forcing, are familiar from existing weakly nonlinear, nearly inviscid theories (Ezerskii, Rabinovich, Reutov, and Starobinets [1986]). These theories lead to expression (2.9) for the damping δ and the expressions vg = ω (k),

α = −ω (k)/2 ,

(2.14)

ωk [(1 − S)(9 − σ )(1 − σ ) + Sk (7 − σ )(3 − σ )] 4σ 2 [(1 − S)σ 2 − Sk 2 (3 − σ 2 )] [8(1 − S) + 5Sk 2 ]ωk 2 , + 4(1 − S + Sk 2 ) 1 0 ωk 2 (1 − S + Sk 2 )(1 + σ 2 )2 4(1 − S) + 7Sk 2 + α4 = , 2 (1 − S + 4Sk 2 )σ 2 1 − S + Sk 2

(2.15)

α5 = ωkσ,

(2.17)

2

2

2

2

2

2

α3 =

(2.16)

where ω = ω(k) is the dispersion relation (2.8), and are recovered in the present formulation. In particular, the cubic coeﬃcients coincide with those obtained in strictly inviscid formulations (Pierce and Knobloch [1994]; see also Miles [1993], Hansen and Alstrom [1997] and references therein). The coeﬃcient α3 diverges at (excluded) resonant wavenumbers satisfying ω(2k) = 2ω(k). The last two terms describe the coupling to the mean ﬂow in the bulk (be it viscous or inviscid in origin) in terms of (a local average ·x of) the streamfunction ψ m for this ﬂow and the associated free surface elevation f m . The coeﬃcients of these terms and the function g are given

6. Nearly Inviscid Faraday Waves

191

by

2ωk cosh[2k(y + 1)] kσ ωk(1 − σ 2 ) , α7 = , g(y) = . 2ω 2σ sinh2 k The new terms are therefore conservative, implying that at leading order the mean ﬂow does not extract energy from the system. This result is consistent with the small steepness of the associated surface displacement and its small speed compared with the speed |∇ψ| due to the surface waves. The mean ﬂow variables in the bulk depend weakly on time but strongly on both x and y, and evolve according to the equations α6 =

m m + ψyy = Ωm , ψxx m 2 2 m m m m m Ωm t − [ψy + (|A| − |B| )g(y)]Ωx + ψx Ωy = Cg (Ωxx + Ωyy ) ,

ψxm − ftm = β1 (|B|2 − |A|2 )x ,

m ψyy

= β2 (|A|2 − |B|2 ) ,

at y = 0 , (2.19)

m m m m (1 − S)fxm − Sfxxx − ψyt + Cg (ψyyy + 3ψxxy )

= −β3 (|A|2 + |B|2 )x ,

(2.18)

at y = 0 ,

(2.20)

L

at y = −1 ,

m Ωm = 0, y dx = ψ

(2.21)

0

ψym

¯ 2ikx + c.c. + |B|2 − |A|2 ] , = −β4 [iABe

ψ m (x + L, y, t) ≡ ψ m (x, y, t) ,

at y = −1 ,

f m (x + L, t) ≡ f m (x, t) ,

(2.22)

subject to the constraint

L

f m (x, t) dx = 0 .

(2.23)

0

Here β1 = 2ω/σ ,

β2 = 8ωk 2 /σ ,

β3 = (1 − σ 2 )ω 2 /σ 2 ,

β4 = 3(1 − σ 2 )ωk/σ 2 .

Thus the mean ﬂow is forced by the surface waves in two ways. The right sides of the boundary conditions (2.19a) and (2.20) provide a normal forcing mechanism; this mechanism is the only one present in strictly inviscid theory (Davey and Stewartson [1974]; Pierce and Knobloch [1994]) and does not appear unless the aspect ratio is large. The right sides of the boundary conditions (2.19b) and (2.21c) describe two shear forcing mechanisms, a

192

E. Knobloch and J. M. Vega

tangential stress at the free surface and a tangential velocity at the bottom wall. Note that, as in the simpler example considered in §1, neither of these forcing terms vanishes in the limit of small viscosity (i.e., as Cg → 0). The shear nature of these forcing terms leads us to retain the viscous term in (2.18b) even when Cg is quite small. In fact, when Cg is very small, the eﬀective Reynolds number of the mean ﬂow is quite large. Thus the mean ﬂow itself generates additional boundary layers near the top and bottom of the container, and these must be thicker than the original boundary layers for the validity of the analysis. This puts an additional restriction on the validity of the equations (Vega, Knobloch, and Martel [2001]). There is a third, less eﬀective but inviscid, volumetric forcing mechanism associated with the second term in the vorticity equation (2.18b), which looks like a horizontal force (|A|2 − |B|2 )g(y)Ωm and is sometimes called the vortex force. This term plays an important role in the generation of Langmuir circulation (Leibovich [1983]). Although this term vanishes in the absence of mean ﬂow, it can change the stability properties of the ﬂow and enhance or limit the eﬀect of the remaining forcing terms. However, this is not the case in the limit considered in §3 below. In the following we refer to Eqs. (2.11)–(2.13) and (2.18)–(2.23) as the general coupled amplitude-mean-ﬂow (GCAMF) equations. These equations diﬀer from the exact equations forming the starting point for the analysis in the presence of the forcing terms in the boundary conditions (2.19)–(2.21), and in two essential simpliﬁcations: the fast oscillations associated with the surface waves have been ﬁltered out, and the boundary conditions are applied at the unperturbed location of the free surface, y = 0. The forcing terms capture completely the eﬀect of the primary viscous boundary layers on the bulk. The GCAMF equations are invariant under reﬂection, ψ m → −ψ m ,

Ωm → −Ωm ,

A↔B,

x → −x ,

(2.24)

and hence admit reﬂection-symmetric solutions. The simplest such solutions are the spatially uniform standing waves given by A = B = Reiθ , where θ is a constant and the amplitude R is given by 2 δ 2 + d + (α3 − α4 )R2 = α52 µ2 , with an associated reﬂection-symmetric streaming ﬂow that is periodic in x with period π/k (see Eq. (2.21c)). Since this mean ﬂow does not couple to the amplitudes A, B (i.e., the mean ﬂow terms are absent from Eqs. (2.11)– (2.12)), the presence of this ﬂow does not aﬀect the standing waves. These much-studied waves bifurcate from the ﬂat state at µ = µc ≡

(δ 2 + d2 )1/2 , α5

and do so supercritically if d < 0 and subcritically if d > 0, see Fig. 2.3. Note that µ can be of order µc without violating the conditions for the validity of

6. Nearly Inviscid Faraday Waves

193

the GCAMF equations, and that these equations describe correctly both cases d < 0 and d > 0. In the former case, the waves are stable near threshold, but may lose stability at ﬁnite amplitude through the action of the mean ﬂow as the forcing amplitude increases. Like the secondary saddlenode bifurcation which stabilizes the spatially uniform standing waves when d > 0 (see Fig. 2.3), this bifurcation is well within the regime of validity of the GCAMF equations. Thus the mean ﬂow is involved only in possible secondary instabilities of the primary standing wave branch.

d>0 R02

d<0

µc

µ

Figure 2.3. The primary bifurcation from the ﬂat state to spatially uniform standing wave solutions. The GCAMF equations describe correctly all states with |µ − µc | ∼ µc , including the secondary saddle-node bifurcation present when d > 0 and the stable solutions beyond it.

The special case d = 0 (zero detuning) and µ = µc deﬁnes a codimensiontwo point for the analysis since both L (or equivalently ω) and µ must be chosen appropriately. In this case the direction of branching is determined by higher order terms neglected in the analysis, such as the real parts of the coeﬃcients of the cubic terms, and this is so for suﬃciently small but nonzero values of d as well. In other words, the limit d → 0 (although welldeﬁned within the GCAMF equations) may not describe correctly the corresponding behavior of the underlying ﬂuid equations appropriately close to threshold, i.e., for |µ − µc | " µc . However, even in this case the GCAMF equations capture correctly any secondary instabilities involving the mean ﬂow, provided these occur at µ ∼ µc . A similar remark applies to other codimension-two points as well.

194

3

E. Knobloch and J. M. Vega

Gravity-Capillary Waves in Moderately Large Aspect-Ratio Containers

The GCAMF equations describe small amplitude slowly varying wavetrains whenever the parameters Cg , L−1 and µ are small, but otherwise unrelated to one another. Any relation between them will therefore lead to further simpliﬁcation. To derive such simpliﬁed equations we consider the distinguished limit δL2 /α = ∆ ∼ 1 ,

dL2 /α = D ∼ 1 ,

µL2 /α ≡ M ∼ 1 ,

(3.1)

with 1 k | ln Cg |, and | ln Cg | taken for simplicity to be O(1) as well. −1/2 The simpliﬁed equations will then be formally valid for 1 " L " Cg if k ∼ 1. These are derived under the assumption 1 − S ∼ 1 using a multiple scale method with x and t as fast variables and ζ = x/L ,

T = t/L2

τ = t/L ,

(3.2)

as slow variables. In terms of these variables the local horizontal average ·x becomes an average over the fast variable x. Note that assumption (3.1) imposes an implicit relation between L and Cg . When 1 − S ∼ 1 the nearly inviscid and viscous mean ﬂows can be clearly distinguished from one another as discussed in §2, and the viscous mean ﬂow can be identiﬁed by taking appropriate averages of the whole mean ﬂow over an intermediate timescale τ , i.e., the mean ﬂow variables ψ m , Ωm and f m take the form ψ m (x, y, ζ, τ, T ) = ψ v (x, y, ζ, T ) + ψ i (x, y, ζ, τ, T ) , m

v

(3.3)

i

Ω (x, y, ζ, τ, T ) = Ω (x, y, ζ, T ) + Ω (x, y, ζ, τ, T ) , m

v

(3.4)

i

f (x, ζ, τ, T ) = f (x, ζ, T ) + f (x, ζ, τ, T ) ,

(3.5)

with τ τ τ τ τ i i i i ψx dτ + ψζ dτ + ψy dτ + Ω dτ + f i dτ 0

0

0

0

0

(3.6) bounded as τ → ∞. Thus the nearly inviscid mean ﬂow is purely oscillatory (i.e., it has a zero mean) on the timescale τ . Since its frequency is of the order of L−1 (see (3.2)), which is large compared with Cg , the inertial term for this ﬂow is large in comparison with the viscous terms (see Eq. (2.18)), except in two secondary boundary layers, of thickness of the order of (Cg L)1/2 (" 1), attached to the bottom plate and the free surface. Note that, as required for the consistency of the analysis, these boundary layers are much thicker than the primary boundary layers associated with the surface waves (see Fig. 2), which provide the boundary conditions (2.19)–(2.21) for the mean ﬂow. Moreover, the width of these secondary

6. Nearly Inviscid Faraday Waves

195

boundary layers remains small as τ → ∞ and (to leading order) the vorticity of this nearly inviscid mean ﬂow remains conﬁned to these boundary layers. This is because, according to condition (3.6), the nearly inviscid mean ﬂow is purely oscillatory on the timescale τ . Consequently, condition (3.6) is essential for the validity of the analysis that follows, and the mathematical deﬁnition of the nearly inviscid mean ﬂow through Eqs. (3.3)–(3.6) is the only consistent one; without this condition vorticity would diﬀuse outside the boundary layers and aﬀect the structure of the whole ‘nearly inviscid’ solution even at leading order. In fact, vorticity does diﬀuse (and is convected) from the boundary layers, but this vorticity transport is included in the viscous mean ﬂow. The vorticity associated with the nearly inviscid mean ﬂow is readily seen to be of, at most, the order of 2 2 |A| − |B|2 and |A| + |B|2 (Cg L)−1/2 in the upper and lower secondary boundary layers, respectively; the jump in the associated streamfunction ψ i across each boundary layer is O(Cg L) times smaller. This jump only aﬀects higher order terms; as a consequence the secondary boundary layers can be completely ignored and no additional contributions to the boundary conditions on the nearly inviscid ﬂow need be included in (2.19) and (2.21). Outside these boundary layers, the complex amplitudes and the ﬂow variables associated with the nearly inviscid mean ﬂow are expanded as (A, B) = L−1 (A0 , B0 ) + L−2 (A1 , B1 ) + · · · , i

i

i

v

v

(ψ , f , Ω ) = v

(ψ , f , Ω ) =

(3.8)

−2

L (φi0 , F0i , 0) + L−3 (φi1 , F1i , W0i ) + · · · , L−2 (φv0 , 0, W0v ) + L−3 (φv1 , F0v , W1v ) + · · ·

(3.9) .

(3.10)

Substitution of (3.1)–(3.6), (3.8)–(3.10) into (2.11)–(2.23) leads to the following: (i ) From (2.18)–(2.21), at leading order, φi0xx + φi0yy = 0 in − 1 < y < 0 ,

φi0 = 0 at y = −1 ,

φi0x = 0 at y = 0 ,

i together with F0x = 0. Thus

φi0 = (y + 1)Φi0 (ζ, τ, T ) ,

F0i = F0i (ζ, τ, T ) .

At second order, the boundary conditions (2.19a) and (2.20) yield i φi1x (x, 0, ζ, τ, T ) = F0τ − Φi0ζ + β1 (|B0 |2 − |A0 |2 )ζ , i i i (1 − S)F1x − SF1xxx = Φi0τ − (1 − S)F0ζ − β3 (|A0 |2 + |B0 |2 )ζ

at y = 0. Since the right hand sides of these two equations are independent of the fast variable x and both φi1 and F1i must be bounded in x, it follows

196

E. Knobloch and J. M. Vega

that i Φi0ζ − F0τ = β1 (|B0 |2 − |A0 |2 )ζ , i Φi0τ − vp2 F0ζ = β3 (|A0 |2 + |B0 |2 )ζ ,

(3.12)

where vp = (1 − S)1/2

(3.13)

is the phase velocity of long wavelength surface gravity waves. Equations (3.12) must be integrated with the following additional conditions, which result from (2.22)–(2.23) and (3.6), Φi0 (ζ + 1, τ, T ) ≡ Φi0 (ζ, τ, T ) ,

F0i (ζ + 1, τ, T ) ≡ F0i (ζ, τ, T ) ,

(3.14)

1

F0i dζ = 0 ,

τ

Φi0ζ

0

dτ +

0 τ

F0i

dτ = bounded as τ → ∞ .

(3.15)

0

(ii ) The leading order contributions to equations (2.11)–(2.12) yield A0τ − vg A0ζ = B0τ + vg B0ζ = 0 . Thus A0 = A0 (ξ, T ) ,

B0 = B0 (η, T ),

where ξ and η are the characteristic variables ξ = ζ + vg τ ,

η = ζ − vg τ .

(3.17)

Moreover, according to (2.13), A0 (ξ + 1, T ) ≡ A0 (ξ, T ) ,

B0 (η + 1, T ) ≡ B0 (η, T ) .

(3.18)

Substitution of these expressions into (3.12) followed by integration of the resulting equations yields β1 vp2 + β3 vg |A0 |2 − |B0 |2 − |A0 |2 − |B0 |2 ζ 2 2 vg − v p + vp F + (ζ + vp τ, T ) − F − (ζ − vp τ, T ) , β1 vg + β3 |A0 |2 + |B0 |2 − |A0 |2 + |B0 |2 ζ F0i = 2 2 vg − v p + F + (ζ + vp τ, T ) + F − (ζ − vp τ, T ) ,

Φi0 =

(3.19)

(3.20)

6. Nearly Inviscid Faraday Waves

197

where ·ζ denotes the mean value in the slow spatial variable ζ, i.e.,

1

G = ζ

G dζ ,

(3.21)

0

and the functions F ± are such that F ± (ζ + 1 ± vp τ, T ) ≡ F ± (ζ ± vp τ, T ) ,

F ± ζ = 0 .

(3.22)

The particular solution of (3.19)–(3.20) yields the usual inviscid mean ﬂow included in nearly inviscid theories (see Pierce and Knobloch [1994] and references therein); the averaged terms are a consequence of the conditions (3.15), i.e., of volume conservation (cf. Pierce and Knobloch [1994]) and the requirement that the nearly inviscid mean ﬂow has a zero mean on the timescale τ ; the latter condition is never imposed in strictly inviscid theories but is essential in the limit we are considering, as explained above. To avoid the breakdown of the solution (3.19)–(3.20) at vp = vg we assume that (3.23) |vp − vg | ∼ 1 . The functions F ± remain undetermined at this stage. In fact, they are not needed below because the evolution of both the viscous mean ﬂow and the complex amplitudes is decoupled from these functions. However, at next order one ﬁnds that F ± remain constant on the timescale T , but decay exponentially due to viscous eﬀects (resulting from viscous dissipation in the secondary boundary layer attached to the bottom plate) on the timescale t ∼ (L/Cg )1/2 . (iii) The evolution equations for A0 and B0 on the timescale T are readily obtained from equations (2.11)–(2.13), invoking (3.1)–(3.6), (3.19)–(3.20), (3.22) and eliminating secular terms (i.e., requiring |A1 | and |B1 | to be bounded on the timescale τ ): A0T =iαA0ξξ − (∆ + iD)A0 + i (α3 + α8 )|A0 |2 − α8 |A0 |2 ξ − α4 |B0 |2 η A0 0 η ¯ + iα5 M B0 + iα6 g(y)φv0y x ζ dy A0 ,

(3.24)

B0T =iαB0ηη − (∆ + iD)B0 + i (α3 + α8 )|B0 |2 − α8 |B0 |2 η − α4 |A0 |2 ξ B0 0 + iα5 M A¯0 ξ − iα6 g(y)φv0y x ζ dy B0 ,

(3.25)

−1

−1

subject to (3.18). Here ξ and η are the comoving variables deﬁned in (3.17), and · x , · ζ , · ξ and · η denote mean values over the variables x, ζ, ξ and η, respectively. Note that ζ averages over functions of A0 are equivalent

198

E. Knobloch and J. M. Vega

to ξ averages, while those over functions of B0 are equivalent to η averages. The real coeﬃcient α8 is given by α8 = [α6 (2ω/σ)(β1 vp2 + β3 vg ) + α7 (β1 vg + β3 )]/(vg2 − vp2 ) . Eqs. (3.24)–(3.25) are independent of F ± because of the second condition in (3.22). Since |A0 |2 − |B0 |2 τ = |A0 |2 ξ − |B0 |2 η → 0

as T → ∞

the long time behavior of the viscous mean ﬂow is described by φv0xx + φv0yy = W0v

in −1 < y < 0 ,

(3.27)

v v v v − φv0y W0x + φv0x W0y = Re−1 (W0xx W0T φv0x = φv0yy = 0 at y = 0 , v x ζ = φv0 = 0 at y = −1 , W0y

¯0 τ e2ikx + c.c. φv0y = −β4 iA0 B

+

v W0yy )

at y = −1 ,

φv0 (x + L, ζ + 1, y, T ) ≡ φv0 (x, ζ, y, T ) ,

in −1 < y < 0 , (3.28) (3.29) (3.30) (3.31) (3.32)

where the eﬀective Reynolds number associated with this viscous mean ﬂow is Re = 1/(Cg L2 ) . (3.33) Remarks. Some remarks about these equations and boundary conditions are now in order. a. The viscous mean ﬂow is driven by the short gravity-capillary waves through the inhomogeneous term in the boundary condition (3.31). Since ¯0 τ depends on both ζ and T (unless either A0 or B0 is spatially uniA0 B form) the boundary condition implies that φv0 (and hence W0v ) depends on both the fast and slow horizontal spatial variables x and ζ. This dependence cannot be obtained in closed form (except, of course, in the uninteresting limit Re → 0), and one must resort to numerical computations for realistically large values of L. b. The higher order oscillatory terms absent from the boundary condition (3.31) oscillate on the intermediate timescale τ , and hence generate secondary boundary layers. However, the contributions from these boundary layers are all subdominant and have no eﬀect at the order considered. Moreover, the free-surface deﬂection accompanying the viscous mean ﬂow is also small, f v ∼ L−3 (see Eq. (3.10)), and so plays no role in the evolution of this ﬂow, as expected of a ﬂow involving the excitation of viscous modes (see §2).

6. Nearly Inviscid Faraday Waves

199

c. The dominant forcing of the viscous mean ﬂow comes from the lower boundary. This forcing vanishes exponentially when k 1 leaving only a narrow range of wavenumbers within which such a mean ﬂow is forced while δ = O(Cg ), see Fig. 2.1. Thus in most cases in which a viscous mean ﬂow is present one may assume that δ = O(Cg1/2 ) . Note, however, that in fully three-dimensional situations (such as that in Douady, Fauve, and Thual [1989]) in which lateral walls are included a viscous mean ﬂow will be present even when k 1 because the forcing of the mean ﬂow in the oscillatory boundary layers along the lateral walls remains. d. According to the scaling (3.1) and the deﬁnitions (2.8), (2.9) and (3.33), the eﬀective Reynolds number Re is large, and ranges from logarithmically −1/2 ) if k ∼ 1. However, even in the large values if k ∼ | ln Cg | to O(Cg latter limit we must retain the viscous terms in (3.28) in order to account for the second boundary conditions in (3.29)–(3.31). Of course, if Re 1 vorticity diﬀusion is likely to be conﬁned to thin layers, but the structure and location of all these layers cannot be anticipated in any obvious way, and one must again rely on numerical computations. e. Note that the change of variables A0 = A˜0 e−iθ , where

θ (T ) = −α6

˜0 eiθ , B0 = B

0

−1

g(y)φv0y x ζ dy ,

reduces Eqs. (3.24)–(3.25) to the much simpler form ¯ ˜ 0 η A˜0T = iαA˜0ξξ − (∆ + iD)A˜0 + iα5 M B + i (α3 + α8 )|A˜0 |2 − (α4 + α8 )|A˜0 |2 ξ A˜0 ,

(3.36)

˜0T = iαB ˜0ηη − (∆ + iD)B ˜0 + iα5 M A¯ ˜0 ξ B ˜0 , ˜0 |2 − (α4 + α8 )|B ˜0 |2 η B + i (α3 + α8 )|B

(3.37)

A˜0 (ξ + 1, T ) ≡ A˜0 (ξ, T ) ,

(3.38)

˜0 (η + 1, T ) ≡ B ˜0 (η, T ) . B

from which the mean ﬂow is absent. This decoupling is a special property of the regime deﬁned by Eq. (3.1). The resulting equations provide perhaps the simplest description of the Faraday system at large aspect ratio, and it is for this reason that they have been extensively studied (Martel, Knobloch, and Vega [2000]). We summarize some of their properties in the next section.

200

4

E. Knobloch and J. M. Vega

Dynamics of the Reduced Equations

In this section we describe some basic properties of the nonlocal equa˜0 ( · , tau) ≡ tions (3.36)–(3.38) in the invariant subspace A˜0 ( · , tau) = B C( · , tau), say, in which the dynamics are described by the partial diﬀerential equation (PDE) Cτ = iαCxx − (∆ + iD)C ¯ , + i (α3 + α8 )|C|2 − (α4 + α8 )|C|2 C + iα5 M C

(4.1)

subject to periodic boundary conditions. Henceforth the variable x stands ˜0 . Eq. (4.1) for either η or ξ, depending on whether C stands for A˜0 or B describes standing wave solutions of Eqs. (3.36)–(3.38). It is possible to show that such standing waves (hereafter SW) are the preferred state at onset (Riecke, Crawford, and Knobloch [1988]) although at larger values of the forcing amplitude such waves may become unstable with respect to perturbations transverse to this subspace (Martel, Knobloch, and Vega [2000]); if this is so the dynamics of Eqs. (3.36)–(3.37) and (4.1) will diﬀer. After an appropriate rescaling (and taking the complex conjugate in (4.1) if α3 −α4 < 0, and changing the sign of α and d if α3 +α8 < 0) the standing waves obey an equation of the form ¯ , (4.2) Cτ = iαCxx − (1 + id)C + i |C|2 + (Λ − 1)|C|2 C + µC C(x + 1, τ ) = C(x, τ ) . (4.3) Thus the relative size of the nonlinear terms is measured by the single parameter Λ ≡ 1 − (α4 + α8 )/(α3 + α8 ). The results of solving equations (3.36)–(3.38) and (4.2)–(4.3) for identical parameter values are summarized in the bifurcation diagrams shown in Fig. 4.1. These are constructed by noting that equations (4.2)–(4.3) imply d ¯ 2 + c.c.) , C2L2 = −2C2L2 + µ(C dτ so that successive intersections of a trajectory with the hypersurface ¯ 2 + c.c.) C2L2 = 12 µ(C are always well deﬁned. In fact this surface contains all the steady states, while each periodic trajectory intersects it at least twice in each period, at the turning points in CL2 . In the bifurcation diagrams we plot successive maxima of CL2 at each value of µ after transients have died away. In the ˜0 |) we likewise plot the outward intersections with general case (|A˜0 | = |B the hypersurface ¯ B ¯ + c.c.) , ˜0 2 = µ(A˜ ˜ A˜0 2L2 + B 0 0 L2

6. Nearly Inviscid Faraday Waves

201

˜0 2 . Although this procedure for corresponding to maxima in A˜0 2L2 + B L2 generating bifurcation diagrams is convenient for most purposes it suﬀers from the disadvantage that it is insensitive to phase drift. Thus additional diagnostics are necessary to identify such drifts, as discussed in detail in Martel, Knobloch, and Vega [2000]. In both cases the ﬁrst instability produces uniform steady solutions and these subsequently lose stability in a symmetry-breaking pitchfork bifurcation, giving rise to time-independent but spatially nonuniform states. Both ˜0 and hence are common to these bifurcations preserve the identity A˜0 = B both sets of equations. Both also preserve the spatial reﬂection symmetry R : x → −x. In the case shown in Figs. 4.1a,b the resulting nonuniform but reﬂection-symmetric states subsequently undergo a Hopf bifurcation and produce a branch of oscillatory solutions. Shortly thereafter chaos sets in, interspersed with nonuniform temporally periodic motion. Some of the observed transitions are the result of crises while others appear to be due to period-doubling cascades. Observe that the details of this behavior diﬀer ˜0 does not in the two ﬁgures, indicating that the invariant subspace A˜0 = B remain attracting for all values of µ. In the following we restrict attention to the origin of this more complicated dynamical behavior in Eqs. (4.2)–(4.3). Analysis of the system (4.2)–(4.3) is complicated by the absence of wavenumber-dependent dissipation: the damping is identical for all modes. As a result the theorem of Duan, Ly, and Titi [1996] establishing the existence of a ﬁnite-dimensional inertial manifold for a nonlocal Ginzburg–Landau of the same type and with the same boundary conditions does not apply. Nonetheless, for the weakly damped nonlinear Schr¨ odinger (NLS) equation with direct external forcing, Ghidaglia [1988] was able to demonstrate the existence of a weak ﬁnite-dimensional attractor. This result was improved upon by Wang [1995] who used an energy equation to obtain strong convergence, showing that the attractor is in fact a strong, ﬁnite-dimensional, global attractor. Subsequent work (see, e.g., Goubet [1996]; Oliver and Titi [1998]) has dealt with the task of proving additional regularity properties of the attractor. In particular, Oliver and Titi [1998] showed that the global attractor for the weakly damped driven (but local) NLS equation with direct forcing is analytic, indicating that the Fourier expansion of a solution on the attractor converges exponentially fast (as the number of terms is increased) to the exact solution. We believe that these properties continue to hold for the nonlocal equation with parametric forcing, and explain why a simple two-mode truncation of the PDE discussed next describes the PDE dynamics so well over a large range of parameter values.

4.1

Two-Mode Model and Basic Solutions

In view of the fact that both the primary and secondary bifurcations preserve the reﬂection symmetry R : x → −x we focus on the class of

202

E. Knobloch and J. M. Vega

(a) 3

Cmax 2.5

2

1.5

1

0.5

0

(b)

A, Bmax

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

µ

5.5

3

2.5

2

1.5

1

0.5

0

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

µ

5.5

Figure 4.1. Bifurcation diagrams for the two systems described in (3.36)–(3.38) and (4.2)–(4.3), with α = 0.1, d = 0 and ∆ = 2/3. Courtesy C. Martel.

6. Nearly Inviscid Faraday Waves

203

reﬂection-invariant states of the form ∞ 1 C(x, τ ) = √ c0 (τ ) + cn (τ ) cos(2πn x) . 2 n=1

(4.7)

Projection of equation (4.2) onto the ﬁrst two modes then leads to the dynamical system (Higuera, Porter, and Knobloch [2002]) c0 c1 + i(¯ c0 c1 + c¯1 c0 ) + µ¯ c0 , (4.8) 2 2 c1 c c 1 0 c0 c1 + c¯1 c0 ) + i|c1 |2 , c˙1 = −(1 + id1 )c1 + iΛ(|c0 |2 + |c1 |2 ) + i(¯ 2 2 4 (4.9) c˙0 = −(1 + id )c0 + iΛ(|c0 |2 + |c1 |2 )

where d1 = d + 4π 2 α. These equations are equivariant under the operations R0 : (c0 , c1 ) → (−c0 , c1 ) ,

R1 : (c0 , c1 ) → (c0 , −c1 ) ,

(4.10)

where R0 = κ ˆ T1/2 and R1 = T1/2 , and T1/2 : x → x + 12 , C → C, and κ ˆ : C → −C represent two symmetries of the original equations (4.2)– (4.3). These actions generate the group D2 . Eqs. (4.8)–(4.9) contain three types of ﬁxed points whose properties are summarized below. In what follows we set d = 0 (both for simplicity and for comparison with Martel, Knobloch, and Vega [2000]) and write c0 ≡ x0 + iy0 ,

c1 ≡ x1 + iy1 ,

where x0 , x1 , y0 and y1 are all real. Trivial state (O): This solution has the full symmetry D2 . Its stability is determined by the four eigenvalues ±µ − 1 and −1 ± iω, where ω ≡ 4π 2 α. The ﬁrst two give the growth rate of perturbations within the invariant plane c1 = 0, while the complex conjugate pair describes perturbations within the invariant plane c0 = 0. When µ = 1 there is a supercritical pitchfork bifurcation giving rise to a branch of spatially uniform states U : c0 = 0, c1 = 0; note that there are no ﬁxed points of the form c0 = 0, c1 = 0. Uniform steady states (U ): These solutions take the form c0 = 0, c1 = 0, where ( (4.11) |Λ| |c0 |2 = 2 µ2 − 1 , cos 2ϑ = 1/µ , and c0 = |c0 |eiϑ , and are invariant under R1 but not under R0 ; when necessary we distinguish between the two R0 -related branches using the notation U± (the ± reﬂects the sign of the x0 coordinate). Since d = 0 these solutions are always stable to perturbations within the plane c1 = 0 with the corresponding eigenvalues, s, satisfying (s + 1)2 − (5 − 4µ2 ) = 0 .

(4.12)

204

E. Knobloch and J. M. Vega

√ Note that these eigenvalues are complex when µ > 5/2. Stability with respect to the mode c1 is described by the characteristic equation ( 2 2ω (Λ + 1) µ2 − 1 = 0 . (4.13) (s + 1)2 + ω 2 + µ2 − 1 + (µ2 − 1) − Λ |Λ| Thus, when s = 0, the uniform states U undergo a pitchfork bifurcation which breaks the R1 symmetry and produces time-independent nonuniform states with n = 1 (NU ). Note that because of the form of Eqs. (4.12)–(4.13) Hopf bifurcations are not possible. Nonuniform steady states (NU ): The ﬁxed points NU have no symmetry; consequently, the NU states come in quartets, related by the actions of R0 , R1 , and R0 R1 . Depending on the value of Λ, the NU states may become unstable, with increasing µ, at either a saddle-node or a Hopf bifurcation. If a Hopf bifurcation occurs it generates four symmetry-related periodic orbits. The fate of these and other time-dependent solutions is investigated in the following section.

4.2

Numerical Results

In this section we present the results of a careful numerical investigation of Eqs. (4.8)–(4.9) using a combination of AUTO (Doedel, Champneys, Fairgrieve, Kuznetsov, Sandstede, and Wang [1997]) and XPPAUT (Ermentrout [2000]). In addition to the simple bifurcations mentioned above these equations can exhibit extremely complicated dynamics. We ﬁnd that over a large range of parameters this complex behavior is organized by a codimension-one heteroclinic connection between the uniform and trivial states, a global bifurcation which can be best understood in the context of a two-parameter study. We therefore set d = 0, α = 0.1 and vary Λ along with the forcing amplitude µ. Fig. 4.2 shows the important local bifurcation sets in the (µ, Λ) plane: the n = 1 neutral stability curve (labeled SB) and the loci of Hopf and saddlenode (SN) bifurcations on the NU branch which bifurcates from the U state along the neutral curve. Fig. 4.3 shows the bifurcation diagrams obtained on traversing this plane in the direction of increasing µ at several diﬀerent (but ﬁxed) values of Λ. Fig. 4.2 reveals the presence of two singularities. There is a degeneracy when Λ = 0: at this value of Λ spatially uniform states exist only at µ = 1 and at no other value of µ. It is thus not surprising that there are many bifurcation sets emanating from the singular point (µ, Λ) = (1, 0). In the present problem there is, in addition, evidence of singular behavior at Λ −1.1428, where the amplitude of the NU branch (but not the U branch) becomes inﬁnite. As Λ decreases toward this value the two saddlenode bifurcations on the NU branch (at µ ∼ 2.33 and µ ∼ 5.67) occur at roughly constant µ values but at larger and larger amplitude (see Fig. 4.3g). When Λ < −1.1428 these two saddle-node bifurcations no longer occur at all (see Fig. 4.3h).

6. Nearly Inviscid Faraday Waves

3

205

Λ SN : saddle-node SB : symmetry-breaking

SB

2 Hopf

1 SN

0 SN Hopf

–1

SN

SB

2

SN

4

6

8

µ

Figure 4.2. Local bifurcation sets with d = 0 and α = 0.1: symmetry-breaking bifurcation (SB) on the U branch, and Hopf and saddle-node (SN) bifurcations on the NU branch. Courtesy M. Higuera and J. Porter.

The bifurcation diagrams of Fig. 4.3 show the U and NU branches, as well as recording the fate of the branches of periodic orbits (when present) generated in Hopf bifurcations on the NU branch (Figs. 4.3a-f). For typical parameter values the NU branch is S-shaped, with the Hopf bifurcations occurring on the lower part. For example, a cut (not shown) at Λ = 1 barely crosses the locus of Hopf bifurcations but does so twice in quick succession indicating the presence of two Hopf bifurcations back to back (see Fig. 4.2); connecting these bifurcations is a stable branch of periodic orbits. With Λ = 0.9 (Fig. 4.3a) there is a period-doubling bifurcation on this original branch but the cascade (not shown) is incomplete (there are just two period-doublings followed by two reverse period-doublings). Bifurcation “bubbles” of this type are familiar from problems related to the Shil’nikov bifurcation (Knobloch and Weiss [1981]; Glendinning and Sparrow [1984]). For Λ = 2/3 (Fig. 4.3b), the value corresponding to Fig. 4.1, there is (presumably) a complete period-doubling cascade and one can easily ﬁnd a variety of periodic and chaotic attractors (see Fig. 4.4). Evidence that this cascade is not the whole story, however, is provided in Fig. 4.3c. The ﬁgure shows that, for Λ = 0.645, the branch of periodic orbits has split apart, each half terminating in a Shil’nikov-type homoclinic connection with the uniform state. The abruptness of this transition suggests the presence of other periodic orbits with which the original periodic branch is colliding. This interpretation is further supported by a second abrupt transition which occurs by Λ = 0.633 (Fig. 4.3d); the branch of periodic states produced in the second Hopf bifurcation (at µ 4.8) now terminates in a

206

E. Knobloch and J. M. Vega

|c| 3

4

SB

(a) Λ=0.9

(b) Λ= 23

3

SB

2

Hopf

Hopf

NU

2

1

1

U 1

4

2

3

1

5 µ

4

4

(c) Λ=0.645

3

3

2

2

1

1

1

2

3

4

1

9

4

6

2

3

2

3

4

5

6

1

7

150

9

100

6

50

3 5

5

2

3

4

5

5

10

15

10

15

(h) Λ=−1.2

(g) Λ=−1.1425

1

4

(f) Λ=− 29

6

1

3

(d) Λ=0.633

5

(e) Λ= 29

200

2

10

15

1

5

Figure 4.3. Series of bifurcation diagrams, |c| ≡ (|c0 |2 + |c1 |2 )1/2 versus µ, for diﬀerent values of Λ. Stable (unstable) solutions are rendered with thick (thin) lines. Branches of periodic solutions originating in Hopf bifurcations are also shown. Courtesy M. Higuera and J. Porter.

homoclinic bifurcation on the NU states rather than the U states. As Λ is decreased even further (see Fig. 4.3e) the ﬁrst homoclinic bifurcation (with the U state) moves very close to the initial Hopf bifurcation, occurring at µ 1.112 when Λ = 2/9, while the second homoclinic bifurcation (on the NU branch) moves closer to the rightmost saddle-node bifurcation. The branch of periodic solutions corresponding to the former is almost invisible on the scale of the ﬁgure. A comparison of Figs. 4.3e and 4.3f shows that when Λ is small in magnitude the bifurcation diagrams on either side of Λ = 0 are qualitatively similar. The main diﬀerences are the change in scale (larger µ values for negative Λ) and the absence of the rightmost symmetry-breaking (SB) bifurcation when Λ < 0: although the NU branch comes very close to the U branch for large µ the two branches remain

6. Nearly Inviscid Faraday Waves

(a)

207

(c)

(b)

y1 0.5

1

0

0

1

0

-1

-0.5

-1

1.2

1.6

2

x0

1

1

2

2

Figure 4.4. Attractors for d = 0, α = 0.1, Λ = 2/3 and (a) µ = 1.86, (b) µ = 2.2, (c) µ = 2.5. Courtesy M. Higuera and J. Porter.

distinct, in contrast to the situation for Λ > 0. It turns out that the interesting periodic and chaotic behavior which one ﬁnds for values of Λ such as those used in Figs. 4.3b-f is associated with a heteroclinic bifurcation involving both O and U . The bifurcation sets for this global connection, U → O → U , are shown in Fig. 4.5. In this ﬁgure there are three curves of heteroclinic bifurcations which emerge Λ

SB

0.4

SN

Hopf

0.2

0.195

Λ= 29

Het

0.19

SN

0.2 0.185

Het 0.18 2.54

2.56

2.58

2.6

2.62

0

SN Het - 0.2

Hopf SN 1

Het

SB 2

3

4

5

µ

Figure 4.5. Heteroclinic (Het) bifurcation sets (solid lines) representing the cycle U → O → U . The inset shows an enlargement of one of these curves near its termination in the codimension-two heteroclinic cycle U → NU → O → U . Note that the cut Λ = 2/9 passes through four heteroclinic bifurcations. Courtesy M. Higuera and J. Porter.

from (µ, Λ) = (1, 0) into the region Λ > 0 and three that emerge into

208

E. Knobloch and J. M. Vega

the region Λ < 0. For Λ > 0 two of these connect up smoothly forming a loop while the third oscillates back and forth an inﬁnite number of times before terminating in a codimension-two heteroclinic bifurcation point at (µ, Λ) (2.5803, 0.1877). The heteroclinic cycle at this point involves all three types of ﬁxed points: O, U , and the NU state between the two saddle-node bifurcations on the NU branch. For Λ < 0 the three curves of heteroclinic bifurcations remain separate (the upper two are almost indistinguishable on the scale of the ﬁgure). Two of them continue out to large values of µ (they have been followed to µ > 50) while the third wiggles back and forth before terminating in another codimension-two heteroclinic cycle involving O, U , and NU . This point, (µ, Λ) (5.065, −0.159), is marked in Fig. 4.5 by a small circle; the wiggles are not visible on this scale. This point diﬀers from the previous codimension-two point for Λ > 0 in a fundamental way because it involves the small amplitude NU state (after the ﬁrst Hopf bifurcation) whose stable and unstable manifolds are each twodimensional. Thus the codimension-two heteroclinic cycle for Λ > 0 involves three points with one-dimensional unstable manifolds; the connection O → U is structurally stable (due to the invariance of the uniform plane) while the connections U → NU and NU → O are each of codimension-one. For Λ < 0 the connections O → U and NU → O, are both structurally stable but the third, U → NU , is itself of codimension two. Fig. 4.5 also shows the cut Λ = 2/9. This cut corresponds to the bifurcation diagram of Fig. 4.3e and crosses the heteroclinic bifurcation set four times. We use this Λ value to investigate further the dynamics associated with this bifurcation. Along this path the ﬁrst Hopf bifurcation (at µ 1.106) occurs almost immediately after the birth of the NU branch (see Fig. 4.5). Between this Hopf bifurcation and the leftmost saddle-node bifurcation on the NU branch at µ 2.674 there are no stable ﬁxed points; in this region one can easily ﬁnd chaotic attractors, such as those shown in Fig. 4.6, as well as a variety of interesting periodic solutions (see Fig. 4.7).

(a) y1

(b)

(c)

2 2

2

1

0

0

0

-1 -2

-2 -2 -2

0

2

x0

-3

0

3

-3

0

3

Figure 4.6. Chaotic attractors for d = 0, α = 0.1, Λ = 2/9 and (a) µ = 1.51, (b) µ = 2.0, (c) µ = 2.54. Courtesy M. Higuera and J. Porter.

6. Nearly Inviscid Faraday Waves

(a)

(b)

209

(c)

y1 2

2

1

0

0

0

-1 -2

-2 -2

0

2

x0

-3

0

3

-3

0

3

Figure 4.7. Z2 -symmetric periodic attractors for d = 0, α = 0.1, Λ = 2/9 and (a) µ = 1.41, R0 R1 -symmetry; (b) µ = 1.64, R1 -symmetry; (c) µ = 1.875, R0 -symmetry. Courtesy M. Higuera and J. Porter.

Notice that the periodic orbits in Fig. 4.7 have Z2 symmetry, i.e., they are invariant under one of the reﬂections: R0 , R1 , R0 R1 . Although these particular periodic orbits are somewhat exotic (in the sense that they do not belong to one of the basic families of periodic solutions analyzed below but resemble something like the ‘multi-pulse’ orbits identiﬁed in perturbations of the Hamiltonian problem) there are also sequences of simpler periodic orbits which come close to both O and U . These orbits, characterized by their symmetry (or lack thereof) and by the number of oscillations they experience near O, are related in a fundamental way to the heteroclinic connection U → O → U . A bifurcation diagram obtained by following many of these solutions numerically is displayed in Fig. 4.8, along with four representative orbits. This ﬁgure shows the period (half-period for symmetric orbits) as a function of µ. Two of the branches shown (the ones with lowest period) close on themselves to form isolas but most of the solutions terminate in homoclinic (U± → U± ) gluing bifurcations or heteroclinic (U± → U∓ ) symmetryswitching bifurcations. This is evident from the dramatic increase in period which occurs as the periodic orbits approach the ﬁxed points. In the gluing bifurcations two asymmetric periodic orbits come together (using U+ or U− ) to create a single R1 -symmetric periodic orbit. In the symmetryswitching bifurcations two R0 -symmetric periodic orbits transform (using both U+ and U− ) into two R0 R1 -symmetric periodic orbits. In this second case the symmetry neither increases nor decreases but switches from one Z2 symmetry to another. Under appropriate conditions each of these processes is associated, as in the usual Shil’nikov scenario (Glendinning and Sparrow [1984]; Wiggins [1988]), with cascades of saddle-node and either period-doubling or symmetry-breaking bifurcations; the Z2 -symmetric orbits must undergo symmetry-breaking prior to any period-doubling bifurcations since such orbits do not (generically) have negative Floquet multipliers (Swift and

210

E. Knobloch and J. M. Vega

(1)

(2)

(3)

period

Λ=

15

2 9

10

5

1.5

2

2.5

3

µ

3.5

Heteroclinic Bifurcation (1) y1

2

2

1

1

1

1

R1

A 0

R0

0

0

-1 -1

-1

µ=1.28 -2

0

2

µ=1.315 -2

x0

0

-1

-2 2

R0 R1

0

µ=1.445 -2

Gluing

0

-2 2

µ=1.462 -2

0

2

Symmetry-switching

period 15

10

A

R1 R1

5

A

R1

A

R0 R0 R1

R0 R0 R1 R0

1.2

1.3

1.4

1.5

1.6

1.7

µ

Figure 4.8. Cascades of gluing (A + A ↔ R1 ) and symmetry-switching (R0 + R0 ↔ R0 R1 + R0 R1 ) bifurcations for d = 0, α = 0.1 and Λ = 2/9. These accumulate from opposite sides on the principal heteroclinic bifurcations, the ﬁrst two of which, labeled (1) and (2), are shown (upper panel). At point (3) there is a homoclinic connection to NU . The lower panel shows an enlargement of the region near point (1). The diagrams show the period (half-period) of asymmetric (symmetric) periodic orbits as a function of µ. Courtesy M. Higuera and J. Porter.

Wiesenfeld [1984]). Note also that the way the two branches (e.g., an asymmetric and an R1 -symmetric branch) merge with increasing period

6. Nearly Inviscid Faraday Waves

211

diﬀers from that of the corresponding Shil’nikov problem in three dimensions with symmetry (Glendinning [1984]). This is because the reﬂection symmetry in the latter case must be a complete inversion (Tresser [1984]; Wiggins [1988]), while in our case the relevant symmetry R1 is not (see Eq. (4.10)); in particular R1 does not act on the swirling part of the ﬂow near U± in the plane c1 = 0. In our case the two types of branches oscillate “in phase” around the homoclinic or heteroclinic points as their period increases (cf. Fig. 4.8), while they oscillate “out of phase” in the threedimensional case with inversion symmetry. These diﬀerences between the standard situation and ours are a direct consequence of the fact that our two-mode truncation is four-dimensional, allowing new types of connection that are not possible in three dimensions. Note that in Fig. 4.8 we have only investigated the ﬁrst two of the main heteroclinic bifurcations (recall that there are four such bifurcations when Λ = 2/9) and that there are many periodic solutions (e.g., those of Fig. 4.7) which have not been shown; these may form isolas or terminate at other, subsidiary, connections. In short, the full situation is extremely complex.

4.3

Comparison with the PDE

Since it is the dynamics of the PDE (4.2)–(4.3) that are of ultimate interest, one would like to understand how faithfully their behavior is represented by a truncated set of ordinary diﬀerential equations (ODEs). While there is no a priori reason to assume that a ﬁnite number of modes can accurately capture the eﬀect of the nonlinear terms, it turns out that in many problems they do (Knobloch, Proctor, and Weiss [1993]; Doelman [1991]; Rucklidge and Matthews [1996]). Higuera, Porter, and Knobloch [2002] ﬁnd numerically that these equations frequently have reﬂection-symmetric attractors (in x) and that these are well described by the restriction to the cosine subspace. In addition, the numerical simulations indicate that the inﬂuence of the higher modes is often negligible, particularly for periodic orbits and chaotic attractors which are approximately heteroclinic. Fig. 4.9 shows that the heteroclinic behavior found within the two-mode model (4.8)–(4.9) also occurs in the full PDE. To examine the inﬂuence of higher modes (n > 1) on the dynamics we have computed |c0 |, |c1 |, and ΣN n=2 |cn | as functions of time, after ﬁrst allowing transients to die away. The solutions in Fig. 4.10 represent typical chaotic attractors that can be found for Λ = 2/9 and 1.5 µ 2.8, together with the time series representing their harmonic content. Notice that in all cases the amplitude of the higher modes (bold curves in the righthand set of panels) remains small, indicating that these modes do not play a signiﬁcant role in the dynamics. While such a low-dimensional description is not unexpected for small amplitudes (i.e., near onset at µ = 1) Eqs. (4.2)–(4.3) continue to be described by the two mode truncation even relatively far from the primary bifurcation. Notice that, e.g., for Λ = 2/9 and µ 1.875 the uniform states are

212

E. Knobloch and J. M. Vega

2

y1

2

1

1

A 0

1

R1 x0

-1

0

1

R0

0

0

-1

-1

R0 R1

-1

µ=1.25 -2

µ=1.26 0

2

-2

0

2

-2 µ=1.33 -2

0

2

-2 µ=1.36 -2

0

2

Figure 4.9. Stable periodic orbits of the PDE (4.2)–(4.3) with diﬀerent symmetries for Λ = 2/9. Gluing and symmetry-switching bifurcations, as in the ODEs, appear to be present. Courtesy M. Higuera and J. Porter.

unstable to at least two nonuniform modes and one might therefore suppose that a two-mode truncation will be of dubious validity. However, we often ﬁnd that the system (4.8)–(4.9) continues to apply (see Figs. 4.10b,c). This increased range of validity is likely due to the prominence of the heteroclinic bifurcation since for orbits which are approximately heteroclinic the potentially complicated dynamics of the full PDE are controlled mainly by symmetries and by the local properties of the ﬁxed points O and U where most time is spent; recall that O and U are the same in both the PDE (4.2)–(4.3) and the ODE model (4.8)–(4.9). Also important is the fact that due to the spatial averaging of the forcing term in Eq. (4.2) the origin is always stable with respect to nonuniform modes. The higher modes are thus quickly damped under the attracting inﬂuence of the trivial state. We conclude that the evident low-dimensional behavior of the PDE (4.2)– (4.3) is related to the presence of the heteroclinic bifurcation involving the origin and its associated cascades. Whenever one is relatively close to these bifurcations in parameter space (see Fig. 4.5) the dynamics will typically be dominated by the many periodic and chaotic attractors associated with them. For parameter values outside of this regime (e.g., µ 3 when Λ = 2/9) the dynamics are no longer heteroclinic and hence are more likely to involve other modes. When Λ = 2/3, the value used in Martel, Knobloch, and Vega [2000] for Fig. 4.1, the heteroclinic bifurcation does not actually occur (see Fig. 4.5), but the dynamics may nonetheless be dominated by the various periodic orbits and related chaotic attractors which exist in nearby regions of parameter space; gluing bifurcations still occur even though the full cascade does not. Fig. 4.11 shows several chaotic attractors for Λ = 2/3 demonstrating that the dynamics are again dominated by the ﬁrst two modes. As for Λ = 2/9, this low-dimensional behavior does not hold for all values of µ and the two-mode ODE model eventually fails. But in contrast to the case Λ = 2/9, when Λ = 2/3 this failure can arise for two reasons. The ﬁrst failure of Eqs. (4.8)–(4.9) is due to a R symmetry-breaking bifurcation, which occurs at µ ∼ 3.4. In this case it is not the two-mode nature of the model

6. Nearly Inviscid Faraday Waves

213

(a) y1 3 3

|c0 |

1.5

2.5 0 2

|c1 |

–1.5

1.5

N #

–3 –3

0

(b)

3

x0

|cn | 0

n=2

200

3

2.5

1.5

3

202

204

206

208

202

204

206

208

210

202

204

206

208

210

τ

210

2.5

0

2

–1.5

1.5 –3 –3

0

0 200

3

(c) 3 3 1.5 2.5 0 2 –1.5 1.5 –3 –3

0

3

0 200

Figure 4.10. Relative importance of the Fourier components for Λ = 2/9: (a) ) chaotic attractor at µ = 1.51, (b) at µ = 2.0, (c) at µ = 2.8. The lines ( ) to |c1 | and ( ) to ΣN |c |. Courtesy M. correspond to |c0 |, ( n n=2 Higuera and J. Porter.

that becomes inappropriate (the uniform state does not lose stability to the n = 2 mode until µ 4.093) but the restriction to the cosine subspace. Fig. 4.12a shows a solution, which possesses low-dimensional character but is not reﬂection-symmetric and is therefore not contained within the system (4.8)–(4.9). After a narrow interval (3.4 µ 3.46) the dynamics recover their reﬂection-symmetric character, and subsequently (see Fig. 4.1) a second window of stable uniform states appears for 3.5 µ 4.3. At µ ∼ 4.3 the system becomes abruptly chaotic, with many modes partaking in the dynamics. This situation, however, does not persist uniformly as µ increases further. For example, at µ = 4.65 the trajectories spend a long

214

E. Knobloch and J. M. Vega

(a) y1

2

0.2

|c0 | 0

1.5 1

–0.2

|c1 | 0.5

–0.4 0.8

N #

1

1.2

1.4

x0

(b)

|cn | 0

n=2

150

152

154

156

158

152

154

156

158

160

152

154

156

158

160

τ

160

1 1.5 0.5 1

0 –0.5

0.5

–1 0.6

0.8

1

1.2

1.4

0 150

(c) 2 0.5 1.5 0

1 0.5

–0.5 0.5

1

1.5

0 150

Figure 4.11. Relative importance of the diﬀerent Fourier components when Λ = 2/3 for chaotic attractors at: (a) µ = 1.85, (b) µ = 1.925, and (c) µ = 3.2. ) correspond to |c0 |, ( ) to |c1 | and ( ) to ΣN The lines ( n=2 |cn |. Courtesy M. Higuera and J. Porter.

time near the invariant even subspace (cn = 0 if n is odd), occasionally coming under the inﬂuence of unstable periodic orbits in this subspace and being brieﬂy ejected from the even subspace (see Fig. 4.13). These excursions are associated with episodic phase drift of the solution (type I drift in the terminology of Martel, Knobloch, and Vega [2000]). This interesting behavior is reminiscent of the so-called blowout bifurcation (see, e.g., Ashwin, Buescu, and Stewart [1996]). In the present case the attractor is completely contained in the even subspace (with dynamics dominated by the ﬁrst two even modes, n = 0, 2) over a moderately large interval, 5.0 µ 6.5, but loses stability, apparently in the above manner, as µ decreases below µ 5.0. We remark that blowout bifurcations provide a

6. Nearly Inviscid Faraday Waves

(a)

(b)

τ

τ

–0.5

0

0.5

x

–0.5

0

215

0.5

x

Figure 4.12. Space-time diagrams corresponding to (a) a quasiperiodic attractor without reﬂection symmetry, Λ = 2/3 and µ = 3.4; and (b) a periodic attractor with reﬂection symmetry, Λ = 2/9 and µ = 3.46. Courtesy M. Higuera and J. Porter.

general mechanism by which attractors in invariant subspaces lose stability with respect to perturbations out of the subspace.

5

Concluding Remarks

In this paper we have summarized the results of a systematic derivation of the amplitude equations describing the evolution of slowly varying wavetrains on the surface of a nearly inviscid liquid excited by small amplitude vertical vibration of its container. Because of the presence of oscillatory viscous boundary layers along the rigid boundaries and the free surface viscous mean ﬂows are driven in the largely inviscid interior of the ﬂuid. These augment any inviscid mean ﬂows that may be present and the two together interact with the parametrically excited waves producing them. This nontrivial interaction between the mean ﬂows and the waves is a consequence of the presence of the hydrodynamic modes which decay, for Cg " 1, more slowly than gravity-capillary waves, and hence are easily excited by the oscillations. The resulting equations, albeit still complex, provide a signiﬁcant simpliﬁcation of the original problem in that the boundary conditions are now applied at the undeformed surface, and the fast oscillation frequency

216

E. Knobloch and J. M. Vega

4

|c0 |

3.5 3 2.5 2 1.5

|c2 |

1 0.5

|c1 |

0 460

480

500

520

540

560

τ

Figure 4.13. Norm of the ﬁrst three modes versus τ for Λ = 2/3 and µ = 4.65. The thin, medium, and thick lines denote |c0 |, |c2 |, and |c1 |, respectively. Note the episodic excitation of the mode c1 . Courtesy M. Higuera and J. Porter.

associated with the vibration of the container has been eliminated. As part of the analysis explicit expressions for all the coeﬃcients are obtained, as are explicit conditions for the validity of the resulting equations (Vega, Knobloch, and Martel [2001]). As such the resulting equations represent a novel system for the study of pattern formation and subsequent instabilities of the resulting patterns via the excitation of mean ﬂows. In certain speciﬁc cases these equations can be simpliﬁed further. We discussed one such case, in which the mean ﬂow decouples from the amplitude equations for the left- and right-traveling waves. The remaining equations are still not trivial, in that they are nonlocal and include both dispersion and damping, although no wavenumber-dependent dissipation. Equations of this type were studied by Martel, Knobloch, and Vega [2000] and provide perhaps the simplest description of the Faraday system in an extended domain under precisely stated conditions. It is important to emphasize that this description diﬀers from those obtained by ad hoc procedures. In particular, the usual approach of formulating the problem as an inviscid one at leading order, and adding some damping after the fact to mimic the role of viscosity fails on two levels: it omits the basic mechanisms that drive the (viscous) mean ﬂow (Schlichting [1932]), and it omits the back-reaction of this ﬂow on the waves that are responsible for it. Even the simplest

6. Nearly Inviscid Faraday Waves

217

description of the Faraday system that results includes nonlocal terms in the amplitude equations whose origin can be traced to the fact that amplitude inhomogeities are advected at the group velocity on a timescale that is much faster than the timescale on which the waves equilibrate. An additional nonlocal contribution arises from the requirement that mass be conserved (Pierce and Knobloch [1994]). Since the Reynolds number of the associated ﬂow can be (indeed must be) substantial the equations for this ﬂow must in general be solved numerically as already done in other circumstances (Nicol´ as, Rivas, and Vega [1997, 1998]). A careful examination of the analysis that led us to equations (2.11)– (2.23) shows that these in fact apply under the conditions k(|ψx | + |ψy |) " ω ,

|f | + |fx | " 1 ,

L−1 " k ,

(5.1)

or equivalently, k(|A| + |B|) + |fxm | " 1 ,

k|ψxm | " ω ,

(5.2)

and the condition L " vg /(δ + |d| + |α5 |µ) .

(5.3)

Here vg is the (nondimensional) group velocity of the surface waves, deﬁned in (2.14), α5 is given in (2.16) and we assumed that the smallest spatial scale is k −1 . The condition (5.1) can be stated succinctly as requiring that the nonlinearity be weak and the aspect ratio of the system be large compared to the nondimensional wavelength of the surface waves; the condition (5.3) requires that the terms accounting for inertia and propagation at the group velocity in the amplitude equations (2.11)–(2.12) be much larger than the remaining terms. In addition, the requirements k 2 (1 − S + Sk 2 )− 2 " Cg−1 , 3

(1 − S)k 2 + Sk 4 Cg2 ,

1

(5.4)

or equivalently, Cg " ω ,

1

3

Cg2 ω 2 " 1 − S + (Sω/Cg ) .

(5.5)

are imposed implicitly both on the carrier wavenumber k as well as on all wavenumbers associated with the (viscous) mean ﬂow. These conditions guarantee that the thickness of the associated boundary layers will be small compared to the depth (if k " 1) or compared to the wavelength (if k 1), see Fig. 2.2. Since the lowest wavenumber of the mean ﬂow is k = 2π/L the condition (5.4) implies, in particular, that (1 − S)L−2 + (2π)2 SL−4 Cg2 .

(5.6)

Several additional assumptions appear in the course of the analysis (Vega, Knobloch, and Martel [2001]).

218

E. Knobloch and J. M. Vega

It is evident that strictly inviscid treatments of the problem and the powerful techniques that are available for such treatments miss qualitatively important properties of vibrating systems. Similar issues arise in the theory of vibrating liquid bridges (Nicol´ as and Vega [1996]) and related systems (Higuera, Nicol´ as, and Vega [2000]), where mean ﬂows generated in the viscous boundary layers can be used to control the amplitude of any convection that may be present. Whether the approach described here for the Faraday system will yield a quantitatively precise description of existing experiments on the Faraday system with nearly inviscid ﬂuids (Ezerskii, Rabinovich, Reutov, and Starobinets [1986]; Douady, Fauve, and Thual [1989]; Tuﬁllaro, Ramshankar, and Gollub [1989]; Kudrolli and Gollub [1997]) remains to be seen, however. Any experiments in a narrow annulus will suﬀer from eﬀects due to oscillatory boundary layers at the lateral (radial) boundaries which are diﬃcult to minimize. Likewise precise experiments on liquid bridges are diﬃcult under terrestrial conditions, and stability predictions of the type given by Kruse, Mahalov, and Marsden [1999] remain to be conﬁrmed. The relation between the type of theory described here and earlier work (Kovaˇciˇc and Wiggins [1992]; Haller and Wiggins [1993, 1995a,b]) on the origin of complex dynamics in the forced weakly damped nonlinear Schr¨odinger equation is also of interest. This work focused on the near-Hamiltonian limit and exploited generalizations of the Mel’nikov theory to PDEs to establish the presence of a variety of multipulse orbits homoclinic or heteroclinic to a slow manifold. In contrast, our approach has focused on the dynamics substantially farther from this limit. Although much of the dynamical behavior found numerically in the nonlocal parametrically forced damped nonlinear Schr¨ odinger equation derived here could be understood in detail using a two-mode model system, the relation of the cascades of gluing and symmetry-switching bifurcations that appear to be responsible for it to the near-Hamiltonian dynamics analyzed for this class of systems by Kovaˇciˇc and Wiggins [1992] and Haller and Wiggins [1993, 1995a,b] remains to be examined. Indeed, because of the parametric nature of the forcing (and in particular the resulting symmetry C → −C) the behavior found here bears a greater resemblance to that discussed by Rucklidge and Matthews [1996] in their study of the dynamics of the shearing instability in magnetoconvection than to the damped nonlinear Schr¨ odinger equation with direct forcing. Like our system the former has D2 symmetry and exhibits global bifurcations involving both the origin (corresponding to the conduction state) and the convective state SS. The latter state is reﬂectionsymmetric and can undergo a pitchfork bifurcation to a tilted convection state STC. From a symmetry point of view these states play the same role as O, U and NU in our problem. The essential diﬀerence between our system and that studied by Rucklidge and Matthews lies in the fact that in our case the leading stable eigenvalues of both O and U are complex (the former in the c0 = 0 subspace, and the latter in the c1 = 0 subspace). The dynamical

6. Nearly Inviscid Faraday Waves

219

behavior that results is new and is discussed in detail in Higuera, Porter, and Knobloch [2002] and Porter [2001]. Truncated Galerkin expansions of the type that led us to this behavior have, of course, also been used to study the eﬀect of direct forcing on the sine-Gordon equation, a system closely related to ours. Here, too, the study of the ﬁnite-dimensional system proved of substantial help in understanding the PDE simulations (Bishop, Forest, McLaughlin, and Overman [1990]; McLaughlin, Overman, Wiggins, and Xiong [1996]). It should therefore not come as a complete surprise that the two-mode model constructed here captures so much of the behavior found numerically in the PDE (4.2)–(4.3) by Martel, Knobloch, and Vega [2000].

Acknowledgments: We are very grateful for long-term assistance from our colleagues M. Higuera, C. Martel, J. Nicol´ as and J. Porter with whom the results reported here were obtained. This work was supported in part by the National Aeronautics and Space Administration under Grant NAG32152 and by the Spanish Direcci´ on General de Ense˜ nanza Superior under Grant PB97-0556.

References Abarbanel, H. D. I., D. D. Holm, J. E. Marsden, and T. S. Ratiu [1986], Nonlinear stability analysis of stratiﬁed ﬂuid equilibria, Phil. Trans. Roy. Soc. London A, 318:349–409. Ashwin, P., J. Buescu, and I. Stewart [1996], From attractor to saddle: a tale of transverse instability, Nonlinearity, 9:703–738. Batchelor, G. K. [1967], An Introduction to Fluid Dynamics, Cambridge Univ. Press. Bishop, A. R., M. G. Forest, D. W. McLaughlin, and E. A. Overman II [1990], A modal representation of chaotic attractors for the driven, damped pendulum chain, Phys. Lett. A, 144:17–25. Chorin, A. and J. E. Marsden [1979], A Mathematical Introduction to Fluid Mechanics Springer-Verlag. Craik, A. D. D. [1982], The drift velocity of water waves, J. Fluid Mech., 116:187– 205. Davey, A., L. M. Hocking, and K. Stewartson [1974], On nonlinear evolution of three-dimensional disturbances in plane Poiseuille ﬂow, J. Fluid Mech, 63:529– 536. Davey, A. and K. Stewartson [1974], On three-dimensional packets of surface waves, Proc. R. Soc. London, Ser. A, 338:101–110. Doedel, E. J., A. R. Champneys, T. F. Fairgrieve, Y. Kuznetsov, B. Sandstede, and X. J. Wang [1997], AUTO 97: Continuation and bifurcation software for ordinary diﬀerential equations (available via FTP from directory pub/doedel/auto at ftp.cs.concordia.ca).

220

E. Knobloch and J. M. Vega

Doelman, A. [1991], Finite-dimensional models of the Ginzburg–Landau equation, Nonlinearity, 4:231–250. Douady, S., S. Fauve, and O. Thual [1989], Oscillatory phase modulation of parametrically forced surface waves, Europhys. Lett., 10:309–315. Duan, J., H. V. Ly, and E. S. Titi [1996], The eﬀects of nonlocal interactions on the dynamics of the Ginzburg–Landau equation, Z. angew. Math. Phys., 47:432–455. Ermentrout, B. [2000], XPPAUT, Dynamical systems software with continuation and bifurcation capabilities (available via FTP from directory /pub/bardware at ftp.math.pit.edu). Ezerskii, A. B., M. I. Rabinovich, V. P. Reutov, and I. M. Starobinets [1986], Spatiotemporal chaos in the parametric excitation of a capillary ripple, Sov. Phys. JETP, 64:1228–1236. Fauve, S. [1995], Parametric instabilities, In G. Mart´ınez Mekler and T.H. Seligman, editors, Dynamics of Nonlinear and Disordered Systems, pp. 67– 115. World Scientiﬁc. Ghidaglia, J. M. [1988], Finite-dimensional behaviour for weakly damped driven Schr¨ odinger equations, Ann. Inst. H. Poincar´ e – Anal. Non-Lin´ eaire, 5:365– 405. Glendinning, P. [1984], Bifurcations near homoclinic orbits with symmetry, Phys. Lett. A, 103:163–166. Glendinning, P. and C. Sparrow [1984], Local and global behavior near homoclinic orbits, J. Stat. Phys., 35:645–696. Goubet, O. [1996], Regularity of attractor for a weakly damped nonlinear Schr¨ odinger equation, Appl. Anal., 60:99–119. Haller, G. and S. Wiggins [1993], Orbits homoclinic to resonances: the Hamiltonian case, Physica D, 66:298–346. Haller, G. and S. Wiggins [1995a], N -pulse homoclinic orbits in perturbations of resonant Hamiltonian systems, Arch. Rat. Mech. Anal., 130:25–101. Haller, G. and S. Wiggins [1995b], Multi-pulse jumping orbits and homoclinic trees in a modal truncation of the damped-forced nonlinear Schr¨ odinger equation, Physica D, 85:311–347. Hansen, P. L. and P. Alstrom [1997], Perturbation theory of parametrically driven capillary waves at low viscosity, J. Fluid Mech., 351:301–344. Henderson, D. M. and J. W. Miles [1994], Surface-wave damping in a circular cylinder with a ﬁxed contact line, J. Fluid Mech., 275:285–299. Higuera, M., J. A. Nicol´ as, and J. M. Vega [2000], Coupled amplitude-streaming ﬂow equations for the evolution of counter-rotating, nearly-inviscid surface waves in ﬁnite axisymmetric geometries, Preprint. Higuera, M., J. Porter, and E. Knobloch [2002], Heteroclinic dynamics in the nonlocal parametrically driven Schr¨ odinger equation, Physica D, 162:155–187. Holm, D. D., J. E. Marsden, T. S. Ratiu, and A. Weinstein [1985], Nonlinear stability of ﬂuid and plasma equilibria. Phys. Rep., 123:1–116.

6. Nearly Inviscid Faraday Waves

221

Holm, D. D., J. E. Marsden, and T. S. Ratiu [1986], Nonlinear stability of the Kelvin–Stuart cat’s eyes ﬂow, in Lects. in Appl. Math., 23:171–186. Kakutani, T. and K. Matsuuchi [1975], Eﬀect of viscosity on long gravity waves, J. Phys. Soc. Japan, 39:237–246. Knobloch, E. and R. Pierce [1998], On mean ﬂows associated with travelling water waves, Fluid Dyn. Res., 22:61–71. Knobloch, E., M. R. E. Proctor, and N. O. Weiss [1993], Finite-dimensional description of doubly diﬀusive convection, in Turbulence in Fluid Flows: A Dynamical Systems Approach, G.R. Sell, C. Foias, and R. Temam (eds), SpringerVerlag, New York, IMA Volumes in Mathematics and its Applications 55, pp. 59–72. Knobloch, E. and N. O. Weiss [1981], Bifurcations in a model of double-diﬀusive convection, Phys. Lett. A, 85:127–130. Kovaˇciˇc, G. and S. Wiggins [1992], Orbits homoclinic to resonances, with an application to chaos in a model of the forced and damped sine-Gordon equation, Physica D, 57:185–225. Kruse, K.-P., A. Mahalov, and J. E. Marsden [1999], On the Hamiltonian structure and three-dimensional instabilities of rotating liquid bridges. Fluid Dyn. Res., 24:37–59. Kudrolli, A. and J. P. Gollub [1997], Patterns and spatio-temporal chaos in parametrically forced surface waves: A systematic survey at large aspect ratio, Physica D, 97:133–154. Leibovich, S. [1983], On wave-current interaction theories of Langmuir circulations Ann. Rev. Fluid Mech., 15:391–427. Lewis, D., J. E. Marsden, R. Montgomery, and T. S. Ratiu [1986], The Hamiltonian structure for dynamic free boundary problems, Physica D, 18:391–404. Longuet–Higgins, M. S. [1953], Mass transport in water waves, Phil. Trans. R. Soc. London, Ser. A, 245:535–581. Marsden, J. E. and P. J. Morrison [1984], Noncanonical Hamiltonian ﬁeld theory and reduced MHD. Contemp. Math., 28:133–150. Martel, C. and E. Knobloch [1997], Damping of nearly inviscid water waves, Phys. Rev. E, 56:5544–5548. Martel, C., E. Knobloch, and J. M. Vega [2000], Dynamics of counterpropagating waves in parametrically forced systems, Physica D, 137:94–123. McLaughlin, D. W., E. A. Overman II, S. Wiggins, and C. Xiong [1996], Homoclinic orbits in a four-dimensional model of a perturbed NLS equation: A geometric singular perturbation study, in Dynamics Reported, vol. 5, SpringerVerlag, New York; p. 190. Miles, J. W. [1993], On Faraday waves, J. Fluid Mech., 248:671–683. Miles, J. and D. Henderson [1990], Parametrically forced surface waves, Ann. Rev. Fluid Mech., 22:143–165. Nicol´ as, J. A., D. Rivas, and J. M. Vega [1997], The interaction of thermocapillary convection and low-frequency vibration in nearly-inviscid liquid bridges, Z. Angew. Math. Phys., 48:389–423.

222

E. Knobloch and J. M. Vega

Nicol´ as, J. A., D. Rivas, and J. M. Vega [1998], On the steady streaming ﬂow due to high frequency vibration in nearly-inviscid liquid bridges, J. Fluid Mech., 354:147–174. Nicol´ as, J. A. and J. M. Vega [1996], Weakly nonlinear oscillations of axisymmetric liquid bridges, J. Fluid Mech., 328:95–100. Oliver, M. and E. Titi [1998], Analyticity of the attractor and the number of determining modes for a weakly damped driven nonlinear Schr¨odinger equation, Indiana Univ. Math. J., 47:49–73. Phillips, O. M. [1977], Press.

The Dynamics of the Upper Ocean, Cambridge Univ.

Pierce, R. D. and E. Knobloch [1994], On the modulational stability of traveling and standing water waves, Phys. Fluids, 6:1177–1190. Porter, J. B. [2001], Global bifurcations with symmetry. Ph.D. Thesis, University of California at Berkeley. Riecke, H., J. D. Crawford, and E. Knobloch [1988], Time-modulated oscillatory convection, Phys. Rev. Lett., 61:1942–1945. Rucklidge, A. M. and P. C. Matthews [1996], Analysis of the shearing instability in nonlinear convection and magnetoconvection, Nonlinearity, 9:311–351. Schlichting, H. [1932], Berechnung ebener periodischer Grenzschichtstr¨omungen, Phys. Z., 33:327–335. Swift, J. W. and K. Wiesenfeld [1984], Suppression of period doubling in symmetric systems, Phys. Rev. Lett., 52:705–708. Tresser, C. [1984], About some theorems by L.P. Shil’nikov, Ann. Inst. Henri Poincar´e – Phys. Theorique, 40:440–461. Tuﬁllaro, N. B., R. Ramshankar, and J. P. Gollub [1989], Order-disorder transition in capillary ripples, Phys. Rev. Lett., 62:422–425. Vega, J. M., E. Knobloch, and C. Martel [2001], Nearly inviscid Faraday waves in annular containers of moderately large aspect ratio, Physica D, 154:313–336. Wang, X. [1995], An energy equation for the weakly damped driven nonlinear Schr¨ odinger equations and its application to their attractors, Physica D, 88:167–175. Wiggins, S. [1988], Global Bifurcations and Chaos: Analytical Methods, SpringerVerlag, New York.

7 The Variational Multiscale Formulation of LES with Application to Turbulent Channel Flows Thomas J. R. Hughes Assad A. Oberai To Jerry Marsden on the occasion of his 60th birthday ABSTRACT We begin by recalling old times when the senior author and Jerry Marsden collaborated on research in mechanics in the 1970’s. We note that both our recent interests have been focused on turbulence, although with diﬀerent approaches. The common theme is reliance on variational structure. We then get down to business and describe our approach — the variational multiscale formulation of LES. Application is made to turbulent two-dimensional equilibrium and three-dimensional non-equilibrium channel ﬂows. Simple, constant-coeﬃcient Smagorinsky-type eddy viscosities, without wall damping functions, are used to model the transfer of energy from small resolved scales to unresolved scales, an approach which is not viable within the traditional LES framework. Nevertheless, very good results are obtained.

Contents 1 2

Introduction . . . . . . . . . . . . . . . . . . Theory . . . . . . . . . . . . . . . . . . . . . . 2.1 Incompressible Navier–Stokes Equations . . 2.2 Variational Multiscale Method . . . . . . . 3 Numerical Results . . . . . . . . . . . . . . . 3.1 Preliminaries . . . . . . . . . . . . . . . . . 4 Conclusions . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . .

223

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

224 225 225 227 229 229 237 238

224

1

T. J. R. Hughes and A. A. Oberai

Introduction

I ﬁrst “met” Jerry Marsden in the late 1960’s in Barnes and Nobles in New York City. (This was when there was only one Barnes and Nobles book store.) On Saturdays I would spend my afternoons paging through technical books in Barnes and Nobles, McGraw-Hill, and many other not as famous book stores in Manhattan trying to sate the intellectual appetite of a young engineer. I recall encountering the book Foundations of Mechanics by Ralph Abraham and Jerrold E. Marsden of Princeton University for the ﬁrst time (Abraham and Marsden [1966]). I thought I knew a bit about mechanics but I had no idea what this book was about! Nevertheless, it was intriguing and it was easy to discern the beauty and elegance within its pages. This book was something special. I did not buy it, but every time I returned to Barnes and Nobles I would look at it and take away a little something. What was a manifold, a vector bundle, a diﬀerential form, symplectic geometry, a Klein bottle, etc., and what did they have to do with mechanics? My curiosity was piqued. In 1969 I went to Berkeley to study for my Ph.D. Berkeley was my choice because of its preeminence in ﬁnite element research, excellence in mechanics and mathematics, and the reputation of its faculty. I was urged by a former Berkeley Ph.D. and a colleague of mine at General Dynamics, John Baylor, to join the research team of Karl Pister, whose reputation as a scholar, gentleman, leader, and man of character turned out to be absolutely accurate (but that is another nice story). When I arrived at Berkeley and met Karl he urged me to pursue the Ph.D. in Engineering Science, a program, not a department, that allowed considerable latitude in course work. This would enable me to study all the mechanics and mathematics I was so eager to learn, which I did. So oﬀ I went studying mechanics and mathematics. I do not remember when I ﬁrst realized that the same Jerrold E. Marsden was a faculty member in the mathematics department at Berkeley but it was probably pretty early in my studies. I had no contact with him until I had taken some more mathematics courses, in particular, ones in mathematical physics, analysis, topology, and diﬀerential manifolds and modern diﬀerential geometry, and until I saw an announcement of a graduate seminar to be given by Chernoﬀ and Marsden regarding the application of global analysis to incompressible ﬂows and nonlinear elasticity. It was time to meet Professor Marsden. I made an appointment to discuss the seminar with him. From the ﬁrst time we met, Jerry and I hit it oﬀ. It turned out that Jerry wanted to incorporate elasticity in the seminar but did not know much about it. He proposed that this would be my contribution to the seminar and he would mentor me on global analysis. Deal. We began to collaborate. Somewhere along the line I decided to cash in some of my mathematics courses on a master’s degree in mathematics. I wrote my masters’s thesis on existence, uniqueness, and regularity in linear elastodynamics under

7. The Variational Multiscale Formulation of LES

225

Jerry’s supervision (Hughes and Marsden [1978]). We also studied nonlinear elastodynamics and, with the help of Tosio Kato, proved some basic theorems which were also applicable to general relativity (Hughes, Kato, and Marsden [1977]). Eventually our research in elasticity and continuum mechanics led to a year-long graduate course which became the basis of our book (Marsden and Hughes [1983]). After I left Berkeley for Caltech, our intense period of research wound down. Both Jerry and I pursued independent tracks. Interestingly, our scientiﬁc orbits seem to have recently intersected once again. Jerry has gotten interested in integration schemes for dynamics problems (Marsden, Patrick, and Shkoller [1998]; Kane, Marsden, Ortiz, and West [2000]), a topic that I worked on assiduously for many years (Hughes [2000]) and one that I still have a deep interest in. And Jerry and I are now both working on turbulence. Perhaps the dynamics and turbulence seeds were planted years ago in joint work of ours with Alex Chorin and Marge McCracken (see Chorin, Hughes, Marsden, and McCracken [1978]). Our approaches to turbulence are diﬀerent — Jerry’s is based on the Lagrangian averaged equations and mine on a multiscale decomposition of resolved scales — but we are both heavily relying on variational machinery, a recurring theme in our research. There may be a possibility that we can once again mutually beneﬁt from each other’s ideas. With that end in mind, I thought that I would attempt, with the help of Assad Oberai, a simple exposition of the variational multiscale method. Here goes. In section 2, we describe the basic machinery. For background references, see Hughes [1995]; Hughes, Feij´ oo, Mazzei, and Quincy [1998]; Hughes, Mazzei, and Jansen [2000b]. Antecedents include Hughes, Mazzei, Oberai, and Wray [2001a]; Hughes, Oberai, and Mazzei [2001b] and the pioneering work of the Temam group (Dubois, Jauberteau, and Temam [1993, 1998, 1999]). A few sample calculations are presented in Section 3. Conclusions are drawn in Section 4.

2 2.1

Theory Incompressible Navier–Stokes Equations

Let Ω be an open, connected, bounded subset of Rd , d = 2 or 3, with piecewise smooth boundary Γ = ∂Ω; Ω represents the ﬁxed spatial domain of our problem. The time interval of interest is denoted ] 0, T [ , T > 0, and thus the space-time domain is Q = Ω × ] 0, T [ ; its lateral boundary is denoted P = Γ × ] 0, T [ . The setup is illustrated in Fig. 2.1. The initial/boundary-value problem consists of solving the following set ¯ → Rd , the velocity, and p : Q → R, of partial diﬀerential equations for u : Q

226

T. J. R. Hughes and A. A. Oberai

T Q = Ω × (0, T ) P = Γ × (0, T ) Γ = dΩ Ω 0

Figure 2.1. Space-time domain for the initial/boundary value problem.

the pressure (divided by density), ∂u + ∇ · (u ⊗ u) + ∇p = ν∆u + f ∂t ∇·u=0

in Q

(2.1)

in Q

(2.2)

u=0

on P

(2.3)

on Ω

(2.4)

+

−

u(0 ) = u(0 )

where f : Q → Rd is the given body force (per unit mass); ν is the kinematic viscosity, assumed positive and constant; u(0− ) : Ω → Rd is the given initial velocity; and ⊗ denotes the tensor product (e.g., in component notation [u ⊗ v]ij = ui vj ). Equations (2.1)–(2.4) are, respectively, the linear momentum balance, the incompressibility constraint, the no-slip boundary condition and the initial condition. Note that when we write a function with only one argument, it is assumed to refer to time. For example, u(t) = u(·, t), where the spatial argument x ∈ Ω is suppressed for simplicity. Furthermore, u t± = lim u (t ± ε) ε↓0

∀t ∈ [0, T ] .

(2.5)

This notation allows us to distinguish between u(0+ ) and u(0− ), the solution and its given initial value, respectively. In our variational formulation of the initial/boundary-value problem we will only satisfy (2.4) in a weak sense.

7. The Variational Multiscale Formulation of LES

2.2 2.2.1

227

Variational Multiscale Method Space-Time Formulation of the Incompressible Navier–Stokes Equations

We consider a space-time formulation with weakly imposed initial condition. Let V = V(Q) denote the trial solution and weighting function spaces, which are assumed p} ∈ V implies to be identical. We assume U = {u, u = 0 on P and Ω p(t) dΩ = 0 for all t ∈ ] 0, T [ . Let · , · D denote the L2 (D) inner product, where D = Ω or Q. The variational formulation is stated as follows: Find U ∈ V such that ∀W = {w, q} ∈ V B(W , U ) = (W , F ) ,

(2.6)

∂w B(W , U ) = w(T − ) , u(T − ) Ω − ,u ∂t Q − ∇w , u ⊗ u Q + q , ∇ · u Q − ∇ · w , p Q + ∇s w, 2ν∇s u Q

(2.7)

where

and (W , F ) = w , f Q + w(0+ ) , u(0− ) Ω

(2.8)

This formulation implies weak satisfaction of the momentum equations and incompressibility constraint, in addition to the initial condition. The boundary condition is built into the deﬁnition of V. 2.2.2

Separation of Scales

Let ¯ ⊕ V V=V

(2.9)

¯ is identiﬁed with a standard ﬁnite-dimensional space (e.g., a specwhere V tral or ﬁnite element space). Various characterizations of V are possible. Note that V is ∞-dimensional. In the discrete case, V can be replaced with ﬁnite-dimensional approximations which are viewed as enrichments of ¯ We think of V ¯ as representing large scales and V as representing small V scales. (2.9) enables us to decompose (2.6) into two sub-problems: ¯ + U ) = (W , F ) B(W , U ¯ + U ) = (W , F ) B(W , U

(2.10) (2.11)

228

T. J. R. Hughes and A. A. Oberai

where ¯ + U U =U W =W + W

(2.12)

(2.13)

¯ ,W ∈ V ¯ and U , W ∈ V . in which U Let ¯ + εU ¯ , U ) = d B W , U B1 (W , U dε ε=0

∂w , u = w(T − ), u (T − ) Ω − ∂t Q ¯ ⊗ u + u ⊗ u ¯ Q + q , ∇ · u Q − ∇w , u − ∇ · w , p Q + ∇s w, 2ν∇s u Q . (2.14) ¯ , U ) is the linearized Navier–Stokes operator. B1 (W , U With the aid of (2.14) we may rewrite (2.10) and (2.11) as ¯ ) + B1 (W , U ¯ , U ) = ∇w ¯ , u ⊗ u Q + W , F (2.15) B(W , U ¯ ¯ B1 (W , U , U ) − ∇w , u ⊗ u Q = − B(W , U ) − W , F . (2.16) This amounts to a pair of coupled, nonlinear variational equations. Given ¯ ). the small scales (i.e., U ), (2.15) enables solution for large scales (i.e., U Likewise, the large scales drive small scales through (2.16). Note, the righthand-side of (2.16) is the residual of the large scales projected onto V . 2.2.3

Modeling of Subgrid Scales

In the discrete case we need to include a model to account for the transfer of energy through the cutoﬀ induced by the ﬁnite dimensional approximation of V . We consider simple eddy viscosity terms added to the small-scale equation, (2.16), viz., ¯ , U ) = − B(W , U ¯) − W ,F B (W , U

(2.17)

¯ , U ) ≡ B1 (W , U ¯ , U ) − ∇w , u ⊗ u B (W , U Q s s + ∇ w , 2νT ∇ u Q .

(2.18)

where

This is the modeled small-scale equation which replaces (2.16). The largescale equation, (2.15), remains the same.

7. The Variational Multiscale Formulation of LES

2.2.4

229

Eddy Viscosity Models

The assumed forms of νT are inspired by the Smagorinsky model: Small–Small νT = (CS ∆ )2 ∇s u

(2.19)

and ¯ . νT = (CS ∆ )2 ∇s u

Large–Small

(2.20)

where CS is a constant and ∆ is a length scale. See Hughes, Mazzei, and Jansen [2000b]; Hughes, Mazzei, Oberai, and Wray [2001a]; Hughes, Oberai, and Mazzei [2001b] for a discussion concerning their evaluation. In the ﬁrst case, νT depends exclusively on small-scale velocity components. In the second case, νT depends on the large-scale components. 2.2.5

Summary

A concise way of writing the combined modeled system of (2.15) and (2.17) is ¯ B(W , U ) = (W , F )

(2.21)

¯ B(W , U ) ≡ B(W , U ) + (∇s w , 2νT ∇s u )Q .

(2.22)

where

¯ ⊕ V is ﬁnite dimensional in this case. (2.22) is the form The space V = V used in our numerical calculations.

3 3.1

Numerical Results Preliminaries

We consider a rectangular channel Ω = [0, Lx ] × [−δ, δ] × [0, Lz ] ⊂ R3 . The coordinate directions x, y, z, following the usual convention, are aligned with the streamwise, wall-normal and spanwise directions, respectively. The velocity components are likewise denoted u = (ux , uy , uz ) = (u, v, w). Reτ = ( uτ δ/ν is the Reynolds number based on the wall-shear velocity, uτ = τ /ρ, in which ρ is density, τ is the wall shear, and ν is the kinematic viscosity. The boundary conditions are periodic in the x-and z-directions, and no-slip at y = ±δ. The non-dimensional distance from the wall is deﬁned as y + = (δ − |y|)uτ /ν. Note that the ﬂow is driven by a prescribed unit pressure gradient in the minus x-direction. The problem conﬁguration is schematically illustrated in Fig. 3.1. Two problems are considered: Fully-developed two-dimensional plane channel ﬂow at Reτ =180 and a non-equilibrium three-dimensional channel

230

T. J. R. Hughes and A. A. Oberai

y

δ Lx x Lz

−δ

z Figure 3.1. Problem set-up for the channel ﬂow.

ﬂow at Reτ =180 in which a spanwise shearing motion is imposed instantaneously to the fully-developed two-dimensional ﬂow. The approach used is based on that of Moser, Moin, and Leonard [1983], Kim, Moin, and Moser [1987], and Lopez and Moser [1999]. For further details see also Hughes, Oberai, and Mazzei [2001b]. The approach employed utilizes Fourier-spectral discretization in the xand z-directions, and modiﬁed Legendre polynomials in the y-direction. Each of the modiﬁed Legendre polynomial satisﬁes the no-slip boundary conditions. The concept is schematically illustrated in Fig. 3.2. The indices, n, associated with the Legendre polynomials in the y-direction are arranged in ascending order along the vertical axis. The indices j and l, associated with the kx = 2πj/Lx and kz = 2πl/Lz wave numbers are arranged along the axes in the horizontal plane as shown. Each triple of indices, (j, n, l), identiﬁes a Fourier–Galerkin basis function. The large-scale space of func¯ consists of the span of the basis functions represented by tions, denoted V, the inner box. Note that for this set, the indices satisfy the bounds −

¯x ¯x N N <j< , 2 2

¯y , 0≤n
−

¯z ¯z N N
(3.1)

The space of all functions, denoted V, consists of the span of all basis functions represented by the outer box. For this set, the indices satisfy −

N Nx < j < x, 2 2

0 ≤ n < Ny ,

−

N Nz
(3.2)

¯x , Ny > N ¯y and Nz > N ¯z . The space of small scales Obviously, Nx > N ¯ is the diﬀerence, namely, V = V \ V. It consists of the span of all basis functions in the outer box but not those in the inner box. Every triple of

7. The Variational Multiscale Formulation of LES

231

indices in this region represents a Fourier–Galerkin basis function of the span of small scales, V . n (y)

Ny

¯y N −1 N 2 z ¯ N −1 2 z 1N ¯ 2 z 1 N 2 z

−1 N 2 x

¯ N −1 2 x

1N ¯ 2 x

j (kx )

1 N 2 x

l (kz ) Figure 3.2. The space V of all scales is represented by the outer box. The space ¯ of large scales is represented by the inner box. The space V = V \ V ¯ is the V space of small scales.

All LES calculations were performed with 323 resolution. In the multiscale case, this means Nx = Ny = Nz = N = 32. For the large-scale ¯y = N ¯z = N ¯ = 16. This represents a nominal ¯x = N subspace, we selected N rather than an optimized value. (See Hughes, Oberai, and Mazzei [2001b] ¯ The fraction of modes in the large-scale for results at optimized value of N.) subspace is given by f=

¯z − 1 N ¯y − 2 ¯x − 1 N ¯ − 1)2 (N ¯ − 2) N (N · · = Nx − 1 Nz − 1 Ny − 2 (N − 1)2 (N − 2)

(3.3)

The fraction of modes in the small-scale space is 1 − f . For the cases considered, f = 0.109 and 1 − f = 0.891. The dimensions of the channel are Lx = 2π, Lz = 4π/3, and δ = 1. Reference DNS data was obtained from Moser, Kim, and Mansour [1999] for the two-dimensional case and from Coleman, Kim, and Le [1996] for the three-dimensional case. Two grids are employed in the calculations: one is a 48 × 48 × 48 quadrature grid, and the other is a 32 × 32 × 32 grid used to compute length scales for the Smagorinsky and multiscale models. The points are equally spaced in the x- and z-directions and are the Gauss quadrature points in the y-direction. The length-scale is computed from : y ) = ∆(˜ y ) = 3 hx hz hy (˜ y) (3.4) ∆ (˜

232

T. J. R. Hughes and A. A. Oberai

where hx = Lx /32 = 2π/32 = 0.196 hz = Lz /32 = 4π/(3 · 32) = 0.131

(3.5) (3.6)

y ) is the y-spacing at Gauss point y˜, that is, the distance between and hy (˜ adjacent Gauss points of the 32-point rule, in which y˜ is the location corresponding to the 48-point rule. We note that there are no coincident locations of quadrature points for the two rules, so no ambiguity can arise. The smallest y-length scale, adjacent to the wall, is 0.0027 (∆y + = 0.486), and the largest, at the center of the channel, is 0.0966 (∆y + = 17.392). Dealiasing was employed but the higher order functions used in the models are not dealiased exactly by the 3/2 rule. In the ﬁgures, we have used the notation · to denote averages over x, z-planes, and · to denote averages over x, z-planes and time. For the Smagorinsky and multiscale models, we used Cs = Cs = 0.1 throughout. In calculating the x, y-component of the Reynolds stress, denoted Rxy , we included the eﬀect of the model. We employed a variational projection method which can be shown to be conservative analytically (Hughes, Engel, Mazzei, and Larson [2000a]). This was numerically veriﬁed. Two-dimensional case at Reτ = 180. We compare the Large–Small and Small–Small multiscale methods with the Dynamic model (see Germano, Piomelli, Moin, and Cabot [1991], Ghosal, Lund, Moin, and Akselvoll [1995], Lilly [1992]) and DNS data. Results are presented here, in Figures 3.3–3.7. In Fig. 3.3 we compare results for the mean streamwise velocity. The Large–Small case is not as accurate as the Dynamic model in the log-layer, both lying above the DNS data. The Small–Small values are the most accurate, slightly better than the Dynamic model. In Fig. 3.4 we compare root-mean-square (RMS) values of the streamwise velocity ﬂuctuations. The Large–Small and Small–Small results overshoot the peak DNS value somewhat more than the Dynamic model. Throughout most of the channel, both multiscale models are in better agreement with DNS data than is the Dynamic model. RMS values of the wall-normal and spanwise velocity ﬂuctuations are presented in Figs. 3.5 and 3.6. The accuracy of both Large–Small and Small–Small is considerably better than the Dynamic model. The x,y-component of Reynolds stress is presented in Fig. 3.7. All results are very accurate as may be seen. Note that, due to the inclusion of the model in the calculation of the Reynolds stress, the multiscale model results do not vanish at the wall whereas the Dynamic model results do. In summary, the Large–Small and Dynamic model results are commensurate in that Dynamic is better than Large–Small in accurately capturing the mean streamwise velocity log layer, and the peak of the RMS streamwise velocity ﬂuctuations, whereas Large–Small is better than Dynamic

7. The Variational Multiscale Formulation of LES

20

dddddd1 T( mm6 ( mmm K (( m m (( mmm mmm (( m m Dynamic (( (( ( Small–Small

dddd Large–Small dddd

18 16

u

233

14 12 10 8

DNS

6 4 2 0 -2 10

-1 10

0 10

y+

1 10

2 10

3 10

¯ = 16). Figure 3.3. Mean streamwise velocity (Reτ = 180, N

3

o

Large–Small

2. 5

I

2

u ˜ 1. 5

1

Small–Small

oDynamic ooo o o o ooo ow oo

DNS 0. 5

0 0

20

40

60

80

y+

100

120

140

160

180

1

Figure 3.4. RMS values of streamwise velocity ﬂuctuations u ˜ = (u − u)2 2 .

234

T. J. R. Hughes and A. A. Oberai

0. 9 0. 8 0. 7

DNS ?

?? ?? ?

Small–Small

0. 6

v˜

I

*T * ** ** ** ** ** **

0. 5 0. 4

Dynamic

Large–Small

0. 3 0. 2 0. 1 0 0

20

40

60

80

100

y+

120

140

160

180

1

Figure 3.5. RMS values of wall-normal velocity ﬂuctuations v˜ = (v − v)2 2 .

1. 4

Dynamic

1. 2

Q ""

1

w ˜

0. 8

0. 6

0. 4

""

""

""

""

""

""

""

""

Small–Small

"

DNS

ll5 lll l l l lll

Large–Small

0. 2

0 0

20

40

60

80

y+

100

120

140

160

180 1

Figure 3.6. RMS values of spanwise velocity ﬂuctuations w ˜ = (w − w)2 2 .

7. The Variational Multiscale Formulation of LES

235

0. 1 0 - 0. 1 - 0. 2 - 0. 3

Dynamic, Large–Small

\n \\ \\\\\\\\\ DNS, Small–Small

Rxy - 0. 4 - 0. 5 - 0. 6 - 0. 7 - 0. 8

0

0. 1

0. 2

0. 3

0. 4

0. 5

|1 − y|

0. 6

0. 7

0. 8

0. 9

1

¯ = 16). Figure 3.7. x,y-component of Reynolds stress (Reτ = 180, N

throughout most of the channel for the RMS streamwise velocity ﬂuctuations, and uniformly more accurate for the RMS wall-normal and spanwise velocity ﬂuctuations, and the Reynolds stress. However, the most accurate model overall is Small–Small. Three-dimensional case at Reτ = 180. In the three-dimensional case the lower wall is sheared in the minus z-direction in accord with the DNS calculations of Coleman, Kim, and Le [1996]. We compared the Dynamic model, the Smagorinsky model with Van Driest damping (see Smagorinsky [1963], Van Driest [1956], and Moin and Kim [1982]), and the two multiscale models with DNS data (Coleman, Kim, and Le [1996]). Since the results depend on the initial conditions, we performed 16 realizations for each model. The initial conditions were selected to be instantaneous values of the velocity ﬁeld sampled from the time interval used to calculate statistics for the fully-developed case described previously. For each model, we used velocity ﬁelds computed for that model. Statistics were then computed from the 16 realizations. Mean values of wall shear stress at the lower wall and maximum turbulent kinetic energy are plotted in Figs. 3.8 and 3.9. In both sets of data, results are normalized by their initial values. From Fig. 3.8 we see that the Dynamic model predicts the mean wall shear very well for early times (t ≤ 0.2), but thereafter departs signiﬁcantly from the DNS data. The Smagorinsky model with Van Driest damping is somewhat better in that it is in fair agreement with DNS data until t = 1.0, but remains roughly constant thereafter in contrast with the DNS data which increases. The multiscale models are in good overall agreement with

236

T. J. R. Hughes and A. A. Oberai

1.06

DNS (1,2,3)

1.04

Large–Small?

?? ?? ?? ??

1.02

Dynamic 4

1

44 44 44 4

0.98

τ˜(t)

R%% %% %% %% %%

0.96 0.94

Small–Small

O

0.92 0. 9 0.88

Smagorinsky with Van Driest damping 0

0. 2

0. 4

0. 6

0. 8

1

1. 2

1. 4

1. 6

1. 8

t Figure 3.8. Non-equilibrium case. Mean streamwise shear stress τ˜(t) =

τ (t) . τ (0)

1. 3 1. 2

Large–Small ?

?? ?? ??

DNS (1,2,3)

O

1. 1 1

,U , ,, ,, ,, ,, ,,

q˜(t) 0. 9 0. 8

O

Small–Small

Dynamic

0. 7 0. 6

Smagorinsky with Van Driest damping 0. 5 0

0. 2

0. 4

0. 6

0. 8

1

1. 2

1. 4

1. 6

1. 8

t Figure 3.9. Non-equilibrium case. Max turbulent kinetic energy q˜(t) =

q(t) . q(0)

7. The Variational Multiscale Formulation of LES

237

the DNS data for all times. The Large–Small model is in particularly good agreement even at long times. From Fig. 3.9 we see that the Smagorinsky model with Van Driest damping is the least accurate in predicting the maximum turbulent kinetic energy. It is in good agreement with DNS data until t = 0.3 but thereafter is in signiﬁcant error. The Dynamic model is better in that it is in good agreement with the DNS data until t = 0.6, but it also loses accuracy beyond this point. Both multiscale models are better. Again, the Large–Small model is in particularly good agreement at long times. Overall, both multiscale models are signiﬁcantly more accurate than the Dynamic model and the Smagorinsky model with Van Driest damping. The accuracy of the Large–Small model is particularly noteworthy at long times.

4

Conclusions

We described the variational multiscale formulation and presented numerical results for two-dimensional equilibrium and three-dimensional nonequilibrium channel ﬂows. Our goal in the numerical simulation was to assess the variational multiscale formulation’s ability to solve wall-bounded ﬂows. In particular, we wanted to demonstrate the feasibility of using simple, constant-coeﬃcient Smagorinsky-type eddy viscosities without wall damping functions to model the evolution of the small scales, knowing full well that such an approach is not viable within the traditional LES framework. The remarkable result is that even with these simple models the multiscale method achieves very good accuracy, arguably better than the accuracy of the Dynamic model. We believe that these results provide convincing evidence of the power of multiscale concepts in LES, and reinforce conclusions previously drawn for homogeneous isotropic ﬂows (see Hughes, Mazzei, Oberai, and Wray [2001a]). For a more comprehensive investigation of channel ﬂows, see Hughes, Oberai, and Mazzei [2001b].

Acknowledgments: We wish to express sincere appreciation to V. Lopez and R. Moser who provided us with their channel-ﬂow code, which served as the basis of the numerical calculations reported herein. We also wish to thank P. Moin for suggesting we apply our method to turbulent boundary layers, and W. Cabot, S. Ghosal and S. Lichter for helpful conversations. This research was partially supported by NASA Ames Research Center Cooperative Agreement No. NCC 2–5363 and ONR Grant No. 00014–99– 1–0122.

238

T. J. R. Hughes and A. A. Oberai

References Abraham, R. and J. E. Marsden [1966], Foundations of Mechanics. W. A. Benjamin and Co., Reading, Massachusetts. Chorin, A. J., T. J. R. Hughes, J. E. Marsden, and M. F. McCracken [1978], Product formulas and numerical algorithms, Communications on Pure and Applied Mathematics 31, 205–256. Coleman, G., J. Kim, and A.-T. Le [1996], A Numerical Study of Threedimensional wall-Bounded ﬂows, International Journal of Heat and Fluid Flow 17, 333–342. Dubois, T., F. Jauberteau, and R. Temam [1993], Solution of the incompressible Navier–Stokes equations by the nonlinear Galerkin method, Journal of Scientiﬁc Computing 8, 167–194. Dubois, T., F. Jauberteau, and R. Temam [1998], Incremental unknowns, multilevel methods and the numerical simulations of turbulence, Computer Methods in Applied Mechanics and Engineering 159, 123–189. Dubois, T., F. Jauberteau, and R. Temam [1999], Dynamic Multilevel Methods and the Numerical Simulation of Turbulence. Cambridge University Press, Cambridge, U.K. Germano, M., U. Piomelli, P. Moin, and W. Cabot [1991], A dynamic subgridscale eddy viscosity model, Physics of Fluids 3(7), 1760–1765. Ghosal, S., T. Lund, P. Moin, and K. Akselvoll [1995], A dynamic localization model for large-Eddy simulation of turbulent ﬂows, Journal of Fluid Mechanics 286, 229–255. Hughes, T. J. R. [1995], Multiscale Phenomena: Green’s functions, The Dirichletto-Neumann formulation, subgrid scale models, bubbles, and the origins of stabilized methods, Computer Methods in Applied Mechanics and Engineering 127, 387–401. Hughes, T. J. R. [2000], The Finite Element Method–Linear Static and Dynamic Finite Element Analysis. Dover Publications, Mineola, New York. Hughes, T. J. R., G. R. Feij´ oo, L. Mazzei, and J.-B. Quincy [1998], The variational multiscale method–a paradigm for computational mechanics, Computer Methods in Applied Mechanics and Engineering 166(1–2), 3–24. Hughes, T. J. R., T. Kato, and J. E. Marsden [1977], Well-posed quasi-linear second-order hyperbolic systems with applications to nonlinear elastodynamics and general relativity, Arch. Rational Mech. Anal. 63, 273–294. Hughes, T. J. R. and J. E. Marsden [1978], Classical elastodynamics as a symmetric hyperbolic system, Journal of Elasticity 8, 97–110. Hughes, T. J. R., L. Mazzei, A. A. Oberai, and A. A. Wray [2001a], The multiscale formulation of large eddy simulation: decay of homogeneous isotropic turbulence, Physics of Fluids 13(2), 505–512. Hughes, T. J. R., A. A. Oberai, and L. Mazzei [2001b], Large eddy simulation of turbulent channel ﬂows by the variational multiscale method, Physics of Fluids 13(6), 1784–1799.

7. The Variational Multiscale Formulation of LES

239

Hughes, T. J. R., G. Engel, L. Mazzei, and M. Larson [2000a], The Continuous Galerkin method is locally conservative, Journal of Computational Physics 163, 467–488. Hughes, T. J. R., L. Mazzei, and K. Jansen [2000b], Large eddy simulation and the variational multiscale method, Computing and Visualization in Science 3, 47–59. Kane, C., J. E. Marsden, M. Ortiz, and M. West [2000], Variational integrators and the newmark algorithm for conservative and dissipative mechanical systems, International Journal for Numerical Methods in Engineering 49, 1295– 1325. Kim, J., P. Moin, and R. Moser [1987], Turbulence statistics in fully developed channel ﬂow at low Reynolds number, Journal of Fluid Mechanics 177, 133– 166. Lilly, D. [1992], A proposed modiﬁcation of the Germano subgrid-scale closure method, Physics of Fluids A 4, 633–635. Lopez, V. and R. Moser [1999]. Private communication, 1999. Marsden, J. E. and T. J. R. Hughes [1983, 1994)], Mathematical Foundations of Elasticity. Prentice-Hall, Englewood Cliﬀs, New Jersey. Reprinted by Dover Publications, Mineola, New York, 1994. Marsden, J. E., G. W. Patrick, and S. Shkoller [1998], Multisymplectic geometry, variational integrators, and nonlinear PDEs, Communications in Mathematical Physics 199, 351–395. Moin, P. and J. Kim [1982], Numerical investigation of turbulent channel ﬂow, Journal of Fluid Mechanics 118, 341–377. Moser, R., J. Kim, and N. Mansour [1999], Direct numerical simulation of turbulent channel ﬂows up to Reτ = 590, Physics of Fluids 11, 943–945. Moser, R., P. Moin, and A. Leonard [1983], A spectral numerical method for the Navier–Stokes equations with applications to Taylor–Couette ﬂow, Journal of Computational Physics 52, 524–544. Smagorinsky, J. [1963], General circulation experiments with the primitive equations, I: The basic experiment, Monthly Weather Review 91, 99–164. Van Driest, E. [1956], On turbulent ﬂow near a wall, Journal of the Aerospace Sciences 23, 1007–1011.

Part III

Dynamical Systems

8 Patterns of Oscillation in Coupled Cell Systems Martin Golubitsky Ian Stewart To Jerry Marsden on the occasion of his 60th birthday ABSTRACT Coupled oscillators or coupled cell systems are used as models in a variety of physical and biological contexts. Each of these models includes assumptions about the internal dynamics of a cell (a pendulum or a neuron or a single laser) and assumptions about how the cells are coupled to each other. In a primitive sense, coupled cell systems are just moderate sized systems of ODE; for example, an eight-cell system with four-dimensional internal dynamics (such as a Hodgkin–Huxley system) yields a 32-dimensional system of ODE. In a more sophisticated sense, coupled cell systems have additional structure; we want to be able to compare the dynamics in diﬀerent cells (are they synchronous, or a half-period out of phase, or do they have a more complicated phase relation?). In this paper we explore the extra structure that is associated with a coupled cell system. We argue that those permutations of the cells that are assumed to be symmetries of the cell system consititute a modelling assumption — one that in large measure dictates the kinds of equilibria and time periodic solutions that are expected in such models. We survey certain general results in the context of speciﬁc models, including locomotor central pattern generators for quadruped motion and coupled pendula. These results lead to a model for multirhythms. Coupled cell dynamics are a worthwhile subject of study and we begin here to discuss some of the fascinating features of this area.

Contents 1 2

Introduction . . . . . . . . . . . . . . . . . . . . . . Model Cell Systems . . . . . . . . . . . . . . . . . . 2.1 Speciation (SN ) . . . . . . . . . . . . . . . . . . . 2.2 Quadrupedal Gaits (Z2 × Z4 ) . . . . . . . . . . . . 2.3 Pendulum Ring Coupled by Torsion Springs (DN ) 2.4 Coupled Hypercolumns (D4 ) . . . . . . . . . . . .

243

. . . . . .

244 245 246 247 249 251

244

M. Golubitsky and I. Stewart 3

A Formal Deﬁnition of a Coupled Cell System . . . 252 3.1 Coupled Cell Systems with Additional Structure . . 254 3.2 Symmetry and Modelling . . . . . . . . . . . . . . . 255 4 Spatio-Temporal Patterns in Coupled Cell Systems 256 4.1 A Classiﬁcation Theorem for Spatio-Temporal Symmetries . . . . . . . . . . . . . . . . . . . . . . . 257 4.2 Examples of Spatio-Temporal Symmetries . . . . . . 260 4.3 Spatio-Temporal Symmetries in Hamiltonian Systems 263 5 Spontaneous Symmetry-Breaking . . . . . . . . . . . 268 5.1 Linear Theory . . . . . . . . . . . . . . . . . . . . . 269 5.2 Nonlinear Theory . . . . . . . . . . . . . . . . . . . . 269 5.3 Genericity Questions in Coupled Cell Systems . . . . 274 5.4 The Equivariant Moser–Weinstein Theorem . . . . . 275 6 The Coupling Decomposition . . . . . . . . . . . . . 280 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283

1

Introduction

Coupled oscillators or coupled cell systems have been much studied as models for certain physical or biological systems (Josephson junction arrays Hadley, Beasley, and Wiesenfeld [1988], coupled lasers Wang and Winful [1988]; Bracikowski and Roy [1990], central pattern generators [Kopell and Ermentrout, 1986, 1988, 1990, Rand, Cohen, and Holmes, 1988]speciation Cohen and Stewart [2000], and so on). Many classical mechanical systems can be interpreted as coupled cell systems: for example chains of like rods [Synge and Griﬃth, 1959, p. 270], normal mode vibrations of a loaded string [Fowles, 1986, p. 301], linear motion of a triatomic molecule [Fowles, 1986, p. 299][Fowles 1986 p. 299], and n-body dynamics [Griﬃths, 1985, p. 132]However, few attempts have been made to formalize the concept of a coupled cell system and develop a general, abstract theory. We begin that process in this paper. An N -cell coupled cell system is often written in the form dxi = fi (xi , λ) + hij (xj , xi ) xi ∈ Rki ; i = 1, . . . , N (1.1) dt j→i where fi is the internal dynamics of the ith cell, hij is the coupling from cell j to cell i, and λ is a vector of parameters. In these models the total coupling at cell i is just the sum of coupling contributions from those cells j that are actually connected to cell i (symbolized here as j → i). See [Kopell and Ermentrout, 1986, 1988, 1990, Rand, Cohen, and Holmes, 1988]This structure represents a rather special case of the general concept introduced in Section 3, but it serves as motivation. In a coupled cell system we emphasize the comparative dynamics of all cells, as opposed to the dynamics of the whole system, and it is this com-

8. Patterns of Oscillation in Coupled Cell Systems

245

ment that distinguishes the study of coupled cell systems from the study of systems of ordinary diﬀerential equations. Of course, the two points of view are intimately related, but they are not the same. In particular, from the coupled-cell viewpoint the output signal from each cell is assumed to have its own signiﬁcance. For example, in the context of time-periodic solutions two cells are often described as being ‘a halfperiod out of phase’. In rings of cells, solutions may be described as forming ‘discrete rotating waves’. Two cells i, j can be described as ‘synchronous’ — that is, satisfying the condition xi (t) = xj (t) — even when the trajectory x(t) is chaotic. For these reasons, we must consider a coupled cell system to be a system of ODE, equipped with a distinguished set of projections whose images are the individual cells. If we view each cell as representing a point in space, then coupled cell systems are discrete-space continuoustime systems. They therefore represent a fascinating compromise between ODE and PDE, without the technical complications typically associated with the latter. We are intrigued by the structure implicit in coupled cell systems that permits patterned solutions to exist robustly, and ask: What structure in coupled cell systems allows speciﬁc cells to have identical time series, definite phase relations, or other identiﬁable spatio-temporal patterns? One answer is symmetry, and that is the one that we focus on here. In coupled cell systems symmetries appear naturally as permutations of the cells, and exist only when (subsets of) the cells are identical.

2

Model Cell Systems

As motivation for the concept of a coupled cell system we now introduce four coupled cell models, each having a diﬀerent symmetry group: speciation, animal gaits, coupled pendula, and coupled hypercolumns in the visual cortex. Each example has an eight-cell version and it is curious to note that the symmetry groups corresponding to these four models (S8 , Z4 × Z2 , D8 , D4 ) are all diﬀerent. We return to these examples once we have developed appropriate general techniques for their analysis. We note that there are no ﬁrst principle derivations for the form of the coupled cell systems (in particular, the internal dynamics of each cell) in three of the four examples — speciation, animal gaits, and hypercolumns in the visual cortex — though the coupled cell form that we abstracted in (1.1) is used by a number of authors. Our chief point is that the kind of states that coupled cell models can produce depends crucially on the symmetries of the system.

246

M. Golubitsky and I. Stewart

2.1

Speciation (SN )

Our ﬁrst example arises in a model of speciation — the formation of new species — in evolutionary biology Cohen and Stewart [2000]. Examples include ‘Darwin’s ﬁnches’ in The Gal´apagos Islands, where what was initially a single species of ﬁnch has diversiﬁed into 14 species over a period of about 5 million years. In fact, evolutionary changes in Darwin’s ﬁnches can be observed today, over periods of just a few years Ridley [1996]. Speciation is usually discussed in terms of genotype — genetics. In contrast, we shall focus on the phenotype — the organism’s form and behavior — because the dynamics of evolution is driven by natural selection, which acts on phenotypes. The principal role of genes is to make it possible for the phenotype to change. (For recent support for this approach, see Pennisi [2000], Rundle, Nagel, Boughman, and Schluter [2000], and Huey, Gilchrist, Carlson, Berrigan, and Serra [2000].) Until recently, most explanations of speciation have invoked geographical or environmental discontinuities or non-uniformities. For example, the mechanism known as ‘allopatry’ involves an initial species being split into two geographically isolated groups — say by one group moving to new territory, later isolated from the original territory by ﬂoods or other geographical changes. Once separated, the two groups can evolve independently. See [Mayr, 1963, 1970] for details. Such theories are based on the belief that discontinuous or non-uniform eﬀects must have discontinuous or non-uniform causes. The conventional wisdom was that if the organisms of two nascent species are not isolated, they will be able to interbreed, and ‘gene-ﬂow’ will maintain them as a single species. Therefore gene-ﬂow must be disrupted in some way, and the obvious possibility is geographical isolation. However, it is now recognized that discontinuous or non-uniform eﬀects can have continuous or uniform causes. Indeed, these are the phenomena addressed in bifurcation theory and symmetry-breaking. Towards the end of the 1990s evolutionary biologists increasingly began to consider mechanisms for ‘sympatric’ speciation. Here, organisms remain intermingled throughout the process of speciation. In sympatric speciation, gene-ﬂow is disrupted by more subtle mechanisms than geographical isolation, in particular natural selection, which eliminates ‘hybrid’ oﬀspring arising from matings between members of the two diﬀerent speciating groups before the hybrids become breeding adults. Cohen and Stewart [2000], developeda context in which sympatric speciation is explicitly represented as a form of spontaneous symmetry-breaking in a coupled cell system with all-to-all coupling (SN symmetry). Stewart, Elmhirst, and Cohen [2002], made numerical studies of such models. Since individual organisms can die or breed, their numbers can change, so it is unsatisfactory to model the system with a ﬁxed number of immortal organisms. The cells of the system are therefore taken to be coarse-grained

8. Patterns of Oscillation in Coupled Cell Systems

247

clusters of related organisms in phenotypic space; these clusters act as carriers for phenotypes. Cohen and Stewart [2000] refer to these cells as ‘PODs’ — Placeholders for Organism Dynamics. The motivation behind the model is that a single species is invariant under all permutations of its organisms, whereas a mixture of species is invariant only under the smaller group of permutations that preserve each species. The appropriate symmetry group is therefore the symmetric group SN of all permutations on N symbols, where N is the number of PODs in the model. (This number is a modelling choice, rather than being determined by biological considerations: typically something in the range 10–100 seems reasonable.) The model also assumes that the relevant phenotypes can be described by continuous characters, such as beak length for birds, and may therefore be inappropriate for characters that are determined by a single gene or a small gene-complex. The model demonstrates that sympatric speciation can occur in a population where all organisms can potentially interbreed, and in an environment that is uniform at any instant but may change as time passes. A speciation event (bifurcation) is triggered if environmental changes render the uniform state (a single species) unstable, so that the symmetry of the uniform state breaks. Such an instability occurs if the organisms can survive more eﬀectively by adopting diﬀerent strategies, rather than by all adopting the same strategy (subject to genetic feasibility). Consider a system of N PODs. The state of POD j is described by a vector xj belonging to phenotypic space Rr , where 1 ≤ j ≤ N . A point in phenotypic space represents a phenotype. Each entry xij in xj = (x1j , . . . , xrj ) represents a phenotypic character. Throughout the following discussion, for simplicity, we focus on the case r = 1, so each cell is 1-dimensional. Let a = (a1 , . . . , as ) represent environmental inﬂuences (climate, food resources, other organisms,. . . ). Assume that on the appropriate time scale changes in phenotype can be described by a dynamical system dxj = fj (x1 , . . . , xN ; a1 , . . . , as ) (2.1) dt for suitable functions fj : RN × Rs → RN . The key observation is that the system should have SN -symmetry. Intuitively, this just means that the dynamical equations should treat all cells in the same way. Thus we assume that F = (f1 , . . . , fN ) is SN -equivariant. Figure 2.1 shows typical time-series of phenotypic variables (for a choice of F that we do not specify here): the split into two species is evident.

2.2

Quadrupedal Gaits (Z2 × Z4 )

Quadrupedal gaits provide excellent examples of periodic states with spatiotemporal symmetries. In the pace, trot, and bound a four-legged animal partitions its legs into two pairs — the legs in each pair move in synchrony while legs in diﬀerent pairs move with a half-period phase shift. The two

248

M. Golubitsky and I. Stewart

λ = 25

λ=5 1.4

8

6

1.2

4 1

2 0.8

0 0.6

–2 0.4

–4

0.2

0

–6

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

–8

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

time

Figure 2.1. Bifurcation to two species in model with N = 25 PODs. Time series of all PODs are superimposed. (Left) one species; (right) two species (one with 9 PODs and one with 16 PODs).

pairs in a bound consist of the forelegs and the hind legs; the two pairs in a pace consist of the left legs and the right legs; and the two pairs in a trot consist of the the two diagonal pairs of legs. The quadruped walk has a more complicated cadence: each leg moves independently with a quarterperiod phase shift in the order left hind, left fore, right hind, and right fore. As in the pace, the left legs move and then the right legs move — but the left legs and the right legs do not move in unison. Collins and Stewart [1993, 1994] pointed out that each of these gaits can be distinguished by symmetry in the following sense. Spatio-temporal symmetries are permutations of the legs coupled with phase shifts (translations of time). So interchanging the two fore legs and the two hind legs of a bounding animal does not change the gait, while interchanging the two left legs and the two right legs leads to a half-period phase shift. In a walk permuting the legs in the order left hind to left fore to right hind to right fore leads to a quarter-period phase shift. We list the spatio-temporal symmetries of each of these gaits in Table 2.1. Gait pace trot bound walk

Symmetries (1 3)(2 4), 0 (1 3)(2 4), 12 (1 3)(2 4), 12 (1 3 2 4), 14

(leg permutation, phase shift)1 (1 2)(3 4), 12 (1 4)(2 3), 2 (1 2)(3 4), 12 (1 4)(2 3), 0 (1 2)(3 4), 0 (1 4)(2 3), 12 (1 2)(3 4), 12 (1 4 2 3), 34

Table 2.1. Gait symmetries: 1=left hind leg; 2=right hind leg; 3=left foreleg; 4=right foreleg.

Biologists often assume that animal nervous systems contain a variety of central pattern generators (CPGs) — each (partially) directing a speciﬁc

8. Patterns of Oscillation in Coupled Cell Systems

249

function. For example, locomotor CPGs are suppose to control the rhythms associated to standard quadrupedal gaits. Locomotor CPGs are themselves often modelled by a coupled cell system where each cell is a cluster of neurons that is responsible for directing motion in a single leg. It is usually assumed that the various clusters are identical and coupled. The simplest such model consists of four cells — one for each leg. Golubitsky, Stewart, Buono, and Collins [1998]; Buono and Golubitsky [2001] argue that, because of the spatio-temporal symmetries present in the gaits walk, trot, and pace, this four-cluster structure cannot be an appropriate model for quadrupedal gaits. The reason is that with four cells, symmetry forces the trot and pace gaits to correspond to conjugate solutions in the model — that is, these two solutions must exist simultaneously and be stable simultaneously. But many animals pace but do not trot (a camel for instance) and many animals trot but do not pace (a squirrel for instance). Although gait selection could in principle be accomplished by using diﬀerent initial conditions, this option is not especially attractive and we seek something more robust. These authors then show that there is a unique eight-cell model that can produce walk, trot, and pace, while avoiding the conjugacy problem. In this model the motion of each leg is directed by the output from two cells: see Figure 2.2. For purposes of visualization we may assume that only the ﬁrst four cells send signals to the animal’s four legs. Suppose that xj (t) is the time series associated to the jth cell. Then the gait ‘pace’ corresponds to a solution where x3 (t) = x1 (t) x4 (t) = x1 t + 12 . x2 (t) = x1 t + 12 Observe that this network consists of two unidirectional rings of four cells each. The coupling within a single ring is called ipsilateral and the coupling between rings is called contralateral. In the ﬁgure, diﬀerent types of lines are used to represent each type of coupling. The symmetry group of the network is generated by two elements: the transposition κ that interchanges the left and right rings, and the four-cycle ω that permutes the cells in each ring simultaneously. Thus, the symmetry group of this network is Γ = Z4 (ω) × Z2 (κ). Finally, this network can be generalized to a CPG model for myriapods with N pairs of legs by coupling two directed rings with 2N cells each leading to a network with 4N cells. The symmetry group of this network is Z2N (ω) × Z2 (κ).

2.3

Pendulum Ring Coupled by Torsion Springs (DN )

This example is one of the simplest nonlinear Hamiltonian coupled cell systems: it is perhaps best thought of as a chain of nonlinear oscillators with periodic boundary conditions, but we will think of it as a ring to keep the number of cells ﬁnite. Consider a ring of N identical simple pendula, coupled in nearest-neighbor fashion by torsion springs (Fig 2.3). The elastic

250

M. Golubitsky and I. Stewart

left

right

LF

7

8

RF

LH

5

6

RH

LF

3

4

RF

LH

1

2

RH

Figure 2.2. Eight-cell network for quadruped locomotor central pattern generator. The signals from cells 1 and 5 are sent to the left hind leg and the signals from cells 3 and 7 are sent to the left foreleg. Similar statements hold for the right side of the network. Ipsilateral coupling is indicated by solid lines and contralateral coupling is indicated by dashed lines.

force exerted by such a spring is proportional to the diﬀerence between the angular positions of its endpoints.

Figure 2.3. Ring of identical pendula coupled by torsion springs (bold lines). Each pendulum swings in a vertical plane through the center of a regular N -gon (here N = 6).

We can represent this as a coupled cell system, where each cell corresponds to a pendulum. There is a trivial equilibrium in which each pendulum is stationary and hangs vertically downwards. The problem we address here is the existence of small-amplitude time-periodic oscillations near that equilibrium. In §4 we classify possible spatio-temporal symmetries of periodic states of this system. In §5.4.1 we use symmetry methods and the coupled cell viewpoint to prove that generically (that is, for almost all values of the gravitational constant) there exist at least ' 3N2−1 ( distinct families of small-amplitude time-periodic oscillations, each parametrized by energy. The existence of these families of solutions depends on the fact that the coupled cell system has extra structure, namely, there is an internal cell symmetry due to the mechanical nature of the Hamiltonian system. See §4.3.1.

8. Patterns of Oscillation in Coupled Cell Systems

2.4

251

Coupled Hypercolumns (D4 )

Neurons in the primary visual cortex V1 are known to be sensitive to the orientation of contours in the visual ﬁeld. As discussed in Bressloﬀ, Cowan, Golubitsky, Thomas, and Wiener [2001], the pattern of interconnection of these neurons has interesting symmetry properties, and these symmetries seem to be responsible in part for the types of geometric patterns that are reported in visual hallucinations. Using microelectrodes, voltage-sensitive dyes, and optical imaging, scientists have accumulated information about the distribution of orientation selective cells in V1, and about their pattern of interconnection. These studies can be interpreted to suggest that approximately every millimeter there is an iso-orientation patch with a given orientation preference and that a set of orientation patches covering the orientation domain [0, π) (for each eye) occurs (in humans) in a millimeter square slab of V1. This slab was called a hypercolumn by Hubel and Wiesel [1974]. Thus there seem to be at least two length scales: (a) Local: cells less than a millimeter apart tend to make inhibitory connections with most of their neighbors in a roughly isotropic fashion, and

(b) Lateral: cells make excitatory contact only every millimeter or so along their axons with cells in similar iso-orientation patches. The experimental description of the local and lateral connections in V1 is illustrated in Figure 2.4. The neurons in each hypercolumn are all-toall coupled while the connections between hypercolumns couple only those neurons that are sensitive to the same contour orientation. Moreover, if two hypercolumns lie in a direction φ from each other in V1, then only those neurons sensitive to contours oriented at angle φ are connected. Except for boundaries these connections are the same at every hypercolumn in V1. The simplest discrete model for orientation tuning in hypercolumns is a model system of four hypercolumns arranged in a square, as suggested by Nancy Kopell and shown in Figure 2.5 (left). In this model the jth hypercolumn consists of two cells: one Hj is sensitive to horizontal contours and the other Vj is sensitive to vertical contours. As suggested by Figure 2.4, the connections between hypercolumns are restricted to connecting those cells that have like sensitivity — and then only when the cells are aligned along the line of their orientation preference. The result is shown in Figure 2.5 (left). This network has D4 symmetry, since it is the same network as the octagonal one shown in Figure 2.5 (right), where the D4 symmetry is transparent.

252

M. Golubitsky and I. Stewart

local connections

lateral connections

hypercolumn

Figure 2.4. Illustration of isotropic local and anisotropic lateral connection patterns.

3

A Formal Deﬁnition of a Coupled Cell System

With these examples in mind, we give a formal deﬁnition of a coupled cell system that is intimately related to its symmetries. Later we discuss the solution types that are consistent with and forced by symmetry. We also specialize the notion of a coupled cell system to the Hamiltonian context. Let N = {1, . . . , N } and let Pj be a manifold for j ∈ N . 3.1 Deﬁnition.

A coupled cell system is a dynamical system dx = F (x) dt

(3.1)

deﬁned on the space P = P1 × · · · × PN where the Pj are the cells of the system and the projections πj : P → Pj are the cell projections. Let x(t) be a trajectory of (3.1). Then the jth cell trajectory is xj (t) = πj x(t) .

8. Patterns of Oscillation in Coupled Cell Systems H1

H1

H3

V1

V3

H2

H4

V2

253

V1

H2

V3

V2

H3

V4 V4

H4

Figure 2.5. (Left) Schematic eight-cell network for orientation tuning model. (Right) Equivalent network.

Abstractly, this completes the deﬁnition, but we need to be able to do two things: interpret a coupled cell system in terms of its individual cells and how they are coupled, and decompose the dynamical system into diﬀerent levels of coupling. That is, we need to set up links between the abstract concept and the intuitive one employed in areas of application. The basic idea is that F can be decomposed as a sum of terms, which correspond to various types of coupling. A formal deﬁnition is postponed to §6 to avoid complicating a relatively simple idea with technicalities: we summarize the basic ideas here. The vector ﬁeld F can be written (in an essentially unique way) as a sum of terms that depend on none of the xj , on just one of them, on just two of them, and so on. Let Φk be the terms that depend on exactly k of the xj . Each Φk can be further decomposed according to which xj actually occur (that is, the value of Φk (x) is not independent of xj ). The constant part of F is Φ0 . We can write Φ1 = f1 +· · ·+fN where fj depends only on xj . Then the ith component of fi deﬁnes the internal dynamics of cell i. In a similar manner (the details require a little care) we can deﬁne the coupling from cell i to cell j. When the system has ‘point to point’ coupling, as in (1.1), this takes care of the whole of F , but in general there might be ‘three-cell’ coupling terms involving three diﬀerent xj , and so on. Such terms can also be given a canonical meaning. Associated with a coupled cell system is a ‘decorated directed graph’ (more generally a labelled oriented simplicial complex) whose nodes correspond to the N cells of the system and whose edges (or higher-dimensional simplices) correspond to various types of coupling. An edge from node j to node i exists if and only if Fi contains terms that depend only on xi and xj , and so on. The resulting graph (or complex) provides a schematic description of which cells inﬂuence which— but not of what these inﬂuences actually are. The key ingredient for this paper is symmetry. Suppose that a group Γ ⊂ SN permutes the nodes in N . Nodes are said to be identical (or to

254

M. Golubitsky and I. Stewart

have the same type) if they lie in the same Γ-orbit. Edges are said to be identical (or to have the same type) if they lie in the same Γ-orbit, where Γ is now acting on pairs (i, j) with i = j. In practice we draw nodes of the same type with the same kind of symbol (circle, box...) and we draw edges of the same type with the same kind of arrow (single head, two heads, double shaft...). Our four examples give four diﬀerent examples of cell complexes: an N node simplex, two rings of four nodes each, a ring of N nodes, and a ring of eight nodes (with D4 symmetry). Note that the animal gaits model in Figure 2.2 and the coupled hypercolumn model in Figure 2.5(right) each have two diﬀerent types of arrows in their deﬁnitions: the ﬁrst case distinguishes between ipsilateral and contralateral coupling and the second case distinguishes between local and lateral coupling. A symmetry γ ∈ Γ acts on the phase space S by γ(x1 , . . . , xN ) = (xγ −1 (1) , . . . , xγ −1 (N ) ).

(3.2)

3.2 Deﬁnition. The coupled cell system is symmetric under Γ if F is Γ-equivariant. By extension we also say that the coupled cell system is Γ-equivariant. The equivariance assumption implies that the internal dynamics of nodes of the same type are identical. Similarly, coupling terms corresponding to edges of the same type are identical, and the same goes for multi-cell coupling terms. In particular: 3.3 Deﬁnition. on the nodes.

A cell complex has identical cells if Γ acts transitively

Each of the systems in our four examples consists of identical cells.

3.1

Coupled Cell Systems with Additional Structure

There are three types of additional structure that are routinely placed on coupled cell systems of diﬀerential equations; in general, these structures change the kind of dynamics that one can expect from the coupled cell system. These structures are: restrictions on the type of coupling, Hamiltonian cells, and internal symmetries. The cell system system (1.1) is restricted because the coupling at each node i is just the sum of couplings from all nodes connected to i — we call this point-to-point coupling. Other types of special coupling include diﬀusive, synaptic, nearest neighbor, dead cells stay dead, and linear. For example, the coupled pendula, animal gaits CPG, and simple hypercolumn models are all assumed to have nearest neighbor coupling, while the speciation model is an example of all-to-all coupling. Note that in the gaits model and the hypercolumn model, the couplings between nearest neighbor cells are not identical; indeed, generally we assume that couplings are identical only when that feature is forced by symmetry.

8. Patterns of Oscillation in Coupled Cell Systems

255

In some models the internal cell dynamics is restricted by extra structure. For example, Hamiltonian coupled cell systems are coupled cell systems where each cell is assumed to be Hamiltonian. In these models we assume that the permutation group of the cell complex acts symplectically, that is, the symplectic structure on the phase spaces of any two cells related by a permutation symmetry are identical. Coupled pendula provide an example of a Hamiltonian coupled cell system. Another way that the internal cell dynamics may be restricted is through the existence of symmetry — in this case we refer to the permutation symmetries of the cells as global symmetries and the symmetries within each cell as local symmetries. As we shall see, the coupled pendula model has a transpositional Z2 symmetry related to the fact that it models a mechanical system; the total symmetry group of this coupled cell system is DN × Z2 . See Dionne, Golubitsky, and Stewart [1996] and Dias [1998] for a more detailed discussion of coupled cell systems with internal symmetries. It has recently been observed that symmetries of a subset of cells in a coupled cell system can be responsible for symmetric patterns in the dynamics of those cells. For details, see Golubitsky, Pivato, and Stewart [2002]. This phenomenon of interior symmetries is natural in a coupled cell formalism and is suited to observational tests, but has no natural counterpart in equivariant dynamical systems theory.

3.2

Symmetry and Modelling

Many times coupled cell systems are used as models in a schematic sense: the exact form that model equations may have is unknown. All that is known is which cells have equal inﬂuences on other cells. The examples on speciation, animal gaits, and hypercolumns all fall into this category. In these cases, it is the symmetry of the coupled cell system that is the important modelling assumption, not the detailed equations for the cells. For example, in the animal gaits locomotor CPG model, the cells themselves may represent individual neurons or, as is more likely, collections of neurons. Should the internal dynamics of each cell be modelled by a single Hodgkin–Huxley system, or for simplicity by Morris-Lecar or FitzhughNagumo equations, or more realistically by a collection of Hodgkin–Huxley systems? Should the cell coupling be modelled by nearest-neighbor pointto-point coupling or more realistically by couplings that include dependence on all cells? In many cases, such issues are secondary because there is no well-established physical or biological reason to make any particular choice. In this sense, the most important modelling assumption for the locomotor CPG model is the symmetry assumption; that is, the coupled cell system has Γ = Z2 × Z4 symmetry. In these circumstances, the only a priori assumption on the form of the coupled cell system that we should make is Γ-equivariance. That is, we need to study Γ-symmetric coupled cell systems deﬁned on the state space (Rk )8 . We begin this process in Section 4.

256

4

M. Golubitsky and I. Stewart

Spatio-Temporal Patterns in Coupled Cell Systems

We begin this section by reviewing the deﬁnitions of spatial symmetries of equilibria and of spatio-temporal symmetries of time-periodic solutions of Γ-equivariant systems of ODE. Suppose that x˙ = f (x) (4.1) is a system of diﬀerential equations with x ∈ Rn and symmetry group Γ. The symmetry group of an equilibrium x0 of (4.1) is just the isotropy subgroup of x0 , that is, the spatial symmetries γ ∈ Γ that ﬁx x0 . For example, suppose Γ = SN , where SN acts on RN by permuting coordinates. It is a straightforward exercise to show that up to conjugacy the isotropy subgroups of this action of SN all have the form Sn1 × · · · × Snk

(4.2)

where n1 + · · · + nk = N . Thus, in the speciation model (2.1), equilibria correspond to decomposition of the population into k species, where k ≤ N . See Figure 2.1 for an example where N = 25. The equilibrium on the left represents one species and has isotropy subgroup S25 , while the one on the right represents two species and has isotropy subgroup S9 × S16 . The symmetries of the time-periodic solutions are more complicated to describe than are the symmetries of equilibria. To begin, suppose that x(t) is a T -periodic solution of (4.1). and that γ ∈ Γ. We discuss the ways in which γ can be a symmetry of x(t); the main tool is the uniqueness theorem for solutions to the initial value problem for (4.1). We know that γx(t) is another T -periodic solution of (4.1). Should the two trajectories intersect, then the common point of intersection would be the same initial point for the two solutions. Uniqueness of solutions implies that the trajectories of γx(t) and x(t) would be identical. So either the two trajectories are identical or they do not intersect. Suppose that the two trajectories are identical. Then uniqueness of solutions implies that there exists θ ∈ S1 = [0, T ] such that γx(t) = x(t − θ), or γx(t + θ) = x(t). (4.3) We call (γ, θ) ∈ Γ × S1 a spatio-temporal symmetry of the solution x(t). A spatio-temporal symmetry of x(t) for which θ = 0 is called a spatial symmetry , since it ﬁxes the point x(t) at every moment of time. The group of all spatio-temporal symmetries of x(t) is denoted Σx(t) ⊂ Γ × S1 . Next we show how the symmetry group Σx(t) can be identiﬁed with a pair of subgroups H and K of Γ and a homomorphism from H into S1 with

8. Patterns of Oscillation in Coupled Cell Systems

257

kernel K. Deﬁne K = γ ∈ Γ : γx(t) = x(t) ∀t H = γ ∈ Γ : γ{x(t)} = {x(t)} .

(4.4)

The subgroup K ⊂ Σx(t) is the group of spatial symmetries of x(t) and the subgroup H consists of those symmetries that preserve the trajectory of x(t) — in short, the spatial parts of spatio-temporal symmetries of x(t). Indeed, the groups H ⊂ Γ and Σx(t) ⊂ Γ × S1 are isomorphic; the isomorphism is just the restriction to Σx(t) of the projection of Γ × S1 onto Γ.

4.1

A Classiﬁcation Theorem for Spatio-Temporal Symmetries

There are three straightforward algebraic restrictions placed on the pair H and K deﬁned in (4.4) in order for them to correspond to symmetries of a periodic solution. Recall that the ﬁxed-point subspace of a subgroup Σ ⊂ Γ is Fix(Σ) = x ∈ Rn : σx = x ∀σ ∈ Σ and that ﬁxed-point subspaces are ﬂow invariant, that is, f : Fix(Σ) → Fix(Σ). 4.1 Lemma. Let x(t) be a periodic solution of (4.1) and let H and K be the subgroups of Γ deﬁned in (4.4). Then (a) K is a normal subgroup of H and H/K is either cyclic or S1 . (b) K is an isotropy subgroup for the Γ-action. (c) dim Fix(K) ≥ 2. Proof. For each γ ∈ H there is a unique θ ∈ S1 such that (γ, θ) is a spatio-temporal symmetry of x(t). Uniqueness of solutions implies that the mapping Θ : H → S1 deﬁned by Θ(γ) = θ is a group homomorphism. By deﬁnition, the kernel of this homomorphism is K — thus verifying (a). Let x0 = x(0) and suppose that σx0 = x0 . Then σx(t) is another (periodic) solution with initial condition x0 . If follows that σx(t) = x(t) and that σ ∈ K. Therefore, the isotropy subgroup of x0 is in K. Conversely, by deﬁnition, σ ∈ K ﬁxes x0 — and (b) is valid. Also by deﬁnition x(t) ⊂ Fix(K); so (c) must be valid. 4.2 Deﬁnition. When H/K ∼ = Zm the periodic solution x(t) is called either a standing wave or (usually for m ≥ 3) a discrete rotating wave; and when H/K ∼ = S1 it is called a rotating wave. In fact, the pair H and K must satisfy two restrictions in addition to those listed in Lemma 4.1. We discuss one of those in detail here. Let Γ

258

M. Golubitsky and I. Stewart

be a ﬁnite group acting on Rn and let x(t) be a periodic solution of a Γ-equivariant system of ODE. Deﬁne ; Fix(γ) ∩ Fix(K) LK = γ∈K

Since K is an isotropy subgroup (Lemma 4.1(a)), LK is the union of proper subspaces of Fix(K). More precisely, suppose that Fix(γ) ⊃ Fix(K). Then the isotropy subgroup of every point in Fix(K) contains both K and γ ∈ K. Therefore, the isotropy subgroup of any point in Fix(K) is larger than K, and K is not an isotropy subgroup. We claim that H ﬁxes a connected component of Fix(K) LK .

(4.5)

To verify (4.5) we ﬁrst show that any δ in the normalizer N (K) permutes connected components of Rn LK . Observe that δ(Fix(γ) ∩ Fix(K)) = Fix(δγδ −1 ) ∩ Fix(δKδ −1 ) = Fix(δγδ −1 ) ∩ Fix(K) Moreover, δγδ −1 ∈ K. (If it were, then γ would be in δ −1 Kδ = K, which it is not.) Therefore δ : LK → LK . Since δ is invertible, δ : Rn LK → Rn LK and δ permutes the connected components of Rn LK . Since H/K is cyclic, we can choose an element h ∈ H that projects onto a generator of H/K. We now show that h (and hence H) must ﬁx one of the connected components of Rn LK . Suppose that the trajectory of x(t) intersects the ﬂow-invariant subspace Fix(γ) ∩ Fix(K). Flow-invariance of Fix(γ) implies that γ is a spatial symmetry of the solution x(t), and by deﬁnition γ ∈ K. Therefore the trajectory of x(t) does not intersect LK . Since h is a spatiotemporal symmetry of x(t), it preserves the trajectory of x(t). Therefore, h must map the connected component of Rn LK that contains the trajectory of x(t) into itself, thus verifying (4.5). The main theorem of this section is a characterization of the possible spatio-temporal symmetries of periodic solutions. 4.3 Theorem (Buono and Golubitsky [2001]). Let Γ be a ﬁnite group acting on Rn . There is a periodic solution to some Γ-equivariant system of ODE on Rn with spatial symmetries K and spatio-temporal symmetries H if and only if (a) (b) (c) (d)

H/K is cyclic. K is an isotropy subgroup. dim Fix(K) ≥ 2. If dim Fix(K) = 2, then either H = K or H = N (K). H ﬁxes a connected component of Rn LK .

Moreover, when these conditions hold, there exists a smooth Γ-equivariant vector ﬁeld with an asymptotically stable limit cycle with the desired symmetries.

8. Patterns of Oscillation in Coupled Cell Systems

259

4.4 Corollary. For pairs (H, K) satisfying conditions (a)-(d) of Theorem 4.3, the property of having periodic solutions with spatial symmetries K and spatio-temporal symmetries H is robust in Γ-equivariant systems of ODE on Rn . The case when the internal dynamics of a coupled cell system is k ≥ 2 motivates the following corollary to Theorem 4.3. 4.5 Corollary. Let Γ be a ﬁnite group acting on V and suppose that W = V k for some k ≥ 2. Then there is a hyperbolic periodic solution to some Γ-equivariant system of ODE on Rn with spatial symmetries K and spatio-temporal symmetries H if and only if (a) H/K is cyclic. (b) K is an isotropy subgroup. (c) If dim Fix(K) = 2, then either H = K or H = N (K). A Two-Cell Coupled Cell Example Our ﬁrst example of a coupled cell system is the simplest possible one — the two-cell system pictured in Figure 4.1. 1

2

Figure 4.1. A two-cell coupled cell system.

The corresponding system of ODEs is x˙ 1 = f (x1 , x2 ) x˙ 2 = f (x2 , x1 )

(4.6)

where x1 , x2 ∈ Rk . The symmetry group for the two-cell system is Z2 (κ) where κ(x1 , x2 ) = (x2 , x1 ). According to symmetry there are three possible types of periodic solutions in this cell system and they correspond to: (H, K) = (Z2 , Z2 ), (H, K) = (Z2 , 1), and (H, K) = (1, 1). Suppose that x(t) = (x1 (t), x2 (t)) is a 1-periodic solution to (4.6). • If x(t) corresponds to (H, K) = (Z2 , Z2 ), then it is a synchronous solution where x2 (t) = x1 (t). • If x(t) corresponds to (H, K) = (Z2 , 1), then it is an out of phase solution where x2 (t) = x1 t + 12 . • If x(t) corresponds to (H, K) = (1, 1), then x(t) is asymmetric — but then x2 (t), x1 (t) is also a 1-periodic solution. It follows from Corollary 4.5 that there are stable limit cycles with each of these symmetry types when k ≥ 2 — indeed it is not too diﬃcult to ﬁnd examples of each type of periodic solution. When k = 1 Theorem 4.3

260

M. Golubitsky and I. Stewart

precludes the existence of both synchronous and out of phase periodic solutions. Note that Fix(Z2 ) = {(x1 , x1 )}. So synchronous solutions cannot exist since dim Fix(Z2 ) = 1 and out of phase solutions cannot exist since κ does not ﬁx a connected component of R2 L1 = R2 Fix(κ). Asymmetric periodic solutions can exist when k = 1. Three Cells in a Line Consider the three-cell coupled cell system pictured in Figure 4.2. 1

2

3

Figure 4.2. Three cells in a line.

The corresponding system of ODEs is x˙ 1 = f (x1 , x2 , x3 ) x˙ 2 = g(x1 , x2 , x3 ) x˙ 3 = f (x3 , x2 , x1 )

(4.7)

where x1 , x2 , x3 ∈ Rk and g(x3 , x2 , x1 ) = g(x1 , x2 , x3 ). The symmetry group for this three-cell system is Z2 (κ), where κ(x1 , x2 , x3 ) = (x3 , x2 , x1 ), and there are still three types of possible periodic solutions: synchronous, out of phase, and asymmetric. Suppose that x(t) = (x1 (t), x2 (t), x3 (t)) is a 1-periodic solution to (4.7). Then • If x(t) is a synchronous solution, then x3 (t) = x1 (t). • If x(t) an out of phase solution, then x3 (t) = x1 t + 12 and x2 (t) = is x2 t + 12 . That is, the second cell oscillates with twice the frequency of the other cells. • If x(t) is asymmetric, then x3 (t), x2 (t), x1 (t) is a 1-periodic solution. Again it follows from Corollary 4.5 that there are stable limit cycles with each of these symmetry types when k ≥ 2. When k = 1 Theorem 4.3 precludes the existence of the out of phase periodic solutions.

4.2

Examples of Spatio-Temporal Symmetries

We now present two examples where spatio-temporal symmetries have important interpretations for the associated periodic solutions. 4.2.1

Animal Gaits

Golubitsky, Stewart, Buono, and Collins [1998, 1999] argue that the eightcell double-ring network pictured in Figure 2.2 is the simplest network that will produce periodic solutions having the rhythms of the quadruped

8. Patterns of Oscillation in Coupled Cell Systems

261

gaits walk, trot, and pace. The symmetry group of this network is Γ = Z4 (ω) × Z2 (κ). A symmetry type (H, K) of a periodic solution is primary when H = Γ. If the cell system consists of identical cells (that is, Γ acts transitively on the cells), then the signals emanating from each cell in a primary periodic solution are identical up to a phase shift. In this generalized sense signals sent from each cell in a primary periodic solutions are synchronous. It is a straightforward exercise to classify the primary periodic solution types in the network pictured in Figure 2.2; the results are listed in Table 4.2. Note that primary periodic solutions in this network also include models of the bound, the pronk, and an unusual gait called the jump (which has been seen in bucking broncos — as well as in gerbils and rats). K

Γ/K

Spatio-Temporal

Γ

1

—

ω

Z2

(κ, 12 )

κω

Z2

(κ, 12 )

κ, ω 2

Z2

(ω, 12 )

κω 2

Z4

κ

Z4

(ω, 14 )

(κ, 12 )

(ω, 14 )

Phase Diagram 0 0 0 0 0 1 2 0 12 1

0 2 1 0 0 20

1 2 ± 41

0 ± 41 0

1 2

± 34 1 2 ± 14

0

Gait pronk pace trot bound walk± jump±

Table 4.2. Symmetries of primary periodic solutions in a Γ = Z4 × Z2 model.

4.2.2

Multirhythms

Coupled cell dynamics can lead to situations where diﬀerent cells are forced by symmetry to oscillate at diﬀerent frequencies (Golubitsky and Stewart [1986]; Golubitsky, Stewart, and Schaeﬀer [1988]; Armbruster and Chossat [1999]). As we have seen, certain cells can be forced to oscillate at twice the frequency of other cells — but the range of possibilities is much more complicated. The basic principle is simple (though combinatorial bells and whistles can be added). Let γ be an m-cycle that is a spatio-temporal symmetry 1 . Suppose, in of a coupled cell system having corresponding phase shift m addition, that γ cyclicly permutes cells 1, . . . , m and ﬁxes cell m + 1. Then cell m + 1 must oscillate m times as quickly as cell 1 — with one caveat that we will return to in a moment. A simple example of a four-cell system that illustrates this point is given in Figure 4.3. In this coupled cell system

262

M. Golubitsky and I. Stewart

a ponies on a merry-go-round solution (Z3 , 1) will force cell 4 to oscillate at three times the frequencies of the other three cells.

2

1 4

3 Figure 4.3. Unidirectional ring of three cells with a center cell.

We now return to the caveat: suppose two diﬀerent cycles with nontrivial temporal symmetries exist. Then, they can force two diﬀerent frequency relations between the cells — and it is quite curious how these two frequency restrictions are resolved into one relation, as we now show. Consider a ﬁve-cell system consisting of two rings — one with three cells and one with two cells — as shown in Figure 4.4. The symmetry group of this system is Γ = Z3 × Z2 ∼ = Z6 . Note that the internal dynamics of cells 4 and 5 do not have to be the same as that of cells 1, 2, and 3 (indeed, they do not even have to have the same dimensions).

1

2 3

4

5

Figure 4.4. Five cell system made of a ring of three and a ring of two.

Suppose that a 1-periodic solution X(t) = x1 (t), x2 (t), x3 (t), y1 (t), y2 (t) to this coupled cell system exists. Suppose thatthis solution has two spatio temporal symmetries (1 2 3), 13 and (4 5), 12 . The ﬁrst symmetry forces the xj to be in ponies form with (nominally) the frequency of the yi equal to three times the frequency of the xj . The second symmetry forces the yi to be a half period out-of-phase and the xj to oscillate at twice the frequency of the yi . This apparent nonsense is resolved as follows. The product of the two symmetries is γ = (1 2 3)(4 5), 16 ,

8. Patterns of Oscillation in Coupled Cell Systems

263

explicitly exhibiting the isomorphism Z3 × Z2 ∼ = Z6 . Thus X(t) actually has the form X(t) = x(t), x(t + 13 ), x(t + 23 ), y(t), y(t + 12 ) . where three times the frequency of x is twice the frequency of y. Does such a solution actually exist? Corollary 4.5 states that it does — at least if all nonlinearities consistent with Z6 symmetry are permitted to be present. The diﬃculty is to ﬁnd a solution corresponding to the pair (Z6 , 1) in the coupled cell system context. The diﬃculty is compounded by the fact that no such solution is supported by a primary Hopf bifurcation in this coupled-cell system. The reason is that in Hopf bifurcation the available representations of the symmetry group Z6 ∼ = Z3 × Z2 are sums of irreducible components of the permutation representation on R5k , where k is the dimension of the state space of a single cell. However, there does exist a more complicated bifurcation scenario that contains such a representation: primary Hopf bifurcation to a Z3 discrete rotating wave, followed by a secondary Hopf bifurcation using the nontrivial Z2 representation. We therefore seek a 3:2 resonant solution arising from such a scenario. Let x1 , x2 , x3 ∈ R be the state variables for the ring of three cells and let y1 = (y11 , y21 ), y2 = (y12 , y22 ) ∈ R2 be the state variables for the ring of two cells. Consider the system of ODE x˙ 1 = −x1 − x31 + 2(x1 − x2 ) + D(y1 + y2 ) + 3 (y21 )2 + (y22 )2 x˙ 2 = −x2 − x32 + 2(x2 − x3 ) + D(y1 + y2 ) + 3 (y21 )2 + (y22 )2 (4.8) x˙ 3 = −x3 − x33 + 2(x3 − x1 ) + D(y1 + y2 ) + 3 (y21 )2 + (y22 )2 2 2 2 2 y˙ 1 = B1 y1 − |y1 | y1 + B2 y2 + 0.4 x1 + x2 + x3 C y˙ 2 = B1 y2 − |y2 |2 y2 + B2 y1 + 0.4 x21 + x22 + x23 C where B1 =

− 12 −1

1 − 12

, B1 =

−1 −1 , D = 0.20 1 −1

−0.11 , C =

0.10 . 0.22

Starting at the initial condition x01 = 1.78, x02 = −0.85, x03 = −0.08, y10 = (−0.16, 0.79), y20 = (0.32, −0.47) We obtain Figures 4.5 and 4.6.

4.3

Spatio-Temporal Symmetries in Hamiltonian Systems

We now discuss the Hamiltonian version of Theorem 4.3. We begin by developing a theory of Hamiltonian coupled cell systems, by analogy with

264

M. Golubitsky and I. Stewart

cells 1, 2, 3

2

1.5

1

0.5

0

–0.5

–1

–1.5

0

5

10

15

20

25

30

35

25

30

35

cells 4 and 5

0.8

0.6

0.4

0.2

0

–0.2

–0.4

–0.6

–0.8

0

5

10

15

20

Figure 4.5. Integration of (4.8): (top) cells 1–2–3 out of phase by one-third period; (bottom) cells 4–5 out of phase by one-half period.

the dissipative case (the one described so far). We use standard concepts from Hamiltonian dynamics without further comment: see Abraham and Marsden [1978] and Arrowsmith and Place [1990]. In Hamiltonian systems the phase space is a symplectic manifold, and for the purposes of local bifurcation theory it can be assumed to be a symplectic vector space P = R2n with coordinates (q, p) = (q1 , . . . , qn ; p1 , . . . , pn ) where q is position and p is velocity. The dynamics is determined by a Hamiltonian H:P →R

8. Patterns of Oscillation in Coupled Cell Systems

265

cells 1 and 4

2

1.5

1

0.5

0

–0.5

–1

–1.5

0

5

10

15

time

20

25

30

35

cell 1 vs. cell 4

0.8

0.6

0.4

0.2

x4

0

– 0.2

– 0.4

– 0.6

– 0.8 –1.5

–1

–0.5

0

x1

0.5

1

1.5

2

Figure 4.6. Integration of (4.8): (top) time series of cells 1 and 4 indicating that triple the frequency of cell 4 equals double the frequency of cell 1; (bottom) plot of cell 1 versus cell 4 showing a closed curve that indicates a time-periodic solution.

and we assume that H ∈ C ∞ . Hamilton’s Equations for the dynamics are: q˙j =

∂H , ∂pj

p˙j = −

∂H ∂qj

(4.9)

Because of the form of these equations, H˙ ≡ 0, so the Hamiltonian is conserved by the ﬂow. The level sets of H, given by H = c for constant c,

266

M. Golubitsky and I. Stewart

are called energy levels. Let Γ ⊂ O(2n) be a ﬁnite group and Ω be the symplectic 2-form on R2n . The group Γ acts symplectically if γ ∗ Ω = Ω for all γ ∈ Γ. Recall that ﬁxed-point subspaces of symplectic actions are symplectic, hence evendimensional. 4.6 Theorem. Let Γ be a ﬁnite group acting symplectically on R2n . There is a periodic solution to some Γ-equivariant Hamiltonian system of ODE on R2n with spatial symmetries K and spatio-temporal symmetries H if and only if (a) H/K is cyclic. (b) K is an isotropy subgroup. (c) dim Fix(K) ≥ 2. If dim Fix(K) = 2, then either H = K or H = N (K). Moreover, when these conditions hold, there exists a smooth Γ-equivariant Hamiltonian vector ﬁeld having an elliptic periodic solution with the desired symmetries. The proof of this theorem is virtually identical to that of Theorem 4.3. As before, conditions (a)–(c) are necessary conditions. Note that condition (d) of Theorem 4.3 is superﬂuous in the Hamiltonian setting, since the symplectic structure implies that the codimension of Fix(γ) ∩ Fix(K) in Fix(K) is at least two; hence the complement of LK is always connected. Conversely, choose the closed curve C with the desired symmetry properties, as in the proof of Theorem 4.3. See Buono and Golubitsky [2001]. Then choose a nonnegative Hamiltonian in a small neighborhood of C whose zero set is C. Extend the Hamiltonian to be Γ-invariant on all of R2n in a way analogous to the construction of the vector ﬁeld in the proof of Theorem 4.3. We can also assume that the Hamiltonian is chosen so that C is the trajectory of an elliptic periodic solution. 4.3.1

Coupled Pendula

In this subsection we discuss the spatio-temporal symmetries of periodic solutions to the ring of N identical simple pendula introduced in §2.3. Denote the position of pendulum j (taken modulo N ) by qj and its angular velocity by pj = q˙j . Let the mass of each pendulum bob be m, normalize the length to 1, let gravity be g, and let the modulus of elasticity for each spring be α. Choose units so that m = 1, g = 1. Then the Hamiltonian is (4.10) H(q, p) = 12 p2j − cos qj + 12 α (qj−1 − qj )2 + (qj+1 − qj )2 The equations of motion are q˙j = pj p˙j = − sin qj + α(qj−1 − 2qj + qj+1 ).

(4.11)

8. Patterns of Oscillation in Coupled Cell Systems

267

Note that the coupling is this model is assumed to be nearest neighbor and diﬀusive. In the pendulum system the symmetry group of the Hamiltonian is not DN but DN × Z2 where the extra symmetry is an internal one given by (q, p) → (−q, −p). More precisely, the action of this group on RN ⊕ RN with coordinates (q, p) is: σ(q0 , . . . , qN −1 ; p0 , . . . , pN −1 ) = (q1 , . . . , qN −1 , q0 ; p1 , . . . , pN −1 , p0 ) ρ(q0 , . . . , qN −1 ; p0 , . . . , pN −1 ) = (qN −1 , . . . , q0 ; pN −1 , . . . , p0 ) τ (q0 , . . . , qN −1 ; p0 , . . . , pN −1 ) = (−q0 , . . . , −qN −1 ; −p0 , . . . , −pN −1 ) (4.12) Here DN = σ, ρ and Z2 = τ . Let Γ = DN × Z2 . Next we ask: what kinds of periodic solution does Theorem 4.6 suggest may exist in the Hamiltonian system (4.11)? Theorem 4.6 states that we need to determine, up to conjugacy, all isotropy subgroups K having dim Fix(K) ≥ 2 and all subgroups H for which H/K is cyclic. In general, this is a combinatorially diﬃcult problem, but the enumeration simpliﬁes when N is prime — which we now assume. Note that the only isotropy subgroup that contains τ is Γ itself. Therefore, possible isotropy subgroups of Γ have one of two possible forms: K =L×1

and

K = (L × 1) ∪ ((M L) × {τ }),

(4.13)

where L ⊂ M ⊂ DN and L has index two in M . When N is an odd prime there are only four subgroups of DN up to conjugacy: 1, Z2 (ρ), ZN , and DN . It follows from (4.13) that there are just two additional possible isotropy subgroups: Z2 (ρτ ) (from 1 ⊂ Z2 (ρ)) and DN (from ZN ⊂ DN ). Of the seven possibilities only ﬁve 1

Z2 (ρ)

Z2 (ρτ )

DN

Γ

are isotropy subgroups and they have ﬁxed-point subspace dimension 2N , N + 1, N − 1, 2, and 0, respectively. So K = Γ is not possible. Finally, we enumerate the pairs K ⊂ H for which H/K is cyclic. There are 13 such pairs: 1⊂1 1 ⊂ Z2 (ρ) 1 ⊂ Z2 (ρτ ) 1 ⊂ Z2 (τ ) 1 ⊂ ZN

Z2 (ρ) ⊂ Z2 (ρ) Z2 (ρ) ⊂ Z2 (ρ) × Z2 (τ ) Z2 (ρ) ⊂ DN

Z2 (ρτ ) ⊂ Z2 (ρτ ) Z2 (ρτ ) ⊂ Z2 (ρ) × Z2 (τ ) Z2 (ρτ ) ⊂ DN

DN ⊂ DN DN ⊂ Γ

When N is not prime the number of isotropy subgroups increases substantially with the number of prime factors. There is, however, another issue that needs to be discussed. In the models for speciation, animal gaits, and the visual cortex, speciﬁc equations for the internal dynamics and the coupling are not known; indeed, in a very real

268

M. Golubitsky and I. Stewart

sense, they may never be known. In the coupled pendulum model, the Hamiltonian for the internal dynamics and the coupling are derivable from ﬁrst principles. Therefore, for such systems, it is useful to have techniques that prove the existence of periodic solutions in the given model equation not just in all possible model systems having the same symmetries. In dissipative systems one method for ﬁnding periodic solutions of a given type in a ﬁxed model is Hopf bifurcation. In Hamiltonian systems, the analogous method for ﬁnding periodic solutions is the Weinstein–Moser theorem. We present the equivariant versions of these techniques in the next chapter. Using this approach we will be able to prove that three of the 13 possibilities do appear in the Hamiltonian system (4.11). See §5.4.1 for further information.

5

Spontaneous Symmetry-Breaking

In Section 4 we discussed the symmetry types of stationary and periodic solutions that one can expect to ﬁnd in equivariant systems of diﬀerential equations. We can apply these theorems only to the class of all equivariant systems — not to an individual system. Bifurcation theory is the traditional method by which solutions of a given symmetry type are proved to exist in a particular model system. Usually we start with a group-invariant equilibrium and ask what states bifurcate from that equilibrium as a parameter is varied. In general, almost anything can happen; but, generically, only rather speciﬁc types of bifurcations are possible. That comment follows from the well-developed theory of spontaneous symmetry-breaking and leads to a set of solutions that are ‘likely to occur’ in speciﬁc models. It is important to emphasize that the ‘likely’ solutions do not include all possible solutions. In this section we review some of equivariant bifurcation theory. See Golubitsky, Stewart, and Schaeﬀer [1988] and Golubitsky and Stewart [2002] for additional detail. Let f : Rn × R → Rn be Γ-equivariant where Γ ⊂ O(n) is ﬁnite, that is, f (γx, λ) = γf (x, λ). Consider the Γ-invariant system of ODE x˙ = f (x, λ) where λ is a bifurcation parameter. Suppose that x = 0 is a trivial group invariant equilibrium, that is, f (0, λ) = 0. Suppose, in addition, that there is a bifurcation at λ = 0; that is, there are eigenvalues of the linearization L = (dx f )(0,0)

8. Patterns of Oscillation in Coupled Cell Systems

269

on the imaginary axis. By deﬁnition steady-state bifurcation occurs when L has a zero eigenvalue and Hopf bifurcation occurs when L has a complex conjugate pair of purely imaginary eigenvalues. Typically, either steadystate or Hopf bifurcation occurs — but not both — unless additional parameters are available in the model equations. For the moment we assume that only one parameter is present.

5.1

Linear Theory

It is easy to check that ker L is a Γ-invariant subspace of Rn . It is proved in Golubitsky, Stewart, and Schaeﬀer [1988] that typically, at a steady-state bifurcation, the subspace ker L ⊂ Rn is an absolutely irreducible representation of Γ. Recall that a real representation is absolutely irreducible if the only linear maps that commute with Γ are scalar multiples of the identity map. It is also shown that typically at a Hopf bifurcation the center subspace C of L is Γ-simple: either (a) C = V ⊕ V where V is an absolutely irreducible representation of Γ, or (b) C itself is irreducible but not absolutely irreducible. One consequence of these two results is that there is a type of steadystate bifurcation for each absolutely irreducible representation of Γ and there is a type of Hopf bifurcation for each irreducible representation of Γ. Likely solutions are found by determining the new solutions that occur by symmetry-breaking bifurcation from each of these type of bifurcations.

5.2

Nonlinear Theory

There are two steps in analyzing symmetry-breaking bifurcations. First, either a Liapunov–Schmidt or center manifold reduction is used to reduce the question of ﬁnding new solutions to one of ﬁnding solutions to Γ-invariant systems of ODE y˙ = g(y, λ) where y ∈ C and g : C × R → C is Γ-equivariant with respect to the action of Γ on C. These reductions can be performed to preserve symmetry and so that g(0, λ) = 0. The second step — analyzing the bifurcations of the implicitly deﬁned system g — is generally quite diﬃcult. There are, however, two theorems that simplify the search for generically occurring solutions — the Equivariant Branching Lemma and the Equivariant Hopf Theorem.

270

M. Golubitsky and I. Stewart

The symmetry group of an equilibrium Σ ⊂ Γ is always an isotropy subgroup. An isotropy subgroup is axial if dim Fixker L (Σ) = 1. 5.1 Theorem (Equivariant Branching Lemma). Generically, for each axial subgroup Σ ⊂ Γ, there is a unique branch of equilibria having symmetry subgroup Σ. At a generic Hopf bifurcation A = (dy g)(0,0) has one purely imaginary pair of complex conjugate eigenvalues each of multiplicity m where dim C = 2m. It follows that etA induces an action of S1 on C that commutes with the action of Γ; hence there is a naturally deﬁned action of Γ × S1 on C. An isotropy subgroup Σ ⊂ Γ × S1 is C-axial if dim FixC (Σ) = 2 If a periodic solution has symmetry subgroup Σ ⊂ Γ × S1 , then, as in Section 4, we can deﬁne K = Σ ∩ Γ and H = Π(Σ) where Π : Γ × S1 → Γ is projection. 5.2 Theorem (Equivariant Hopf Theorem). Generically, for each axial subgroup Σ ⊂ Γ × S1 , there is a unique branch of periodic solutions having symmetry subgroup Σ. The next two sections are devoted to applications of these bifurcation results to coupled cell systems. We then discuss genericity issues involving coupled cell systems and end the chapter with a discussion of the equivariant Moser–Weinstein theorem — the Hamiltonian analogue of the equivariant Hopf theorem. 5.2.1

SN Steady-State Bifurcations and Speciation Revisited

In Section 1 we introduced a coupled cell model of speciation and exhibited a numerical simulation in which a single species splits into two. A number of general phenomena are associated with such models, independently of many details of the equations, and we now describe some of these. Speciﬁc models with a well-deﬁned biological interpretation, such as simulations of speciation in bird populations, have been studied by Elmhirst [2000] and related to the general considerations stemming from symmetry. Recall that the model deals with a set of N PODs (coarse-grained clumps of organisms) whose phenotypes are represented by x = (x1 , . . . , xN ) ∈ RN . (To include more phenotypic variables, let the xj be vectors in some Rk . The discussion generalizes to this case.) We normalize all phenotypic variables to be zero prior to bifurcation: that is, we deﬁne them as deviations from the mean. The subspaces V0 V1

= R(1, 1, . . . , 1) = (x1 , . . . , xN ) : x1 + · · · + xN = 0

(5.1)

8. Patterns of Oscillation in Coupled Cell Systems

271

are SN -invariant and SN -irreducible, and RN = V0 ⊕ V1 A symmetry-breaking bifurcation of equilibria occurs when the kernel of the linearization is V1 , and we can carry out a Liapunov–Schmidt reduction onto this space. Consider the restriction of the action of SN to V1 . Here, the isotropy subgroups are the same as for the action of SN on RN , but the dimension of Fix(Σ) is reduced by 1. In particular dim Fix(Σ) = 1 when Σ is the isotropy group of a block {p, N − p}, where p ≤ N2 . The coupled cell system is modelled by an SN -equivariant ODE dxj = fj (x1 , . . . , xN ; a1 , . . . , as ) dt

(5.2)

We can ﬁnd symmetry-breaking equilibria by applying the Equivariant Branching Lemma. If there is a steady-state bifurcation with kernel V1 then there exist branches of solutions for all axial isotropy subgroups. From §4 it is easy to check that the axial subgroups of SN in this representation are, up to conjugacy, those of the form Sp ×Sq where p+q = N and 1 ≤ p ≤ [N/2]. So there exist branches of solutions with these isotropy subgroups. Such solutions lie in ﬁxed-point spaces of the form (u, . . . , u; v, . . . , v), with exactly two distinct values u and v for phenotypic variables. These solution branches therefore correspond to a split of the population of N identical PODs into two distinct species consisting of p and q PODs respectively. One species has the phenotype u and the other species has the phenotype v. Note that pu + qv = 0 since (u, . . . , u; v, . . . , v) ∈ V1 . We can also make an interesting universal quantitative prediction: on the above branches the mean value of the phenotypic variables changes smoothly during the bifurcation. The reason is that the ﬁxed-point space of Sp × Sq is spanned by all vectors (u, . . . , u; v, . . . , v) where there are p u’s and q v’s, and the mean phenotype is pu + qv = 0. Because we are using normalized phenotypic variables, all xi = 0 prior to bifurcation. Thus the mean phenotype remains constant throughout the bifurcation. However, we are working with the Liapunov–Schmidt reduced problem, which involves a nonlinear change of variables. Therefore the mean varies smoothly in the original phenotypic variables, and is thus approximately constant. Some studies reported in the literature are consistent with the above predictions. For example, a celebrated instance of polymorphism is the changes in beak size that occur among various species of Darwin’s ﬁnches in the Gal´ apagos Islands. The prediction of smoothly changing mean is consistent with observations of these ﬁnches. The evolution of the diﬀerent ﬁnch species in the Gal´apagos Islands is thought to have occurred around ﬁve million years ago, and so cannot be observed (although small-scale evolution remains rapid enough that signiﬁcant phenotypic changes can be observed from one year to the next). However, we can observe a surrogate for actual evolution: diﬀerences in the phenotype of a given species

272

M. Golubitsky and I. Stewart

in allopatric and sympatric populations. The transition in phenotype from sympatric populations to allopatric ones should be just like the bifurcations in the speciation model: in particular, we expect to see approximately the same mean in either situation. This is the case for the two species Geospiza fortis and G. fuliginosa, which occur in both sympatric and allopatric populations. G. fortis is allopatric on the island known as Daphne, and G. fuliginosa is allopatric on Crossman. The two species are sympatric on a number of islands which Lack placed in three groups for data analysis: Abingdon, Bindloe, James, Jervis; Albemarle, Indefatigable; and Charles, Chatham. Fig. 5.1, adapted from Lack [1968], shows the diﬀerences in beak size between these species on the cited groups of islands. The mean beak sizes of both G. fortis and G.

Abingdon, Bindloe, James, Jervis

Albemarle, Indefatigable

Charles, Chatham

Geospiza fuliginosa Geospiza fortis Daphne 50% histogram s 0% Crossman 7

8

9

1 0 11 1 12 13 size of beak (mm)

14

15

6

Figure 5.1. Beak sizes in allopatric and sympatric populations of Geospiza in the Gal´ apagos Islands.

fuliginosa are approximately 10mm in allopatric populations. In all three

8. Patterns of Oscillation in Coupled Cell Systems

273

(groups of) sympatric populations, the mean for G. fortis is about 12mm, while that for G. fuliginosa is about 8mm. These ﬁgures are consistent with the ‘constant mean’ prediction. 5.2.2

Animal Gaits and Multirhythms Revisited

We begin our discussion by recalling that if Γ is an abelian group, then its irreducible representations are either one-dimensional (and absolutely irreducible, since all linear maps are multiples of the identity) or twodimensional (and nonabsolutely irreducible, since Γ commutes with SO(2)). Thus, generically Hopf bifurcation in the presence of an abelian symmetry group reduces to standard Hopf bifurcation — a single pair of multiplicity one purely imaginary eigenvalues. Standard Hopf bifurcation, which is just a special case of the Equivariant Hopf Theorem, leads to a unique branch of periodic solutions. The symmetry group pair K ⊂ H of these solutions is simple to determine: K is the kernel of the action of Γ on C and H = Γ (since the bifurcating periodic solution at any parameter value near 0 is unique up to phase shift). Thus, when Γ is abelian, bifurcating solutions are primary solutions. Since Γ = Z4 × Z2 in the animal gaits model, only primary gaits can be obtained by Hopf bifurcation from a trivial equilibrium. The second question that we ask is whether all primary gaits can, in principle, be obtained by Hopf bifurcation from a trivial equilibrium — and the answer is yes — at least when the dynamics in each cell is two dimensions or greater. First, for every subgroup K ⊂ Γ for which Γ/K is cyclic, there is an irreducible representation of Γ with kernel K. Second, suppose that the internal dynamics of the cell system pictured in Figure 2.2 is one-dimensional. Then the state space is R8 and since Z4 × Z2 has eight elements, R8 = L2 (Z4 × Z2 ). It is a standard theorem from representation theory that every irreducible representation appears at least once in L2 (Z4 × Z2 ) and hence at least twice when the internal dynamics in each cell is at least two-dimensional. It follows that in principle every primary gait can be obtained by Hopf bifurcation from a trivial equilibrium. Indeed, Buono [1998] shows that each of the primary gaits listed in Table 4.2 can be obtained by such a Hopf bifurcation. See also Buono and Golubitsky [2001]. The situation is diﬀerent in the multirhythm example. In that case, see Figure 4.3, the cell system also has an abelian symmetry group Γ = Z3 × Z2 ∼ = Z6 . The multirhythm periodic solutions have a symmetry group pair (H, K) = (Z6 , 1) but none of the irreducible representations occurring in the phase space of this cell system has trivial kernel — there are three diﬀerent irreducible representations and their kernels are Z2 , Z3 , and Z6 . So the multirhythm periodic solution cannot appear by a generic Hopf bifurcation from a trivial equilibrium. Indeed, we found ours by constructing a succession of two Hopf bifurcations with certain properties.

274

M. Golubitsky and I. Stewart

5.3

Genericity Questions in Coupled Cell Systems

In this subsection we comment on genericity questions concerning bifurcations in coupled cell systems. A simple example is instructive. Consider the two cell system pictured in Figure 4.1. The general system of diﬀerential equations for this cell system is given in (4.6). At a group invariant equilibrium (x1 , x1 ), the Jacobian of the system is given in block form by 0 1 fx1 fx2 . fx2 fx1 The eigenvalues of this matrix are the eigenvalues of the matrices fx1 + fx2

or fx1 − fx2 .

Critical eigenvalues of the ﬁrst matrix lead to bifurcations that preserve symmetry (the trivial representation of Z2 ), while critical eigenvalues of the second matrix lead to symmetry-breaking bifurcations (the nontrivial representation of Z2 ). Note that when the internal dynamics of each cell is one-dimensional, the eigenvalues are real — so Hopf bifurcation is not possible. However, when the internal dynamics of each cell are at least two-dimensional the eigenvalues of each matrix can be chosen arbitrarily. Thus, to achieve generic behavior of coupled cells systems, we may need to consider higher dimensional internal dynamics than is suggested just by the phase space of the coupled cell system. After all, when the internal dynamics are one-dimensional, the phase space is two-dimensional and Hopf bifurcation might have been possible. A second example is given by a bidirectional ring of four cells with just nearest neighbor coupling. This coupled cell system has D4 symmetry. Suppose that the internal dynamics is k-dimensional. The diﬀerential equation in the ﬁrst cell is denoted by x˙ 1 = f (x1 , x2 , x4 ), where f (x, y, z) = f (x, z, y). The Jacobian of the full 4k-dimensional system at a group invariant equilibrium — all coordinates equal — is: ⎡ ⎤ A B 0 B ⎢B A B 0 ⎥ ⎥ L=⎢ ⎣ 0 B A B⎦ B 0 B A where A = dx1 f and B = dx2 f . The eigenvalues of this matrix are determined in [Golubitsky, Stewart, and Schaeﬀer, 1988, p. 396] and it follows that the eigenvalues of L are the union of the eigenvalues of A + 2B, A − 2B and A (twice). There are three irreducible representations of D4 that occur in the ring system phase space: trivial one-dimensional, a nontrivial one-dimensional,

8. Patterns of Oscillation in Coupled Cell Systems

275

and the standard two-dimensional. Critical eigenvalues of the three matrices correspond to bifurcations corresponding to each of these irreducible representations. As in the simple example, Hopf bifurcation cannot occur unless k ≥ 2. We ask the following question: Can a stable symmetry-breaking D4 Hopf bifurcation occur when the internal dynamics is two-dimensional? The answer is basically no. Such bifurcations occur when tr(A) = 0. It follows that the traces of the other two matrices are ±2tr(B). If tr(B) = 0, then one of these matrices has a positive eigenvalue, and a stable Hopf bifurcation is not possible. If we need to require that tr(B) = 0, then we have this bifurcation occurring with stable periodic solutions only in codimension two. In these models generic Hopf bifurcations to stable solutions can occur either when k ≥ 3 or when next nearest neighbor coupling is also allowed. It is clear that the determination of generic bifurcation behavior in coupled cell systems depend to some extent on the dimension of the internal dynamics allowed in each cell and on the ways in which coupling is restricted in the cell system.

5.4

The Equivariant Moser–Weinstein Theorem

The basic ‘local bifurcation’ existence theorem for periodic orbits in Hamiltonian dynamics is the Liapunov Center Theorem. Suppose that H is a Hamiltonian on P = R2n and let p ∈ P be an equilibrium, so that (dH)p = 0. Assume that p is a nondegenerate minimum of H, that is (dH)p = 0 and (d2 H)p is positive deﬁnite. Let L be the linearization of the Hamiltonian vector ﬁeld at p and let the eigenvalues of L be the purely imaginary pairs {±λ1 , . . . , ±λn }. Liapunov proved that if the linearized ﬂow at an equilibrium has a simple purely imaginary eigenvalue and some λi is non-resonant then there exists a smooth 2-dimensional submanifold of P , which passes through p and intersects every energy level near p in a periodic orbit, such that the period of that orbit approaches 2π/λi for orbits near p. By ‘non-resonant’ we mean that λj is not an integer multiple of λi for j = i. Weinstein [1973] proved that even when there is resonance, there must exist at least 12 dim Vλ families of periodic solutions on each energy level near p. The proof was simpliﬁed by Moser [1976], and the result has come to be known as the Weinstein–Moser Theorem. However, Weinstein–Moser Theorem fails to predict all periodic solutions near equilibrium in the equivariant case. For instance, in the H´enon–Heiles system (H´enon and Heiles [1964]), the Weinstein–Moser Theorem predicts at least two (families of) periodic solutions near equilibrium, but actually there are eight. Even taking the symmetry into account, there are three group orbits of periodic solutions. The Equivariant Weinstein–Moser Theorem remedies this diﬃculty by exploiting the symmetry of the Hamiltonian.

276

M. Golubitsky and I. Stewart

Recall that a symplectic vector space over R is a vector space V over R equipped with a symplectic form. A symplectic action of a group Γ on a symplectic vector space V is an action that leaves the symplectic form invariant. The theory of group representations can be extended to symplectic representations Montaldi, Roberts, and Stewart [1988]: in particular any symplectic representation of a compact Lie group is a direct sum of irreducible symplectic representations, and there exists a unique isotypic decomposition. Moreover, the symplectic irreducibles for compact Γ are precisely what Golubitsky, Stewart, and Schaeﬀer [1988] call Γ-simple representations. These arise generically in symmetric Hopf bifurcation of dissipative systems. Suppose that a compact Lie group Γ acts symplectically on P , let p ∈ P be a ﬁxed point for Γ, and suppose that the Hamiltonian H is Γ-invariant. This symmetry may force some of the λi to be equal, creating unavoidable resonances. Let u(t) be a periodic orbit of the ﬂow of H having period T . Let S1 be the circle group, identiﬁed with R/2πZ, and consider the usual action of Γ×S1 on the loop space C k (T ) of k-times diﬀerentiable T -periodic functions, as in equivariant Hopf bifurcation. That is, Γ × S1 acts on u = u(t) by (γ, θ) · u(t) = γu(t + T θ/2π) . Deﬁne the symmetry group of u ∈ C k (T ) to be Σu = (γ, θ) ∈ Γ × S1 : γu(t + T θ/2π) = u(t) . Recall that when P is a vector space over R and G acts linearly, Fix(Σ) is a linear subspace. Analogously, if P is a symplectic vector space over R and G acts linearly and symplectically, then Fix(Σ) is a symplectic subspace. Let X be the vector ﬁeld of H, let L be the linearization of X at p. Deﬁne the linearized ﬂow to be the ﬂow generated by the ODE x˙ + Lx = 0 on the tangent space V = Tp P to P at p. Let λ be a non-zero purely imaginary eigenvalue of L and deﬁne the resonance space Vλ ⊂ V to be the (real part of the) sum of the generalized eigenspaces of L for eigenvalues kλ, where k ∈ Z. Assume the following conditions on H: 1. (d2 H)p is a nondegenerate quadratic form. 2. (d2 H)p |Vλ is positive deﬁnite. Condition (1) is equivalent to L being nonsingular, and (2) implies that L|Vλ is semisimple (diagonalizable over C). Clearly L is Γ-equivariant, so Vλ is invariant under the action of Γ. It is also invariant under the linearized ﬂow. Because L|Vλ is semisimple, the

8. Patterns of Oscillation in Coupled Cell Systems

277

orbits of the linearized ﬂow are all periodic with period 2π/|λ| and hence deﬁne an action of S1 on Vλ . Explicitly, θ · v = exp

θ L v. |λ|

This action commutes with the action of Γ, so together they deﬁne a Γ×S1 action on Vλ . We may now state: 5.3 Theorem (Equivariant Weinstein–Moser Theorem). Suppose that the Hamiltonian H satisﬁes (1) and (2). Then for every isotropy subgroup Σ of the Γ × S1 -action on Vλ , and for all suﬃciently small , there exist at least 1 dim Fix(Σ) periodic orbits of X with periods near 2π/|λ| and symmetry 2 group containing Σ, on the energy surface H(x) = H(p) + 2 . For a proof see Montaldi, Roberts, and Stewart [1988]. A rather different approach to an equivariant Liapunov Center Theorem, using the ‘constrained Liapunov–Schmidt procedure’, can be found in Golubitsky, Marsden, Stewart, and Dellnitz [1995]. Because of the symplectic structure, dim Fix(Σ) is always even. In practice — though it is more a rule of thumb than a provable theorem — the ‘primary’ isotropy subgroups Σ are those for which dim Fix(Σ) is small. The most important isotropy subgroups, and the most tractable, of all are those for which dim Fix(Σ) attains its minimum value, namely 2. These are what we have called C-axial subgroups. So the Equivariant Weinstein– Moser theorem implies that under the usual hypotheses if Σ is C-axial then there exists at least one family of periodic solutions with isotropy group equal to Σ. The group theory involved in the Γ × S1 -action is identical to the action occurring in equivariant Hopf bifurcation. This is a consequence of the loop space technique employed in both contexts and the classiﬁcation of symplectic irreducibles. We can use this relationship to import results from equivariant Hopf bifurcation into Hamiltonian dynamics. In particular, we can use the existing analysis of Dn Hopf bifurcation (Golubitsky, Stewart, and Schaeﬀer [1988] , Golubitsky and Stewart [1986]) to prove the existence of certain periodic solutions in Hamiltonian systems with Dn symmetry, such as the coupled pendulum system. 5.4.1

Coupled Pendula Revisited

In this subsection we apply the Equivariant Weinstein–Moser Theorem to the Hamiltonian system (4.11). The linearization of (4.11) is 0 L=

0 M

I 0

1

278

M. Golubitsky and I. Stewart

where I is the N × N identity matrix and M is ⎡ −(1 + 2α) α 0 ⎢ α −(1 + 2α) α ⎢ ⎢ 0 α −(1 + 2α) M =⎢ ⎢ .. .. .. ⎣ . . . α 0 ...

the circulant matrix ⎤ ... 0 α ⎥ 0 ... 0 ⎥ ⎥ α ... 0 ⎥ ⎥ .. .. .. ⎦ . . . 0 α −(1 + 2α)

First, we derive the eigenvectors and eigenvalues of L. Let ω = e2πi/N and deﬁne T vk = 1, ω k , ω 2k , . . . , ω (N −1)k where 0 ≤ k ≤ N − 1. Let

'

2πk − 1 − 1 νk = 2α cos N

± T and deﬁne u± k = [vk , ± iνk vk ] . An easy calculation shows that the uk are ± ± eigenvectors of L with eigenvalues λk = ± iνk . The λk are purely imaginary since µk < 0. Generically (in α) these eigenvalues are non-resonant, and we henceforth assume that α has been chosen to avoid resonances. The linearized ﬂow on R2N possesses periodic solutions corresponding to initial conditions Re(u± k ). The corresponding solutions take the form

2πj , qj (t) = cos ± νk t + N 2πjk , pj (t) = ± νk sin ± νk t + N which are discrete rotating waves of period 2π νk such that successive pendula are phase-shifted by 2πk . N The general solution of the linearized equation is a superposition of such discrete rotating waves. When nonlinear terms are restored, some of these solutions persist as periodic solutions of the nonlinear equations— for example, the synchronous solutions (k = 0). The question is: which? The Equivariant Weinstein–Moser Theorem of Montaldi, Roberts, and Stewart [1987], stated in §5.4, provides a partial answer to this question, as we now describe. First, we recall some useful results from representation theory. Assume that Dn = σ, ρ where σ n = 1, ρ2 = 1, ρ−1 σρ = σ −1 . With two exceptions (when n is even) the irreducible representations of Dn over R are ξ0 , ξ1 , . . . , ξ[n/2] , deﬁned as follows. ξ0 is the trivial representation on R. When n is even, ξn/2 is the representation on R in which σ acts trivially and ρ acts as −1. In all other cases, ξk is the representation on R2 ≡ C in which σ acts as multiplication by ω k = e2πk/n and ρ acts by complex conjugation z → z. The exceptional cases when n is even arise because

8. Patterns of Oscillation in Coupled Cell Systems

279

then Dn /Zn ∼ = D2 which has four 1-dimensional irreducibles, which pull back to Dn . Two of these give rise to ξ0 , ξn/2 , but there are two others. All of these representations are absolutely irreducible. The space RN = {q} decomposes into Γ-irreducibles according to RN = Q0 ⊕ . . . ⊕ Q[N/2] where the action of DN on Qk is isomorphic to ξk and the action of τ is by −1. Similarly RN = {p} decomposes into Γ-irreducibles according to RN = P0 ⊕ . . . ⊕ P[N/2] where the action of DN on Pk is isomorphic to ξk and the action of τ is by −1. The symplectic Γ-irreducible components of R2N are Qk ⊕ Pk with actions ξk ⊕ ξk . Moreover, these are the symplectic isotypic components. The action of τ on each component is by −1. The action of S1 can be written in the form q + ip → eiθ (q + ip) Therefore π ∈ S1 also acts by −1, so (1, τ, π) ∈ DN × Z2 × S1 acts trivially. There is a homomorphism DN × Z2 × S1 → DN × S1 deﬁned by (δ, 1, θ) (δ, τ, θ)

→ (δ, θ) → (δ, θ + π)

and the action factors through this So in eﬀect we have a homomorphism.

DN × S1 -action, modulo K = (1, τ, π) . In particular, the isotropy subgroups are generated by isotropy subgroups of the DN × S1 -action together with K. Physically, K represents the usual ‘internal’ symmetry of a simple pendulum: all periodic oscillations are invariant under reﬂection together with a half-period phase shift. The problem therefore reduces to ﬁnding isotropy subgroups (more speciﬁcally, C-axial subgroups) of DN × S1 acting by ξk . To do this we use the results of Golubitsky and Stewart [1986], recorded in Golubitsky, Stewart, and Schaeﬀer [1988] . These apply to the standard action of DN × S1 . Here DN acts as the direct sum of two copies of ξ1 . The representations with ξk in place of ξ1 can be reduced to the standard case by use of the homomorphism αk : Dn × S1 → Dn × S1 sending σ → σ k : we omit the details of this reduction. We next describe how to interpret the symmetries of the solutions given by the standard action. Denote the state of pendulum j at time t by uj (t), and let T be the overall period of the system of pendula. Then the standard representation leads to three conjugacy classes of C-axial subgroups, whose interpretation is shown in Table 5.4.1. For illustrative purposes we show the 17 distinct (conjugacy classes of) C-axial solutions when N = 12, including the solutions arising from nonstandard actions. See Table 5.4.1. Here A, B, C, D are waveforms, A + p

280

M. Golubitsky and I. Stewart

Isotropy N ≡ ±1 (mod 4) ˜N Z Zρ2 (ρ,π) Z2 N ≡ 2 (mod 4) ˜N Z ρ Z2 ⊕ Zc2

(ρ,π)

Z2

⊕ Zc2

N ≡ 0 (mod 4) ˜N Z ρ Z2 ⊕ Zc2

c Zρσ 2 ⊕ Z2

Waveform Relationships uj (t) = u0 t + jT N uj (t) = u−j (t), for j =

0 uj (t) = u−j t + 12 T , u0 (t) — twice frequency uj (t) = u0 t + jT N uj (t) = u−j (t) = u 12 N +j t + 12 T = u 12 N −j t + 12 T , for 1 ≤ j < 14 N − 1 u0 (t) = u 21 N t + 12 T uj (t) = u−j t + 12 T = u 12 N +j t + 12 T = u 12 N −j (t), for 1 ≤ j < 14 N − 1 u0 (t) = u 21 N (t) — twice frequency uj (t) = u0 t + jT N uj (t) = u−j (t) = u 12 N +j t + 12 T = u 12 N −j t + 12 T , for 1 ≤ j < 14 N − 1 u0 (t) = u 21 N t + 12 T u 41 N (t) = u− 14 N (t) — twice frequency uj (t) = u1−j (t) = u 12 N +j t + 12 T = u 12 N +1−j (t + 12 T ), for 0 ≤ j ≤ 14 N − 1

Table 5.3. Oscillatory wave patterns in DN -symmetric systems. 1 indicates waveform A with a phase shift of 12 p of the overall period, a prime indicates a phase shift of half a period, and an asterisk indicates that the pendulum oscillates with twice the overall frequency of the system. Each of A, B, C, D must also have an ‘internal’ (τ, π) symmetry. As well as existence, we can ask about the linearized stabilities of these solutions. Methods for computing stability, based on Birkhoﬀ normal form, can be found in Montaldi, Roberts, and Stewart [1990].

6

The Coupling Decomposition

Consider a cell system on N nodes, with symmetry group Γ ⊂ SN , where node i has phase space Pi = Rki . The cell system dynamics is determined by a general Γ-equivariant vector ﬁeld F on P = P1 × · · · × PN . However, in interpretations of such models in applications, it is useful to consider speciﬁc ‘terms’ in the vector ﬁeld as representing internal dynamics of one component cell, coupling between two speciﬁed cells, multi-cell couplings, and so on. Moreover, we may wish to determine whether such terms are linear or absent entirely (the cells are not coupled); and whether the structure

8. Patterns of Oscillation in Coupled Cell Systems k 0 1 0 A A 1 A A+1 A B A B 2 A A+2 A B A∗ B 3 A A+3 A B∗ A∗ B 4 A A+4 A B A∗ B 5 A A+5 A B A A 6 A A

2 A A+2 C C A+4 B B A+6 A A∗ A+8 B B A + 10 C B A

3 A A+3 D∗ C A+6 A A∗ A+9 B∗ B A A A∗ A+3 D∗ C A

4 A A+4 C B A+8 B B A A A∗ A+4 B B A+8 C C A

5 6 7 A A A A+5 A+6 A+7 B A B A A B A + 10 A A+2 B A B B A∗ B A+3 A+6 A+9 B∗ A B∗ B A∗ B A+8 A A+4 B A B B A∗ B A + 1 A + 6 A + 11 B A B B A A A A A

8 A A+8 C C A+4 B B A A A∗ A+8 B B A+4 C B A

9 A A+9 D∗ C A+6 A A∗ A+3 B∗ B A A A∗ A+9 D∗ C A

10 A A + 10 C B A+8 B B A+6 A A∗ A+4 B B A+2 C C A

281 11 A A + 11 B A A + 10 B B A+9 B∗ B A+8 B B A+7 B B A

Table 5.4. The 17 C-axial solutions for a ring of 12 pendula.

of the system is Hamiltonian. In order to give such terminology a precise basis, we develop a decomposition of F into vector ﬁelds that correspond to various forms of coupling. Reﬁnements of this decomposition can also be introduced, but here we develop only the main idea. One aim of this decomposition is to provide a rigorous deﬁnition of ‘point-to-point’ coupling. See Deﬁnition 6.3. As motivation, let N = 3, all ki = 1, and deﬁne F by ⎤ 2 + 3x21 + 4x1 x2 + 5x2 x73 + 6x1 x22 x3 ⎦ x3 + 4x1 x23 + x1 x2 x3 − 2x1 x22 x33 F (x1 , x2 , x3 ) = ⎣ 2 2 3 3 9 + x1 − x2 + 3x3 + x1 + x2 + x1 x3 + x2 x3 − 11x1 x2 x3 ⎡

Given this explicit formula, we can decompose F directly into terms that depend on 0, 1, 2, or 3 of the variables: F = Φ0 + Φ1 + Φ2 + Φ3 where ⎡ ⎤ 2 Φ0 (x1 , x2 , x3 ) = ⎣0⎦ 9 ⎤ ⎡ 3x21 ⎦ x3 Φ1 (x1 , x2 , x3 ) = ⎣ x1 − x2 + 3x3 + x21 + x22 ⎤ ⎡ 4x1 x2 + 5x2 x73 ⎦ 4x1 x23 Φ2 (x1 , x2 , x3 ) = ⎣ x1 x3 + x2 x3 ⎤ ⎡ 6x1 x22 x3 Φ3 (x1 , x2 , x3 ) = ⎣x1 x2 x3 − 2x1 x22 x33 ⎦ −11x31 x32 x3

282

M. Golubitsky and I. Stewart

More importantly, we can obtain the same result with a more abstractly deﬁned decomposition, as follows. Let N = {1, . . . , N }. For each i ∈ N choose a base point bi ∈ Pi . In this paper we assume for simplicity that each Pi is a vector space and we let bi = 0. When the Pi are manifolds technical issues concerning uniqueness arise, which we prefer to ignore here. For each subset S ⊂ N deﬁne FS (x) = F (y) where ) xi if i ∈ S yi = 0 if i ∈ S . Then deﬁne the S-coupled part of F to be ΦS = (−1)|S\T | FT , T ⊂S

where S \ T is the set consisting of elements in S that are not in T . Finally, deﬁne Φk = ΦS S⊂N |S|=k

We claim that F is the sum of the Φk , and that these components can sensibly be interpreted as the k-node coupling terms. First, we need to recall a standard result from combinatorics: 6.1 Lemma.

Let Y be a ﬁnite set. Then 1 if |X | (−1) = 0 if X ⊂Y

Y=∅ Y = ∅

Proof. Let Y = m. If m = 0 the result is clear. Otherwise let t1 , . . . , tm be indeterminates. Consider the identity < (1 + t1 ) · · · (1 + tm ) = ti X ⊂Y i∈X

Now substitute ti = −1 for all i. 6.2 Proposition.

With the above notation:

1. F = Φ0 + · · · + ΦN 2. Suppose that T ⊂ N and |T | = k > 0. Then

∂ ∂xi ΦT

= 0 for all i ∈ T .

3. Each Φk is Γ-equivariant. Proof. To prove the ﬁrst statement, observe that Φ0 + · · · + ΦN =

N k=0 |S|=k T ⊂S

(−1)|S\T | FT

8. Patterns of Oscillation in Coupled Cell Systems

=

S

283

(−1)|S\T | FT

T ⊂S

The coeﬃcient of FT is S⊇T

(−1)|S\T | =

(−1)|U |

U

where U = N \ T . By Lemma 6.1 this coeﬃcient is 0 unless U = ∅, that is, unless T = N . Hence Φ0 + · · · + ΦN = FN = F as claimed. The second statement follows immediately from the deﬁnition of ΦS . The third statement follows since Γ permutes the cells, F is Γ-equivariant, and Φk is deﬁned as a sum over all subsets S ⊂ N that contain k elements. 6.3 Deﬁnition. The cell system F has point-to-point coupling if Φk = 0 for all k ≥ 3. More generally, the coupling degree of F is the largest k for which Φk = 0. There is a Hamiltonian analogue of all this: decompose the Hamiltonian in the same way. The decomposition of the Hamiltonian induces the above decomposition on the Hamiltonian vector ﬁeld.

Acknowledgments: We thank Jeroen Lamb for helpful discussions and the Center for Biodynamics, Boston University for its hospitality and support. This research was supported in part by NSF Grant DMS-0071735.

References Abraham, R. and J. E. Marsden [1978], Foundations of Mechanics, Benjamin– Cummings, New York. Armbruster, D. and P. Chossat [1999], Remarks on multi-frequency oscillations in (almost) symmetrically coupled oscillators, Phys. Lett. A 254, 269–274. Arrowsmith, D. K. and C. M. Place [1990], An Introduction to Dynamical Systems, Cambridge University Press, Cambridge. Barany, E., M. Dellnitz, and M. Golubitsky [1993], Detecting the symmetry of attractors, Physica D 67, 66–87. Bracikowski, C. and R. Roy [1990], Chaos in a multimode solid-state laser system, Chaos 1, 49–64. Bressloﬀ, P. C., J. D. Cowan, M. Golubitsky, P. J. Thomas, and M. C. Wiener [2001], Geometric visual hallucinations, Euclidean symmetry, and the functional architecture of striate cortex. Phil. Trans. Royal Soc. London B 356, 299–330. Buono, P-L. [1998], A Model of Central Pattern Generators for Quadruped Locomotion, Ph.D Dissertation, U Houston.

284

M. Golubitsky and I. Stewart

Buono, P-L. [2001], Models of Central Pattern Generators for Quadruped Locomotion: II. Secondary Gaits, J. Math. Biol. 42 No. 4 327–346. Buono, P-L. and M. Golubitsky [2001], Models of Central Pattern Generators for Quadruped Locomotion: I. Primary Gaits, J. Math. Biol. 42 No. 4, 291–326. Cohen, J. and I. Stewart [2000], Polymorphism viewed as phenotypic symmetrybreaking, in: Nonlinear Phenomena in Physical and Biological Sciences (S.K. Malik ed.), Indian National Science Academy, New Delhi, 1–67. Collins, J. J. and I. Stewart [1993], Coupled nonlinear oscillators and the symmetries of animal gaits, J. Nonlin. Sci. 3, 349–392. Collins, J. J. and I. Stewart [1994], A group-theoretic approach to rings of coupled biological oscillators, Biol. Cybern. 71, 95–103. Dias, A. P. S. [1998], Hopf bifurcation for wreath products, Nonlinearity 11, 247–264. Dionne, B., M. Golubitsky, and I. Stewart [1996], Coupled cells with internal symmetry Part I: wreath products, Nonlinearity 9 (1996) 559–574; Part II: direct products, 575–599. Elmhirst, T. [2000], Symmetry and Emergence in Polymorphism and Sympatric Speciation, Ph.D. Thesis, Math. Inst., U Warwick (to appear). Ermentrout, G. B. and N. Kopell [1991], Multiple pulse interactions and averaging in systems of coupled neural oscillators, J. Math. Biol. 29, 195–217. Field, M. J. [1980], Equivariant dynamical systems, Trans. Amer. Math. Soc. 229, 185–205. Field, M.J. [1996], Lectures on Bifurcations, Dynamics and Symmetry, Research Notes in Mathematics 356 Longman, London. Field, M., I. Melbourne, and M. Nicol [1996], Symmetric attractors for diﬀeomorphisms and ﬂows, Proc. Lond. Math. Soc. (3) 72, 657–696. Fowles, G. R. [1986], Analytical Mechanics, Saunders, Philadelphia. Golubitsky, M., J. E. Marsden, I. Stewart, and M. Dellnitz [1995], The constrained Liapunov–Schmidt procedure and periodic orbits, Fields. Inst. Commun. 4, 81-127. Golubitsky, M., M. Pivato, and I. Stewart [2002], Interior symmetries in coupled cell networks. Preprint. Golubitsky, M. and D. G. Schaeﬀer [1985], Singularities and Groups in Bifurcation Theory I, Applied Mathematical Sciences 51, Springer-Verlag, New York. Golubitsky, M. and I. Stewart [1986], Hopf bifurcation with dihedral group symmetry: coupled nonlinear oscillators, in Multiparameter Bifurcation Theory (M. Golubitsky and J. Guckenheimer eds.), Proceedings of the AMS–IMS– SIAM Joint Summer Research Conference, July 1985, Arcata; Contemporary Math. 56 Amer. Math. Soc., Providence RI, 131–173. Golubitsky, M. and I. Stewart [2002], The Symmetry Perspective: From Equilibrium to Chaos in Phase Space and Physical Space. Progress in Mathematics 200, Birkh¨ auser, Basel. Golubitsky, M., I. Stewart, P-L. Buono, and J. J. Collins [1998], A Modular Network for Legged Locomotion, Physica D 115, 56–72

8. Patterns of Oscillation in Coupled Cell Systems

285

Golubitsky, M., I. Stewart, P-L. Buono, and J. J. Collins [1999], The role of symmetry in animal locomotion, Nature 401, 693–695. Golubitsky, M., I. Stewart, and D. Schaeﬀer [1988], Singularities and Groups in Bifurcation Theory: Vol.II, Appl. Math. Sci. 69, Springer-Verlag, New York. Griﬃths J. B. [1985], The Theory of Classical Dynamics, Cambridge University Press, Cambridge. Guckenheimer, J. and P. Holmes [1983], Nonlinear Oscillations, Dynamical Systems, and Bifurcations of Vector Fields, Appl. Math. Sci. 42, Springer-Verlag, New York. Guckenheimer, J. and P. Holmes [1988], Structurally stable heteroclinic cycles, Math. Proc. Camb. Phil. Soc. 103, 189–192. Hadley, P., M. R. Beasley, and K. Wiesenfeld [1988], Phase locking of Josephsonjunction series arrays, Phys. Rev. B 38 No. 13, 8712–8719. H´enon, M. and C. Heiles [1964], The applicability of the third integral of motion; some numerical experiments, Astronom. J. 69, 73–79. Hubel, D. H. and T. N. Wiesel [1974], Sequence regularity and geometry of orientation columns in the monkey striate cortex, J. Comp. Neurol. 158, 267–294. Huey, R. B., G. W. Gilchrist, M. L. Carlson, D. Berrigan, and L. Serra [2000], Rapid evolution of a geographic cline in size in an introduced ﬂy, Science 287, 308–310. Kopell, N. [1988], Toward a theory of modelling central pattern generators, in: Neural Control of Rhythmic Movements in Vertebrates (A.H. Cohen, S. Rossignol and S. Grillner, eds.) New York, Wiley, 369–413. Kopell, N. and G. B. Ermentrout [1986], Symmetry and phaselocking in chains of weakly coupled oscillators, Comm. Pure Appl. Math. 39, 623–660. Kopell, N. and G. B. Ermentrout [1988], Coupled oscillators and the design of central pattern generators, Math. Biosci. 89, 14–23. Kopell, N. and G. B. Ermentrout [1990], Phase transitions and other phenomena in chains of oscillators, SIAM J. Appl. Math. 50, 1014–1052. Lack, D. [1968], Darwin’s Finches: an Essay on the General Biological Theory of Evolution, Peter Smith, Gloucester MA. Mayr, E. [1963], Animal Species and Evolution, Belknap Press, Cambridge MA. Mayr, E. [1970], Populations, Species, and Evolution, Harvard University Press, Cambridge MA. Melbourne, I., M. Dellnitz, and M. Golubitsky [1993], The structure of symmetric attractors, Arch. Rational Mech. Anal. 123, 75–98. Montaldi, J. A., R. M. Roberts, and I. Stewart [1987], Nonlinear normal modes of symmetric Hamiltonian systems, in Structure Formation in Physics (G. Dangelmayr and W. Guttinger, eds.), Springer-Verlag, New York, 354–371. Montaldi, J. A., R. M. Roberts, and I. Stewart [1988], Periodic solutions near equilbria of symmetric Hamiltonian systems, Phil. Trans. R. Soc. Lond. A 325, 237–293. Montaldi, J. A., R. M. Roberts, and I. Stewart [1990], Stability of nonlinear normal modes of symmetric Hamiltonian systems, Nonlinearity 3, 731–772.

286

M. Golubitsky and I. Stewart

Morris, C. and H. Lecar [1981], Voltage oscillations in the barnacle giant muscle ﬁber, Biophysical J. 35, 193–213. Moser, J. [1976], Periodic orbits near equilibrium and a theorem by Alan Weinstein, Commun. Pure Appl. Math. 29, 727–747. Pennisi, E. [2000], Nature steers a predictable course, Science 287, 207–208. Rand, R. H., A. H. Cohen, and P. J. Holmes [1988], Systems of coupled oscillators as models of central pattern generators, in: Neural Control of Rhythmic Movements in Vertebrates (A. H. Cohen, S. Rossignol and S. Grillner, eds.) New York, Wiley, 333–367. Ridley, M. [1996] Evolution, Blackwell, Oxford. Rundle, H. D., L. Nagel, J. W. Boughman, and D. Schluter [2000], Natural selection and parallel speciation in sympatric sticklebacks, Science 287, 306–308. Stewart, I., T. Elmhirst, and J. Cohen [2000], Symmetry, stochastics, and sympatric speciation (in preparation). Stewart, I., T. Elmhirst and J. Cohen. Symmetry-breaking as an origin of species. In: Bifurcations, Symmetry, and Patterns (J. Buescu, S. B. S. D. Castro, A. P. S. Dias and I. S. Labouriau, eds), Birkh¨ auser, Basel, (to appear). Synge, J. L. and B. A. Griﬃth [1959], Principles of Mechanics, McGraw–Hill, New York. Wang, S. S. and H. G. Winful [1988], Dynamics of phase-locked semiconductor laser arrays, Appl. Phys. Lett. 52, 1744–1776. Weinstein, A. [1973], Normal modes for nonlinear Hamiltonian systems, Invent. Math. 20, 47–57.

9 Simple Choreographic Motions of N Bodies: A Preliminary Study Alain Chenciner Joseph Gerver Richard Montgomery Carles Sim´ o To Jerry Marsden on the occasion of his 60th birthday ABSTRACT A “simple choreography” for an N -body problem is a periodic solution in which all N masses trace the same curve without colliding. We shall require all masses to be equal and the phase shift between consecutive bodies to be constant. The ﬁrst 3-body choreography for the Newtonian potential, after Lagrange’s equilateral solution, was proved to exist by Chenciner and Montgomery in December 1999 (Chenciner and Montgomery [2000]). In this paper we prove the existence of planar N -body simple choreographies with arbitrary complexity and/or symmetry, and any number N of masses, provided the potential is of strong force type (behaving like 1/r a , a ≥ 2 as r → 0). The existence of simple choreographies for the Newtonian potential is harder to prove, and we fall short of this goal. Instead, we present the results of a numerical study of the simple Newtonian choreographies, and of the evolution with respect to a of some simple choreographies generated by the potentials 1/ra , focusing on the fate of some simple choreographies guaranteed to exist for a ≥ 2 which disappear as a tends to 1.

Contents 1 2

3 4

Introduction . . . . . . . . . . . . . . . . . . . 1.1 Literature . . . . . . . . . . . . . . . . . . . . Simple Choreographies: The Theorem. . . . 2.1 An Alternative Description . . . . . . . . . . 2.2 Remark on Imposing Additional Symmetries Proof . . . . . . . . . . . . . . . . . . . . . . . . Numerical Investigations . . . . . . . . . . . . 4.1 Minimization Methods . . . . . . . . . . . . . 4.2 Newton’s Method . . . . . . . . . . . . . . . .

287

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

288 290 291 292 292 294 295 297 298

288

A. Chenciner, J. Gerver, R. Montgomery, and C. Sim´o 5

Main Choreographies, Satellites, Linear Chains . . 5.1 On Main and Satellite Choreographies . . . . . . . . 5.2 The linear Chains . . . . . . . . . . . . . . . . . . . 6 Evolution of the Choreographies with the Potential 7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1

298 298 303 303 306 307

Introduction

We will prove the existence of new families of periodic solutions to the N body problem. In these solutions all N masses travel along a ﬁxed curve in the plane. These solutions are topologically interesting, and pleasing to the eye. See the ﬁgures herein, although it is better to look at animations1 . The N -body problem with N equal masses concerns the study of the diﬀerential equations d2 xi = ∇i U (x1 , . . . , xN ) . dt2

(1.1)

Here U (x) = U (x1 , . . . , xN ) is the negative of the potential energy. The vectors xi ∈ Rd , i = 1, 2, . . . , N represent the positions of N masses moving in Rd . We will only be concerned here with the planar case, d = 2. We take all the masses to be equal to 1. We assume that U has the form U (x) =

f (rij ) ,

(1.2)

1≤i<j≤N

where rij = |xi −xj | is the distance between the ith and jth mass and where the two-body potential f (r) is a smooth non-negative function of r > 0 which blows up as r tends to 0. The potential is said to be Newtonian when f = c/r for some c > 0. A collision-free solution for the N -body problem in which all masses move on the same planar curve with a constant phase shift will be called a simple choreography. Lagrange [1772] found a simple choreography in the case N = 3, for the Newtonian potential. The three masses form the vertices of an equilateral triangle which rotates rigidly within its circumscribing circle. More generally, place N equal masses on a circle of radius r, so as to form the vertices of a regular N -gon. Rotate this N -gon rigidly about the center of the circle with angular velocity ω. The resulting curve will be a solution to the N -body equations. For concreteness, assume the potential is 1 Some animations, to be run under Linux or Unix using gnuplot, are available at http://www.maia.ub.es/dsg.

9. Choreographic Motions of N Bodies

289

f (r) = c/ra , c > 0, a > 0. Then the condition on the radius of the circle is rω 2 =

ac ra+1

σa,N

where

σa,N =

N −1

2 sin

j=1

jπ −a . N

We will call this the trivial circular simple choreography. 1 0.5 0.5

0

0

-0.5 -0.5 -1 -1

0

Figure 1.1. Three bodies on the eight.

1

-1

0

1

Figure 1.2. Fives bodies on a 4-petal ﬂower.

In December of 1999 two of us (see Chenciner and Montgomery [2000]) found another simple choreography for the Newtonian three-body problem. In this new solution three equal masses travel a ﬁxed “ﬁgure eight” shaped curve in the plane (see Figure 1.1). This ﬁgure eight started oﬀ a ﬂurry of work. Soon afterwards another one of us (J. G.) wondered whether the circle and ﬁgure eight might be generalized to other Lissajous-like curves. He soon found initial conditions for N = 4 which led to a simple “chain” choreography in the Newtonian case (see Figure 4.1b). The four masses form a parallelogram at each time instant. Then C. S., the fourth member of our team, found a whole slew of numerical solutions in which all of the bodies move on a single curve, and with quite diﬀerent shapes of curves (see Figures 4.1 and 4.2). He coined the name “choreography” because of the dance-like movement of the bodies in animations. The qualiﬁcation “simple” refers to the fact that all of the bodies lie on a single curve, “multiple” choreographies being reserved for solutions where the bodies move on different curves. As this paper deals only with simple choreographies we shall often skip the word “simple”. Hundreds of simple Newtonian choreography solutions have now been found, the number of “distinct” choreographies growing quickly as a function of N . The largest N achieved so far is N = 799, with the bodies moving on a ﬁgure eight curve. When we say “distinct”, we are counting only what we call the “main” choreographies, that is those which are not derived from a given choreography either by travelling around it a multiple number of times (subharmonics) or by a continuation in which the angular momentum is varied, or even by a combination of these two constructions. The precise deﬁnition of “main” and “satellite” choreographies, together with examples and counting, will be given in Section 5.

290

A. Chenciner, J. Gerver, R. Montgomery, and C. Sim´o

1.1 Conjecture. For every N ≥ 3 there is a main simple choreography solution for the equal mass Newtonian N -body problem diﬀerent from the trivial circular one. The number of such ‘distinct’ main simple choreographies grows rapidly with N . We will prove this conjecture, but only after replacing the Newtonian two-body potential f = 1/r by a strong-force potential, a suggestion which goes back to Poincar´e. For the precise statement, see Theorem 2 below. We call a potential strong-force provided there exist positive constants c, δ such that its two-body potential f satisﬁes: f (r) ≥ c/r2 whenever r < δ .

(1.3)

Imposing the strong force condition is a cheap way to get around the main obstruction to proving existence of simple choreographies, which is the existence of ﬁnite action collision solutions. Such collision solutions are present in the Newtonian case, and indeed for every f (r) = 1/r a with a < 2. Our real interest is establishing the existence of Newtonian choreographies and we will return to this in future papers.

1.1

Literature

The search for periodic solutions of the N -body problem is more tractable in the case of equal masses than in the general case due to the symmetries of mass interchange. To our knowledge, this observation ﬁrst appears explicitly (but only in the spatial case) in the paper by Davies, Truman, and Williams [1983]. Among other things, these authors were looking for periodic solutions of the equal mass Newtonian N -body problem in R3 whose conﬁgurations were invariant under an orientation-reversing isometry at each instant. After reducing by rotations, they get a two-degrees of freedom system, parameterized by the angular momentum (the symplectic reduction). They look for periodic orbits whose projection to the reduced phase space are “brake orbits”. A brake orbit is a periodic solution which traces out the image of an interval, going back and forth, ‘braking’ to zero (reduced) velocity and changing direction at the endpoints of this interval. In order to give rise to a periodic solution of the unreduced system, the period of the reduced (brake) orbit must be in resonance with the period of rotation of the system. The existence of such resonant brake orbits was only established numerically. Periodic solutions of this kind were rediscovered recently at least twice: the “pelotes”, found numerically by Hoynant [1999], which include an a priori inﬁnite set of examples where at least 4 bodies travel on one and the same spatial curve, and the “HipHop”, whose existence is proved by Chenciner and Venturelli [2000] as a collisionless minimizer of the action under the appropriate symmetry conditions. The paper by Davies, Truman, and Williams was the incentive for

9. Choreographic Motions of N Bodies

291

the systematic study by Stewart [1996] of symmetry methods in N -body problems with a non-singular potential. Another important paper, of which we became aware only after our paper was nearly completed, is Moore [1993]. Moore investigates the possibility of realizing pure braids on N strands by periodic solutions to planar N body problems. Simple choreographies correspond to certain special types of braids, hence Moore’s paper has close relations to ours. His tool is the gradient ﬂow for the action functional. He obtains the result, rediscovered by R.M. [1998], that for strong-force potentials any “tangled” braid type can be realized. He asserts the existence of the ﬁgure eight solution in the Newtonian case, based on a numerical investigation of the convergence of the gradient ﬂow, and he discusses its dynamical stability. He also discusses the dependence of choreography solutions (and their existence or disappearance) on the exponent a of f = 1/ra , thus presaging the discussion of our Section 6. Applications of “the eight” start to appear. Heggie [2000] has numerical evidence that it can appear as an output of the interaction of two couples of binaries.

2

Simple Choreographies: The Theorem.

We are interested in periodic solutions in which all N masses travel the same curve q(t). The period of these solutions is not important to us. Our proofs work for any period. Furthermore, in the homogeneous potential case, scaling allows to obtain any desired period. At this point it is convenient for us to take this period to be N . Thus we are searching for solutions to the N -body problem which have the form xj (t) = q(t + j),

j = 0, . . . , N − 1

(2.1)

with q(t) = q(t + N ) (see Chenciner and Montgomery [2000]) after renumbering. We will say that a curve has a collision if q(t) = q(t + i) for some time t, and some integer i, i = 1, . . . , N − 1.We want solutions without collisions. With this in mind, let C = C 0 S1 , C be the set of all continuous curves q : S1 = R/N Z → C endowed with the usual C 0 -topology. Deﬁne the discriminant locus D ⊂ C to be the set of all those curves along which there is some collision. 2.1 Deﬁnition.

A simple choreography class is a component of C \ D.

The main theoretical result of this paper is : 2.2 Theorem. Given any simple choreography class, there is a periodic solution of any planar strong-force N -body problem (see equation (1.3)) which realizes this class.

292

A. Chenciner, J. Gerver, R. Montgomery, and C. Sim´o

Compare with Moore [1993] and Montgomery [1998] in which an analogous result is established for any braid class. Examples of simple choreographies are given by the ﬁgures in this paper. Most of these are for Newton’s (non-strong force) potential. In Sim´o [2001a] several other families are displayed. In Section 5 we prove that the number of “main” simple choreographies increases at least exponentially with N .

2.1

An Alternative Description

It is illuminating to have another description of simple choreographies (see Chenciner [2002]). The conﬁguration space for the planar N -body problem is CN . Think of S1 = R/N Z as a circle of circumference N , drawn in the plane. Inscribe within this S1 a regular N -gon, with vertices labelled in cyclic order and vertex 0 on the positive x-axis. The image of vertex j under a map x : S1 → CN \ ∆ is to represent the initial position xj (0) of mass j, with j = 0, 1, . . . , N − 1. As the N -gon rotates rigidly within the circle, these image points move, thus sweeping out a curve in CN . Now the group ZN acts on S1 by rotations, taking our standard N -gon to itself, with the standard generator γ of the group acting on a point t ∈ S1 by t → γ ◦ t = t + 1. This same generator acts on CN by permuting the masses: x = (x0 , . . . , xN −1 ) → γ ◦ x = (x1 , . . . , xN −1 , x0 ). Now we make a crucial observation. A map x : S1 → CN is equivariant with respect to this ZN action, i.e. x(γ ◦ t) = γ ◦ x(t), if and only if xj (t) = x0 (t + j). In other words, the ZN -equivariant maps into CN correspond precisely to closed curves q : S1 → C in the plane, with the correspondence being given by This deﬁnes a nat the equation (2.1) above. ural correspondence C := C 0 S1 , C ↔ C 0 S1 , CN Z , where the subscript N ZN denotes equivariance with respect to that group. Moreover, a curve is collision-free in our original sense if and only if its corresponding curve in C 0 S1 , CN Z has no collisions, i.e., no points with xi = xj when i = j, for N i, j = 0, . . . , N − 1. This establishes a natural correspondence between the space of loops C \D of the beginning and the space C 0 S1 , CN \∆ Z where N ∆ ⊂ CN is the set of all possible collisions between any distinct masses. (∆ is sometimes called the “fat diagonal”.) Thus a simple choreography class is the same as a component of the space C 0 S1 , CN \ ∆ Z of collision-free N equivariant loops in conﬁguration space.

2.2

Remark on Imposing Additional Symmetries

Various other groups act on S1 and on CN . By imposing these as additional symmetries we can obtain beautiful symmetric patterns for our N -body choreography solutions. Fix a ﬁnite group Γ containing ZN , and acting on both S1 and on CN in such a way that it preserves the Lagrangian, and such that the restriction of the action to ZN agrees with the previously

9. Choreographic Motions of N Bodies

293

deﬁned action of ZN . Replace C 0 S1 , CN \ ∆ Z by C 0 S1 , CN \ ∆ Γ ⊂ N C 0 S1 , CN \ ∆ ZN , the space of Γ-equivariant loops. Then the deﬁnition of choreography classes extends to yield that of equivariant choreography classes, and our main theorem still holds in the equivariant case. The groups Γ we have in mind are cyclic (ZN m ) or dihedral (DN m ) extensions of ZN , or products of these by a subgroup of O(2). Recall that the dihedral group Dk , the symmetry group of a regular k-gon, is a non trivial extension of Zk by Z2 which admits the presentation s, σ | sk = 1, σ 2 = 1, σsσ = s−1 . This group may be put, usually in several ways, in the form of a semi-direct product. For example, D6 is a semi-direct product of Z3 by Z2 × Z2 , D12 is a semi-direct product of Z4 by D3 , etc. We need to deﬁne their actions. We shall take always the action of Dk on S1 (of length N ), deﬁned by s · t = t + N/k, σ · t = −t, but we may deﬁne diﬀerent actions on CN . The only condition will be that the restriction of the action to the normal subgroup ZN (generated by sm ) be the one deﬁned in 2.1, that is sm · (x0 , x1 , · · · , xN −1 ) = (x1 , x2 , · · · , x0 ) . Let us take for example N = 3 and Γ = D6 . As a ﬁrst action of D6 on C3 we deﬁne s · (x0 , x1 , x2 ) = (−x2 , −x0 , −x1 ) ,

σ · (x0 , x1 , x2 ) = (x0 , x2 , x1 ) .

For the second one, we take s · (x0 , x1 , x2 ) = (−x2 , −x0 , −x1 ) ,

σ · (x0 , x1 , x2 ) = (−x0 , −x2 , −x1 ) .

An example of an equivariant loop for the ﬁrst action of D6 is the Lagrange equilateral solution where the three bodies chase each other around a circle, x0 being at time 0 on the positive intersection of the circle with the horizontal (= real) axis. An example of an equivariant loop for the second action is the eight with x0 being at the origin when t = 0. Note that, on the contrary, the supereight with four bodies (Figure 4.1b) shares equivariance under some action of the group D4 × Z2 on C4 , with the relative equilibrium solution where the four bodies form a rigid square and chase each other around a circle. (These two represent diﬀerent topological, or choreography classes, however). Planar choreographies which enjoy k-fold dihedral symmetry have the pattern of ﬂowers with k petals (see Figures 1.2, 4.1c and 4.2e), or, when the petals overlap tightly, they look like pictures drawn by a children’s drawing toy, the spirograph. We leave to the reader the deﬁnition and representation of the corresponding groups Γ (in the case of Figure 4.1c, for example, the group is D12 , see Chenciner [2002] for more details).

294

3

A. Chenciner, J. Gerver, R. Montgomery, and C. Sim´o

Proof

Except for the (fundamental) symmetry considerations, the following proof is essentially due to Poincar´e, with the following unimportant diﬀerences: Poincar´e was working with homology instead of homotopy and looked for periodic orbits in a rotating frame. We use the direct method of the calculus of variations (see Montgomery [2002] for more details). The action for our N -body problem is given by T 1 K x(t) ˙ + U x(t) dt , (3.1) A(x) = 2 0

# where T = N , K(x) ˙ = i=0 |x˙ i |2 and U (x) = 1≤i<j≤N f |xi − xj | , with f as in (1.3). If U (x) ≥ 0, which we henceforth assume, and if the action of the curve x is ﬁnite, then the derivative x˙ is square integrable, which is to say that it lies in the Sobolev space x ∈ H 1 S1 , CN . If x is a critical point of A, and if x has no collisions, then x is a N -periodic solution to (1.1). This is a basic, well-known result in mechanics and in the calculus of variations. Collisions have to be excluded because (1.1) breaks down at collisions, and because the action is not diﬀerentiable at paths with collisions, despite some potentials (e.g., the Newtonian one) being regularizable. According to “the principle of symmetric criticality” (see for example Palais [1979]) this same statement holds for Γ-equivariant paths. More precisely, let Γ be any ﬁnite group acting on both S1 and on CN by isometries in such a way that it preserves the potential U . Then Γ preserves the Lagrangian and hence leaves the action unchanged: A(x) = A(g ◦ x), for g ∈ Γ. Let H 1 S1 , CN Γ ⊂ H 1 S1 , CN be the set of all equivariant paths with square-integrable derivative. Suppose that x is collision-free, and that dA(x)(v) = 0 for all v ∈ H 1 S1 , CN Γ . Then x is a solution to (1.1). The proof proceeds by using reducibility of Γ-representations to show that dA(x)(v) = 0 for all v ∈ H 1 S1 , CN Γ implies that dA(x)(v) = 0 for all v ∈H 1 S1 ,CN , and that x is a critical point within the bigger loop space H 1 S1 , CN (a direct proof is given in Chenciner [2002]). Recall H 1 S1 , CN ⊂ C 0 S1 , CN . This is one of the simplest instances of the Sobolev inequalities. The direct method by ﬁxing a chore proceeds ography α, that is to say a component of C 0 S1 , CN Γ , intersecting α with the subspace of H 1 -paths, and then taking the inﬁmum of the action A(x) over all paths x realizing this choreography. By slight abuse of notation we will use the same symbol α to denote the intersection of the class α with the space of H 1 -paths. Set: #N −1

a(α) = inf A(x) . x∈α

(3.2)

Then, by using the deﬁnition of inﬁmum, there is a sequence xn ∈ α ⊂ H 1 S1 , CN \∆ Γ with A(xn ) → a(α). The idea is to show that xn converges to a solution to (1.1), and this solution lies in the interior of α .

9. Choreographic Motions of N Bodies

295

( : |x| ˙ 2 dt |t − s| shows that the The Sobolev inequality x(t)−x(s) ≤ set of all H 1 paths with action A bounded by a ﬁxed constant forms an equicontinuous family. This ( same argument shows that the length of any path x ∈ C is less than 2A(x). Without loss of generality we may take the center of mass of each of our paths xn to be identically zero: x(t + j) = 0 . (3.3) j

An easy argument shows that the set of all paths in C with center of mass at the origin, and with bounded length, is a pointwise bounded family. The Arzel`a–Ascoli theorem asserts that any bounded, equicontinuous family of paths in C contains a convergent subsequence. So without loss of generality, we have the existence of a curve x∗ such that xn → x∗ in the C 0 -norm. The crux of the matter is to show that this C 0 limit x∗ is collision-free, or what is the same thing, that minimizing sequences cannot tend to the boundary of a component α. If so, this limit is automatically in α. Fatou’s lemma A(x∗ ) ≤ limn A(xn ) then shows that x∗ is a minimizer, and hence a critical point for the action restricted to Γ-equivariant loops. The principle of symmetric criticality applies, yielding that x∗ is a solution realizing the given choreography. That x∗ is collision-free follows directly from 3.1 Proposition. If U is a strong-force potential, then any path with collision has inﬁnite action. Proof. Suppose the path x suﬀers a collision, with masses i and j colliding at time tc . Write r for rij . The kinetic term K in the action satisﬁes K ≥ r˙ 2 . Since x is continuous, we have r < δ for some time interval |t−tc | ≤ about the collision time. The strong force assumption yields U ≥ c/r 2 over this interval. Thus the Lagrangian satisﬁes L = 12 K + U ≥ 12 r˙ 2 + c/r2 . Using t √ ˙ for |t| ≤ . But t12 r/r ˙ dt = a2 + b2 ≥ 2ab we have that L ≥ 2c |r/r| log r(t2 ) − log r(t1 ), and r(tc ) = 0. From this we conclude that the partial tc + L dt diverges at least logarithmically as t → tc from above, action t and so the action of the collision path is inﬁnite. Remark. This proposition is proved in Poincar´e, but under the stronger assumption of (almost) conservation of energy. More precisely, Poincar´e makes use of the fact that at collision, kinetic and potential energy are of the same order of magnitude.

4

Numerical Investigations

We concentrate on the Newtonian potential. Easier cases (strong-force 1/r a , a ≥ 2) and harder cases ( 1/ra , 0 < a < 1) of homogeneous potentials have

296

A. Chenciner, J. Gerver, R. Montgomery, and C. Sim´o

also been successfully searched for simple choreographies (see Section 6). Except in that section, all ﬁgures presented here are of Newtonian solutions. Note, also, that a natural continuation to the logarithmic potential, f (r) = − log r, is possible and it is better done by using f (r) = 1/(ar a ) instead of f (r) = 1/ra for small positive a, but we shall not report on these results here. All the ﬁgures in the paper represent solutions with period T = 2π. From now on we set S1 = R/2πZ and we shall use S1N for R/N Z. 1

1

0.5

0.5

0.5

0

0

0

-0.5

-0.5

-0.5

-1

-1 -1

0

1

(a) Action = 44.437886

-1

0

1

-1

(b) Action = 48.510294 0.5

0.5

-0.5

0

1

(d) Action = 60.191825

(c) Action = 55.804721

0

-0.5

-0.5 -1

1

0.5

0

0

0

-1

0

1

(e) Action = 65.269875

-1

0

1

(f) Action = 67.186712

Figure 4.1. Simple choreographies for four bodies under the Newtonian potential

For N = 3 only Lagrange’s equilateral solution, the eight and several satellites of the eight (see Section 5) are known2 . From now on we skip the trivial circular case in which the N masses form a regular N -gon which rotates within its circumscribing circle. Figure 4.1 presents some simple choreographies for four bodies. The positions of the bodies at some initial time are displayed. The values of the actions are shown. Compare with the action for the circle choreography, A = 36.613230. Several examples with N = 5 can be found in Sim´o [2001a]. In Figure 4.2 we display some simple choreographies for several values of N . They show just a few of the types found. We refer to Sim´o [2001a] for additional families. For the numerical computation of simple choreographies, two methods have been used: minimization and Newton’s method. See Sim´o [2001b] for a detailed presentation of these methods. 2 This is no longer true. By the time the galley proofs were received, hundreds of N = 3 choreographies were known (see Sim´ o [2002]).

9. Choreographic Motions of N Bodies 2 1.5 1 0.5 0 -0.5 -1 -1.5 -2

1.5

1.5 1

1

0.5

0.5

0

0

-0.5

-0.5

-1

-1

-1.5

-3

-2

-1

0

1

2

3

-1.5 -2

(a) Chain with 11 bodies in 10 loops

297

-1

0

1

2

-2

(b) Chain with 11 bodies in 4 loops

-1

0

1

2

(c) Chain with 8 bodies in 6 loops

1 1 1 0.5

0.5

0.5 0

0

0

-0.5

-0.5

-0.5 -1

-1 -1 -1

0

1

2

-1

0

1

-1

0

1

(d) Bifurcated chain (e) 8 petal daisy (f) An asymmetric with 9 bodies with 9 bodies case with 7 bodies Figure 4.2. A sample of diﬀerent kinds of simple choreographies for the Newtonian potential.

4.1

Minimization Methods

These proceed by searching for local minima of A. In general we cannot ensure that the value of the minimum found is a(α), see (3.2). We represent a curve q whose components in R2 are denoted as (u, v), by an approximation qˆ = (ˆ u, vˆ) with M ak cos(kt) + bk sin(kt) , u ˆ(t) = k=1

vˆ(t) =

M

(4.1) ck cos(kt) + dk sin(kt) .

k=1

At time t the bodies are located at q(t + 2πj/N ), j = 0, . . . , N − 1, with velocities ˙ + 2πj/N q(t ). These values are substituted in (3.1). The inte q /dt dt is computed using a trapezoid rule with time step gral S1 L qˆ(t), dˆ 2π/n, where n is a multiple of the number of bodies, n = pN , p ∈ N. Only the values for t = tj = 2πj/n, j = 0, . . . , p − 1 are needed, because after 2π/N each body is shifted to the position of the next one. The approximate value of the action Aˆ depends on P = ak , bk , ck , dk , for k = 1, . . . , M through (4.1). Because of (3.3), with j replaced by 2π/N , all the coeﬃcients with N |k must be zero. By imposing symmetries on a choreography we can further decrease the cardinality of P . Collision-free solutions are analytic and the use of the trapezoid rule is suitable. ˆ ). It is minimized by In this way we obtain a discretized functional A(P

298

A. Chenciner, J. Gerver, R. Montgomery, and C. Sim´o

using the gradient method and variants. Several practical problems appear: a) The action is quite “ﬂat” and lots of local minima seem to exist. b) In case we look for a solution having a passage close to collision, the number of harmonics should be large, of the order of several thousands. Both problems slow down the minimization. Typically the computations have been stopped when two consecutive ˆ all of them being local minima along a search line, diﬀer by estimates of A, less than 10−10 . We started with any arbitrary set P or with data obtained after smoothing and ﬁltering a hand drawn curve. As a test of goodness we have used the conservation of the energy and the residual acceleration: the q ) at the diﬀerence between the value given by (1.1) and by using d2 /dt2 (ˆ times t = tj .

4.2

Newton’s Method

Let Φ2π/N be the ﬂow of (1.1) for a time interval 2π/N . Starting with given values of positions xj and velocities x˙ j , for j = 0, . . . , N − 1 at time t = 0, the transport by Φ2π/N should give the same values, with the indices shifted cyclically by one unit. This gives a set of 4N scalar equations, which is solved by Newton method starting at an approximate solution found by minimization. The map Φ2π/N is computed by numerical integration of (1.1). Simultaneous integration of the variational equations is also needed. In most of the cases, parallel shooting (see, e.g., Stoer and Bulirsch [1983]) has been required, especially if passages close to collision occur. Typically Newton iterations are stopped when the “closing error” is below 10−12 . As a by-product, linear stability properties have been obtained. Note that these 4N equations are not independent. Use has been made of (3.3), and the rotation and time-shift invariance, to decrease by 6 the dimension of the system to be solved. The only choreography found to be linearly stable, up to now, for the Newtonian potential is the eight (again, see Section 5). Furthermore, on the manifold of angular momentum zero, where the eight lives, the hypothesis of the KAM theorem has been checked to hold by a numerical computation of the torsion (see Sim´o [2002]).

5 5.1

Main Choreographies, Satellites, Linear Chains On Main and Satellite Choreographies

As we announced in the Introduction, starting with one choreography, we show how to construct, under some hypotheses, a family of new choreographies, by either travelling around the initial choreography a multiple

9. Choreographic Motions of N Bodies

299

number of times, or by a continuation in which the angular momentum is varied, or by a combination of these two constructions. This suggests distinguishing between main and satellite choreographies, as we have done preceding Conjecture 1.1 in the ﬁrst section: a main choreography is one which is not the satellite of another one. 5.1.1

Subharmonics

Let us start with a description of the Poincar´ e map in the neighborhood of an N -periodic choreography x(t) = q(t), q(t + 1), · · · , q(t + N − 1) . Recall (Subsection 2.1) that the solution x(t) is characterized by the fact that ∀t, x(t + 1) = Sx(t), where S : CN → CN is the isometry of the conﬁguration space deﬁned by S(x0 , x1 , · · · , xN −1 ) = (x1 , · · · , xN −1 , x0 ) . This fact has been strongly used in the numerical methods of the previous section. Let us ﬁx the energy and the angular momentum to the value they have for our solution. After reduction of the translational and rotational symmetries, we get a (4N − 7)-dimensional manifold (counting the dimension over R) to which S can be extended naturally. Let us call Σ0 a piece of (4N − 8)-dimensional submanifold transverse to the periodic orbit at a ˙ 0 ) . Let Σ1 , · · · , ΣN −1 be the images of Σ0 by S, · · · , S N −1 . point x(t0 ), x(t These submanifolds are transverse to the periodic orbit at the points x(t0 + 1), x(t ˙ 0 + 1) , · · · , x(t0 + N − 1), x(t ˙ 0 + N − 1) , respectively. Let Pi : Σi → Σi+1 , i = 0, · · · , N − 1, be the Poincar´e maps (of course, ΣN = Σ0 ). One veriﬁes readily that S ◦ Pi = Pi+1 ◦ S. Let us deﬁne P : Σ0 → Σ0 by the formula P = S −1 ◦ P0 . One deduces from the above and from the fact that S N = Id, that the ﬁrst return map P = PN −1 ◦ · · · ◦ P1 ◦ P0 to Σ0 is equal to P N . This is what characterizes the ﬁrst return maps along choreographies: they admit an N th root (which is nothing but the return map to the corresponding section in the quotient by S, which acts freely in the neighborhood of a choreography). Now, for any choreography of N bodies, a subharmonic solution gives rise to a choreography √ each time it corresponds to a periodic point, say of order k, of P = N P, the Poincar´e map of the quotient by S. This is because, lifted to the phase space, such a subharmonic will give rise to a choreography x ˜(t) ∼ x(t), t ∈ [0, T˜], of period T˜ ∼ kN (if we chose the ˜(t), as long period T of x(t) to be equal to N ), precisely x ˜(t + T˜/N ) = S k x as (k, N ) = 1. This is because 1) if (k, N ) = 1, N is a generator of Z/kZ, so that all the inverse images of the periodic solution of P under quotient by S belong to the same periodic orbit; 2) the time spent by this orbit to go from Σi to Σi+1 is independent of i because the vector ﬁeld commutes with S.

300

5.1.2

A. Chenciner, J. Gerver, R. Montgomery, and C. Sim´o

Relative Choreographies

Another possibility is to changethe angular momentum level. Call a solution of (1.1) of the form x(t) = q(t), q(t + 1), . . . , q(t + N − 1) a relative simple choreography of period N if there is a rotation Rβ of ﬁxed angle β, such that, for all t, q(t + N ) = Rβ q(t) . In a rotating frame with angular velocity β/N , it becomes an honest choreography. If the angle of this rotation is a rational multiple m/d of 2π then x is periodic with period T = dN and X(t) = q(t), q(t + d), . . . , q(t + (N − 1)d) is a choreography (in the ﬁxed frame) provided there are no collisions, that is if d and N are mutually prime. Otherwise, the solution is quasiperiodic. If the Poincar´e return map of an initial choreography q 0 is nondegenerate, and if that choreography has angular momentum C0 , then according to the implicit function theorem there will exist a family of relative simple choreographies near q 0 with angular momentum C taking any value within an interval about C0 . Most of these will be quasiperiodic, but a dense set will have rational rotational angle. 5.1.3

Satellites of the Eight

Let us describe now some basic facts regarding the dynamics near the ﬁgure eight choreography (see Sim´o [2002]). We shall show that it has many “satellite” choreographies. Indeed: As a ﬁxed point of the above Poincar´e map P (with ﬁxed energy and zero angular momentum), the ﬁgure eight orbit is totally elliptic with torsion: the frequencies do change locally when one gets away from the ﬁxed point. Hence there exists a family of periodic points (subharmonics) parameterized by a rational rotation vector, whose components tend to limit values given by the eigenvalues of the Poincar´e map P, approximately 0.00842272 and 0.29809253, when we approach the ﬁxed point. √ 3 As, for the eight solution, P = P is also totally elliptic, this implies the existence of a family of choreographies accumulating to the eight. In turn, some of these choreographies are totally elliptic and the same argument is likely to apply indeﬁnitely, giving also rise to choreographic solenoids. The ﬁgure eight periodic solution can also be continued to diﬀerent angular momenta. As we saw, a possible way to proceed is to use a rotating frame with angular velocity ω. As ﬁrst found by H´enon [2000], the periodic orbit becomes a distorted ﬁgure eight, with the three bodies travelling on the same path in rotating coordinates. For that purpose H´enon used the same program he had been using in H´enon [1976] to continue the collinear Schubart’s orbit. According to the preceding discussion, this gives rise to new choreographies, satellites of the eight. If some of these are still totally

9. Choreographic Motions of N Bodies

301

0.4

0

-0.4 -1

-0.5

0

0.5

1

-1

-0.5

0

0.5

1

0.4

0

-0.4

1

0.5

0

-0.5

-1 -1.5

-1

-0.5

0

0.5

1

1.5

Figure 5.1. Several examples of satellite orbits, the top and bottom ones being choreographies, but not the middle one. See the text for explanation.

elliptic — and this will happen for a small enough angular velocity — new satellites of them shall appear, and so on. Figure 5.1 shows an illustration of these two possibilities. On the top we display a satellite choreography of the ﬁgure eight one, obtained from a periodic point under the Poincar´e map around the ﬁxed point and having only component along the fast frequency. Therefore, it lives on a “subcenter” manifold. The rotation number is 11/37 ≈ 0.297297297, quite close to the limit rotation number at the ﬁxed point. Indeed, the variation of the rotation number is quite ﬂat along that mode. All the bodies describe the same path and this seems to be a local minimum of the action. The dots on the ﬁgure (one on the left, one on the right and the third one at the origin) show the initial position of the three bodies. For reference also the path of the ﬁgure eight solution is plotted. The middle sub-ﬁgure shows a satellite

302

A. Chenciner, J. Gerver, R. Montgomery, and C. Sim´o

orbit with rotation number 8/27. Note that the denominator is now a multiple of 3. The three bodies describe slightly diﬀerent paths. On the ﬁgure the path of one of the bodies is plotted in continuous lines, while the path of another body is plotted in broken lines. These paths look like curves with rational slope on a torus, slightly shifted the one from the other. To prevent to have too many lines the path of the third body is skipped, but can be clearly seen where it should be. On the bottom we display a satellite choreography with non-zero angular momentum. This value, C ≈ 0.03125986, has been selected to have a solution which precesses and closes also after 37 “revolutions” along the eight and 3 full revolutions around the center of mass. That is m/d = 3/37. The three bodies describe, again, the same path. Linear stability has been checked for this choreography. Figure 5.2 displays 1/37 of the period, both in ﬁxed and rotating axes. 1

1

0.5

F1

0

I1

0.5

I3

I2

-0.5

I3

I1

0

F3

I2

-0.5

F2

-1

-1 -1.5

-1

-0.5

(a) Fixed axes.

0

0.5

1

1.5

-1.5

-1

-0.5

0

0.5

1

1.5

(b) Rotating axes.

Figure 5.2. 1/37 of the bottom orbit in Figure 5.1. Left: Fixed axes. Right: Rotating axes. The points marked Ij (resp. Fj ) for j = 1, 2, 3 denote initial (resp. ﬁnal) conditions.

In general, we do not need to start with a totally elliptic periodic solution. It is enough that it have some elliptic eigenvalues. For instance, for the choreographies for N = 4 shown in Figure 4.1, we have that the dimension of the center manifold W c equals 2 in all cases except d), where it is 4, and e), where it is 0, as it also is for the trivial circle case. (We always ignore the 4 couples of eigenvalues equal to 1 due to the ﬁrst integrals.) Remark. To decide if a given choreography is main is not easy. It is not excluded that some of the choreographies presented here as of main type could be related by a family of continuous solutions if we allow for periodic solutions in the complex phase space with the complex period. By homotoping the potential we can sometimes connect two diﬀerent main choreographies. This happens to Figure 4.1e where the homotopy

9. Choreographic Motions of N Bodies

303

parameter is the exponent a in the potential 1/r a . In Sim´o [2001a] another choreography, very similar to Figure 4.1e, but having dim W c = 2, is found by following choreography Figure 4.1e upon changing the potential. Both choreographies arise for the Newtonian potential and belong to the same class. See Section 6 for further discussion. Note that although we do not allow such potential-varying homotopies in our deﬁnition of “satellite” choreography, they are nevertheless useful in understanding choreographies.

5.2

The linear Chains

Among the simplest choreographies are the “linear chains” formed by different “bubbles”. Figure 1.1, 4.1a, 4.1b, 4.2a, 4.2b and 4.2c show examples. All of them seem to be of main type. Working in S1N a double point z in a chain (or in a general choreography) has two values of t associated to it, say ta and tb . The (integer) length of the loop related to z is deﬁned to be / Z the [tb − ta ] (in S1N ), where [ ] denotes the integer part. As tb − ta ∈ complementary length is N −1−[tb −ta ]. A linear chain with J bubbles has J − 1 double points all lying on the x-axis, which we will label z1 , . . . , zJ−1 in order of increasing x-coordinate. Note that, if we try to produce a similar chain without having zi on the x-axis, a kind of principle of minimum interaction of the bubbles leads, by minimization of the action, to a solution as described. Upon reorienting the loop (reversing time) if necessary, we may assume that the corresponding lengths of the left hand loops deﬁned by these double points yields an increasing sequence of integers: 1 ≤ 1 < 2 < . . . < J−1 . The values 1 , 2 − 1 , . . . , i+1 − i , . . . , N − 1 − J−1 are the lengths associated to the bubbles and characterize the choreography class of a linear chain. Note that if i+1 = i then the bubble between zi and zi+1 can be destroyed without passing through a collision and hence represents the same choreography as a linear chain with one fewer bubble. So we assume that the sequence of lengths is strictly increasing. For completeness we also include the case J = 1, i.e., the trivial circular solution, as a linear chain. 5.1 Proposition. chains.

For N bodies 2N −3 + 2[(N −3)/2] is the number of linear

A proof can be found in Sim´o [2002]. In particular, the number of main choreographies increases exponentially with N .

6

Evolution of the Choreographies with the Potential

As anticipated, we can take a family of homogeneous potentials with f (r) = 1/ra , for a > 0 in (1.2). It is reasonable to ask several questions: what

304

A. Chenciner, J. Gerver, R. Montgomery, and C. Sim´o

happens to a choreography which exists for a = 1 when a is decreased approaching zero? What is the fate of a choreography which exists for a = 2 but fails to exist for a = 1? Are the diﬃculties encountered in trying to prove the conjecture for the Newtonian potential just technical, or are they deeper? It has already been said that from a given choreography for a = 1 it is possible to ﬁnd, by continuation with respect to a, another choreography also for a = 1. The main goal of this section is to present several numerical results in these directions. The eight can be continued without diﬃculty to any value of a > 0. It is found to be stable in a short domain, roughly a ∈ [0.86, 1.23]. Concerning the cases N = 4 given in Figure 4.1, decreasing a they reach a saddlenode bifurcation (s-n for short) and the continuation of the family is only possible by increasing a again. With the exception of the mentioned case e), all of them seem to approach a collision (either a single double collision, several double collisions or a triple collision) before reaching again a = 1. Case c) displays a short stability interval around a = 0.63. For N = 5 similar things happen. Most of the cases go to a collision (a quadruple collision being now possible), after or before reaching a s-n. Some cases present several s-n before approaching a collision, the variation of a being monotone between two successive s-n. A couple of cases return to a = 1, after having a s-n for a < 1, before approaching a collision. On the other side the example with 5 bodies on a symmetrical eight can be continued to any value a > 0, while the linear chain with J = 4 (a supersuper-eight) can be continued up to a 0.0288854, where it has a s-n and a starts to increase again. Now let us consider a choreography with N = 4 which seems not to exist for the Newtonian potential. It should look like Figure 4.1a but with the small loop inside the larger one. It certainly exists for a = 2. Figure 6.1 shows what happens when a continuation for decreasing a is attempted. As a characteristic of a given choreography we have taken the minimum distance rmin = min1≤i<j≤N, t∈[0,2π] ri,j (t). On Figure 6.1a we display the evolution of rmin with a, starting at a = 2 (marked as A) and decreasing a. Two s-n are seen, marked as B and D. Proceeding along the family a collision is approached at the point marked as F. On Figure 6.1b the three orbits shown correspond to A, B and C in Figure 6.1a, the size of the inner loop decreasing when a does. Next orbits, D to F, are displayed on Figure 6.1c. The magniﬁcation shows a clear approximation of F to a binary collision. A very small loop appears for orbit C. It becomes as large as the inner one in D, then part of it moves outside the large loop in E and, ﬁnally, most of the small loop appears outside the large one in F. This scenario is frequently observed in the evolution of the action minimization procedure with the Newtonian potential, when it seems that no local minimum exists inside the chosen choreography class. The case N = 4 with a small loop inside is not an exception. Instead of N = 4 we can take N > 4 and ask for a small loop of integer length [] = 1 inside a large loop. Only these two loops are requested for the choreography.

9. Choreographic Motions of N Bodies

305

1

1

A

0.5

B

C

0

C

0

B A

F E

D

D E

F

0 1

-1

-1

1.5

(a) The a − rmin diagram.

0

-1

2

-1

1

(b) The ﬁrst 3 orbits.

0

1

(c) Last 4 orbits and a magniﬁcation.

Figure 6.1. Evolution of a choreography as a function of a for r −a potentials. See the text.

In all cases the behavior seems to be the same one. Figure 6.2a displays what happens for the equivalent of point B in Figure 6.1a, i.e., the ﬁrst s-n encountered when we evolve from a = 2 downwards. In this ﬁgure we show the small inner loops for several values of N : 4 to 8, 12 and 36, the loops going to the left for increasing N . To be able to put them on the same window we have added to each curve the coordinates of the rightmost point in the large loop. For N = 4 we have in Figure 6.2a the same loop shown in Figure 6.1b with label B. Note the evolution of the tiny loop which is created in Figure 6.1a for increasing N , like a swallowtail unfolding. 0.04

0.2

0

0

-0.2

-0.04

0

0.2

(a) Inner loops.

0.4

0.6

0.8

0

0.04

0.08

0.12

0.16

(b) Outer loops.

Figure 6.2. Details of small loops for diﬀerent N . Left: Inner loops for r −a potentials with values of a for which a s-n occurs. Right: Outer loops for the Newtonian potential. See the text for additional explanations.

Let aN,k be the value of a in the continuation, started going down from a = 2 with N bodies, and when the k-th s-n is found. The data corresponding to N = 4, displayed in Figure 6.1a, are aN,1 1.0344, aN,2 1.5374. It is quite instructive to look at similar values for other values of N . They are given in the next table. In particular all the aN,1 values are greater than 1. This gives an evidence of the lack of existence of this very simple choreography for all N . Furthermore, a tentative guess of the behavior of aN,1 as a function of N for increasing N is aN,1 1 + c/N 2 , for some

306

A. Chenciner, J. Gerver, R. Montgomery, and C. Sim´o

c > 0. For a value of a slightly larger than 1, the choreography with a small loop of [] = 1 inside a large loop should exist for N large enough, while it seems not to exist for a = 1. It looks diﬃcult to take into account this tiny diﬀerence in an analytical reasoning towards existence proofs. a5,1 a6,1 a7,1 a8,1

1.1720 1.1401 1.1103 1.0887

a5,2 a6,2 a7,2 a8,2

1.3862 1.3255 1.2914 1.2680

a12,1 a16,1 a20,1 a24,1

1.0449 1.0273 1.0183 1.0129

a28,1 a32,1 a36,1 a40,1

1.0096 1.0073 1.0057 1.0046

On the other side, we can consider the small loop with [] = 1, outside the large loop, generalizing to arbitrary N the case shown in Figure 4.1a. The shape of the small loop is shown in Figure 6.2b for diﬀerent values of N : 40, 48, 50, 60, 70 and 100. As before the size of the loop decreases with increasing N , while to keep the loops on the same window we have added the coordinates of the leftmost point on the large loop. The value N = 48 has been selected because it is very close to having a cusp point. The evolution of the shape of the small loops is diﬀerent from the one found in the preceding case. We ﬁnish with a discussion of a diﬀerent type. Consider an N -gon. It seems to be a global minimum for the action. Assume it is travelled k > 1 times. Is it still a local minimum? The simplest counterexample we have found appears for N = 7, k = 2. Taking small deviations from this solution the minimization leads to an inner loop with [] = 3 inside a larger loop. It looks similar to case A in Figure 6.1b, but the inner loop is closer to the outer one. It has an action A 182.326, while the 7-gon travelled twice has A 182.729. The continuation, starting at a = 1, has a s-n for a 0.304557. But it returns to a = 1 giving a new choreography in the same class. This one is a saddle of the action functional, with A 186.705. This is also in contrast with the case of Figure 4.1e, for which the choreography in the same class, obtained by continuation, is also a local minimum of the action.

7

Conclusions

Simple choreographies are N -body solutions in which all N masses chase each other around the same curve. We have proved the existence of simple planar choreographies of arbitrary complexity and symmetry for strongforce N -body problems. Most of these choreographies vanish as the strong force potential tends to the Newtonian potential, but still a large number persist. We have investigated this vanishing fact numerically, and have found a large number of individual Newtonian choreographies. An analytic existence proof for the Newtonian choreographies beyond N = 3 remains to be found. Which simple choreography classes survive in the Newtonian

9. Choreographic Motions of N Bodies

307

limit, and what determines whether or not they survive? Is the number of these classes ﬁnite? Are all the linear chains with k bubbles, k < N , represented? Can the ﬁgure eight solution with N bodies be continued for all a > 0 and even for the logarithmic potential if N is odd? It is also an open question whether simple choreographies can exist when the masses are not all equal and N ≥ 6. For N < 6 it has been proved (Chenciner [2001]) that all masses must be equal, using the fact that choreography for any set of masses implies choreography for equal masses (the arithmetic mean). The existence of choreographies with arbitrary time intervals (not necessarily equal) is completely open.

Acknowledgments: We would like to thank Phil Holmes and Robert MacKay for bringing to our attention the paper of Cris Moore [1993] and the note of Henri Poincar´e [1896], respectively. R. M. thanks the support of the NSF (grant DMS 9704763) as well as the support of the French government through the position of invited researcher in the ASD group. An important part of the research of C. S. on that topic was carried out during a sabbatical leave spent with the team ASD, IMCCE in Paris, thanks to the support of the CNRS. He is indebted to the institution and all the staﬀ for the hospitality and interest on the work. The support of grants DGICYT PB 94–0215 (Spain) and CIRIT 1998SGR–00042 is also acknowledged by the same author.

References Chenciner, A. [2001], Are there perverse choreographies?, to appear in the Proceedings of the HAMSYS Conference, Guanajuato (March 19–23, 2001), World Scientiﬁc Publishing Co., Singapore. Chenciner, A. [2002], Action minimizing periodic orbits in the Newtonian n-body problem, in Celestial Mechanics, dedicated to Donald Saari for his 60th birthday; (A. Chenciner, R. Cushman, C. Robinson and J. Xia, eds.) Contemporary Mathematics 292, Amer. Math. Soc., 71–90. Chenciner, A. and Montgomery, R. [2000], A remarkable periodic solution of the three body problem in the case of equal masses, Annals of Mathematics 152, 881–901. Chenciner, A. and Venturelli, A. [2000], Minima de l’int´egrale d’action du Probl`eme newtonien de 4 corps de masses ´egales dans R3 : orbites “hip-hop”, Celestial Mechanics 77, 139–152. Davies, I., Truman, A. and Williams, D. [1983], Classical periodic solutions of the equal mass 2N -body Problem, 2n-Ion problem, and the n-electron atom problem, Physics Letters 99A, 15–17. Heggie, D. C. [2000], A new outcome of binary–binary scattering, Mon. Not. R. Astron. Soc. 318, L61–L63.

308

A. Chenciner, J. Gerver, R. Montgomery, and C. Sim´o

H´enon, M. [1976], A family of periodic solutions of the planar three-body problem, and their stability, Celestial Mechanics 13, 267–285. H´enon, M. [2000], private communication. Hoynant, G. [1999], Des orbites en forme de rosette aux orbites en forme de pelote, Sciences 99, 3–8. Lagrange, J. [1772], Essai sur le probl`eme des trois corps, Œuvres, Vol. 6, p. 273. Montgomery, R. [1998], The N -body problem, the braid group, and action-minimizing periodic solutions, Nonlinearity 11, 363–376. Montgomery, R. [2002], Action spectrum and Collisions in the three-body problem, in Celestial Mechanics, dedicated to Donald Saari for his 60th birthday; (A. Chenciner, R. Cushman, C. Robinson and J. Xia, eds.) Contemporary Mathematics 292, Amer. Math. Soc., 173–184. Moore, C. [1993], Braids in Classical Gravity, Physical Review Letters 70, 3675– 3679. Palais, R. [1979], The principle of symmetric criticality, Comm. Math. Phys. 69, 19–30. Poincar´e, H. [1896], Sur les solutions p´eriodiques et le principe de moindre action, C.R.A.S. Paris 123, 915–918 (30 Novembre 1896). Sim´ o, C. [2001a], New families of Solutions in N –Body Problems, Proceedings of the Third European Congress of Mathematics, (C. Casacuberta et al., eds.) Progress in Mathematics 201, 101–115, Birk¨ auser, Basel. Sim´ o, C. [2001b], Periodic orbits of the planar N -body problem with equal masses and all bodies on the same path, in The Restless Universe, (B. Steves and A. Maciejewski, eds.), 265–284, Institute of Physics Publ., Bristol. Sim´ o, C. [2002], Dynamical properties of the ﬁgure eight solution of the threebody problem, in Celestial Mechanics, dedicated to Donald Saari for his 60th birthday; (A. Chenciner, R. Cushman, C. Robinson and J. Xia, eds.) Contemporary Mathematics 292, Amer. Math. Soc., 209–228. Stewart, I. [1996], Symmetry Methods in Collisionless Many-Body Problems, J. Nonlinear Sci. 0, 543–563. Stoer, J. and Bulirsch, R. [1983], Introduction to Numerical Analysis, SpringerVerlag, 1983 (second printing).

10 On Normal Form Computations J¨ urgen Scheurle Sebastian Walcher To Jerry Marsden on the occasion of his 60th birthday ABSTRACT We review the computational procedures involved in transforming a vector ﬁeld into a suitable normal form about a stationary point, and we introduce an algorithm to compute Poincar´e–Dulac normal forms in scenarios where the linearization is not in any particular canonical form. Some examples and applications are presented.

Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . 2 Normalization with Respect to the Linear Part 3 Linear Algebra, Part One . . . . . . . . . . . . . 4 Application to Normal Form Computations . . 5 Problems . . . . . . . . . . . . . . . . . . . . . . . 6 Linear Algebra, Part Two . . . . . . . . . . . . . 7 Application to Poincar´ e–Dulac Normal Forms . 8 Some Classes of Examples . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . .

1

. . . . . . . . .

. . . . . . . . .

309 310 312 313 313 315 317 318 325

Introduction

Normal forms of vector ﬁelds are of central importance in stability and bifurcation investigations of stationary points and in the analysis of local bifurcations. Various aspects of normal forms and their computation have been discussed extensively in the literature; we recall only a few references here: Anosov and Arnold [1988], Bibikov [1979], Bruno [1989], Elphick, Tirapegui, Brachet, Coullet, and Iooss [1987], Guckenheimer and Holmes [1986], and Iooss and Adelmeyer [1992]. The practical advantage of normal 309

310

J. Scheurle and S. Walcher

forms is that certain parameters important for the local analysis of the stationary point can be read oﬀ quite easily, once the normal form has been computed up to a certain order. For the Hopf bifurcation this was presented in Marsden and McCracken [1976], which seems to be the ﬁrst reference where all the necessary computations are carried out completely. (Later on, there were simpliﬁcations and extensions by Hassard and Wan [1978], and by others; see also Scheurle and Marsden [1984].) For special classes of diﬀerential equations it is advisable to employ special transformations; see e.g., Deprit [1969], Broer [1981], and Walcher [1993]. Nowadays, normal form computations (using computer algebra systems) are routine. However, most algorithms and computations start from the assumption that the linear approximation of the vector ﬁeld at the stationary point is in some canonical form, for instance in Jordan form. When it comes to real life computations in higher dimensions, or involving parameters, determining such a canonical form may be a very tough problem. (In their recent paper, Chen and DellaDora [2000] work with the less restrictive assumption of Frobenius normal form, which can be obtained by “rational” methods.) In this paper we will review the computational problems and procedures involved in normal form computations, and we will propose an approach to computing Poincar´e–Dulac normal forms when the linear approximation is arbitrary. The method we present will not always be more eﬃcient than other established methods. But as examples show, it is quite useful in several cases of interest, and it still works when other methods seem to be no longer applicable, in particular for parameter-dependent systems. The underlying idea involves little more than a combination of known facts from linear algebra. We start the paper by reviewing normal forms and normal form computations in the two most important settings: Poincar´e–Dulac, resp. Elphick, Tirapegui, Brachet, Coullet, and Iooss [1987]. After discussing computational and conceptual questions, and considering the underlying linear algebra problems, we introduce a method to compute Poincar´e–Dulac normal forms with respect to an arbitrary linearization. (We will discuss only the semisimple case in detail.) As shown by relevant examples, and by some explicit computations, the method can provide the parameters necessary for stability discussions.

2

Normalization with Respect to the Linear Part

Consider an ordinary diﬀerential equation x˙ = f (x) in a neighborhood of the point 0 in Kn (with K standing for the reals R or complex numbers C). We assume that f has a Taylor expansion f (x) = Bx + f2 (x) + · · · + fm (x) + h.o.t.,

m ≥ 2.

10. On Normal Form Computations

311

Thus the point 0 is stationary, B = Df (0) is the linear approximation, and each fj is a homogeneous polynomial of degree j, while “h.o.t.” stands for “terms of higher order”. (We will always assume that f has continuous derivatives of order m + 1 in such a setting.) In many cases it is desirable to seek a simpliﬁcation of the terms in the Taylor expansion by way of a suitable coordinate transformation. We will consider only transformations of class C m+1 . Then the linearization of the transformed equation is conjugate to the linearization of the original, and one may therefore assume (at least in theory) that the linear approximation B is in some canonical form. The remaining task is to get the nonlinear terms in the expansion into satisfactory shape (in practical computations, one will be concerned with a ﬁnite number of such terms), and this can be achieved by “near-identity” transformations. A possible approach works degree by degree: Given that f2 , . . . , fr−1 (for some r, with 2 ≤ r < m) are already deemed satisfactory, according to some speciﬁed criteria, one next seeks a transformation to change fr . With the ansatz H(x) = x + hr (x) + · · · (transformation), g(x) = Bx + f2 (x) + · · · + fr−1 (x) + gr (x) + · · · (transformed equation), the condition

DH(x) g(x) = f H(x)

is necessary and suﬃcient for H to send all solutions of x˙ = g(x) to solutions of x˙ = f (x). Expanding the Taylor series, the terms of degree r are seen to satisfy the identity (2.1) B , hr = fr − gr . This “homological equation” may be viewed as the principal linear algebra problem in normal form computations. Whatever criteria one wants to impose on the simpliﬁed equation, one must determine gr so that the equation has a solution, thus fr − gr lies in the image of ad B. Obviously, gr = 0 would be the most desirable choice, but this may not be possible. Once gr and hr have been determined to satisfy the homological equation, one is free to choose higher degree terms in the expansion of H, which will determine the higher degree terms for g. The particular choice H = exp(hr ) (thus H(y) is the solution of x˙ = hr (x), x(0) = y, at time t = 1), has the practical advantage that g can be computed easily, via g = exp(ad hr )f = f + hr , f + 12 hr , hr , f + · · · (See Walcher [1993] or Gaeta [1999] for details.) This transformation also has good structural properties; for instance special classes of vector ﬁelds

312

J. Scheurle and S. Walcher

(Hamiltonian, reversible, symmetric with respect to a linear group, etc.) are preserved. We will work with these transformations in our examples. The question concerning the purpose of the simpliﬁcation procedure seems to have more than one answer. An obvious, and legitimate, goal is to reduce the number of parameters in the nonlinear terms as much as possible. Considering (2.1) this means choosing gr in a subspace complementary to the image of ad B. But whenever B is not nilpotent, one can choose gr in such a way that the transformed system admits a nontrivial linear one-parameter group of symmetries, up to some order. This is a very useful property in reduction and further analysis. In the Poincar´e–Dulac normal form, normalization often refers only to the semisimple part of the linearization, and one does not necessarily require the subspace for gr to be of smallest possible dimension. As was already noted by Takens [1974], it is possible to proceed further and to employ nonlinear terms to remove parameters in higher order nonlinear terms. See also the recent paper on this topic by Gaeta [1999].

3

Linear Algebra, Part One

The homological equation (2.1) is a special case of the following general problem: Let V be a ﬁnite dimensional vector space, and T : V → V a linear map, T = 0. Given v ∈ V , ﬁnd v0 ∈ V and w ∈ V such that T w = v − v0 .

(3.1)

The obvious solution w = 0 and v0 = v is of little interest here, and we impose an additional, more stringent condition: For V0 a proper subspace of V such that im(T ) + V0 = V , we require that v0 ∈ V0 . The additional condition im(T ) ∩ V0 = 0 may be imposed, but we will not always do so. A frequently used approach to solve (3.1) is to ﬁnd a linear map T with the property V = ker(T ) + im(T ) , thus V0 = ker(T ). (Of course, one may specify V0 ﬁrst, and then determine T accordingly.) In that case, the equation T w = v − v0 implies TTw = Tv . This yields a system of linear equations for w. Moreover, whenever w ∗ satisﬁes T T w∗ = T v, one has v − T w∗ ∈ ker(T ). Therefore this approach indeed reduces (3.1) to the standard problem of solving a system of linear equations. There are many possible choices for T ; we present two distinguished ones.

10. On Normal Form Computations

313

1. One choice uses the decomposition T = Ts + Tn into semisimple and nilpotent part, and T := Ts . Indeed, V = ker(Ts ) ⊕ im(Ts ), as well as im(Ts ) ⊆ im(T ), are known facts from linear algebra. 2. Another choice employs a (positive deﬁnite) scalar product σ on V , and T := T ∗ , the adjoint of T with respect to σ. As is well known, V is the orthogonal direct sum of ker(T ∗ ) and im(T ).

4

Application to Normal Form Computations

In the case of the homological equation (2.1), one has V = Pr , the space of homogeneous vector-valued polynomials of degree r, and T is the re striction of ad B = B , · to Pr . Generally one may say that every choice of V0 (depending on the degree r) yields a normal form for the diﬀerential equation under consideration. Let us take a closer look at the distinguished choices introduced above. Poincar´ e–Dulac (see Dulac [1912]): The decomposition B = Bs + Bn induces the semisimple-nilpotent decomposition ad B = ad Bs + ad Bn . Thus one may choose V0 = ker(ad Bs ), whence the normal form is characterized by the property Bs , fr = 0. (If Bs is diagonal, this yields the familiar resonance conditions for the vector monomials in the normal form.) Elphick, Tirapegui, Brachet, Coullet, and Iooss (see Elphick, Tira pegui,

Brachet, Coullet, and Iooss [1987]): The standard scalar product · , · on Kn induces a scalar product σr on Pr via

aI xI , bJ xJ = I! aI , bI . σr |I|=r

|J|=r

|I|=r

(We use the familiar abbreviations xI = xi11 · · · xinn , and I! = i1 ! · · · in ! for I = (i1 , . . . , in ).) This scalar product is sometimes called the Bargmann scalar product. Its characterizing property is ∗ ad B ∗ = ad B ,

with the adjoints ( · )∗ taken relative to · , · , respectively to σr . Thus one ∗ may choose ∗V0 = ker(ad B ), and normal forms are characterized by the property B , fr = 0.

5

Problems

In principle the previous sections open up a path to the computation of normal forms. For instance, in order to solve (2.1) in the Poincar´e–Dulac

314

J. Scheurle and S. Walcher

scenario, start by solving (ad Bs ) (ad B) hr = (ad Bs ) fr , and let

gr = fr − B , hr .

In the scenario of Elphick, Tirapegui, Brachet, Coullet, and Iooss, solve (ad B ∗ ) (ad B) hr = (ad B ∗ ) fr ﬁrst. It is not necessary to assume that B is in some pre-processed form here, although the systems of linear equations that have to be solved may be of very large size, and simpliﬁcations are certainly welcome. There is yet another reason for requiring B to be in some special form, and this reason is connected with the purpose of the normalization procedure. In the scenario of Elphick, Tirapegui, Brachet, Coullet, and Iooss, one may in principle let B be arbitrary, but the additional assumption that (even self-adjoint) with respect Bs is diagonal ensures that Bs is normal

to the standard scalar product · , · , and this property ensures a canonical reduction procedure, since Bs commutes with the Taylor polynomial in normal form. (Incidentally, this is the same eﬀect that Poincar´e–Dulac has, so judicious choice of coordinates does matter here.) Of course, by construction B ∗ always commutes with the normalized nonlinear terms, but B ∗ commutes with B only if B is normal with respect to the standard scalar product. (Actually, it can be shown that systems in normal form with respect to a nilpotent linear part do not, in general, admit nontrivial commuting vector ﬁelds.) Recall that, whenever B is not nilpotent, theory guarantees the existence of a positive deﬁnite scalar product on Kn so that the semisimple part Bs is normal with respect to this product. But there seems to be no easy computational access to such a scalar product. Of course there are important classes of matrices which are normal with respect to the standard scalar product, like skew-symmetric matrices, and for these the Elphick normalization yields normal forms with nontrivial commuting vector ﬁelds. Still, one should keep in mind that the scalar product generally is another ingredient which has to be determined in a suitable manner. Thus Poincar´e–Dulac, in its naturally invariant setting, may have some advantage. The decomposition B = Bs + Bn , which is of central importance in the Poincar´e–Dulac setting, is not necessarily easy to ﬁnd, although it can be obtained by “rational” methods which do not require the actual computation of, say, the Jordan canonical form. But it is worth noting here that the question whether B = Bs can be decided with relative ease: This is the case if and only if the minimum polynomial of B has no multiple roots. (Recall that the minimum polynomial is the normalized polynomial of smallest degree which vanishes upon substitution of B.)

10. On Normal Form Computations

6

315

Linear Algebra, Part Two

As in Section 3, let V be a ﬁnite dimensional vector space and T : V → V linear. We will discuss the Poincar´e–Dulac scenario in greater detail: Given v ∈ V , ﬁnd v0 ∈ ker(Ts ) and w ∈ V such that (3.1) is satisﬁed. Our goal is to ﬁnd an approach to this problem which does not involve solving a system of linear equations. Of course, certain properties of T will have to be used in such a computation. We ﬁrst discuss the semisimple case T = Ts . Let p ∈ K[τ ] be a normalized polynomial over K in the indeterminate τ , thus p=τ

m

+

m

αi τ m−i ,

with α1 , . . . , αm ∈ K ,

i=1

with the property that p(T )v = T m v +

αi T m−i v = 0 .

Such a polynomial exists; for instance one may take the minimum polynomial of T . (For certain v there are polynomials of smaller degree with the desired property.) Since T is semisimple, the roots of its minimum polynomial are simple. Therefore we may assume that αm−1 = 0 in case αm = 0, and we will require this in the following. 6.1 Proposition.

Let p be given, with the above properties.

(a) In case αm = 0 the equation (3.1) is satisﬁed with v0 = 0 and w=−

m−1

1 m−1 T v+ αi T m−i−1 v . αm i=1

(b) In case αm = 0 and αm−1 = 0, one has p(τ ) = τ · p∗ (τ ), with p∗ (τ ) = τ m−1 +

m−1

αi τ m−1−i .

i=1

Let q1 (τ ) = − q2 (τ ) =

1 αm−1 1

αm−1

τ

m−2

+

m−2

αi τ m−i−2 ,

i=1

,

so that q1 (τ ) · τ + q2 (τ ) · p∗ (τ ) = 1, and w = q1 (T )v = −

1 αm−1

T m−2 v +

m−2

αi T m−i−2 v ,

i=1

v0 = v − T w = v − T q1 (T )v = q2 (T ) p∗ (T )v .

316

J. Scheurle and S. Walcher

Then v0 and w satisfy equation (3.1). One may replace w by w ˜ := w +

αm−2 v ∈ im(T ) . αm−1 0

Sketch of proof. Part (a) is an immediate consequence of p(T )v = 0. With regard to part (b) note that T q2 (T ) p∗ (T )v = q2 (T ) p(T )v = 0 , which shows q2 (T ) p∗ (T )v ∈ ker(T ). Now the equality v = q1 (T ) T v + q2 (T ) p∗ (T )v

implies the assertion.

Note that the procedure only requires 0 to be a simple root of p. This is somewhat less restrictive than semisimplicity. Let us address the problem of how to ﬁnd a polynomial p with the desired properties. If the minimum polynomial of T is known then one may take p as this minimum polynomial. In any case, a possible strategy is to test v, T v, T 2 v, . . . successively for linear dependence, and choose m as the smallest positive integer such that v, T v, . . . , T m v are linearly dependent. An elementary argument shows that the coeﬃcient of T m v is then nonzero, so it may be set equal to 1. Moreover, 0 is a root of multiplicity at most one for p, since the minimum polynomial of T is a multiple of p. Let us now consider the general case, with T not necessarily semisimple. We require the decomposition T = Ts + Tn into the semisimple and nilpotent part here. Let p ∈ K[τ ] be the minimum polynomial of Ts . Thus we know how to ﬁnd v0 ∈ ker(Ts ) and w∗ ∈ im(Ts ) such that Ts w∗ = v − v0 . There remains the task to ﬁnd w so that T w = Ts w ∗ . We may assume that w ∈ im(Ts ), since this subspace is T -invariant and the restriction of T to this subspace is invertible. The crucial observation here is that there is a polynomial q3 so that w = Ts q3 (Ts )w . To see this, note w=−

m−1

1 m−1 + αi Tsm−1−i Ts w Ts αm i=1

10. On Normal Form Computations

317

in case αm = 0. In case αm = 0 there is some z so that w = Ts z, and the equation w = Ts z = −

1 αm−1

Tsm−2 +

m−2

αi Tsm−i−2 Ts2 z

i=1

yields the desired property. Since Ts and Tn commute, one may rewrite the equation as Ts w∗ = (Ts + Tn )w = Ts id + Tn q3 (Ts ) w . This shows that

−1 ∗ w = id + Tn q3 (Ts ) w

solves the equation. It should be emphasized that we really have gained something here, since Tn q3 (Ts ) is a nilpotent linear map, and id +Tn q3 (Ts ) is therefore easy to invert via the geometric series. Solving a system of linear equations is not necessary here.

7

Application to Poincar´ e–Dulac Normal Forms

Let us return to the homological equation (2.1) in the space Pr of homogeneous vector polynomials. For the sake of simplicity and brevity, we restrict our attention to the case that B = Bs is semisimple, whence ad Bs is semisimple on Pr . (Note that the same procedure works if the restriction of ad Bs to Pr is invertible, or the polynomial has a simple root 0.) Using the results of the previous section, we have the following strategy to solve (2.1) and obtain gr in Poincar´e–Dulac form: (i) Determine an integer m and scalars α1 , . . . , αm such that m

m−1

(ad B) fr + α1 (ad B)

fr + · · · + αm−1 (ad B) fr + αm fr = 0 .

Moreover, ascertain that αm−1 = 0 in case αm = 0. (ii) In case αm = 0 set 1 (ad B)m−1 fr + α1 (ad B)m−2 fr + · · · + αm−1 fr , αm gr = 0 .

hr = −

(iii) In case αm = 0 set (ad B)m−2 fr + α1 (ad B)m−3 fr + · · · + αm−2 fr , αm−1 gr = fr − B , h∗r .

h∗r = −

1

318

J. Scheurle and S. Walcher

One may take hr = h∗r , or choose hr = h∗r +

αm−2 gr ∈ im(ad B) . αm−1

In most cases it will be necessary to compute (and possibly store) fr , B , fr , B , B , fr , . . . In principle it is possible to compute the minimum polynomial of ad B on Pr from the minimum polynomial of B, which one needs to know in any case. (To see this, recall that a normalized polynomial with simple roots is uniquely determined by these roots. Furthermore the roots of the minimum polynomial of B - a.k.a. the eigenvalues of B - determine the roots of the minimum polynomial of ad B on Pr .) This approach is actually feasible in a number of relevant cases, as will be seen in the following section. An alternative approach, as mentioned in 6, is to successively test the vector ﬁelds above for linear dependence. We do not contend that this approach is always better, or more eﬃcient, than proceeding as suggested by the results of Sections 3 and 4, via solving (ad B)2 hr = (ad B) fr . But it may be of advantage for instance in the case of parameter-dependent systems, since solving (large) parameterdependent systems of linear equations is a formidable problem.

8

Some Classes of Examples

Here we discuss some classes of practically relevant examples to illustrate the approach. In all these cases the minimum polynomials of ad B are directly available from the minimum polynomial of B. (Recall that for are the numbers eigenvalues λ1 , . . . , λn of B, the eigenvalues of ad B on Pr # # i mi λi −λj , where the mi are nonnegative integers with i mi = r. Thus one has the correspondence of the minimum polynomials.)

Dimension Two, Linear Part with Simple Eigenvalue 0 8.1 Proposition. Let B have the eigenvalues 0 and ρ = 0. Then the minimum polynomial of ad B on Pr is (τ + ρ) τ (τ − ρ) (τ − 2ρ) · · · (τ − r ρ) . Sketch of proof. The eigenvalues of ad B on Pr are the numbers m1 · 0 + m2 · ρ − 0 ,

and m1 · 0 + m2 · ρ − ρ ,

with integers m1 ≥ 0 and m2 ≥ 0 so that m1 + m2 = r. Let us use this for normal form computations up to degree 3:

10. On Normal Form Computations

319

(i) Degree two normalization for f = B + f2 + · · · . The minimum polynomial of ad B on P2 is given by τ (τ 3 − 2ρ τ 2 − ρ2 τ + 2ρ3 ) (so m = 4, α1 = −2ρ, α2 = −ρ2 , α3 = 2ρ3 , α4 = 0). From equation (ii) of Section 7 we have 1 h2 = − 3 (ad B)2 f2 − 2ρ (ad B)f2 − ρ2 f2 , 2ρ g2 = f2 − B , h2 . In order to proceed with the normalization, we need the degree three term of g. We will compute it using the exponential of h2 , as outlined in Section 2. Collecting terms of small degrees, we have g = exp(ad h2 )f = B + f2 + f3 + · · · + h2 , B + h2 , f2 + · · · + 12 h2 , h2 , B + · · · .

Using h2 , B = g2 − f2 , one ﬁnds g3 = f3 + h2 , f2 + 12 h2 , g2 − f2 = f3 + 12 h2 , g2 + f2 . (ii) Degree three normalization for f = B + f2 + f3 + · · · ; here we assume that B , f2 = 0 (thus we have renamed the results of the degree two normalization). The minimum polynomial of ad B on P3 is given by τ (τ 4 − 5ρ τ 3 + 5ρ2 τ 2 + 5ρ3 τ + 6ρ4 ) , and hence 1 (ad B)3 f3 − 5ρ (ad B)2 f3 + 5ρ2 (ad B)f3 + 5ρ3 f3 , 4 6ρ g3 = f3 − B, h3 .

h3 = −

Let us illustrate why this procedure to compute the Poincar´e–Dulac normal form is of practical interest. (For a vector ﬁeld f and a scalar function µ we denote the Lie derivative by Lf (µ); thus Lf (µ)(x) = Dµ(x) f (x).) 8.2 Proposition. Let f = B + · · · be in normal form up to degree three. There is a nonzero linear form µ such that LB (µ) = 0. (a) In case Lf2 (µ) = 0 the stationary point 0 is a saddle-node. (b) In case Lf2 (µ) = 0, Lf3 (µ) = 0 the stationary point 0 is either a saddle or a node. More precisely, one has Lf3 (µ) = σ · µ3 for some scalar σ, and the stationary point is a saddle in case ρ · σ < 0, and a node in case ρ · σ > 0.

320

J. Scheurle and S. Walcher

Sketch of proof. In the special case B = diag(0, ρ) the normal form is given by α1 x21 α2 x31 Bx + + + ··· , β1 x1 x2 β2 x21 x2 and the stationary point is a saddle-node if α1 = 0, and a saddle or a node if α1 = 0, α2 = 0. Note that µ(x) := x1 is (up to scalar factors) the only linear ﬁrst integral of B, and that Lf2 (µ) = α1 µ2 , Lf3 (µ) = α2 µ3 . The assertion in the general case follows immediately, being the coordinate-independent version of the above. It is also worth noting that one ﬁnds µ from a system of linear equations. Let us illustrate this with a sample computation. Consider 1 1 2x1 x2 . f (x) = x+ −x22 σ σ The eigenvalues of the linear part are 0 and ρ := 1 + σ. (We will assume that σ = −1.) Note that the linear form µ(x) = σx1 − x2 is annihilated by LB . Using the procedure described above, we get 0 1 x1 + x2 2x1 x2 2σx21 + 2σx1 x2 + 3x22 (ad B)f2 = , = , σx1 + σx2 −x22 −4σx1 x2 − σx22 (2σ + 2σ 2 )x21 + (14σ + 2σ 2 )x1 x2 + (−3 + 9σ)x22 . (ad B)2 f2 = −6σ 2 x21 + (−4σ − 4σ 2 )x1 x2 + (−7σ − σ 2 )x22 This yields −1 × h2 = (1 + σ)3 (−2σ − 2σ 2 ) x21 + (−2 + 6σ − 4σ 2 ) x1 x2 + (−9 + 3σ) x22 , −6σ 2 x21 + (4σ + 4σ 2 ) x1 x2 + (1 − 3σ + 2σ 2 ) x22 1 × g2 = 2(1 + σ)3 (−4σ + 10σ 2 − 4σ 3 ) x21 + (4 − 16σ + 16σ 2 ) x1 x2 + (6 − 12σ) x22 . (−6σ 2 + 12σ 3 ) x21 + (8σ − 20σ 2 + 8σ 3 ) x1 x2 + (−2 + 8σ − 8σ 2 ) x22 Since Lg2 (µ) =

1 − 2σ 2 µ , (1 + σ)2

we ﬁnd that the stationary point 0 is a saddle-node whenever σ = 12 . This collection of computations should be suﬃcient to illustrate the procedure, and to convince the reader that it is straightforward. We will

10. On Normal Form Computations

321

not record the following computations explicitly. The expressions become longer, but they are easily handled by any computer algebra system. After normalization up to degree three, the next interesting result is Lg3 (µ) =

6σ 2 + 9σ + 15σ 2 + 10σ 3 − 3σ 5 − σ 6 µ3 . 10 (1 + σ)

In the yet unresolved case σ = 12 we ﬁnd Lg3 (µ) = αµ3 with some positive α, and this shows that the stationary point 0 is a repelling node. Thus, computation of the normal form up to degree three is suﬃcient to determine the type of the stationary point for every admissible value of σ.

Dimension Two, Linear Part having Purely Imaginary Eigenvalues 8.3 Proposition. Let B have purely imaginary eigenvalues ±iω, ω = 0. Then the minimum polynomial of ad B on Pr is equal to for r even, (τ 2 + ω 2 ) (τ 2 + 9ω 2 ) · · · τ 2 + (r + 1)2 ω 2 2 2 2 2 2 2 2 for r odd. τ (τ + 4ω ) (τ + 16ω ) · · · τ + (r + 1) ω Sketch of proof. The eigenvalues are of the form iω(m1 − m2 ± 1).

Again, let us determine the normal form of f = B + f2 + f3 + · · · up to degree three. (i) Degree two normalization for f = B + f2 + · · · ; the minimum polynomial of ad B on P2 is given by τ 4 + 10ω 2 τ 2 + 9ω 4 , therefore g2 = 0 and h2 = −

1 (ad B)3 f2 + 10ω 2 (ad B)f2 . 9ω 4

Again, compute the degree three term of g, according to 2. Collecting terms of small degrees, and using h2 , B = −f2 , one ﬁnds g3 = f3 +

1 2

h2 , f2 .

(ii) Degree three normalization for f = B + f2 + f3 + · · · ; again we assume that B , f2 = 0 (thus we have renamed the results of the degree two normalization). The minimum polynomial of ad B on P3 is given by τ (τ 4 + 20ω 2 τ 2 + 64ω 4 ) ,

322

J. Scheurle and S. Walcher

and hence 1 (ad B)3 f3 + 20ω 2 (ad B)f3 , 4 64ω g3 = f3 − B , h3 .

h3 = −

The usefulness of these computations is evident by the next result, with the proof analogous to Proposition 8.2. 8.4 Proposition. Let f = B + f3 + · · · in normal form up to degree three. Then there is a positive deﬁnite quadratic form µ (unique up to scalar factors) such that LB (µ) = 0. Moreover Lf3 (µ) = α µ for some scalar α. In case α = 0 the stationary point 0 is a (weak) focus, which is stable in case α < 0 and unstable in case α > 0. Again we present a sample computation for the purpose of illustration. Let 1 σ 0 . f (x) = x+ −σ −1 x22 √ The eigenvalues of the linear part are ±i σ 2 − 1. (We will assume that σ > 1.) Here the positive deﬁnite quadratic form µ(x) = σx21 + 2x1 x2 + σx22 is annihilated by LB . Using the procedure described above, we get (among other results) 0 (ad B)f2 = (ad B)3 f2 =

1 x1 + σx2 0 −σx22 , 2 = , −σx1 − x2 −2σx1 x2 − x22 x2

−6σ 3 x21 − 12σ 2 x1 x2 + (−13σ + 7σ 3 ) x22 . 6σ 2 x21 + (−2σ + 14σ 3 ) x1 x2 + (−1 + 7σ 2 ) x22

This yields h2 =

−1 9(σ 2 − 1)2

−6σ 3 x21 − 12σ 2 x1 x2 + (−3σ − 3σ 2 ) x22 6σ 2 x21 + (18σ − 6σ 3 ) x1 x2 + (9 − 3σ 2 ) x22

.

Again, we do not record the intermediate results explicitly. After normalization up to degree three, we ﬁnd Lg3 (µ) = −

σ µ2 , 2(σ 2 − 1)2

whence the stationary point 0 is asymptotically stable by Proposition 8.4.

10. On Normal Form Computations

323

Some Higher-Dimensional Examples One could argue that for two-dimensional systems a coordinate-invariant approach is not really necessary, since one can still resort to brute force methods in the computation of eigenvalues and eigenvectors, and actually perform the necessary coordinate transformations. But it is obvious that such an argument does no longer hold for systems of higher dimension. Here we will indicate that our approach still works in such scenarios, by giving closed expressions for some minimum polynomials. (This is the only necessary ingredient for our normal form computations. The rest is routine, even if possibly tedious.) The ﬁrst example is in dimension three, and the minimum polynomial of B is τ 3 + ω 2 τ . (For instance, skew-symmetric matrices are of this type.) The proof is as usual. 8.5 Proposition. Let B have the eigenvalue 0 and the purely imaginary eigenvalues ±iω, ω = 0. Then the minimum polynomial of ad B on Pr is equal to τ (τ 2 − ω 2 )(τ 2 − 4ω 2 ) · · · τ 2 − (r + 1)2 ω 2 . Thus the computation of the normal form, up to any degree, is reduced to routine computations, as before. As is well-known, the symmetry properties of a vector ﬁeld in normal form can be employed to ﬁnd a reduced equation on a suitable orbit space, using the polynomial invariants of B. In the given scenario, this yields a counterpart to Propositions 8.2 and 8.4: There are a linear form µ1 and a positive semideﬁnite quadratic form µ2 which are annihilated by LB . If f = B + · · · is in normal form up to degree three then Lf (µ1 ) = α1 µ21 + α2 µ2 + α3 µ31 + α4 µ1 µ2 + · · · , Lf (µ2 ) = β1 µ1 µ2 + β2 µ22 + · · · , with constants αi and βi . This is just the coordinate-invariant version of the familiar case when ⎛ ⎞ 0 0 0 B = ⎝0 0 −ω ⎠ , 0 ω 0 with µ1 (x) = x1 and µ2 (x) = x22 + x23 . Thus one gets stability results from the reduced two-dimensional system. Finding µ1 and µ2 may not be straightforward, but this task can be approached by a strategy similar to the one of Section 6, employing the minimum polynomial of LB on the space of linear, resp. quadratic, polynomials. The second class of examples, in dimension four, includes most skewsymmetric 4 × 4-matrices. For the sake of brevity we only record degrees 2 and 3 here.

324

J. Scheurle and S. Walcher

8.6 Proposition. Let B have the purely imaginary eigenvalues ±iω1 and ±iω2 , where 0 < ω1 < ω2 . The minimum polynomial of ad B on P2 is equal to < (τ 2 + αk ) , k

where α1 = ω12 ,

α2 = 9ω12 ,

α3 = ω22 ,

α4 = 9ω22 ,

α5 = (2ω1 + ω2 )2 ,

α6 = (2ω1 − ω2 )2 ,

α7 = (ω1 + 2ω2 )2 ,

α8 = (ω1 − 2ω2 )2 ,

provided that all these numbers are pairwise diﬀerent and nonzero. The minimum polynomial of ad B on P3 is equal to < (τ 2 + βk ) , τ k

where β1 = 4ω12 ,

β2 = 16ω12 ,

β3 = 4ω22 ,

β4 = 16ω22 ,

β5 = (3ω1 + ω2 )2 ,

β6 = (3ω1 − ω2 )2 ,

β7 = (ω1 + ω2 )2 ,

β8 = (ω1 − ω2 )2 ,

β9 = (ω1 + 3ω2 )2 ,

β10 = (ω1 − 3ω2 )2 ,

provided that all these numbers are pairwise diﬀerent and nonzero. The proof is as usual, with the eigenvalues of B being known. Some loworder resonances (like ω2 = 2ω1 ) had to be excluded in the statement of Proposition 8.6. This is not a serious problem, since the minimum polynomials still have the same roots in the exceptional cases, only their degrees are smaller. If there are no low-order resonances, one can employ the stability criteria of Takens [1974]: One has to ﬁnd two positive semideﬁnite quadratic forms of rank 2 that are mapped to 0 by LB , and use the normal form up to degree three. Thus we see that this class of examples can be treated directly with the tools developed here. More complicated examples will be discussed in a forthcoming paper.

10. On Normal Form Computations

325

References Anosov, D. V. and V. I. Arnold (Eds.) [1988], Dynamical Systems I, Encyclopedia of Mathematical Sciences, Springer-Verlag, New York, Berlin. Bibikov, Yu. N. [1979], Local theory of analytic ordinary diﬀerential equations, Lecture Notes in Mathematics 702, Springer-Verlag, New York, Berlin. Broer, H. [1981], Formal normal form theorems for vector ﬁelds and consequences for bifurcations in the volume preserving case, in: Dynamical systems and turbulence (D. A. Rand and L. S. Young, eds), Lecture Notes in Mathematics 898, Springer-Verlag, New York, Berlin, 54–74. Bruno, A.D. [1989], Local methods in nonlinear diﬀerential equations, SpringerVerlag, New York, Berlin. Chen, G. and J. DellaDora [2000], Further reductions of normal forms for dynamical systems, J. Diﬀerential Eqs. 166, 79–106. Deprit, A. [1969], Canonical transformations depending on a small parameter, Celestial Mech. 1, 12–30. Dulac, H. [1912], Solution d’un syst´eme d’´equations diﬀer´entielles dans le voisinage des valeurs singuli´eres, Bull. Soc. Math. France 40, 324–383. Elphick, C., Tirapegui, E., Brachet, M.E., Coullet, P. and Iooss, G. [1987], A simple global characterization for normal forms of singular vector ﬁelds, Physica D 29, 95–127. Gaeta, G. [1999], Poincar´e renormalized forms, Ann. Inst. Henri Poincar´e 70, 461–514. Guckenheimer, J. and P. Holmes [1986], Nonlinear oscillations, dynamical systems and bifurcations of vector ﬁelds, Springer-Verlag, New York, Berlin. Hassard, B. and Y.H. Wan [1978], Bifurcation formulae derived from center manifold theory, J. Math. Anal. Appl. 63, 297–312. Iooss, G. and M. Adelmeyer [1992], Topics in bifurcation theory and applications, World Scientiﬁc, Singapore. Marsden, J. and M. McCracken [1976], The Hopf bifurcation and its applications, Springer-Verlag, New York, Berlin. Scheurle, J. and J. E. Marsden [1984], Bifurcation to quasiperiodic tori in the interaction of steady state and Hopf bifurcation, SIAM J. Math. Anal. 15, 1055–1074. ´ Takens, F. [1974], Singularities of vector ﬁelds, Inst. Hautes Etudes Sci. Publ. Math. 43, 47–100. Walcher, S. [1993], On transformations into normal form, J. Math. Anal. Appl. 180, 617–632.

Part IV

Geometric Mechanics

11 The Optimal Momentum Map Juan-Pablo Ortega Tudor S. Ratiu To Jerry Marsden on the occasion of his 60th birthday ABSTRACT The presence of symmetries in a Hamiltonian system usually implies the existence of conservation laws that are represented mathematically in terms of the dynamical preservation of the level sets of a momentum mapping. The symplectic or Marsden–Weinstein reduction procedure takes advantage of this and associates to the original system a new Hamiltonian system with fewer degrees of freedom. However, in a large number of situations, this standard approach does not work or is not eﬃcient enough, in the sense that it does not use all the information encoded in the symmetry of the system. In this work, a new momentum map will be deﬁned that is capable of overcoming most of the problems encountered in the traditional approach.

Contents 1 2

Introduction . . . . . . . . . . . . . . . . . . . . . . . 330 Preliminaries and Technical Results . . . . . . . . . 332 2.1 Generalized Distributions . . . . . . . . . . . . . . . 332 2.2 Poisson Manifolds and Poisson Tensors . . . . . . . . 334 2.3 Proper Actions, Tubes, and Slices . . . . . . . . . . 336 3 A New Momentum Map and an Optimal Noether Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . 341 3.1 The Optimal Momentum Map . . . . . . . . . . . . 342 3.2 The Optimal Momentum Map for Proper Globally Hamiltonian Actions . . . . . . . . . . . . . . . . . . 347 3.3 Universality properties of the optimal momentum map349 4 Optimal Reduction . . . . . . . . . . . . . . . . . . . 352 4.1 Reduction Lemmas . . . . . . . . . . . . . . . . . . . 352 4.2 The Optimal Reduction Method . . . . . . . . . . . 356 4.3 Comparison of the Optimal and the Marsden–Weinstein Reductions . . . . . . . . . . . . . . . . . . . . . . . 359 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . 360

329

330

1

J-P Ortega and T. S. Ratiu

Introduction

Let (M, ω , h) be a Hamiltonian system and G be a Lie group with Lie algebra g, acting canonically on M ; ω denotes the symplectic two-form on the phase space M and h : M → R is the Hamiltonian function. The triplet (M, ω , h) is called a G–Hamiltonian system or one says that (M, ω , h) has symmetry G , if h is a G–invariant function. The G-action on M is said to be globally Hamiltonian if there exists a G-equivariant map J : M → g∗ with respect to the G-action on M and the coadjoint action on the dual g∗ of the Lie algebra g, such that, for each ξ ∈ g the vector ﬁeld associated to the inﬁnitesimal generator ξM is Hamiltonian with Hamiltonian function Jξ := J , ξ (the symbol ·, · denotes the natural pairing of g with its dual g∗ ). The map J is called the momentum map associated to the canonical G-action on M . The main interest in ﬁnding the symmetries of a given system lies in the conservation laws associated to them provided by the following classical result due to E. Noether (see Noether [1918]). 1.1 Theorem (Noether). Let (M, ω , h) be a G–Hamiltonian system. If the G-action on M is globally Hamiltonian with associated momentum map J : M → g∗ , then J is a constant of the motion for h, that is: J ◦ Ft = J, where Ft is the ﬂow of Xh , the Hamiltonian vector ﬁeld associated to h. In other words, for each µ ∈ g∗J := J(M ), the (connected components of the) level set J−1 (µ) is preserved by the dynamics induced by any G– invariant Hamiltonian. Notice that this allows us to look at g∗J as a set of labels that index a family of sets that are invariant under the ﬂows associated to G–invariant Hamiltonian functions. The problem with this classical approach to the interplay between symmetries and conservation laws resides in the fact that in a number of important situations it cannot be implemented or, even if it can be implemented, it is grossly ineﬃcient in the sense that the sets labeled by g∗J are not the smallest subsets of M preserved by G–invariant dynamics. The following situations exemplify these problems: (i) The simplest situation in which the labeling by g∗J is not optimal is when the level sets J−1 (µ) are not connected. Notice that G–invariant dynamics preserves not only J−1 (µ), that is, the sets labeled by g∗J , but also their connected components. Even though in several important situations (for instance, canonical representations of compact connected groups Lerman [1995], Theorem 2.1) the level sets of the momentum map are connected, this is not the case in general. (ii) Singularities and the law of conservation of the isotropy. Let m ∈ M be such that Gm = {e}. If dim Gm > 0 then J(m) = µ ∈ g∗

11. The Optimal Momentum Map

331

is a singular value of J since rank (Tm J) = (gm )◦ , where gm = {0} is the Lie algebra of the isotropy subgroup Gm of m ∈ M and (gm )◦ denotes the annihilator of gm in g∗ . Suppose now that the G-action on M is proper and denote H := Gm . In that situation, the connected components of the set MH = {z ∈ M | Gz = H}, are embedded symplectic submanifolds of (M, ω). Moreover, given that the ﬂow of any G–invariant Hamiltonian is G–equivariant, the connected components of MH are preserved by any of these ﬂows (law of conservation of the isotropy), that is, in the presence of points with non trivial symmetry, the smallest sets left invariant by G–invariant dynamics are not the connected components of J−1 (µ) but rather those of J−1 (µ) ∩ MH . (iii) Symmetries given by ﬁnite groups. This is an important particular case of (ii). Many relevant systems possess symmetries that are expressed through the canonical action of a ﬁnite group G on M . Since in that case the Lie algebra g = {0}, the associated momentum map J is the constant map equal to zero. Therefore, in this scenario, the Noether Theorem is empty of content. (iv) Symmetries without a momentum map. As the statement of Noether’s Theorem implies, one needs to insure a priori the existence of a momentum map which then gives the conserved quantities associated to the given canonical symmetry. However, even if the system possesses a canonical symmetry, the existence of a momentum map is by no means guaranteed. For example, let M = S 1 × S 1 = T2 with the symplectic form ω = dθ1 ∧ dθ2 . Let G = S 1 acting on M by eiφ ·(eiθ1 , eiθ2 ) := (ei(φ+θ1 ) , eiθ2 ). A simple calculation shows (see Weinstein [1976]) that this action is canonical but that it does not admit an associated globally deﬁned momentum map. The conservation laws of a Hamiltonian system allows one to apply symplectic or Marsden–Weinstein reduction (Marsden and Weinstein [1974]; Meyer [1973]) in order to obtain a new system with less degrees of freedom. However, in some of the above mentioned problematic situations, performing reduction becomes either impossible or a task subject to arbitrary choices. The main goal of this paper is the construction of a momentum map, which we will call optimal momentum map, that does not suﬀer from the inconveniences pointed out in the classical approach. This mapping is always deﬁned in the presence of a canonical symmetry, it uses the symmetries of the system in order to provide optimal conservation laws, and can be used, following the traditional Marsden–Weinstein scheme, to symplectically reduce the system in virtually every possible situation.

332

J-P Ortega and T. S. Ratiu

We emphasize that this paper is just an introduction to the optimal momentum map. Many of its properties are still being investigated. For instance, the convexity properties of its image will be presented in a future publication. Some natural questions about this object have been recently answered; for instance how to carry out orbit reduction and reduction by stages in the optimal framework is already well understood (Marsden, Misiolek, Ortega, Perlmutter, and Ratiu [2001]; Ortega [2001a]). Incidentally, this shows how to perform standard Hamiltonian singular reduction by stages without using the so called stages hypothesis. The paper is organized as follows. Section 2 introduces some notations and technical results that will be needed later on. The deﬁnition of the optimal momentum map is given in Section 3. This then leads to a Noether Theorem valid also in the above mentioned problematic situations. Section 4 shows how the optimal momentum map can be used to perform Marsden–Weinstein reduction. In addition, it is shown that these new reduced spaces coincide with the usual ones, both in the regular (Marsden and Weinstein [1974]) and singular (Sjamaar and Lerman [1991]; Bates and Lerman [1997]; Ortega [1998]; Ortega and Ratiu [2002]) situations.

2

Preliminaries and Technical Results

2.1

Generalized Distributions

We begin by recalling some results on generalized distributions due to Stefan [1974a,b] and Sussman [1973]. The reader is also encouraged to check with the excellent review in the Appendix 3 of the book by Libermann and Marle [1987]. 2.1 Deﬁnition. Let M be a diﬀerentiable manifold. A generalized distribution D on M is a subset of the tangent bundle T M such that, for any point m ∈ M , the ﬁber Dm = D ∩ Tm M is a vector subspace of Tm M . The dimension of Dm is called the rank of the distribution D at m. A diﬀerentiable section of D is a diﬀerentiable vector ﬁeld X deﬁned on an open subset U of M , such that for any point z ∈ U , X(z) ∈ Dz . An immersed connected submanifold N of M is said to be an integral manifold of the distribution D if, for every z ∈ N , Tz i(Tz N ) ⊂ Dz , where i : N → M is the canonical injection. The integral submanifold N is said to be of maximal dimension at a point z ∈ N if Tz i(Tz N ) = Dz . (i) The generalized distribution D is diﬀerentiable if, for every point z ∈ M , and for every vector v ∈ Dz , there exists a diﬀerentiable section X of D, deﬁned on an open neighborhood U of z, such that X(z) = v. (ii) The generalized distribution D is completely integrable if, for every

11. The Optimal Momentum Map

333

point z ∈ M , there exists an integral manifold of D everywhere of maximal dimension which contains z. (iii) The generalized distribution D is involutive if it is invariant under the (local) ﬂows associated to diﬀerentiable sections of D. Remark. Our deﬁnition of involutive distribution is more general than the traditional one which states that D is involutive if [X , Y ] takes values in D whenever X and Y are vector ﬁelds with values in D. The two concepts of involutivity are equivalent only when the dimension of Dm is the same for any m ∈ M . 2.2 Theorem (Generalized Frobenius Theorem). A diﬀerentiable distribution D on a manifold M is completely integrable iﬀ it is involutive. Proof. See Stefan [1974a,b]; Sussman [1973].

In our discussion we will be interested in the speciﬁc case in which the generalized distribution is given by an everywhere deﬁned family of vector ﬁelds, that is, there is a family of smooth vector ﬁelds D whose elements are vector ﬁelds X deﬁned on a open subset Dom(X) ⊂ M such that, for any z ∈ M the generalized distribution D is given by Dz = span{X(z) ∈ Tz M |X ∈ D and z ∈ Dom(X)}. Note that in such a case the distribution D is diﬀerentiable by construction. We will say that D is the generalized distribution spanned by D. One of the reasons for our interest in this special case resides in the fact that when these distributions are completely integrable, a very useful characterization of their integral manifolds can be given. In order to describe it we introduce some terminology following Libermann and Marle [1987]. Let X be a vector ﬁeld deﬁned on an open subset Dom(X) of M and Ft be its ﬂow. For any ﬁxed t ∈ R the domain Dom(Ft ) of Ft is an open subset of Dom(X) such that Ft : Dom(Ft ) → Dom(F−t ) is a diﬀeomorphism. If Y is a second vector ﬁeld deﬁned on the open set Dom(Y ) with ﬂow Gt we can consider, for two ﬁxed values t1 , t2 ∈ R, the composition of the two diﬀeomorphisms Ft1 ◦ Gt2 as deﬁned on the open set Dom(Gt2 ) ∩ (Gt2 )−1 (Dom(Ft1 )) (which may be empty). The previous prescription allows us to inductively deﬁne the composition of an arbitrary number of locally deﬁned ﬂows. We will obviously be interested in the ﬂows associated to the vector ﬁelds in D that deﬁne the distribution D. The following sentences describe some important conventions that we will use all over the paper. Let k ∈ N∗ , be a positive natural number, X be an ordered family X = (X1 , . . . , Xk ) of k elements of D, and T be a k–tuple T = (t1 , . . . , tk ) ∈ Rk such that Fti denotes the (locally

334

J-P Ortega and T. S. Ratiu

deﬁned) ﬂow of Xi , i ∈ {1, . . . , k}, ti . We will denote by FT the locally deﬁned diﬀeomorphism FT = Ft11 ◦ Ft22 ◦ · · · ◦ Ftkk constructed using the above given prescription. Any diﬀeomorphism from an open subset of M onto another open subset of M that is constructed in the same fashion as FT is said to be generated by the family D. It can be proven that the composition of diﬀeomorphisms generated by D and the inverses of diffeomorphisms generated by D are themselves diﬀeomorphisms generated by D Libermann and Marle [1987], Proposition 3.3, Appendix 3. In other words, the family of diﬀeomorphisms generated by D forms a pseudogroup of transformations (see page 74 of Paterson [1999]) that will be denoted by GD . Two points x and y in M are said to be GD –equivalent, if there exists a diﬀeomorphism FT ∈ GD such that FT (x) = y. The relation being GD –equivalent is an equivalence relation whose equivalence classes are called the GD –orbits. 2.3 Theorem. Let D be a diﬀerentiable generalized distribution on the smooth manifold M spanned by a family of vector ﬁelds D ⊂ X(M ). The following properties are equivalent: (i) The distribution D is invariant under the pseudogroup of transformations generated by D, that is, for each FT ∈ GD generated by D and for each z ∈ M in the domain of FT , Tz FT (Dz ) = DFT (z) . (ii) The distribution D is completely integrable and its integral manifolds are the GD –orbits. Proof. See Stefan [1974a,b] or Sussman [1973]. For a compact presentation, combine Theorems 3.9 and 3.10 in the Appendix 3 of Libermann and Marle [1987].

2.2

Poisson Manifolds and Poisson Tensors

A Poisson manifold is a pair (M, { · , · }), where M is a diﬀerentiable manifold and { · , · } is a bilinear operation on C ∞ (M ) such that (C ∞ (M ) , { · , · }) is a Lie algebra, and { · , · } is a derivation (that is, the Leibniz identity holds) in each argument; (C ∞ (M ) , { · , · }) is called a Poisson algebra. The derivation property of the Poisson bracket deﬁnes for each h ∈ C ∞ (M ) the Hamiltonian vector ﬁeld Xh by df (Xh ) = {f, h} for any f ∈ C ∞ (M ). Obviously, any Hamiltonian system on a symplectic manifold is a Poisson dynamical system relative to the Poisson bracket induced by the symplectic structure. The converse relation is given by the Symplectic Stratiﬁcation Theorem, (see Kirillov [1976], Weinstein [1983], Libermann and Marle [1987], or Marsden and Ratiu [1999]) which states that any Poisson manifold (M, { · , · }) is partitioned into symplectic leaves.

11. The Optimal Momentum Map

335

Each leaf is, by deﬁnition, the set of points that can be linked to a given one by a ﬁnite number of smooth curves, each of which is a piece of an integral curve of a locally deﬁned Hamiltonian vector ﬁeld. The symplectic leaves are connected immersed symplectic manifolds in M (relative to the inclusion map), whose Poisson bracket coincides with that of M . The tangent space at z ∈ M to a leaf consists of all vectors that are equal to the value of some Hamiltonian vector ﬁeld at z. The symplectic leaves are invariant under the ﬂow of any Hamiltonian vector ﬁeld on M . The derivation property of the Poisson bracket implies that for any two functions f, g ∈ C ∞ (M ), the value of the bracket {f , g}(z) at an arbitrary point z ∈ M (and therefore Xf (z) as well), depends on f only through df (z) (for a justiﬁcation of this argument, see Abraham, Marsden, and Ratiu [1988], Theorem 4.2.16) which allows us to deﬁne a contravariant antisymmetric two–tensor B ∈ Λ2 (T ∗ M ) by B(z)(αz , βz ) = {f , g}(z) , where df (z) = αz and dg(z) = βz ∈ Tz∗ M . This tensor is called the Poisson tensor of M . We will denote by B : T ∗ M → T M the vector bundle map associated to B, that is

B(z)(αz , βz ) = αz , B (βz ) . The Poisson tensor permits another formulation of the results regarding symplectic leaves in terms of the characteristic distribution. Given a Poisson manifold (M, { · , · }) with associated Poisson tensor B, the characteristic distribution D on M is deﬁned by D := B (T ∗ M ). It can be proven (Libermann and Marle [1987], Theorem 12.1, Chapter III) that the characteristic distribution D is diﬀerentiable, completely integrable, and that its integral manifolds are the symplectic leaves of (M, { · , · }). 2.4 Proposition. Let (M, ω) be a symplectic manifold and B ∈ Λ2 (T ∗ M ) be the associated Poisson tensor. Then for any m ∈ M and any vector subspace V ⊂ Tm M , V ω = B (m)(V ◦ ) , where V ω := {v ∈ Tm M | ω(m)(u, v) = 0, for all u ∈ V } is the ω– orthogonal complement of V in Tm M . Proof. Let αm ∈ V ◦ and f ∈ C ∞ (M ) be such that df (m) = αm . Then, for any v ∈ V : ω(m) B (m)(αm ) , v = ω(m)(Xf (m) , v) = df (m) · v = αm , v = 0 , which proves that B (m)(V ◦ ) ⊂ V ω . Given that in the symplectic case B (m) is an isomorphism for all m ∈ M , a dimension count concludes the proof.

336

J-P Ortega and T. S. Ratiu

2.5 Deﬁnition. A smooth mapping ϕ : M1 → M2 , between the two Poisson manifolds (M1 , { · , · }1 ) and (M2 , { · , · }2 ) is called canonical or Poisson if for all g, h ∈ C ∞ (M2 ) we have ϕ∗ {g , h}2 = {ϕ∗ g , ϕ∗ h}1 . For future reference we state a result whose proof can be found (for instance, in Marsden and Ratiu [1999], Proposition 10.3.2). 2.6 Proposition. Let ϕ : M1 → M2 be a smooth map between two Poisson manifolds (M1 , { · , · }1 ) and (M2 , { · , · }2 ). Then ϕ is a Poisson map if and only if T ϕ ◦ Xh◦ϕ = Xh ◦ ϕ for any h ∈ C ∞ (M2 ). In particular, if h ∈ C ∞ (M2 ), Ft2 is the ﬂow of Xh , and Ft1 is the ﬂow of Xh◦ϕ , then Ft2 ◦ ϕ = ϕ ◦ Ft1 .

2.3

Proper Actions, Tubes, and Slices

The following deﬁnitions and results are standard in Lie theory (see Bredon [1972]; Palais [1961]). In what follows we will deal mostly with proper actions. Recall that the action Φ : G × M → M is called proper if for any two convergent sequences {mn } and {gn · mn } in M , there exists a convergent subsequence {gnk } in G. Proper actions have compact isotropy subgroups and Hausdorﬀ orbit spaces. 2.7 Deﬁnition. Let G be a Lie group and H ⊂ G be a subgroup. Suppose that H acts on the left on the manifold A. The twist action of H on the product G × A is deﬁned by h · (g , a) = (gh , h−1 · a) . Note that this action is free and proper by the freeness and properness of the action on the G–factor. The twisted product G ×H A is deﬁned as the orbit space corresponding to the twist action. 2.8 Proposition. The twisted product G ×H A is a G–space relative to the left action deﬁned by g · [g , a] := [g g , a]. If the action of H on A is proper, the G-action on G ×H A just deﬁned is proper. 2.9 Deﬁnition. Let M be a manifold and G be a Lie group acting on M . Let m ∈ M and denote H := Gm . A tube about the orbit G · m is a G–equivariant diﬀeomorphism ϕ : G ×H A −→ U, where U is a G–invariant neighborhood of G · m in M and A is some open ball centered at the origin in a representation space of H.

11. The Optimal Momentum Map

337

Note that if the G-action on M is proper then the G-action on any tube G ×H A is also proper since the isotropy subgroup H is compact and, consequently, its action on A is proper. Proposition 2.8 guarantees the claim. 2.10 Deﬁnition. Let M be a manifold and G be a Lie group acting properly on M . Let m ∈ M and denote H := Gm . Let S be a submanifold of M , such that m ∈ S and H · S = S. We say that S is a slice at m if the map G ×H S −→ U [g , s] −→ g · s is a tube about G · m, for some G–invariant open neighborhood U of G · m. 2.11 Theorem. Let M be a manifold and G be a Lie group acting properly on M . Let m ∈ M , denote H := Gm , and let S be a submanifold of M such that m ∈ S. Then the following statements are equivalent: (i) There is a tube ϕ : G ×H A −→ U about G · m such that ϕ[e , A] = S. (ii) S is a slice at m. (iii) G · S is an open neighborhood of G · m and there is an equivariant retraction r : G · S −→ G · m such that r−1 (m) = S. The ball A appearing in (i) can always be chosen to be a H–invariant neighborhood of 0 in the vector space Tm M/Tm (G · m), where H acts linearly and orthogonally by h · [v] = [h · v]. 2.12 Theorem (Slice Theorem). Let M be a manifold and G be a Lie group acting properly on M . Then there is a slice for the G-action at any m ∈ M. As a ﬁrst consequence of the Slice Theorem we have the following result that will be used in the sequel. 2.13 Proposition. Let G be a Lie group acting properly on the manifold M, U an open G–invariant subset of M , and f ∈ C ∞ (U )G . Then for any z ∈ U there exist a G–invariant open neighborhood V ⊂ U of z and a G–invariant smooth function F ∈ C ∞ (M )G such that f |V = F |V . Proof. Let U1 ⊂ U be an open G–invariant neighborhood of z that by the Slice Theorem can be modeled by the tube U1 G ×Gz Ar , where Ar is the open ball of radius r in the vector space Tz M/Tz (G · z) on which Gz acts orthogonally. Deﬁne g : Ar → R as the smooth Gz –invariant function given by g(v) := f ([e, v]). Since Gz is compact, there exists a Gz –invariant bump function φ : Ar → [0, 1] such that φ|Dr/2 = 1

and

φ|Dr \D3r/4 = 0.

338

J-P Ortega and T. S. Ratiu

Deﬁne f ∈ C ∞ (U1 )G by f ([h, v]) := φ(v)g(v), for any h ∈ G and v ∈ Ar . Since f and all its derivatives vanish on the boundary of U1 , its extension oﬀ U1 by the identically zero function yields F ∈ C ∞ (M )G . Take V G ×Gz Ar/2 . It is clear that F |V = f |V = f |V . The following result, proved for the ﬁrst time in Ortega [1998], will be of great importance in the sequel. 2.14 Proposition. Let G be a Lie group acting properly on the smooth manifold M . Let m ∈ M be a point with isotropy subgroup H := Gm . Then ◦ H Tm (G · m) = span df (m) f ∈ C ∞ (M )G . ◦ H Proof. We ﬁrst show that df (m) ∈ Tm (G · m) for f ∈ C ∞ (M )G . It is clear that for any ξ ∈ g,

d d df (m) , ξM (m) = f (exp tξ · m) = f (m) = 0. dt t=0 dt t=0 Hence, df (m) ∈ Tm (G · m)◦ . Now, df (m) is also H–ﬁxed since for any d h ∈ H and any v = m(t) ∈ Tm M with m(0) = m, dt t=0

h · df (m) , v = df (m) , h−1 · v d = f h−1 · m(t) dt t=0

d = f m(t) = df (m) , v . dt t=0 Since the vector v is arbitrary, h · df (m) = df (m), as required. Since we are going to work locally, in order to prove the converse inclusion, we do it in the tubular model provided by the Slice Theorem. Thus, the manifold M will be replaced by G ×H V , where V = Tm M/Tm (G · m) and the point m ∈ M is represented by [e , 0] ∈ G ×H V . It is easy to verify that T[e ,0] (G · [e , 0]) = {T(e ,0) π(ξ , 0) ∈ T[e ,0] (G ×H V ) | ξ ∈ g} ∼ = g/h × {0}, where π : G × V → G ×H V is the canonical projection. Clearly, (T[e ,0] (G · [e , 0]))◦ ∼ = {0} × V ∗ ∼ = V ∗.

(2.1)

At this point we introduce the following 2.15 Lemma. 1 Let H be a compact Lie group acting linearly on the vector space V , as well as on its dual V ∗ via the associated contragredient representation. Then, the restriction to (V ∗ )H of the dual map associated to the inclusion iV H : V H → V is a H-equivariant isomorphism from (V ∗ )H to (V H )∗ . 1 We

thank Tanya Schmah for her quick proof of this lemma.

11. The Optimal Momentum Map

339

Proof. Note that for any β ∈ V ∗ , i∗V H (β) = β|V H . Take an H-invariant inner product · , · on V ; this is always available by the compactness of H. Now let W be the H-invariant orthogonal complement to V H with respect to · , · . ∗ Any element α ∈ V H can be extended to β ∈ V ∗ by setting β|W = 0. Moreover, note that i∗V H (β) = β|V H = α and also, for any v ∈ V H , w ∈ W , and h ∈ H, we have h · β, v + w = β , h−1 · (v + w) = β , v + h−1 · w = β , v + w , since both w and h−1 · w are in W . This implies that β ∈ (V ∗ )H , and hence i∗V H |(V ∗ )H is surjective. For injectivity, suppose β|V H = 0, for some β ∈ (V ∗ )H . Let v ∈ V be such that β , w = v , w, for all w ∈ W . Then, for any h ∈ H and any w ∈ V we have that h · v , w = v , h−1 · w = β , h−1 · w = h · β , w = β , w = v , w , which, by the non degeneracy of the inner product implies that h · v = v and hence v ∈ V H . But β|V H = 0, which implies that v = 0, which in turn implies β = 0. Hence iV H (V ∗ ) is an isomorphism. The H-equivariance of i∗ H ∗ H follows trivially from the following V

(V )

chain of equalities that are satisﬁed for any h ∈ H, β ∈ (V ∗ )H , and v ∈ V H h · i∗V H (β) , v = i∗V H (β) , h−1 · v = β , h−1 · v = β , v = i∗V H (β) , v. Now, using Lemma 2.15 in (2.1) we get

◦ H T[e ,0] (G · [e , 0]) (V ∗ )H ∼ = (V H )∗ .

In the tubular model, the G–invariant functions f ∈ C ∞ (G ×H V )G are characterized by the condition f ◦ π ∈ C ∞ (V )H . The claim then follows if we show that (V ∗ )H = dg(0) ∈ V ∗ | g ∈ C ∞ (V )H . d Let g ∈ C ∞ (V )H and h ∈ H be arbitrary. Then, for any v = dt c(t) ∈ V t=0

with c(0) = 0, we have

h · dg(0) , v = dg(0) , h−1 · v d = g h−1 · c(t) dt t=0

d = g c(t) = dg(0) , v . dt t=0

340

J-P Ortega and T. S. Ratiu

Since v ∈ V is arbitrary, it follows that h · dg(0) = dg(0). To prove the converse, we begin by decomposing V into its irreducible H–components: V = W 1 ⊕ . . . ⊕ W k ⊕ U1 ⊕ . . . ⊕ U r , where dim W1 = . . . = dim Wk = 1, and dim Ui > 1 for i ∈ {1, . . . , r}. Thus, (2.2) V H = W1 ⊕ . . . ⊕ W k . Let {w1 , . . . , wk }, be a basis of V H adapted to the splitting (2.2). Deﬁne π1 , . . . , πk ∈ V ∗ by πi (wj ) = δij πi |Up = 0

i, j ∈ {1, . . . , k} i ∈ {1, . . . , k} , p ∈ {1, . . . , r} .

By construction, the functionals π1 , . . . , πk ∈ V ∗ are linear invariants of the H-action on V . Moreover, they are the only ones. Indeed, since π1 , . . . , πk is a basis of (V H )∗ , there are no additional independent linear invariants on V H . If α : U1 ⊕ . . . ⊕ Ur → R is another nontrivial linear invariant, there is some p ∈ {1, . . . , r} such that α|Up is not the zero functional. Therefore, ker(α|Up ) = 0 is a nontrivial H–invariant subspace of Up . Since this is impossible by the irreducibility of Up , it follows that such an α cannot exist. We have thus shown that π1 , . . . , πk ∈ V ∗ , or, in general, that any basis of (V H )∗ spans the set of all independent linear invariants of the H-action on V . By the Hilbert Theorem, the ring of H–invariant polynomials on V is ﬁnitely generated. We complete the set {π1 , . . . , πk } to a generating system {π1 , . . . , πk , πk+1 , . . . , πq } of the this ring. The Schwarz Theorem (Schwarz [1974]; Mather [1977]) guarantees that every H–invariant function f ∈ C ∞ (V )H can be locally written as f = g(π1 , . . . , πq ) , with g ∈ C ∞ (Rq ). Let now α ∈ (V ∗ )H ∼ = (V H )∗ be arbitrary. The form α ∈ (V H )∗ can be expanded as α = α1 π1 + . . . + αk πk . with α1 , . . . , αk ∈ R. Let g ∈ C ∞ (Rq ) be such that ∂g(0) = αi , ∂πi

i ∈ {1, . . . , k} .

With this choice, the function f := g(π1 , . . . , πq ) belongs to C ∞ (V )H and satisﬁes df (0) =

∂g(0) ∂g(0) π1 + . . . + πk = α1 π1 + . . . + αk πk = α , ∂π1 ∂πk

11. The Optimal Momentum Map

341

where we used that dπj (0) = 0 for j ∈ {k +1, . . . , q} because the invariants πj in this range of the indices are at least quadratic. Since α is arbitrary, the result follows. Remark. The properness condition in the statement of the previous proposition is essential (and is not tied to the existence of slices) since there are examples of non proper actions where this result does not hold. Indeed, consider the irrational ﬂow on the torus. Since the orbits of this action ﬁll densely the torus, the only invariant functions in this particular case are the constant functions. Hence the right hand side of the equality in Proposition 2.14 is trivial. However,if the torus ◦ in question is bigger than one dimensional, the vector space Tm (G · m) is non trivial. We now recall a few facts that we will need later on about the interplay between group actions and symplectic and Poisson structures. 2.16 Deﬁnition. Let (M, { · , · }) be a Poisson manifold (resp. (M, ω) a symplectic manifold), let G be a Lie group, and let Φ : G × M → M be a smooth left action of G on M . We say that the action Φ is canonical if Φ acts by canonical transformations; that is, for any f, h ∈ C ∞ (M ) and any g∈G (resp. Φ∗g ω = ω ). Φ∗g {f , h} = {Φ∗g f , Φ∗g h} 2.17 Proposition. Let (M, { · , · }) be a Poisson manifold and denote by B ∈ Λ2 (T ∗ M ) be the associated Poisson tensor. Let G be a Lie group acting canonically on M . Then, for any m ∈ M such that Gm =: H and ∗ M: any vector subspace V ⊂ Tm ∗ (i) B (m) : Tm M → Tm M is H–equivariant.

(ii) If the Poisson bracket { · , · } is induced by a symplectic manifold ω then H B (m)(V H ) = B (m)(V ) , where the H–superscript denotes the set of H–ﬁxed points in the corresponding spaces. Proof. Part (i) is a trivial consequence of the canonical character of the action. Part (ii) follows from Part (i) and the non degeneracy of B (m) in the symplectic case.

3

A New Momentum Map and an Optimal Noether Theorem

In this section we introduce the main ideas of the paper.

342

J-P Ortega and T. S. Ratiu

3.1

The Optimal Momentum Map

3.1 Deﬁnition. Let (M, { · , · }) be a Poisson manifold, G be a Lie group acting canonically on M , U be a G–invariant open subset of M , and let C ∞ (M )G (respectively C ∞ (U )G ) be the set of G–invariant functions on M (respectively, G–invariant functions on U ). Let E be the set of Hamiltonian vector ﬁelds associated to all the elements of C ∞ (U )G , for all the open G–invariant subsets U of M , that is, E = Xf f ∈ C ∞ (U )G , with U ⊂ M open and G–invariant , and E be the smooth generalized distribution on M spanned by E. We will call E the G–characteristic distribution. Remark. If the G-action on M is proper, the deﬁnition of the distribution E admits some simpliﬁcation. Indeed, by deﬁnition, for any m ∈ M , there is a r ∈ N such that E(m) = span Xf1 (m), . . . , Xfr (m) , where fi ∈ C ∞ (Ui )G for any i ∈ {1, . . . , r}. By Proposition 2.13, for each i ∈ {1, . . . , r} there exists an open G–invariant subset Vi ⊂ Ui containing the point m and a function Fi ∈ C ∞ (M )G such that fi |Vi = Fi |Vi . Consequently, E(m) = span Xf1 (m), . . . , Xfr (m) = span XF1 (m), . . . , XFr (m) . This proves that the family of vector ﬁelds E = Xf f ∈ C ∞ (M )G spans the distribution E.

(3.1)

3.2 Deﬁnition. Let (M, { · , · }) be a Poisson manifold and D ⊂ T M be a smooth generalized distribution onM . The distribution D is called Poisson or canonical , if the condition df D = dg D = 0 for f, g ∈ C ∞ (M ) implies that d{f , g}D = 0. 3.3 Proposition. The G–characteristic distribution E is smooth, completely integrable, and Poisson. Its integral manifolds are given by the orbits of the pseudogroup GE of local diﬀeomorphisms generated by E. Proof. The generalized distribution E is smooth since it is spanned by all vector ﬁelds in E. We will prove its complete integrability by using Theorem 2.3 which, at the same time, provides us with the characterization of the integral manifolds in terms of the GE –orbits. So, let m ∈ M and,

11. The Optimal Momentum Map

343

for simplicity in the exposition, take FT = FT ∈ GE , with Ft the ﬂow of Xf , f ∈ C ∞ (U )G , U an open G–invariant neighborhood of m (the general case in which FT is the composition of a ﬁnite number of ﬂows follows easily by attaching to what we are going to do a straightforward induction argument). Recall that the G–invariance of f implies that Xf and its ﬂow Ft are G–equivariant and consequently Dom(FT ) is a G–invariant open subset of U . The theorem follows if we are able to show that Tm FT E(m) = E FT (m) . Let Xg (m) ∈ E(m), with g ∈ C ∞ (W )G , for W an open G–invariant subset of Dom(FT ). Since any Hamiltonian ﬂow is always a Poisson map, by Proposition 2.6 we have that Tm FT Xg (m) = Tm FT Xg◦F−T ◦FT (m) = Xg◦F−T FT (m) ∈ E FT (m) since g ◦ F−T ∈ C ∞ (FT (W ))G by the G–equivariance of FT and the G– invariance of g. This implies that Tm FT E(m) ⊂ E FT (m) . Conversely, let Xg FT (m) ∈ E FT (m) . Again, by using Proposition 2.6, Xg FT (m) = Tm FT Xg◦FT (m) which, by the equivariance of FT , concludes the proof of the integrability of E (for simplicity we omitted straightforward domain issues). As to E being canonical, let f, g ∈ C ∞ (M ) be such that df |E = dg|E = 0. Consider m ∈ M and let h ∈ C ∞ (U )G be arbitrary, with U an open G– invariant neighborhood of m. Then, d{f , g}(m) · Xh (m) = Xh {f , g} (m) = {f , g} , h (m) = − {h , f } , g (m) − {g , h} , f (m) = Xh [f ] , g (m) − Xh [g] , f (m) = 0, as required.

3.4 Deﬁnition. Let (M, { · , · }) be a Poisson manifold, G be a Lie group acting canonically on M , and E the associated integrable G–characteristic distribution. Let J : M → M/GE be the canonical projection of M onto the GE –orbit space. We will call J the optimal momentum map of the canonical G-action on M, { · , · } . We will refer to M/E := M/GE as the momentum space. A straightforward consequence of the previous deﬁnition is the following

344

J-P Ortega and T. S. Ratiu

3.5 Theorem (Optimal Noether Theorem). Let M, { · , · } be a Poisson manifold, G be a Lie group acting canonically on M , E be the associated integrable G–characteristic distribution, and J : M → M/GE be the optimal momentum mapping. Then J is a constant of the motion for the dynamics generated by any G–invariant Hamiltonian h; that is, J ◦ Ft = J , where Ft is the ﬂow of Xh . Remark. By the very construction of J , its level sets are the smallest immersed submanifolds (actually, we will see that under certain hypotheses they are embedded) respected by all G–equivariant Hamiltonian dynamics. This justiﬁes the use of optimal in its denomination. Notice that in contrast with the ordinary momentum map, J is always deﬁned, which solves some of the problems of the traditional approach to the study of symmetries pointed out in the introduction. In particular, we have the following examples. Example. Conservation laws without momentum maps: Consider the example presented in the introduction consisting of a canonical symmetry to which it is impossible to associate a globally deﬁned momentum map. Let M = S 1 × S 1 = T2 be the two torus with the symplectic form ω = dθ1 ∧ dθ2 . Consider G = S 1 acting on M by eiφ · (eiθ1 , eiθ2 ) := (ei(φ+θ1 ) , eiθ2 ) . In order to compute the optimal momentum map J , the ﬁrst ingredient that we need is the S 1 –characteristic distribution E. It is easy to see that 1 in this case, every S 1 –invariant function f ∈ C ∞ (T2 )S can be written as f (eiθ1 , eiθ2 ) = g(eiθ2 ) , with g ∈ C ∞ (S 1 ). Its associated Hamiltonian vector ﬁeld is given by Xf = ∂g ∂ ∂θ2 ∂θ1 . Since g is an arbitrary function on the circle, we can identify M/E with the second circle S 1 in the torus T2 . The optimal momentum map is therefore given by the expression: J :

−→ S 1 T2 iθ2 (e , e ) −→ eiθ2 . iθ1

It is a remarkable fact that in this case the optimal momentum map is S 1 –valued and moreover, it coincides with the Lie group valued momentum map introduced by McDuﬀ [1988], Weitsman [1993], and Alekseev, Malkin, and Meinrenken [1997] that one would obtain by considering our example as a quasi–Hamiltonian S 1 –space (see the prior references for an explanation of this term).

11. The Optimal Momentum Map

345

Example. The optimal momentum map of a Poisson non Hamiltonian action: The previous example needed the introduction of group valued momentum maps in order to encode the conservation laws associated to the symmetries of the problem. We now give an example where even such momentum maps are not available. Nevertheless we will see that the optimal momentum map can carry out that job. Let (R3 , { · , · }) be the Poisson manifold formed by the Euclidean three dimensional space R3 together with the Poisson structure induced by the Poisson tensor B that in Euclidean coordinates takes the form: ⎛ ⎞ 0 1 0 B = ⎝ −1 0 1 ⎠ . 0 −1 0 With this Poisson bracket, the Hamiltonian vector ﬁeld Xf associated to any smooth function f ∈ C ∞ (R3 ) is given by ∂f ∂f ∂f ∂ ∂ ∂f ∂ Xf (x, y, z) = + − − . (3.2) ∂y ∂x ∂z ∂x ∂y ∂y ∂z Consider the action of the additive group (R, +) on R3 , given by λ · (x, y, z) := (x + λ, y, z) for any λ ∈ R and any (x, y, z) ∈ R3 . In view of (3.2), it is clear that this action does not have a standard associated momentum map. Nevertheless, it is a Poisson action and therefore we can construct an optimal momentum map for it. Indeed, notice ﬁrst that the invariant functions f ∈ C ∞ (M )R are all of the form f (x, y, z) ≡ f¯(y, z), with f¯ ∈ C ∞ (R2 ) arbitrary. This implies that the GE –orbits on R3 coincide with those of the R2 –action on R3 given by (µ, ν) · (x, y, z) := (x + µ, y + ν, z − µ), for any (µ, ν) ∈ R2 and any (x, y, z) ∈ R3 . Therefore, M/GE can be identiﬁed with R and the associated optimal momentum map takes the form J :

R3 (x, y, z)

−→ R −→ x + z .

It is easy to verify that the Hamiltonian ﬂow associated to any invariant function f (x, y, z) ≡ f¯(y, z) preserves the level sets of J ;moreover, the function J is a Casimir of the Poisson manifold R3 , { · , · } . Example. A canonical linear action: Consider C3 with the symplectic form ω given by

ω (z1 , z2 , z3 ) , (z1 , z2 , z3 ) = −Im (z1 , z2 , z3 ) , (z1 , z2 , z3 ) . Consider now the natural action of the Lie group SU(3) on C3 via matrix multiplication. This action is canonical and since it is linear, it is globally Hamiltonian. Moreover, given that the isotropy subgroup of any point in C3

346

J-P Ortega and T. S. Ratiu

with respect to this action has dimension at least three, the ordinary momentum map is always singular. We will see that its image can be naturally identiﬁed with the momentum space associated to the SU(3)–characteristic distribution. Indeed, given that every SU(3)-invariant function in C3 is a function of f (z1 , z2 , z3 ) = 12 |z1 |2 + |z2 |2 + |z3 |2 and the Hamiltonian ﬂow of Xf is given by Ft (z1 , z2 , z3 ) = (z1 e−it , z2 e−it , z3 e−it ) , the orbit space C3 /GE coincides with C3 /S 1 , where S 1 acts on C3 by (3.3) eiφ · (z1 , z2 , z3 ) = eiφ z1 , eiφ z2 , eiφ z3 . This quotient space can be identiﬁed with (CP(2) × R+ ) ∪ {∗}, where {∗} ◦

denotes a singleton or, said diﬀerently, with the cone C (CP(2)) on CP(2). Indeed, if π : C3 → C3 /S 1 is the canonical projection and z = (z1 , z2 , z3 ), then the mapping that assigns π(z1 , z2 , z3 ) to ([z/z] , z) if z = 0, and to ∗ if z = 0, provides the needed identiﬁcation (the symbol [z/z] denotes the element π (z/z) ∈ CP(2)). We have the following expression for the optimal momentum map: J : C3 −→ (CP(2) × R+ ) ∪ {∗}

z , z if z = 0 z z −→ ∗ if z = 0.

Remark. The compact case and the Theory of Invariants: The example that we just described lies in a very big class of systems for which the computation of the G–characteristic distribution E, and therefore of the optimal momentum map J , is particularly simple. We are referring to canonical G-actions with G a compact Lie group. It turns out that, according to a theorem of Gotay and Tuynman [1991], every canonical action of a compact Lie group on a symplectic manifold can be reduced to the study of a symplectic linear representation of G on a certain ﬁnite dimensional vector space V R2n . Once we have reduced the problem to the linear representation of a compact Lie group, we have at our disposal the Theory of Invariants. For our purposes, the most interesting result in this theory is the Hilbert–Weyl Theorem (Weyl [1946]; Po`enaru [1976]; Kempf [1987]) which guarantees that the algebra of G–invariant polynomials is ﬁnitely generated, that is, one can always ﬁnd a ﬁnite number of G–invariant polynomials {σ1 , . . . , σk } such that every G–invariant polynomial P ∈ P(V )G

11. The Optimal Momentum Map

347

can be written as a polynomial function of them. More speciﬁcally, given P ∈ P(V )G , there is some P ∈ R[X1 , . . . , Xk ] such that P = P(σ1 , . . . , σk ). Note that the generating family {σ1 , . . . , σk } can be chosen to be minimal. In that situation we say that {σ1 , . . . , σk } is a Hilbert basis of P(V )G . In applications, it is convenient to choose the Hilbert basis formed of homogeneous polynomials. Note also that the Hilbert basis is not necessarily free and that therefore there are in general relations between its elements. The generalization of the Hilbert–Weyl Theorem to smooth functions has been carried out by Schwarz [1974], who proved that if f is a germ of a function in C ∞ (V )G and {σ1 , . . . , σk } is a Hilbert basis of P(V )G , then there is a germ f ∈ C ∞ (Rk ) such that f = f(σ1 , . . . , σk ). Consequently, using (3.1) we can write the G–characteristic distribution in this case as E = span{Xσ1 , . . . , Xσk }.

3.2

The Optimal Momentum Map for Proper Globally Hamiltonian Actions

In order to illustrate the content of J , we now identify in the classical language the conservation laws induced by J . In the following paragraphs, we assume that (M, ω) is a symplectic manifold and that G is a Lie group acting properly on M in a globally Hamiltonian fashion with associated momentum map J : M → g∗ . 3.6 Theorem. Let (M, ω) be a symplectic manifold and G be a Lie group acting properly on M in a globally Hamiltonian fashion with associated momentum map J : M → g∗ . If E is the G–characteristic distribution, then for any m ∈ M : E(m) = ker Tm J ∩ Tm MGm .

(3.4)

Moreover, the GE –orbit of the point m, and therefore the level set J −1 (ρ), ρ ∈ M/E of J containing the point m ∈ M , is the connected compo nent J−1 (µ) ∩ MH c.c.m which contains m, of the embedded submanifold J−1 (µ) ∩ MH ; that is, J −1 (ρ) = J−1 (µ) ∩ MH c.c.m where µ = J(m) ∈ g∗ and H := Gm is the isotropy subgroup of m ∈ M . Proof. Expression (3.4) is simply a consequence of the following chain of

348

J-P Ortega and T. S. Ratiu

equalities: E(m) = span Xf (m) f ∈ C ∞ (M )G = B (m) span df (m) f ∈ C ∞ (M )G ◦ G m Tm (G · m) = B (m) ◦ Gm = B (m) Tm (G · m) ω G m = Tm (G · m) G m = ker Tm J = ker Tm J ∩ Tm MGm .

(by Proposition 2.14) (by Proposition 2.17) (by Proposition 2.4)

We now prove the claim in the statement about the integral manifolds of E. Let again m ∈ M be such that µ = J(m) ∈ g∗ , J (m) = ρ, and H := Gm ⊂ G is its isotropy subgroup. As we said in the introduction, the subset MH ⊂ M is a symplectic submanifold of M . Moreover, it is easy to see Otto [1987]; Ortega [1998]; Ortega and Ratiu [2002] that the restriction J|MH of J to MH is a constant rank mapping and hence, by the Fibration Theorem ( Abraham, Marsden, and Ratiu [1988], Theorem 3.5.18), −1 (µ) ∩ MH is a submanifold of MH and consequently of M , J|−1 MH (µ) = J which contains the forany point z ∈ J−1 (µ) ∩ MH we −1 point m. Given that −1 have that Tz J (µ) ∩ MH = Tz J|MH (µ) = ker Tz J ∩ Tz MH = E(z), we can conclude that the connected component J−1 (µ)∩MH )c.c.m containing the point m of the submanifold J−1 (µ) ∩ MH , is an integral manifold of E containing m. At the same time, the characterization of the level sets of J as GE –orbits implies, via the standard Noether’s Theorem and the principle of conservation of the isotropy that J −1 (ρ) = GE ·m ⊂ J−1 (µ)∩MH c.c.m . The result follows from the uniqueness of the maximal integral manifolds of a generalized distribution (Libermann and Marle [1987], Theorem 2.3, Appendix 3). Remark. The previous theorem justiﬁes again the use of the adjective optimal in the denomination of J since it proves that in the proper globally Hamiltonian case, its level sets coincide with the smallest invariant subsets of M under G–equivariant dynamics. In other words, the optimal momentum map J is capable of implementing in one shot both the classical Noether Theorem, as well as the law of conservation of the isotropy. The use of the name momentum for J is reasonable since, as it follows from Theorem 3.6, there are cases in which J and J are basically the same map. For example, suppose that we are in the hypotheses of Theorem 3.6 and, additionally, we assume the G-action to be free (there are no singularities) and the level sets of J connected. In this situation,

11. The Optimal Momentum Map

349

the map J (m) = ρ → J(m) = µ is a bijection ϕ between M/GE and g∗J := J(M ). Indeed, it is well deﬁned since if we take another m ∈ M such that J (m ) = ρ, then m and m are in the same level set of J that in our hypotheses, by Theorem 3.6, are the(connected) level sets of J and hence ϕ(J (m )) = J(m ) = µ = J(m) = ϕ J (m) . The map ϕ is onto by construction and one–to–one by the connectedness of the level sets of J. Note that in this case, the commutative diagram J

M J

g∗J ϕ

M/GE provides an identiﬁcation between J and J. This diagram is a corollary of the universality properties of the optimal momentum map that we study in the following subsection.

3.3

Universality properties of the optimal momentum map

In this section we will show that the optimal momentum map is universal in the sense of Category Theory, that is, any other momentum map that we may deﬁne is going to factor through J . Before making this statement more explicit we will need to introduce a few properties of the orbit space M/GE and the optimal momentum map. We start with the following. 3.7 Deﬁnition. A pair (X , C ∞ (X)), where X is a topological space and C ∞ (X) ⊂ C 0 (X) is a subset of continuous functions in X, is called a variety with smooth functions C ∞ (X). If Y ⊂ X is a subset of X, the pair (Y , C ∞ (Y )) is said to be a subvariety of (X , C ∞ (X)), if Y is a topological space endowed with the relative topology deﬁned by that of X and C ∞ (Y ) = {f ∈ C 0 (Y ) | f = F |Y for some F ∈ C ∞ (X)}. Sometimes C ∞ (Y ) is called the set of Whitney smooth functions on Y with respect to X. A map ϕ : X → Z between two varieties is said to be smooth when it is continuous and ϕ∗ C ∞ (Z) ⊂ C ∞ (X). In our discussion we are interested in the varieties constructed using generalized integrable distributions D on the manifold M . If we denote by M/D the space formed by the integral manifolds of M , then the pair (M/D , C ∞ (M/D)) is a variety whose set of smooth functions C ∞ (M/D) is deﬁned by the requirement that the canonical projection π : M → M/D is a smooth map; that is, C ∞ (M/D) := f ∈ C 0 (M/D) f ◦ π ∈ C ∞ (M ) .

350

J-P Ortega and T. S. Ratiu

Note that if (M, { · , · }) is a Poisson manifold and D ⊂ T M is a Poisson integrable distribution, the pair C ∞ (M/D) , { · , · }M/D is a well– deﬁned Poisson algebra (Ortega and Ratiu [1998], Theorem 2.12), where the bracket { · , · }M/D is given by {f , g }M/D π(m) = {f ◦ π , g ◦ π }(m) ,

(3.5)

for every m ∈ M . In the particular case in which E is the G–characteristic distribution, Proposition 3.3 guarantees that C ∞ (M/E) , { · , · }M/E := ∞ C (M/GE ) , { · , · }M/GE is a well–deﬁned Poisson algebra. Moreover, the construction of the bracket (3.5) implies that the optimal momentum map J is a smooth Poisson morphism. 3.8 Proposition. Let (M, { · , · }) be a Poisson manifold, G be a Lie group acting canonically on it, with E and J being the associated G–characteristic distribution and optimal momentum map, respectively. Let m ∈ M be arbitrary such that J (m) = ρ ∈ M/GE . Then, for any g ∈ G, the map Φg (ρ) = J (g · m) ∈ M/GE deﬁnes a smooth Poisson G-action on M/GE with respect to which J is G–equivariant. Proof. We just have to check that Φ is well deﬁned since if that is case, the equivariance of J will follow by construction. Let m, m ∈ M be such that J (m) = J (m ) = ρ. This implies that m and m live in the same integral manifold of E, that is, in the same GE –orbit. Hence, there exists FT ∈ GE such that FT (m) = m . Since FT is the composition of a ﬁnite number of G–equivariant Hamiltonian ﬂows associated to G–invariant Hamiltonians, it is G–equivariant and therefore J (g · m ) = J g · FT (m) = J FT (g · m) = J (g · m) = Φg (ρ) , as required. The smoothness and the Poisson character of Φ follow from the fact that this action is the projection onto M/E of the smooth Poisson action on the manifold M , via the smooth optimal momentum map. Example. We look at the Poisson and G–structures of the spaces M/E found in the examples 3.1 and 3.1. • S 1 acting on T2 : In this case M/E coincides with S 1 and the optimal momentum map is given by J (eiθ1 , eiθ2 ) = eiθ2 . The group S 1 acts trivially on M/E S 1 and the Poisson structure { · , · }M/E is trivial. Indeed, let f, g ∈ C ∞ (S 1 ) and (eiθ1 , eiθ2 ) ∈ T2 be arbitrary. Then {f, g}M/E (eiθ2 ) = {f, g}M/E (J (eiθ1 , eiθ2 )) = {f ◦ J , g ◦ J }T2 (eiθ1 , eiθ2 ) ∂f ∂ ∂g ∂ , = 0. = dθ1 ∧ dθ2 ∂θ2 ∂θ1 ∂θ2 ∂θ1

11. The Optimal Momentum Map

351

• SU(3) acting on C3 : As we saw, M/E (CP(2) × R+ ) ∪ {∗}. The Lie group SU(3) acts on this set by leaving the point ∗ ﬁxed and by 0 1 0 1 z Az A· , z = , Az , z Az when A ∈ SU(3) and z = 0. As to the Poisson structure on M/E, it can be easily shown that for any f, g ∈ C ∞ (M/E) and any ([z], r) ∈ M/E (CP(2) × R+ ) ∪ {∗}, then {f , g}M/E ([z] , r) = ωCP(2) ([z]) Xfr ([z]) , Xgr ([z]) , {f , g}M/E (∗) = 0 , where ωCP(2) in the natural symplectic structure on CP(2) (coming from considering it as one of the regular symplectic reduced spaces of the S 1 -action (3.3) on C3 ) and fr , gr ∈ C ∞ (CP(2)) are deﬁned by fr ([z]) := f ([z], r) and gr ([z]) := g([z], r), for any [z] ∈ CP(2). In order to illustrate the universality properties of the optimal momentum map we introduce the category of Hamiltonian symmetric systems with a momentum map. 3.9 Deﬁnition. Let (M, { · , · }) be a Poisson manifold, and P , C ∞ (P ) , { · , · }P a Poisson variety, with G being a Lie group acting canonically on M and P . Let K : M → P be a smooth G–equivariant Poisson map. We say that K is a momentum map for the G-action on M if the Hamiltonian ﬂows associated to G–invariant smooth functions leave invariant the level sets of K. In that situation we say that M, { · , · }, G, K : M → P is a Hamiltonian G–space with momentum map K. 3.10 Theorem (Universality of the optimal momentum map). The optimal momentum map is a universal object in the category of Hamiltonian symmetric systems with a momentum map. More speciﬁcally, if M, { · , · }, G, K : M → P is any Hamiltonian G–space with momentum map K and J : M → M/E is the optimal momentum map deﬁned using the canonical G-action on M , then there exists a unique G–equivariant Poisson morphism ϕ : M/E → P such that the diagram commutes. J

M J

g∗P ϕ

M/E

352

J-P Ortega and T. S. Ratiu

Proof. The function ϕ is given, for any ρ = J (m) ∈ M/E, by the expression ϕ(ρ) := K(m) . The map ϕ is well deﬁned since if m ∈ J −1 (ρ), then there exists a ﬁnite composition FT of ﬂows associated to G–invariant Hamiltonians such that m = FT (m). Given that K is a momentum map we have that K(m ) = K FT (m) = K(m) = ϕ(ρ) . The smoothness of ϕ, as well as its G–equivariance, and Poisson character are a simple diagram chasing exercise. The uniqueness is guaranteed by the fact that the diagram (3.10) commutes and by the surjectivity of J .

4

Optimal Reduction

As we already said in the introduction, the most eﬃcient way to proﬁt from the conservation laws encoded in the symmetries of a globally Hamiltonian system is carrying out the so called symplectic or Marsden–Weinstein reduction (Marsden and Weinstein [1974]), which we brieﬂy review. Let (M, ω, h, G, J : M → g∗ ) be a globally Hamiltonian symmetric system where we will assume that the G-action on M is free and proper. Let µ ∈ g∗J be an arbitrary element in the image of the momentum map J. The Marsden–Weinstein reduction theorem says that the quotient J−1 (µ)/Gµ is a symplectic manifold with symplectic form ωµ uniquely determined by the equality πµ∗ ωµ = i∗µ ω, where iµ : J−1 (µ) → M and πµ : J−1 (µ) → J−1 (µ)/πµ are the natural injection and projection respectively. The dynamics induced by Xh projects naturally onto the reduced space J−1 (µ)/Gµ . In this section we see how the new formulation in terms of the optimal momentum map allows us to mimic this procedure, creating the possibility to reduce symmetric systems in all the situations in which J is deﬁned and freeing us from the strong restrictions posed by the classical formulations of the reduction theorems.

4.1

Reduction Lemmas

The ﬁrst ingredient needed in the reduction of a symmetric system are the level sets of the associated momentum map. Since by construction the level sets of the optimal momentum map J are the integral manifolds of a smooth distribution, they are always smooth immersed submanifolds of M . Moreover, in Theorem 3.6 we saw that if J is associated to a proper globally Hamiltonian action, its level sets are actually embedded submanifolds. In the next result we show that this is also the case under much weaker hypotheses. We start with the following straightforward lemma.

11. The Optimal Momentum Map

353

4.1 Lemma. Let (M, ω) be a symplectic manifold and G be a Lie group acting properly and canonically on M . Let E be the G–characteristic distribution with optimal momentum map J : M → M/GE . Then, for any ρ ∈ M/GE , the set J −1 (ρ) ⊂ M is included in the connected component of some isotropy type manifold MH , with H the isotropy subgroup of any m ∈ J −1 (ρ). Proof. It is a straightforward consequence of the equality J −1 (ρ) = GE · m, for any m ∈ J −1 (ρ). 4.2 Deﬁnition. In the hypotheses of the previous lemma, we say that an element ρ ∈ M/GE satisﬁes the closedness hypothesis if J −1 (ρ) is closed as a subset of the isotropy type submanifold MH in which it is sitting. Example. The closedness hypothesis is always satisﬁed in the presence of globally Hamiltonian actions. Also, suppose that ρ ∈ M/E is such that J −1 (ρ) ⊂ MH . Let N (H) be the normalizer in G of H. The canonical Gaction on M induces a natural canonical action of the group N (H)/H on the symplectic manifold MH . Let (N (H)/H)ρ be the subgroup of N (H)/H ρ of MH in which J −1 (ρ) that leaves invariant the connected component MH ρ ρ is sitting. If the (N (H)/H) –action on MH has a globally deﬁned momentum map associated, then ρ satisﬁes the closedness hypothesis. 4.3 Proposition. Let (M, ω) be a symplectic manifold and G be a Lie group acting properly and canonically on M . Let E be the G–characteristic distribution with optimal momentum map J : M → M/GE . Let ρ be an element in M/GE that satisﬁes the closedness hypothesis. If MH is the isotropy type submanifold in which the level set J −1 (ρ) is included by Lemma 4.1, then J −1 (ρ) is a closed embedded submanifold of MH and therefore an embedded submanifold of M . As a consequence, if MH is closed in M then J −1 (ρ) is a closed embedded submanifold of M . ρ Proof. Let MH be the connected component of MH containing J −1 (ρ). Thus, the claim of the proposition will follow if we are able to show that ρ . J −1 (ρ) is a closed embedded submanifold of MH ρ Firstly, note that MH inherits from the canonical G-action on M a free and canonical Lρ := (N (H)/H)ρ -action, where N (H) denotes the normalizer of H in G and (N (H)/H)ρ is the closed subgroup of N (H)/H that ρ invariant. In our subsequent discussion we will assume, in order leaves MH ρ = MH and (N (H)/H)ρ = N (H)/H. to simplify the exposition, that MH Let EL be the L-characteristic distribution associated to the canonical Laction on MH and JL : MH → MH /EL be the associated optimal momentum map. Let m ∈ J −1 (ρ). In the sequel we will prove that if JL (m) = σ then, J −1 (ρ) = JL−1 (σ). We ﬁrst need the following lemma.

4.4 Lemma. Let G be a Lie group acting properly on the manifold M. Let m ∈ M be an arbitrary point such that H := Gm . Then every function f ∈

354

J-P Ortega and T. S. Ratiu

C ∞ (MH )N (H) = C ∞ (MH )L admits a local extension at m to C ∞ (M )G , that is, there exists a G–invariant neighborhood U of m in M and a Ginvariant function F ∈ C ∞ (M )G , such that F |U ∩MH = f |U ∩MH . Proof. Since the claim of the lemma is local we can make use of the slices introduced in Section 2. Let V be an open G–invariant neighborhood of the orbit G · m that is modeled by the tube G ×H Ar . It is easy to see that VH N (H) ×H AH r . Let now g : Ar → R be the smooth function deﬁned by g(v) := f ([e, v]) for any v ∈ AH r . Using a bump function similar to the one that was used in the proof of Proposition 2.13 we can construct another function g1 ∈ C ∞ (AH r ) H = 0. Due to the compactness such that g1 |AH = g|AH and g1 |AH r \A r/2

r/2

3r/4

of H, the vector space A can be decomposed as the direct sum A = AH ⊕ W of two H–invariant subspaces AH and W . Deﬁne g2 ∈ C ∞ (Ar )H by ∞ H g2 (v + w) = g1 (v), for any v ∈ AH r and w ∈ W . We now let g3 ∈ C (V ) be given by g3 ([h, v]) = g2 (v), for any [h, v] ∈ V G ×H Ar . Finally, let F ∈ C ∞ (M )G be the function given by ) g3 (z) if z ∈ V F (z) = 0 if z ∈ / V. The lemma follows by taking the function F above and U as the open G–invariant set modeled by G ×H Ar/2 . As a corollary to this lemma we have that E|MH = EL and, consequently J −1 (ρ) = JL−1 (σ), as required. We now show that the distribution EL has constant rank. Indeed, for any z ∈ MH we have that EL (z) = {Xf (z) | f ∈ C ∞ (MH )L } (z) {df (z) | f ∈ C ∞ (MH )L } = BM H (z) (Tz (L · z)◦ ) = BM H

=

ω|MH

(Tz (L · z))

(by Proposition 2.14)

.

In particular, this equality shows that EL is an integrable distribution of constant rank equal to dim MH − dim l∗ . In the previous chain of equalities we denoted by BMH the Poisson tensor associated to the symplectic form ω|MH on MH . The proof is concluded by recalling the closedness hypothesis on ρ and a general fact about constant rank smooth foliations (see for instance Theorem 5 in page 51 of Camacho and Lins Neto [1985]) which states that the closed integral leaves of an integrable distribution of constant rank are always embedded submanifolds. This fact proves that JL−1 (σ) = J −1 (ρ) is an embedded submanifold of MH , and thereby of M , as required.

11. The Optimal Momentum Map

355

4.5 Proposition. In the hypotheses of the previous proposition, for any ρ ∈ M/GE satisfying the closedness hypothesis, the isotropy subgroup Gρ ⊂ G of ρ with respect to the G-action on M/GE deﬁned in Proposition 3.8, is a closed Lie subgroup of G. Moreover, for any m ∈ J −1 (ρ) Tm (Gρ · m) = Tm (J −1 (ρ)) ∩ Tm (G · m) .

(4.1)

Proof. In order to show that Gρ is a Lie subgroup of G it suﬃces to show (see for instance Warner [1983], Theorem 3.42) that Gρ is closed in G. Let {gn } ⊂ Gρ be an arbitrary convergent sequence in Gρ with limit g ∈ G. The closedness of Gρ will be guaranteed if we show that the limit g actually belongs to Gρ . Let m ∈ M be such that J (m) = ρ and H := Gm . The condition {gn } ⊂ Gρ implies that for any given n ∈ N∗ , there exists an element FTnn ∈ GE such that gn · m = FTnn (m). Consequently, the Gequivariance of the elements FTnn implies that the isotropy subgroups Ggn ·m of gn · m satisfy that gn Hgn−1 = Ggn ·m = GFTn

n

(m)

= Gm = H,

and hence the sequence {gn } ⊂ N (H). Since the normalizer N (H) is closed in G, the limit g belongs to N (H) and hence the element g · m is sitting in the same connected component of MH in which the sequence {gn · m} lives. Consequently, the element g · m lies in the closure in MH of J −1 (ρ). The closedness hypothesis on ρ guarantees that g · m ∈ J −1 (ρ) ⊂ MH . In particular, this implies that g · ρ = g · J (m) = J (g · m) = ρ or, equivalently, g ∈ Gρ , as required. We now prove equality (4.1). The inclusion Tm (Gρ · m) ⊂ Tm (J −1 (ρ)) ∩ Tm (G · m) is straightforward since the orbit Gρ · m is included in both J −1 (ρ) and G · m. Conversely, let Xf (m) = ξM (m) ∈ Tm (J −1 (ρ)) ∩ Tm (G · m) ,

(4.2)

with f ∈ C ∞ (M )G and ξ ∈ g. Recall that since the G-action on M is canonical, the vector ﬁeld ξM ∈ X(M ) is locally Hamiltonian and therefore there is a smooth function, say Jξ ∈ C ∞ (U ), locally deﬁned in an open neighborhood U of m in M , such that XJξ = ξM . Notice that for any z ∈ U, {f , Jξ }(z) = XJξ [f ](z) = df (z) · XJξ (z) = df (z) · ξM (z) = 0 , by the G–invariance of the function f . Consequently, at any point in U , the Lie bracket [Xf , XJξ ] = X{Jξ ,f } = 0, and hence, if Ft is the ﬂow of Xf and Gt is the ﬂow of XJξ (more explicitly Gt (z) = exp tξ ·z for any z ∈ U ), then Ft ◦ Gs = Gs ◦ Ft (see for instance Abraham, Marsden, and Ratiu [1988], Proposition 4.2.27). By one of the Trotter product formulas (see Trotter [1958] or Abraham, Marsden, and Ratiu [1988], Corollary 4.1.27), the ﬂow

356

J-P Ortega and T. S. Ratiu

Ht of Xf −Jξ = Xf − XJξ is given by n n ◦ Gn−t/n (z) Ht (z) = lim Ft/n ◦ G−t/n (z) = lim Ft/n n→∞

n→∞

= (Ft ◦ G−t )(z) = Ft (exp −tξ · z) , for any z ∈ U . Note that by (4.2), the point m ∈ M is an equilibrium of Xf −Jξ = Xf −XJξ , hence Ft (exp −tξ ·m) = m or, analogously exp tξ ·m = Ft (m). Applying J on both sides of this equality, taking into account that Ft is the ﬂow of a G–invariant Hamiltonian vector ﬁeld, it follows that exp tξ ·ρ = ρ, and hence ξ ∈ gρ . Thus ξM (m) ∈ Tm (Gρ ·m), as required.

4.2

The Optimal Reduction Method

We continue the study of the ingredients needed for reduction with the following. 4.6 Proposition. Let (M, ω) be a symplectic manifold and G be a Lie group acting properly and canonically on M . Let E be the G–characteristic distribution with optimal momentum map J : M → M/GE . Then, for any ρ ∈ M/GE satisfying the closedness hypothesis, the isotropy subgroup Gρ acts on the submanifold J −1 (ρ), and the corresponding orbit space Mρ := J −1 (ρ)/Gρ is a regular quotient manifold, that is, it can be endowed with the unique smooth structure that makes the canonical projection πρ : J −1 (ρ) → J −1 (ρ)/Gρ a submersion. We will call Mρ = J −1 (ρ)/Gρ endowed with this smooth structure the reduced phase space. Proof. Let m ∈ J −1 (ρ) ⊂ M be such that H := Gm . Recall that in the proof of Proposition 4.3 we showed that J −1 (ρ) ⊂ MH . This implies that the isotropies of all the elements of J −1 (ρ) under the Gρ action are identical and equal to H ∩ Gρ . A classical result (see for instance Exercise 4.1M in Abraham and Marsden [1978]) guarantees that in such situation the quotient J −1 (ρ)/Gρ is a regular manifold, and the claim follows. We are now in position to state the main result of this section. 4.7 Theorem (Optimal Reduction). Let (M, ω) be a symplectic manifold and G be a Lie group acting properly and canonically on M . Let E be the G– characteristic distribution with optimal momentum map J : M → M/GE . Then, for any ρ ∈ M/GE satisfying the closedness hypothesis, the reduced space Mρ = J −1 (ρ)/Gρ has a unique symplectic structure ωρ characterized by (4.3) πρ∗ ωρ = i∗ρ ω , where πρ : J −1 (ρ) → Mρ is the canonical projection and iρ : J −1 (ρ) → M is the inclusion.

11. The Optimal Momentum Map

357

Proof. Let [z]ρ = πρ (z) be an arbitrary element of Mρ . Since by Propovector sition 4.6 the projection πρ is a surjective submersion, every [v]ρ ∈ T[z]ρ Mρ can be written as [v]ρ = Tz πρ · v, with v ∈ Tz J −1 (ρ) . Taking also [w]ρ = Tz πρ · w ∈ T[z]ρ Mρ arbitrary, we deﬁne ωρ ([z]ρ ) Tz πρ · v , Tz πρ · w := ω(z) (v , w) .

(4.4)

In order to verify that this is a good deﬁnition we have to verify that it is independent of the representative z ∈ J −1 (ρ) that deﬁnes [z]ρ and of the −1 vectors v, w ∈ Tz J (ρ) that deﬁne [v]ρ , [w]ρ ∈ T[z]ρ Mρ , respectively. So, let z ∈ J −1 (ρ) and v , w ∈ Tz J −1 (ρ) be such that [z]ρ = [z ]ρ and [v]ρ = [v ]ρ , [w]ρ = [w ]ρ . Let g ∈ Gρ be such that z = g · z. Note that since πρ = πρ ◦ Φg implies Tz πρ =Tz πρ ◦ Tz Φg , the relation [v]ρ = [v ]ρ can be written as Tz πρ (v ) = Tz πρ Tz Φg (v) . Consequently, v − Tz Φg (v) ∈ ker Tz πρ = Tz (Gρ · z ) and therefore there are elements ξ 1 , ξ 2 ∈ gρ such that 1 2 and w = Tz Φg (w) + ξJ v = Tz Φg (v) + ξJ −1 (ρ) (z ) −1 (ρ) (z ) .

We now prove that ωρ ([z ]ρ ) [v ]ρ , [w ]ρ = ωρ ([z]ρ ) [v]ρ , [w]ρ : ωρ ([z ]ρ ) [v ]ρ , [w ]ρ 1 2 = ω Φg (z) Tz Φg (v) + ξJ −1 (ρ) (z ) , Tz Φg (w) + ξJ −1 (ρ) (z ) 2 = Φ∗g ω (z)(v , w) + ω Φg (z) Tz Φg (v) , ξJ −1 (ρ) (z ) 1 + ω Φg (z) ξJ −1 (ρ) (z ) , Tz Φg (w) 1 2 + ω Φg (z) ξJ −1 (ρ) (z ) , ξJ −1 (ρ) (z ) . Since the G-action is canonical, we have Φ∗g ω (z)(v , w) = ω(z)(v , w). 1 Also, since ξJ −1 (ρ) (z ) ∈ E(z ), then there exists a G–invariant function 1 f ∈ C ∞ (M )G such that Xf (z ) = ξJ −1 (ρ) (z ). This allows us to write 1 2 2 ω Φg (z) ξJ −1 (ρ) (z ) , ξJ −1 (ρ) (z ) = df (z ) · ξJ −1 (ρ) (z ) = 0 , by G–invariance of the function f . Furthermore, if h ∈ C ∞ (M )G satisﬁes Xh (z ) = Tz Φg (v), then it follows that 2 2 ω Φg (z) Tz Φg (v) , ξJ −1 (ρ) (z ) = dh(z ) · ξJ −1 (ρ) (z ) = 0 . 1 Analogously, we can conclude that ω Φg (z) ξJ = 0 −1 (ρ) (z ) , Tz Φg (w) and hence ωρ ([z ]ρ ) [v ]ρ , [w ]ρ = ω(z) (v , w) = ωρ ([z]ρ ) [v]ρ , [w]ρ ,

358

J-P Ortega and T. S. Ratiu

which guarantees that (4.4) is a good deﬁnition of ωρ , consistent with (4.3). Notice that ωρ is smooth since πρ∗ ωρ is smooth and it is also closed since ω is closed and πρ is a surjective submersion. We now show that ωρ is non degenerate, which concludes the proof. Let [z]ρ ∈ Mρ and [v]ρ ∈ T[z]ρ Mρ be such that for all [w]ρ ∈ T[z]ρ Mρ ωρ ([z]ρ ) [v]ρ , [w]ρ = 0 , which implies that ω(z)(v , w) = 0 for all w ∈ Tz J −1 (ρ) = E(z). Let f1 , f2 ∈ C ∞ (M )G be such that v = Xf1 (z) and w = Xf2 (z). Since ω(z)(v , w) = df1 (z) · Xf2 (z) = 0 for all f2 ∈ C ∞ (M )G , we conclude that df1 (z) ∈ E(z)◦ . Let now H := Gz and L := N (H)/H (as usual, we will suppose for simplicity that MH is connected). Notice that the L-action on MH being canonical implies that for any η ∈ l (l denotes the Lie algebra of L), the vector ﬁeld ηMH satisﬁes LηMH ω|MH = 0 or, equivalently, ηMH i

is locally Hamiltonian. Let {η 1 , . . . , η l } be a basis of l and JηL be the loi deﬁned in a neighborhood Ui of m in MH . So, cal Hamiltonian for ηM H 1 l if η = c1 η + · · · + cl η is an arbitrary element of l, then ηMH is a locally 1 l Hamiltonian vector ﬁeld with Hamiltonian function JηL = c1 JηL +· · ·+cl JηL , locally deﬁned in U := U1 ∩ . . . ∩ Ul . We deﬁne JL : U → l∗ by JL (z) , η = JηL (z) . It is easy to show that the map JL has the following properties: ω| (i) ker(Tz JL ) = Tz (L · z) MH , for any z ∈ U . (ii) range(Tz JL ) = l∗ , for any z ∈ U . (iii) Noether Theorem: The Hamiltonian ﬂow associated to L–invariant functions in MH leaves the connected components of the level sets of JL invariant. Now, using properties of JL and the fact that df1 (z) ∈ E(z)◦ , we can write v = Xf1 (z) = B (z) df1 (z) ∈ B (z) E(z)◦ ∩ E(z) ◦ ⊂ B (z) ker(Tz JL ) ∩ T z MH . At the same time, ◦ ∩ T z MH B (z) ker(Tz JL ) ω = ker(Tz JL ) ∩ Tz MH = Tz (L · z) ⊂ Tz (G · z) . Therefore v = Xf1 (z) ∈ Tz (G · z) ∩ E(z) = Tz (Gρ · z) = ker(Tz πρ ), by Proposition 4.5. Consequently [v]ρ = 0, as required.

11. The Optimal Momentum Map

359

4.8 Theorem (Optimal Reduction of Hamiltonian dynamics). Consider a symplectic manifold (M, ω). Let G be a Lie group acting properly and canonically on M . Let E be the G–characteristic distribution with optimal momentum map J : M → M/GE and h ∈ C ∞ (M )G be a G–invariant Hamiltonian. Then (i) The Hamiltonian ﬂow Ft of Xh leaves the level sets J −1 (ρ) of J invariant and commutes with the G-action, hence if ρ ∈ M/GE satisﬁes the closedness hypothesis, Ft induces a ﬂow Ftρ on Mρ , uniquely determined by πρ ◦ Ft ◦ iρ = Ftρ ◦ πρ , where iρ : J −1 (ρ) → M is the canonical injection and πρ : J −1 (ρ) → Mρ is the projection. (ii) The ﬂow Ftρ is Hamiltonian in (Mρ , ωρ ), with Hamiltonian function hρ ∈ C ∞ (Mρ ) deﬁned by hρ ◦ πρ = h ◦ iρ . We will call hρ the reduced Hamiltonian. The vector ﬁelds Xh and Xhρ are πρ –related. (iii) Let k ∈ C ∞ (M )G be another G–invariant function. Then, {h , k} is also G–invariant and {h , k}ρ = {hρ , kρ }Mρ , where { · , · }Mρ denotes the Poisson bracket on M associated to the symplectic structure ωρ . Proof. Part (i) is a consequence of the Optimal Noether Theorem and the G–equivariance of the ﬂow Ft . Parts (ii) and (iii) are a straightforward veriﬁcation that uses the G–invariance of h and the deﬁnition of ωρ given by expression (4.3).

4.3

Comparison of the Optimal and the Marsden–Weinstein Reductions

Suppose now that the G-action in the statement of Theorem 4.7 is globally Hamiltonian, that is, there exists a globally deﬁned equivariant momentum map J : M → g∗ that allows us to perform symplectic reduction in the spirit of Marsden and Weinstein [1974]. What is the relation between the Marsden–Weinstein reduced spaces obtained via the use of J and the optimal reduced spaces constructed using Theorem 4.7? In Theorem 3.6 we showed that in the globally Hamiltonian case we have J −1 (ρ) = J−1 (µ) ∩ MH , where µ = J(m) and H = Gm , for some m ∈ J −1 (ρ) (all along this section we will assume that J−1 (µ) ∩ MH is connected). Also, it is easy to show that in that situation Gρ = NGµ (H) = NG (H) ∩ Gµ , with Gµ the coadjoint isotropy of µ ∈ g∗ . Indeed if g ∈ Gρ then g · ρ = ρ. By the deﬁnition of the G-action on M/GE , this implies that J (g · m) = J (m), or equivalently, both g · m and m are in the same GE –orbit, that is, there is a G–equivariant element FT ∈ GE such that FT (m) = g · m. The G–equivariance of FT implies that m and g · m have the same isotropy subgroup, hence H = Gm = Gg·m = gGm g −1 = gHg −1 ,

360

J-P Ortega and T. S. Ratiu

which implies that g ∈ N (H). At the same time, Noether’s Theorem for J, as well as its G–equivariance implies that g · µ = g · J(m) = J(g · m) = J FT (m) = J(m) = µ , which implies that g ∈ Gµ . We therefore have that Gρ ⊂ NGµ (H). The reverse inclusion is trivial once we assume the connectedness of J−1 (µ) ∩ MH . In conclusion, we have that (4.5) Mρ = J−1 (µ) ∩ MH /NGµ (H) = J−1 (µ) ∩ MH / NGµ (H)/H . When there are no singularities, the isotropies of all the elements in J −1 (ρ) are trivial (H = {e}) and therefore MH = M and NGµ (H) = Gµ . Consequently, in this case Mρ = J−1 (µ)/Gµ and the optimal and Marsden– Weinstein reduced spaces coincide. In the presence of singularities, the optimal reduced spaces (4.5) coincide with the singular reduced spaces introduced in Sjamaar and Lerman [1991] (for compact groups at zero momentum) and in Ortega [1998] and Ortega and Ratiu [2002] (for proper actions at arbitrary momentum values). Indeed, Gµ /Gµ , Mρ = J−1 (µ) ∩ MH / NGµ (H)/H J−1 (µ) ∩ M(H) Gµ = m ∈ M | Gm = gHg −1 , g ∈ Gµ . See Sjamaar and where M(H) Lerman [1991]; Ortega [1998]; Ortega and Ratiu [2002] for the details.

Acknowledgments: We are most grateful for many useful conversations with A. Alekseev, A. Blaom, V. Ginzburg, A. Knutson, B. Kostant, R. Loja Fernandes, J. Marsden, and A. Weinstein. This research was partially supported by the European Commission and the Swiss Federal Government through funding for the Research Training Network Mechanics and Symmetry in Europe (MASIE). Tudor Ratiu’s Research was also partially supported by the Swiss National Science Foundation.

References Abraham, R., and Marsden, J. E. [1978], Foundations of Mechanics. Second edition, Addison–Wesley. Abraham, R., Marsden, J. E., and Ratiu, T. S. [1988], Manifolds, Tensor Analysis, and Applications. Volume 75 of Applied Mathematical Sciences, SpringerVerlag. Alekseev, A., Malkin, A., and Meinrenken, E. [1997], Lie group valued momentum maps. Preprint, dg-ga/9707021.

11. The Optimal Momentum Map

361

Arms, J. M., Cushman, R., and Gotay, M. J. [1991], A universal reduction procedure for Hamiltonian group actions. In The Geometry of Hamiltonian Systems. T. S. Ratiu ed. pages 33–51. Springer Verlag. Bates, L., and Lerman, E. [1997], Proper group actions and symplectic stratiﬁed spaces. Paciﬁc J. Math., 181(2):201–229. Bredon, G. E. [1972], Introduction to Compact Transformation Groups. Academic Press. Camacho, C., and Lins Neto, A. [1985] Geometric Theory of Foliations. Birkh¨ auser. Gotay, M. J., and Tuynman, G. M. [1991], A symplectic analogue of the Mostow–Palais Theorem. Symplectic geometry, groupoids, and integrable systems (Berkeley, CA, 1989), Math. Sci. Res. Inst. Publ., 20:173–182, Springer– Verlag. Kempf, G. [1987], Computing invariants. Springer Lecture Notes in Mathematics, volume 1278, 62–80. Springer-Verlag. Kirillov, A. A. [1976], Elements of the Theory of Representations. Grundlehren der mathematischen Wissenschaften, volume 220. Springer–Verlag. Lerman, E. [1995], Symplectic cuts. Mathematical Research Letters, 2:247–258. Libermann, P., and Marle, C.–M. [1987], Symplectic Geometry and Analytical Mechanics. Reidel. Marsden, J. E., Misiolek, G., Ortega, J.-P., Perlmutter, M., and Ratiu, T. S. [2001], Symplectic Reduction by Stages. Preprint. Marsden, J. E. and Ratiu, T. S. [1999], Introduction to Mechanics and Symmetry. Texts in Applied Mathematics, volume 17. Second Edition. Springer–Verlag. Marsden, J. E., and Weinstein, A. [1974], Reduction of symplectic manifolds with symmetry. Rep. Math. Phys., 5(1):121–130. Mather, J. [1977], Diﬀerentiable invariants. Topology, 16: 145–156. McDuﬀ, D. [1988], The moment map for circle actions on symplectic manifolds. J. Geom. Phys., 5:149–160. Meyer, K. R. [1973], Symmetries and integrals in mechanics. In Dynamical Systems, pp. 259–273. M.M. Peixoto, ed. Academic Press. Noether, E. [1918], Invariante Variationsprobleme. Nachrichten von der Gesellschaft der Wissenschaften zu G¨ ottingen Mathematisch–physikalische Klasse, pp. 235–258. Ortega, J.-P. [1998], Symmetry, Reduction, and Stability in Hamiltonian Systems. Ph.D. Thesis. University of California, Santa Cruz. June, 1998. Ortega, J.-P. [2001a], Optimal reduction. In preparation. Ortega, J.-P. [2001b], Singular dual pairs. Preprint, INLN. Ortega, J.-P. and Ratiu, T. S. [1998], Singular reduction of Poisson manifolds. Letters in Mathematical Physics, 46:359–372. Ortega, J.-P. and Ratiu, T. S. [2002], Hamiltonian Singular Reduction. To appear in Birkh¨ auser, Progress in Mathematics. Otto, M. [1987] A reduction scheme for phase spaces with almost K¨ ahler symmetry. Regularity results for momentum level sets. J. Geom. Phys., 4:101–118.

362

J-P Ortega and T. S. Ratiu

Palais, R. [1961], On the existence of slices for actions of non–compact Lie groups. Ann. Math., 73:295–323. Paterson, A. L. T.[1999] Grupoids, Inverse Semigroups, and their Operator Algebras. Progress in Mathematics, volume 170. Birkh¨ auser. Po`enaru, V. [1976], Singularit´es C ∞ en pr´esence de sym´etrie. Lecture Notes in Mathematics, volume 510. Springer–Verlag. Schwarz, G. W. [1974], Smooth functions invariant under the action of a compact Lie group. Topology, 14:63–68. Sjamaar, R. and Lerman, E. [1991], Stratiﬁed symplectic spaces and reduction. Ann. of Math., 134:375–422. Stefan, P. [1974a], Accessibility and foliations with singularities. Bull. Amer. Math. Soc., 80:1142–1145. Stefan, P. [1974b], Accessible sets, orbits and foliations with singularities. Proc. Lond. Math. Soc., 29:699–713. Sussman, H. [1973], Orbits of families of vector ﬁelds and integrability of distributions. Trans. Amer. Math. Soc., 180:171–188. Trotter, H. F. [1958], Approximation of semi–groups of operators. Paciﬁc J. Math., 8:887–919. Warner, F. W. [1983], Foundation of Diﬀerentiable Manifolds and Lie Groups. Graduate Texts in Mathematics, vol. 94. Springer–Verlag. Weinstein, A. [1976], Lectures on Symplectic Manifolds. Expository lectures from the CBMS Regional Conference held at the University of North Carolina, March 8–12, 1976. Regional Conference Series in Mathematics, number 29. American Mathematical Society. Weinstein, A. [1983], The local structure of Poisson manifolds. J. Diﬀerential Geometry, 18:523–557. Weitsman, J. [1993], A Duistermaat–Heckman formula for symplectic circle actions. Internat. Math. Res. Notices, 12:309–312. Weyl, H. [1946], The Classical Groups. Second Edition. Princeton University Press.

12 Combinatorial Formulas for Products of Thom Classes Victor Guillemin Catalin Zara To Jerry Marsden on the occasion of his 60th birthday ABSTRACT Let G be a torus of dimension n > 1 and M be a compact Hamiltonian G-manifold with M G ﬁnite. A circle S 1 in G is generic if 1 M G = M S . For such a circle the moment map associated with its action on M is a perfect Morse function. Let {Wp+ ; p ∈ M G } be the Morse– Whitney stratiﬁcation of M associated with this function and let τp+ be the ∗ (M ) equivariant Thom class dual to Wp+ . These classes form a basis of HG ∗ as a module over S(g ) and, in particular, τp+ τq+ =

crpq τr+

with crpq ∈ S(g∗ ). For a large class of manifolds of this type we obtain a combinatorial description of these τp+ s and, from this description, a combinatorial formula for crpq .

Contents 1 2 3 4 5

Products of Thom classes . . . . . . . . . . . . . G-Actions on Graphs . . . . . . . . . . . . . . . . Combinatorial Formulas for Thom Classes . . . Combinatorial Intersection Numbers . . . . . . . Examples . . . . . . . . . . . . . . . . . . . . . . . 5.1 Cancellations Occurring in the Individual Terms 5.2 The Flag Variety G = SL(3, C)/B. . . . . . . . . 5.3 The Zero-Dimensional Thom Class . . . . . . . . 5.4 The (n − 1)-Dimensional Projective Space . . . . 6 An Integral Transform . . . . . . . . . . . . . . . 7 Alternative Interpolation Schemes . . . . . . . . 8 Controlled Paths . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . .

363

. . . . . . . . . . . . .

. . . . . . . . . . . . .

364 371 377 385 391 391 391 393 394 395 397 401 404

364

1

V. Guillemin and C. Zara

Products of Thom classes

Let M 2d be a compact Hamiltonian S 1 -manifold with moment map φ : M → R. 1

1

If M S is ﬁnite, then φ is a Morse function, and its critical points, p ∈ M S , are all of even index. This has important consequences for the topology of M : If we put an S 1 -invariant Riemannian metric, B, on M and let v = ∇B φ be the gradient vector ﬁeld associated with B and φ, then, for every critical 1 point p ∈ M S , the unstable manifold at p, Wp+ = q ∈ M ; lim exp tv (q) = p , t→−∞

supports a cohomology class, i.e., the complement of Wp+ in its closure Wp+ is a union of Wq+ s of codimension at least two, and hence Wp+ is a + “cycle”: the homology class Wp is well deﬁned, and so is the dual class in cohomology. These dual “Thom” classes τp+ ,

p ∈ MS

1

(1.1)

are an additive basis of the cohomology ring H ∗ (M, R). The same is true of the stable manifolds Wp− = q ∈ M ; lim exp tv (q) = p t→∞

and their dual Thom classes τp− ,

1

p ∈ MS .

The main topic of this paper is the symplectic version of what is sometimes 1 called the multiplicative Morse problem: Given p and q in M S , then τp+ τq+ can be expanded as a sum crpq τr+ . (1.2) τp+ τq+ = What are the crpq s? Closely related to this is the question of determining the cohomology pairings: cpqr = τp+ τq+ τr− . (1.3) Neither of these questions is easy to answer even when the structure of the cohomology ring itself is well understood. For instance if M is the coadjoint orbit of a compact Lie group, the computation of the crpq s is an important

12. Combinatorial Formulas for Products of Thom Classes

365

open problem in the theory of the Schubert calculus and is the focus of a lot of recent activity. (See, for instance, Bernstein, Gelfand, and Gelfand [1973], Billey and Haiman [1995], Billey [1999], Kogan [2000], and Knutson [2001].) In this paper we consider the equivariant version of this problem. We assume the action of S 1 on M can be extended to a Hamiltonian action of a torus G of dimension n > 1, and replace the τp+ s in (1.1) by their equivariant counterparts. These equivariant Thom classes generate the equivariant cohomology ring HG (M, R) as a module over the ring HG (pt) = S(g∗ ) , so one gets as above an identity τp+ τq+ =

crpq τr+ ,

but now the crpq s are elements of the polynomial ring S(g∗ ). When degree τr+ = degree τp+ + degree τq+ , they are polynomials of degree zero (i.e., real numbers) and, in fact, coincide with the crpq s in (1.2). Thus, they are in principle a much larger list of unknown quantities. We show, however, that they are, in some sense, easier to compute due to the fact that, in equivariant cohomology, one has a much richer store of intersection invariants to play around with. More explicitly, if X and Y are submanifolds of M and τX and τY are their dual Thom classes, then the intersection number (1.4) #(X ∩ Y ) = τX τY is zero except when X and Y are of complementary dimension. On the other hand, if X and Y are G-invariant and τX and τY are their equivariant Thom classes, then the expression (1.4) (which is now an element of S(g∗ )) can be non-zero no matter what the relative dimensions of X and Y are. Moreover, by the localization theorem of Atiyah–Bott–Berline–Vergne, (1.4) is a sum of local intersection invariants #(X ∩ Y )p ∈ Q(g∗ ) , where Q(g∗ ) is the quotient ﬁeld of S(g∗ ) and p is a ﬁxed point, and each of these is itself an intersection invariant. Of particular interest for us are certain intersection invariants of this type associated with the moment map φ. Suppose that p and q are critical points of φ and that there are no critical values of φ in the interval φ(p), φ(q) . Let φ(p) < c < φ(q) and let Mc = φ−1 (c)/S 1

366

V. Guillemin and C. Zara

be the symplectic reduction of M at c. By the Marsden–Weinstein theorem, Mc is a symplectic orbifold, and the action of G on M induces an action of the group G1 = G/S 1 on Mc . The reduced spaces (Wp+ )c and (Wq− )c are G-invariant symplectic sub-orbifolds of Mc and so their equivariant intersection “number” (1.5) # (Wp+ )c ∩ (Wq− )c is well-deﬁned as an element of the subring S(g∗1 ) of S(g∗ ). Moreover, if McG1 is ﬁnite, then for every v ∈ McG1 , the local intersection number (i.e., the local contribution of the point in the ABBV localization formula) (1.6) # (Wp+ c ∩ (Wq− )c )v is well-deﬁned as an element of Q(g∗1 ). We now describe the role of these intersection numbers in the computation of the crpq s. We say that M is a GKM manifold if, for all non-critical values c, McG1 is ﬁnite. Thus, being GKM is a necessary and suﬃcient condition for the invariants (1.6) to be well-deﬁned. We recall some other characterizations of these manifolds. 1.1 Theorem. M is GKM iﬀ, for every p ∈ M G , the weights αi,p of the linear isotropy representation of G on Tp M are pair-wise linearly independent; i.e., αi,p is not a multiple of αj,p if i = j. 1.2 Theorem. M is GKM iﬀ, for every codimension one subtorus H of G, the connected components of M H are either ﬁxed points of G or imbedded copies of S 2 . Moreover, if a connected component X is a copy of S 2 , then the action of G on X is symplectomorphic to the standard action of G/H = S 1 on S 2 . 1.3 Theorem.

M is GKM iﬀ the one skeleton of M p ∈ M, dim Gp ≥ n − 1

(1.7)

is a ﬁnite union of embedded S 2 s. Proof. Theorem 1.3 is a consequence of Theorem 1.2; and it is not hard to see that if the hypotheses of Theorem 1.2 hold, then M is GKM. For the other implications, see Guillemin and Zara [2001]. The intersection properties of the embedded S 2 s in the set (1.7) can be described by an intersection graph, and a beautiful observation of Goresky– Kottwitz–MacPherson is that one can read oﬀ the structure of the equivariant cohomology ring of M from the “action” of G on this graph. More explicitly, let Γ be the graph whose vertices are the ﬁxed points of G and

12. Combinatorial Formulas for Products of Thom Classes

367

whose edges, e, are copies, Xe , of the S 2 s in Theorem 1.3. The graph structure on this collection of vertices and edges is given by deﬁning the pair of vertices incident to an edge e to be the set ∂e = {p, q} = XeG . To orient the edge e, we specify that one of the two vertices {p, q} be the initial vertex, i(e), of e, and the other be its terminal vertex, t(e). If H is the stabilizer group of Xe , then this orientation corresponds to an orientation of the circle group G/H, i.e., by specifying an orientation of G/H, we can regard the two-sphere Xe as being rotated in a counterclockwise sense about its axis of symmetry, with i(e) as its “south pole” and t(e) as its “north pole”. (Hence, given a vector ξ in g such that ξ ∈ h for all stabilizer groups H in Theorem 1.2, one gets a consistent orientation of all the edges of Γ.) The action of G on the set (1.7) can be described graph theoretically by two pieces of data: a function ρ, that assigns to each oriented edge e of Γ a one-dimensional representation ρe of G, and a function κ, that assigns to each vertex p a d-dimensional representation, κp . These functions are deﬁned by letting ρe be the representation of G on Tp Xe , with p = i(e), and letting κp be the representation of G on Tp M . It is easily checked that ρ and κ satisfy the axioms: = ρe (1.8) κp = i(e)=p

κp

ρe¯ = ρ∗e Ge

= κq

(1.9) Ge

,

(1.10)

where e¯ is the edge obtained from e by reversing its orientation, Ge is the kernel of ρe , and p and q are the vertices i(e) and t(e). In particular, by (1.8), κp is determined by the ρe s with i(e) = p. Since ρe is a one-dimensional representation, it is determined by its weight, αe ; so the “action” of G on Γ associated with ρ and κ consists essentially of a labeling of the oriented edges e of Γ by weights, αe . The axioms (1.8)–(1.10) impose, of course, some condition on this labeling. For instance (1.9) is equivalent to αe = −αe¯ . Now let HG (M ) be the equivariant cohomology ring of M and HG (M G ) be the equivariant cohomology ring of M G . Since M G is a ﬁnite disjoint union of ﬁxed points and these ﬁxed points are also the vertices VΓ of Γ, it follows that HG (M G ) = Maps VΓ , S(g∗ ) . Moreover, if i : M G → M is the inclusion, then the map i∗ : HG (M ) → HG (M G )

368

V. Guillemin and C. Zara

is injective, by a well-known result of Kirwan. The theorem of Goresky– Kottwitz–MacPherson which we alluded to above asserts: 1.4 Theorem. edge e,

A map h : VΓ → S(g∗ ) is in the image of i∗ iﬀ, for every hp g = h q g e

(1.11)

e

where p and q are the vertices of e and ge is the annihilator of αe in g. This leads us to deﬁne the equivariant cohomology ring, H(Γ, α), of Γ to be the set of all maps h : VΓ → S(g∗ ) satisfying (1.11). Each of the Thom classes (1.1) gets mapped by i∗ onto an element of H(Γ, α) and we continue to use the notation τp+ for this “combinatorial” Thom class. The main result of this article is a formula for this Thom class as a kind of path integral over certain paths in Γ. This formula, by the way, is true not only for graphs that come from manifolds. We deﬁne in Section 2 a notion of an “action” of a torus G on a graph Γ, and show that this formula is true provided it is true for the subgraphs of Γ that are “ﬁxed” by the codimension two subgroups of G. Fortunately, the GKM condition insures that a graph coming from a manifold has this property. Before stating this result we describe a few basic properties of these combinatorial Thom classes. We continue to denote by φ the restriction of the moment map φ to M G . Identifying M G with VΓ , one can think of φ as a real-valued function on VΓ . By Theorem 1.2, φ takes on distinct values on the vertices i(e) and t(e) of an oriented edge e. We say that this edge is ascending if φ(i(e)) < φ(t(e)) and descending if the reverse inequality is true. More generally, if γ is a path in Γ, we say that γ is ascending if each of its edges is ascending. For every vertex p ∈ VΓ , deﬁne the index of p, σp , to be the number of descending edges e with i(e) = p. 1.5 Theorem.

The Thom class τp+ has the following properties:

1. Its support is the set of vertices of Γ which can be joined to p by an ascending path. 2. The value of τp+ at p is νp+ =

<

αe

,

i(e)=p

where the product

>

is over all descending edges with i(e) = p.

In certain instances these properties uniquely characterize τp+ . 1.6 Theorem. Suppose that the indexing function σ : VΓ → Z, p → σp , is strictly increasing along ascending paths. Then τp+ is the unique element of H(Γ, α) with properties 1 and 2 above.

12. Combinatorial Formulas for Products of Thom Classes

369

We now describe our “path-integral” formula for τp+ . As mentioned above, this formula involves the Hamiltonian action of the subgroup S 1 of G on M and the intersection invariants (1.5) and (1.6). Let ξ ∈ g be the inﬁnitesimal generator of this subgroup and let e be an ascending edge of Γ with p = i(e). For any point c on the interval between φ(p) and φ(q), the S 1 -reduced space (Xe )c consists of a single point, v ∈ Mc . Let ιe be the local intersection number (1.6). 1.7 Theorem.

For every q ∈ VΓ , τp+ (q) =

E(γ)

(1.12)

where the sum is over all ascending paths γ : p = p0 → p1 → · · · → pm = q joining p to q, and the summands are deﬁned by E(γ) = (−1)m νq+

m ι e1 < ι ek , α ˆm α ˆ k−1 − α ˆk

(1.13)

k=2

where ei = (pi−1 , pi ), i = 1, . . . , m are the edges of γ and αek α ˆk = . αek (ξ) Remarks. (1) The local intersection number ιe is equal to the global intersection number (1.5) provided that there are no ascending paths in Γ of length greater than one joining p = i(e) to q = t(e). In particular, if γ is a longest path joining p to q, then all the intersection numbers in (1.13) are global intersection numbers and, in particular, are elements of S(g∗ ). (2) In Section 4 we give a purely combinatorial deﬁnition of ιe (see (4.7)). As a corollary of Theorem 1.7, one gets for (1.3) the formula cpqr = δt E(γ1 )E(γ2 )E(γ3 )

(1.14)

t

summed over all conﬁgurations of paths, γ1 , γ2 and γ3 , where γ1 is an ascending path from p to t, γ2 an ascending path from q to t, γ3 a descending path from r to t, and

−1 < αe . (1.15) δt = i(e)=t

In particular:

370

V. Guillemin and C. Zara

r

γ3 t

p

γ1

γ2

q

Figure 1.1. Conﬁguration of paths

1.8 Theorem. If the hypotheses of Theorem 1.6 are satisﬁed, crpq = cpqr and hence crpq is equal to the sum (1.14). A few words about the organization of this paper. In Section 2 we give a brief account of the theory of G-actions on graphs (based, for the most part, on material in Guillemin and Zara [2000]). In Section 3 we derive a preliminary version of Theorem 1.7 and then, in Section 4, deduce from it the version above, after ﬁrst describing how to deﬁne the invariants (1.6) combinatorially. In Section 5 we attempt to demystify what is perhaps the most puzzling feature of the formula (1.12), the fact that all the summands are rational functions (elements of the quotient ﬁeld, Q(g∗ )), whereas the sum itself is a polynomial. This indicates that a lot of mysterious cancellations are occurring in this summation; and we show how these cancellations occur in a few simple but enlightening examples. A key ingredient in the proof of Theorem 1.7 is a “ﬂip-ﬂop” operation which describes how Thom classes change as one passes through critical points of the Morse function φ. An interpretation of this operation as an integral transform is discussed in Section 6. In Section 7 we discuss alternative schemes for computing combinatorial Thom classes on graphs, and using one of these alternative schemes, compute the combinatorial Thom classes of certain Grassmannians. In Section 8 we show that these schemes can be made to yield a streamlined version of the path integral formula (1.12) in which far fewer paths are involved. There is an alternative way of reducing the number of paths involved in the path integral formula (3.27) which exploits the fact that each of the summands (4.9) is not just a rational function of x ∈ g∗ , but also depends on an exterior parameter, ξ ∈ P, when P is the open cone (2.3). One can renormalize (3.27) by letting ξ tend to zero; and, if one does this carefully, the limits of the individual terms in (3.27) stay ﬁnite and most of them go to zero. (A detailed account of this method will appear in Zara [2001].)

12. Combinatorial Formulas for Products of Thom Classes

371

It is clear from this summary that beginning with Section 2, this paper is more about graphs than about manifolds. One reason for this is that the proofs of Theorem 1.7 and of our other main results are much easier to follow when couched in the language of graphs. Another reason, however, is that there are certain graphs (e.g., Cayley graphs associated to reﬂection groups) that don’t arise in the manifold setting but to which our results nonetheless apply (See for example Guillemin, Holm, and Zara [2001]). We would like to thank Tara Holm and Sue Tolman for helping us with the computations involved in these examples and Allen Knutson for pointing out to us antecedents in the combinatorial literature for formulas of type (1.12) and (1.14).

2

G-Actions on Graphs

Let Γ be a ﬁnite d-valent graph. Given an oriented edge e of Γ, we denote by i(e) the initial vertex of e and by t(e) the terminal vertex; and we denote by e¯ the edge obtained from e by reversing its orientation. Thus i(¯ e) = t(e) and t(¯ e) = i(e). 2.1 Deﬁnition. Let ρ be a function which assigns to each oriented edge e of Γ a one dimensional representation, ρe : G → S 1 ; and let κ be a function which assigns to each vertex p a d-dimensional representation κp of G. We say that ρ and κ deﬁne an action of G on Γ if the axioms (1.8)–(1.10) are satisﬁed. Let αe be the weight of the representation ρe . By (1.8), the weights αe , i(e) = p, determine the representation κp up to isomorphism; so the action of G on Γ is basically just a labelling of the edges of Γ by weights. We denote the function e → αe by α and call it the axial function of the action of G on Γ. The axioms (1.8)–(1.10) can be reformulated as statements about α: 2.2 Proposition. Axiom (1.9) is satisﬁed iﬀ αe = −αe¯ and axiom (1.10) is satisﬁed iﬀ one can order αek , i(ek ) = p, ek = e and αek , i(ek ) = q, ek = e¯ so that αek = αek + ck αe . (2.1) (We leave the proof of these assertions as an easy exercise.) 2.3 Deﬁnition. The action of G on Γ is a GKM action if, for all vertices p, the weights αe , i(e) = p, are pair-wise linear independent. From now on we assume, unless we state otherwise, that the action of G on Γ has this property. For every vertex p of Γ, let Ep be the set of oriented edges e, with i(e) = p.

372

V. Guillemin and C. Zara

2.4 Deﬁnition. A connection on Γ is a function which assigns to every oriented edge e, a bijective map θe : Ei(e) → Et(e) satisfying θe¯ = θe−1 . This connection is compatible with the action of G if, for every oriented edge e, with i(e) = p and every edge ek ∈ Ep , ek = e αek = αek + ck αe ,

where ek = θe (ek ) .

(2.2)

Thus the existence of a G-compatible connection is a slight sharpening of the identity (2.1). It is easy to see that G-compatible connections exist, and we assume henceforth that Γ is equipped with such a connection. Let VΓ be the set of vertices of Γ and EΓ the set of oriented edges. Motivated by the theorem of Goresky–Kottwitz–MacPherson we deﬁne the equivariant cohomology ring H(Γ, α) of Γ, to be the set of maps h : VΓ → S(g∗ ) satisfying the compatibility condition (1.11) for all e ∈ EΓ . This ring has a natural grading1 H k (Γ, α) = H(Γ, α) ∩ Maps VΓ , Sk (g∗ ) and contains S(g∗ ) as a subring: the ring of constant maps of VΓ into S(g∗ ). The proof of Theorem 1.7 requires a number of results about the structure of H(Γ, α) as an S(g∗ ) module. These results were proved in an earlier paper of ours on “equivariant Morse theory on graphs” (Guillemin and Zara [2000]), and we refer to this paper for a detailed treatment of the material in the next few paragraphs. Let P = {ξ ∈ g, αe (ξ) = 0 for all e ∈ EΓ } .

(2.3)

Given an element ξ ∈ P we say that an oriented edge e is ascending with respect to ξ if αe (ξ) > 0. For every vertex p, let the index σp of p, be the number of ascending edges e, with t(e) = p. 2.5 Deﬁnition. The k th Betti number, bk (Γ), is the number of vertices p of Γ for which σp = k.

Remark. The deﬁnition of σp depends upon the choice of ξ but bk (Γ) turns out not to. (See [Guillemin and Zara, 1999, Theorem 2.6]). 1 This deﬁnition is, unfortunately, inconsistent with the topological deﬁnition which assigns to H k the degree 2k. It is, however, more natural in this algebraic context.

12. Combinatorial Formulas for Products of Thom Classes

373

A function φ : VΓ → R is a (ξ-compatible) Morse function if, for every ascending edge e, φ(i(e)) < φ(t(e)). It is not obvious, and in fact not true, that Morse functions always exist. A necessary and suﬃcient condition for the existence of a Morse function is that, for every ascending path in Γ, the initial vertex of this path is distinct from its terminal vertex (i.e., there are no ascending “loops”). If a Morse function exists, however, one can easily perturb it so that it is injective as a map of VΓ into R. From now on we let φ be a ﬁxed Morse function with this property. The topological results discussed in Section 1 prompt one to make the following Morse-theoretic conjectures about the equivariant cohomology ring of a graph. 2.6 Conjecture. of degree k.

H(Γ, α) is a free S(g∗ ) module with bk (Γ) generators

2.7 Conjecture. H(Γ, α) is freely generated as an S(g∗ ) module by a family of “Thom classes” τp+ ∈ H k (Γ, α) ,

k = σp ,

satisfying support τp+ ⊆ Fp and τp+ (p) =

<

αe

(:= νp+ ) ,

(2.4) (2.5)

e∈Ep−

where Fp is the set of vertices which can be reached from p along an ascending path and Ep− is the set of descending edges in Ep . It is clear that Conjecture 2.7 implies Conjecture 2.6, and it is not diﬃcult to prove that Conjecture 2.6 implies Conjecture 2.7 (see [Guillemin and Zara, 2001, §2.4.3]). Therefore, since Conjecture 2.6 doesn’t depend on the choice of an orientation of Γ (i.e., the choice of a polarizing vector ξ ∈ P), the same is true of Conjecture 2.7. In particular, if we reverse the orientation (replace ξ by −ξ), we get from Conjecture 2.7 the existence of Thom classes τp− , p ∈ VΓ , associated with the Morse function −φ. Unfortunately these conjectures are not true in general; however there is a useful necessary and suﬃcient condition for them to be true involving certain subgraphs of Γ. 2.8 Deﬁnition. A subgraph Γ1 of Γ is totally geodesic if, for every pair of edges, e and e , of Γ1 , with i(e) = i(e ), θe (e ) is also an edge of Γ1 . Note that if Γ1 is a totally geodesic subgraph of Γ, then the restriction of α to it is, by (2.2), an axial function on Γ1 ; so each of these subgraphs is equipped with an action of G. An important example of a totally geodesic subgraph is the following. Let h∗ be a subspace of g∗ , and let Γh∗ be the subgraph whose edges are the set {e ∈ EΓ , αe ∈ h∗ } .

374

V. Guillemin and C. Zara

(It is clear, by (2.1) and (2.2) that this is totally geodesic.) One of the main results of Guillemin and Zara [2000] is the following. 2.9 Theorem. Conjecture 2.7 is true for Γ iﬀ, for every two-dimensional subspace h∗ of g∗ , Conjecture 2.7 is true for Γh∗ . Thus, to verify that Conjecture 2 holds for Γ it suﬃces to verify it for these subgraphs (which is usually much easier than verifying it for Γ itself). The proof of Theorem 2.9 involves a graph-theoretic version of symplectic reduction. We say that c is a critical value of the Morse function φ : VΓ → R if c = φ(p) for some p ∈ VΓ and, otherwise, c is a regular value. Let c be a regular value of φ and let Vc be the set of oriented edges e of Γ with φ(i(e)) < c < φ(t(e)). We showed in Guillemin and Zara [2000] that Vc is the set of vertices of a hypergraph, Γc . Thus the elements of Vc are both edges of the graph Γ and vertices of this hypergraph. It is useful to distinguish between the two roles they play by saying that “an edge e intersects Γc in a vertex ve .” Let g∗ξ be the annihilator of ξ in g∗ . For each oriented edge e of Γ, we deﬁne a map ρe : g∗ → g∗ξ by setting ρe α = α −

α(ξ) αe . αe (ξ)

This extends to a ring homomorphism ρe : S(g∗ ) → S(g∗ξ )

(2.6)

and, from (2.6), we get a ring homomorphism Kc : H(Γ, α) → Maps Vc , S(g∗ξ ) by setting Kc (g)(ve ) = ρe gi(e) = ρe gt(e) .

(2.7)

(The two terms on the right are equal by (1.11).) We showed in Guillemin and Zara [2000] that Kc maps H(Γ, α) into the cohomology ring H(Γc , αc ) of the hypergraph Γc . We won’t bother to review here the deﬁnition of this hypergraph cohomology ring (which is quite tricky) since one of the main theorems of Guillemin and Zara [2000] asserts that, if the hypotheses of Theorem 2.9 hold and if ξ satisﬁes a certain genericity condition (which we spell out below), then the map Kc : H(Γ, α) → H(Γc , αc ) is a submersion. Hence, thanks to this theorem, one can deﬁne H(Γc , αc ) to be the image of Kc . A key step in the proof of Theorem 2.9 is a theorem which describes how the structure of the ring H(Γc , αc ) changes as one passes through a critical

12. Combinatorial Formulas for Products of Thom Classes

375

value φ. More explicitly, suppose that c and c are regular values of φ and that there is a unique vertex p with c < φ(p) < c . Also, suppose that, for e1 , e2 , e3 , e4 ∈ Ep 1 1 ρe αe = ρe αe αe1 (ξ) 2 1 αe3 (ξ) 4 3

(2.8)

except when the two sides of (2.8) are forced to be equal (i.e., except when e1 = e2 and e3 = e4 or e1 = e3 and e2 = e4 ). The inequality (2.8) is unfortunately not satisﬁed for all elements ξ of the set (2.3), but one can show that those ξs for which it is satisﬁed form an open dense subset of this set. Let r be the index of p and let s = d − r. Let ei , i = 1, . . . , r be the descending edges in Ep and let ea , a = r + 1, . . . , d be the ascending edges in Ep . Let ∆c = {ei ; i = 1, . . . , r}

and

∆c = {ea ; a = r + 1, . . . , d} .

Then ∆c is a subset of Vc , ∆c a subset of Vc and V c − ∆ c = V c − ∆ c = V 0 , where V0 is the intersection of Vc and Vc . Let V # = V0 ∪ (∆c × ∆c ) . Then one has projection maps πc : V # → Vc

and πc : V # → Vc

and, from these projection maps, pull-back maps, πc∗ and πc∗ , embedding the rings (2.9) Maps Vc , S(g∗ξ ) and into the ring Moreover the ring

Maps Vc , S(g∗ξ )

(2.10)

Maps V # , S(g∗ξ ) .

(2.11)

Maps ∆c , S(g∗ξ )

(2.12)

sits in the ring (2.9) as the set of maps h : Vc → S(g∗ξ ) supported on ∆c , and the ring (2.13) Maps ∆c , S(g∗ξ ) sits inside the ring (2.10); so all the rings (2.9)–(2.13) can be regarded as subrings of (2.11).

376

V. Guillemin and C. Zara

Let y1 , . . . , yn−1 be a basis of g∗ξ and let x be a ﬁxed element of g∗ with x, ξ = 1. Let αei = mi x − βa (y) and αea = ma x − βa (y) , with mi < 0 < ma , 1 ≤ i ≤ r < a ≤ d, and with the βs in g∗ξ . Consider the maps τc : ∆c → g∗ξ , τc : ∆c → g∗ξ , τ # : ∆c × ∆c → g∗ξ τc (ei ) = βi

τ # (ei , ea ) = βi − βa .

τc (ea ) = βa

The ﬁrst two of these maps depend on the choice of x; however, τ # is intrinsically deﬁned since τ # (ei , ea ) is just 1 ρe αe . αei (ξ) a i Also, by the genericity condition (2.8) the map, τ # sends ∆c ×∆c injectively into g∗ξ and, as a consequence, τc and τc map ∆c and ∆c injectively into g∗ξ . Deﬁne the cohomology ring H(∆c , τc ) to be the set of all maps of ∆c into S(g∗ξ ) of the form h=

r−1

gi τci ,

gi ∈ S(g∗ξ )

i=0

and deﬁne H(∆c , τc ) to be the set of all maps of ∆c into S(g∗ξ ) of the form s−1 gi τci , gi ∈ S(g∗ξ ) . h = i=0

The theorem we alluded to above asserts: 2.10 Theorem. For every f ∈ H(Γc , αc ) and every fi ∈ H(∆c , τc ), for each i = 1, . . . , s − 1, there exists a unique f ∈ H(Γc , αc ) and unique fj ∈ H(∆c , τc ), for each j = 1, . . . , r − 1 such that f +

r−1 j=1

(τ # )j fj = f +

s−1

(τ # )i fi .

(2.14)

i=1

Remarks. (1) This theorem gives one a concrete picture of how H(Γc , αc ) changes as one goes through a critical point of φ. Namely it shows that H(Γc , αc ) can be obtained from H(Γc , αc ) by a “blow-up” followed by a “blow-down”. (Compare with [Guillemin and Zara, 2001, Theorem 2.3.2].)

12. Combinatorial Formulas for Products of Thom Classes

377

(2) This theorem can also be used to map cohomology classes in H(Γc , αc ) into cohomology classes in H(Γc , αc ). With fi = 0, i = 1, . . . , s − 1, in (2.14) one gets from the cohomology class f ∈ H(Γc , αc ) a unique cohomology class f ∈ H(Γc , αc ). (This observation is heavily exploited in the next section.)

An important ingredient in the proof of Theorem 2.10 is the following. 2.11 Theorem. H(∆c , τc ).

If f is in H(Γc , αc ), then its restriction to ∆c is in

(It is in the proof of this result that the hypotheses of Theorem 2.9 are needed.)

3

Combinatorial Formulas for Thom Classes

We describe in this section how to compute the combinatorial Thom class τp+0 at an arbitrary point p on the ﬂow-up Fp0 . We recall that τp+0 is canonically deﬁned only if the index function σ : VΓ → Z, is strictly increasing along ascending paths in Γ. Assuming that σ has this property, we show below that there is a simple inductive method for computing τp+0 on a critical level c of φ if one knows the values of τp+0 on lower critical levels. Then, later in this section, we show that this method works even when the hypothesis about σ is dropped. Let φ(p0 ) = c0 and σp0 = m. The ﬁrst step in this induction is to set τp+0 (q) = 0 for all vertices q with φ(q) < c0 and set τp+0 (p0 ) = νp+0 , as in (2.5). Now let c > c0 and suppose, by induction, that τp+0 (q) is deﬁned for all q with φ(q) < c and is zero unless q is in Fp0 . Let p be a vertex with φ(p) = c. Let σp = r and let ek , k = 1, . . . , r, be the descending edges in Γ with p = i(ek ). Then the vertices qk = t(ek ) are points where τp+0 is already deﬁned. 3.1 Lemma.

There exists a unique polynomial ψ ∈ S(g∗ ) such that ψ ≡ τp0 (qk ) mod αek ,

k = 1, . . . , r .

(3.1)

Remark. The uniqueness part of this lemma is where the hypothesis on σ is used. If f ∈ Sm (g∗ ) and f = 0 mod αek ,

k = 1, . . . , r ,

then f = hαe1 . . . αer , h ∈ Sm−r (g∗ ). Hence, if m < r, then f is identically zero.

378

V. Guillemin and C. Zara

Using this lemma, set τp+0 (p) = ψ, and continue with the induction until the set of vertices of Γ is exhausted. It is clear from (3.1) that this construction gives us a map τp+0 : VΓ → Sm (g∗ ) satisfying (1.11) and that this map is supported on Fp . By giving a constructive proof of the existence part of Lemma 3.1 the induction argument we just sketched can be converted into a formula for τp+0 , and this is the main goal of this sections. Note that the solution of (3.1) is basically an interpolation problem: ﬁnding a polynomial with prescribed values at αe1 , . . . , αer . To solve this problem constructively, we review a few elementary facts about “interpolation”. The basic problem in interpolation theory is to ﬁnd a polynomial p(x) =

n

gi xi−1

(3.2)

i=1

which takes prescribed values p(xi ) = fi

(3.3)

at n distinct points xi on the complex line. The solution of this problem is more or less trivial. The polynomial p(x) =

n < x − xk fj x j − xk j=1

(3.4)

k=j

satisﬁes (3.3) and is the only polynomial of degree less than n which does satisfy (3.3). Moreover, from (3.4) one gets an explicit formula for the gi s in (3.2). Let n−1 < j (x − x ) = (−1)n−1−i σn−1−i xi , =j

i=1

where σrj is the r-th elementary symmetric function in the x , = j. Then, by (3.4), n j σn−i > fj . (3.5) gi = (−1)n−i =j (xj − x ) j=1 One consequence of (3.5) is an inversion formula for the Vandermonde matrix, A, with entries aij = xj−1 , 1 ≤ j, i ≤ n . i If B = A−1 , then, by (3.5) and (3.3), bij = (−1)n−i >

j σn−i . =j (xj − x )

(3.6)

12. Combinatorial Formulas for Products of Thom Classes

379

In particular, <

bnj =

=j

and

<

b1j =

=j

1 xj − x

(3.7)

−x . xj − x

(3.8)

It is sometimes convenient to write the inversion formula (3.6) in terms of the elementary symmetric functions σr = σr (x1 , . . . , xn ) rather than in terms of the σrj s. To do so, we note that k

σkj =

(−1)r σk−r xrj .

(3.9)

r=0

(To derive (3.9), observe that <

(x − x ) =

=j

< (x − x )

< 1 1 xj i = (x − x ) x − xj x i=0 x ∞

and compare coeﬃcients of xn−k−1 on both sides.) Substituting (3.9) into (3.6) one gets an alternative inversion formula for the Vandermonde matrix bij =

n−i

(−1)n−i−r >

r=0

σn−i−r xrj . =j (xj − x )

(3.10)

Finally we note a couple of trivial consequences of (3.7) and (3.8). From (3.7) and the identity bnj ajk = δkn , j

we conclude that the sum

>

j

xk−1 j (x j − x ) =j

is zero if k is less than n, and is 1 if k = n; and from (3.8) and the identity b1j ajk = 1 , j

we conclude that

< j

=j

−x = 1. xj − x

(3.11)

380

V. Guillemin and C. Zara

In the applications which we make of these identities below, the xi s are variables and the fi s are polynomials in these variables, and we want to know when the gi s are also polynomials in these variables. To answer this question we show that these identities have a simple “topological” interpretation: Suppose one is given a graph Γ and an action of G on Γ deﬁned by an axial function α : EΓ → g∗ . One of the main results of an earlier paper of ours is that there is a canonical integration operation : H(Γ, α) → S(g∗ ) Γ

deﬁned by

f= Γ

fp δp

(3.12)

p∈VΓ

where δp is given by (1.15). (See [Guillemin and Zara, 1999, Section 2.4]. This formula is the formal analogue of the standard localization theorem (Atiyah and Bott [1984], Berline and Vergne [1982]) in equivariant DeRham theory.) In particular let ∆ be the complete graph on n vertices. Denote these vertices by 1, . . . , n, and let x1 , . . . , xn be a basis of g∗ . It is easy to check that the map α : E∆ → g ∗ , which assigns the weight xi − xj to the edge joining i to j, is an axial function, and that the map τ : V ∆ → g∗ ,

τ (i) = xi

is an element of H 1 (∆, α). We claim that 1, τ, . . . , τ n−1 generate H(∆, α) as a module over S(g∗ ). To see this, let νi be the cohomology class νi =

n−i

(−1)n−i−r σn−i−r τ r ,

i = 1, . . . , n .

r=0

Then (3.10) simply asserts that νi τ j−1 = δij . ∆

In particular if f is any cohomology class, then one can express f as a sum f=

n

gi τ i−1 ,

i=1

This proves the assertion:

where

gi = ∆

νi f ∈ S(g∗ ) .

12. Combinatorial Formulas for Products of Thom Classes

3.2 Proposition. the function

381

Suppose f1 , . . . , fn are polynomials in x1 , . . . , xn and p(x) =

n

gi xi−1

i=1

solves the interpolation problem p(xi ) = fi . Then the gi s are polynomials in x1 , . . . , xn iﬀ xi − xj divides fi − fj . Proof. If xi − xj divides fi − fj , then the map f : V∆ → S(g∗ ) ,

i → fi

is in H(∆, α).

Let’s come back now to Theorem 2.10 and the application of it which we discussed at the end of Section 2. As in Theorem 2.10, let c and c be regular values of φ, and suppose that there is just one vertex of Γ, p, with c < φ(p) < c . By setting f1 = f2 = · · · = fs−1 = 0 in (2.14), one gets a map Tc,c : H(Γc , αc ) → H(Γc , αc ) (3.13) sending f0 to f0 , and, by the results above, one can give a fairly concrete description of this map. Let’s order the edges e1 , . . . , ed ∈ Ep so that e1 , . . . , er are descending and er+1 , . . . , ed are ascending, and let ∆c and ∆c be the vertices of Γc and Γc corresponding to the ej s, 1 ≤ j ≤ r, and the ea s, r + 1 ≤ a ≤ d. Then V c = V 0 ∪ ∆c

and

V c = V 0 ∪ ∆ c ,

where V0 are the vertices which are common to Γc and Γc . We identify ∆c with the set {1, . . . , r } and ∆c with the set {r + 1, . . . , d }. Let f0 be in H(Γc , αc ) and let f0 = Tc,c (f0 ). Then, by (2.14) and (3.8), f0 (a) =

r < βa − βk f0 (j) , βj − βk j=1

(3.14)

k=j

for a in ∆c and j and k in ∆c ; and f0 = f0

on V0 .

(3.15)

The identity (3.14) has the following simple interpretation. Let p(x) =

r < x − βk f0 (j) . β j − βk j=1 k=j

(3.16)

382

V. Guillemin and C. Zara

Then, by (3.4), p(x) solves the interpolation problem p(βj ) = f0 (j) .

(3.17)

On the other hand, by Theorem 2.11 f0 |∆c ∈ H(∆c , τc ) ; so, by Proposition 3.2, p(x) is a polynomial in x, β1 (y), . . . , βr (y) and hence a polynomial in (x, y1 , . . . , yn−1 ), i.e., an element of the ring S(g∗ ). In fact, if f0 ∈ H m (Γc , αc ) and r > m, then p(x) is the unique element of Sm (g∗ ) satisfying (3.17). By (3.14), p(βa ) = f0 (a), so (3.14) simply says that f0 ∆ = p(τc ) . c

Thus to summarize, we have proved: 3.3 Theorem.

The map Tc,c : H m (Γc , αc ) → H m (Γc , αc )

is the identity map on V0 , and on ∆c is the “ﬂip–ﬂop” f0 = p(τc ) → p(x) → p(τc ) = f0 .

(3.18)

Since Vc and Vc are ﬁnite sets, the map Tc,c is deﬁned by a matrix with entries Tc,c (v, v ) , (v, v ) ∈ Vc × Vc . An important property of this matrix is the Markov property:

Tc,c (v, v ) = 1 .

(3.19)

v∈Vc

Proof. It suﬃces to check this for a ∈ ∆c , i.e., it suﬃces to check that r

Tc,c (j, a) = 1 .

j=1

However, by (3.14), this sum is equal to r < βa − βk βj − βk j=1 k=j

which is equal to 1 by (3.11), with x = β − βa .

12. Combinatorial Formulas for Products of Thom Classes

383

We next give a more intrinsic description of Tc,c and of the polynomial p in (3.16). We recall that αei = mi x − βi (y) , and αea = ma x − βa (y) , with i = 1, . . . , r and a = r + 1, . . . , d. Hence Tc,c (j, a) =

< βa − βk < αe − (mk /ma )αe a k = βj − βk αek − (mk /mj )αej

k=j

and therefore

k=j

> ρea ( k=j αek ) > , Tc,c (j, a) = ρej ( k=j αek )

where ρe is the map (2.6). Similarly the polynomial p is just > k=j αek >

f0 (j) . α j ρej k=j ek

(3.20)

By iterating (3.13) we extend the deﬁnition of Tc,c to arbitrary regular values of φ with c < c . Let ci , i = 0, . . . , be regular values of φ with c0 = c and c = c , and such that there exists a unique vertex, pi , with ci−1 < φ(pi ) < ci . Let Ti = Tci−1 ci and let T : H(Γc , αc ) → H(Γc , αc )

(3.21)

T = T ◦ · · · ◦ T1 .

(3.22)

be the map We list a few properties of this map: (1) This map is deﬁned by a matrix with entries T (v, v ),

(v, v ) ∈ Vc × Vc

and since all the factors on the right hand side of (3.22) have the Markov property (3.19), this matrix also has this property. (2) The matrix version of (3.22) asserts that T (v, w) = T (v−1 , w) · · · T2 (v1 , v2 )T1 (v, v1 ) summed over all sequences v1 , . . . , v−1 with vk ∈ Vck . By (3.14) and (3.15), a large number of the matrix entries in this formula are either 1 or 0: If ek is an ascending edge which intersects Γck in vk and Γck−1 in vk−1 , then 0 if v = vk−1 , T (v , vk ) = . 1 if v = vk−1

384

V. Guillemin and C. Zara

This fact can be exploited to write the sum above more succinctly. For every pair of edges, e and e , with t(e) = i(e ) = p, let e1 , . . . , er be the descending edges in Ep , ordered so that er = e¯ and let >

r−1 ρe αei i=1

. Q(e, e ) = > r−1 ρe i=1 αei Then T (v, w) can be written as a weighted sum: T (v, w) = Q(γ) γ

over all ascending paths γ in Γ whose initial edge intersects Γc in v and whose terminal edge intersects Γc in w. The weighting of the path γ is given by m < Q(ei−1 , ei ) , (3.23) Q(γ) = i=1

where the ei s are the edges of γ, ordered so that for i > 1, t(ei−1 ) = i(ei ). (3) The map (3.22) can also be viewed as a series of “ﬂip–ﬂops”. Let f0 be an element of H(Γc , αc ), and let fi = Ti · · · T1 f . Then Ti maps fi−1 to fi by a map of the form (3.18). Let’s denote the polynomial p in (3.18) by ψpi . We claim: 3.4 Proposition.

If pi is joined to pj by an ascending edge e, then ψpi ≡ ψpj

mod αe .

Proof. This is equivalent to asserting that ρe ψpi = ρe ψpj ;

(3.24)

however, (3.24) is, by deﬁnition, the common value of fk (vk ), i ≤ k < j, at the vertices, vk , at which e intersects Γck . (4) In particular let p0 be an arbitrary vertex of Γ; and choose c and c such that there are no critical values of φ on the interval, φ(p0 ), c and such that c > max φ(p), p ∈ VΓ . Order the edges e1 , . . . , ed in Ep0 so that e1 , . . . , er are descending and er+1 , . . . , ed are ascending. For r + 1 ≤ a ≤ d, let va be the vertex at which ea intersects Γc , and let

f0 : VΓc → Sr (g∗ξ ) be the map deﬁned by 0, f0 (v) = >r ρea ( i=1 αei ) ,

if v ∈ {vr+1 , . . . , vd } if v = va .

(3.25)

12. Combinatorial Formulas for Products of Thom Classes

3.5 Proposition.

385

f0 is an element of H(Γc , αc ).

Proof. By (2.7), f0 = Kc τp+0 . (This proof assumes that there exists a Thom class, τp+0 , having the properties listed in Theorem 1.5. Alternatively, Proposition 3.5 can be proved directly using a more sophisticated deﬁnition of H(Γc , αc ) than that which we gave in Section 2. For more details see [Guillemin and Zara, 2000, Section 4].) By applying the sequence of ﬂip–ﬂops, Ti , to the f0 above, we get a polynomial, ψpi ∈ Sr (g∗ ), for each vertex pi of Γ with φ(pi ) > c. On the other hand, we can deﬁne τp for φ(p) < c to be equal to (2.5) at p0 and equal to zero otherwise. By (3.24), τp satisﬁes the cocycle condition (1.11) at all vertices except p0 , and by (3.25) it satisﬁes this condition at p0 as well. Thus, if the index function, σ : VΓ → Z, is strictly increasing along ascending paths, this settles the existence part of Lemma 3.1 and justiﬁes the induction method for constructing τp+0 which we outlined at the beginning of this section. On the other hand, if σ fails to satisfy this hypothesis, the assignment p → τp still deﬁnes an element of H r (Γ, α) with the properties listed in Theorem 1.5; however, it won’t be the only element with these properties and may not even be the optimal element with these properties. (5) From (3.23) one gets the following “path integral” formula for τp+ . If e is an ascending edge of Γ, let p = t(e) and let e1 , . . . , er be the descending edges in Ep , ordered so that er = e¯. Let >r−1 αe Q(e) = >i=1 i . r−1 ρe i−1 αei

(3.26)

Then, by (3.20), (3.23) and (3.25), τp+0 (p) =

E(γ) ,

(3.27)

summed over all ascending paths in Γ joining p0 to p. E(γ) is deﬁned by E(γ) = Q(em )Q(γ)ρe1 (νp+0 ) ,

(3.28)

where e1 is the initial edge of γ and em is the terminal edge of γ.

4

Combinatorial Intersection Numbers

We show below how to recast the formula (3.28) into the form (1.13), and also show that, if the hypothesis of Theorem 1.6 is satisﬁed, then one can deduce from (1.12) the formula that we described in Section 1 for the products of Thom classes. First, however, we examine this hypothesis in

386

V. Guillemin and C. Zara

more detail: Suppose the graph Γ is connected and admits a family of Thom classes, τp+ , p ∈ VΓ , which generates H(Γ, α) as a free module over the ring S(g∗ ), and have the properties (2.4) and (2.5). By (1.11) dim H 0 (Γ, α) = 1 ; hence there is a unique vertex, p0 , with σp0 = 0. Let p be an arbitrary vertex of Γ and let γ be an ascending path with terminal endpoint p. If γ is of maximal length, its initial vertex has to be p0 , since every other vertex has a descending edge. Let φ(p) be the length of this longest path. If p can be joined to q by an ascending edge, φ(p) is strictly less than φ(q), so the map φ : VΓ → Z, p → φ(p), is a Morse function. 4.1 Theorem. The index function σ is strictly increasing along ascending paths iﬀ φ = σ (i.e., iﬀ the Morse function φ is self-indexing.) Moreover, if φ has this property then, for every pair of vertices, p ∈ VΓ and q ∈ Fp , the length of the longest ascending path from p to q is σq − σp . It suﬃces to prove the last assertion, and it suﬃces by induction to prove this assertion for paths of length one. This we do by proving a slightly stronger assertion. 4.2 Theorem. Let e be an ascending edge joining p to q. If e is the only ascending path from p to q, then σq ≤ σp + 1. Proof. Let Γe be the totally geodesic subgraph of Γ consisting of the single edge e and vertices p and q. The Thom class τe of Γe is deﬁned by < < τe (p) = αe , τe (q) = αe , i(e )=p e =e

i(e )=q e =e¯

and τe (r) = 0 if r = p, q. It is easily checked that τe ∈ H d−1 (Γ, α). 4.3 Lemma. A cohomology class τ ∈ H(Γ, α) is supported on {p, q} iﬀ τ = hτe , h ∈ H(Γe , α). Suppose now that e satisﬁes the hypotheses of Theorem 4.2. Then τp+ τq− is supported on {p, q}; so, by the lemma, τp+ τq− = hτe ,

h ∈ H(Γe , α) .

(4.1)

In particular σp + d − σq = deg τp+ + deg τq− ≥ deg τe = d − 1 , so σq ≤ σp + 1.

12. Combinatorial Formulas for Products of Thom Classes

387

Coming back to the formula (3.27), we ﬁrst consider the simplest summands in this formula, those associated with paths γ of length one. For each q ∈ VΓ , denote by Eq− and Eq+ the descending and ascending edges in Eq and let νq = νq+ be deﬁned as in (2.5). Let γ be an ascending path of length one consisting of a single edge e, with i(e) = p and t(e) = q. Then, by (3.26) and (3.28), > ρe (αei ) νq · > , (4.2) E(γ) = −αe ρe (αej ) > > in the enumerator is a product over the edges ei ∈ Ep− and where in the denominator is the product over the edges ej ∈ Eq− − {qp}. Let θ e : Ep → E q be the connection along this edge and let θe¯ = θe−1 : Eq → Ep . We deﬁne Ep,q = {e ∈ Ep− ; θe (e ) ∈ Eq− }

(4.3)

e} . Eq,p = {e ∈ Eq− ; θe¯(e ) ∈ Ep− } − {¯

(4.4)

and

Note that θe restricts to a bijection θe : Ep − Ep,q → Eq − Ep,q . If e ∈ Ep , then (2.2) implies ρe (αe ) = ρe (αθe (e ) ) .

(4.5)

Therefore, if ei ∈ Ep − Ep,q , then the terms corresponding to ei and θe (ei ) in (4.2) cancel each other and we obtain −νq E(γ) νq ρe (Zp,q ) = = · · Θpq , νp −αe νp ρe (Zq,p ) αpq νp where Zp,q =

<

αe ,

Zq,p =

e ∈Ep,q

and

<

αe ,

e ∈Eq,p

> ρe (αei ) ρe (Zp,q ) Θpq = > . = ρe (Zq,p ) ρe (αej )

(4.6)

If γ is the only ascending path from p to q, then Θpq has an interpretation as an “intersection number”: By (4.1), the quotient τp+ τq− τe

388

V. Guillemin and C. Zara

is an element of H(Γe , αe ). Let c be a point on the interval (φ(p), φ(q)) and let ve be the vertex of Γc corresponding to e. If we apply the Kirwan map Kc : H(Γ, α) → H(Γc , αc ) to this quotient and evaluate at ve , we get an element of S(g∗ξ ). We claim that + − τp τq Θpq = Kc (4.7) (ve ) , αe (ξ) τe and we deﬁne the combinatorial local intersection number ιpq to be given by (4.7). Proof. A direct computation shows that Kc (τp+ )(ve ) =

<

ρe (αe )

and Kc (τe )(ve ) Kc (τq− )(ve ) = > , ρe (αe )

hence (4.7) follows from (4.6).

We now show that the right hand side of (4.7) can be interpreted as a “pairing” of the cohomology classes Kc (τp+ ) and Kc (τq− ). We pointed out in Section 3 that the localization formula in equivariant DeRham theory enables one to deﬁne an integration operation on H(Γ, α). The analogue of this result for Γc asserts that there is an integration operation : H(Γc , αc ) → S(g∗ξ ) , Γc

mapping f ∈ H(Γc , αc ) to the sum

f (v)δv ,

with

−1 δv = Kc (τe )(v) ,

v∈Vc

where e is the edge of Γ which intersects Γc of the vertex v = ve . In particular, consider the product in H(Γc , αc ) of Kc (τp+ ) and Kc (τq− ). If e is the only ascending path in Γ joining p to q, then this product is zero except at the point ve ; so by (4.7) Θpq = Kc (τp+ )Kc (τq− ) , (4.8) αe (ξ) Γc which is the formal analogue of the intersection number (1.5).

12. Combinatorial Formulas for Products of Thom Classes

389

Remarks. (1) By Theorem 4.2, σq ≤ σp + 1. One can see by inspection that the right hand side of (4.8), which is by deﬁnition an element of S(g∗ξ ), is of degree σp + 1 − σq . In particular, if σ is a self-indexing Morse function, then the right hand side of (4.8) is just a constant. (2) If the edge e is not the only path joining p to q, then the identity (4.7) is still true; however the right hand side of (4.7) is in Q(g∗ξ ) and is interpreted as the formal analogue of the local intersection number (1.6). We now return to the general case. γ

γ

Let p → q be an ascending path from p to q, let q → r be an ascending path from q to r, and let γ

γ

γ:p→q→r be the ascending path from p to r obtained by joining γ and γ . A direct computation shows that E(γ) E(γ ) E(γ ) αei , = · · νp νp νq ρea (αei ) where ei is the last edge of γ and ea is the ﬁrst edge of γ , both pointing upward. Let γ : p = p0 → p1 → · · · → pm−1 → pm = q be an ascending path. We express the contribution E(γ) by breaking up the path γ into its constituent edges. Then m−1 αpk−1 pk E(γ) E(pp1 ) E(pm−1 q) < = · ··· · · νp νp νpm−1 ρpk pk+1 (αpk−1 pk ) k=1

=

< αpk−1 pk −νq Θpm−1 q m−1 −νp1 Θpp1 −νp2 Θp1 p2 . · · ··· · αpp1 νp αp1 p2 νp1 αpm−1 q νpm−1 ρpk pk+1 (αpk−1 pk ) k=1

Therefore the contribution E(γ) of the path γ is E(γ) = νq ·

m < k=1

Θpk−1 pk ·

(−1)m . >m−1 αpm−1 q k=1 ρpk pk+1 (αpk−1 pk )

(4.9)

In view of (4.7), we can also write this in the form (1.13), where ei is the edge of Γ joining pi−1 to pi and ιe is the local intersection number (4.7). If we reverse the orientation of Γ replacing ξ with −ξ and the Morse function φ by −φ, then we get a formula similar to (1.12) for τp− : τp− (q) = E(γ) , (4.10)

390

V. Guillemin and C. Zara

where the sum is now over descending paths from p to q. Moreover, the E(γ)s in (4.10) are easy to compute in terms of the E(γ)s in (1.13). To see this, consider, as above, the simplest example of an ascending path in Γ: an ascending edge e joining p to q. By (4.3) and (4.4), θe Ep,q = {e ∈ Eq+ , θe¯(e ) ∈ Ep− }, θe¯Eq,p = {e ∈ Ep+ , θe (e ) ∈ Eq− } − {e} . So, by (4.5) and (4.6), Θqp = Θpq .

(4.11)

Now let γ be an ascending path of length m from p to q and let γ¯ be the same path traced in the reverse direction. Then, by (4.9) and (4.11), E(¯ γ ) = (−1)m where νp− =

<

α ˆ m νp− · · E(γ) α ˆ 1 νq+

αe ,

α ˆm =

e ∈Ep+

αem , αem (ξ)

α ˆ1 =

αe1 . αe1 (ξ)

(4.12)

We are now ﬁnally in position to compute the cohomology pairing (1.3). By (3.12) and (4.10), the integral cpqr = τp+ τq+ τr− Γ

is equal to

δt E(γ1 )E(γ2 )E(γ3 )

where the sum is over all triples γ1 , γ2 , γ3 consisting of an ascending path, γ1 , from p to t, an ascending path, γ2 , from q to t, and a descending path, γ3 , from r to t. (See Figure 1.1.) Thus, in particular, if there exist no such conﬁgurations, cpqr = 0. Now suppose that the hypothesis of Theorem 1.6 is satisﬁed, i.e., that σ is a self-indexing Morse function. Then we claim that (4.13) τp+ τq− = δpq . In fact if q ∈ Fp , then the supports of τp+ and τq− are non-overlapping, so (4.13) is automatically zero; and if q = p, then the support of τp+ τq− consists of the single point p and it is easy to verify that (4.13) is equal to one. Thus (4.13) is trivially true except when q ∈ Fp and q = p. In this case however, σq > σp , so k = degree τp+ τq− = degree τp+ + degree τq− = d − σq + σp < d , hence the integral (4.13) is zero by degree considerations. If we substitute the sum cspq τs+ for τp+ τq+ in (1.3), then we obtain for cspq the formula (1.14).

12. Combinatorial Formulas for Products of Thom Classes

5

391

Examples

Each of the summands (1.13) is a rational function: an element of the quotient ﬁeld Q(g∗ ); however, the sum itself is a polynomial, so the singularities in the individual summands are mysteriously cancelling each other out. We discuss below a few simple examples in which one can see how some of these cancellations are happening.

5.1

Cancellations Occurring in the Individual Terms

Suppose γ is a longest ascending path from p to q. Let e1 , . . . , em be the edges of γ, ordered so that t(ek−1 ) = pk = i(ek ). Then ek is the only path joining pk to pk+1 ; hence the intersection numbers, ιek , are all global intersection numbers of the form (4.8) and are in S(g∗ ). Hence the factor < ι ek in the formula (1.13) is in S(g∗ ). If, in addition, the Morse function φ is self-indexing, then this factor is a polynomial of degree zero, i.e., is just a constant.

5.2

The Flag Variety G = SL(3, C)/B.

Graph theoretically, the ﬂag variety SL(n, C)/B is the permutahedron: a Cayley graph associated with the Weyl group of SL(n, C), the symmetric group Sn . Each vertex of this graph corresponds to a permutation π ∈ Sn , and two permutations π and π are adjacent in Γ if and only if there exists a transposition τij , 1 ≤ i < j ≤ n with π = πτij . Moreover, if e is the edge joining π to πτij , the weight labelling e is j − i , if π(j) > π(i) αe = i − j , if π(j) < π(i) , where 1 , . . . , n is the standard basis vectors of the lattice Zn . The connection θe along this edge is given by θπ,πτ (π, πτ ) = (πτ, πτ τ ) . If ξ = (ξ1 , . . . , ξn ) ∈ P, with ξ1 < · · · < ξn , then the function φ : VΓ → Z ,

φ(π) = length(π)

is a self-indexing ξ-compatible Morse function on Γ. The permutahedron is a bi-partite graph, with the two sets of vertices corresponding to even, respective odd permutations. In the special case

392

V. Guillemin and C. Zara

(321)

(312)

x2

x1 (231)

x1 + x2 x1 x1 + x2

(132) x2

x1 + x2

(213)

x2

x1 (123)

Figure 5.1. The ﬂag variety

n = 3, this graph is a complete bi-partite graph, and the corresponding labeling is shown in Figure 5.1. Here x1 = 2 − 1 and x2 = 3 − 2 , and we have used the notation (312) for the permutation 1 → 3, 2 → 1, 3 → 2. The quantities Θpq given by (4.6) are all equal to 1, with the exception of Θ(123)(321) , which is Θ(123)(321) =

2 x1 (ξ) + x2 (ξ) 1 = − 2 ρx1 +x2 (x1 x2 ) x2 (ξ)x1 − x1 (ξ)x2

There are two ascending paths from (213) to (321), namely γ1 : (213) → (231) → (321)

and

γ2 : (213) → (312) → (321)

and their contributions to τ(213) (321) are x1 (ξ)x2 (x1 + x2 ) 1 1 = · , x1 ρx1 (x1 + x2 ) x2 (ξ)x1 − x1 (ξ)x2 x2 (ξ)x1 (x1 + x2 ) 1 1 E(γ2 ) = −x1 x2 (x1 + x2 ) · =− · , x2 ρx2 (x1 + x2 ) x2 (ξ)x1 − x1 (ξ)x2

E(γ1 ) = −x1 x2 (x1 + x2 ) ·

hence τ(213) (321) = E(γ1 ) + E(γ2 ) = −x1 − x2 . The other classes can be computed similarly and are given by

12. Combinatorial Formulas for Products of Thom Classes τ(123) (123) (213) (132) (231) (312) (321)

5.3

1 1 1 1 1 1

τ(213)

τ(132)

τ(231)

τ(312)

393

τ(321)

0 0 0 0 0 −x1 0 0 0 0 0 −x2 0 0 0 −x1 − x2 −x2 x2 (x1 + x2 ) 0 0 −x1 −x1 − x2 0 x1 (x1 + x2 ) 0 −x1 − x2 −x1 − x2 x2 (x1 + x2 ) x1 (x1 + x2 ) −x1 x2 (x1 + x2 )

The Zero-Dimensional Thom Class

Suppose the graph Γ is connected and hence has a unique vertex, p0 , of index zero. Then τp0 is the unique generator of H 0 (Γ, α) with τp0 (p0 ) = 1. Thus τp0 (p) = 1 (5.1) for all vertices p. We show how to deduce (5.1) from (1.12). Choose the constants c and c in (3.21), so that p0 is the only vertex with φ(p0 ) < c and such that φ(p) is the smallest critical value of φ greater that c . By the Markov property of the map (3.22), 1=

Q(v, w)

w∈Vc

for every vertex v ∈ Vc . In particular, let Ep− = ei , i = 1, . . . , r and let vi ∈ Vc be the vertex at which ei intersects Γc . Then, by (3.28) τp0 (p) =

r

Q(ei )

Q(vi , w) =

Q(ei )

w

i=1

and by (3.26) τp0 (p) =

r < i=1 j=i

αej . αej − (αej (ξ)/αei (ξ) )αei

If we let xi = − this becomes

1 αe , αei (ξ) i

r < −xj , x i − xj i=1 j=i

which is equal to 1 by (3.11). Thus τp0 (p) = 1.

394

V. Guillemin and C. Zara

5.4

The (n − 1)-Dimensional Projective Space

Graph theoretically this is just the complete graph ∆ on n vertices. Denote these vertices by p1 , . . . , pn , and assign to the edge e joining pi to pj , the weight αe = xi − xj . (As we pointed out in Section 3, this deﬁnes an axial function on ∆.) Let ξ be an n-tuple of real numbers with ξ1 > ξ2 > · · · > ξn and orient the edges of ∆ by decreeing that an edge e is ascending if αe (ξ) > 0. With this orientation, the function mapping pi to i is a ξ-compatible Morse function. We compute the Thom class τpi . If i = 1, then, from the computation above, τp1 (p) = 1 , for all vertices p. If i > 1, we can regard the vertices pi , pi+1 , . . . , pn as the vertices of a complete graph, ∆ , with the same axial function as above. Consider the sum E(γ) (5.2) over all ascending paths joining p = pi to q = pj , where j > i. The individual summands can be written in the form νq E (γ) , νq where, by (1.13), E (γ) = (−1)m νq

m ι e1 < ι ek αm αk−1 − αk k=2

and νq is the product

<

αe

over all descending edges e ∈ Eq− which join q to vertices in ∆ . Then (5.2) becomes νq E (γ) . νq However, by (1.12), the expression in parentheses computes the zeroth Thom class of the subgraph ∆ , at q, and hence is equal to one. Thus τpi (q) =

< νq = αe , νq

> where is the product over all the edges e of ∆ which join q to the vertices pk , for k = 1, . . . , i − 1.

12. Combinatorial Formulas for Products of Thom Classes

6

395

An Integral Transform

Consider the transformation f+

s−1

(τ # )i fi −→ f +

i=1

r−1

(τ # )j fj

j=1

deﬁned by (2.14). Let f0 be the restriction of f to ∆c and let f0 be the restriction of f to ∆c . We claim: 6.1 Theorem. The class f0 is obtained by an “integral transform” from f0 , f1 , . . . , fs−1 . Moreover, if h=f+

s−1

(τ # )i fi

i=1

is (homogeneous) of degree < r, then f is independent of f1 , . . . , fs−1 , i.e., depends only on f . Proof. Recall (by (3.14)) that f (a) =

s−1

< βa − βk (τ # ( · , a))i fi (j) . βj − βk i=0 j k=j

Let ha =

s−1

(τ # ( · , a))i fi ∈ H(∆c , τc ) .

i=0

Since H(∆c , τc ) is generated over S(g∗ ) by 1, τ # ( · , a), . . . , (τ # ( · , a))r−1 , one can write r−1 (τ # ( · , a))i gi,a , (6.1) ha = i=0

where the gi,a s are constants, i.e., ﬁxed elements of S(g∗ ). Then f0 (a) =

r−1 r <

βa − βk (βj − βa )i gi,a . βj − βk i=0 j=1 k=j

We now use of the fact that if (bj ) is the inverse of the Vandermonde matrix aij = xj−1 , 1 ≤ i, j ≤ r , i then b1j =

< k=j

−xk . xj − xk

396

V. Guillemin and C. Zara

In particular r <

−xk −1 x = b1j aj = δ1, . xj − xk j j=1 r

j=1 k=j

Applying this identity with xj = βa − βj , we obtain that f0 (a) =

r−1

δ1,i+1 gi,a = g0,a .

i=0

We determine g0,a from (6.1): g0,a =

r < r < βa − βk δj ha (j) (βa − βk ) , ha (j) = βj − βk j=1 j=1 k=j

k=j

where δj =

<

−1 (βj − βk )

k=j

is given by (1.15). Let ν

∈ H(∆c × ∆c ) be given by < ν + (j, a) = (βa − βk ) ,

+

k=j

and let νa ∈ H(∆c , τc ) be the “restriction” of ν + to ∆c × {a}. Then r

g0,a =

ν + ( · , a)ha ,

δj ha (j)ν + (j, a) = ∆c

j=1

hence f0 (a)

ν + ( · , a)ha

= g0,a = ∆c

f0 ν ( · , a) + +

= ∆c

Therefore f0 =

s−1 i=1

i fi τ # ( · , a) ν + ( · , a) .

∆c

f0 , f1 , . . . , fs−1 · 1, τ # , . . . , (τ # )s−1 ν + ,

∆c

where

f0 , f1 , . . . , fs−1 · 1, τ # , . . . , (τ # )s−1 = f0 + f1 τ # + · · · + fs−1 (τ # )s−1 . This makes very explicit the fact that f0 is obtained from f0 , f1 , . . . , fs−1 by an “integral transform”.

12. Combinatorial Formulas for Products of Thom Classes

397

Moreover, for i ≥ 1 and 1 ≤ j ≤ r, <

(τ # (j, a))i ν + (j, a) = (βj − βa )i

(βa − βk ) = −(βj − βa )i−1

k=j

r <

(βa − βk ) ,

k=1

hence f0 (a) =

f0 νa − ∆c

r <

s−1 (βa − βk ) i=1

k=1

i−1 fi τ # ( · , a) .

∆c

Therefore, if f+

s−1

(τ # )i fi

i=1

is (homogeneous) of degree less than r, then, for all i = 1, . . . , s − 1, the form i−1 # fi τ ( · , a) is of degree less than r − 1, hence # i−1 τ ( · , a) fi = 0 . ∆c

Therefore f0 =

f0 ν + , ∆c

which proves that f0 depends only on f .

7

Alternative Interpolation Schemes

The standard action of the torus T n ⊂ U (n) on the Grassmannian Gr(Cn , k) is a GKM action, and the graph associated with it is the Johnson graph Jn,k . The vertices of this graph are the k-element subsets of the n-element set {1, . . . , n}; and two such subsets are adjacent if they intersect in a (k − 1)-element set. It is convenient to identify the vertices of Jn,k with k-tuples p = (i1 , . . . , ik ) , ordered so that 1 ≤ i1 < i2 < · · · < ik ≤ n. Thus given such a k-tuple, the k-element set associated with it is Sp = {i1 , . . . , ik } . If p and q are adjacent, then Sp contains exactly one element, i, which is not in Sq , and Sq contains exactly one element, j, not in Sp . The axial function on Jn,k is deﬁned by setting αe = xj − xi

398

V. Guillemin and C. Zara

on the edge joining p to q, where {x1 , . . . , xn } is a basis of Rn . Also, if ξ = (1, 2, . . . , n) ∈ g is a polarizing vector, then a ξ-compatible Morse function is p = (i1 , . . . , ik ) → (i1 − 1) + (i2 − 2) + · · · + (ik − k) . In fact it is easy to see that this Morse function is the canonical “selfindexing” Morse function described in Theorem 4.1. For k = 1, Jn,1 is a complete graph with n vertices, and if p = i, q = j (with i < j), then i−1 < τp (q) = (xm − xj ) . m=1

We now determine the Thom classes for Jn,2 . It suﬃces to consider the case p = (1, i2 ) (by the “truncation” trick which we used to compute the Thom classes of P n−1 in Section 5). Let q = (j1 , j2 ), with j1 < j2 . It is clear that if j2 < i2 , then τp (q) = 0. Assume now that j1 < i2 ≤ j2 . Let 1 ≤ m < i2 , m = j1 . Then τp (j1 , j2 ) − τp (j1 , m) = 0

mod (xj2 − xm )

by the cocycle condition. But τp (j1 , m) = 0; hence xj2 −xm divides τp (j1 , j2 ). However, deg τp = σp = i2 − 2; hence τp (j1 , j2 ) =

i< 2 −1

(xm − xj2 ) .

(7.1)

m=1 m=j1

Finally, assume that i2 ≤ j1 < j2 . Let f = τp (q) and let fr = τp (r, j2 ), for r = 1, . . . , i2 − 1. Then f is a polynomial of degree i2 − 2; so it is completely determined by the i2 − 1 cocycle conditions f = fr

mod (xj1 − xr ) .

In fact, since fr doesn’t depend on xj1 , f is given explicitly by (3.4): f=

i 2 −1 i< 2 −1 r=1

xm − xj1 fr . x − xr m=1 m m=r

Thus, by (7.1), τp (q) =

i 2 −1 i< 2 −1 r=1

(xm − xj1 )(xm − xj2 ) . xm − xr m=1 m=r

12. Combinatorial Formulas for Products of Thom Classes

399

Notice, by the way, that this formula is valid for all j1 , j2 . If j1 < i2 , then the only non-zero summand on the right is the one corresponding to j1 , and this summand is equal to (7.1). Moreover, if both j1 and j2 are less than i2 , then all summands on the right are zero. In general, if p = (i1 , i2 ) and q = (j1 , j2 ), then >i2 −1 i 2 −1 m=1,m=r (xm − xj1 )(xm − xj2 ) τp (q) = . (7.2) >i2 −1 m=i1 ,m=r (xm − xr ) r=i1 A similar argument shows that, if p = (i1 , i2 , i3 ) and q = (j1 , j2 , j3 ) for Jn,3 , then τp (q) is given by >i3 −1 i i 3 −1 3 −1 m=1,m=r,s (xm − xj1 )(xm − xj2 )(xm − xj3 ) . (7.3) >i3 −1 >i3 −1 m=i2 ,m=r (xm − xr ) m=i1 ,m=r,s (xm − xs ) r=i2 s=i1 ,s=r For k > 3 the formulas become considerably more complicated. (For a description of these classes for general k, see Zara [2001].) The expressions (7.2) and (7.3) don’t appear to be polynomials, but there is a suggestive way of writing them as “integrals” which makes clear that they are. For instance, to show that (7.3) is a polynomial, let fr,s =

i< 3 −1

(xm − z1 )(xm − z2 )(xm − z3 ) ,

m=1,m=r,s

where z1 , z2 and z3 are variables not depending on x1 , . . . , xn . Fix an r such that i2 ≤ r ≤ i3 − 1 and let ∆r be the complete graph having vertices, m, with i1 ≤ m < i3 and m = r. Equip this graph with the axial function, αr , which assigns to the edge joining m to s the weight xm − xs . For r ﬁxed the assignment s → fr,s =: γr (s) deﬁnes a cohomology class γr ∈ H(∆r , αr ) ⊗ C[z1 , z2 , z3 ], so we can integrate it over ∆r to get a polynomial fr = γr ∈ C[x1 , . . . , xn ] ⊗ C[z1 , z2 , z3 ] . (7.4) ∆r

Now let ∆ be the complete graph on the vertices, r, with i2 ≤ r ≤ i3 − 1 and equip this graph with the axial function, α, which assigns to the edge joining r to m the weight xr − xm . It is easy to check that the assignment r → fr =: γ(r) deﬁnes a cohomology class γ ∈ H(∆, α) ⊗ C[z1 , z2 , z3 ]; so we can integrate it over ∆ to get a polynomial f (z; x) = γ. (7.5) ∆

400

V. Guillemin and C. Zara

We leave for the reader to check that if one unwinds the deﬁnition of the integrals (7.4) and (7.5) one gets for this polynomial the expression >i3 −1 i i 3 −1 3 −1 m=1,m=r,s (xm − z1 )(xm − z2 )(xm − z3 ) . (7.6) >i3 −1 >i3 −1 m=i2 ,m=r (xm − xr ) m=i1 ,m=r,s (xm − xs ) r=i2 s=i1 ,s=r Hence if one makes the substitution, z → xj , = 1, 2, 3, then (7.6) gets converted into (7.3). Thus this sum is a polynomial, as claimed. It is also not hard to see that the map deﬁned by (7.3) is indeed the Thom class. For instance, consider the adjacent vertices q = (j1 , j2 , j3 ) and q = (j1 , j2 , j3 ). Then τp (q ) − τp (q ) = f (xj1 , xj2 , xj3 ; x) − f (xj1 , xj2 , xj3 ; x) , and this diﬀerence is clearly divisible by xj3 −xj3 . Moreover, if q = (j1 , j2 , j3 ) and one of the numbers j1 , j2 , j3 is less than i1 , then the numerator of the (r, s) summand in (7.3) vanishes identically. The same is true if two of these numbers are less than i2 or if all three of these numbers are less than i3 . We conclude this section by making a few general comments about the argument we’ve just sketched. Given a graph Γ, the computation of its combinatorial Thom classes is basically an interpolation problem: If q is a vertex, one attempts to compute τp at q by solving τp (q) = τp (qi )

mod αi ,

i = 1, . . . , k + 1 ,

(7.7)

with k = σp , where the qi s are neighbors of q at which τp has already been computed, and αi is the value of the axial function on the edge joining q to qi . To insure that such qi s exist one looks for a ﬁltering of VΓ by sets Vi , such that V0 ⊆ {p} ∪ Fpc (where Fpc = VΓ \ Fp is the complement of the ﬂow-up), and Vi ⊆ Vi−1 ∪ q ; #(Vi−1 ∩ Aq ) ≥ k + 1 , where Aq is the set of vertices of Γ adjacent to q . If q is contained in Vj , then one can, in principle, determine τp at q by solving a sequence of k + 1 systems of equations of type (7.7). For instance, in the interpolation scheme we described in Section 3 our choice of the Vi s was dictated by our choice of a Morse function, φ : VΓ → R. Letting c0 , c1 , . . . be the critical values of φ, we chose Vi to be the set of vertices q with φ(q ) ≤ ci . In the case of Johnson graphs, this turns out not to be a particularly good choice. For instance, in the example above (Jn,2 with p = (i1 , i2 )), a much more eﬃcient interpolation scheme is obtained as follows: V0 = (j1 , j2 ) ; j1 < i1 ∪ (j1 , j2 ) ; i1 ≤ j1 < j2 ≤ i2 V1 = V0 ∪ (i1 , j2 ) ; i2 < j2 ∪ (j1 , i2 ) ; i1 < j1 < i2

12. Combinatorial Formulas for Products of Thom Classes

401

V0

i2

V1 − V0 V2 − V1 V3 − V2 (1, 1)

i1

i2

Figure 7.1. Filtering for Jn,2

V2 = V1 ∪ (j1 , j2 ) ; i1 < j1 ≤ i2 < j2 V3 = V2 ∪ (j1 , j2 ) ; 1 < i2 < j1 < j2 This gives one an interpolation scheme in which the computation of τp (q) involves at most 3 interpolations. (This contrasts favorably with the Morse scheme which can involve as many as 2(n − 2) interpolations!) We exploit these observations in the next section by showing that there is a path integral formula for τp (q) which, for the Johnson graph, automatically drops out the identities (7.2)–(7.3) and which, in general, involves far fewer summands that the formula (1.12).

8

Controlled Paths

Let p be a vertex of Γ of index k, and let Fp be the ﬂow-up from p. For every vertex q ∈ Fp , we specify a control set Cq ; if q = p, then Cp is the set of all ascending edges with terminal vertex p. This is, by deﬁnition, a collection of upward pointing edges with terminal vertex q. For the moment, the only condition we impose on this set is that its cardinality be at least k + 1. (In this section we assume that the Thom class τp is uniquely deﬁned. This is the case if, for example, for every q ∈ Fp , the index of q is greater than the index of p.) Now let’s attempt to compute the Thom class τp at q as follows. Let e1 , . . . , e be the edges of Cq , let qi be the initial vertex of ei and let αi be the value of the axial function on ei . Suppose that fi = τp (qi ), i = 1, . . . , has been computed. By Lagrange interpolation, a polynomial f of degree − 1 which satisﬁes the congruences f − fi ≡ 0 mod αi ,

(8.1)

402

V. Guillemin and C. Zara

for i = 1, . . . , , is uniquely determined by these congruences. But τp (q) is a polynomial of degree at most − 1 and satisﬁes the congruences (8.1), hence τp (q) is the polynomial given by Lagrange interpolation. We construct this polynomial explicitly as follows. Fix a polarizing vector ξ, and choose coordinates x, y1 , . . . , yn−1 on g such that x(ξ) = 1 and yj (ξ) = 0 for j = 1, . . . , n − 1. Then αi = mi (x − bi (y)) and τp (q) =

< i

j=i

x − bj (y) fi bi (y), y . bi (y) − bj (y)

(8.2)

From (8.2) one gets a path integral formula for τp (q) which is very similar to (3.27). Namely E(γ) . (8.3) τp (q) = However, now the sum is over all controlled paths joining p to q, “controlled” meaning an ascending path joining p to q whose edges all lie in the control sets. Also the formula for each of the summands in (8.3) is slightly diﬀerent from that given by (3.23), (3.26), (3.28). Namely if p0 = p and pm = q, then E(γ) = Q p0 p1 Q p0 p1 p1 p2 · · · Q pm−2 pm−1 pm−1 q Q pm−2 q where < Q p0 p1 = (−1)ind(p0 ) ρp0 p1 (αe ) , Q pk−1 pk pk pk+1 = Q pm−1 q =

< e∈Cpk e=pk−1 pk

<

e∈Cq e=pm−1 q

e∈Cp0

ρpk pk+1 (αe ) , ρpk−1 pk (αe ) αe . ρpm−1 q (αe )

The main issue here is how to choose the control sets Cq , so as to get as few summands as possible. Let’s examine from this perspective two examples which we’ve already considered. Example 1: The complete graph on n vertices. Let the vertices of this graph be {pi , i = 1, . . . , n}, and let the poset structure be the standard one: pi < pj ⇔ i < j. If Fpi is the ﬂow-up from pi and q = pj ∈ Fpi , take Cq to be the edges joining p1 , . . . , pi to q. This is clearly an optimal choice of control sets and gives the formula for τpi described in Section 5. There is just one controlled path joining pi to q.

12. Combinatorial Formulas for Products of Thom Classes

403

Example 2: The Johnson graph Jn,2 . Let p = (i1 , i2 ), q = (j1 , j2 ). (Note that the index of p is i1 + i2 − 3.) We choose the control sets as follows: If j1 < i2 = j2 , then Cq = (m, i2 ) , q ; 1 ≤ m ≤ i1 ∪ (m, j1 ) , q ; 1 ≤ m ≤ i2 , m = j1 . There is only one controlled path from p to q, which is γ : p → q, and E(γ) is given by (7.2). If i1 = j1 < i2 < j2 , then Cq = (m, j2 ) , q ; 1 ≤ m ≤ i1 ∪ (m, i1 ) , q ; 1 ≤ m ≤ i2 , m = i1 . There is only one controlled path from p to q, which is γ : p → q, and E(γ) is given by (7.2). If i1 < j1 < i2 < j2 , then Cq = (m, j2 ) , q ; 1 ≤ m ≤ i1 ∪ (m, j1 ) , q ; 1 ≤ m ≤ i2 , m = j1 . There is only one controlled path from p to q: γ : p → (j1 , i2 ) → q , and again, E(γ) is given by (7.2). If i1 < i2 ≤ j1 < j2 , then Cq = (m, j2 ) , q ; 1 ≤ m < i2 ∪ (m, j1 ) , q ;

1 ≤ m < i1 .

There are i2 − i1 controlled paths: γi1 : γr :

p → (i1 , j2 ) → (j1 , j2 ) ;

p → (r, i2 ) → (r, j2 ) → q ,

i1 < r < i2 .

For every r = i1 , . . . , i2 − 1, the term E(γr ) depends on ξ. If we set ξj1 = 1 and all the other coordinates zero, then we obtain (7.2). With these choices of control sets one gets for τp (q) the formula (7.2) in the previous section. Again the set of controlled paths joining p to q is much smaller than the set of all ascending paths joining p to q. For instance all controlled paths are of length less than three. How should the control sets Cq be selected for an arbitrary graph ? A few obvious desiderata are the following: 1. Let q ∈ Fp and let Nq be the set of upward pointing edges with terminal vertex q and with initial vertex not contained in Fp . Then a good strategy is to choose Cq to contain Nq since this makes as many as possible of the summands in (8.3) equal to zero. Note by the way that if the index of p is k, the cardinality of Nq is less than or equal to k; so the interpolation condition requires that at least one of the edges in Cq joins q to a vertex of Fp .

404

V. Guillemin and C. Zara

2. Sara Billey pointed out to us that for the permutahedron (see Section 5) the condition #Nq = k characterizes those vertices on the ﬂow-up Fp which are non-singular ﬁxed points of the Schubert variety corresponding to the permutation p. (see Lakshmibai and Sandhya [1990]) This suggests that one take the condition #Nq = k to be one’s deﬁnition of non-singularity for points of Fp . Then, at non-singular points, < αe . τp (q) = Const · e∈Nq

Notice that in Example 1 all points of Fp are non-singular and in Example 2, the points, (j1 , j2 ), i1 ≤ j1 < i2 ≤ j2 are non-singular. 3. A singular point, q ∈ Fp might be adjacent to several non-singular points, q of Fp with q < q. If so, a good strategy is to insist that Cq contain the edges joining these vertices to q. In particular, we say that q is a singular point of Fp of degree one if there are at least k + 1 such edges; and for these points all the edges in Cq can be required to be of this type. By induction its clear how to deﬁne singular points of degree r on the ﬂow-up Fp , and for these singular points all the edges in Cq can be required to have, as initial vertices, singular points of degree less than r. This requirement, by the way, drastically reduces the number of choices for the Cq s (as is evident, for instance, in the two examples above.)

Acknowledgments: Research by Victor Guillemin was partially supported by NSF grant DMS 890771. Research by Catalin Zara was partially supported by the Clay Mathematics Institute Liftoﬀ Program.

References Atiyah, M. F. and R. Bott [1984], The moment map and equivariant cohomology, Topology 23, 1–28. Berline, N. and M. Vergne [1982], Classes caract´eristiques ´equivariantes, C.R. Acad. Sci., Paris 295, 539–541. Bernstein, I. N., I. M. Gelfand, and S. I. Gelfand [1973], Schubert cells and cohomology of the spaces G/P , Russian Math. Surveys 28, 1–26; Billey, S. and M. Haiman [1995], Schubert polynomials for the classical group, J. Amer.Math.Soc. 8, 443–482. Billey, S. [1999], Kostant polynomials and the cohomology ring for G/B, Duke Math. J. 96, 205–224. Goresky, M., R. Kottwitz and R. MacPherson [1998], Equivariant cohomology, Koszul duality and the localization theorem, Invent. Math. 131, 25–83.

12. Combinatorial Formulas for Products of Thom Classes

405

Guillemin, V., T. Holm, and C. Zara [2001], A GKM description of the equivariant cohomology ring of homogeneous spaces, Technical Report Math. SG/0112184. Guillemin, V. and C. Zara [1999], Equivariant de Rham theory and graphs, Asian J. of Math. 3, 49–76. Guillemin, V. and C. Zara [2000], Morse Theory on Graphs, Technical Report Math. CO/0007161 Guillemin, V. and C. Zara [2001], One-skeleta, Betti numbers and Equivariant Cohomology, Duke Math. J. 107, 283–349. Knutson, A. [2001], Descent cycling in Schubert calculus, (in preparation). Kogan, M. [2000], Schubert geometry of ﬂag varieties and Gelfand-Cetlin Theory, Ph.D. Thesis, MIT. Lakshmibai, V. and B. Sandhya [1990], Criterion for smoothness of Schubert varieties in Sl(n)/B. Proc. Indian Acad. Sci. Math. Sci. 100, 45–52. Zara, C. [2000], One-skeleta and the equivariant cohomology of GKM manifolds, Ph.D. thesis, MIT. Zara, C. [2001], Generators for the equivariant cohomology ring of GKM manifolds, (in preparation).

13 Gauge Theory of Small Vibrations in Polyatomic Molecules Robert G. Littlejohn Kevin A. Mitchell To Jerry Marsden on the occasion of his 60th birthday ABSTRACT The problem of small vibrations in polyatomic molecules is examined from the standpoint of gauge theory and ﬁber bundle theory. The Eckart conventions and their privileged status are given a geometrical interpretation (the Eckart coordinates are shown to be Riemann normal coordinates and the Eckart frame is a non-Abelian version of Poincar´e gauge). The Hamiltonian is developed in covariant Taylor series and averaged over rapid vibrations to second order. The averaged Hamiltonian is expressed in terms of geometrical objects such as the Riemann and Coriolis curvature tensors on shape space.

Contents 1 2

Introduction . . . . . . . . . . . . . . . . . . . The Perturbation Problem . . . . . . . . . . . 2.1 Small Vibrations in One Dimension . . . . . 2.2 Small Vibrations of a Diatomic Molecule . . . 2.3 The Hamiltonian for Small Vibrations in Polyatomic Molecules . . . . . . . . . . . . 3 Special Gauges, Frames and Coordinates . . 3.1 A Model Problem . . . . . . . . . . . . . . . 3.2 Non-Abelian Poincar´e Gauge . . . . . . . . . 3.3 Riemann Normal Coordinates and the Eckart Conventions . . . . . . . . . . . . . . . . . . . 4 Expanding and Averaging the Hamiltonian . 4.1 The Covariant Expansion of the Hamiltonian 4.2 The Perturbation Calculation . . . . . . . . . 5 Conclusions . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . .

407

. . . .

. . . .

. . . .

. . . .

408 410 410 410

. . . .

. . . .

. . . .

. . . .

413 417 417 419

. . . . . .

. . . . . .

. . . . . .

. . . . . .

421 422 422 423 426 426

408

1

R. G. Littlejohn and K. A. Mitchell

Introduction

It is truly a pleasure to dedicate this article to Jerry Marsden, who has not only been the inspiration for much of our own work, but who has also been a good friend. We are especially grateful for the series of lectures which Jerry gave to one of us (R.L.) and his students perhaps eight or nine years ago on applications of geometrical methods to molecular dynamics. The ﬁeld of geometry, symmetry and dynamics would be greatly impoverished without Jerry, and we hope he will continue to have as much inﬂuence on it (and us) in the future as he has had in the past. The problem of small vibrations in rotating systems is an intrinsically interesting one. Here we ﬁnd corrections to rigid body behavior for a body which is “stiﬀ” but not inﬁnitely so, so that it is capable of small amplitude, high frequency vibrations as it undergoes slower rotations. Since a rigid body is an idealization which does not exist in nature, we can see in this problem how the idealization comes about as the limit of a more realistic system. One of the most physically important and well studied examples of a nearly rigid body is a molecule, with important fundamental work going back to the 1930s. The current status of this ﬁeld and references to the earlier literature are presented in standard books on molecular physics (for example, Wilson, Decius, and Cross [1955], Papouˇsek and Aliev [1982] and Zare [1988]). The standard approach to small vibrations is to expand the Wilson–Howard–Watson Hamiltonian (Watson [1968]) about an equilibrium position. This Hamiltonian is committed to the Eckart conventions (Eckart [1935], Wilson, Decius, and Cross [1955], Louck and Galbraith [1976], Biedenharn and Louck [1981] and Ezra [1982]), which consist of a privileged choice of body frame (or gauge) as a ﬁeld over shape space, and a privileged choice of coordinates on shape space. The form of this Hamiltonian and the operations performed on it in the standard analysis of small vibrations are intricately entwined with the Eckart conventions, so that it is very diﬃcult to see what the geometrical meaning is of the various terms and expressions which result. It is generally believed in the molecular literature that the Eckart conventions have overwhelming advantages for problems involving small amplitude motions, although we have found it diﬃcult to ﬁnd a completely clear examination of the issue. One of the results of this article is to conﬁrm many of these assumptions and to place them within the geometrical framework of ﬁber bundle theory. The traditional molecular literature is fundamentally coordinate-based and non-geometrical in nature. This situation began to change in the 1980’s, with the realization (by Guichardet [1984], Iwai [1986], Tachibana and Iwai [1986]and Shapere and Wilzcek [1989]) that the separation of rotations from internal motions involves a certain non-Abelian, SO(3) gauge ﬁeld (the Coriolis ﬁeld), and that the proper geometrical framework for understanding this problem is that of ﬁber bundle theory. The situation is reviewed

13. Theory of Small Vibrations

409

by Littlejohn and [1997], who integrate the newer approaches with the traditional literature. These newer developments have been expressed in terms of the geometry of conﬁguration space, rather than phase space, and so have been somewhat independent of the earlier and more general theory of reduction (Abraham and Marsden [1978], Marsden and Ratiu [1994]). But of course we are dealing here with a special case of reduction theory, in which the symmetry group acts primar ily on conﬁguration space (probably the most important case from a physical standpoint). The purpose of this article is to examine small vibrations of a molecule from a completely covariant and gauge-invariant standpoint. One of our results is to show that the Eckart frame and coordinates do have a privileged status in the analysis of small vibrations, and to explain the geometrical signiﬁcance of this fact. As we will show, the Eckart coordinates turn out to be identical with Riemann normal coordinates on shape space and the Eckart frame is a non-Abelian version of Poincar´e gauge. One of the main results of a theory of small vibrations is the form of the Hamiltonian averaged over the rapid vibrations (the normal form), our Eqs. (4.9) and (4.10). This Hamiltonian contains information about shifts in energy levels due to the “Coriolis coupling,” the “centrifugal distortion” and other physical eﬀects. Since energy level shifts are physically observable, they cannot depend on conventions for body frame (gauge) or coordinate system, Eckart or otherwise. Therefore we believed when we started this work that these physical eﬀects must be expressible in manifestly gauge-invariant and coordinate-invariant form, presumably involving the Coriolis curvature tensor Bµν (rather than the gauge potential Aµ ), the Riemann curvature tensor Rµ ναβ etc. Indeed this is precisely what we found. Nevertheless, the Eckart conventions are involved in deriving these results, as we will show. We proceed by analyzing small vibrations in diatomic molecules, then we set up the reduced Hamiltonian for polyatomic molecules and discuss choices of gauges which simply the perturbation analysis. A model problem involving small vibrations in a U (1) (electromagnetic) gauge ﬁeld leads us to the choice of Poincar´e gauge, and similarly we are led to choose Riemann normal coordinates to simplify the expansion of the Hamiltonian. These are shown to be identical to Eckart’s conventions. We next develop the Hamiltonian in a covariant Taylor series expansion, and then average it (transform it to normal form). The results are brieﬂy discussed. Finally we present some conclusions and ideas for new applications.

410

2 2.1

R. G. Littlejohn and K. A. Mitchell

The Perturbation Problem Small Vibrations in One Dimension

The small vibrations of a particle of mass m moving near the bottom of a one-dimensional potential V (x) is described by the Hamiltonian, H=

mω 2 x2 p2 x3 x4 + + κV3 + κ2 V4 + ..., 2m 2 6 24

(2.1)

where we have expanded the potential about the minimum at x = 0, denoted the n-th derivative of the potential at x = 0 by Vn , introduced the frequency of small vibrations ω by writing V2 = mω 2 , dropped the constant term V (0), and introduced a formal ordering parameter κ which indicates the order of the successive terms. Equation (2.1) is the starting point for the transformation to Birkhoﬀ normal form (Birkhoﬀ [1927], Eckhardt [1986]), which we carry out in action-angle variables (θ, I), x=

2I 12 sin θ, mω

1

p = (2Imω) 2 cos θ .

(2.2)

The sequence of canonical transformations which formally eliminates the angle dependence of the Hamiltonian is conveniently carried out using Lie transform methods (Cary [1981], Dragt and [1976]), which through terms quadratic in the action yield the averaged Hamiltonian, V4 5V32 + + ... . (2.3) K = Iω + κ2 (Iω)2 − 48m3 ω 6 16m2 ω 4 The ﬁrst correction in K occurs at second order in κ, since all terms ﬁrst order in κ average to zero; this correction contains the nonlinear frequency shift, a physically important quantity.

2.2

Small Vibrations of a Diatomic Molecule

Consider now the small vibrations of a diatomic molecule, from the classical standpoint (see, example, Wilson, Decius, and Cross [1955], Kroto [1975], or Bunker [1979]). The Hamiltonian for the radial coordinate (the reduced Hamiltonian for the two-body problem) is H=

L2 p2 + + V (r) , 2m 2mr2

(2.4)

where p is the momentum conjugate to r, m is the reduced mass of the two atoms (of the order of an atomic mass), L is the magnitude of the angular momentum (a constant), and V (r) is the Born–Oppenheimer potential, assumed to have a minimum at r = r0 . We do not simply expand the

13. Theory of Small Vibrations

411

total potential (true plus centrifugal) about its minimum, because there are slightly nontrivial ordering issues. To expand and order this Hamiltonian in a physically realistic manner, we must specify physically interesting values of the parameters and initial conditions. These depend on the physical circumstances, so the ordering scheme is not unique, but the following is a common and reasonable approach. We begin with atomic units, in which me (the electron mass), (Planck’s constant) and e (the electron charge) are all set to unity. Then a typical nuclear (or atomic) mass is of the order of 104 , which we regard as order κ−4 (this is standard Born–Oppenheimer ordering). The potential V (r) is independent of the nuclear mass and so is regarded as of order κ0 = 1. This applies not only to the depth V (r0 ) of the potential but also to its approximate range r0 , its spring constant k = V2 = V (r0 ), etc. However, the frequency ω = (k/m)1/2 of small vibrations involves the nuclear mass, and thus turns out to be of order κ2 . Thus these vibrations are slower than ty pical electronic motions by a factor of κ−2 (that is, about 100). Next we assume the initial conditions are consistent with a vibrational quantum number of order unity (that is, independent of κ). This is a reasonable assumption at ordinary temperatures. Thus the vibrational amplitude x = r − r0 (the displacement from equilibrium) is of order (/mω)1/2 , that is, of order κ. This in turn implies that the vibrational energy is of order κ2 , the vibrational velocity v = x˙ is of order κ3 , and the vibrational momentum p = mv is of order κ−1 . Finally there is the question of the order of magnitude of the angular momentum L, which also depends on the initial conditions. We will assume, in accordance with thermal equilibrium, that the vibrational energy and the rotational kinetic energy, L2 /2mr2 , are comparable, that is, of order κ2 . This implies that L, or, equivalently, the angular momentum quantum number , is of order κ−1 . Note that with this ordering, the moment of inertia M is o f order κ−4 , and hence the rotational frequency Ωr = M −1 L is of order κ3 . This is a factor of κ times slower than the vibrational frequency, so we have an adiabatic separations of time scales between the vibrational and rotational motions. The molecule vibrates on the order of 10 times during each rotation. Other ordering schemes than the one we have presented are possible, but correspond to diﬀerent physical circumstances. Our assumptions regarding the rotational kinetic energy imply that the amount of centrifugal distortion is small, namely, of order κ. Thus, the shape of the molecule at a minimum of the potential energy is nearly the same as the shape in a relative (rotating) equilbrium, and the diﬀerence between the two can be handled by perturbation theory. This may be a disappointment to those interested in the theory of relative equilibria, which is an important part of the rest of this volume. The situation would be diﬀerent for rapidly rotating molecules; in this case, studies have been carried out by Jellinek and Li [1989], Kozin, Roberts, and Tennyson [2000] and others. However, such rapid rotation is unusual from a physical standpoint, and is of less common

412

R. G. Littlejohn and K. A. Mitchell

interest than the case we consider here. The ordering scheme we have developed is equivalent to that developed by Nielson (discussed in Papouˇsek and Aliev [1982]). Although we have used quantum concepts to work out the ordering of various quantities, we will carry out a classical treatment of the Hamiltonian. This is reasonable since the classical treatment closely parallels the proper quantum treatment, and in any case is good practice before doing the quantum calculation. To introduce a formal ordering parameter consistent with the Nielson ordering scheme, we make the substitutions m = m /κ4 ,

r = r0 + κx ,

p = p /κ

and L = L /κ

in the Hamiltonian (2.4). We note that (r, p) → (x , p ) is a canonical transformation. Then we drop the primes, expand both the true potential V (r0 + κx) and the centrifugal potential L2 /2m(r0 + κx)2 in κ, and drop the constant V (r0 ). Finally, we cancel an overall factor of κ2 from the Hamiltonian, or, equivalently, set t = t /κ2 , which means working with a new time variable in which the # vibrational time scale is of order unity. The ∞ result is the Hamiltonian H = n=0 κn Hn , where mω 2 x2 L2 p2 + + , 2m 2 2mr02 L2 x n Vn+2 n+2 Hn = (−1)n (n + 1) x + , 2 2mr0 r0 (n + 2)! H0 =

(2.5) n > 0.

(2.6)

The third term in Eq. (2.5) is the rotational kinetic energy at lowest order; it is a constant, but we retain it in the Hamiltonian because it depends on L, and we often wish to know how the energy depends on L. Otherwise it has no eﬀect on the following analysis. The ﬁrst term in Eq. (2.6) can be thought of as representing the eﬀects of centrifugal distortion. This is particularly clear in the case n = 1, where this term is proportional to x and can be incorporated into the harmonic oscillator in H0 by completing the square, that is, by shifting the origin. The new origin is approximately the equilibrium conﬁguration in the rotating system. Indeed, it would be possible to expand the total potential (true plus centrifugal) about the minimum of the total potential, rather than that of the true potential only, as done here. The result would be a reorganization of the expansion, with some ﬁrst order terms being absorbed into zeroth order terms. This does not, how ever, seem to oﬀer any great advantages. We now transform this Hamiltonian to action-angle variables # as before and apply the normal form or averaging transformation. Writing n κn Kn for the averaged Hamiltonian, we ﬁnd Kn = 0 for n odd, and K0 = Iω +

L2 , 2mr02

(2.7)

13. Theory of Small Vibrations

413

5V32 V4 (2.8) K2 = (Iω)2 − + 48m3 ω 6 16m2 ω 4 L2 3

L2 2 2 V3 + (Iω) + 2 4 . − 2 2 2 2mr0 mω r0 m ω r0 2mr02 mω 2 r02 Hamilton’s equation Ωr = ∂K/∂L gives the angular velocity Ωr of rotation of the molecule (valid for a diatomic), which for a rigid body would be M −1 L, where M is the moment of inertia. If we use this rigid body formula also in the case of the vibrating molecule, we eﬀectively deﬁne M −1 for a non-rigid molecule. We then ﬁnd that M −1 is given by its equilibrium value 1/mr02 plus one correction proportional to L2 and another proportional to the vibrational action I. The ﬁrst of these corrections indicates the centrifugal distortion (that is, the distortion to the shape of the rotating molecule due to centrifugal forces), while the second is the correction to the moment of inertia due to the vibrations, averaged over the rapid vibrations. The other one of Hamilton’s equations, Ωv = ∂K/∂I, gives the vibrational frequency Ωv , which is ω (the frequency of small vibrations) plus a correction proportional to I (due to nonlinearity of the potential) and another proportional to L2 . The latter indicates nonvanishing of the average of the centrifugal forces over the rapid vibrations (compared to the slower rotations). This calculation was classical, but a good approximation to the correct quantum result is obtained by replacing L2 by (+1)2 and I by (n+1/2), where n is the vibrational quantum number. From the observed spectrum of the molecule (diﬀerences between energy levels) it is possible to determine the parameters of the Hamiltonian (ω, r0 , V3 , V4 ).

2.3

The Hamiltonian for Small Vibrations in Polyatomic Molecules

Let us now consider the small vibrations of a polyatomic molecule with N ≥ 3 atoms, modelled as a system of point masses interacting via a Born–Oppenheimer potential V . The ﬁrst problem is to write down the reduced Hamiltonian (that is, reduced with respect to the translational and rotational invariance of the system). This is an old subject. Here we follow the notation of Littlejohn and [1997], which is based on local coordinate patches and local sections on the quotient (shape) space. It is understood that all constructions (coordinates, section and ﬁelds over shape space) are local, which is all we need for the problem of small vibrations. We let {rsα , α = 1, . . . N } be the positions of the N atoms, referred to the inertial or “space” frame (hence the s subscript). The translational degrees of freedom are eliminated by going to the center-of-mass frame, in which only N − 1 vectors {ρsα , α = 1, . . . , N − 1} are needed to specify the conﬁ guration. We choose these vectors to be mass-weighted Jacobi vectors, which means that the kinetic energy is Euclidean, that is, it has the form

414

R. G. Littlejohn and K. A. Mitchell

# ˙ sα 2 . The translation-reduced conﬁguration space is R3N −3 , which α ρ is foliated by the action of SO(3) to produce an SO(3) ﬁber bundle plus singular orbits, the latter consisting of the collinear conﬁgurations. Generic orbits are ﬁbers, diﬀeomorphic to SO(3). The quotient space R3N −3 / SO(3) is “shape space,” the base space of the bundle plus the singular (collinear) shapes. We introduce a coordinate system q µ , µ = 1, . . . , 3N −6 , essentially arbitrary at this point, on shape space. We also introduce a section of the ﬁber bundle, also essentially arbitrary at this point, which is equivalent to the deﬁnition of a body frame (a conﬁguration on the section is one for which the body frame and space frame coincide). The s subscript is omitted on vectors and tensors referred to the body frame. The Hamiltonian is expressed in terms of three ﬁelds over shape space, deﬁned by 2 ρα ⊗ ρα − ρα I , (2.9) M= 1 2

α

Aµ = M−1

ρα ×

α

gµν =

∂ρ

α

∂q µ

α

·

∂ρα , ∂q µ

(2.10)

∂ρα − Aµ · M · Aν , ∂q ν

(2.11)

where M is the moment of inertia tensor, I is the identity tensor, Aµ is the Coriolis gauge potential, and gµν is the metric tensor on shape space. Then the translation- and rotation-reduced Hamiltonian is H = 12 L · M−1 · L + 12 vµ g µν vν + V (q) ,

(2.12)

where L is the angular momentum and where vµ is the covariant “shape” velocity, given in terms of the pµ , the momentum conjugate to q µ , by vµ = pµ − L · Aµ .

(2.13)

The three terms in the Hamiltonian (2.12) are the vertical kinetic energy, the horizontal kinetic energy and the potential energy. We also need the symplectic 1-form to ﬁnd equations of motion, which is θ = pµ dq µ + Li λi , where λi , i = 1, 2, 3, are the left-invariant 1-forms on SO(3), transported to the rotation ﬁbers in R3N −3 . The absence of the s-subscript on ρα , M, Aµ and L indicates the body frame. Thus, the components Li satisfy the usual (body frame) Poisson bracket relations, {Li , Lj } = −ijk Lk . Now suppose that q0 is an equilibrium shape; namely, one for which ∂V /∂q µ = 0. We assume that q0 is a noncollinear shape (it lies on a generic orbit of SO(3)). We wish to study small vibrations about this equilibrium. We use Nielson ordering for this purpose, which as above involves writing mα = mα /κ4 ,

q µ = q0µ + κxµ ,

pµ = pµ /κ

13. Theory of Small Vibrations

L = L /κ

415

and t = t /κ2 .

This is less trivial than in the diatomic case, because now L is a nontrivial dynamical variable. In addition, we set ρα = ρα /κ2 (because the Jacobi vectors are “mass-weighted,” that is, they have absorbed factors of the square root of the atomic masses to make the kinetic energy Euclidean), M = M /κ4 ,

Aµ = Aµ , vµ = vµ /κ

gµν = gµν /κ4 ,

gµν = gµν κ4 ,

and v µ = v µ κ3 ,

which follow from the deﬁnitions above. The gauge potential Aµ is independent of κ because the angle of rotation of the “falling cat” is invariant when all masses are scaled by the same amount. We substitute these into the Hamiltonian, drop the primes, scale H by κ2 (which is the eﬀect of the scaling of time), expand the potential and drop the constant V (q0 ). The result is H = 12 L · M−1 · L + 12 vµ g µν vν + 12 V,µν xµ xν + 16 κV,µνσ xµ xν xσ + . . . (2.14) where V,µν , V,µνσ , etc., are the q-derivatives of the potential evaluated at equilibrium (commas represent ordinary derivatives), and where it is understood that M, Aµ (contained in vµ ) and g µν are evaluated at q0µ +κxµ . If these dependencies are also expanded out, H is arranged as a power series in κ (it is also quadratic in L, quadratic in pµ , and a power series in xµ ). The symplectic form also becomes ordered in κ; it is now θ = pµ dxµ + κ1 Li λi .

(2.15)

Symplectic forms which are ordered like this into “large” and “small” parts are a standard occurrence in problems with multiple time scales. They are easily handled in perturbation theory by using a Poisson bracket which is correspondingly ordered; in this case, we have ∂f ∂f ∂g ∂g ∂f ∂g − κL · × . − f ,g = ∂xµ ∂pµ ∂pµ ∂xµ ∂L ∂L µ

(2.16)

Examples of such perturbation problems are presented in Littlejohn [1979] (for guiding center motion) and Littlejohn [1993] (for the Stern–Gerlach problem). In this case, the slow angular momentum Poisson bracket indicates the slow time scale of rotations, in comparison to vibrations. Corresponding problems in quantum mechanics can be handled by a kind of Weyl symbol perturbation theory, for which see Littlejohn and Flynn [1991], [1992] and Weigert and Littlejohn [1993], or in other cases by Van Vleck perturbation theory (Zare [1988]). The Hamiltonian (2.12) is written in an arbitrary coordinate system and gauge (it is manifestly covariant), in constrast to the usual custom in the

416

R. G. Littlejohn and K. A. Mitchell

study of small vibrations in the molecular literature, where the Hamiltonian is specialized to the Eckart conventions for gauge and shape coordinates. Moreover, it is the usual custom in the molecular literature to complete the square in the angular momentum L in a diﬀerent way, thereby introducing new tensor-like ﬁelds (the “modiﬁed” inverse moment of inertia tensor and others) which are not true tensor ﬁelds. The eﬀect is to obscure the geometrical meaning of the result. Our aim will be to maintain manifest covariance throughout. One can give a geometrical interpretation to the Eckart frame (discussed in Littlejohn and [1997]), but this by itself does not explain whether the Eckart frame really has special virtues for the analysis of small vibrations. On the other hand, a straightforward expansion of the Hamiltonian (2.14) in a power series in κ produces quite a few terms even through second order, and it is natural to ask whether a special choice of gauge or coordinate system will simplify these or the perturbation analysis itself. The issue is already seen in the zeroth order term of Eq. (2.14), H0 = 12 L · M0 · L + 12 pµ − L · A0µ g0µν pν − L · A0µ + 12 V,µν xµ xν , (2.17) where the 0-subscripts indicate that the ﬁelds are evaluated at q0 . It is not even clear that this Hamiltonian has eﬀected a separation of vibrational and rotational motions at lowest order, due to the presence of the terms in L·A0µ . Thus one is motivated to perform a gauge transformation such that in the new gauge, Aµ vanishes at the equilibrium shape. The desirability of doing this was apparently ﬁrst noted by Casimir [1931]. The geometrical condition for this is that the section should be orthogonal to the equilibrium ﬁber where it intersects that ﬁber; in fact, the Eckart section has this property, but so do many other choices of gauge, so it is still not clear that the Eckart choice is the best. In any case, in such a gauge, the vertical kinetic energy becomes a function only of L (it is the rigid body kinetic energy for the molecule in the equilibrium shape), and the rest of H0 is a harmonic oscillator. To bring out the harmonic oscillator more clearly, we may ﬁrst choose our coordinates q µ to be orthogonal at the equilibrium# shape, so that g0µν = g0µν = δµν . The 1 horizontal kinetic energy is then 2 µ p2µ . Then by a further, orthogonal transformation of coordinates, we can diagonalize V,µν . With ωµ2 denoting the (presumed positive) eigenvalues, the unperturbed Hamiltonian takes on the form p2µ + ωµ2 x2µ , (2.18) H0 = 12 L · M0 · L + 12 µ

in which it is clear that rotations and vibrations are decoupled. Higher order terms will couple these degrees of freedom, of course, and introduce shifts in the energy levels of H0 ; that is where the real work lies. Before proceeding, however, we will examine a model problem which will help

13. Theory of Small Vibrations

417

us address the question of privileged choices of gauge for simplifying the perturbation calculation.

3 3.1

Special Gauges, Frames and Coordinates A Model Problem

In order to address issues concerning small vibrations in the presence of a gauge ﬁeld, let us consider an Abelian, U (1) problem before attacking the non-Abelian, SO(3) problem of small vibrations in a molecule. Speciﬁcally, let us consider a 3-dimensional harmonic oscillator perturbed by a magnetic ﬁeld B = ∇×A. The Hamiltonian is ωi2 x2i , (3.1) H = 12 (p − A)2 + 12 i

where we have set m = 1 and absorbed other physical constants into the vector potential A. The frequencies of the harmonic oscillator in the three directions are ωi , i = 1, 2, 3; these are assumed to be rationally independent, for simplicity. The magnetic ﬁeld is allowed to be inhomogeneous, but the vibrational amplitude is assumed to be small compared to the scale length of the magnetic ﬁeld, so ﬁeld gradients are sampled only weakly. The eﬀect of these assumptions is captured by an expansion and ordering of the vector potential, 1 A(x) = A(0) + κx · ∇A(0) + κ2 xx : ∇∇A(0) + . . . , 2

(3.2)

where κ is the ordering parameter. We expect the magnetic ﬁeld to have an eﬀect on the vibrational frequencies and to give them a dependence on the actions. These eﬀects should be gauge-invariant, that is, they should be expressible as a function of the magnetic ﬁeld and its gradients alone. However, if we work in an arbitrary gauge, then even the unperturbed Hamiltonian is unattractive, 2 ωi2 x2i , (3.3) H0 = 12 p − A(0) + 12 i

because of the presence of the constant vector A(0). It is possible to do the perturbation calculation in an arbitrary gauge, but it is certainly more convenient to transform to a gauge in which A(0) = 0. Similar transformations are suggested at each higher order, the eﬀect of which is to express the vector potential near x = 0 as a power series in x in which the coeﬃcients depend only on B and its gradients. If these transformations are not made, the perturbation analysis can still be carried out, but since the answers do turn out to depend only on B and its gradients, any terms in the expansion of A which do not contribute to the ﬁnal answers (for example, the

418

R. G. Littlejohn and K. A. Mitchell

symmetric part of ∇A(0)) drop out of the analysis. Thus, it is convenient to perform a series of gauge transformations to eliminate these superﬂuous terms. The details are unimportant, but what emerges is a power series representation of Poincar´e gauge, given by 1 t dt B(tx)×x A(x) = 01 (3.4) = 2 B(0) + 13 x · ∇B(0) + 18 xx : ∇∇B(0) + . . . ×x . Poincar´e gauge is deﬁned by the integral formula on the ﬁrst line (which is Poincar´e’s formula for uncurling a vector ﬁeld); we remark that a completely equivalent deﬁnition is x · A(x) = 0 (the gauge is transverse in real space). Poincar´e gauge is used in low-energy quantum electrodynamics (Cohen-Tannoudji [1989]), where it is useful for expressing the interaction of a localized charge and current distribution with the electromagnetic ﬁeld in terms of the ﬁeld and its gradients at the center of the distribution. The purpose here is similar, since the amplitude of vibration is small compared to ﬁeld scale lengths. Even in Poincar´e gauge, the perturbation analysis of the Hamiltonian (3.1) is not entirely trivial. It helps to use the following formula for the second order averaged Hamiltonian, valid for an arbitrary, bound, nonresonant system of N degrees of freedom. Let H = H0 + κH1 + κ2 H2 + . . ., where the action-angle variables are (θ, I), where H0 depends only on I, where ω = ∂H0 /∂I, where the other terms in H are expanded in Fourier series according to Hkn (I) exp(in · θ) for k ≥ 1 . Hk (θ, I) = n

Vectors I, θ, ω and n are N -vectors, and n consists of integers. Then the ﬁrst order averaged Hamiltonian is K1 = H10 , the n = (0, . . . , 0) Fourier component of H1 , and the second order averaged Hamiltonian is given by ∂ |H1n |2 . (3.5) K2 = H20 − 12 n· ∂I n · ω n=0

With the help of Eq. (3.5) and working in Poincar´e gauge, we can carry out the perturbation analysis of small vibrations in a magnetic ﬁeld through second order. The averaged Hamiltonian is given by K = K0 +κ2 K2 , where ωi Ii , (3.6) K0 = i

K2 =

1 4

i=j

2 Bij

ωi Ii − ωj Ij , ωi2 − ωj2

(3.7)

where Bij = ijk Bk , and where it is understood that the magnetic ﬁeld is evaluated at x = 0. Not surprisingly, the ﬁrst nonvanishing correction

13. Theory of Small Vibrations

419

only involves the strength of the magnetic ﬁeld at the origin (gradients will appear at higher order); also not surprising is the fact that the correction is linear in the actions, since a harmonic oscillator in a constant magnetic ﬁeld is altogether a linear problem.

3.2

Non-Abelian Poincar´ e Gauge

Let us now return to the problem of small vibrations in the molecule, where the gauge ﬁeld Aµ is non-Abelian, and ask whether something like Poincar´e gauge could be useful there. An obvious guess is that a non-Abelian version of Poincar´e gauge should satisfy xµ Aµ (x) = 0, where as above xµ is the coordinate diﬀerence from the equilibrium q0µ . Amazingly enough, this condition is once again equivalent to an integral formula connecting the gauge potential Aµ and the Coriolis ﬁeld tensor Bµν , deﬁned by Bµν =

∂Aν ∂Aµ − − Aµ ×Aν . ∂q µ ∂q ν

(3.8)

Namely, the integral formula is

1

dt t xν Bνµ (tx) .

Aµ (x) =

(3.9)

0

The result is amazing because the relation between Aµ and Bµ in Eq. (3.8) is nonlinear, but it is linear in Eq. (3.9). The equivalence of xµ Aµ (x) = 0 and Eq. (3.9) is known in particle physics (Halpern [1979]), but otherwise we do not know its history. To prove these results, we use bold face symbols for 3-vectors (for example, Aµ and Bµν ), and sans serif symbols for the corresponding 3 × 3 antisymmetric matrices belonging to the Lie algebra of SO(3) (for example, Aµ and Bµν ), where, for example, Aµij = ijk Aµk . Now let Aµ (x) represent an arbitrary choice of gauge (that is, body frame). First we show that it is always possible to transform to a new gauge where xµ Aµ (x) = 0 . A general gauge transformation is speciﬁed by a matrix ﬁeld S(x) ∈ SO(3), Aµ = SAµ St +

∂S t S. ∂xµ

(3.10)

Thus, if we demand that xµ Aµ = 0, we obtain a diﬀerential equation for S, ∂S = −xµ SAµ , (3.11) ∂xµ which can always be solved (locally) by integrating along radial lines in the xµ coordinates (we choose initial conditions S(0) = I). A geometrical interpretation of this construction is illustrated in Fig. 3.1. In the ﬁgure, q0 is the equilibrium shape on shape space SS, F0 is the ﬁber above it, diﬀeomorphic to SO(3), and Q0 is a speciﬁc conﬁguration on xµ

420

R. G. Littlejohn and K. A. Mitchell

F0

F

S Q0

Q

SS q0

q

Figure 3.1. Poincar´e gauge can be described as the section which is the horizontal lift of radial lines emanating from the equilibrium shape q0 . The lines are radial in some coordinate system with origin at q0 . If Riemann normal coordinates are used, the section is ﬂat and coincides with the section of the Eckart frame.

the ﬁber. Relative to some coordinates xµ on shape space with origin at q0 , we draw radial lines emanating from q0 , that is, lines with coordinates xµ (λ) = λξ µ , where ξ µ is a constant vector (a vector in the tangent space at q0 ). The horizontal lifts of these lines, starting at Q0 , sweep out the section S of the Poincar´e gauge. The condition xµ Aµ = 0 is equivalent to the condition that the ﬁber F above a point q on a radial line is orthogonal to the lifted line in the section. This in turn is the condition for the horizontal lift. With this construction, the Poincar´e gauge depends on the coordinates used to deﬁne the radial lines. Now given the condition xµ Aµ (x) = 0, Eq. (3.9) follows. We prove this by writing 1 1 d Aµ (x) = dt (tAµ ) = dt (Aµ + txν Aµ,ν ) , (3.12) dt 0 0 where all ﬁelds under the integral are evaluated at txµ . Then by using Eq. (3.8) to eliminate Aµ,ν in favor of Bµν , then integrating by parts and using xµ Aµ = 0, we easily prove Eq. (3.9). We can now use the non-Abelian version of Poincar´e gauge to expand the gauge potential Aµ about the equilibrium xµ = 0, and to express the result in terms of the curvature tensor Bµν . The result is like Eq. (3.4): Aµ (x) = 12 Bαµ xα + 13 Bαµ,β xα xβ + . . . ,

(3.13)

13. Theory of Small Vibrations

421

where the ﬁelds on the right hand side are evaluated at equilibrium, xµ = 0. The gauge potential is expressed in terms of the Coriolis curvature form and its derivatives at equilibrium. Thus, the use of Poincar´e gauge in the expansion of Eq. (2.13) and (2.14) simpliﬁes the result and expresses it purely in terms of the Coriolis curvature tensor.

3.3

Riemann Normal Coordinates and the Eckart Conventions

However, the expansion of the Hamiltonian (2.14) still has quite a few terms in it, including those coming from an expansion of the metric tensor g µν about the equilibrium. The ﬁrst derivatives of the metric tensor can be expressed in terms of the Christoﬀel symbols Γµαβ at equilibrium, the second derivatives in terms of the Riemann tensor Rµ ναβ , etc. The suggestion naturally arises that Riemann normal coordinates (Misner, Thorne, and Wheeler [1973]) would simplify the expansion, certainly at least by making Γµαβ and therefore the derivatives of the metric tensor vanish at equilibrium. Further advantages are noted below. To construct Riemann normal coordinates, we choose a ﬁxed basis in the tangent space to shape space at q0 , we let ξ µ be the components of a tangent vector with respect to this basis, and we then construct the geodesics passing through q0 with tangent vectors ξ µ at parameter λ = 0. Then the point on a geodesic with parameter λ and initial tangent vector ξ µ is assigned the Riemann normal coordinates xµ = λξ µ . Thus, radial lines in Riemann normal coordinates are geodesics. Since d2 xµ /dλ2 = 0, we have the identity Γµαβ xα xβ = 0 in Riemann normal coordinates. This reminds us of the condition xµ Aµ = 0 for Poincar´e gauge. In general, a geodesic on shape space is a trajectory q(t) of a system of free particles with zero angular momentum, as is fairly obvious by setting V = 0 and L = 0 in Eq. (2.12). The corresponding trajectory up in the bundle is of course a straight line with constant velocity, also with L = 0. The bundle trajectory is also the horizontal lift of the trajectory in shape space (since L = 0 is the condition for the horizontal lift). Thus, if Riemann normal coordinates are used to deﬁne the radial lines which when lifted produce the section for Poincar´e gauge, that section will consist of all straight lines passing through a point Q0 (see Fig. 3.1) on the equilibrium ﬁber F0 which are orthogonal to F0 at Q0 . The section itself is then simply the ﬂat subspace of R3N −3 of dimension 3N − 6 which is orthogonal to the equilibrium ﬁber F0 at Q0 (in fact, it passes through the origin and so is a vector subspace). But as explained in Littlejohn and [1997], this is precisely the geometrical description of the Eckart frame (or gauge). Therefore the Eckart frame is the same as Poincar´e gauge, relative to Riemann normal coordinates on shape space. Eckart’s conventions include not only a frame, but also a set of shape

422

R. G. Littlejohn and K. A. Mitchell

coordinates. These are constructed by ﬁrst choosing a set of Euclidean coordinates with origin at Q0 on the Eckart section, possible since it is a ﬂat subspace of the bundle, itself a Euclidean space, and then projecting those onto shape space. Straight lines passing through the origin in these coordinates on the section are geodesics of zero angular momentum, which project onto radial geodesics on shape space. Thus, Eckart coordinates are Riemann normal coordinates. Furthermore, since the section is orthogonal to the equilibrium ﬁber, the metric on shape space at q0 is the projection of the metric on the section at Q0 , which means that the Eckart coordinates are orthonormal at q0 . Often these coordinates are oriented so as to diagonalize V,µν at equilibrium, thereby transforming the unperturbed Hamiltonian into normal modes as in Eq. (2.18).

4 4.1

Expanding and Averaging the Hamiltonian The Covariant Expansion of the Hamiltonian

We can now return to the expansion of the Hamiltonian (2.14), in which the potential V is already expanded and we must in addition expand the metric tensor g µν , the inverse moment of inertia tensor M−1 , and the gauge potential Aµ . However, by Eq. (3.13), the latter is expressed in terms of the Corilolis curvature tensor, so we obtain the expansion of Aµ once that for Bµν is known. These expansions by Taylor series produce coeﬃcients which are the ordinary derivatives of these various tensor ﬁelds, evaluated at q0 . Unfortunately, ordinary derivatives do not by themselves lead to covariant expressions, so we are motivated to reexpress all ordinary derivatives in terms of covariant derivatives of various tensor ﬁelds, including as it turns out the Riemann tensor, evaluated at q0 . That this can be done at all is a special feature of Riemann normal coordinates and Poincar´e gauge. The result, however, is a set of fully covariant expressions, valid in any coordinates or gauge. We will omit the derivations of these expansions, and just quote the results. It is assumed that we are working in Riemann normal coordinates xµ and Poincar´e gauge. First, for the potential V , a scalar, it turns out that all ordinary derivatives are identical to covariant derivatives, when evaluated at xµ = 0. Thus, in this case, we can simply replace the comma by a semicolon, and the expansion of the potential is V (x) =

∞ 1 V;µ1 ...µn (0) xµ1 . . . xµn . n! n=0

(4.1)

As for the metric tensor, its expansion in Riemann normal coordinates is discussed by Misner, Thorne, and Wheeler [1973] and has been carried out

13. Theory of Small Vibrations

423

to high order by Yamashita [1984] with computer algebra. Through third order the result is g µν (x) = g µν (0) − 13 Rµ αβ ν (0) xα xβ − 16 Rµ αβ ν ;γ (0) xα xβ xγ + . . . , (4.2) where as noted above the ﬁrst order terms vanish. As for the inverse moment of inertia tensor M−1 and the Coriolis curvature tensor Bµν , these are hybrid tensors, linking the tangent and cotangent spaces on shape space with the tangent spaces to the ﬁbers (eﬀectively the Lie algebra of SO(3)). Thus, the covariant derivatives of these tensors involves the gauge potential Aµ as well as the Christoﬀel symbols Γµαβ , as discussed by Littlejohn and [1997]. As it turns out, the expansion of the inverse moment of inertia tensor is particularly simple: M−1 (x) =

∞ 1 −1 M;µ1 ...µn (0) xµ1 . . . xµn , n! n=0

(4.3)

very much like the expansion of V above. Finally, the gauge potential Aµ has the following expansion, valid through third order, Aµ (x) = 12 Bαµ (0) xα + 13 Bαµ;β (0) xα xβ 1 + 18 Bαµ;βγ (0) + 12 Rν αβµ (0)Bγν (0) xα xβ xγ + . . . .

4.2

(4.4)

The Perturbation Calculation

These expansions of the various ﬁelds allow us to write # out the covariant expansion of the Hamiltonian, for which we write H = n κn Hn . In the following, it is understood that all ﬁelds, V , M−1 , g µν , Bµν and Rµ ναβ and their derivatives, are evaluated at xµ = 0 (the equilibrium). The expansion through second order is given by (4.5) H0 = 12 L · M−1 · L + 12 pµ pµ + 12 V;µν xµ xν , µ 1 −1 µ ν µ ν α 1 1 (4.6) H1 = 2 L · M;µ · L x + 2 Bµν p x + 6 V;µνα x x x , µ α β 1 µ ν 1 −1 µ ν α β 1 H2 = 4 L · M;µν · L x x + 3 L · Bµα;β p x x + 6 Rµανβ p p x x 1 + 18 L · Bαµ L · Bβ µ xα xβ + 24 V;µναβ xµ xν xα xβ . (4.7) Some further simpliﬁcation of this expansion can be achieved by using the Kaluza–Klein identities for the N -body problem, which are given by Littlejohn and [1997]. However, we have found that the results of the averaging transformation, Eq. (4.10), seem to be simpler if we leave the expanded Hamiltonian as it is. We remark that the Kaluza–Klein identities show that all the ﬁelds which occur in the expansion of the Hamiltonian (not counting V ), plus all of their covariant derivatives, can be expressed in terms of just the three ﬁelds M, M;µ and Bµν . Indeed, such a reduction

424

R. G. Littlejohn and K. A. Mitchell

is necessary to reconcile the covariant expansion we have developed with the standard, noncovariant expansion of the Watson Hamiltonian (see, for example, Papouˇsek and Aliev [1982]). We now assume that the Riemann normal coordinates are chosen to be orthogonal (they are now Eckart coordinates) and to diagonalize V;µν , as in Eq. (2.18). Thus, gµν = δµν at equilbrium. For simplicity we also assume that the frequencies of small vibrations are nondegenerate, or more precisely, that {ωµ } are rationally independent and suﬃciently far from resonance. Many common molecules are highly symmetric and have degenerate frequencies of vibration, so this condition does not hold for them. See Harter [1993] for an analysis of this case. We next introduce action angle variables, ? ( 2Iµ sin θµ , pµ = 2Iµ ωµ cos θµ , (4.8) xµ = ωµ and ﬁnally carry out the averaging transformation. It turns out that the slow Poisson bracket (the second term in Eq. (2.16)) does not contribute to the result at second order (it occurs in the expansion, but averages to zero). The#results of the expansion to second order are the following. We write K = n κn Kn . We have Iµ ωµ , (4.9) K0 = 12 L · M−1 · L + µ

and K1 = 0. For the second order Hamiltonian, we ﬁnd ω 2 Iµ ωµ − Iν ων ων µ 1 K2 = 14 L · Bµν + R I I + µνµν µ ν 12 ωµ2 − ων2 ων ωµ µν µ=ν 1 Iµ Iν 2 −1 L · M L · M−1 + 14 − 18 · L + V ;ννµ ;µµ · L ;µ ωµ ωµ2 ων µ µ ν 2 I I + I I + I I V;µνσ Iν Iσ + Iµ Iσ − Iµ Iν ν σ µ σ µ ν + ω ω ω ωµ + ω ν + ω σ ωµ + ω ν − ω σ µνσ µ ν σ Iν Iσ − Iµ Iσ + Iµ Iν −Iν Iσ + Iµ Iσ + Iµ Iν + + ωµ − ω ν + ω σ −ωµ + ων + ωσ 2 2 2 V;µµν I V µ ;µµν 2 1 − 16 + 18 I 2 2 − ω2 2 2 µ ω 4ω ω µ µ ν µ ων µν µν

−

1 48

+

1 8

µν

V;µνµν

Iµ Iν − ωµ ων

1 16

µ

V;µµµµ

Iµ2 . ωµ2

(4.10)

Most of the work in deriving this result, and the most complicated parts of the result, come from averaging the cubic and quartic contributions to

13. Theory of Small Vibrations

425

the potential. These terms have nothing to do with the gauge ﬁelds or the curvature of the manifold (they would be present in a multidimensional generalization of Eq. (2.3) on a ﬂat space), and so are relatively uninteresting from the standpoint of the geometry of the ﬁber bundle. But even if we are only interested in those terms which have a dependence on the angular momentum or the curvature tensors, we cannot ignore the potential, because there is a nonlinear beating between the cubic contributions to the potential and the centrifugal distortion, which is seen in fourth major sum. Terms of this sort are also present in the diatomic result, Eq. (2.9), which in fact is a special case of Eq. (4.10). The ﬁrst sum in Eq. (4.10) is the contribution from the Coriolis forces, and is an obvious non-Abelian generalization of the result found in Eq. (3.7) for the oscillator in the magnetic ﬁeld. The second sum represents the sampling of the curvature of shape space by the oscillator. The third sum obviously involves the second order eﬀects of the centrifugal distortion of the moment of inertia tensor. Nonlinear beatings of the ﬁrst order centrifugal distortion of the moment of inertia tensor with itself are seen in the fourth major sum, which produces a term quartic in the angular momentum. The vibrational actions Iµ are formal invariants of this Hamiltonian, which eﬀectively is reduced to one degree of freedom, that of the angular momentum L . For ﬁxed values of the actions Iµ , the Hamiltonian is an even quartic in the angular momentum L (sixth and higher order terms would appear in higher order perturbation theory). Thus, the dynamics of L on the angular momentum sphere can display a much richer set of oscillations and separatrices than in the case of a rigid body (where H is only a quadratic function of L). This is an old and well studied theme in molecular physics. The results of a quantum calculation are quite close to the classical result given above, mainly because of the strong similarity between classical and quantum harmonic oscillator theory. Thus, a very good idea of the correct quantum result is obtained by replacing Iµ above by (nµ + 12 ) , where nµ is the quantum number of the µ-th normal mode. The result is then a quantum Hamiltonian in the angular momentum alone, which can be solved either by diagonalizing the (2 + 1) × (2 + 1) dimensional matrix for H, or by semiclassical (Bohr–Sommerfeld) methods on the angular momentum sphere. The eﬀect of the averaging transformation is to create a collection of tensors, deﬁned on the tangent space to shape space at q0 , whose components are simple in the orthonormal frame of the normal modes of the potential. For example, in the third major sum in Eq. (4.10) there occurs a tensor with components (Iµ /ωµ )δµν . The ﬁnal result can be expressed in terms of generally covariant contractions with these tensors.

426

5

R. G. Littlejohn and K. A. Mitchell

Conclusions

We will conclude with the following suggestion for a generalization of this work. Molecular clusters are currently an active area of research. Such clusters are often modelled as a collection of rigid bodies, interacting by some potential. In this model, the Hamiltonian for a cluster is a generalization of Eq. (2.12), in that the “particles” are no longer points, but rather have their own (ﬁxed) moment of inertia tensors and rotational kinetic energy. We have recently worked out this Hamiltonian in general form (Mitchell and Littlejohn [1999]). Suppose we wish to study rovibrational coupling in such systems. How are the Eckart frame and coordinates to be generalized? There is no clue from Eckart’s original (coordinate-based) deﬁnition. However, the analysis above shows that the coordinates should be Riemann normal coordinates, which are constructed out of radial geodesics (potential free motions of zero angular momentum), whose horizontal lifts deﬁne the section for a version of Poincar´e gauge. In this case, the potential free motion no longer consists of straight lines (the free rotation of asymmetric tops is somewhat nontrivial). Nevertheless, the covariant Taylor series expansions developed above can be carried over almost without modiﬁcation. We will report on these developments in the future.

References Abraham, Ralph and Jerrold E. Marsden [1978], Foundations of Mechanics (Benjamin/Cummings, Reading, Massachusetts). Biedenharn, L. C. and J. D. Louck [1981], Angular Momentum in Quantum Physics: Theory and Application, edited by Gian-Carlo Rota, Encyclopedia of Mathematics and its Applications, v. 8 (Addison-Wesley, Reading, Massachusetts). Birkhoﬀ, G. D. [1927], Dynamical Systems, v. IX (AMS Colloquium Publications, New York). Bunker, Philip R. [1979], Molecular Symmetry and Spectroscopy (Academic, New York). Cary, John R. [1981], Lie transforms and their use in Hamiltonian perturbation theory, Phys. Rep. 79, 31–95. Casimir, Hendrik Brugt Gerhard [1931], Rotation of a Rigid Body in Quantum Mechanics (J. B. Wolters, Groningen, den Haag). Cohen-Tannoudji, Claude [1989], Photons and Atoms: Introduction to Quantum Electrodynamics (Wiley, New York). Dragt, A. J. and J. M. Finn [1976], Lie series and invariant functions for analytic symplectic maps, J. Math. Phys. 17 2215–2227. Eckart, Carl [1935], Some studies concerning rotating axes and polyatomic molecules, Phys. Rev. 47 552–558.

13. Theory of Small Vibrations

427

Eckhardt, Bruno [1986], Birkhoﬀ–Gustavson normal form in classical and quantum mechanics, J. Phys. A 19 2961–2972. Ezra, G. S. [1982], Symmetry Properties of Molecules (Springer-Verlag, New York). Guichardet, A., [1984], On rotation and vibration motions of molecules, Ann. Inst. Inst. H. Poincar´ e 40, 329–342. Halpern, M. B. [1979], Field strength and dual variable formulation of gauge theory, Phys. Rev. D 19 517–530. Harter, William G. [1993], Principles of Symmetry, Dynamics and Spectroscopy (Wiley, New York). Iwai, Toshihiro [1986], A geometric setting for classical molecular dynamics, Ann. Inst. Inst. H. Poincar´ e 47 199–219. Jellinek, Julius and D. H. Li [1989], Separation of the energy of overall rotation in any n-body system, Phys. Rev. Lett. 62, 241–244. Kozin, I. N., R. M. Roberts and J. Tennyson [2000], Relative equilibria of D2 H + and H2 D+ , Mol. Phys. 98 295–307. Kroto, H. W. [1975], Molecular Rotation Spectra (Wiley, New York). Littlejohn, Robert G. [1979], A guiding center Hamiltonian: A new approach, J. Math. Phys. 20 2445–2458. Littlejohn, Robert G. and William G. Flynn [1991], Geometric phases and the asymptotic theory of coupled wave equations, Phys. Rev. A 44 5239–5256. Littlejohn, Robert G. and William G. Flynn [1992], Semiclassical theory of spinorbit coupling, Phys. Rev. A 45 7697–7717. Littlejohn, Robert G. and Stefan Weigert [1993], Adiabatic motion of a neutral spinning particle in an inhomogeneous magnetic ﬁeld, Phys. Rev. A 48 924– 940. Littlejohn, Robert G. and Matthias Reinsch [1997], Gauge ﬁelds in the separation of rotations and internal motions in the n-body problem, Rev. Mod. Phys. 69, 213–275. Louck, James D. and Harold W. Galbraith [1976], Eckart vectors, Eckart frames and polyatomic molecules, Rev. Mod. Phys. 48, 69–106. Marsden, Jerrold E. and Tudor S. Ratiu [1994], Introduction to Mechanics and Symmetry (Springer-Verlag, New York). Misner, Charles W., Kip S. Thorne and John Archibald Wheeler [1973], Gravitation (W. H. Freeman, San Francisco). Mitchell, Kevin A. and Robert G. Littlejohn [1999], The rovibrational kinetic energy for complexes of rigid molecules, Mol. Phys. 96 1305–1315. Papouˇsek, D. and M. R. Aliev [1982], Molecular Vibrational-Rotational Spectra (Elsevier, Amsterdam). Shapere, Alfred and Frank Wilzcek, eds. [1989], Geometric Phases in Physics (World Scientiﬁc, Singapore). Tachibana, Akitomo and Toshihiro Iwai [1986], Complete molecular Hamiltonian based on the Born–Oppenheimer adiabatic approximation, Phys. Rev. A 33, 2262–2269.

428

R. G. Littlejohn and K. A. Mitchell

Watson, James K. G. [1968], Simpliﬁcation of the molecular vibration-rotation Hamiltonian, Mol. Phys. 15, 479–490. Weigert, Stefan and Robert G. Littlejohn [1993], The diagonalization of multicomponent wave equations with Born–Oppenheimer example, Phys. Rev. A 47 3506–3512. Wilson, E. Bright Jr., J. C. Decius and Paul C. Cross [1955], Molecular Vibrations: The Theory of Infrared and Raman Vibrational Spectra (McGraw–Hill, New York). Yamashita, Y. [1984], Computer calculation of tensors in Riemann normal coordinates, General Relativity and Gravitation 16 99–109. Zare, Richard N. [1988], Angular Momentum (John Wiley & Sons, New York).

Part V

Geometric Control

14 Symmetries, Conservation Laws, and Control Anthony M. Bloch Naomi Ehrich Leonard To Jerry Marsden on the occasion of his 60th birthday ABSTRACT In this paper we describe various aspects of Jerry Marsden’s work and inﬂuence on control theory and its connections with mechanics. In particular we trace the role of his key ideas on reduction and symmetries in the setting of nonlinear control.

Contents 1

Introduction: Control Theory and Analytical Mechanics . . . . . . . . . . . . . . . . . . . . . . . . . 2 Nonlinear Control Systems Possessing Symmetries 3 Poisson Manifolds and Poisson Control Systems . . 4 Controlled Lagrangian Systems . . . . . . . . . . . . 5 Controlled Euler–Poincar´ e Systems . . . . . . . . . 5.1 Euler–Poincar´e Matching . . . . . . . . . . . . . . . 5.2 Determination of the Control Law . . . . . . . . . . 5.3 Euler–Poincar´e Stabilization . . . . . . . . . . . . . 5.4 The Rigid Spacecraft with a Rotor . . . . . . . . . . 5.5 The Dynamics of an Underwater Vehicle . . . . . . . 6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1

431 434 436 439 443 443 446 447 449 451 456 457

Introduction: Control Theory and Analytical Mechanics

In this paper we discuss aspects of Jerry Marsden’s work in nonlinear control theory. In particular, we consider the evolution of his fundamental ideas on symmetry and reduction in the context of nonlinear control systems. 431

432

A. M. Bloch and N. E. Leonard

We discuss various developments in this arena, some coauthored by Jerry Marsden with the authors of this article, some with other authors. This represents only a small part of his contribution to the control ﬁeld, however. Through his books, such as Abraham and Marsden [1978], through his lectures and papers, and through his direct interactions with researchers in the ﬁeld, Marsden has had an immeasurable inﬂuence in introducing mechanics and symmetry into the ﬁeld of nonlinear control theory. One thinks, for example, of the work of van der Schaft [1983] and Grizzle and Marcus [1985], to name two early examples, but there have been many, many examples since then. For further background on mechanics and control we refer the reader to Bloch and Crouch [1999] Classical mechanics has been in many ways the primary motivation for the study of dynamical systems and their properties as in the work of Newton, Kepler, Cartan, Hamilton, Dirac and Whittaker. Mechanics has been treated from a modern geometrical point of view in the texts of Abraham and Marsden [1978], Marsden and Ratiu [1999] among others. Over the last twenty years, many of the ideas have been incorporated into control theory through the work of Brockett, Willems, and van der Schaft, for example. It would not be much of an exaggeration to say that none of this would have happened without the fundamental book Abraham and Marsden. Indeed, on a personal note, one of the authors (Bloch) remembers very well a conversation in Roger Brockett’s oﬃce at Harvard in his ﬁrst or second year of graduate study (1982/83) about Abraham and Marsden. He remembers saying what a wonderful wealth of information was in it and how he despaired of learning it all. In turn, Brockett said that it was a lifetime’s work to really understand it. Indeed, as of now this work of a lifetime continues and a wonderful experience it has been. Control theory’s initial success was based on a solid understanding of linear control systems and applications of linear control theory to linearized models of real world systems. Key areas were aerospace control problems and application of LQR (linear quadratic regulator) theory. In this work the special nature of mechanical systems was not very important, although it did have some impact in special instances. On the other hand, the connection of control to mechanics has been crucial in the development of nonlinear control in recent years. Perhaps the fundamental paper of Brockett [1976] can be considered the inspiration. The development and application of nonlinear control theory for many years took the route whereby the philosophical treatment and successes of the linear control theory were applied to nonlinear control systems. In some cases, this resulted in signiﬁcant results, such as in realization theory, where the existence and uniqueness of realizations has been proved conclusively for suitably “smooth” systems. Recent successes using this approach have also been achieved in regulator theory. However, in other arenas, such as controller synthesis, path planning and state reconstructability, it has been much more diﬃcult to produce parallels with the linear case. The basic

14. Symmetries, Conservation Laws, and Control

433

reason is the vast array of nonlinear systems, which defy the approach of mimicking the linear theory. Specializing systems to the broad class of “mechanical control systems” is one way to restrict the class of nonlinear control systems considered. This specialization introduces additional constraints and structure, which may enable otherwise intractable problems to be answered. For example, the fact that many mechanical systems exhibit Poisson stable vector ﬁelds, may be employed proﬁtably to prove controllability of the corresponding mechanical control systems. Without this, controllability is much more difﬁcult to establish. In addition, the mechanical systems framework provides a natural setting for the introduction of symmetry and reduction and geometric structures such as ﬁber bundles which help enormously with the analysis. Of course, the most important reason for singling out the study of mechanical control systems is the fact that solutions to such problems can have a major inﬂuence on contemporary engineering problems. The progress and synergy between mechanics and control theory continues unabated. The work of Marsden has had a profound impact, and this continues to be the case as Marsden himself has played an increasingly active role. The authors of this article are fortunate to have worked with Marsden in this area and we describe below some of our joint work. As remarked above, considerable work in this area owes much to the inﬂuence of Marsden, for example the work of Lewis and Murray [1997], but we do not attempt to make a comprehensive list of such papers. In the remainder of this paper we describe a selection of results that show an evolution from tools for the analysis of mechanical control systems to a methodology for control system design. Our focus is on work inspired by or carried out with Jerry Marsden which directly uses symmetry and reduction in nonlinear control problems. In §2 we describe how symmetries can be deﬁned in the context of nonlinear control systems. The natural extension to Poisson control systems and reduction in controlled systems with symmetry is described in §3 following the work of Sanchez [1986]. In §4 we examine how these notions and analytical tools can be exploited for control synthesis. We describe our joint work with Marsden on the method of controlled Lagrangians in the context of underactuated control systems in which the directions corresponding to control inputs are (abelian) symmetry directions. This work was inspired by the work of Krishnaprasad [1985] on dual-spin spacecraft and the subsequent work of Bloch, Krishnaprasad, Marsden, and S´ anchez de Alvarez [1992]. See also Wang and Krishnaprasad [1992]. We focus in §5 on control design for Lagrangian systems which also have nonabelian symmetry in the uncontrolled directions, and we examine the method of controlled Lagrangians for the resulting controlled Euler– Poincar´e equations. This theory is illustrated with the design of control laws for stabilization of motions of a rigid spacecraft and a rigid underwater vehicle using internal rotors.

434

2

A. M. Bloch and N. E. Leonard

Nonlinear Control Systems Possessing Symmetries

As detailed elsewhere in this volume, Jerry Marsden’s role in understanding symmetry in mechanics cannot be overestimated, from the paper Marsden and Weinstein [1974] onwards. The bundle picture in nonlinear control was introduced by Brockett [1976] and the role of symmetries discussed in van der Schaft [1981] and Grizzle and Marcus [1985]. Brockett noted that local descriptions of nonlinear control system dynamics in the form x˙ = f (x, u),

(2.1)

where f : M × U → T M , were not adequate descriptions of systems where the inputs depend on the states, and even on the time histories of the states. He deﬁned a nonlinear control system as follows (see Grizzle and Marcus [1985]): 2.1 Deﬁnition. A smooth nonlinear control system is deﬁned to be a triple (B, M, f ) such that (i) (B, M, π) is a ﬁber bundle with total space B, base space M , and canonical projection π : B → M , and (ii) f : B → T M is a bundle morphism such that for each x ∈ M and u ∈ Ux = π −1 (x), f (x, u) ∈ Tx M . One can naturally introduce symmetries into this picture as follows: Let G be a Lie group and let Θ : G × B → B and Φ : G × M → M denote left actions of G on B and M respectively. For ﬁxed g ∈ G denote by Φg : M → M the map x → Φ(g, x), x ∈ M and similarly for Θ. Let Σ(B, M, f ) denote a nonlinear control system deﬁned as above. 2.2 Deﬁnition. We say Σ has the symmetry (G, Θ, Φ) if the diagram in Fig. 2.1 commutes for all g ∈ G. Θg

B

B

f π

f

TM πM

M

T Φg

Φg

TM

π

πM

M

Figure 2.1. Commutative condition for control system symmetry.

The special case of “state-space” symmetry can be deﬁned:

14. Symmetries, Conservation Laws, and Control

435

2.3 Deﬁnition. Suppose B = M × U is a trivial bundle, for U some manifold. (G, Φ) is a state-space symmetry of Σ(B, M, f ) if (G, Θ, Φ) is a symmetry of Σ for Θg = (Φg , IdU ) : (x, u) → (Φg (x), u). One can also introduce the notion of inﬁnitesimal symmetry. Let ξ ∈ g, where g is the Lie algebra of G, and let ξM denote the inﬁnitesimal generator of the action corresponding to Φ. (Recall that this may be thought of intuitively as inﬁnitesimal group motions of the system. Thus, for each ξ ∈ g, ξM is a vector ﬁeld on the manifold M and its value at a point x ∈ M is denoted ξM (x).) Let Φt denote the ﬂow of the vector ﬁeld ξM . Similarly, let ξB denote the inﬁnitesimal generator of the action corresponding to Θ and Θt the ﬂow of the vector ﬁeld ξB . 2.4 Deﬁnition. Let Σ be a nonlinear control system as above. We say (G, Θ, Φ) is an inﬁnitesimal symmetry of Σ if for each x0 ∈ M there exists an open neighborhood V of x0 and an > 0 such that T Φt f (b) = f (Θt )(b)

(2.2)

for all b ∈ π −1 (V ), |t| < and ξ ∈ g with ||ξ|| < 1 for || · || an arbitrary ﬁxed norm on g. Now assume B is endowed with an integrable (Ehresmann) connection. 2.5 Deﬁnition. (G, Θ, Φ) is said to be an inﬁnitesimal state-space symmetry if it is an inﬁnitesimal symmetry and the inﬁnitesimal generators of Θ are horizontal, i.e., ξB is the horizontal lift of ξM . In this case, since Θ is determined by Φ we will omit mention of Θ. Now assume M has dimension n, G has dimension k and the action of G is free. Then, one can prove various “reduction” theorems in analogy with that for free (uncontrolled) systems. We will refer the reader to Grizzle and Marcus [1985] for most of these but will quote here just a simple example: 2.6 Theorem. Suppose Σ(B, M, f ) has inﬁnitesimal state-space symmetry (G, Φ). Suppose G is abelian. Then about any point m ∈ M there exist connection-respecting coordinates (x1 , · · · , xn , u) for B such that in these coordinates Σ is given by x˙ = f (x1 , · · · xn−k , u) .

(2.3)

It is also possible to write down a nonabelian and “global” version of this theorem. This section indicates how symmetries are useful in understanding nonlinear control systems in the absence of mechanical structures. However, as subsequent sections indicate, the combination of symmetries and mechanical structures enables one to say much more.

436

3

A. M. Bloch and N. E. Leonard

Poisson Manifolds and Poisson Control Systems

In this section we discuss the general theory of Hamiltonian and Poisson control systems and show how the analysis above extends to this setting. We begin with the following set up (see van der Schaft [1983] and references therein): 3.1 Deﬁnition. Let P be a smooth manifold of dimension n and E, W vector spaces of dimensions m, p respectively. A nonlinear control system with external variables is a 4-tuple Σ(E, P, W, f ) where π : E → P is a smooth ﬁber bundle and f = (g, h) : E → T P × W is a ﬁber preserving smooth map, g : E → T P , h : E → W . Choosing local coordinates x for P , (x, u) ﬁber-respecting coordinates for E and w coordinates for W , locally this deﬁnition gives x˙ = g(x, u), w = h(x, u).

(3.1)

We are interested here in mechanical (Lagrangian or Hamiltonian) control systems. The extension of the notion of Hamiltonian and Lagrangian systems to the setting of control was formally proposed in Brockett [1976] and was generalized and formalized by Willems [1979], van der Schaft [1983, 1986], and others. The book Nijmeijer and van der Schaft [1990] gives a nice summary of many of the main ideas. We begin with the Lagrangian side. The simplest form of Lagrangian control system is a Lagrangian system with external forces: in local coordinates we have ∂L d ∂L = ui , i = 1, . . . , m , − dt ∂ q˙i ∂qi (3.2) d ∂L ∂L = 0, i = m + 1, . . . n . − dt ∂ q˙i ∂qi More generally, we have the system d ∂L(q, q, ˙ u) ∂L(q, q, ˙ u) =0 − dt ∂ q˙i ∂qi

(3.3)

for q ∈ Rn and u ∈ Rm . Similarly, one can deﬁne a Hamiltonian control system. Locally one has: q˙i =

∂H(q, p, u) , ∂pi

for q, p ∈ Rn and u ∈ Rm .

p˙i = −

∂H(q, p, u) , ∂qi

(3.4)

14. Symmetries, Conservation Laws, and Control

437

This easily generalizes to a Hamiltonian control system on a symplectic or Poisson manifold. Let M be a Poisson manifold and H0 , H1 , · · · , Hm be smooth functions on M . Then an aﬃne Hamiltonian control system on M is given by m x˙ = XH0 (x) + XHj (x)ui , (3.5) i=1

where x ∈ M and as usual XHj is the Hamiltonian vector ﬁeld corresponding to Hj . In Sanchez [1986] a generalization of Noether’s theorem on symmetries and conservation laws to the control setting is given. This extends the local result of van der Schaft [1981], First, we need the deﬁnition of a Poisson control system with external space (see Sanchez [1986]). 3.2 Deﬁnition. A Poisson control system with external space is a nonlinear control system Σ(E, P, W, f ) where P and W are Poisson manifolds such that f (E) is an embedded Lagrangian manifold of the Poisson manifold T P × W . Omitting the output space W , we have the following: 3.3 Deﬁnition. A Poisson control system is a nonlinear control syssubmanifold of T P . The tem such that the graph of f , Γf , is a Lagrangian input-state evolution equations take the form x˙ = x , H u where { · , · } is the Poisson bracket on P and H u is the control Hamiltonian. We now discuss a generalization of Noether’s theorem in this controlled setting. 3.4 Deﬁnition. Let Σ(E, P, W, f ) be a Poisson control system and let θ, φ, ψ be the actions of a Lie group G on E, P, W respectively. These actions are called Poisson actions for Σ if the functions g and h are equivariant with respect to these actions. Denote by JP : P → g∗ the momentum map corresponding to the ac

tion of φ on P and as usual let JP (ξ)(x) = JP (x) , ξ . Let ξP denote the inﬁnitesimal generator of the action. Deﬁne JW : W → g∗ to be the momentum map corresponding to the action of ψ on W . 3.5 Proposition. The tangential lift of the momentum map for the lift action T φg is J˙P : T P → g∗ , and the corresponding inﬁnitesimal generator for this action is ξ˙P , the complete lift of ξP to T P . One can then prove the following: 3.6 Theorem. Let Σ(E, P, W, f ) be a Poisson control system with external space and θ, ψ, φ the Poisson actions of G on Σ as above. Then the diagram below commutes.

438

A. M. Bloch and N. E. Leonard

g

E

h

TP

W J˙P

g∗

JW

The theorem implies that the state-space momentum map JP (ξ) is constant along the orbits of Hamiltonian vector ﬁelds for those Poisson control systems for which JW (ξ) = 0. This formalizes the idea of a lossless passive system with storage function JP and supply rate JW introduced in Willems [1972] and van der Schaft [1983]. Reduction can now be analyzed as follows. One considers a ﬁber bundle (E, P, π) and a Lie group G acting freely and properly on E so that E/G is a manifold. Assume further that the action θ of G on E leaves the ﬁbers invariant. The submersion τ : E → E/G is a morphism which takes ﬁbers of E onto ﬁbers of E/G. Then, one has (Sanchez [1986]) 3.7 Theorem. Let Σ(E, P, f ) be a Poisson control system and G a Lie group acting freely and properly on E and P by Poisson maps and such that ˜ E/G, P/G, f˜ G leaves the ﬁbers of E invariant. Then the control system Σ ˜ is a Poisson control system, where f = f ◦ τ . There is also a natural version of this theorem for systems with outputs. Example 1. A good example of Theorem 3.7 is a rigid spacecraft with a single rotor. In this setting the state space is P = T ∗ SO(3) and the control space is T ∗ S 1 (corresponding to the rotor). The reduced state space under the action of G = SO(3) is P˜ = P/G = T ∗ SO(3)/ SO(3) ≡ so(3)∗ ˜ = E/(G × S 1 ) = so(3)∗ × R . and the reduced total space is E Example 2. Another good example is a rigid underwater vehicle with two ∗ internal rotors. In this case the state space is P = T SE(3) and the control space is T ∗ S 1 × S 1 (corresponding to the two rotors). The reduced state space under the action of G = SE(3) is P˜ = P/G = T ∗ SE(3)/ SE(3) ≡ se(3)∗ and the reduced total space is ˜ = E/(G × S 1 × S 1 ) = se(3)∗ × R2 . E In the case of a bottom-heavy underwater vehicle, gravity breaks part of the rotational symmetry and the remaining symmetry group is G = SE(2)×R. The reduced state space under the action of G = SE(2) × R is P˜ = P/G = T ∗ SE(3)/(SE(2) × R) ≡ s∗

14. Symmetries, Conservation Laws, and Control

439

and the reduced total space is ˜ = E/ G × S 1 × S 1 = s∗ × R2 . E The space s∗ is the dual of the Lie algebra of the double semidirect product S = SE(3) × R3 = SO(3) × (R3 × R3 ) with SO(3) acting identically on each of the two copies of R3 (see Leonard [1997a] for details).

4

Controlled Lagrangian Systems

In this section we turn to control synthesis and describe in broad terms the mathematics, intuition, and calculational procedure for the method of controlled Lagrangians. This provides a general setting for various papers, and has been discussed in a series of papers by Bloch, Leonard and Marsden, and more recently with Chang, Woolsey and Zenkov. Many of these papers are listed in the references. This theory generalizes the theory of control systems with symmetries discussed above to the closed-loop (feedback) setting. The guiding principle behind this methodology is the development of a class of feedback control laws which provide closed-loop dynamics that remain in Lagrangian form (and hence conservative form) and which achieve stabilization. Controls that are dissipative in nature are appended to turn the conservative stabilization into asymptotic stabilization. The advantage of requiring that the closed-loop dynamics be Lagrangian is that stabilization can be understood in terms of energy. In particular, energy methods can be used which provide a Lyapunov function and thereby give information on how to choose the control gains to achieve closed-loop stability. For the case when no dissipative control terms are added, there is a modiﬁcation of the mechanical energy of the system that is exactly conserved by the closed-loop dynamics; it can be interpreted as a combined energy available to the mechanism and the control forces. Accordingly, for ﬁxed control gains which achieve stabilization, one can conclude that the control inputs will remain bounded. The controlled Lagrangian approach begins with a mechanical system with symmetry and dynamics described by an (uncontrolled) Lagrangian equal to kinetic energy minus potential energy. The kinetic energy (given by a metric tensor) is then modiﬁed to produce a new controlled Lagrangian which describes the dynamics of the controlled, closed-loop system. The kinetic energy modiﬁcation preserves the original system symmetry. One can also modify the potential energy of the system and break symmetry; this is useful in achieving complete state-space stabilization. In this section, we summarize the general approach to kinetic energy modiﬁcation as

440

A. M. Bloch and N. E. Leonard

presented in Bloch, Leonard, and Marsden [2000]. Modiﬁcation of potential energy and symmetry breaking as a complement to kinetic energy shaping is described in Bloch, Chang, Leonard, and Marsden [2001]. The Setting. Let Q be the conﬁguration space for the system of interest and suppose that a Lie group G acts freely and properly on Q. For many practical examples Q = S × G with G acting only on the second factor by acting on the left by group multiplication. An inverted planar pendulum on a cart is such an example. The conﬁguration space is Q = S 1 × R with G = R, the group of reals under addition (corresponding to translations of the cart). Another such system is the rigid spacecraft with a rotor which has conﬁguration space Q = SO(3) × S 1 . Here the group G = S 1 corresponds to rotations of the rotor. Similarly, the rigid underwater vehicle with two rotors has conﬁguration space Q = SE(3) × (S 1 × S 1 ), and the group G = S1 × S1. A central objective of the methodology is to control the variables lying in the shape space Q/G (in the case in which Q = S ×G, then Q/G = S) using controls which act directly on the variables lying in G. The Lagrangian is assumed invariant under the action of G on Q, where the action is on the factor G alone. In many speciﬁc examples, such as those given above, the invariance is equivalent to the Lagrangian being cyclic in the G-variables. Accordingly, this produces a conservation law for the free system. The construction (before appending dissipation or introducing symmetry-breaking potentials) preserves the invariance of the Lagrangian, thus providing a controlled conservation law. The kinetic energy modiﬁcation focuses on modiﬁcation of the metric ˙ q). ˙ The tensor g(·, ·) that deﬁnes the kinetic energy of the system 12 g(q, modiﬁcation of this tensor relies on a special decomposition of the tangent spaces to the conﬁguration manifold and a subsequent “controlled” modiﬁcation of this split. Speciﬁcally, the tangent space to Q can be split into a sum of horizontal and vertical parts deﬁned as follows: for each tangent vector vq to Q at a point q ∈ Q, there is a unique decomposition vq = Hor vq + Ver vq ,

(4.1)

such that the vertical part is tangent to the orbits of the G-action and the horizontal part is the metric orthogonal to the vertical space. Equivalently, g(vq , wq ) = g(Hor vq , Hor wq ) + g(Ver vq , Ver wq ),

(4.2)

where vq and wq are arbitrary tangent vectors to Q at the point q ∈ Q. This choice of horizontal space coincides with that given by the mechanical connection - see, for example, Marsden [1992]. The Controlled Lagrangian. The controlled Lagrangian consists of a new kinetic energy deﬁned by a modiﬁcation of the metric tensor given in

14. Symmetries, Conservation Laws, and Control

441

(4.2) minus the original potential energy. There are three steps to modiﬁcation of the metric tensor: 1. a new horizontal space denoted Horτ is chosen; 2. there is a change g → gσ of the metric acting on horizontal vectors; and 3. there is a change g → gρ of the metric acting on vertical vectors. 4.1 Deﬁnition. Let τ be a Lie-algebra-valued horizontal one form on Q; that is, a one form with values in the Lie algebra g of G that annihilates vertical vectors. This means that for all vertical vectors v, the inﬁnitesimal generator [τ (v)]Q corresponding to τ (v) ∈ g is the zero vector ﬁeld on Q. The τ -horizontal space at q ∈ Q consists of tangent vectors to Q at q of the form Horτ vq = Hor vq − [τ (v)]Q (q), which also deﬁnes vq → Horτ (vq ), the τ -horizontal projection. The τ -vertical projection operator is deﬁned by Verτ (vq ) := Ver(vq ) + [τ (v)]Q (q). This new horizontal subspace can be regarded as deﬁning a new connection, the τ -connection. The horizontal space itself, which by abuse of notation, is also written as just Hor or Horτ of course depends on τ also, but the vertical space does not—it is the tangent to the group orbit. On the other hand, the projection map vq → Verτ (vq ) does depend on τ . 4.2 Deﬁnition. Given gσ , gρ and τ , the controlled Lagrangian is deﬁned to be the following Lagrangian which has the form of a modiﬁed kinetic energy minus the original potential energy: Lτ,σ,ρ (v) = 12 gσ Horτ vq , Horτ vq + gρ Verτ vq , Verτ vq − V (q) , (4.3) where V is the potential energy. Note that the controlled Lagrangian is a modiﬁcation of the Kaluza– Klein Lagrangian for a particle in a magnetic ﬁeld (see, for example, Marsden and Ratiu [1999]). The General Strategy. The deﬁnition of controlled Lagrangian given above (4.3) prescribes a parameterized family of candidate Lagrangians that should describe the closed-loop dynamics. The remainder of the approach then is to choose, if possible, one of the family members that is consistent with the available control authority and that provides stabilization as desired. The issue of ensuring consistency with available control authority is referred to as “matching”, since the problem is to match the dynamics of the controlled system with the dynamics of the uncontrolled system in directions that are not directly controlled. That is, if the dynamics deriving from the controlled Lagrangian are compared to the dynamics deriving from the uncontrolled Lagrangian, there can be new terms in the directly controlled directions (these are interpreted as control inputs), but there can be no new terms in the directions with no control input.

442

A. M. Bloch and N. E. Leonard

Thus, the next step in the approach is to choose τ , gσ and gρ so that matching is guaranteed. The matching theorems that have been proven, e.g., the ﬁrst matching theorem in Bloch, Leonard, and Marsden [2000], provide suﬃcient conditions for matching and a procedure for choosing τ , gσ and gρ . Note, that τ , gσ and gρ should not be completely speciﬁed at this stage. If they are, then the approach is not so useful. This is because remaining ﬂexibility in the choice of these (control) parameters is used to ﬁnd a controlled Lagrangian that yields closed-loop stability. Once matching has been achieved, but before stability is investigated, the control law can be determined by comparison of the uncontrolled and controlled dynamics. Because the control terms derive from a modiﬁcation of the kinetic energy, they will necessarily be functions of accelerations. However, since the closed-loop dynamics are known as the Euler–Lagrange equations associated with the controlled Lagrangian, the accelerations can be eliminated, and the control becomes a feedback law that is a function of conﬁguration and possibly velocity terms. The theory provides the construction of this control law. The ﬁnal step is to use the remaining freedom in the choice of τ , gσ and gρ to prove closed-loop stability. Because the closed-loop dynamics derive from a Lagrangian (the controlled Lagrangian), the stability analysis can be performed using linearization or the energy-momentum method (or, when appropriate, the energy-Casimir–Arnold method). Accordingly, one can write stabilizability theorems. These theorems provide construction for stabilizing feedback control laws (see, for example, Bloch, Leonard, and Marsden [2000]). We note that applying the matching theorems and stabilizability theorems in examples is relatively straightforward. The general matching and stabilization theorems have been applied to a variety of mechanical control systems including balance systems (inverted pendulum systems) and rigid body systems. For example, this methodology provides a control law for stabilization of an inverted spherical pendulum balanced on a cart using two control forces that can move the cart in the horizontal plane. The theory for systems of this type is described in Bloch, Leonard, and Marsden [2000], where details are also given for adding controlled dissipation and proving asymptotic stability. The theory focusing on kinetic energy modiﬁcation only provides stabilization modulo the symmetry directions. As mentioned above, the theory has been extended to provide stabilization of systems in the whole phase space. This is achieved by complementing the control term discussed above that derives from kinetic energy modiﬁcations with a control term that derives from a symmetry-breaking potential. This theory allows for the original system (i.e., before control) to have symmetry breaking in the potential energy; for example, the spherical pendulum balanced on a cart that moves about an inclined plane is an example. This more complete theory and the examples may be found in Bloch, Chang, Leonard, and Marsden [2001].

14. Symmetries, Conservation Laws, and Control

443

In the following we focus on the controlled Lagrangian methodology in the special context of systems with additional symmetries in the uncontrolled directions. This is relevant for examples such as the spacecraft and the underwater vehicle controlled with internal rotors as discussed at the end of §3. The theory in this context makes especially clear the role and utility of symmetry and reduction in understanding and controlling mechanical systems. We remark that related work on matching has been done by Auckly, Kapitanski, and White [2000], Hamberg [1999], and Baillieul [1999], and by Ortega, Loria, Nicklasson, and Sira-Ramirez [1998], Ortega, van der Schaft, and Maschke [1999], and Ortega, van der Schaft, Maschke, and Escobar [1999].

5

Controlled Euler–Poincar´ e Systems

In this section we give the details of a general, readily implementable matching theorem and stabilization theorems in the Euler–Poincar´e setting. The Euler–Poincar´e setting allows for control systems with two symmetry groups: the ﬁrst symmetry group G is the abelian one corresponding to the control inputs as described in §4, and the second nonabelian symmetry group H corresponds to symmetries of the uncontrolled system. This setting is motivated by the applications. For instance, the rigid spacecraft with rotor example has one symmetry group G = S 1 corresponding to the controlled rotor and a second symmetry group H =SO(3) corresponding to the rotational invariance of the complete rigid body system. The theory is illustrated for a spacecraft and an underwater vehicle controlled with internal rotors. The presentation follows Bloch, Leonard and Marsden [2001].

5.1

Euler–Poincar´ e Matching

Let the conﬁguration space be the Cartesian product of a nonabelian group H with an abelian group G. The controls act in the G directions. Let the Lagrangian L : T (H × G) → R be left invariant on H and cyclic in the abelian variables. Denote l : h × g → R the restriction of L to the identity ˙ The reduced of H, and for a curve h(t) ∈ H let η(t) = Th(t) Lh(t)−1 h. Lagrangian is given by l(η α , θ˙a ) = 12 gαβ η α η β + gαa η α θ˙a + 12 gab θ˙a θ˙b ,

(5.1)

where η α are the variables in h and θ a are the control variables. The matrices with elements gαβ , gαa and gab are all constant. In the absence of control, one conservation law comes about as a result of the G symmetry. The conserved quantity Ja is the momentum conjugate

444

A. M. Bloch and N. E. Leonard

to the cyclic variable θ a and is given by Ja =

∂l = gaα η α + gab θ˙b . ∂ θ˙a

(5.2)

The equations of motion corresponding to l for the control system where the controls ua act in the θa directions are called the controlled Euler– Poincar´e equations: d ∂l ∂l = cβαδ η δ β , dt ∂η α ∂η d ∂l = ua , dt ∂ θ˙a

(5.3) (5.4)

where cβαδ are the structure constants of the Lie algebra h. The strategy outlined in §4 is now followed in order to determine a stabilizing feedback control law for an otherwise unstable equilibrium of the system (5.3)-(5.4). The ﬁrst step is to write down the candidate family of controlled Lagrangians using Deﬁnition 4.2: Lτ,σ,ρ (v) = 12 gσ Horτ vq , Horτ vq + gρ Verτ vq , Verτ vq − V (q). (5.5) Assume further that the metric gσ satisﬁes the following: 1. g = gσ on Hor, 2. Hor and Ver are orthogonal for gσ . Then the controlled Lagrangian can be expressed as the original Lagrangian with shifted velocity, plus two additional modiﬁcation terms. We have the following formula: Lτ,σ,ρ (v) = L v + τ (v)Q + 12 gσ τ (v)Q , τ (v)Q + 12 (v) , where v ∈ Tq Q and where (v) = (gρ − g) Verτ (v), Verτ (v) .

5.1 Theorem.

(5.6)

For the proof see Bloch, Leonard, and Marsden [2000]. In coordinates, the controlled Lagrangian from (5.6) is lτ,σ,ρ = l(η α , θ˙a + ταa η α ) + 12 σab ταa τβb η α η β + 12 ab (θ˙a + g ac gcα η α + ταa η α )(θ˙b + g bc gcβ η β + τβb η β ).

(5.7)

σab and ρab denote the ab components of gσ and gρ , respectively. To preserve symmetry σab and ρab are both taken to be constant. From (5.7) the controlled conserved quantity J˜a associated with cyclic variable θa is ∂lτ,σ,ρ J˜a = = ρab (θ˙b + g bc gcα η α + ταb η α ). ∂ θ˙a

(5.8)

14. Symmetries, Conservation Laws, and Control

445

Since the controlled Lagrangian prescribes the closed-loop system, the closed-loop dynamics can be written as the Euler–Poincar´e equations corresponding to lτ,σ,ρ : ∂lτ,σ,ρ d ∂lτ,σ,ρ = cβαδ η δ , dt ∂η α ∂η β d ∂lτ,σ,ρ = 0. dt ∂ θ˙a

(5.9) (5.10)

This implies that the control inputs ua must be chosen so that (5.4) and (5.10) are equivalent. Additionally, as discussed in the general strategy in §4, the controlled Lagrangian must be chosen to be consistent with the available control authority. This is equivalent to requiring that (5.3) matches (5.9). By inspection it is clear that for matching to hold it is suﬃcient to equate ∂ ∂ l with lτ,σ,ρ . One then computes ∂η α ∂η α ∂ l = gαβ η β + gαa θ˙a , (5.11) ∂η α ∂ ∂ ∂ lτ,σ,ρ = l(η α , θ˙a + ταa η α ) + l(η α , θ˙a + ταa η α )ταa ∂η α ∂η α ∂ θ˙a + σab τ a τ b η β + ab (θ˙a + g ac gcα η α + τ a η α )(g bc gcα + τ b ) α β

α

= gαβ η β + gαa (θ˙a + τβa η β ) + gab ρbc J˜c ταa + σab ταa τβb η β + (ρab − gab )ρad J˜d (g bc gcα + τ b ) α

α

(5.12)

where in the second equality, the conservation law and the deﬁnition of are used. Note that the partial derivatives in this expression mean the derivatives with respect to the relevant variable slot of the function. Subtract (5.11) from (5.12), and one ﬁnds that matching holds if (gαb + σab ταa )τβb η β + gab ρbc J˜c ταa + (ρab − gab )ρad J˜d (g bc gcα + ταb ) = 0. (5.13) With two assumptions, matching follows: Assumption EP–1. ταa = −σ ab gbα . Assumption EP–2. σ ab + ρab = g ab . 5.2 Theorem. Under assumptions EP–1 and EP–2, the Euler–Poincar´e equations for the controlled Lagrangian coincide with the controlled Euler– Poincar´e equations. Proof. With Assumption EP–1, the ﬁrst term in (5.13) is zero and the remaining expression simpliﬁes to gab σ ac − (ρab − gab )(g ac − σ ac ) = 0 . Assumption EP–2 is equivalent to (5.14).

(5.14)

446

A. M. Bloch and N. E. Leonard

Theorem 5.2 restricts τ , gσ , and gρ suﬃciently so that the family of controlled Lagrangians given by (5.7) is consistent with the available control authority. Note that there remains a nontrivial choice of controlled Lagrangian, i.e., one can interpret ρab as a free control parameter which can be chosen for stabilization. Given ρab , Assumption EP–2 gives σab and then Assumption EP–1 prescribes τ .

5.2

Determination of the Control Law

The control law ua is chosen so that (5.4) and (5.10) are the same. Note that (5.4) is J˙a = ua and (5.10) is the controlled conservation law J˜˙a = 0. Since gab and ρab are also constant, the controlled conservation law can also be written d d (gab ρbd J˜d ) = (gaα η α + gab θ˙b + gab ταb η α ) = 0. dt dt

(5.15)

Subtracting (5.15) from (5.4) gives d d (gab ρbd J˜d − Ja ) = − (gab ταb η α ) dt dt = gab σ bc gcα η˙ α ,

ua = −

(5.16)

where the expression in Assumption EP–1 is used for ταb . Accelerations are eliminated using the Euler–Poincar´e equations for η which hold for both l and lτ,σ,ρ . This yields (see Bloch, Leonard, and Marsden [2001] for details)

δ ∂l η ud = gdb σ bc gcβ B αβ −gαe g ae ua + cψ αδ ∂η ψ δ i.e., ua = Dab σ bc gcβ B αβ cψ αδ η

∂l , where ∂η ψ

Dba = g ba + σ bc gcβ B αβ gαe g ae

(5.17)

Bαβ = gαβ − gαb g ab gaβ .

(5.18)

and

Deﬁne control gains kaα = Dab σ bc gcβ B αβ .

(5.19)

Then, ua = kaα

d ∂l δ ∂l δ β ˙b . = kaα cψ = kaα cψ αδ η αδ η gψβ η + gψb θ dt ∂η α ∂η ψ

(5.20)

14. Symmetries, Conservation Laws, and Control

5.3

447

Euler–Poincar´ e Stabilization

Since the controlled system is in Lagrangian (or Hamiltonian) form, the energy-Casimir or energy-momentum method can be used to determine stability (see e.g., Marsden [1992], or Marsden and Ratiu [1999]). Recall that for mechanical systems, an eigenvalue analysis alone is not suﬃcient for determining stability. Before proceeding to explicit computation in the examples, we ﬁrst describe the general analysis of stability. The general approach is to assume an equilibrium is given and there are a ﬁnite collection of Casimir functions (or more generally, conserved quantities) for the free Lagrangian system on the group H without the introduction of the controlled variables. The control variables are then added. Deﬁne, using the previous notation, the (reduced) Lagrangian l0 on h by l0 (η α ) = 12 gαβ η α η β .

(5.21)

A (relative) equilibrium ηe for the corresponding dynamical equations satisﬁes the equation ∂l0 (5.22) cβαδ η δ β = 0 . ∂η Suppose there are a collection of Casimir functions C 1 Mα ), · · · , C m (Mα ∂l0 β where Mα = ∂η α = gαβ η . Now consider the full uncontrolled Lagrangian l given by (5.7). Using (5.3), ηe is still an equilibrium together with θ˙ea for the full system provided cβαδ ηeδ gβδ ηeδ + gβa θ˙ea = 0 .

(5.23)

This is satisﬁed if cβαδ ηeδ gβa θ˙ea = 0 and in particular if θ˙ea = 0. Therefore, from the matching conditions, lτ,σ,ρ also has this equilibrium. 1 Set ˜ α = ∂lτ,σ,ρ = δl = gαβ η β + gαa θ˙a M ∂η α δη α β = Gαβ η + gαa ρab J˜b

(5.24)

where Gαβ = gαβ −gaα ρab gbβ . To examine stability of the controlled system using the energy-Casimir method consider the energy-Casimir function ˜ C k (M ˜ α ), J˜a Φ EΦ˜ = lτ,σ,ρ + (5.25) k

1 This can also be seen using the general fact that l and l τ,σ,ρ are reductions of Lagrangians L and Lτ,σ,ρ that have the same “locked Lagrangian”. This is explained in Bloch, Leonard, and Marsden [2001].

448

A. M. Bloch and N. E. Leonard

˜ is a smooth function. Since we have only reshaped energy and not where Φ modiﬁed the Lagrangian (Hamiltonian) structure, C k are Casimir functions for the controlled system. First note, that using Assumption EP–2, lτ,σ,ρ = 12 Gαβ η α η β + 12 ρab J˜a J˜b , and consider the case in which ˜ C 1 , · · · , C m , J˜a ≡ Φ ˜ C 1 , · · · , C m + Ψ J˜a . Φ

(5.26)

(5.27)

This specialization is suﬃcient for applications. Computing the ﬁrst and second variations at the equilibrium following the standard prescription of the energy-Casimir method gives the following theorem: 5.3 Theorem. Let ηe be an equilibrium for the uncontrolled dynamics given by l0 (5.21). Suppose that θ˙e satisﬁes (5.23). Then, (ηe , θ˙e ) is an equilibrium for the controlled system described by lτ,σ,ρ (5.26). This equilibrium ˜ 1 , . . . , Cm ) can is Lyapunov stable for the controlled dynamics if ρab and Φ(C be found so that k ˜ ∂C = −ηeα , Dk Φ ˜ ∂ M α e k=1

m

(5.28)

˜ γδ Gδβ is deﬁnite, where and Gαβ + Gαγ H ˜ αβ = H

m

m k j 2 k ˜ ∂C ∂C + ˜ ∂ C . Dkj Φ Dk Φ ˜ ˜ ˜ ˜ ∂ M ∂ M ∂ M ∂ M α β α β e k,j=1 k=1

(5.29)

The matrix Gαβ is the horizontal part of the metric for the controlled system, i.e., the “controlled inertia” associated with the group H variables. The expression Gαβ = gαβ − gaα ρab gbβ , shows how the control gain ρab enters in to provide stabilization, i.e., by modifying the inertia to satisfy the deﬁniteness condition of Theorem 5.3. Details of the proof are in Bloch, Leonard, and Marsden [2001]. To obtain asymptotic stability, an additional term can be introduced in the control law to simulate dissipation. Let the complete control law be + gab ρbc udiss , ua = ucons a c

(5.30)

= gab σ bc gcα η˙ α i.e., the control law term derived above in equawhere ucons a is the new feedback term that will simulate dissipation (5.16). Here, udiss a tion. With this control law, the controlled dynamics are computed to be ∂lτ,σ,ρ d ∂lτ,σ,ρ = cβαδ η δ , α dt ∂η ∂η β

(5.31)

14. Symmetries, Conservation Laws, and Control

d ∂lτ,σ,ρ = J˜˙a = udiss , a dt ∂ θ˙a

449

(5.32)

d where udiss is chosen so that dt EΦ˜ is nondecreasing (nonincreasing) if the a equilibrium is a local maximum (minimum) of EΦ˜ . ˜ can be taken in the form (5.27). Since the actuation is Assume that Φ internal, C k is constant. Thus,

˜ d d ∂Φ ∂Ψ ˜˙ ∂Ψ diss Ja = θ˙a + C˙ k + EΦ˜ = lτ,σ,ρ + ua . k dt dt ∂C ∂ J˜a ∂ J˜a

(5.33)

Without loss of generality, assume that EΦ˜ has a local maximum at the equilibrium. Choose ∂Ψ , (5.34) = cab θ˙b + udiss a ∂ J˜b where cab is a positive deﬁnite matrix. Then, ∂Ψ ˙b ∂Ψ d ≥ 0. EΦ˜ = cab θ˙a + θ + dt ∂ J˜a ∂ J˜b

(5.35)

In the case that the equilibrium of interest is such that θ˙a e = 0, Ψ can be taken as Ψ J˜ = 12 bc J˜b J˜c , where bc is a sign deﬁnite symmetric matrix. Then, (5.35) becomes d E ˜ = cab θ˙a + ac J˜c θ˙b + bd J˜d ≥ 0 . dt Φ

(5.36)

The LaSalle invariance principle and the details of the speciﬁc system are used to prove asymptotic stability. In Bloch, Chang, Leonard, Marsden, and Woolsey [2000] the general theory of how to carry this out is examined in some detail. See also Woolsey and Leonard [1999a].

5.4

The Rigid Spacecraft with a Rotor

The ﬁrst application considered is the rigid spacecraft with rotor aligned along the third principal axis as shown in Figure 5.1 and as described in Krishnaprasad [1985] and Bloch, Krishnaprasad, Marsden, and S´ anchez de Alvarez [1992]. The rotor spins under the inﬂuence of a torque u acting on the rotor, and the objective is to stabilize steady spin about the spacecraft’s otherwise unstable, intermediate axis. The stabilizing control law derived in Bloch, Krishnaprasad, Marsden, and S´ anchez de Alvarez [1992] is recovered here by application of the method of controlled Lagrangians (although in contrast to the the earlier paper, here there is no restriction to the zero level set of the conserved momentum). As such, the method of controlled Lagrangians can be regarded as an algorithmic generalization

450

A. M. Bloch and N. E. Leonard

spinning rotor

rigid carrier

Figure 5.1. The spacecraft with a rotor attached along the long axis.

of the spacecraft control law developed in Bloch, Krishnaprasad, Marsden, and S´ anchez de Alvarez [1992]. The conﬁguration space for the spacecraft with rotor system is Q = SO(3) × S 1 ; the reduced Lagrangian l on so(3) × R is the kinetic energy: ˙ 2 ˙ = 1 λ1 Ω2 + λ2 Ω2 + I3 Ω2 + J3 (Ω3 + φ) l(Ω, φ) 1 2 3 2 ⎡ ⎤⎡ ⎤ Ω1 λ1 0 0 0 ⎢ ⎢ ⎥ 0 0 ⎥ ⎢Ω2 ⎥ 0 λ2 ⎥. = 12 Ω1 Ω2 Ω3 φ˙ ⎢ (5.37) ⎣0 0 I3 + J3 J3 ⎦ ⎣Ω3 ⎦ 0 0 J3 J3 φ˙ Here, Ω = (Ω1 , Ω2 , Ω3 ) is the body angular velocity vector of the carrier, φ is the relative angle of the rotor, I1 > I2 > I3 are the rigid body moments of inertia, J1 = J2 and J3 are the rotor moments of inertia and λi = Ii + Ji . Since G = S 1 is one-dimensional, gab , σab and ρab are all scalars. From (5.37), gab = J3 . Let σab = σJ3 and ρab = ρJ3 , where σ and ρ are dimensionless scalars. Applying Assumptions EP–1 and EP–2, the controlled Lagrangian is 2 σ lτ,σ,ρ = 12 λ1 Ω21 + λ2 Ω22 + I3 Ω23 + σ1 J3 Ω23 + σ−1 J3 Ω3 + φ˙ − σ1 Ω3 , (5.38) where σ is a free variable and matching is ensured by Theorem 5.2. The controlled conserved quantity is ˜l3 = ∂lτ,σ,ρ = J3 Ω3 + ρJ3 φ˙ . ∂ φ˙ Using (5.20), the control law is u = k(λ1 − λ2 )Ω1 Ω2 , where

k I3 1 = . σ 1 − k J3

(5.39)

As in Bloch, Krishnaprasad, Marsden, and S´ anchez de Alvarez [1992], ˙ = (0, Ω, ¯ 0, 0) corresponding to the equilibrium of interest is (Ω1 , Ω2 , Ω3 , φ)

14. Symmetries, Conservation Laws, and Control

451

steady rotation about the intermediate axis (unstable for the uncontrolled spacecraft). Application of Theorem 5.3 gives the stabilization result: ¯ 0, 0) is non5.4 Proposition. For k > 1 − I3 /λ2 , the equilibrium (0, Ω, linearly stable for the feedback controlled system where u is given by (5.39). For asymptotic stability of the equilibrium, consider Ψ(˜l3 ) =

1 ˜2 l 2J3 3

with < 0 and || " 1. By (5.34) a dissipative control term is taken as 1 1 ˜ ρ udiss = c φ˙ + l3 = c Ω3 + 1 + φ˙ J3 with c > 0, and the complete control law is u = k(λ1 − λ2 ) Ω1 Ω2 + (1 − k) ρ1 udiss . As is shown in Bloch, Chang, Leonard, Marsden, and Woolsey [2000] this leads to asymptotic stability.

5.5

The Dynamics of an Underwater Vehicle

In this section, a full presentation (following Bloch, Leonard, and Marsden [2001]) is made of the application of the theory to stabilization of (otherwise unstable) steady translation of an underwater vehicle along its long (streamlined) axis using two internal rotors. Aspects of the application are similar to the spacecraft problem; however, the underwater vehicle dynamics are much richer, notably because of the interaction with the surrounding ﬂuid for both rotational and translational motions. Typically, steady translation of an underwater vehicle is stabilized by propellers and ﬁns. However, internal rotors may be advantageous in certain settings since they are isolated from the harsh seawater environment and, unlike ﬁns, they can provide stabilization even at low vehicle velocities. The original idea to consider internal rotors for underwater vehicle stabilization was inspired in part by these practical considerations but also notably by the developments and successes in geometric mechanics and control and most especially the work of Marsden and colleagues. Stabilization of the underwater vehicle with internal rotors was ﬁrst investigated in Leonard and Woolsey [1998]. The approach proceeded in the Hamiltonian setting and a control law was determined that gave controlled Lie-Poisson dynamics. Here, we show that the method of controlled Lagrangians provides a systematic means for determining such a control law. The underwater vehicle dynamics are modelled according to Kirchhoﬀ’s equations which describe the dynamics of a rigid body in an ideal, unbounded ﬂuid. The vehicle is assumed to be neutrally buoyant, to have

452

A. M. Bloch and N. E. Leonard

three planes of symmetry and to have uniformly distributed mass so that the centers of buoyancy and gravity are coincident. Given these simplifying assumptions, the mass plus added mass matrix of the body-ﬂuid system can be diagonalized and denoted by (m1 , m2 , m3 ) as can the inertia plus added inertia matrix which is denoted by (I1 , I2 , I3 ). More on the geometry, dynamics, equilibria and stability of the underwater vehicle under these assumptions as well as in the case when centers of buoyancy and gravity are not coincident can be found in Leonard [1997a], Leonard and Marsden [1997], and Holmes, Jenkins, and Leonard [1998]. An approach to stabilization of the underwater vehicle steady translation using propellers and symmetry-breaking potentials (instead of kinetic energy modiﬁcation) can be found in Leonard [1997b]. Let the underwater vehicle have two independently controlled, symmetric, internal rotors, one aligned along the ﬁrst principal axis and the other aligned along the second principal axis. The ﬁrst rotor spins under the inﬂuence of a torque u1 acting on it, and the second rotor spins under the inﬂuence of a torque u2 acting on it. The conﬁguration space of the underwater vehicle plus rotors system is Q = SE(3)×(S 1 ×S 1 ) with the ﬁrst factor H = SE(3) being the underwater vehicle attitude and position and the second factor G = S 1 × S 1 being the pair of rotor angles. The reduced Lagrangian on se(3) × R2 for this system is the system kinetic energy: l(v, Ω, α˙ 1 , α˙ 2 ) = 12 (m1 v12 + m2 v22 + m3 v32 + I¯1 Ω21 + I¯2 Ω22 + λ3 Ω23 + J1 Ω1 + α˙ 1 )2 + J2 (Ω2 + α˙ 2 )2 ⎡ ⎤T ⎡ v1 m1 0 0 0 0 0 0 0 ⎢ v2 ⎥ ⎢ 0 m2 0 0 0 0 0 0 ⎢ ⎥ ⎢ ⎢ v3 ⎥ ⎢ 0 0 0 0 0 0 0 m 3 ⎢ ⎥ ⎢ ⎢ ⎥ ⎢ 0 J1 0 0 0 λ1 0 1 ⎢Ω1 ⎥ ⎢ 0 = 2⎢ ⎥ ⎢ 0 J2 0 0 0 λ2 0 ⎢Ω2 ⎥ ⎢ 0 ⎢Ω3 ⎥ ⎢ 0 0 0 0 0 λ 0 0 3 ⎢ ⎥ ⎢ ⎣ α˙ 1 ⎦ ⎣ 0 0 J1 0 0 0 J1 0 α˙ 2 0 J2 0 0 0 0 J2 0

(5.40) ⎤⎡

⎤ v1 ⎥ ⎢ v2 ⎥ ⎥⎢ ⎥ ⎥ ⎢ v3 ⎥ ⎥⎢ ⎥ ⎥ ⎢Ω1 ⎥ ⎥⎢ ⎥ , ⎥ ⎢Ω2 ⎥ ⎥⎢ ⎥ ⎥ ⎢Ω3 ⎥ ⎥⎢ ⎥ ⎦ ⎣ α˙ 1 ⎦ α˙ 2

where v = (v1 , v2 , v3 ) is the body linear velocity of the vehicle, Ω = (Ω1 , Ω2 , Ω3 ) is the body angular velocity vector of the vehicle, αi is the relative angle of the ith rotor, i = 1, 2, J1 and J21 = J31 are the ﬁrst rotor moments of inertia, J2 and J12 = J32 are the second rotor moments of inertia, I¯1 = I1 + J12 , I¯2 = I2 + J21 , λ3 = I¯3 = I3 + J31 + J32 and λi = I¯i + Ji , for i = 1, 2. The vehicle linear and angular momenta are determined by the Legendre transform to be P2 = m2 v2 , P3 = m3 v3 , P1 = m 1 v1 , Π1 = λ1 Ω1 + J1 α˙ 1 , Π2 = λ2 Ω2 + J2 α˙ 2 ,

Π3 = λ3 Ω3 .

14. Symmetries, Conservation Laws, and Control

453

˜ i = Pi , i = 1, 2, 3 and M ˜ i+3 = Πi , i = 1, 2, 3 following the Note that M notation of (5.24). The momenta conjugate to α1 and α2 are ∂l = l1 = J1 (Ω1 + α˙ 1 ) , ∂ α˙ 1 ∂l = l2 = J2 (Ω2 + α˙ 2 ) . ∂ α˙ 2 The equations of motion with the control torques u1 and u2 acting on the rotors are: ˙ = Π×Ω+P ×v, Π P˙ = P × Ω , l˙1 = u1 , l˙2 = u2 .

(5.41)

Controlled Lagrangian and Matching. To determine the controller, the controlled Lagrangian is ﬁrst constructed and then the Euler–Poincar´e matching theorem is applied. From (5.40), gab is given by 0 1 0 1 gα1 α1 gα1 α2 J1 0 = . (5.42) gα2 α1 gα2 α2 0 J2 Let σ1 , σ2 be dimensionless scalars and let σab be given by 1 0 1 0 0 J1 σ1 σα1 α1 σα1 α2 = . σα2 α1 σα2 α2 0 J2 σ2

(5.43)

Similarly, let ρ1 , ρ2 be dimensionless scalars and let ρab be given by 0 0 1 0 0 J1 ρ1 ρα1 α2 ρα1 α2 = . (5.44) ρα2 α2 ρα2 α2 0 J2 ρ2 For matching ταa should be chosen according to Assumption EP–1, i.e., 0 α 1 0 1 0 0 0 σ11 0 0 τv11 τvα21 τvα21 τΩα11 τΩα21 τΩα31 = . τvα12 τvα22 τvα22 τΩα12 τΩα22 τΩα32 0 0 0 0 σ12 0 (5.45) 1 1 1 + = , for i = 1 Further, by assumption EP–2, ρi should satisfy σi Ji ρi Ji Ji and 2, which implies σi ρi = . (5.46) σi − 1 Plugging into (5.7) with these choices, the controlled Lagrangian is lτ,σ,ρ (v, Ω, α˙ 1 , α˙ 2 )

(5.47)

454

A. M. Bloch and N. E. Leonard

=

m1 v12 + m2 v22 + m3 v32 + I¯1 Ω21 + I¯2 Ω22 + λ3 Ω23 + σ11 J1 Ω21 + σ12 J2 Ω22 2 2 1 2 + σ1σ−1 J1 Ω1 + α˙ 1 − σ11 Ω1 + σ2σ−1 J2 Ω2 + α˙ 2 − σ12 Ω2 , 1 2

where σ1 , σ2 are free variables and matching is ensured by Theorem 5.2. The controlled conserved quantities are, for i = 1, 2, ˜li = ∂lτ,σ,ρ = Ji Ωi + ρi Ji α˙ i . ∂ α˙ i Control Law. By (5.16), the control law is u1 =

1 ˙ σ1 J1 Ω1

,

u2 =

1 ˙ σ2 J2 Ω2

.

(5.48)

The formula (5.20) is used to get the control law with accelerations eliminated. First, from (5.18), the matrix B is diagonal and has diagonal elements (m1 , m2 , m3 , I¯1 , I¯2 , λ3 ). Then, from (5.17), D is computed to be @ A 1 1 1 α1 α1

+ 0 Dα1 α2 D ¯ σ1 I1 = J1 . 1 1 1 Dα2 α1 Dα2 α2 0 J2 + σ2 I¯2 The control gains kaα are found from (5.19). The only nonzero elements of this gain matrix are J1 , k1 = kαΩ11 = ¯ σ1 I1 + J1 (5.49) J2 . k2 = kαΩ22 = σ2 I¯2 + J2 From (5.20), the control law is u1 = k1 (λ2 − λ3 )Ω2 Ω3 + J2 α˙ 2 Ω3 + (m2 − m3 )v2 v3 , u2 = k2 (λ3 − λ1 )Ω3 Ω1 + J1 α˙ 1 Ω3 + (m3 − m1 )v3 v1 .

(5.50)

Stabilization. The family of relative equilibria of interest corresponds to translation along and rotation about the third principal axis of the vehicle, i.e., ⎤ ⎡ ⎤ ⎡ 0 1 0 0 0 ⎦ ⎣ ⎦ ⎣ 0 , α˙ = , (5.51) v= 0 , Ω= 0 ¯ Ω v¯ where v¯ = 0. Assume the third principal axis corresponds to the longest physical axis of the vehicle (and the ﬁrst principal axis to the shortest physical axis). Then for the uncontrolled vehicle, this family of relative equilibria is unstable. This ordering of lengths implies m3 < m2 < m1 .

(5.52)

14. Symmetries, Conservation Laws, and Control

455

Two Casimir functions for this problem are as follows: C1 = P1 Π1 + P2 Π2 + P3 Π3 , C2 = 12 P12 + P22 + P32 .

(5.53)

The Lyapunov function for studying stability can be taken to be of the form ˜ C1 , C2 + Ψ ˜l1 , ˜l2 . = lτ,σ,ρ + Φ EΦ,Ψ ˜ ˜ evaluated To satisfy (5.28) for Theorem 5.3, the ﬁrst partial derivatives of Φ at the equilibrium should satisfy ¯ ˜ ˜ = ∂ Φ = − Ω , Φ (5.54) e ∂C1 e m3 v¯ ¯2 ˜ ˜˙ = ∂ Φ = − 1 + λ3 Ω . Φ (5.55) e ∂C2 e m3 m23 v¯2 ˜ γδ Gδβ Given this criterion, it remains to show that Nαβ = Gαβ + Gαγ H can be made deﬁnite. In this case the matrix with elements Gαβ = gαβ − gaα ρab gbβ is computed to be the diagonal matrix diag m1 , m2 , m3 , I¯1 + J1 /σ1 , I¯2 + J2 /σ2 , λ3 . ˜ αβ as deﬁned by (5.29) is computed to be The matrix with elements H ⎛ ⎞ ˜˙ ˜ Φ 0 0 Φ 0 0 e e ⎜ ⎟ ⎜ 0 ˜˙ ˜ Φ 0 0 Φ 0 ⎟ ⎜ ⎟ e e ⎜ 0 ˜ 33 ˜ 36 ⎟ 0 H 0 0 H ˜ ⎜ ⎟ H=⎜ ⎟ ˜ 0 0 0 0 0 ⎜ Φ ⎟ e ⎜ ⎟ ˜ ⎝ 0 ⎠ Φ 0 0 0 0 e 36 66 ˜ ˜ 0 0 H 0 0 H where ¨˜ m2 v¯2 , ˜ 33 = Φ ¯ 2 + 2Φ ¯ m3 v¯ + Φ ˜˙ + Φ ˜ λ23 Ω ˜˙ λ3 Ω H 3 e e e e 2 2 ˙ 36 ¯ m3 v¯ + Φ ˜ =Φ ˜ +Φ ˜ λ3 Ω ˜ m3 v¯ , H e e e 66 2 2 ˜ ˜ H = Φ m3 v¯ . e

Using this, the matrix with elements Nαβ can be determined to be ⎛ ⎞ ˙ 2˜ ˜ 0 0 m1 ∆1 Φ 0 0 ⎜ m1 + m 1 Φ e ⎟ e ⎜ ⎟ ˙ ⎜ ⎟ 2˜ ˜ ⎜ ⎟ Φ Φ 0 m + m 0 0 m ∆ 0 2 2 2 2 ⎜ ⎟ e e ⎜ 2 ˜ 33 36 ⎟ ˜ ⎜ 0 0 m3 + m3 H 0 0 m3 λ 3 H ⎟ N =⎜ ⎟ ⎜ ⎟ ˜ ⎜ m1 ∆1 Φ ⎟ 0 0 ∆ 0 0 1 ⎜ ⎟ e ⎜ ⎟ ˜ ⎜ ⎟ 0 m2 ∆2 Φ e 0 0 ∆2 0 ⎝ ⎠ 36 2 66 ˜ ˜ 0 0 m3 λ 3 H 0 0 λ3 + λ H 3

456

A. M. Bloch and N. E. Leonard

I¯i Ji ∆i = I¯i + = . σi 1 − ki The ﬁrst three diagonal elements of N are N11 = m21 m11 − m13 < 0 , N22 = m22 m12 − m13 < 0 , ¨ ˜ m4 v¯2 < 0 . N33 = Φ 3 e

where

¨˜ < 0 has been where the fact that m3 < m2 < m1 has been used and Φ e chosen. Since these ﬁrst three diagonal element are negative, Nαβ should be made negative deﬁnite. ¯ = 0. This is a pracConsider the special case of equilibrium in which Ω tical choice as it corresponds to the vehicle translating along its long axis but not spinning. For this equilibrium, Nαβ is negative deﬁnite if we take ˜˙ = 0, Φ e

˜ < − λ3 m2 v¯2 −1 , Φ 3 e

∆1 < 0,

and ∆2 < 0 .

The conditions on ∆1 and ∆2 hold if and only if k1 > 1 and k2 > 1. Therefore, by Theorem 5.3 it holds. 5.5 Proposition.

For k1 > 1 and k2 > 1, the equilibrium (0, 0, v¯, 0, 0, 0, 0, 0)

for v¯ = 0 ,

is nonlinearly stable for the feedback controlled system, with u as in (5.50). Asymptotic stability and dissipative controls for the underwater vehicle with rotors are discussed in detail in Woolsey and Leonard [1999a] and in the presence of viscous ﬂuid drag in Woolsey and Leonard [1999b, 2000].

6

Conclusions

In this paper we have presented some of the beautiful connections between control theory, mechanics and symmetry as exempliﬁed in the work of Jerry Marsden. This is by no means his only work in the area, however. For example, Jerry has also done fundamental work in nonholonomic mechanical control systems (see Bloch, Krishnaprasad, Marsden, and Murray [1996]). In work with Jalnakapur (Jalnapurkar and Marsden [1999, 2000]), Marsden analyzed the problem of stabilizing relative equilibria of general underactuated mechanical systems with symmetry. In Jalnapurkar and Marsden [2000] internal actuation only is considered (i.e., change of shape of the mechanical systems essentially) and this is used to shape the potential leading to stabilization. This work uses a combination of the potential shaping ideas of van der Schaft [1986], and the energy-momentum method

14. Symmetries, Conservation Laws, and Control

457

for stabilization due to Sim´o, Lewis, and Marsden [1991]. As an example it is shown how to asymptotically stabilize the “cowboy” equilibrium of the double spherical pendulum. In Jalnapurkar and Marsden [1999] this work is extended to the case of “external” actuation, or actuation in the group variables. This means that one is no longer restricted to constant momentum surfaces and one can achieve stabilization of a relative equilibrium in a full phase space neighborhood modulo a group action. In Koon and Marsden [1997], Koon and Marsden studied necessary conditions for optimal control systems with symmetry using the ideas of Lagrangian reduction. They showed that by using Lagrangian reduction for optimal control one can essentially achieve in one step the two steps of using the Pontryagin maximum principle and then using Poisson reduction. They applied this methodology both to unconstrained systems with zero angular momentum such as the satellite with moveable masses and to nonholonomic constrained systems such as the skateboard. In this setting there is a nontrivial momentum equation as derived in Bloch, Krishnaprasad, Marsden, and Murray [1996]. In Zenkov, Bloch, and Marsden [1999], stabilization of nonholonomic systems was considered. This paper analyzes stabilization of a nonholonomic system consisting of a unicycle with rider. It is shown that one can achieve stability of slow steady vertical motions by imposing a feedback control force on the rider’s limb. This work uses techniques from the theory of stability of nonholonomic dynamical systems that are given in Zenkov, Bloch, and Marsden [1998]. In particular use is made of the Lyapunov–Malkin theorem which is a variant of the center manifold theorem which turns out to be very useful for analyzing nonholonomic systems which exhibit dissipative behavior. In Zenkov, Bloch, Leonard, and Marsden [2000] a start is made in applying matching theory in the nonholonomic setting. There is still much to be done in these areas and we are sure that much will be done!

Acknowledgments: The research work of Anthony M. Bloch was partially supported by NSF grants DMS 981283 and 0103895 and AFOSR. The research work of Naomi E. Leonard was partially supported by NSF grant CCR-9980058, ONR grant N00014-98-1-0649 and AFOSR grant F4962001-1-0382.

References ˚ Astr¨ om, K. J. and K. Furuta [1996], Swinging up a pendulum by energy control, IFAC World Congress, San Francisco 13.

458

A. M. Bloch and N. E. Leonard

Abraham, R. and J. E. Marsden [1978], Foundations of Mechanics, Second Edition, Addison-Wesley. Auckly, D., L. Kapitanski, and W. White [2000], Control of nonlinear underactuated systems, Comm. Pure Appl. Math., 53, 354–369. Baillieul, J. [1999], Matching conditions and geometric invariants for second-order control systems, in Proc. IEEE Conf. Decision and Control 38, 1664–1670. Bloch, A. M., D. E. Chang, N. E. Leonard, and J. E. Marsden [2001], Controlled Lagrangians and the stabilization of mechanical systems II: Potential shaping, IEEE Trans. Automatic Control 46, 1556–1571. Bloch, A. M., D. E. Chang, N. E. Leonard and J. E. Marsden [2000], Potential and kinetic shaping for control of underactuated mechanical systems, in Proc. American Control Conf., 3913–3917. Bloch, A. M., D. E. Chang, N. E. Leonard, J. E. Marsden and C. A. Woolsey [2000], Asymptotic stabilization of Euler–Poincar´e mechanical systems, in Lagrangian and Hamiltonian Methods for Nonlinear Control: A Proceedings Volume from the IFAC Workshop (N. E. Leonard and R. Ortega eds.), Pergamon, 51–56. Bloch, A. M. and P. E. Crouch [1999], Optimal control, optimization and analytical mechanics, in Mathematical Control Theory (J. Baillieul and J. Willems eds.), 268–321, Springer. Bloch, A. M., P. S. Krishnaprasad, J. E. Marsden, and R. Murray [1996], Nonholonomic mechanical systems with symmetry, Arch. Rat. Mech. An., 136, 21–99. Bloch, A. M., P. S. Krishnaprasad, J. E. Marsden, and G. S´ anchez de Alvarez [1992], Stabilization of rigid body dynamics by internal and external torques, Automatica 28, 745–756. Bloch, A. M., N. E. Leonard and J. E. Marsden [1997], Stabilization of mechanical systems using controlled Lagrangians, Proc. IEEE Conf. Decision and Control 36, 2356–2361. Bloch, A. M., N. E. Leonard and J. E. Marsden [1998], Matching and stabilization by the method of controlled Lagrangians, Proc. IEEE Conf. Decision and Control 37, 1446–1451. Bloch, A. M., N. E. Leonard and J. E. Marsden [1999a], Stabilization of the pendulum on a rotor arm by the method of controlled Lagrangians, Proc. IEEE Int. Conf. Robotics and Automation, 500–505. Bloch, A. M., N. E. Leonard and J. E. Marsden [1999b], Potential shaping and the method of controlled Lagrangians, Proc. IEEE Conf. Decision and Control 38, 1653–1657. Bloch, A. M., N. E. Leonard and J. E. Marsden [2000], Controlled Lagrangians and the stabilization of mechanical systems I: The ﬁrst matching theorem, IEEE Trans. Automatic Control 45, 2253–2270 Bloch, A. M., N. E. Leonard and J. E. Marsden [2001], Controlled Lagrangians and the stabilization of Euler–Poincar´e mechanical systems, Int. J. Nonlinear and Robust Control 11, 191–214. Bloch, A. M., J. E. Marsden and G. S´ anchez de Alvarez[1997], Stabilization of relative equilibria of mechanical systems with symmetry, Current and Future Directions in Applied Mathematics, (M. Alber, B. Hu, and J. Rosenthal, eds.), Birkh¨ auser, 43–64.

14. Symmetries, Conservation Laws, and Control

459

Bloch, A. M. and J. E. Marsden [1989], Controlling homoclinic orbits, Theoretical and Computational Fluid Dynamics 1, 179–190. Brockett, R. W. [1976], Control theory and analytical mechanics, in 1976 Ames Research Center (NASA) Conference on Geometric Control Theory, (R. Hermann and C. Martin, eds.), Lie Groups: History Frontiers and Applications, 7, Math. Sci. Press, Brookline, Mass., USA. Grizzle, J. W. and S. Marcus [1985], The structure of nonlinear control systems posessing symmetries, IEEE Trans. Automatic Control 30, 248–258. Hamberg, J. [1999], General matching conditions in the theory of controlled Lagrangians. Proc. IEEE Conf. Decision and Control 38, 2519–2523. Holmes, P., J. Jenkins, and N. E. Leonard [1998], Dynamics of the Kirchhoﬀ equations I: Coincident centers of gravity and buoyancy, Physica D, 118, 311–342. Jalnapurkar, S. M. and J. E. Marsden [1999], Stabilization of relative equilibria II, Reg. and Chaotic Dyn. 3, 161–179. Jalnapurkar, S. M. and J. E. Marsden [2000], Stabilization of relative equilibria, IEEE Trans. Automatic Control 45, 1483–1491. Koon, W. S. and J. E. Marsden [1997], Optimal control for holonomic and nonholonomic mechanical systems with symmetry and Lagrangian reduction, SIAM J. Control and Optim., 35, 901–929. Koon, W-S., M. Lo, J. E. Marsden and S. D. Ross [2000], Heteroclinic connections between periodic orbits and resonance transitions in celestial mechanics, Chaos, 10, 427–469. Krishnaprasad, P. S. [1985], Lie–Poisson structures, dual-spin spacecraft and asymptotic stability, Nonl. Anal. Th. Meth. and Appl. 9, 1011–1035. Leonard, N. E. [1997a], Stability of a bottom-heavy underwater vehicle, Automatica 33, 331–346. Leonard, N. E. [1997b], Stabilization of underwater vehicle dynamics with symmetry-breaking potentials, Systems and Control Letters 32, 35–42. Leonard, N. E. and J. E. Marsden [1997], Stability and drift of underwater vehicle dynamics: Mechanical systems with rigid motion symmetry, Physica D 105, 130–162. Leonard, N. E. and C. A. Woolsey [1998], Internal actuation for intelligent underwater vehicle control, Proc. 10th Yale Workshop on Adaptive and Learning Systems, 295–300. Lewis, A. and R. Murray [1997], Conﬁguration controllability of simple mechanical systems, SIAM. J. Control and Optimization 41 555–574. Marsden, J. E. [1992], Lectures on Mechanics London Mathematical Society Lecture note series. 174, Cambridge University Press. Marsden, J. E. and T. S. Ratiu [1999], Introduction to Mechanics and Symmetry. Texts in Applied Mathematics, 17, Springer-Verlag, 1994. 2nd Ed. Marsden, J. E. and A. Weinstein [1974], Reduction of symplectic manfolds with symmetry, Rep. Math. Phys 5, 121–130. Nijmeijer, H. and A. van der Schaft [1990], Nonlinear dynamical control systems, Springer-Verlag, New York.

460

A. M. Bloch and N. E. Leonard

Ortega, R., A. Loria, P. J. Nicklasson and H. Sira-Ramirez [1998], Passivity-based Control of Euler–Lagrange Systems, Springer-Verlag, Communication & Control Engineering Series. Ortega, R., A. van der Schaft and B. Maschke [1999], Stabilization of portcontrolled Hamiltonian systems via energy balancing, in Stability and Stabilization of Nonlinear Systems, Springer, London, 239–260. Ortega, R., A. van der Schaft, B. Maschke, and G. Escobar [1999], Energy-shaping of port-controlled Hamiltonian systems by interconnection, in Proc. IEEE Conf. Decision and Control 38, 1646–1651. S´ anchez de Alvarez, G. [1986], Ph.D Thesis, Berkeley, Sim´ o, J. C., D. K. Lewis and J. E. Marsden [1991], Stability of relative equilibria I: The reduced energy momentum method, Arch. Rat. Mech. Anal. 115, 15–59. van der Schaft, A. J. [1981], Symmetries and conservation laws for Hamiltonian systems with inputs and outputs: A generalization of Noether’s theorem, Syst. Contr. Letters 1, 108–115. van der Schaft, A. J. [1983], System theoretic descriptions of physical systems, Doct. Dissertation, University of Groningen, also CWI Tract #3, CWI, Amsterdam. van der Schaft, A. J. [1986], Stabilization of Hamiltonian systems, Nonlinear Analysis, Theory, Methods and Applications, 10, 1021–1035. Wang, L. S. and P. S. Krishnaprasad [1992], Gyroscopic control and stabilization, J. Nonlinear Sci. 2, 367–415. Willems, J. C. [1972], Dissipative dynamical systems, Arch. Rat. Mech. and Anal. 45, 321–351. Willems, J. C. [1979], System theoretic models for the analysis of physical systems, Ricerche di Automatica 10, 71–106. Woolsey, C. A. and N. E. Leonard [1999a], Underwater vehicle stabilization by internal rotors, Proc. American Control Conf., 3417–3421. Woolsey, C. A. and N. E. Leonard [1999b], Global asymptotic stabilization of an underwater vehicle using internal rotors, Proc. IEEE Conf. Decision and Control 38, 2527–2532. Woolsey, C. A. and N.E . Leonard [2002], Stabilizing underwater vehicle motion using internal rotors, Automatica, (to appear). Zenkov, D. V., A. M. Bloch and J. E. Marsden [1998], The energy momentum method for the stability of nonholonomic systems, Dyn. Stab. of Systems, 13, 123–166. Zenkov, D. V., A. M. Bloch, and J. E. Marsden [1999], Stabilization of the unicycle with rider, Proc. IEEE Conf. Decision and Control 38, 3470–3471. Zenkov, D. V., A. M. Bloch, N. E. Leonard and J. E. Marsden [2000], Matching and stabilization of the unicycle with rider, in Lagrangian and Hamiltonian Methods for Nonlinear Control: A Proceedings Volume from the IFAC Workshop (N. E. Leonard and R. Ortega eds.), Pergamon, 177–178

Part VI

Relativity and Quantum Mechanics

15 Conformal Volume Collapse of 3-Manifolds and the Reduced Einstein Flow Arthur E. Fischer Vincent Moncrief To Jerry Marsden on the occasion of his 60th birthday ABSTRACT We consider the problem of the Hamiltonian reduction of Einstein’s equations on a (3 + 1)-vacuum spacetime that admits a foliation by constant mean curvature compact spacelike hypersurfaces M of Yamabe type −1. After a conformal reduction process, we ﬁnd that the reduced Einstein ﬂow is described by a time-dependent non-local dimensionless reduced Hamiltonian Hreduced which is strictly monotonically decreasing along any non-constant integral curve of the reduced Einstein system. We establish relationships between Hreduced , the σ-constant of M , and the Gromov norm M , show that Hreduced has a global minimum at a hyperbolic critical point if and only if the hyperbolic σ-conjecture is true, and show that for rigid hyperbolizable M , the hyperbolic ﬁxed point of the reduced Einstein ﬂow is a local attractor. We consider as examples Bianchi models that spatially compactify to manifolds of Yamabe type −1 and show that for the non-hyperbolizable models, the reduced Einstein ﬂow volume-collapses the 3-manifold M along either circular ﬁbers, embedded tori, or completely to a point, as suggested by conjectures in 3-manifold topology. Remarkably, in each of these cases of collapse, the collapse occurs with bounded curvature.

Contents 1 2 3 4 5 6 7 8

Introduction . . . . . . . . . . . . . . . . . Some Background Information . . . . . . Hyperbolizable Manifolds and Manifolds of Yamabe Type −1 . . . . . . . . . . . . . The Reduced Phase Space Preduced . . . . The Lichnerowicz Transform . . . . . . . . The Restricted Volume Functional vol−1 . The σ-Constant of M . . . . . . . . . . . . The Gromov Norm M . . . . . . . . . .

463

. . . . . . 464 . . . . . . 468 . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

474 475 477 479 482 484

464

A. E. Fischer and V. Moncrief 9 10 11 12 13

The Conformal Volume Collapse of 3-Manifolds . The Reduced Hamiltonian and Its Properties . . Reduction from the Spacetime Point of View . . Geometrization of 3-Manifolds . . . . . . . . . . . The Hyperbolic Fixed Point, Warped Products, and Lorentz Cones . . . . . . . . . . . . . . . . . . . 14 CMC Foliation of I+ . . . . . . . . . . . . . . . . . . 15 The Standard Models I+ /Γ . . . . . . . . . . . . . . 16 Rigid and Non-Rigid Standard Models . . . . . . 17 Asymptotic Approach to Hyperbolic Geometry . 18 The Hyperbolic Fixed Point is a Local Attractor 19 Collapse of Bianchi Models with σ(M ) = 0 . . . . . 20 The Reduced ADM-Hamiltonian . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . .

1

. . . .

486 490 494 494

. . . . . . . . .

496 499 503 507 509 513 515 519 520

Introduction

An important issue in classical general relativity is how to reduce the canonical formulation of Einstein’s vacuum ﬁeld equations to an unconstrained or free Hamiltonian dynamical system on a suitably constructed phase space. In the resulting system, the usual constraint equations should be eﬀectively factored out and after gauge conditions have been imposed, the full classical Hamiltonian system with constraints should be reduced to a free Hamiltonian system without constraints. Under certain topological and geometrical conditions, this program has been carried out by the authors in Fischer and Moncrief [1996–2000]. In this paper we discuss the results and further applications of that program to issues regarding the collapse of 3-manifolds under the reduced Einstein ﬂow. The conditions under which (3 + 1)-reduction has been successful assume that the vacuum spacetime admits a foliation by constant mean curvature (CMC) compact spacelike hypersurfaces of Yamabe type −1 (described in Section 3). The reduction program then proceeds in two steps. The ﬁrst step involves ﬁnding a suitable reduced phase space of unconstrained dynamical degrees of freedom. The second step involves ﬁnding a reduced Hamiltonian on this phase space which is the true non-vanishing Hamiltonian of the theory. We ﬁnd that under the aforementioned conditions, (3 + 1)-dimensional reduction can be carried out by introducing the reduced “phase” space Preduced = {(γ, pT T ) | γ ∈ M−1 and pT T ∈ (Sd2 )Tγ T }

(1.1)

where M−1 is the space of Riemannian metrics with constant scalar curvature R(γ) = −1 and where (Sd2 )Tγ T is the space of symmetric 2-contravariant tensor density ﬁelds on M that are transverse (i.e., divergenceless) and

15. Conformal Volume Collapse of 3-Manifolds

465

traceless with respect to γ ∈ M−1 . We remark that the fully reduced phase space is Preduced /D where D is the group of diﬀeomorphisms of M but here for ease of exposition we work on the space Preduced itself. Let R denote the reals, let R+ = (0, ∞), and let R− = (−∞, 0). Associated with Preduced is the contact manifold R− × Preduced with contact variables (τ, γ, pT T ) ∈ R− × Preduced . After choosing a temporal coordinate gauge condition τ (t) = −(2/(3t))1/2 , one introduces a non-local timedependent reduced Hamiltonian − TT 3 Hreduced : R × Preduced −→ R , (τ, γ, p ) −→ −τ ϕ6 µγ M

where ϕ = ϕ(τ, γ, pT T ) > 0 is the positive Lichnerowicz conformal factor that is used to transform the free, or unconstrained, reduced variables (τ, γ, pT T ) ∈ R− × Preduced to the CMC-constrained physical variables (g, π) ∈ CH ∩ Cδ ∩ CR− , and where the reduced variable τ coincides with the constant mean curvature τ (g, π) of the physical variables (g, π). The transform itself, from the reduced variables to the physical variables, L : R− ×Preduced −→ CH ∩Cδ ∩CR− ,

(τ, γ, pT T ) −→ L(τ, γ, pT T ) = (g, π)

is an ilh diﬀeomorphism and is central to conformal reduction. We introduce it as the Lichnerowicz transform. Fixing τ ∈ R− , we also get the diﬀeomorphism Lτ : Preduced −→ CH ∩ Cδ ∩ Cτ ,

(γ, pT T ) −→ L(τ, γ, pT T ) = (g, π)

either space of which gives (modulo diﬀeomorphisms of M ) a parameterization of the space of isometry classes of maximal globally hyperbolic CMC sliced Ricci-ﬂat spacetimes with topology R− × M , where (g, π) resides on the unique hypersurface in the spacetime with constant mean curvature τ . With respect to the reduced Hamiltonian, the resulting reduced Einstein equations (modulo diﬀeomorphisms of M ) are given by the following time-dependent unconstrained Hamiltonian system, δHreduced (τ, γ, pT T ) δpT T δHreduced =− (τ, γ, pT T ) δγ

∂γ = ∂t ∂pT T ∂t

(1.2)

where δHreduced /δγ and δHreduced /δpT T denote the functional derivatives of Hreduced with respect to the variables γ and pT T , respectively. The resulting ﬂow on Preduced is the reduced Einstein ﬂow. In this paper we discuss various properties of this system of equations and its ﬂow. An important subclass of manifolds of Yamabe type −1 are the hyperbolizable ones, namely, those that admit a hyperbolic metric γ˜ ∈ M−1 ,

466

A. E. Fischer and V. Moncrief

normalized so that the scalar curvature R(˜ γ ) = −1 (rather than the usual normalization that the sectional curvature is = −1). Let (˜ γ , 0) ∈ Preduced denote the corresponding hyperbolic point in Preduced . In order to understand the phase portrait of any dynamical system, one often looks ﬁrst for the presence or absence of equilibrium points (i.e., zeros) of the system which correspond to ﬁxed points of its ﬂow, how these points are distributed, and what their stability properties are. One of our main results is that the reduced Einstein ﬂow has a ﬁxed point if and only if M is hyperbolizable in which case the ﬁxed point is γ , 0) given by the hyperbolic point (˜ γ , 0) ∈ Preduced . By Mostow rigidity, (˜ is unique up to isometry and so we refer to (˜ γ , 0) as the hyperbolic ﬁxed point of the reduced Einstein ﬂow. Thus, very simply, the reduced Einstein ﬂow has either no ﬁxed points (if M is not hyperbolizable) or has a unique (up to isometry) ﬁxed point if M is hyperbolizable. Thus, importantly, the underlying topology of the universe controls in an essential way the dynamics of the reduced Einstein ﬂow. Using recent results of Anderson and Moncrief [2001] (see also Anderson [2000]), it can be shown that when a rigid hyperbolic ﬁxed point is present, it is a local attractor for the reduced Einstein ﬂow. Thus (up to isometry) reduced Cauchy data near the hyperbolic ﬁxed point (˜ γ , 0) asymptotically approach (˜ γ , 0). Moreover, as suggested by the strict monotonic decay of Hreduced away from the ﬁxed point, we conjecture that a hyperbolic ﬁxed point is a local attractor even when M is not rigid. If M is hyperbolizable and if gr0 is a hyperbolic metric on M with hyperbolic radius r0 > 0 (and with sectional curvature K(gr0 ) = −1/(r02 ), then the hyperbolic manifold (M, gr0 ) is isometric to H3r0 /Γ where H3r0 ⊂ I+ is the hyperboloid of hyperbolic radius r0 in I+ , the interior of the future pointing light cone in Minkowski space, and where Γ π1 (M ) is a discrete torsion-free co-compact subgroup of the proper orthochronos Lorentz group L↑+ = SO↑ (1, 3). The spacetime corresponding to the hyperbolic ﬁxed point (˜ γ , 0) is the standard model I+ /Γ which is isometric to the Lorentz cone r R+ 1 × r (M, gr0 ) over (M, gr0 ), 0

r I+ /Γ ∼ = R+ 1 × r (M, gr0 ) 0

where the Lorentz cone is a warped product Lorentz manifold of R+ 1 = (R+ , −1) and (M, gr0 ), and where either spacetime is a ﬂat spatially compact globally hyperbolic CMC spacetime which is future causally geodesically complete and past causally geodesically incomplete in the contracting direction. At a hyperbolic ﬁxed point (˜ γ , 0) ∈ Preduced , for arbitrary τ ∈ R− , Hreduced (τ, γ˜ , 0) = ( 32 )3/2 vol(M, γ˜ ) = 33 v3 M

(1.3)

where vol(M, γ˜ ) is the volume of the hyperbolic manifold (M, γ˜ ), M is

15. Conformal Volume Collapse of 3-Manifolds

467

the Gromov norm of M , an important topological invariant of M , and v3 is a positive constant that depends only on the dimension n = 3 (see Section 8). Thus Hreduced is a constant topological invariant on the ray (τ, γ˜ , 0) ⊂ R− × Preduced , τ ∈ R− . Moreover, (up to isometry) this ray is a strict local minimum of Hreduced and is a strict global minimum if and only if the hyperbolic σ-conjecture is true, an important conjecture in 3-dimensional topology which asserts that vol(M, γ˜ ) = |σ(M )|3/2 , where σ(M ) is the σ-constant, another topological invariant of M . Away from the hyperbolic ﬁxed point, along any non-constant integral curve (γ(t), pT T (t)) ∈ Preduced of the reduced Einstein equations, Hreduced τ (t), γ(t), pT T (t) is strictly monotonically decreasing, where τ (t) = −(2/(3t))1/2 is the temporal coordinate gauge condition, and moreover inf Hreduced ≡

inf Hreduced (τ, γ, p (τ,γ,pT T )∈R− ×Preduced

TT

) = (− 32 σ(M ))3/2 (1.4)

Since Hreduced is strictly monotonically decreasing along non-constant integral curves of the reduced Einstein ﬂow, we expect that under certain conditions the reduced Hamiltonian is monotonically seeking to decay to its inﬁmum inf Hreduced = (− 32 σ(M ))3/2 in which case it follows that the conformal volume vol(M, γ(t)) must also decay to its inﬁmum, vol(M, γ(t)) −→ (−σ(M ))3/2 In the case that M is hyperbolizable and that (˜ γ , 0) is the hyperbolic ﬁxed point, this expectation regarding Hreduced and the conformal volume is satisﬁed if and only if the hyperbolic σ-conjecture is true, in which case, since the ﬁxed point is a local attractor, this expectation is also satisﬁed if γ , 0). the initial Cauchy data (γ, pT T ) is near the ﬁxed point (˜ Since M is of Yamabe type −1, σ(M ) ≤ 0. Thus, more generally, if Hreduced → inf Hreduced and if σ(M ) = 0, then the curve γ(t) of conformal metrics must volume collapse M in the direction of cosmological expansion. Thus if Hreduced asymptotically approaches its inﬁmum and σ(M ) = 0, the reduced Einstein ﬂow predicts the conformal volume collapse of M vol(M, γ(t)) −→ 0 We note however that the physical volume vol(M, g(t)) −→ ∞ is not collapsing but is going to inﬁnity as beﬁts an expanding universe. Thus the dynamics in Preduced , driving the dynamics in the physical space,

468

A. E. Fischer and V. Moncrief

provides an underlying and very diﬀerent reality from the dynamics in the physical space. Since the reduced Hamiltonian is monotonically decreasing in the direction of cosmological expansion and since the inﬁmum of the reduced Hamiltonian determines σ(M ), it is reasonable to ask how the reduced Einstein ﬂow behaves in known cases in the limit of inﬁnite cosmological expansion? To answer this question, we have considered the vacuum solutions of Einstein’s equations which spatially compactify to manifolds of Yamabe type −1. These models are the ﬁve Bianchi models of types II, III, V and VIIh , VI0 , and VIII, which in turn correspond in Thurston’s classiﬁcation to manifolds of type Nil, H2 × R, H3 , Sol and SL(2, R), respectively. We have shown by explicit calculation, using the known solutions, that in the four non-hyperbolizable cases where σ(M ) = 0, the reduced Hamiltonian asymptotically approaches 0 under the reduced Einstein ﬂow, thereby satisfying the assumption that it asymptotically approaches its inﬁmum. Thus in these cases the reduced Einstein ﬂow volume-collapses the 3manifold. By explicit calculation one ﬁnds that this collapse occurs along either circular ﬁbers, embedded tori, or in one case of total collapse, the entire manifold is collapsed to a point. Remarkably, in each of these cases of collapse, the collapse occurs with bounded curvature, precisely as occurs in the theory of collapsing Riemannian manifolds (Cheeger and Gromov [1986, 1990]). Thus it seems plausible that in certain cases the reduced Einstein ﬂow may induce a decomposition of M into geometric pieces. Indeed, certain topological conjectures of Anderson [1993, 1997, 1999] relating to the geometrization program of 3-manifolds predict how a sequence of geometries with bounded curvature approaching σ(M ) degenerate. Assuming these conjectures, the asymptotic behavior of large classes of Einstein spacetimes may perhaps be characterized rather explicitly in terms of the geometrization program of 3-manifolds. Conversely, it is conceivable that the Einstein ﬂow could be used to try to establish some form of the geometrization conjectures for 3-manifolds, much like Hamilton [1995] used the Ricci ﬂow in a positive way to try to establish the Poincar´e conjecture. Further details of the results presented here can be found in Fischer and Moncrief [1996–2002].

2

Some Background Information

Throughout this paper M will denote a smooth (C ∞ ) closed (compact without boundary) connected orientable n-manifold, n ≥ 3. For the most part we will be interested in the case n = 3, but several of our constructions and results remain true in the more general case n ≥ 3. Occasionally we shall also consider non-compact M and n ≥ 2 but we shall always mention

15. Conformal Volume Collapse of 3-Manifolds

469

this explicitly. We let “ ” denote diﬀeomorphic or isomorphic for groups and we let “∼ = ” denote isometric for either Riemannian or Lorentz manifolds. Let M = Riem(M ) denote the space of smooth Riemannian metrics on M , let S2 = S2 (M ) denote the space of smooth ﬁelds of symmetric 2-covariant tensors on M , and let Sd2 = Sd2 (M ) denote the space of smooth ﬁelds of symmetric 2-contravariant tensor densities on M . Let T M M × S2 denote the tangent bundle of M and let T ∗ M M × Sd2 denote the weak L2 -dual cotangent bundle of M. Thus a pair (g, π) ∈ T ∗ M consists of a smooth Riemannian metric g ∈ M and a smooth symmetric 2-contravariant tensor density ﬁeld π ∈ Sd2 . The space T ∗ M is thought of as the space of gravitational phase variables with π ∈ Sd2 the gravitational momentum. This point of view is useful even when n > 3 if one is considering higher dimensional theories of gravity. Let I = [t1 , t2 ] ⊂ R be a closed interval, t1 < t2 , and let c : I −→ T ∗ M ,

t −→ (g(t), π(t))

be an arbitrary smooth curve in T ∗ M. The starting point for reduction for vacuum (3 + 1)-dimensional spacetimes is the ADM-action Arnowitt, Deser, and Misner [1962] which on such curves c takes the form

π ∂t g − N H(g, π) − 2X δ (g, π) dt IADM (c) = (2.1) I

M

where N = N (t, x) is the lapse function, X = X(t, x) is the shift vector ﬁeld, H is the Hamiltonian scalar density, 2δ is the momentum 1-form density, given by (2.2) and (2.3) below, and where the t-dependence of all variables has been suppressed. In the above expression, juxtaposition of π and ∂t g, N and H, and X and δ are natural (i.e., non-metric) contractions yielding scalar densities. The main idea of reduction is that the ADM formulation of Einstein’s equations is in “already-parameterized form” with a “super-Hamiltonian” given by

N H(g, π) + 2Xδ (g, π) Hsuper (g, π) = M

where the lapse function N and shift vector ﬁeld X act as Lagrange multipliers (see Arnowitt, Deser, and Misner [1962] p. 231, for a discussion of the parametric form of the canonical equations, and see Fischer and Marsden [1972, 1979,?] for further mathematical information about the canonical approach to Einstein’s equations). Consequently, the gravitational phase variables (g, π) must solve the Hamiltonian and divergence constraint equations (2.2) H(g, π) = π · π − 12 (trg π)2 µ−1 g − R(g)µg = 0 2δ (g, π) = 2δg π = 0

(2.3)

470

A. E. Fischer and V. Moncrief

where π ·π denotes the metric contraction, trg π denotes the trace of π, µg is the unique volume element on M determined by g and a chosen orientation of M , R(g) is the scalar curvature of g, δ(g, π) = δg π is the divergence of π with respect to g, a vector density on M , and where δ (g, π) = (δ(g, π)) = (δg π) is its covariant form, a 1-form density on M , where (·) denotes the metric lowering ﬂat operation. In a local coordinate system {xi } on M , 1 ≤ i ≤ 3, H(g, π) = (gij gkl π ik π jl − 12 (gkl π kl )2 )(det g)−1/2 − R(g)(det g)1/2 , (δ(g, π))i = (δg π)i = −π ij |j , (δ (g, π))i = ((δg π) )i = −πi k |k = −gij π jk |k where vertical bar denotes covariant diﬀerentiation with respect to g, and Xδ (g, π) = X(δg π) = X · δg π = −gij X i π jk |k . The gravitational momentum π can be deﬁned in terms of the second fundamental form = k ∈ S2 (M ) by π = −(k − (trg k)g) µg = −(k − (trg k)g −1 )µg which can be inverted to give k = −(π − 12 (trg π)g)µ−1 g , where (·) denotes the metric raising sharp operation, with the exception that g = g −1 is the inverse of g. In local coordinates, (g )ij = (g −1 )ij = g ij , π ij = −(k ij − (g kl kkl )g ij )(det g)1/2 , kij = −(πij − 12 (gkl π kl )gij )(det g)−1/2 . Let

CH = {(g, π) ∈ T ∗ M | H(g, π) = 0} ⊂ T ∗ M

denote the Hamiltonian constraint space, let Cδ = {(g, π) ∈ T ∗ M | δ(g, π) = 0} ⊂ T ∗ M denote the divergence constraint space, and let CH ∩ Cδ ⊂ T ∗ M denote the joint constraint space of the canonical equations. If (g, π) ∈ CH ∩ Cδ , then the constrained pair is thought of as constrained gravitational variables, or ADM Cauchy data, or physical variables, in

15. Conformal Volume Collapse of 3-Manifolds

471

contrast to the unconstrained, or free, or conformal, or reduced variables (γ, pT T ) ∈ Preduced that we introduce later. If (g, π) ∈ CH ∩ Cδ , then by the classical result of Choquet-Bruhat and Geroch [1969], there exists a maximal vacuum Cauchy development (V4 , (4) g)(g,π) , where diﬀeomorphically (not metrically) V4 I × M , I is an open interval in R, and (V4 , (4) g) is a globally hyperbolic Ricci-ﬂat spacetime with a spacelike embedding i : M → (V4 , (4) g) with forward pointing unit timelike normal Zi(M ) on the embedded hypersurface i(M ) ⊂ V4 such that g = i∗ ((4) g), k = −i∗ ((4) ∇Zi(M ) ) is the second fundamental form k associated with (g, π), and where the development is maximal in the sense that there exist no developments that are proper extensions of the given one. This maximal development is unique up to isomorphism of developments, which includes isometry of the spacetime. Since Hsuper vanishes on CH ∩Cδ , one strives to ﬁnd the true non-vanishing free, or unconstrained, or reduced Hamiltonian Hreduced of the theory on a reduced phase space Preduced by eliminating the constraints and imposing spatial and temporal coordinate gauge conditions. The resulting unconstrained Hamiltonian equations for the reduced Hamiltonian would then yield a system of free equations whose ﬂow would describe the evolution of the universe. The goal of this paper is to construct such a reduced system and to study some of its resulting properties. To accomplish this goal we must make certain topological restrictions on M , namely that M be of Yamabe type −1, and also make certain geometrical restrictions on the gravitational data (g, π), namely that (g, π) has constant mean curvature. We shall discuss the topological restrictions on M in Section 3. Here we brieﬂy discuss the geometrical restrictions on (g, π). The pair (g, π) (or (g, k)) describes the intrinsic and extrinsic geometry of a hypersurface embedded in an ambient spacetime. The mean curvature of (g, π) (or, equivalently, of (g, k)) is deﬁned by the mean curvature function τ : T ∗ M −→ C ∞ (M ; R), (2.4) (g, π) −→ τ (g, π) ≡ 12 (trg π)µ−1 g = trg k, where C ∞ (M, R) denotes the space of smooth real-valued functions on M . If −1 1 2 (trg π)µg = τ = constant ∈ R, then (g, π) has constant mean curvature. The usual convention on the sign of k, as adopted here, is that the sign of k is negative when the tips of the normals on a spacelike hypersurface are further apart than their bases, as for example in the expansion of a model universe, in which case τ = trg k < 0. For τ ∈ R− , let Cτ = { (g, π) ∈ T ∗ M | τ (g, π) = 12 (trg π)µ−1 g =τ }

(2.5)

472

A. E. Fischer and V. Moncrief

denote the space of gravitational variables with constant mean curvature τ so that (2.6) CH ∩ Cδ ∩ Cτ is the space of Cauchy data with ﬁxed constant mean curvature τ < 0. Let ; − CR− = Cτ = { (g, π) ∈ T ∗ M | 12 (trg π)µ−1 g ∈R } τ ∈R−

denote the space of gravitational variables with arbitrary constant negative mean curvature τ ∈ R− . Of importance for us will be CH ∩ Cδ ∩ CR−

(2.7)

the space of Cauchy data with constant negative mean curvature which we refer to as the (negative) CMC constraint space. The manifold structure of this space as a subspace of T ∗ M can be analyzed as in Fischer and Marsden [1979]. We remark that the spaces CH ∩ Cδ ∩ Cτ and CH ∩ Cδ ∩ CR− are invariant under the group D = D(M ) = Diﬀ(M ) of diﬀeomorphisms of M and that the fully reduced theory uses (CH ∩ Cδ ∩ CR− )/D as a contact manifold (see below). Alternately, one can argue that one should only factor by the group of small diﬀeomorphisms, i.e., the subgroup D0 of diﬀeomorphisms of M isotopic to the identity (i.e., the connected component of the identity of D) in which case the proper fully reduced space is (CH ∩ Cδ ∩ CR− )/D0 . In any event, in this paper, for clarity of exposition, we work “up to isometry” rather than factoring by D (or D0 ). However, the following heuristic counting argument is suggestive. For ﬁxed τ ∈ R− , dim((CH ∩ Cδ ∩ Cτ )/D) = dim(CH ∩ Cδ ∩ Cτ ) − dim D = (12 − (1 + 3 + 1) − 3) ∞3 = 4∞3

(2.8)

(see also (4.2)) corresponding to the two degrees of freedom of the gravitational ﬁeld. However, we note that these are not dynamical degrees of freedom in the sense that there is a dynamical system on this space but rather true degrees of freedom of the gravitational ﬁeld representing nonisometric Ricci-ﬂat spacetimes. Indeed, if (g, π) ∈ CH ∩ Cδ ∩ Cτ , then under our assumptions on the topology of M , (g, π) can be uniquely embedded (up to diﬀeomorphism of M ) in a maximal vacuum Cauchy development (V4 , (4)g)(g,π) which admits a CMC foliation. Thus CH ∩ Cδ ∩ Cτ parameterizes (modulo diﬀeomorphisms of M ) the space of isometry classes of maximal globally hyperbolic CMC sliced Ricci-ﬂat spacetimes with topology R+ × M , where the parameterization can be thought of as taking a snapshot of each of these spacetimes on the unique hypersurface with constant mean curvature τ and ﬁnding the (g, π) that resides there. (Using (CH ∩ Cδ ∩ Cτ )/D instead of CH ∩ Cδ ∩ Cτ enables us to remove the “modulo diﬀeomorphisms of M ” condition.)

15. Conformal Volume Collapse of 3-Manifolds

473

Since CH ∩ Cδ ∩ Cτ represents non-isometric spacetimes, there can be no ADM dynamics on this space since if cADM (t) = (g(t), π(t)) ∈ CH ∩ Cδ , t ∈ I, is an ADM curve, i.e., an integral curve of the ADM dynamical system on CH ∩ Cδ , the maximal Cauchy developments (V4 , (4)g)(g(t),π(t)) of each (g(t), π(t)) in the curve must be isometric, (V4 , (4)g)(g(t),π(t)) = constant

(2.9)

where [ · ] represents the isometry class of the spacetime. Thus, although (CH ∩ Cδ ∩ Cτ )/D is a symplectic manifold, it is not one on which there is an evolution since this space is too restrictive to support the ADM dynamical system. Now consider the contact manifold ;

(CH ∩ Cδ ∩ CR− )/D =

(CH ∩ Cδ ∩ Cτ )/D

τ ∈R−

with “heuristic dimension” dim((CH ∩ Cδ ∩ CR− )/D) = dim

;

(CH ∩ Cδ ∩ Cτ )/D0 = 4∞3 + 1 (2.10)

τ ∈R−

(see also (4.3)) which now has room for dynamics. Thus if cmax (t) = (g(t), π(t)) ∈ CH ∩ Cδ ∩ CR− ADM is a maximal ADM curve (i.e., there exists no proper extensions of cmax ) ADM with strictly monotonically increasing constant mean curvature τ (t) = τ (g(t), π(t)), then cmax (t) describes the (intrinsic and extrinsic) geometry on a oneADM parameter family of CMC hypersurfaces in the maximal Cauchy development (V4 , (4)g)(g(t0 ),π(t0 )) of any ﬁxed point (g(t0 ), π(t0 )) on this curve with constant mean curvature τ0 = τ (g(t0 ), π(t0 )). Thus these maximal integral curves foliate (CH ∩ Cδ ∩ CR− )/D. If we now identify the images of all such maximal ADM curves with a single point, the dimension of (CH ∩ Cδ ∩ CR− )/D gets reduced from 4∞3 + 1 to 4 ∞3 as expected. Thus the space of images of all maximal ADM curves in (CH ∩ Cδ ∩ CR− )/D parameterizes the space of isometry classes of maximal globally hyperbolic CMC Ricci-ﬂat spacetimes with topology R+ × M . In Section 5, we shall see how the Lichnerowicz transform provides a way of shuttling back and forth between the CMC-constraint space CH ∩Cδ ∩CR− and the contact manifold R− × Preduced .

474

3

A. E. Fischer and V. Moncrief

Hyperbolizable Manifolds and Manifolds of Yamabe Type −1

In this section we consider manifolds of Yamabe type −1 and the important special subclass, the hyperbolizable ones. Let M be a connected n-manifold, n ≥ 2 (not necessarily compact). A space form on M is a complete Riemannian metric g ∈ M with constant sectional curvature K(g) = K ∈ R. In this case, g is an Einstein metric with Ric(g) = (n − 1)Kg and a Yamabe metric (see below) with constant scalar curvature R(g) = n(n − 1)K. We deﬁne g to be hyperbolic if g has constant negative sectional curvature K(g) = K constant < 0, in which case we deﬁne the hyperbolic radius by r = (−K)−1/2 (so that K = − r12 ). We remark that the hyperbolic radius refers to a parameter of the simply connected covering space 15.1) and not to a diameter of (M, g). If the volume Hnr (see Theorem vol(M, g) = M µg of a hyperbolic metric g is ﬁnite, we refer to its volume as its hyperbolic volume. If K = −1 (or equivalently, if r = 1) then g is a unit hyperbolic metric. We remark that this deﬁnition of hyperbolic is slightly more general than that of some authors who require that g have constant sectional curvature −1. However, for our purposes it is more convenient to allow the constant sectional curvature to either ﬂoat or to be normalized by some other condition, such as requiring that a hyperbolic g have scalar curvature R(g) = −1 and thus sectional curvature K(g) = 1/(n(n − 1)), in contrast to having K(g) = −1 and thus R(g) = −n(n − 1). With this deﬁnition of hyperbolic, by Mostow rigidity, hyperbolic metrics are unique up to isometry and homothety, but by normalizing the scalar curvature to −1, uniqueness is again returned to isometry alone. We make the following topological deﬁnition: M is hyperbolizable if there exists a hyperbolic Riemannian metric on M . Our use of the term hyperbolizable, while perhaps a bit cumbersome, is in the spirit of the term Banachable (also perhaps a bit cumbersome), but for our purposes serves to distinguish topological conditions from metric ones. We shall consider the structure of hyperbolizable manifolds M in more detail in Section 14, but for now we make the following remark. Let π be a group. A connected n-manifold M is a K(π, 1)-manifold if its fundamental group π1 (M ) = π and if all of its higher homotopy groups vanish (equivalently, the universal covering space of M is contractible). Among the class of K(π, 1)-manifolds are the hyperbolizable and the Euclidean manifolds since these manifolds are diﬀeomorphically covered by Rn . Now we consider the Yamabe type of a manifold. We return to our standing assumption that M is closed, n ≥ 3. A metric g¯ ∈ M is a Yamabe

15. Conformal Volume Collapse of 3-Manifolds

475

metric on M if g has constant scalar curvature, R(¯ g ) = constant ∈ R. Let Y(M ) = { g¯ ∈ M | R(¯ g ) = constant } ⊂ M denote the space of Yamabe metrics on M . From a theorem of Aubin (see [Aubin1982] p. 135, and references therein), every closed M admits a Riemannian metric with constant negative scalar curvature. We deﬁne M to be of Yamabe type −1 if the converse is true, namely, if the scalar curvature of every Yamabe metric on M is a negative constant. Then M is of Yamabe type −1 if and only if the scalar curvature of every Riemannian metric on M is negative somewhere. M is of Yamabe type 0 if M admits a metric g with R(g) = 0 but no metric with R(g) = constant > 0 and M is of Yamabe type +1 if M admits a metric with constant positive scalar curvature. Now we restrict to n = 3. M is irreducible if every 2-sphere in M bounds a 3-cell. For simplicity, we assume that the Poincar´e conjecture is true, which when taken in the form that there do not exist any fake 3-cells, is equivalent to every K(π, 1) 3-manifold being irreducible. Let # denote connected sum and let S 3 denote the 3-sphere diﬀeomorphically. Then using results of Gromov and Lawson [1983] and Schoen and Yau [1979], the following topological information regarding 3-manifolds of Yamabe type −1 is available (see Fischer and Moncrief [1996] for further details). 3.1 Theorem. Let M be a closed connected orientable 3-manifold. Assume that the Poincar´e conjecture is true. Then M is of Yamabe type −1 if and only if one and only one of the following holds: (a) M is hyperbolizable; (b) M is a non-hyperbolizable K(π, 1)-manifold that does not admit a ﬂat Riemannian metric; (c) M has a non-trivial connected sum decomposition in which at least one factor is a K(π, 1)-manifold; i.e., M M #K(π, 1), where M S3. We remark that (a) is the vast class of closed orientable hyperbolizable 3-manifolds, that manifolds of type (a) and (b) are irreducible, and that manifolds of type (c) are reducible. Thus, subject to the Poincar´e conjecture, irreducible manifolds of Yamabe type −1 must be either of type (a) or (b).

4

The Reduced Phase Space Preduced

We now construct the reduced phase space Preduced over a closed connected oriented 3-manifold M of Yamabe type −1. First, for a general closed n-

476

A. E. Fischer and V. Moncrief

manifold, consider the space M−1 = { g ∈ M | R(g) = −1 } ⊂ M of Riemannian metrics on M with scalar curvature = −1. From the aforementioned result of [Aubin1982], M−1 is not empty. Then, by a result of Fischer and Marsden [1975], we have the following. 4.1 Theorem. Let M be a closed connected orientable n-manifold, with n ≥ 3. Then M−1 is a smooth closed non-empty inﬁnite-dimensional and inﬁnite-codimensional ilh-submanifold of M. (ilh = Inverse Limit Hilbert) Let P = Pos(M, R) denote the space of positive functions on M . Then P acts by pointwise multiplication on M. The resulting quotient manifold M/P is the space of pointwise conformal structures on M . If M is of Yamabe type −1, then using methods of Fischer and Tromba [1984], it can be shown that M/P and M−1 are ilh diﬀeomorphic, in which case M−1 is contractible. In this case, M−1 provides an important submanifold representation of the quotient manifold M/P. Since the motivation for our construction of Preduced involves the conformal method of solving the constraint equations, the space M−1 plays a central role in the construction of our reduced phase space. Indeed, the existence of the diﬀeomorphism M−1 M/P ,

γ ←→ γ = Pγ = {pγ | p ∈ P}

for manifolds of Yamabe type −1 is one of the main mathematical reasons that we restrict to manifolds of this type. Notationally, because in this case metrics in M−1 can be thought of as representing pointwise conformal classes, we refer to such metrics as conformal metrics and denote them by γ whereas we denote general metrics in M by g. Similarly, the volume of a conformal metric will be referred to as the conformal volume. For γ ∈ M−1 , let SγT T ≡ (Sd2 )T T (M, γ) = { pT T ∈ Sd2 (M ) | δγ pT T = 0 and trγ pT T = 0 } denote the space of symmetric 2-contravariant tensor density ﬁelds on M that are transverse (i.e., divergenceless) and traceless with respect to γ and let ; SγT T = {(γ, pT T ) | γ ∈ M−1 and pT T ∈ SγT T } (4.1) Preduced = γ∈M−1

denote the reduced phase space. There is a natural projection onto the ﬁrst factor, Preduced → M−1 , (γ, pT T ) → γ, so that for γ ∈ M−1 , the ﬁber above γ is SγT T . The variables (γ, pT T ) ∈ Preduced are the reduced,

15. Conformal Volume Collapse of 3-Manifolds

477

or conformal, or unconstrained, or free variables of our reduction process. Associated with Preduced is the contact manifold R− × Preduced with contact variables (τ, γ, pT T ) ∈ R− × Preduced . We remark that the fully reduced phase space is given by Preduced /D or Preduced /D0 (see also the remarks in Section 2). Moreover, for manifolds of Yamabe type −1 that we are considering here, the space Preduced /D0 can be interpreted as the cotangent bundle of the Teichm¨ uller space of conformal structures on M from which it inherits a natural symplectic structure (see Fischer and Moncrief [1996] for more information regarding this point of view). However for clarity of exposition we work “up to diﬀeomorphism” on Preduced rather than the fully reduced phase space Preduced /D0 and refer to Preduced and R− ×Preduced as the “phase space” and the “contact manifold”, respectively. The following heuristic counting argument is suggestive (see also (2.8)), dim(Preduced /D) = dim Preduced − dim D = dim M−1 + dim SγT T − dim D = ((6 − 1) + (6 − 3 − 1) − 3)∞3 = 4 ∞3 = dim((CH ∩ Cδ ∩ Cτ )/D)

for ﬁxed τ ∈ R−

(4.2)

and we note that this space is too restrictive for dynamics, representing as it does (after ﬁxing some τ ∈ R− ) isometry classes of non-isometric spacetimes sampled on this ﬁxed CMC τ -hypersurface. On the other hand, associated with Preduced /D is the contact manifold R− × (Preduced /D) with heuristic dimension (see also (2.10)), dim(R− × (Preduced /D)) = 4∞3 + 1 = dim

;

(CH ∩ Cδ ∩ Cτ )/D (4.3)

τ ∈R−

which is now large enough to accommodate a dynamical system. The resulting space of equivalence classes of images of maximal integral curves then reduces to a 4∞3 -parameterized space.

5

The Lichnerowicz Transform

We now discuss the relationship of the contact manifold R− × Preduced to conformal reduction and introduce the Lichnerowicz transform. The Choquet-Bruhat–Lichnerowicz–York conformal method (ChoquetBruhat and York [1980]) of solving the CMC constraint equations (2.2) and (2.3) proceeds as follows. Let (τ, γ, pT T ) ∈ R− × Preduced be arbitrary

478

A. E. Fischer and V. Moncrief

contact variables and let ϕ = ϕ(τ, γ, pT T ) > 0 denote the unique positive solution of the Lichnerowicz equation −7 1 2 5 ∆γ ϕ − 18 ϕ + 12 τ ϕ − 18 pT T · pT T µ−2 =0 (5.1) γ ϕ where ∆γ is the Laplacian −∇i ∇i with respect to γ and pT T · pT T = γik γjl pT Tij pT Tkl . The resulting solution ϕ is used as a conformal factor to transform the contact variables (τ, γ, pT T ) to the ADM-physical variables (g, π). Since M is of Yamabe type −1, results regarding conformal reduction can be interpreted as the existence of a global ilh-diﬀeomorphism, L : R− × Preduced −→ CH ∩ Cδ ∩ CR− , (τ, γ, pT T ) −→ (g, π) = L(τ, γ, pT T ) which we refer to as the Lichnerowicz transform, from the contact manifold R− × Preduced to the CMC constraint space CH ∩ Cδ ∩ CR− , and which is given by (g, π) = L(τ, γ, pT T )

= ϕ4 γ, ϕ−4 pT T + 23 τ ϕ2 γ −1 µγ (5.2)

TT −1 2 = g, π + 3 τ g µg where ϕ = ϕ(τ, γ, pT T ) > 0 is the Lichnerowicz conformal factor, the unique positive solution to the Lichnerowicz equation (5.1), and where γ −1 and g −1 denote the inverse of the metrics (in coordinates, (γ −1 )ij = γ ij and (g −1 )ij = g ij ). Note that L maps level sets in R− × Preduced to level sets in CH ∩ Cδ ∩ CR− , and is diﬀeomorphic there, L|{τ } × Preduced : {τ } × Preduced −→ CH ∩ Cδ ∩ Cτ , (τ, γ, pT T ) −→ (g, π) so that τ (L(τ, γ, pT T )) = τ (g, π) = τ

(5.3)

where on the left hand side, τ (·, ·) is the mean curvature function whereas on the right hand side τ ∈ R− . As an important example and special case, we consider the case pT T = 0. Then from (5.2), we see immediately that (g, π) = L(τ, γ, 0) = (ϕ4 γ, 23 τ ϕ2 γ −1 µγ ) = (g, 23 τ g −1 µg )

(5.4)

which clearly satisﬁes δg π = 0. From the Hamiltonian constraint (2.2), we then ﬁnd H(g, 23 τ g −1 µg ) = −( 23 τ 2 g)µg − R(g)µg = 0

15. Conformal Volume Collapse of 3-Manifolds

479

so that R(g) = − 23 τ 2 < 0

(5.5)

is a negative constant. Moreover, Lichnerowicz’s equation (5.1) reduces to ∆γ ϕ − 18 ϕ +

1 2 5 12 τ ϕ

=0

(5.6)

which has a unique positive constant solution ϕ=

3 2τ 2

1/4 (5.7)

which when substituted into (5.4) yields

(g, π) = L(τ, γ, 0) = 2τ32 γ, −( 23 )1/2 γ −1 µγ

(5.8)

We remark that the second component in

3 2 1/2 2τ 2 γ, −( 3

γ −1 µγ )

is independent of τ , although we note that in this form the metric argument γ of the volume element µγ does not correspond to the metric (3/(2τ 2 ))γ in the ﬁrst component. This special case of pT T = 0 will be of importance in ﬁnding the ADM curve generated by the hyperbolic ﬁxed point (see (13.1)).

6

The Restricted Volume Functional vol−1

For a connected closed orientable n-manifold M , n ≥ 3, we choose an orientation on M and for each g ∈ M, we let µg denote the volume element determined on M by g and the orientation of M . Let vol(M, g) = M µg denote the volume of (M, g) and let + vol : M −→ R , g −→ µg = vol(M, g) M

denote the volume functional on M. Let vol−1 = vol |M−1 : M−1 −→ R+ , γ− → µγ = vol−1 (M, γ) = vol(M, γ) M

denote the restricted volume functional restricted to the closed nonempty submanifold M−1 . We are interested in the critical points of vol−1 and the nature of these critical points. Now we have the following result (see Fischer and Moncrief [2000] for more details):

480

A. E. Fischer and V. Moncrief

6.1 Theorem. Let M be a closed connected oriented n-manifold, n ≥ 3. Then γ¯ ∈ M−1 is a critical point of vol−1 : M−1 −→ R+

(6.1)

if and only if γ¯ is an Einstein metric with Ric(¯ γ ) = − n1 γ¯

(6.2)

If n = 3, then γ˜ ∈ M−1 is a critical point of vol−1 if and only if γ˜ is hyperbolic, in which case K(˜ γ ) = −(1/6) and Ric(˜ γ ) = −(1/3)˜ γ . Moreover, up to isometry, γ˜ is unique and is a strict local minimum of vol−1 , i.e., there exists a neighborhood Uγ˜ ⊂ M−1 of γ˜ such that for all γ ∈ Uγ˜ , vol(M, γ ) ≥ vol(M, γ˜ )

(6.3)

with equality if and only if γ is isometric to γ˜ . Thus for n ≥ 3, vol−1 has critical points if and only if M admits an Einstein metric with a negative Einstein constant and if n = 3, vol−1 has a critical point if and only if M is hyperbolizable, in which case (up to isometry) the critical point is unique and is a strict local minimum. What is remarkable in all of this is that the volume functional, one of the simplest functionals on M since it is algebraic in the metric and is computed by integration, can be used to detect Einstein metrics when restricted to M−1 and in the case n = 3, can be used to detect hyperbolic metrics. We also remark that M−1 and vol−1 are invariant under the group of diﬀeomorphisms of M , i.e., if

then f ∗ M−1 = M−1

f ∈ D = Diﬀ(M ),

(since if γ ∈ M−1 , R(f ∗ γ) = R(γ) ◦ f = −1 ◦ f = −1 so that f ∗ γ ∈ M−1 ), and µf ∗ g = f ∗ µg = µg = vol(M, γ) vol(M, f ∗ γ) = M

M

M

so that vol−1 is constant on orbits γ · D = {f ∗ γ | f ∈ D}, i.e., is constant in the isometric directions of γ. Thus no γ could be a strict local minimum of vol−1 . Of course this problem disappears when one works with isometry classes of metrics rather than with the metrics themselves. say this directly in the statement of We also remark that some scale normalization (such as R(γ) = −1) must be chosen to normalize hyperbolic metrics in order that a hyperbolic metric be a local minimum of volume, since otherwise the volume could be scaled to zero, i.e., for c ∈ R+ , vol(M, cg) = cn/2 vol(M, g) → 0 as c → 0.

15. Conformal Volume Collapse of 3-Manifolds

Let inf vol−1 ≡

inf

γ∈M−1

481

vol(M, γ), so that we have the global lower bound

for the restricted volume functional; for γ ∈ M−1 , vol−1 (M, γ) ≥ inf vol−1

(6.4)

Of importance is whether or not inf vol−1 > 0, in which case (6.4) would provide a positive bound away from zero for the functional vol−1 and therefore prevent volume collapse of M by metrics in M−1 . More strongly, if this inﬁmum is realized by an actual metric, say γ¯ , i.e., if vol(M, γ¯ ) = inf vol−1 , then for all γ ∈ M−1 , vol−1 (M, γ) ≥ inf vol−1 = vol(M, γ¯ ) > 0 in which case γ¯ would be a global minimum for vol−1 (and in particular vol−1 would be bounded away from zero). Conversely, if γ¯ is a global minimum for vol−1 , then vol(M, γ¯ ) = inf vol−1 . Thus if γ¯ realizes the inﬁmum, it is a minimum, hence a critical point, and hence an Einstein metric (by Theorem 6.1), and if n = 3, then γ¯ is hyperbolic and hence a strict local and thus a strict global minimum (up to isometry). An important question for n = 3 is when is the converse true, i.e., when does a hyperbolic metric realize inf vol−1 in which case γ˜ would be a strict global minimum of vol−1 over all of M−1 . In one form, the hyperbolic σ-conjecture asserts that this is the case. 6.2 Conjecture. (Hyperbolic volume conjecture): Let M be a closed connected oriented hyperbolizable 3-manifold and let γ˜ ∈ M−1 be a hyperbolic metric on M . Then inf vol−1 = vol(M, γ˜ )

(6.5)

Equivalently, γ˜ is a global minimum of vol−1 , i.e., for all γ ∈ M−1 , vol(M, γ) ≥ vol(M, γ˜ )

(6.6)

in which case, up to isometry, γ˜ is a strict global minimum, i.e., equality occurs if and only if γ is isometric to γ˜ . As we shall see in Section 7, this conjecture is one form of the hyperbolic σ-conjecture which can be thought of as having equivalent topological (Conjecture 7.2), geometrical (Conjecture 6.2), and relativistic (Section 10.9 and Conjecture 20.1) formulations. The interesting thing about the above geometrical formulation is that it can be stated in a self-contained way relying only on the volume functional and the “constraint submanifold” M−1 and without relying on the concept of the σ-constant. If the hyperbolic volume conjecture is true, then the restricted volume functional vol−1 : M−1 → R+ can be described succinctly as having a

482

A. E. Fischer and V. Moncrief

unique critical point (up to isometry) which is a strict global minimum (in the non-isometric directions). On the other hand, if the hyperbolic volume conjecture is false, then vol(M, γ˜ ) > inf vol−1 in which case there exists a metric γ ∈ M−1 such that vol(M, γ˜ ) > vol(M, γ ). In this case, γ˜ would not be a global minimum of vol−1 , although it would still be a strict local minimum. Thus in our current state of knowledge, vol−1 can be succinctly described as having a unique (up to isometry) critical point γ˜ which is a strict local minimum (in the non-isometric directions) but for which it is unknown if γ˜ is a global minimum.

7

The σ-Constant of M

Before introducing the reduced Hamiltonian, we pause to discuss the σconstant and the Gromov norm, two important topological invariants of closed n-manifolds, n ≥ 3. The Yamabe functional is deﬁned by R(g)µg (7.1) Y : M −→ R , g −→ M (n−2)/n ( M µg ) a volume-normalized total scalar curvature functional, weighted to be invariant under homothetic transformations of g, i.e., if c ∈ R+ , then Y (cg) = Y (g). For ﬁxed g ∈ M, let g = Pg ∈ M/P denote the pointwise conformal equivalence class of g, and let Y (g) = inf Y (pg) p∈P

The σ-constant of M , σ(M ), is deﬁned by σ(M ) =

sup

Y (g) =

g∈M/P

sup

g∈M/P

inf Y (pg)

p∈P

(7.2)

Thus the σ-constant is deﬁned by ﬁrst ﬁxing g ∈ M, minimizing the Yamabe functional in the conformal class Pg = {pg | p ∈ P}, and then maximizing the Yamabe functional over all pointwise conformal classes M/P. Thus σ(M ) is deﬁned by a minimax process (ﬁrst inﬁmum, then supremum) analogous to the minimax process (ﬁrst supremum, then inﬁmum) used in Morse theory. The ﬁrst step of this procedure, corresponding to the min part, is the Yamabe problem and has been solved. Thus in each conformal class Pg, there exists a constant scalar curvature metric g¯ ∈ Pg that realizes the inﬁmum of Y over the conformal class Pg, i.e., Y (¯ g ) = inf Y (pg) p∈P

(7.3)

15. Conformal Volume Collapse of 3-Manifolds

483

and so from (7.1), Y (¯ g ) = R(¯ g ) vol(M, g¯)2/n

(7.4)

The second step of the minimax procedure, maximizing over the conformal classes is considerably more diﬃcult and has not been solved. However, from (7.2), (7.3), and (7.4), σ(M ) can be expressed as a supremum over the space of Yamabe metrics Y(M ),

inf Y (pg) σ(M ) = sup

g∈M/P

p∈P

g) = sup Y (¯

=

g ¯∈Y(M )

sup

R(¯ g ) vol(M, g¯)2/n

(7.5)

g ¯∈Y(M )

If M is of Yamabe type −1 and if g¯ ∈ Y(M ), then R(¯ g ) = constant < 0 so from (7.5), σ(M ) ≤ 0 (7.6) Since Y (¯ g ) = R(¯ g ) vol(M, g¯)2/n is homothetically invariant, we can without loss of generality restrict to either metrics of unit volume or to metrics γ ∈ M−1 . We remark that the usual procedure is to restrict to metrics of unit volume and to let the constant scalar curvature ﬂoat; see e.g., Anderson [1997]. However, we take the dual approach and restrict to metrics in M−1 and let the volume ﬂoat. We take this dual approach since as we shall see in Sections 12 and 19, an important part of our program involves studying the volume collapse of the conformal metrics for the reduced Einstein ﬂow which obviously could not occur if the volumes were constrained to be of unit volume. Thus, constraining γ ∈ M−1 , we have from (7.5) the following equation for σ(M ) (see Fischer and Moncrief [2000] for more details). 7.1 Proposition. Let M be a closed connected oriented n-manifold, n ≥ 3, of Yamabe type −1. Then σ(M ) = −

inf

γ∈M−1

2/n vol(M, γ)

(7.7)

Thus for γ ∈ M−1 , we have the global lower bound for the restricted volume functional, vol−1 (M, γ) ≥ inf vol−1 = (−σ(M ))n/2

(7.8)

However, unless σ(M ) < 0, no positive bound away from zero is contained in (7.8), and indeed, there are no known examples of M with σ(M ) < 0 (see Anderson [1997]). Thus a necessary condition for γ¯ ∈ M−1 to realize inf vol−1 is that σ(M ) < 0. We say g¯ ∈ M realizes σ(M ) if Y (¯ g ) = σ(M ), in which case, g¯ ∈ Y(M )

and Y (¯ g ) = R(¯ g ) vol(M, g¯)2/n = σ(M ).

484

A. E. Fischer and V. Moncrief

Since Y is homothetically invariant, if g¯ ∈ Y(M ) realizes σ(M ) and if g also realizes σ(M ). c ∈ R+ , then c¯ Thus γ¯ ∈ M−1 realizes σ(M ) if and only if γ¯ realizes inf vol−1 if and only if γ¯ is a global minimum of vol−1 : M−1 → R+ , in which case vol−1 is bounded away from zero. In any of these cases, σ(M ) < 0, and γ¯ is an Einstein metric with Ric(¯ γ ) = −(1/n)¯ γ. If n = 3 and g¯ ∈ M realizes σ(M ), then g¯ is a hyperbolic metric (and thus M must be hyperbolizable). Thus M being hyperbolizable and g¯ being hyperbolic is a necessary condition for a metric to realize σ(M ). However, conversely, it is unknown if this condition is suﬃcient, i.e., if a hyperbolic metric g¯ in fact realizes σ(M ), i.e., if Y (¯ g ) = σ(M )) is unknown, although it is conjectured to be true (see Anderson [1997]). From Proposition 7.1, the following is equivalent to Conjecture 6.2. 7.2 Conjecture. Hyperbolic σ-conjecture (topological formulation): Let M be a closed connected oriented hyperbolizable 3-manifold and let g¯ ∈ M be a hyperbolic metric on M . Then g¯ realizes σ(M ), Y (¯ g ) = σ(M )

(7.9)

We remark that in this formulation, g¯ need not be normalized to lie in M−1 since the Yamabe functional is homothetically invariant.

8

The Gromov Norm M

The σ-constant of a closed n-manifold M , n ≥ 3, is a topological invariant and for n = 3 and M hyperbolizable, is conjecturally related to the hyperbolic volume, i.e., to the volume of a normalized hyperbolic metric, here normalized by R(˜ γ ) = −1. On the other hand, by results of Gromov and Thurston, a normalized hyperbolic volume is itself a topological invariant. Here we show an inequality between these two topological invariants and show that for n = 3 the hyperbolic σ-conjecture equates these two topological invariants (up to a constant multiple and a power). Let M be a closed connected oriented n-manifold. Then the n-th homology group Hn (M ; Z) ∼ = Z has a preferred generator denoted by [M ] and called the fundamental class of M . [M ] can be viewed as a generator of Hn (M ; R) ∼ = Hn (M ; Z) ⊗ R ∼ =R as a real vector space. The homology spaces Hk (M ; R) = Zk (M ; R)/Bk (M ; R) of k-cycles mod k-boundaries can be endowed with a quotient semi-norm · coming from a norm on Zk (M ; R). Using this semi-norm, the Gromov

15. Conformal Volume Collapse of 3-Manifolds

485

norm of M is deﬁned by M = [M ] . By its deﬁnition, M is a topological invariant of M (see Benedetti [1992], C.3–4, for further details). Now let M be a closed connected oriented hyperbolizable n-manifold, n ≥ 3, and let gr be a hyperbolic metric on M with hyperbolic radius r > 0 and constant sectional curvature K(gr ) = −(1/r2 ). Then the Gromov– Thurston theorem (Gromov [1982]) asserts that for hyperbolizable M M =

vol(M, gr ) r n vn

(8.1)

here slightly modiﬁed to allow for a hyperbolic metric of arbitrary hyperbolic radius, and where the positive number vn involves volumes of simplices in the simply connected unit covering manifold Hn1 together with its boundary and thus depends only on n and not on M . Thus the hyperbolic volume vol(M, gr ) is proportional to the Gromov norm M , a topological invariant, by a universal constant for each n and by a necessary scaling factor rn . Thus a normalized hyperbolic volume is itself a topological invariant. Another way of looking at (8.1) is that on a hyperbolizable manifold, the Gromov norm is always realized by a hyperbolic metric. If gr is a hyperbolic metric with hyperbolic radius r, then R(gr ) = n(n − 1)K(gr ) = −

n(n − 1) . r2

Thus if γ lives in M−1 , then γ has hyperbolic radius r = (n(n − 1))1/2 and hyperbolic volume vol(M, γ˜ ) = (n(n − 1))n/2 vn M

(8.2)

vol(M, γ˜ ) = 63/2 v3 M

(8.3)

Thus for n = 3, Thus from (7.7) and (8.2), we ﬁnd that the topological invariants M and σ(M ) are related by the following. 8.1 Proposition. Let M be a closed connected oriented hyperbolizable n-manifold, n ≥ 3, and let γ˜ ∈ M−1 be a hyperbolic metric. Then n/2 n/2 n(n − 1) vn M = vol(M, γ˜ ) ≥ inf vol−1 = σ(M ) If n = 3, then

3/2 63/2 v3 M ≥ σ(M )

(8.4)

(8.5)

and this inequality is an equality if and only if the hyperbolic σ-conjecture is true. Thus for hyperbolizable 3-manifolds, the topological invariants σ(M ) and M are equal (up to a numerical factor and a power) if and only if the hyperbolic σ-conjecture is true.

486

9

A. E. Fischer and V. Moncrief

The Conformal Volume Collapse of 3-Manifolds

Section 12, the reduced Hamiltonian tends to approach σ(M ) along nonconstant integral curves of the reduced Einstein ﬂow. In so doing, the ﬂow tends to either volume collapse M in certain cases (σ(M ) = 0) or conjecturally to induce a geometrization of M in other cases (σ(M ) < 0) as is suggested by calculations regarding spatially compact cosmological models (see Section 19). In a more geometrical setting, these issues of collapse are discussed by Anderson [1993, 1997, 2000] and we here brieﬂy review his results, modiﬁed somewhat to our setting, since Anderson takes metrics with unit volume and varying constant scalar curvature while we adopt the dual approach of ﬁxing the constant scalar curvature and allowing the volume to vary (see also the remarks preceding Proposition 7.1). A 3-manifold M is a Seifert (ﬁbered) space if M admits a foliation by circles. For example, if S1 acts freely on M , then M is the total space of an S1 -bundle over a surface and is a Seifert ﬁbered space. More generally, if S1 acts without ﬁxed points (locally free), then M is a Seifert ﬁbered space, and in either case the ﬁbers of M are the orbits of the S1 -action. We remark that with the exception of the connected sum P3 #P3 , all Seifert ﬁbered spaces are irreducible. A generalization of Seifert ﬁbered spaces are the graph manifolds. A closed orientable 3-manifold M is a graph manifold if there is a ﬁnite collection T = {T2i } of disjoint embedded tori T2i ⊂ M such that each component Mj of M \∪T2i is a Seifert ﬁbered space. Thus a graph manifold is a union of Seifert ﬁbered spaces glued together by toral automorphisms along toral boundary components. In particular, a Seifert ﬁbered manifold is a graph manifold. We remark that in general graph manifolds are closed under connected sums so that a graph manifold need not be irreducible. This contrasts with the situation for Seifert spaces which are irreducible (with the exception of P3 #P3 ). For an arbitrary graph manifold M , irreducible or not, σ(M ) ≥ 0. Since manifolds of Yamabe type −1 have σ(M ) ≤ 0 (see (7.6)), a graph manifold (or a Seifert ﬁbered space) of Yamabe type −1 must therefore have σ(M ) = 0. The importance of graph manifolds is that they can be characterized geometrically as exactly the class of 3-manifolds which admit a volume collapse by Yamabe metrics with bounded Ricci curvature (or equivalently, since n = 3, with either bounded Riemannian or sectional curvature). Thus if M is a graph manifold, there exists a sequence of Yamabe metrics gi ∈

15. Conformal Volume Collapse of 3-Manifolds

487

M, i ∈ Z+ = the positive integers, such that Ric(gi )gi ≤ B

and

lim vol(M, gi ) = 0

i→∞

(9.1)

where B > 0, Ric(gi ) is the Ricci curvature tensor of gi , and · gi is the norm of a 2-covariant tensor, computed with respect to the metric gi . In local coordinates, Ric(g)2g = g ij g kl Rik Rjl . We shall say that a sequence {gi }i∈Z + that satisﬁes (9.1) volume collapses M with bounded curvature. Now suppose M is of Yamabe type −1 so that σ(M ) ≤ 0. Let γi ∈ M−1 be a sequence of metrics that tries to realize the σ-constant of M , so that Y (γi ) −→ σ(M )

as

i→∞

(9.2)

Equivalently (see Section 7), and perhaps more geometrically, vol(M, γi ) tries to realize inf vol−1 , vol(M, γi ) −→ inf vol−1 = |σ(M )|3/2

as

i→∞

(9.3)

Suppose {γi } has a convergent subsequence {γik }, γik → γ˜ , k ∈ Z+ . Since M−1 is a closed submanifold of M (see Theorem 4.1), the limit metric γ˜ is in M−1 and by the continuity of the volume functional, vol(M, γik ) −→ vol(M, γ˜ )

as

k→∞

(9.4)

Comparing (9.3) and (9.4) and by the uniqueness of limits, vol(M, γ˜ ) = inf vol−1 = |σ(M )|3/2

(9.5)

in which case γ˜ is a minimum of vol−1 , hence a critical point, and hence a hyperbolic metric (Theorem 7.1). Thus if a sequence that tries to realize inf vol−1 has a convergent subsequence, then the limit metric γ˜ must realize inf vol−1 in which case the hyperbolic σ-conjecture would be true. In particular, a necessary condition for a sequence {γi } that realizes the σ-constant of M to have a convergent subsequence is that M be hyperbolizable. On the other hand, if M is not hyperbolizable, a sequence γi ∈ M−1 that realizes σ(M ) cannot converge. Thus there must exist subsets in M on which the sequence γi degenerates and the goal is to understand this degeneration in terms of the topology of M . Anderson ([1993]–[1999]) has conjecturally resolved this problem as follows: For a closed connected orientable irreducible 3-manifold M with σ(M ) ≤ 0, there exists a ﬁnite collection of disjoint embedded incompressible tori T = {T2i }, T2i ⊂ M , which separate M into a union of two types of manifolds,

;

∪ Gk (9.6) M \ ∪ T2i = ∪ Hj

488

A. E. Fischer and V. Moncrief

where each (Hj , γ˜j ), γ˜j ∈ M−1 (Hj ), is a complete connected non-compact hyperbolic manifold of ﬁnite volume vol(Hj , γ˜j ) such that the collection of boundary components of the hyperbolizable domain ∪Hj forms exactly the collection T . Each Gk is a graph manifold with toral boundary components whose union again gives T . Moreover, the σ-constant of M is given by σ(M ) = −

2/3 vol(Hj , γ˜j )

(9.7)

j

modulo a factor of 6, since Anderson’s hyperbolic metrics have constant sectional curvature = −1 and constant scalar curvature = −6, whereas our hyperbolic metrics have constant scalar curvature = −1 and constant sectional curvature = −(1/6). These conjectures imply that a closed connected orientable irreducible 3manifold M with σ(M ) ≤ 0 is a union of complete hyperbolizable manifolds (i.e., each Hj admits a complete hyperbolic metric) and graph manifolds glued together along incompressible toral boundary components. Note in particular that only the hyperbolizable components of M (if there are any) contribute to the σ-constant and that the graph components ∪Gi of M do not. Thus from (9.6) and (9.7), this conjectural information can be organized into the following three cases: Case I: σ(M ) < 0 and M hyperbolizable. This is the pure hyperbolizable case in which case there are no graph components. Case II: σ(M ) < 0 and M not hyperbolizable. This is the hybrid case in which case there are both hyperbolizable and graph components. Case III: σ(M ) = 0 (M is a graph manifold). This is the pure graph case in which case there are no hyperbolizable components. Thus the cases σ(M ) < 0 (cases I and II) and σ(M ) = 0 (case III) distinguish between the presence or absence of hyperbolizable components, whereas we note that when σ(M ) < 0, σ(M ) is not a ﬁne enough invariant to distinguish between the absence (pure hyperbolizable case) or presence (hybrid case) of graph components. We can also think of case I and cases II and III as distinguishing between the absence (case I) or presence (cases II and III) of graph components. Subject to Anderson’s conjectures, the limiting behavior of a sequence γi ∈ M−1 on M that tries to realize the σ-constant (9.2) (or (9.3)) with bounded curvature Ric(γi )γi ≤ B can be described geometrically as follows. If σ(M ) < 0 (cases I and II), then from (9.3), vol(M, γi ) → |σ(M )|3/2 > 0 so that the sequence {γi } does not fully volume collapse M in either case. If M is hyperbolizable i.e., if there are no graph components, (case I, the pure hyperbolizable case), then the sequence {γi } has a subsequence {γik }

15. Conformal Volume Collapse of 3-Manifolds

489

that converges (up to isometry) on M to the unique (up to isometry) hyperbolic metric γ˜ ∈ M−1 (M ). If there are graph components (case II, the hybrid case), then the sequence {γi } has a subsequence {γik } that converges (up to isometry) on the connected non-compact hyperbolizable components Hj to the unique (up to isometry) complete ﬁnite-volume hyperbolic metric γ˜j ∈ M−1 (Hj ) and volume collapses the graph components Gk of the graph manifold domain ∪Gk along either S1 or T2 ﬁbers, or in the Nil case (see Section 19), totally collapses the graph manifold domain. Thus in this case there is partial volume collapse in the sense that the sequence volume collapses M on the graph components but not on the hyperbolizable components. Moreover, this volume collapse occurs with bounded curvature; see Figure 9.1, adopted from Anderson [1997], where here τ → 0− (or t → ∞) refers to conformal volume collapse in the relativistic case under the reduced Hamiltonian (see Section 12).

↓

τ → 0− or τ → ∞

Figure 9.1. Conjectural degeneration of a manifold into geometric pieces: Case II

If σ(M ) = 0 (case III), then M is pure graph and from (9.3), vol(M, γi ) → |σ(M )|3/2 = 0 so the sequence {γi } fully volume collapses M along S1 or T2 ﬁbers and again this collapse occurs with bounded curvature as is characteristic of graph manifolds. In these latter two cases of partial and full volume collapse (cases II and III) when graph components are present, the metrics are not converging and there is no limiting metric in general, but in these cases the incompressibly embedded graph manifold structure of M describes how the degeneration occurs. In Sections 12 and 19 we shall see how the integral curves of the reduced Einstein system realize these results naturally in a relativistic setting.

490

A. E. Fischer and V. Moncrief

10

The Reduced Hamiltonian and Its Properties

Using the Lichnerowicz transform introduced in Section 5 and a particular global temporal coordinate gauge function τ : R+ −→ R− ,

2 1/2 t −→ τ (t) = −( 3t )

(10.1)

where R+ = (0, ∞), Hamiltonian reduction of Einstein’s equations can be carried out by introducing a non-local time-dependent dimensionless reduced Hamiltonian − TT 3 (τ, γ, p ) −→ −τ ϕ6 µγ (10.2) Hreduced : R × Preduced −→ R , M TT

where ϕ = ϕ(τ, γ, p ) > 0 is the Lichnerowicz conformal factor discussed in Section 5 (see Fischer and Moncrief [1997] for the details of this reduction). Here we discuss the resulting reduced system of Einstein equations and the properties of its ﬂow. That Hreduced is non-local follows from the fact that the conformal factor ϕ = ϕ(τ, γ, pT T ) solves the elliptic Lichnerowicz equation (5.1) and thus is a non-local function of its arguments (τ, γ, pT T ). Consequently, that our Hamiltonian reduction process results in a formulation of dynamics that is non-local seems unavoidable. Our reduced Hamiltonian has both an explicit and implicit dependence on time. The explicit dependence is through the overall factor −τ 3 (t) whereas the implicit dependence is through the conformal factor ϕ = ϕ τ (t), γ(t), pT T (t) . Thus our reduction is time-dependent and thus involves a contact manifold, as opposed to a simpler symplectic one. That Hreduced depends essentially upon time, and not merely through the overall factor of −τ 3 , results from the inevitable volume expansion (or contraction in the time-reversed case) of our model universes. In fact, the decaying factor −τ 3 precisely cancels the increase in physical volume, leading to a constant conformal volume, only for the very special case in which M is hyperbolizable and the reduced Cauchy data is (˜ γ , 0), where γ˜ ∈ M−1 is hyperbolic, in which case the resulting spacetime is ﬂat. We also remark that the model universes under study here could not cease expanding and begin to collapse since the onset of such a collapse would necessitate a “maximal” hypersurface having τ = 0. But the Hamiltonian constraint H(g, π) = 0 (2.2) would then yield the inequality R(g) = (π · π)µ−2 g ≥0 for the scalar curvature of the Riemannian spatial metric g and this is impossible to satisfy on a manifold of Yamabe type −1 (see Section 3).

15. Conformal Volume Collapse of 3-Manifolds

491

Therefore the expected maximal range of the constant mean curvature τ is the interval (−∞, 0) with τ = −∞ corresponding to the big bang and τ → 0− corresponding to inﬁnite cosmological expansion. The range of the time coordinate t = 3τ22 is then (0, ∞), vanishing at the big bang and tending to positive inﬁnity in the limit of inﬁnite expansion. We remark that to prove that a solution determined by Cauchy data prescribed at some initial time t0 ∈ (0, ∞) actually exhausts the range [t0 , ∞) and has the asymptotic volume properties suggested above is a diﬃcult global existence problem that we shall not deal with here. Nevertheless, one of our main motivations for this work is the hope that reduction will lead to advances in the study of the global existence question for Einstein’s equations. Lastly, with our choice of time function, the reduced Hamiltonian is dimensionless since in physical units, 6 ϕ µγ = µg = vol(M, g) M

M

has dimensions of spatial volume ∼ (length)3 and τ has dimensions of (length)−1 . That τ has dimensions of (length)−1 follows from the equation τ = trg k = −

1 ij ∂gij g 2N ∂t

(when the shift vector ﬁeld X = 0), so that τ has dimensions of (N · time)

−1

= (proper time)

−1

= (length)

−1

.

The main advantage of having a dimensionless reduced Hamiltonian is that only such a reduced Hamiltonian can have a topological signiﬁcance, and indeed, one of our main results (1.4) shows that the inﬁmum of our reduced dimensionless Hamiltonian is up to a constant multiple and a power equal to the σ-constant of M , a topological invariant. With respect to the reduced Hamiltonian Hreduced the resulting reduced Einstein equations (modulo diﬀeomorphisms of M ) are given by the following unconstrained non-local time-dependent Hamilton system δHreduced ∂γ = (τ, γ, pT T ) ∂t δpT T ∂pT T δHreduced =− (τ, γ, pT T ) ∂t δγ

(10.3)

where δHreduced /δγ and δHreduced /δpT T denote the functional derivatives of Hreduced with respect to the variables γ and pT T , respectively. We have shown the following with regard to this system of equations. 1. Fixed points of the reduced Einstein ﬂow: The only zeros or equilibrium points of the reduced Einstein equations, or equivalently,

492

A. E. Fischer and V. Moncrief

the ﬁxed points of the reduced Einstein ﬂow, occur when M is hyperbolizable, in which case, if γ˜ ∈ M−1 is hyperbolic, then the ﬁxed point is the hyperbolic ﬁxed point (˜ γ , 0) ∈ Preduced , which by Mostow rigidity is unique up to isometry. At a ﬁxed point (˜ γ , 0), Hreduced is γ , 0)} ⊂ R− × Preduced , i.e., for any constant along the ray R− × {(˜ τ ∈ R− , Hreduced (τ, γ˜ , 0) = ( 32 )3/2 vol(M, γ˜ ) = 33 v3 M

(10.4)

where M is the Gromov norm of M . 2. Strict local minima of Hreduced : There exists an open neighborγ , 0) such that hood U(˜γ ,0) ⊂ Preduced of the hyperbolic ﬁxed point (˜ (˜ γ , 0) is a strict local minimum of Hreduced in the following sense. If (τ , γ , pT T ) ∈ R− × U(˜γ ,0) , then Hreduced (τ , γ , pT T ) ≥ ( 32 )

3/2

vol(M, γ˜ )

(10.5)

with equality if and only if γ ∼ = γ˜ and pT T = 0 (however τ is not determined since Hreduced is constant along the entire ray R− × {(˜ γ , 0)}). 3. Monotonicity of the reduced Hamiltonian: Along any nonconstant integral curve creduced (t) = (γ(t), pT T (t)) ∈ Preduced of the reduced Einstein ﬂow, Hreduced (τ (t), γ(t), pT T (t) is strictly monotonically decreasing (where τ (t) = −(2/(3t))1/2 ). Thus if creduced : (t1 , t2 ) ⊆ R+ −→ Preduced , t −→ creduced (t) = (γ(t), pT T (t)) 0 ≤ t1 < t2 ≤ ∞, is a non-constant integral curve of the reduced Einstein equations, then for t1 < t < t < t2 , Hreduced (τ (t ), γ(t ), pT T (t )) < Hreduced (τ (t ), γ(t ), pT T (t )) (10.6) 4. Hreduced majorizes the conformal volume: For any (τ, γ, pT T ) ∈ R− × Preduced , Hreduced (τ, γ, pT T ) ≥ Hreduced (τ, γ, 0) = ( 32 )3/2 vol(M, γ) ≥ (− 32 σ(M ))3/2

(10.7)

5. Inﬁmum of Hreduced and the σ-constant of M : inf Hreduced ≡

inf Hreduced (τ, γ, p (τ,γ,pT T )∈R− ×Preduced

= (− 32 σ(M ))3/2

TT

) (10.8)

15. Conformal Volume Collapse of 3-Manifolds

493

We remark regarding (1) that an equilibrium point, say x0 , of a timedependent dynamical system Xt must satisfy Xt (x0 ) = 0 for all t for which the dynamical system is deﬁned. Thus in our case, we must show (as we γ , 0) is a critical point of Hreduced (τ, ·, ·) have) that for each ﬁxed τ ∈ R− , (˜ with respect to the variables γ and pT T . If M is hyperbolizable, γ˜ ∈ M−1 hyperbolic, and τ ∈ R− , then Hreduced (τ, γ˜ , 0) = ( 32 )3/2 vol(M, γ˜ ) = 33 v3 M is a strict local minimum of Hreduced in the sense of (1) above. The question then naturally arises of whether or not the hyperbolic ﬁxed point (˜ γ , 0) is a global minimum of Hreduced in the same sense. If the hyperbolic σ-conjecture is true, i.e., then so that

vol(M, γ˜ ) = (−σ(M ))3/2 3/2 inf Hreduced = (− 32 σ(M ))3/2 = 32 vol(M, γ˜ )

in which case (τ, γ˜ , 0) is a global minimum of Hreduced , i.e., for any (τ, γ, pT T ) ∈ R− × Preduced , Hreduced (τ, γ, pT T ) ≥ ( 32 )3/2 vol(M, γ˜ )

(10.9)

with equality if and only if γ ∼ = γ˜ and pT T = 0 (see also (20.1)). Moreover, the spacetime corresponding to the hyperbolic ﬁxed point is just a standard model, a sort of compactiﬁed ﬂat Robertson–Walker spacetime (see Section 11). Questions about whether (˜ γ , 0) is a global minimum or not of Hreduced are analogous to the classical positive mass problem (now aﬃrmatively solved) in general relativity, which concerns the Hamiltonian for the vacuum Einstein equations in case M is asymptotically Euclidean. In that case, the ADM-mass is deﬁned by (gii,j − gij,i )dS j (10.10) 16πm(g) = lim r→∞

Sr

The positive mass theorem asserts that for asymptotically Euclidean Riemannian metrics g with R(g) ≥ 0, the mass m(g) ≥ 0 with equality if and only if g is ﬂat, thereby giving Cauchy data for ﬂat Minkowski space. Thus the hyperbolic σ-conjecture may be thought of as an analogue to the positive mass theorem for spatially compact spacetimes with Hreduced playing the part of the mass functional m(g) and (τ, γ˜ , 0) with τ < 0 playing the part of the Cauchy data for ﬂat Minkowski space. We remark that before the positive mass theorem was proven, it was known that the ﬂat metric on R3 was a unique (up to isometry) strict local minimum (in the non-isometric directions) of m(g) and it was conjectured (but unknown at the time) that the ﬂat metric was in fact a global

494

A. E. Fischer and V. Moncrief

minimum. This is analogous to our current understanding of the situation regarding Hreduced . For now, however, we only know that the hyperbolic data described above yields a strict local minimum that may or may not be a global minimum. We will discuss various other consequences of the properties of Hreduced with respect to the hyperbolic σ-conjecture, the Gromov norm, stability of solutions, and volume collapse throughout the next sections.

11

Reduction from the Spacetime Point of View

Associated with any integral curve creduced : (t1 , t2 ) ⊆ R+ −→ Preduced ,

creduced (t) = (γ(t), pT T (t))

of the reduced Einstein equations, there is a physical (or ADM) curve cADM : (t1 , t2 ) ⊆ R+ −→ CH ∩ Cδ ∩ CR− ,

cADM (t) = (g(t), π(t))

deﬁned by (g(t), π(t)) = L(τ (t), γ(t), pT T (t)) ∈ CH ∩ Cδ ∩ CR−

(11.1)

where τ (t) = −(2/(3t))1/2 is the global temporal coordinate gauge function and L : R− × Preduced → CH ∩ Cδ ∩ CR− is the Lichnerowicz transform. The resulting curve (g(t), π(t)) is then a CMC solution to the ADM evolution equations with shift vector ﬁeld X(t, x) determined by a suitable slice condition (see Fischer and Moncrief [1997] for details) and with lapse function N (t, x) determined by the lapse equation necessitated by the requirement that the mean curvature be constant and satisfy τ (g(t), π(t)) = τ (t) for t ∈ (t1 , t2 ). One can compute ∂τ /∂t from the ADM evolution equations for (g, π). The result is ∂τ 1 2 = − 34 τ 3 = ∆g N + (π T T · π T T )µ−2 N g + 3τ ∂t

(11.2)

thereby determining the lapse function (see Fischer and Moncrief [1997]). The resulting vacuum (i.e., Ricci-ﬂat) spacetime can then be reconstructed from the line element ds2 = −N 2 dt2 + gij (dxi + X i dt)(dxj + X j dt)

12

(11.3)

Geometrization of 3-Manifolds

non-constant maximal integral curves of the reduced Einstein equations suggests that along such curves Hreduced is asymptotically decaying to its

15. Conformal Volume Collapse of 3-Manifolds

495

inﬁmum inf Hreduced = (− 32 σ(M ))3/2 in the direction of cosmological expansion. Thus if cmax : (t1 , ∞) ⊆ R+ −→ Preduced ,

t −→ cmax (t) = (γ(t), pT T (t))

0 ≤ t1 < ∞, is a non-constant maximal integral curve of the reduced Einstein ﬂow, then in certain cases we would expect that Hreduced (τ (t), γ(t), pT T (t)) −→ inf Hreduced = | 32 σ(M )|3/2

as t −→ ∞

(12.1)

2 1/2 (where τ (t) = −( 3t ) ) thereby asymptotically realizing | 32 σ(M )|3/2 . An important open question is under what conditions is (12.1) true? We shall explore this question here and in Section 19, and assuming that it is true, we shall explore some of the consequences, conjectural and otherwise. If (12.1) is true, then from (10.7) the conformal volume vol(M, γ(t)) is squeezed between Hreduced and inf Hreduced as Hreduced decays to its inﬁmum. Thus the conformal volume must also decay to its inﬁmum, i.e.,

vol(M, γ(t)) −→ inf vol−1 = (−σ(M ))3/2

as

t −→ ∞

(12.2)

Since M is of Yamabe type −1, σ(M ) ≤ 0. Thus, additionally assuming for now that M is irreducible, the analysis naturally divides into three cases as in Section 9; (I) σ(M ) < 0 and M hyperbolizable; (II) σ(M ) < 0 and M non-hyperbolizable; and (III) σ(M ) = 0. (We remark that conjecturally M hyperbolizable ⇒ σ(M ) < 0.) In case I, M is hyperbolizable and so there is an exact solution of the reduced Einstein equations, the hyperbolic ﬁxed point (˜ γ , 0) with maximal interval of existence R+ . At this ﬁxed point, Hreduced (τ (t), γ˜ , 0) = ( 32 )3/2 vol(M, γ˜ ) = constant and thus trivially asymptotically approaches inf Hreduced = | 32 σ(M )|3/2 (i.e., (12.1) holds) if and only if the hyperbolic σ-conjecture is true. Note also that the curve of conformal volumes is constant, vol(M, γ(t)) = vol(M, γ˜ ) and thus does not volume collapse M . Under the additional assumption that M is a rigid hyperbolizable manifold (see Sections 16 and 18), (˜ γ , 0) is a local attractor for the reduced Einstein ﬂow and thus (12.1) also holds for (γ, pT T ) near (˜ γ , 0). In case II, the hybrid case, the conformal metric partially volume collapses M by volume collapsing M along its graph components but not along its hyperbolizable components. In case III, the pure graph case, σ(M ) = 0 and the curve γ(t) of conformal metrics fully volume collapses M as t → ∞. We shall consider this situation in more detail in Section 19.

496

A. E. Fischer and V. Moncrief

We emphasize that in both cases II and III, the collapse occurs in the reduced phase space Preduced and not in the physical phase space. Perhaps the most interesting potential application of our results is in cases II and III. In those cases, by Theorem 3.1, M is a non-hyperbolizable non-Euclidean K(π, 1)-manifold. Thus the σ-constant can never be realized by an actual metric on M but can only be approached as a limit by a sequence or curve of metrics. In these cases, the reduced Einstein ﬂow has no equilibrium point. Nevertheless, the Einstein ﬂow is still seeking to attain the σ-constant asymptotically insofar as the reduced Hamiltonian is strictly monotonically seeking to decay to its inﬁmum. However, there may well be obstructions, such as the formation of black holes, which may prevent any particular solution from asymptotically approaching inf Hreduced = (− 32 σ(M ))3/2 . However, it seems plausible that some subset of solutions might nevertheless asymptote to this ideal attractor and in so doing, the Einstein ﬂow, through the curve of conformal metrics γ(t), may induce a decomposition of M into geometric pieces by degenerating in precisely the way outlined in Section 9. If this is the case, then perhaps the asymptotic behavior of large classes of Einstein spacetimes can be characterized rather explicitly in terms of the degenerations described there, linking the geometrization program of 3-manifolds to the global properties of the Einstein ﬂow. Conversely, it is not inconceivable that the Einstein ﬂow, much like Hamilton’s [1995, 1999] Ricci ﬂow could be used in a positive way to try to establish some form of the geometrization conjectures for 3-manifolds.

13

The Hyperbolic Fixed Point, Warped Products, and Lorentz Cones

We continue with M hyperbolizable, γ˜ ∈ M−1 hyperbolic, and (˜ γ , 0) ∈ Preduced the hyperbolic ﬁxed point. It is useful to think of the hyperbolic ﬁxed point solution of the reduced Einstein equations (˜ γ , 0) ∈ Preduced as a semi-globally deﬁned maximal constant integral curve, c˜max : R+ → Preduced ,

t → c˜max (t) = (˜ γ , 0)

Note that the hyperbolic ﬁxed point and the constant integral curve γ , 0) c˜max (t) = (˜ diﬀer only by a point of view. First we ﬁnd the ADM physical curve c˜ADM corresponding to c˜max and then the spacetime corresponding to c˜ADM . The ADM-curve corresponding

15. Conformal Volume Collapse of 3-Manifolds

497

to c˜max is given by (see (11.1)) c˜ADM : R+ −→ CH ∩ Cδ ∩ CR− ,

t −→ c˜ADM (t) = g˜(t), π ˜ (t)

where, since pT T = 0, from (5.7) and (5.8), ϕ4 = and

3 2τ 2

=

3 2 2

t

g (t), π ˜ (t)) = L(τ (t), c˜max (t)) = L(τ (t), γ˜ , 0) c˜ADM (t) = (˜

= 2τ 23(t) γ˜ , −( 23 )1/2 γ˜ −1 µγ˜

γ , −( 23 )1/2 γ˜ −1 µγ˜ = ( 32 )2 t˜

(13.1)

We remark that although c˜max (t) is constant, c˜ADM (t) is no longer constant. The metric component of the curve c˜ADM (t) consists of the curve g˜(t) =

3 γ˜ 2τ 2 (t)

which is just a homothetic scaling of the hyperbolic metric γ˜ ∈ M−1 . Thus g˜(t) is a curve of hyperbolic metrics with constant sectional curvatures K

2τ 2 (t) τ (t) 2 2τ 2 (t) 1 1 3 γ ˜ = K(˜ γ ) = − = − =− 2 , 2 2τ (t) 3 3 6 3 r (t)

with constant scalar curvatures

3 6 γ˜ = − 23 τ 2 (t) = − 2 , R 2 2τ (t) r (t) and with hyperbolic radii r(t) given by r(t) = −

3t 12 3 =3 τ (t) 2

(13.2)

Thus from (13.1), c˜ADM can be written in terms of r(t) as 1 ˜ (t) = 16 r2 (t)˜ γ , − 23 2 γ˜ −1 µγ˜ c˜ADM (t) = g˜(t), π

(13.3)

To ﬁnd the resulting vacuum spacetime on R+ ×M generated by this curve, one needs to know the lapse function N (t) (the shift vector ﬁeld X(t) = 0). Thus from g˜(t) = one ﬁnds

3 2 2

t˜ γ,

˜ = 1 τ (t)˜ k(t) g (t), 3

and

∂˜ g ˜ = −2N (t)k(t), ∂t

∂˜ g 3 2 2 1 = 2 γ˜ = g˜(t) = − N (t)τ (t)˜ g (t) ∂t t 3

498

A. E. Fischer and V. Moncrief

from which one ﬁnds that the lapse function is is spatially constant and is given by 3 1 2 N (t) = 32 2 1/2 = − 32 τ (t) (13.4) t (which also follows from the lapse equation (11.2)). The resulting vacuum (i.e., Ricci-ﬂat) Einstein spacetime geometry on R+ × M can now be reγ, constructed locally from (13.4) and g˜(t) = (3/2)2 t˜ 3 1 2 3 2 dt + 2 t γ˜ij dxi dxj ds2 = −N 2 dt2 + g˜ij dxi dxj = − 32 t

(13.5)

Alternately, using the hyperbolic radius r = r(t) = −

12 3 = 3 3t 2 τ (t)

as time coordinate instead of t, the line element can be coordinate transformed on R+ × M to ds2 = −dr2 +

r2 γ˜ij dxi dxj 6

(13.6)

which is identical to the κ = −1 vacuum Robertson–Walker solution which is well-known to be ﬂat. Equivalently, using the constant mean curvature τ = τ (t) = −

2 1/2 3 =− 3t r(t)

as time coordinate instead of t, either line element (11.3) or (13.6) can be transformed to R− × M where it is given by 3 2 3 ds2 = − 2 dτ 2 + 2 γ˜ij dxi dxj (13.7) τ 2τ Since the spacetimes determined by the line elements (11.3), (13.6), and (13.7) are all isometric, the spacetime resulting from c˜ADM and thus from the hyperbolic ﬁxed point (˜ γ , 0) is the ﬂat spacetime given by any one of them. Let τ0 ∈ R− be arbitrary, let t0 = 2/(3τ02 ), and let c˜ADM (t0 ) = (g(t0 ), π(t0 )) = (g0 , π0 ) 1/2 −1 = 2τ32 γ˜ , − 23 γ˜ µγ˜ 0 1/2 −1 2 γ˜ µγ˜ ∈ CH ∩ Cδ ∩ Cτ0 = 32 t0 γ˜ , − 23 be a point on the curve c˜ADM (t). Then the maximal vacuum Cauchy development (V4 , (4)g)(g0 ,π0 ) of (g0 , π0 ) is isometric to the maximal globally hyperbolic spacetime determined by any of the line elements (11.3), (13.6),

15. Conformal Volume Collapse of 3-Manifolds

499

or (13.7) on R+ × M in the ﬁrst two cases and on R− × M in the third case. We now reconstruct the spacetime generated by the curve c˜ADM (t) glob+ ally using a warped product Lorentz manifold. Let R+ 1 = (R , −1) denote R+ taken with the negative-deﬁnite metric −1, let (M, γ˜ ) be the model or ﬁducial ﬁber, and thinking of R+ as the base and the ﬁducial manifold (M, γ˜ ) as the ﬁber, let w√6 : R+ −→ R+ ,

r −→ w(r) =

√r 6

be the warping function. Using this data, let γ√6 denote the Lorentz warped product metric on the product manifold R+ × M . The line element of γ√6 is then given by (13.8) ds2 = −dr2 + 16 r2 γ˜ij dxi dxj Let r ˜ ) = (R+ × M, γ√6 ) R+ 1 × √6 (M, γ

(13.9)

denote the resulting warped product Lorentz manifold, also known as the Lorentz cone over (M, γ˜ ) (see Figure 15.1). Comparing the line elements (13.6) and (13.8) shows that r ˜) (V4 , (4)g)(g0 ,π0 ) ∼ = R+ 1 × √6 (M, γ

(13.10)

We shall examine the global geometry of this ﬂat spacetime from the point of view of the standard models I+ /Γ in Section 15.

14

CMC Foliation of I+

Let R41 = (R4 , b41 ) denote the 4-dimensional Minkowski vector space, where b41 is the Lorentz inner product on R4 with signature (1, 3) = (− + ++). Let γ denote the Minkowski metric on R4 , the ﬂat Lorentz metric induced by b41 (in coordinates {xα }, 0 ≤ α ≤ 3, γαβ dxα dxβ = −dt2 + dx2 + dy 2 + dz 2 , so that the ﬂat Lorentz manifold M4 = (R4 , γ) is Minkowski spacetime. Let O(1, 3) ⊂ GL(4, R) denote the Lorentz group, deﬁned as the orthogonal group of b41 , or equivalently the group of linear isometries of R41 . Let SO(1, 3) = O(1, 3) ∩ SO(4, R) denote the proper Lorentz group (determinant = 1), let O↑ (1, 3) denote the orthochronos Lorentz group, the subgroup of time-orientation preserving Lorentz transformations, and let SO↑ (1, 3) = SO(1, 3) ∩ O↑ (1, 3) denote the proper orthochronos Lorentz group, which is the connected component of the identity of O(1, 3).

500

A. E. Fischer and V. Moncrief

Each of these groups has an inhomogeneous Poincar´e version which corresponds to a group of isometries of R41 . Thus the Poincar´e group .

P = Isom(R41 ) = O(1, 3) × R4 Isom(R4 , γ) is the semi-direct product of the Lorentz group O(1, 3) with the group of translations of R4 and is the full isometry group of R41 . Similarly, the proper orthochronos Poincar´e group .

P↑+ = SO↑ (1, 3) × R4 = Isom↑+ (R41 ) is the connected component of the identity of P. Let I + = (t, x, y, z) ∈ R4 | −t2 + x2 + y 2 + z 2 < 0 and t > 0 denote the interior of the future light cone of Minkowski space R41 . Let γI + ≡ γ|I + denote the restriction of γ to I + , and let I+ = (I + , γI + ) denote I + together with its induced ﬂat Lorentz metric as an open submanifold of Minkowski spacetime M4 so that I+ is a ﬂat open cone in M4 . As a spacetime, I+ is maximally globally hyperbolic, future causually geodesically complete, past causually geodesically incomplete (in the contracting direction), and of course has no curvature singularity. I + is invariant under the two-component orthochronos Lorentz group ↑ O (1, 3), and indeed, the isometry group of I+ is Isom(I+ ) = O↑ (1, 3)|I + O↑ (1, 3) and the subgroup of orientation-preserving isometries is Isom+ (I+ ) = SO↑ (1, 3)|I + SO↑ (1, 3). For r > 0, let Hr3 = (t, x, y, z) ∈ R4 | −t2 + x2 + y 2 + z 2 = −r2 and t > 0 ⊂ I + and let g¯r ∈ M(Hr3 ) = Riem(Hr3 ) denote the induced Riemannian metric on Hr3 as an embedded submanifold of the ﬂat open cone I+ ⊂ M4 (equivalently, of M4 itself). Then the Riemannian manifold H3r = (Hr3 , g¯r ) is the hyperboloid of radius r, a complete connected simply-connected non-compact hyperbolic 3-manifold with K(¯ gr ) = −

1 , r2

constant sectional curvature

Ric(¯ gr ) = 2K(¯ gr )¯ gr = − gr ) = − R(¯ gr ) = 6K(¯

6 . r2

2 g¯r , r2

Ricci curvature, and scalar curvature

15. Conformal Volume Collapse of 3-Manifolds

501

When r = 1, H31 is the unit hyperboloid. Regarding the extrinsic geometry of H3r , we note that if x ∈ Hr3 , then Zx = x/r ∈ Tx⊥ Hr3 is a forward pointing unit timelike normal at x from which one computes that the second fundamental form of H3r as an embedded submanifold in the open ﬂat cone I+ is given by 1 k¯r = − g¯r ∈ S2 (Hr3 ) r and the mean curvature is given by τr = trg¯r k¯r = −

3 <0 r

so that k¯r = (τr /3)¯ gr . Since k¯r is proportional to g¯r , H3r is a totally umbilic hypersurface in I+ (see O’Neill [1983] p. 108). Thus at each point x ∈ H3r , H3r bends away from the forward pointing normal the same amount in all directions i.e., if ux ∈ Tx H3r is a unit tangent vector to H3r , then k¯r (x)(ux , ux ) = −(1/r), and indeed in this case, this amount is independent of x which in general is not required for a totally umbilic hypersurface. In particular, a totally umbilic hypersurface need not have constant mean curvature. However, we remark in passing that Hr3 ⊂ I + is an example of a hyperquadric and that all hyperquadrics are totally umbilic ( O’Neill [1983], Hamilton [1995, 1999] p. 116) and have constant mean curvature. Let π ¯r = −(k¯r − g¯r trg¯r (k¯r )) µg¯r = − 2r g¯r−1 µg¯r = 23 τr g¯r−1 µg¯r ∈ Sd2 (Hr3 ) (14.1) denote the corresponding gravitational momentum of the embedded hypersurface Hr3 . Since the ambient spacetime I+ is ﬂat, it follows by deﬁnition that this data is ﬂat (versus only Ricci-ﬂat) CMC-Cauchy data, so in particular ¯r ) = (¯ gr , 23 τr g¯r−1 µg¯r ) ∈ CH (Hr3 ) ∩ Cδ (Hr3 ) ∩ Cτr (Hr3 ) (¯ gr , π where τr = −(3/r). Thus, I+ =

;

(14.2)

H3r

r∈R+

gives a totally umbilic CMC-foliation of the ﬂat spacetime I+ by constant sectional curvature hyperboloids H3r . We remark that the foliation could also be parameterized by either K or τ as the foliation parameter. The use of either r or K as the foliation parameter uses a parameter intrinsic to the geometry of H3r , whereas using τ as the foliation parameter uses a parameter relating to the extrinsic geometry of H3r as an embedded hypersurface in the ambient cone I+ . However, in this setting, these parameters are simply related by K = −(1/r2 ) and τ = −(3/r).

502

A. E. Fischer and V. Moncrief

Now we express I+ as a warped product Lorentz manifold. Note that the foliation expression I+ = ∪r>0 H3r is related to but is not in and of itself a warped product. Fix r0 ∈ R+ and let H3r0 = (Hr30 , g¯r0 ) be a ﬁducial or model ﬁber. As in + + (13.9), let R+ 1 = (R , −1), and thinking of R as the base and the ﬁducial 3 manifold Hr0 as the ﬁber, let wr0 : R+ −→ R+ ,

r −→ wr0 (r) =

r r0

be the warping function (deﬁned as a positive function on the base). With this data, let γr0 denote the warped product Lorentz metric on the product manifold R+ × Hr30 , so that the line element of γr0 is ds2 = −dr2 +

r 2 (¯ gr0 )ij dxi dxj r0

(14.3)

We let 3 + 3 r R+ 1 × r Hr0 = (R × Hr0 , γr0 )

(14.4)

0

denote the resulting warped product Lorentz manifold. 14.1 Proposition.

Let

3 + 3 + + r F : R+ 1 × r Hr0 = (R × Hr0 , γr0 ) −→ (I , γI + ) = I , 0

(r, x) −→ Then F is an isometry, so that

r r0 x

(14.5)

γr0 = F ∗ γI +

(14.6)

3 ∼ + r R+ 1 × r0 Hr0 = I

(14.7)

are isometric ﬂat Lorentz manifolds, either of which is the vacuum (Ricciﬂat) Robertson–Walker (κ = −1) spacetime. We remark that as a diﬀeomorphism, F maps level hypersurfaces {r} × Hr30 in R+ × Hr30 to “level” hyperboloids Hr3 in I + . It is known that a Ricci-ﬂat Robertson–Walker spacetime is ﬂat, and the only possibilities are Minkowski space M with κ = 0 and the forward I+ and backwards I− open light cone in Minkowski space for κ = −1 (see O’Neill [1983] p. 362). Exactly analogously, the ﬂat Riemannian metric on the open submanifold R4 − {0} ⊂ R4 can ﬁrst be foliated R4 − {0} = ∪r>0 S3r and then expressed as a warped product Riemannian manifold R+ × rr S3r0 ∼ = R4 − {0} 0

(14.8)

on the product manifold R+ ×Sr30 where R+ is taken with its usual metric 1, where S3r0 = (Sr30 , gr0 ) is the round sphere of radius r0 in R4 (with constant

15. Conformal Volume Collapse of 3-Manifolds

503

sectional curvature K(gr0 ) = 1/r02 ) and where the local line element of the warped product metric is given by r 2 (gr0 )ij dxi dxj (14.9) ds2 = dr2 + r0 There are two important similarities and two important diﬀerences between the foliation of I+ and the foliation of R4 − {0}. The two similarities are that both total spaces I+ and R4 − {0} are ﬂat and that in both cases the foliations are foliations by Riemannian manifolds. The ﬁrst diﬀerence is that I+ is a Lorentz manifold whereas R4 − {0} is a Riemannian manifold, which necessitates that one take the negativedeﬁnite metric −1 on R+ in the warped product representation of I+ as compared to the usual positive-deﬁnite metric. The second (and related) diﬀerence is that the foliation of I+ is by non-compact (contractible) manifolds H3r whereas the foliation of R4 −{0} is by compact (simply connected) manifolds S3r . In the next section, we shall see how to “improve” the foliation of I+ by considering foliations of the standard models I+ /Γ which are foliated by compact (but not simply connected) spacelike hypersurfaces.

15

The Standard Models I+ /Γ

As with I+ , the hyperboloids H3r are invariant under O↑ (1, 3), and indeed, the isometry group of H3r is the two-component group Isom(H3r ) = O↑ (1, 3)|Hr3 O↑ (1, 3) and the group of orientation-preserving isometries of H3r is the connected group Isom+ (H3r ) = SO↑ (1, 3)|Hr3 SO↑ (1, 3) Thus for each r > 0, by restriction the group O↑ (1, 3) acts on the manifold Hr3 and is the isometry group of H3r . Thus abstractly O↑ (1, 3) is the isometry group of each H3r independently of r. Shortly, we will consider subgroups Γ ⊂ O↑ (1, 3) acting on I+ which then by restriction act on each H3r . Fix r > 0 and let Γ ⊂ Isom(H3r ) O↑ (1, 3) be a subgroup. Then Γ acts properly discontinuously on the manifold Hr3 if and only if Γ is a discrete subgroup of Isom(H3r ) O↑ (1, 3) taken in the natural topology of O↑ (1, 3). Moreover, a discrete subgroup Γ of Isom(H3r ) operates freely on Hr3 if and only if it is torsion-free, i.e., if no non-trivial element has ﬁnite order (see Benedetti [1992] for a proof of these facts). Thus the set of subgroups of Isom(H3r ) that act freely and properly discontinuously is in bijective correspondence with the set of discrete torsion-free subgroups of O↑ (1, 3).

504

A. E. Fischer and V. Moncrief

Thus if Γ ⊂ Isom(H3r ) is a discrete torsion-free subgroup, then Γ acts freely and properly discontinuously on Hr3 and thus the resulting orbit space Hr3 /Γ is a quotient manifold. Since Γ acts as a group of isometries on H3r = (Hr3 , g¯r ), the hyperbolic metric g¯r on Hr3 passes to a hyperbolic metric gr on the quotient manifold Hr3 /Γ. We let H3r /Γ = (Hr3 , g¯r )/Γ = (Hr3 /Γ, gr ) denote the resulting quotient hyperbolic manifold, in which case H3r /Γ is a complete connected hyperbolic Riemannian manifold with constant gr ) = − r12 , fundamental group π1 (Hr3 /Γ) = sectional curvature K(gr ) = K(¯ Γ, and such that H3r = (Hr3 , g¯r ) −→ H3r /Γ = (Hr3 /Γ, gr ) x −→ [x] = Γ · x = {ax | a ∈ Γ} is a Riemannian covering map. Since the action of Γ on I+ leaves each hyperboloid H3r ⊂ I+ invariant, the CMC-foliation of I+ = ∪r>0 H3r passes to a CMC-foliation of I+ /Γ = ∪r>0 H3r /Γ. Thus the intrinsic and extrinsic geometry (gr , πr ) of Hr3 /Γ as a gr , π ¯r ) on Hr3 submanifold of I+ /Γ is locally isometric with the geometry (¯ + as a submanifold of I . Thus the formulas for kr and πr in terms of r and ¯r in terms gr , (14.1) and (14.2), are the same as the formulas for k¯r and π of r and g¯r . The converse of this construction, amalgamated with some of the other facts mentioned above, then gives part of the classical classiﬁcation theorem of Killing and Hopf (see Wolf [1972] p. 69. Let Mn+1 = (Rn+1 , (n+1) γ) denote the (n + 1)-dimensional Minkowski spacetime. 15.1 Theorem. Let K ∈ R− , let r = (−K)−1/2 > 0, and let n ≥ 2. Then (M, gr ) is a complete connected hyperbolic Riemannian n-manifold of constant sectional curvature K < 0 and hyperbolic radius r if and only if (M, gr ) is isometric to a quotient (M, gr ) ∼ = Hnr /Γ = (Hrn , g¯r )/Γ = (Hrn /Γ, gr ) where Hnr ⊂ Mn+1 is the n-dimensional hyperboloid of hyperbolic radius r and Γ ⊂ Isom(Hnr ) = O↑ (1, n)|Hnr O↑ (1, n) is discrete, torsion-free, and unique up to conjugacy class in O↑ (1, n). The resulting quotient manifold Hrn /Γ is orientable if and only if Γ ⊂ Isom+ (Hnr ) = SO↑ (1, n)|Hnr SO↑ (1, n) and is compact if and only if Γ is co-compact. Thus there is a bijective correspondence between the moduli space (i.e., the isometry classes) of complete connected unit hyperbolic n-manifolds,

15. Conformal Volume Collapse of 3-Manifolds

505

n ≥ 2, and the conjugacy classes of discrete torsion-free subgroups of Isom(Hn1 ) O↑ (1, n). Similarly, there is a bijective correspondence between the moduli space of closed connected orientable unit hyperbolic nmanifolds and the conjugacy classes of discrete torsion-free co-compact subgroups of Isom+ (Hn1 ) SO↑ (1, n), or equivalently, with the moduli space (i.e., the diﬀeomorphism classes) of closed connected orientable hyperbolizable n-manifolds. For n = 3, this class of hyperbolizable manifolds is one of the main subclasses of manifolds of Yamabe type −1 (see Theorem 3.1). As another by-product of Theorem 15.1, if M is a hyperbolizable nmanifold (by deﬁnition connected), then M has a diﬀeomorphic representation as Hr3 /Γ (for some r > 0), the homotopy groups of M are given by π1 (M ) Γ ⊂ O↑ (1, n) and πk (M ) = 0 for k ≥ 2, and the diﬀeomorphism class of M is determined by π1 (M ) Γ. Let Γ ⊂ SO↑ (1, 3) be discrete, torsion-free, and co-compact. Then Γ acts naturally on I+ (on the left) freely and properly discontinuously as a group of isometries. Thus the metric γI + on I + passes to the quotient manifold I + /Γ (writing the left action on the right for the sake of typography). We denote the quotient metric by γI + /Γ and denote the resulting quotient spacetime by I+ /Γ = (I + , γI + )/Γ = (I + /Γ, γI + /Γ ). We refer to the resulting ﬂat spatially compact Lorentz manifold I+ /Γ as a standard model and note that the moduli space of standard models (i.e., the space of isometry classes of standard models) is in bijective correspondence with the conjugacy classes of discrete torsion-free co-compact subgroups of SO↑ (1, 3), or equivalently, with the moduli space (i.e., diﬀeomorphism classes) of closed connected orientable hyperbolizable 3-manifolds, via the following correspondence, { [M ] } ←→ { [Γ] } ←→ { [I+ /Γ] }

(15.1)

where here [M ] denotes the diﬀeomorphism class of M , [Γ] denotes the conjugacy class of Γ in SO↑ (1, 3), and [I+ /Γ] denotes the isometry class of I+ /Γ. In slightly more detail, if M is a closed connected orientable hyperbolizable 3-manifold, let gr be a hyperbolic metric on M with hyperbolic radius r. Then from Theorem (15.1), there exists a discrete torsion-free co-compact subgroup Γ ⊂ SO↑ (1, 3), unique up to conjugacy class, such that (M, gr ) ∼ = H3r /Γ. Using this Γ, the resulting spacetime I+ /Γ is then the associated standard model. Thus there is the following pathway from hyperbolizable M to standard models, M hyperbolizable → (M, gr ) → H3r /Γ → I+ /Γ

(15.2)

Alternately, and more directly, let π1 (M ) Γ. Then I+ /Γ is the standard model associated with M . Since Γ is unique up to conjugacy, I+ /Γ is unique up to isometry. Although technically not classical Robertson–Walker spacetimes, since the resulting compact hypersurfaces are not homogeneous (and thus not

506

A. E. Fischer and V. Moncrief

isotropic), the standard models are actually Lorentz covered by the classical Robertson–Walker spacetime I+ (κ = −1) and thus are locally isometric to a classical Robertson–Walker spacetime. The resulting spacetimes I+ /Γ have expanding compact hyperbolic spacelike hypersurfaces as depicted in Figure 15.1. r

r

Hr3

Hr3 /Γ

−→ Γ

Figure 15.1. A standard model I+ /Γ as a Lorentz cone

Consider again the isometry of Proposition 14.1, 3 + r F : R+ 1 × r Hr0 −→ I ,

(r, x) −→

0

r r0 x

(15.3)

Then Γ acts as a group of isometries (on the left) on both Lorentz manifolds, trivially on the ﬁrst factor in the warped product, and naturally on I+ , leaving invariant each hyperboloid H3r . Thus F is equivariant with respect to these actions, i.e., for a ∈ Γ, r r F (a · (r, x)) = F ((r, ax)) = ax = a x = aF (r, x), r0 r0 and so passes to an isometry on the quotient Lorentz manifolds, where we denote it also by F , 3 + r F : (R+ 1 × r Hr0 )/Γ −→ I /Γ

(15.4)

0

3 r Note that (R+ 1 × r0 Hr0 )/Γ is not a warped product itself but is the quotient space of a warped product. However, since the action of Γ on the warped product is trivial on the ﬁrst factor and leaves the hyperboloid H3r0 invariant, we have the following natural isometry 3 3 ∼ + r r (R+ 1 × r Hr0 )/Γ = R1 × r (Hr0 /Γ) , 0

Γ(r, x) ←→ (r, Γx)

0

(15.5)

so that from (15.4) + 3 3 ∼ ∼ + r r R+ 1 × r (Hr0 /Γ) = (R1 × r Hr0 )/Γ = I /Γ 0

(15.6)

0

giving a warped product expression for the standard model analogous to the warped product expression for I+ of Proposition 14.1. Alternately phrased, the standard model is a Lorentz cone over H3r0 /Γ.

15. Conformal Volume Collapse of 3-Manifolds

507

Now let M be a closed connected orientable hyperbolizable 3-manifold and let gr0 ∈ M be a hyperbolic metric on M with hyperbolic radius r0 and with isometry (15.7) (M, gr0 ) ∼ = H3r0 /Γ Substituting this isometry into (15.6) gives ∼ + r R+ 1 × r (M, gr0 ) = I /Γ

(15.8)

kr0 = − r10 gr0 = −(−Kr0 )1/2 gr0 ∈ S2 (M )

(15.9)

0

Deﬁne to be the totally umbilic second fundamental form. Then the mean curvature τr0 = trgr0 kr0 = − r30 = −3(−Kr0 )1/2 , so that kr0 = ((τr0 )/3)gr0 . Let πr0 = − r20 gr−1 µgr0 = −2(−K)1/2 gr−1 µgr0 0 0 = 23 τr0 gr−1 µgr0 ∈ Sd2 (M ) 0

(15.10)

be the associated totally umbilic gravitational momentum. Then µgr0 ) ∈ CH (M ) ∩ Cδ (M ) ∩ C−3/r0 (M ) (gr0 , πr0 ) = (gr0 , − r20 gr−1 0 If we now compare this data (gr0 , πr0 ) with the associated (gr 0 , πr 0 ) ∈ CH (Hr30 /Γ) ∩ Cδ (Hr30 /Γ) ∩ C−3/r0 (Hr30 /Γ) of the hyperboloid H3r0 /Γ as an embedded submanifold of I+ /Γ (see the remark just before Theorem 15.1), we ﬁnd that (gr0 , πr0 ) ∼ = (gr 0 , πr 0 ) under the isometry given by (15.7). Since the ambient spacetime I+ /Γ is ﬂat and hence Ricci-ﬂat, the uniqueness (up to isometry) of the maximal vacuum Cauchy development then gives (V4 , (4) g)(gr0 ,πr0 ) ∼ = R+ × rr0 (M, gr0 ) ∼ = I+ /Γ

16

(15.11)

Rigid and Non-Rigid Standard Models

Let L↑+ ≡ SO↑ (1, 3) denote the proper orthochronos Lorentz group and ·

let P↑+ ≡ SO↑ (1, 3) × R4 denote the proper orthochronos Poincar´e group. Let Γ ⊂ L↑+ ⊂ P↑+ be a discrete torsion-free co-compact subgroup of L↑+ thought of as a lattice in L↑+ . The moduli space, i.e., the isometry classes,

508

A. E. Fischer and V. Moncrief

of hyperbolic structures with hyperbolic radius r on the compact manifold Hr3 /Γ is in bijective correspondence with the deformation space Def(Γ, L↑+ ) of the lattice Γ ⊂ L↑+ in the proper orthochronos Lorentz group which by Mostow rigidity consists of one point. On the other hand, the moduli space of ﬂat Lorentz structures on I + /Γ is in bijective correspondence with the deformation space Def(Γ, P↑+ ) of the lattice Γ in the properly larger group P↑+ and this deformation space may be non-trivial, i.e., may be properly larger than a point. If Def(Γ, P↑+ ) is a point, then the lattice Γ is rigid and if Def(Γ, P↑+ ) is more than a point, then Γ is non-rigid. Let I+ /Γ = (I + /Γ, γI + /Γ ) denote the standard model associated with Γ as in Section 15. Then the standard model I+ /Γ is rigid (respectively, non-rigid) if the lattice Γ is rigid (respectively, non-rigid). Note that the moduli space of ﬂat spacetime structures on I + /Γ always contains the isometry class [γI + /Γ ] of the standard one γI + /Γ . If Γ is rigid, the moduli space of ﬂat spacetime structures on I + /Γ is a point and so must be the isometry class of the standard ﬂat spacetime structure [γI + /Γ ], in which case the moduli space is the one-point set {[γI + /Γ ]}. In this case γI + /Γ is unique up to isometry among all ﬂat spacetime structures on I + /Γ. On the other hand, if Γ is not rigid, there is a ﬁnite-dimensional space of parameters (see below) that describes a Riemann moduli space or a Teichm¨ uller space of deformation parameters (depending on how one deﬁnes equivalence of ﬂat structures). Now let M be a closed connected orientable hyperbolizable 3-manifold with π1 (M ) Γ ⊂ L↑+ a discrete torsion-free co-compact subgroup. Then M is rigid (respectively, non-rigid) if the lattice Γ is rigid. Fortunately, there is a geometrical criteria for when a hyperbolizable M is either rigid or non-rigid. For any Riemannian manifold (M, g), for h ∈ S2 (M ) let d∇ h ∈ C ∞ (Λ2 (M ) ⊗ Λ1 (M )) denote the “exterior derivative on symmetric 2-tensors”, deﬁned in local coordinates by (d∇ h)ijk = ∇k hij − ∇j hik = hij|k − hik|j . Let ∇

C2 (M, g) ≡ S2d (M, g) = { h ∈ S2 (M ) | d∇ h = 0 } = ker d∇ denote the linear space of Codazzi tensors on (M, g). This term arises because the classical Codazzi equation for the second fundamental form k ∈ S2 (M ) of a hypersurface in an ambient ﬂat space (or spacetime) satisﬁes d∇ k = 0. Let S2tr (M, g) = { h ∈ S2 (M ) | trg h = 0 } = ker trg denote the space of traceless symmetric 2-covariant tensor ﬁelds, and let ∇

C2tr (M, g) ≡ S2tr (M, g) ∩ S2d (M, g) = ker trg ∩ ker d∇

15. Conformal Volume Collapse of 3-Manifolds

509

denote the space of traceless Codazzi tensors. We note that C2tr (M, g) ⊂ S2T T (M, g) = {h ∈ S2 (M ) | δg h = 0 and trg h = 0} and that C2tr (M, g) is scale invariant, i.e., if c > 0, C2tr (M, cg) = C2tr (M, g). A computation shows that the symbol σξx (d∇ ), ξx ∈ Tx∗ M , of d∇ is injective on traceless h and thus d∇ is elliptic on S2tr (M, g) and thus C2tr (M, g) = ker trg ∩ ker d∇ is ﬁnite-dimensional. Now let M be hyperbolizable and let g be hyperbolic. Then it has been shown by Lafontaine [1983] that the inﬁnitesimal moduli space of ﬂat spacetime perturbations of [γI + /Γ ] on I + /Γ is isomorphic to the space of traceless Codazzi tensors C2tr (M, g) (and by scale invariance and Mostow rigidity, this space is independent of the choice of hyperbolic metric). Consequently, there is at most a ﬁnite dimensional space of deformations of the standard spacetime I+ /Γ, which, moreover, is rigid (equivalently, M is rigid) if and only if C2tr (M, g) = {0}.

17

Asymptotic Approach to Hyperbolic Geometry

In this section we consider the results of Anderson and Moncrief [2001] regarding the stability of the standard models I+ /Γ in the rigid case. In this case the moduli space of ﬂat spacetime structures on I + /Γ is a point, namely the isometry class { [γI + /Γ ] }. Thus there are no ﬂat non-isometric perturbations of γI + /Γ and so γI + /Γ is a canonical ﬂat solution on I + /Γ. At issue then is the stability of [γI + /Γ ] in the moduli space of Ricci-ﬂat spacetimes on I + /Γ. In Part A below, we summarize and amalgamate some of the results in this paper. In Part B the properties of spacetimes coming from Cauchy data perturbed from the canonical data constructed in Part A are found. 17.1 Theorem. (Andersson–Moncrief [2001]) Let M be a closed connected oriented hyperbolizable 3-manifold with fundamental group π1 (M ) Γ ⊂ SO↑ (1, 3), a discrete torsion-free co-compact subgroup, and let I+ /Γ denote the standard model associated with M . A. For τ0 < 0, let r0 = −(3/(τ0 )) > 0, and let gr0 ∈ M be a hyperbolic metric on M with hyperbolic radius r0 . Deﬁne kr0 ≡

τ0 1 gr0 = − gr0 ∈ S2 (M ) 3 r0

510

A. E. Fischer and V. Moncrief

to be the second fundamental form with mean curvature τ0 = trgr0 kr0 and deﬁne the gravitational momentum πr0 ≡ 23 τ0 gr−1 µgr0 = − r20 gr−1 µgr0 ∈ Sd2 (M ) 0 0 so that (gr0 , πr0 ) is CMC Cauchy data, (gr0 , πr0 ) ∈ CH (M ) ∩ Cδ (M ) ∩ Cτ0 (M ) Let (V4 , (4) g)(gr0 ,πr0 ) denote the maximal vacuum Cauchy development of r (gr0 , πr0 ) and let R+ 1 × r0 (M, gr0 ) denote the Lorentz cone over (M, gr0 ), a warped product Lorentz manifold. Then ∼ + r (V4 , (4) g)(gr0 ,πr0 ) ∼ = R+ 1 × r0 (M, gr0 ) = I /Γ

(17.1)

B. Suppose additionally that M is rigid. Then given = (τ0 ) > 0, there exists a δ = δ() > 0, such that if (g0 , π0 ) ∈ CH (M ) ∩ Cδ (M ) ∩ Cτ0 (M ) g0 − gr0 H 3 + π0 − πr0 H 2 < δ

and (17.2)

(where · H s denotes the H s -Sobolev norm,) then (1) The maximal vacuum spacetime (V4 , (4) g)(g0 ,π0 ) is globally foliated by CMC hypersurfaces to the future of (M, g0 , π0 ) (in the expanding direction). These CMC hypersurfaces are parameterized by τ ∈ [τ0 , 0). (2) For τ ∈ [τ0 , 0), let (gτ , πτ ) denote the ADM Cauchy data induced on the τ = constant hypersurface in (V4 , (4) g)(g0 ,π0 ) pulled back to M . Then there exists a continuous curve of diﬀeomorphisms fτ ∈ D0 (M ) such that for τ ∈ [τ0 , 0), (a) fτ∗ (τ 2 gτ ) − τ02 gr0 H 3 + fτ∗ (πτ ) − πr0 H 2 < , and

(b) limτ →0− fτ∗ (τ 2 gτ ) − τ02 gr0 H 3 + fτ∗ (πτ ) − πr0 H 2 = 0 where fτ∗ denotes the pull-back by fτ . (3) (V4 , (4) g)(g0 ,π0 ) is future causally geodesically complete; and (4) (V4 , (4) g)(g0 ,π0 ) is past causally geodesically incomplete. More generally, if (g0 , π0 ) ∈ CH ∩ Cδ ∩ Cτ0 has constant mean curvature τ

τ0 < 0 (not necessarily τ0 ), then by rescaling by ( τ00 )2 , τ 2 0

τ0

g0 , π0 ∈ CH ∩ Cδ ∩ Cτ0

(17.3)

15. Conformal Volume Collapse of 3-Manifolds

511

Thus, given > 0, if ((τ0 /τ0 )2 g0 , π0 ) is in the δ-neighborhood of (gr0 , πr0 ) given in (17.2) above, then the rescaled spacetime

, V4 , (4) g τ0 2 τ0

g0 ,π0

where (4) g ∼ = (τ0 /τ0 )2 (4) g, satisﬁes properties (1), (2), (3), and (4). Similarly, if (g0 , π0 ) ∈ CH ∩ Cδ ∩ Cτ0 and if for given > 0 there exists a diﬀeomorphism f0 ∈ D(M ) such that f0∗ g0 − gr0 H 3 + f0∗ π0 − πr0 H 2 < δ, then (f0∗ g0 , f0∗ π0 ) ∈ CH ∩ Cδ ∩ Cτ0 and thus satisﬁes (17.2). Thus the spacetime (V4 , (4) g )((f0∗ g0 ,f0∗ π0 ) satisﬁes properties (1), (2), (3), and (4), where (4) g ∼ = f˜0∗ ((4) g) and where f˜0∗ ((4) g) is deﬁned by pulling back the lapse function Nτ , f0∗ Nτ = Nτ ◦ f0 , the shift vector ﬁeld Xτ , f0∗ Xτ ≡ (f0−1 )∗ Xτ (where ( · )∗ denotes the push-forward of a vector ﬁeld ), and the curve gτ , f0∗ gτ , by the ﬁxed spatial diﬀeomorphism f0 . Remarks. above,

(a) Note that at the origin (gr0 , πr0 ) of the δ-neighborhood (V4 , (4) g)(gr0 ,πr0 ) ∼ = I+ /Γ

as in Part A. (b) Items B(1), B(2a), and B(2b) assert that the “ﬁducial curve” τ0 2 gr0 , πr0 τ ∼ R+ × r of ADM data induced by the CMC hypersurfaces of I+ /Γ = 1 r0 (M, gr0 ) (pulled back to M ) is asymptotically stable up to isometry so that if the initial data (g0 , π0 ) is suﬃciently close to the ﬁducial initial data (gr0 , πr0 ), then the curve (gτ , πτ ) (1) exists for all τ ∈ [τ0 , 0); (2a) remains within an -tube in R− × Preduced enclosing the ﬁducial curve; and (2b) the diameter of the tube approaches zero as τ → 0− (items (1) and (2a) are the assertion that the ﬁducial curve is stable which together with (2b) is the assertion that the ﬁducial curve is asymptotically stable, where both stable and asymptotically stable are up to isometry). (c) The constant mean curvatures of both sets of Cauchy data, the ﬁducial set (gr0 , πr0 ) and the perturbed set (g0 , π0 ), have to be taken to have equal τ0 = τ (gr0 , πr0 ) = τ (g0 , π0 ) in order to insure that in the resulting spacetimes these sets of Cauchy data lie on corresponding CMC-hypersurfaces (in order to compare apples with apples, so to speak). More general CMCCauchy data (g, π) ∈ CH ∩ Cδ ∩ CR− which when rescaled back to the constant τ0 Cauchy hypersurface also lies in the -neighborhood of the ﬁducial

512

A. E. Fischer and V. Moncrief

data (gr0 , πr0 ) will then also generate a maximal spacetime that satisﬁes conditions B(1–4). Now we consider the non-rigid case. As discussed in Section 16, this case arises when M is hyperbolizable and admits non-trivial traceless Codazzi tensors with respect to a hyperbolic metric on M . In this case the resulting spacetime I+ /Γ is not rigid in the 4-dimensional sense and admits a nontrivial ﬁnite-dimensional moduli space of ﬂat perturbations of the canonical ﬂat metric even though Mostow rigidity prohibits a deformation of the hyperbolic structure itself. One of the main features of the proof of Theorem 17.1 is that in the rigid case, the moduli space of ﬂat spacetime perturbations is a point and the 4-dimensional geometries on I+ /Γ can be entirely controlled by the higher order Bel-Robinson energies deﬁned and studied in Christodoulou and Klainerman [1993]. In the rigid case, the Bel–Robinson energies are non-degenerate and decay to zero for small data. In contrast, if M is non-rigid, the Bel-Robinson energies, based as they are entirely on curvature, vanish on ﬂat spacetimes and thus cannot control ﬂat perturbations of the spacetime geometries. Thus in this case, one needs an additional tool for the proof of long-time existence and asymptotic behavior. The reduced Hamiltonian may provide precisely the needed tool to complete the proof of Theorem 17.1 to the non-rigid case since it has an isolated local minimum at the hyperbolic ﬁxed point (˜ γ , 0) and decays towards this minimum in the direction of expansion. This Hamiltonian bounds at best an H 1 ×L2 Sobolev norm of the Cauchy data (g, π) which is far too weak to use for the desired long-time-existence theorem. But to control the Teichm¨ uller parameters in the non-rigid case, all we really need is an additional bound on the ﬁnite dimensional space of modular parameters to complement the Bel-Robinson bounds on curvature and this, it seems, Hreduced can provide. Thus an important potential application of our results would be to use them to complete Theorem 17.1 of long-time-existence and asymptotic behavior to the case of non-rigid hyperbolizable M . Thus, with a proof proposed along the above lines, we have the following. 17.2 Conjecture. Part B of Theorem 17.1 is true without the requirement that M be rigid. In this context, it is worth remarking that in their study of the nonlinear stability of Minkowski space, Christodoulou and Klainerman [1993] never needed to appeal to the ADM-mass functional, which is roughly the analogue of Hreduced here, but only used Bel–Robinson type energies. However, Minkowski space is known to be isolated as a ﬂat solution of Einstein’s equations for the case of asymptotically ﬂat spacetimes and this is more analogous to the rigid case which also only requires curvature type

15. Conformal Volume Collapse of 3-Manifolds

513

energy estimates for its analysis.

18

The Hyperbolic Fixed Point is a Local Attractor

In this section we apply Theorem 17.1 to show that the hyperbolic ﬁxed point of the reduced Einstein equations is a local attractor. As we have seen in Section 10, the hyperbolic ﬁxed point is a strict local minimum of Hreduced which when coupled with the strict monotonicity of Hreduced suggests that it is a local attractor. However, as discussed in the previous section, Hreduced is too weak to use for the desired long-timeexistence. However, the stronger results of Theorem 17.1 can be applied here to get the desired results. At issue is whether the maximal integral curve for initial data (γ0 , pT0 T ) ∈ γ , 0) asymptotically Preduced suﬃciently close to the hyperbolic ﬁxed point(˜ decays in the direction of cosmological expansion to (˜ γ , 0). This is answered aﬃrmatively in the rigid case by Theorem 18.1 below and as depicted in Figure 18.1. Inﬁnite Expansion, τ = 0 or τ = ∞

τ

Gravitational phase space

Big Bang, τ = −∞ or τ = 0 Figure 18.1. Asymptotic approach to the hyperbolic ﬁxed point for small data

For arbitrary initial data (τ0 , γ0 , pT0 T ) ∈ R− ×Preduced , let t0 = 2/(3τ02 ), and let cmax : (t1 , t2 ) ⊆ R+ −→ Preduced , t −→ cmax (t) = γ(t) , pT T (t) (18.1) 0 ≤ t1 < t0 < t2 ≤ ∞, denote the maximal integral curve of the initial

514

A. E. Fischer and V. Moncrief

value problem of the reduced Einstein equations so that cmax (t0 ) = γ(t0 ) , pT T (t0 ) = γ0 , pT0 T

(18.2)

18.1 Theorem. Let M be a closed connected oriented rigid hyperbolizable γ , 0) ∈ 3-manifold, let γ˜ ∈ M−1 be a hyperbolic metric on M , and let (˜ Preduced be the hyperbolic ﬁxed point of the reduced Einstein equations. Then (˜ γ , 0) is asymptotically stable (up to isometry). Thus for ﬁxed τ0 ∈ R− and = (τ0 ) > 0, there exists a δ = δ() > 0 such that if (γ0 , pT0 T ) ∈ Preduced satisﬁes γ0 − γ˜ 3 + pT0 T 2 < δ H H TT and if cmax = γ(t) , p (t) is the maximal integral curve of the initial value problem with initial data τ0 , γ0 , pT0 T as in Eqs. (18.1) and (18.2), then (1) cmax is positively complete, i.e., t2 = ∞ so that cmax is deﬁned for t ∈ (t1 , ∞) and the resulting maximal spacetime is future causally geodesically complete; (2) There exists a continuous curve of diﬀeomorphisms ft ∈ D0 (M ) such that for t ∈ [ t0 , ∞) , ∗ ft γ(t) − γ˜ 3 + ft∗ pT T (t) 2 < ; (2a) H ∗ T T H ∗ (2b) ft γ(t) − γ˜ H 3 + ft p (t) H 2 → 0 as t → ∞ , where ft∗ denotes the pull-back by ft . Consequently, by continuity of Hreduced and vol−1 , Hreduced τ (t) , ft∗ γ(t) , ft∗ pT T (t) = Hreduced τ (t) , γ(t) , pT T (t) 3 → 32 2 vol(M , γ˜ ) 2 12 , and thus also as t → ∞, and where τ (t) = − 3t vol M , ft∗ (γ(t)) = vol M , γ(t) −→ vol M , γ˜ Remarks. (a) Thus if M is a rigid hyperbolizable manifold, then the hyperbolic ﬁxed point is a local attractor for the reduced Einstein ﬂow, thereby providing an important stability result for the reduced Einstein equations. This result can be brieﬂy described as being a reduced and compactiﬁed version of Christodoulou and Klainerman’s [1993] small data result for asymptotic stability of Minkowski space, where I+ /Γ plays the role of Minkowski space. We remark that a local attractor is deﬁned by the fact that integral curves for reduced Cauchy data suﬃciently near the hyperbolic

15. Conformal Volume Collapse of 3-Manifolds

515

ﬁxed point (˜ γ , 0) both stay near and asymptotically decay to the hyperbolic ﬁxed point in the direction of expansion and that these are independent characteristics. (b) The initial Cauchy data (γ0 , pT0 T ) ∈ Preduced . This normalization ensures that all integral curves for varying (γ0 , pT0 T ) ∈ Preduced start at the same time t0 and is thus analogous to ﬁxing the constant mean curvature τ0 of the Cauchy data (g, π) in Theorem 17.1. (c) If the hyperbolic σ-conjecture is true, then from (2b) above, along the reduced Einstein ﬂow the reduced Hamiltonian asymptotically approaches its inﬁmum inf Hreduced = (− 32 σ(M ))3/2 as t → ∞.

19

Collapse of Bianchi Models with σ(M ) = 0

For manifolds of Yamabe type −1, the strict monotonic decay of Hreduced in the direction of cosmological expansion along non-constant integral curves of the reduced Einstein equations suggests that the reduced Hamiltonian is seeking to achieve its inﬁmum inf Hreduced = (− 32 σ(M ))3/2 . But does this ever happen? Does the reduced Einstein ﬂow of the conformal geometry asymptotically approach inf Hreduced in the limit of inﬁnite cosmological expansion? To answer this question we consider the vacuum solutions of Einstein’s equations which spatially compactify to manifolds of Yamabe type −1. These models are the ﬁve Bianchi models of types II, III, VI0 , VIII, and V (and in part VIIh ), which in turn correspond in Thurston’s classiﬁcation to manifolds of type R), and H3 , respectively. Nil, H 2 × R, Sol, SL(2, The various Bianchi models include both trivial (Bianchi type III) and nontrivial (Bianchi type VIII) circle bundles over higher genus surfaces, nontrivial circle bundles over the torus (Bianchi type II), torus bundles over the circle (Bianchi type VI0 ), and compact hyperbolic manifolds (Bianchi types V and VIIh ). Not all Bianchi models admit spatially compact quotients and, of those that do, some yield manifolds of Yamabe type 0 (which allow constant zero scalar curvature metrics but not constant positive scalar curvature metrics), e.g., Bianchi I models deﬁned over T 3 or certain quotients thereof,

516

A. E. Fischer and V. Moncrief

while others yield manifolds of Yamabe type +1 (which allow positive constant scalar curvature), e.g., Bianchi IX models deﬁned over S3 or certain quotients thereof. Fortunately, the general theory of which Bianchi models admit spatially compact quotients has been worked out in detail in Tanimoto, Koike, and Hosoya [1994, 1997a,b]. These Bianchi models together with their corresponding Thurston classiﬁcation and certain of their compact quotient manifolds are listed in Table 19.1 below, where “K-S” indicates “Kantowski-Sachs”, “?” indicates “unknown, but conjectured to be so”, “Yam” denotes the Yamabe type of M , “Seifert” means “Seifert ﬁbered”, and Σp is a closed orientable 2-manifold of genus p ≥ 2. Here we summarize our calculations for the Bianchi models that spatially compactify to manifolds of Yamabe type −1. These models are the last ﬁve entries in Table 19.1. We have calculated these examples using explicitly known vacuum metrics for the simplest “standard” metric forms, given, for example, in Wainwright and Ellis [1997]. We have not considered all possible such spatially compact quotients even though that would appear to be quite feasible, but have considered only some representative examples for each of the Bianchi types listed. The details of these calculations will appear elsewhere; see Fischer and Moncrief [2002]. Table 19.1. Bianchi, Thurston, and Yamabe type of a closed 3-manifold

Bianchi Thurston K-S IX I II III VIII VI0 V, VIIh

S2 × R S3 R3 Nil H2 × R SL(2, R) Sol H3

Finite Covers

Yam

σ(M ) Structure

Trivial S1 -bundle over S2 Non-trivial S1 -bundles over S2 Trivial S1 -bundle over T2 Non-trivial S1 -bundles over T2 Trivial S1 -bundle over Σp

+ + 0 − −

>0 >0 0 0 0

Non-trivial S1 -bundles over Σp Non-trivial T2 -bundles over S1 Compact hyperbolic manifolds

− − −

0 0 <0?

Seifert Seifert Seifert Seifert Seifert Seifert Graph Hyperbolic

In the ﬁrst three cases, the models of Bianchi type II, III, and VIII compactify to a Seifert ﬁbered space and thus in particular are irreducible graph manifolds. In the fourth case, the model of Bianchi type VI0 compactiﬁes to an irreducible graph manifold. Thus in the ﬁrst four cases σ(M ) = 0. In the ﬁfth case, we considered vacuum Bianchi V metrics as well as a special case of Bianchi type VIh which compactify to an arbitrary compact hyperbolizable manifold M . Thus in the ﬁfth case, σ(M ) is conjectured to be < 0 and to be determined by the hyperbolic volume. In all ﬁve cases that we have calculated, the reduced Einstein ﬂow has a positive semi-global existence for t ∈ R+ and moreover, the conformal metrics under the reduced Einstein ﬂow γ(t) ∈ M−1 all evolve with bounded

15. Conformal Volume Collapse of 3-Manifolds

517

Ricci curvature, A ≤ Ric γ(t) γ(t) ≤ B

for t ∈ R+

0 < A ≤ B < ∞, and where the curvatures are bounded away from zero since the conformal metrics γ(t) evolve in M−1 . In the hyperbolic case, the integral curve is the hyperbolic ﬁxed point (˜ γ , 0). Thus the curve of conformal metrics is constant, γ(t) = γ˜ and so the constant hyperbolic curvature is trivially bounded, Ric γ(t) γ )γ˜ = − 13 γ˜ γ˜ = Ric(˜ γ(t) √ 3 = A = B > 0 for all t ∈ R+ = 3 Moreover, from (10.4), the reduced Hamiltonian is constant Hreduced (τ (t), γ˜ , 0) =

3 32 2

vol(M, γ˜ ) for all t ∈ R+

(19.1)

and thus trivially asymptotically realizes the hyperbolic volume and thus 3 inf Hreduced = − 32 σ(M ) 2 if and only if the hyperbolic σ-conjecture is true. Similarly the constant hyperbolic volume does not volume collapse M , (19.2) vol M, γ(t) = vol(M, γ˜ ) > 0 for all t ∈ R+ . In the four cases where σ(M ) = 0, using the known solutions we have shown by explicit calculation that the reduced Hamiltonian along the in tegral curve γ(t), pT T (t) of the reduced Einstein equations satisﬁes Hreduced τ (t), γ(t), pT T (t) −→ 0 as t −→ ∞ (19.3) thereby asymptotically realizing

− 23 σ(M )

32

= 0.

The curve of conformal metrics γ(t) must volume collapse M , vol M, γ(t) −→ 0 as t −→ ∞

(19.4)

since from (12.2) the curve of conformal volumes vol(M, γ(t)) is squeezed 3 between (2/3) 2 Hreduced τ (t), γ(t), pT T (t) and 0. Thus in each of these four cases, the volume collapse under the reduced Einstein ﬂow is interpreted as Hreduced asymptotically realizing

− 23 σ(M )

32

= 0.

518

A. E. Fischer and V. Moncrief

Moreover, graph manifolds (and thus also Seifert manifolds) have the remarkable characteristic property that they admit sequences of metrics which can collapse to zero volume while having bounded curvature. Thus these four cases provide naturally occurring relativistic examples of how the reduced Einstein ﬂow provides curves of conformal metrics which exhibit the phenomena of volume collapse with bounded curvature, precisely as occurs in the Cheeger and Gromov [1986, 1990] theory of collapse. In the four cases where σ(M ) = 0, each of the manifolds is non-hyperbolizable. Thus no metric can actually achieve σ(M ) = 0 and so Hreduced can only approach σ(M ) = 0 along an integral curve on which there occurs some kind of degeneration. As we have seen, the conformal volumes collapse M . However, the details of how this collapse occurs varies somewhat among these four cases. In the three cases where the Bianchi models of type II, III, and VIII compactify to a Seifert ﬁbered space, the volume collapse occurs along circular ﬁbers. Additionally, in the Bianchi type II case, which corresponds to the Thurston type Nil, not only do the circular ﬁbers collapse but the quotient space M/S1 T2 of M modulo its circular ﬁbers also collapses. Thus these model universes of Bianchi type II exhibit total collapse, as recognized by Gromov [1978] over 20 years ago in a diﬀerent setting. In the fourth case where the Bianchi model of type VI0 compactiﬁes to an irreducible graph manifold, the volume collapse occurs along embedded T2 ﬁbers. In this case the manifold is not Seifert ﬁbered so there are no circular ﬁbers that can collapse. Moreover, the quotient space M/T2 S1 of M modulo the T2 ﬁbers, although circular, does not collapse. The Bianchi models with Class A vacuum initial data have been studied extensively by Ringstr¨ om [2000a,b] who ﬁnds that for Bianchi types II, VI0 , and VIII, the reduced Hamiltonian does indeed asymptotically converge to zero for all Class A vacuum initial data, thereby adding more examples of Hreduced converging to its inﬁmum. Thus, in summary, in all ﬁve cases that we have calculated, subject to the hyperbolic σ-conjecture, the reduced Hamiltonian asymptotically approaches its σ-constant inﬁmum along the ﬂow lines of the reduced Einstein system. In doing so the volumes of the conformal metrics either go to zero (in the ﬁrst four cases) or to the hyperbolic volume (in the hyperbolic case). In all ﬁve cases, the curvature of the conformal metrics is bounded both above and away from zero as t → ∞. In contrast, in all ﬁve cases, the volumes of the physical metrics g(t) = ϕ4 (t)γ(t) go to inﬁnity and their curvatures go to zero in the forward direction vol M, g(t) −→ ∞ and Ric g(t) g(t) −→ 0 as t → ∞ (19.5) as beﬁts the expansion of these universes.

15. Conformal Volume Collapse of 3-Manifolds

20

519

The Reduced ADM-Hamiltonian

Using the Lichnerowicz transform L : R− × Preduced → CH ∩ Cδ ∩ CR− , the reduced Hamiltonian Hreduced : R− × Preduced → R deﬁnes a reduced ADM-Hamiltonian on the CMC-constraint space, ADM = Hreduced ◦ L−1 : CH ∩ Cδ ∩ CR− → R , Hreduced

(20.1)

ADM (g, π) (g, π) → Hreduced

Thus from (10.2) and since g = ϕ4 γ, ADM 3 Hreduced (g, π) = −τ (g, π) µg = −τ 3 (g, π) vol(M, g)

(20.2)

M ADM and we note that Hreduced (g, π) is algebraic in g and π. We also note

inf

ADM Hreduced (g, π) =

(g,π)∈CH ∩Cδ ∩CR−

inf Hreduced (τ, γ, p (τ,γ,pT T )∈R− ×Preduced

TT

3 = − 32 σ(M ) 2

) (20.3)

−

If M is hyperbolizable, γ˜ ∈ M−1 hyperbolic, and τ ∈ R , then from (5.8),

3 1 L(τ, γ˜ , 0) = γ˜ , − 23 2 γ −1 µγ˜ . 2 2τ Thus 3 2 12 −1 ADM Hreduced (τ, γ˜ , 0) = Hreduced γ ˜ , − γ µγ˜ 3 2τ 2 3 (20.4) = 32 2 vol(M, γ˜ ) which is equal to

3 ADM = − 32 σ(M ) 2 inf Hreduced

if and only if the hyperbolic σ-conjecture is true, and if and only if 3 32 ADM vol(M, γ˜ ) is a global minimum of Hreduced . 2 Thus the hyperbolic σ-conjecture (7.9) can be given an equivalent relativisADM as follows (see also (10.9)): tic reformulation in terms of Hreduced 20.1 Conjecture. Let M be a closed connected oriented hyperbolizable 3-manifold and let γ˜ ∈ M−1 be a hyperbolic metric on M . Let (g, π) ∈ CH ∩ Cδ ∩ CR− be CMC–Cauchy data with constant mean curvature τ = 12 (trg π)µ−1 g < 0. Then ADM (g, π) ≥ Hreduced

3 32

(20.5) 1 with equality if and only if g ∼ = − 23 2 γ˜ −1 µγ˜ . = 2τ32 γ˜ and π = 23 τ g −1 µg ∼ 2

vol(M, γ˜ )

Finally, we remark that in Anderson [2000] a higher dimensional analog (n ≥ 3) of this conjecture is proven for ﬂat Cauchy data.

520

A. E. Fischer and V. Moncrief

Acknowledgments: This research was supported in part by NSF grant PHY-9732629 to Yale University. We would also like to thank the Institut ´ des Hautes Etudes Scientiﬁques, the Max-Planck-Institut f¨ ur Gravitationsphysik, Albert-Einstein-Institut, and The Erwin Schr¨ odinger International Institute for Mathematical Physics for hospitality and ﬁnancial support for several periods during the time in which this research was carried out.

References Anderson, M (1993), Degeneration of metrics with bounded curvature and applications to critical metrics of Riemannian functionals, Proceedings of Symposia in Pure Mathematics 54, Part 3, 53–79. Anderson, M (1997), Scalar curvature and geometrization conjectures for 3manifolds. In Comparison geometry (Berkeley, CA 1993–94), Math. Sci. Res. Inst. Publ. 30, Cambridge University Press, Cambridge. Anderson, M. [1999] Scalar curvature, metric degenerations and the static vacuum Einstein equations on 3-manifolds, I. Geometric and Functional Analysis), 9, 855–967. Andersson, L. [2000], The global existence problem in general relativity, preprint IHES/M/00/18 and gr–qc/9911032 v3 11 Feb 2000. Andersson, L. [2000], Constant mean curvature foliations of ﬂat space-times, preprint. Andersson, L, and V. Moncrief [2001] The global existence problem in general relativity, (in preparation). Arnowitt, R, S. Deser, and C . Misner [1962], The dynamics of general relativity, in Gravitation: an Introduction to Current Research, (L. Witten, ed.), John Wiley and Sons, Inc., New York. Aubin, T. [1982], Nonlinear Analysis on Manifolds. Monge-Amp` ere Equations, Springer-Verlag, New York. Benedetti, R, and C . Petronio (1992), Lectures on Hyperbolic Geometry, Springer-Verlag, New York. Cheeger, J, and M. Gromov (1986), Collapsing Riemannian manifolds while keeping their curvature bounded I, Jour. Diﬀ. Geom., 23, 309–346. Cheeger, J, and M. Gromov (1990), Collapsing Riemannian manifolds while keeping their curvature bounded II, Jour. Diﬀ. Geom., 32, 269–298. Choquet-Bruhat, Yvonne, and Robert Geroch [1969], Global aspects of the Cauchy problem in general relativity, Comm. Math. Phys., 14, 329–335. Choquet-Bruhat, Y, and J. York [1980], The Cauchy problem. In General Relativity and Gravitation: Volume 1, (A. Held, ed.), Plenum Press, New York. Christodoulou, D, and Klainerman, S [1993], The global nonlinear stability of the Minkowski space, Princeton University Press, Princeton, New Jersey. Fischer, A, and J. E. Marsden [1972], The Einstein equations of evolution — a geometric approach, J. Math. Phys., 28, 1–38.

15. Conformal Volume Collapse of 3-Manifolds

521

Fischer, A, and J. E. Marsden [1975], Deformations of the scalar curvature, Duke Math. J., 42, 519–547. Fischer, A, and J. E. Marsden[1979], The initial value problem and the canonical formalism of general relativity. In General Relativity, An Einstein Centenary Survey, (S. W. Hawking and W. Israel, eds.), Cambridge University Press, Cambridge, New York, 138–211. Fischer, A, and J. E. Marsden [1979], Topics in the dynamics of general relativity. In Proceedings of the International School of Physics “Enrico Fermi”, Isolated Gravitating Systems in General Relativity, (J. Ehlers, ed.) North Holland, 396–456. Fischer, A, and V. Moncrief [1996], The structure of quantum conformal superspace. In Global Structure and Evolution in General Relativity, (S Cotsakis and G Gibbons, eds.), Springer-Verlag, Berlin, 111–173. Fischer, A, and V. Moncrief [1997], Hamiltonian reduction of Einstein’s equations of general relativity, Nuclear Phys. B, Proc. Suppl., 57, 142–161. Fischer, A, and V. Moncrief [1999], The Einstein ﬂow, the σ-constant and geometrization of 3-manifolds, Classical Quantum Gravity, 16, L79–L87. Fischer, A, and V. Moncrief [2000], The reduced Hamiltonian of general relativity and the σ-constant of conformal geometry. In Proceedings of the 2nd Samos Meeting on Cosmology, Geometry and Relativity, (S. Cotsakis and G. W. Gibbons, eds.), Springer-Verlag, New York. Fischer, A, and V. Moncrief [2000], Hamiltonian reduction, the Einstein ﬂow, and collapse of 3-manifolds, Nuclear Phys. B Proc. Suppl., 88 83–102. Fischer, A, and V. Moncrief [2002], Collapse of Bianchi models in the reduced Einstein ﬂow (to appear). Fischer, A, and A. Tromba [1984], On a purely “Riemannian” proof of the structure and dimension of the unramiﬁed moduli space of a compact Riemann surface, Mathematische Annalen, 267, 311–345. Gromov, M [1978], Almost ﬂat manifolds, J. Diﬀ. Geom., 13, 231–241. Gromov, M [1982], Volume and bounded cohomology, Publ. Math. IHES 56, 5–99. Gromov, M. [1999], Metric Structures for Riemannian and Non-Riemannian Spaces, Birkh¨ auser. Gromov, M, and H . Lawson, Jr., [1983], Positive scalar curvature and the Dirac operator on complete Riemannian manifolds, Publ. Math. IHES, 58, 83–196. Hamilton, R [1995], Formation of singularities in the Ricci ﬂow, Surveys in Differential Geometry International Press, Boston, 2, 7–136. Hamilton, R [1999], Non-singular solutions of the Ricci ﬂow on three-manifolds, Comm. Anal. Geom., 7, 695–729. Lafontaine, J [1983], Modules de structures conformes plates et cohomologie de groupes discrets, C. R. Acad. Sci. Paris S´ er. I Math., 297, no. 13, 655–658. O’Neill, Barett [1983], Semi-Riemannian Geometry with Applications to Relativity, Academic Press, New York. Ringstr¨ om, Hans [2000a], Curvature blow up in Bianchi VIII and IX vacuum spacetimes, Class. Quant. Grav., 17, 713–731.

522

A. E. Fischer and V. Moncrief

Ringstr¨ om, Hans [2000b], The future asymptotics of Bianchi VIII. Schoen, R, and S-T Yau [1979], On the structure of manifolds with positive scalar curvature, Manuscripta Math., 28, 159–183. Tanimoto, M, T. Koike, and A. Hosoya [1994], Compact homogeneous universes, J. Math. Phys. 35, 4855–4888. Tanimoto, M, T. Koike, and A. Hosoya [1997a], Dynamics of compact homogeneous universes, J. Math. Phys., 38, 350–368. Tanimoto, M, T. Koike, and A. Hosoya [1997b], Hamiltonian structures for compact homogeneous universes, J. Math. Phys., 38, 6560–6577. Thurston, W. [1997], Three-dimensional Geometry and Topology, volume 1, (S. Levy, ed.), Princeton University Press, Princeton, New Jersey. Wainwright, J and G. F. R. Ellis [1997], Dynamical Systems in Cosmology, Cambridge University Press, Cambridge, England. Wolf, J. [1972], Spaces of Constant Curvature, Second edition, Publish or Perish Press, Berkeley, California.

16 On Quantizing Semisimple Basic Algebras, I: sl(2, R) Mark J. Gotay To Jerry Marsden on the occasion of his 60th birthday ABSTRACT We show that there is a consistent polynomial quantization of the coordinate ring of a basic nilpotent coadjoint orbit of a semisimple Lie group. We also show, at least in the case of a nilpotent orbit in sl(2, R)∗ , that any such quantization is essentially trivial. Furthermore, we prove that there is no consistent polynomial quantization of the coordinate ring of a basic semisimple orbit in sl(2, R)∗ .

Contents 1 Introduction . . . . . . . . 2 Semisimple Basic Algebras 3 Quantization . . . . . . . . 4 Discussion . . . . . . . . . . References . . . . . . . . . . . . .

1

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

523 525 527 535 536

Introduction

We continue our study of Groenewold-Van Hove obstructions to quantization. Let M be a symplectic manifold, and suppose that b is a ﬁnitedimensional “basic algebra” of observables on M . Given a Lie subalgebra O of the Poisson algebra C ∞ (M ) containing b, we are interested in determining whether O can be “quantized.” (See §§2–3 and Gotay [2000] for the precise deﬁnitions.) Already we know that such obstructions exist in many circumstances: In Gotay and Grundling [1999] we proved that there are no ﬁnite-dimensional quantizations of (O, b) on a noncompact symplectic manifold, for any such Lie subalgebra O. Based on the work of Avez [1974] or Ginzburg and Montgomery [2000], it is straightforward to show that there are no quantizations of (C ∞ (M ), b) for any compact symplectic manifold M and basic algebra b. Furthermore, in Gotay, Grabowski, 523

524

M. J. Gotay

and Grundling [2000] we proved that there are no quantizations of the pair (P (M ), b) on a compact symplectic manifold, where P (M ) is the Poisson algebra of polynomials on M generated by b. It remains to understand the case when M is noncompact and the quantizations are inﬁnite-dimensional, which is naturally the most interesting and diﬃcult one. Here one has little control over either the types of basic algebras that can appear (in examples they range from nilpotent to simple), their representations, or the structure of the polynomial algebras they generate. However, in this context it is known from Gotay and Grabowski [2001] that there is an obstruction to quantizing P (M ) when b is nilpotent, but that there is no universal obstruction when b is merely solvable. In this paper we consider the problem of quantizing (P (M ), b) in the other extreme case, viz. when the basic algebra is semisimple. To begin, we recall from Gotay [2000] that if a symplectic manifold M admits b as a basic algebra, then M must be a coadjoint orbit in b∗ . Unfortunately, it is diﬃcult to determine exactly which orbits M ⊂ b∗ are “basic,” i.e. admit b as a basic algebra (cf. §2). Nonetheless, we are able to give conditions which guarantee that various types of orbits will be basic (Proposition 2.1). In particular, principal nilpotent orbits in b∗ are basic. We then prove in §3 that there do exist polynomial quantizations of certain basic orbits, speciﬁcally the nilpotent ones: 1.1 Theorem. Let b be a ﬁnite-dimensional semisimple Lie algebra, and M a basic nilpotent coadjoint orbit in b∗ . Then there exists a polynomial quantization of (P (M ), b). The crucial structural feature underlying Theorem 1.1 is that nilpotent orbits M ⊂ b∗ are conical, so that the (polynomial) ideal I(M ) of M is homogeneous. This allows us to split the coordinate ring of M as a semidirect product P (M ) = (R ⊕ b) P(2) (M ), where P(2) (M ) is the ideal of polynomials all of whose terms are at least quadratic. The quantization constructed in the proof of Theorem 1.1 has the property that it is zero on P(2) (M ), and so is “essentially trivial.” We then show that any polynomial quantization of a nilpotent orbit in sl(2, R)∗ must be essentially trivial (Proposition 3.3). Thus, while polynomial quantizations of basic nilpotent orbits do exist, this example indicates that they are likely to be uninteresting. If I(M ) is not homogeneous, then one might expect that there is an obstruction to quantizing P (M ), cf. Gotay [2000]. We show in §3 that this is indeed the case when b = sl(2, R). Thus polynomial quantizations are forced to be trivial for nilpotent orbits in sl(2, R)∗ , and are genuinely obstructed for all other basic orbits.

16. On Quantizing Semisimple Basic Algebras

2

525

Semisimple Basic Algebras

A key ingredient in the quantization process is the choice of a basic algebra of observables in the Poisson algebra C ∞ (M ). This is a (real) Lie subalgebra b of C ∞ (M ) such that: (B1) b is ﬁnitely generated, (B2) the Hamiltonian vector ﬁelds Xb , b ∈ b, are complete, (B3) b is transitive and separating, and (B4) b is a minimal Lie algebra satisfying these requirements. A subset b ⊂ C ∞ (M ) is “transitive” if {Xb (m) | b ∈ b} spans Tm M at every point. It is “separating” provided its elements globally separate points of M . Throughout this paper we assume that b is ﬁnite-dimensional and semisimple, and we routinely use the Killing form to identify b with b∗ . As previously noted, if the symplectic manifold M admits b as a basic algebra, then M must be a coadjoint orbit of the adjoint group B of b. It is of interest to determine those orbits M ⊂ b∗ which admit b as a basic algebra. Unfortunately, this is not a straightforward matter. For instance, let b = sl(2, R), so that the nonzero orbits are either open half-cones, hyperboloids of one sheet, or components of hyperboloids of two sheets. One can verify that the ﬁrst two types of orbits are basic for sl(2, R), but that the third type is not. (Instead, the components of hyperboloids of two sheets are are basic for subalgebras of triangular matrices.) Note that these orbits are all principal (i.e. have maximal dimension) in sl(2, R)∗ . The instances in which M ⊂ b∗ is guaranteed to be basic are listed below. 2.1 Proposition. Let b be a ﬁnite-dimensional semisimple Lie algebra, and M ⊂ b∗ a nonzero coadjoint orbit. If either: (i) b is compact and M is principal, (ii) b is compact and simple, and M is arbitrary, or (iii) M is nilpotent and principal, then M admits b as a basic algebra. Before giving the proof, we make some remarks and recall several important facts. As the sl(2, R) example shows, neither (i) nor (ii) remain valid when b is noncompact. It also shows that (iii) fails if “nilpotent” is replaced by “semisimple.” It is easy to see that (iii) no longer holds if “principal” is deleted: Let O be a nilpotent half cone in sl(2, R). Then the nilpotent orbit O × {0} ⊂ sl(2, R) ⊕ sl(2, R) has sl(2, R) as a basic algebra, not sl(2, R) ⊕ sl(2, R). Similarly (ii) fails if “simple” is deleted. Finally,

526

M. J. Gotay

regarding (iii), observe that if there is a nonzero nilpotent orbit in b∗ , then b is necessarily noncompact. Given a (noncompact) semisimple Lie algebra b, recall that a “standard triple” is a trio {h, e+ , e− } of elements of b satisfying the commutation relations [h, e± ] = ±2e± and [e+ , e− ] = h. Thus {h, e+ , e− } spans a subalgebra of b isomorphic to sl(2, R). The neutral element h is semisimple, while e± are nilpotent. Given a nilpotent element e ∈ b, the Jacobsen–Morozov theorem (Thm. 9.2.1 in Collingwood and McGovern [1993]) asserts that there exists a standard triple {h, e+ , e− } in b with nilpositive element e+ = e. Proof of Proposition 2.1. Parts (i) and (ii) are proven in §4 of Gotay, Grabowski, and Grundling [2000], so here we consider only the remaining case (iii), the proof of which has been kindly supplied by R. Brylinski. Clearly conditions (B1)–(B3) are satisﬁed, so we need only check the minimality condition (B4). Suppose a ⊂ b is transitive on M , so that b = a + be

(2.1)

for every e ∈ M , where be denotes the centralizer of e. Fix a principal nilpotent e+ ∈ M . We ﬁrst show that e+ is contained in a Borel subalgebra (“BSA”) of b. Let {h, e+ , e− } be a standard triple in b with nilpositive element e+ . From the representation theory of sl(2, R) we see that the eigenvalues of adh are integral; we may therefore decompose = b= bi (2.2) i∈Z

where bi is the eigenspace of adh corresponding to the eigenvalue i. Since e+ is principal, the neutral element h is generic, so its centralizer h = b0 is a Cartan subalgebra (“CSA”) Dof b. Since furthermore [bi , bj ] ⊂ bi+j , k = h ⊕ n is a BSA, where n = i>0 bi . Finally, as [h, e+ ] = 2e+ ∈ b2 , it follows that k is the desired BSA. From the proof of Thm. 5 in Kostant [1963] we know that be+ ⊂ n, which together with (2.1) implies that b = a + m for every B-conjugate m of n. We will prove this forces a = b. Since b = a + n, we may write h = h + n where h ∈ a and n ∈ n. So h = h − n lies in a and is generic (since h and h have the same characteristic polynomial). Thus the centralizer h of h is also a CSA of b. A calculation based on the decomposition (2.2) shows that h ⊂ k. This gives rise to the Levi decomposition k = h ⊕ n.

16. On Quantizing Semisimple Basic Algebras

527

We next claim that a contains h . Indeed, using b = a + n again, we see that each element x ∈ h gives rise to an element x = x − nx of a, where nx ∈ n. Since a is stable under adh , it follows that both x and nx lie in a. (The reason is that hC is the zero eigenspace of adh in bC and nC is the sum of nonzero eigenspaces. So both x and nx lie in aC . As both x and nx are real they must belong to a.) In particular a contains h . We can now ﬁnish the proof. We have the triangular decomposition b = m ⊕ h ⊕ n where m is the unique adh -stable complement to k in b. By a result of Borel and Tits [1965], the two Borel subalgebras h ⊕ n and m ⊕ h are B-conjugate, whence their nilradicals n and m are as well. Since a contains h , aC is the direct sum of hC and some of its root spaces. Using b = a + n, we see that aC contains mC . Similarly, using b = a + m, we see that aC contains nC . Thus aC = bC and so a = b. Let b be a Lie algebra and M a coadjoint orbit in b∗ . Consider the symmetric algebra S(b), regarded as the ring of polynomials on b∗ . The Lie bracket on b may be extended via the Leibniz rule to a Poisson bracket on S(b), so that the latter becomes a Poisson algebra. Let I(M ) be the associative ideal in S(b) consisting of all polynomials which vanish on M and set P (M ) = S(b)/I(M ). Since M is an orbit I(M ) is also a Lie ideal, hence a Poisson ideal, so the coordinate ring P (M ) of M inherits the structure of a Poisson algebra from S(b). We denote the Poisson bracket on P (M ) by {·, ·}. Let P k (M ) denote the subspace of polynomials of degree at most k. (When I(M ) = {0}, P (M ) is not freely generated as an associative algebra by the elements of b. Consequently, the notion of “homogeneous polynomial” is not necessarily well-deﬁned, but that of “degree” is.) In the cases when it does make sense, we let Pl (M ) denote the subspace of homogeDk k neous polynomials of degree l=0 Pl (M ). We then D l, so that P (M ) = also introduce P(k) (M ) = l≥k Pl (M ). Notice that when b is semisimple, P1 (M ) = b and P 1 (M ) = R ⊕ b.

3

Quantization

Fix a basic algebra b on M , and let O be any Lie subalgebra of C ∞ (M ) containing 1 and b. By a quantization of (O, b) we mean a linear map Q from O to the linear space Op(D) of symmetric operators which preserve a ﬁxed dense domain D in some separable Hilbert space H, such that for all f, g ∈ O, (Q1) Q({f, g}) = i[Q(f ), Q(g)],

528

M. J. Gotay

(Q2) Q(1) = I, (Q3) if the Hamiltonian vector ﬁeld Xf of f is complete, then Q(f ) is essentially self-adjoint on D, (Q4) Q represents b irreducibly, (Q5) D contains a dense set of separately analytic vectors for some set of Lie generators of Q(b), and (Q6) Q represents b faithfully. We refer the reader to Gotay [2000] for an extensive discussion of these deﬁnitions. We take Planck’s reduced constant to be 1. Here we are interested in the case when O = P (M ). Let A be the associative algebra over C generated by I along with {Q(b) | b ∈ b}, and let Ak denote the subspace of polynomials of degree at most k in the Q(b). We say that a quantization Q of P (M ) is polynomial if it is valued in A. That “polynomials quantize to polynomials” can be regarded as a generalized “Von Neumann rule,” cf. Gotay [2000]. Proof of Theorem 1.1. Let M be a basic nilpotent orbit. Since each nilpotent orbit is conical (Brylinski [1998]), it follows that we may choose a set of generators for I(M ) which are homogeneous. As a consequence, the gradation of S(b) by degree passes to the quotient P (M ). Thus the notion of homogeneous polynomial does make sense in P (M ). Furthermore, by virtue of the commutation relations of b, for each l ≥ 0 the subspaces Pl (M ) are ad -invariant: {P1 (M ), Pl (M )} ⊂ Pl (M ). In view of this, {Pk (M ), Pl (M )} ⊂ Pk+l−1 (M ), whence each P(l) (M ) is a Lie ideal. We thus have the semidirect sum decomposition P (M ) = P 1 (M ) P(2) (M ).

(3.1)

Because of (3.1), we can obtain a polynomial quantization Q of all of P simply by ﬁnding an appropriate representation of P 1 (M ) = R ⊕ b and ˜ be the connected, simply setting Q(P(2) (M )) = {0}! To this end, let B connected Lie group with Lie algebra b, and let Π be a faithful irreducible ˜ on a Hilbert space H. (For instance, we may unitary representation of B take Π to be a generic irreducible component of the left regular represen˜ cf. §5.6 in Barut and R¸aczka [1986].) Let D ⊂ H be ˜ on L2 (B), tation of B the dense set of analytic vectors for Π, and deﬁne π = −i d Π D, cf. §11.4 ibid. Extend π to P 1 (M ) by setting π(1) = I. Now take Q = π ⊕ 0 (recall (3.1)); then it is straightforward to verify that Q satisﬁes (Q1)–(Q6) and so is the required quantization of (P (M ), b). Note that the quantization constructed above is inﬁnite-dimensional. Indeed, there can be no ﬁnite-dimensional quantizations of a noncompact

16. On Quantizing Semisimple Basic Algebras

529

basic algebra (Gotay and Grundling [1999]); this is a reﬂection of the fact that semisimple Lie groups of noncompact type have no faithful ﬁnitedimensional unitary representations. Furthermore, since Q(P(2) (M )) = {0}, this quantization is essentially trivial. When b = sl(2, R) it turns out that any polynomial quantization is essentially trivial, as we show after some preliminaries. Henceforth take b = sl(2, R) and let M be an arbitrary coadjoint orbit. It is convenient to complexify. Deﬁne 0 1 1 1 0 0 i 0 −i ± . h= and e± = 0 −1 i 0 i 0 2 Then {h, e+ , e− } is a standard triple in bC = sl(2, C). Note that h2 +4e+ e− is the Casimir element for bC ; consequently h2 + 4e+ e− = c is constant on M . Suppose Q were a polynomial quantization of (P (M ), b) on a dense invariant domain D in an inﬁnite-dimensional Hilbert space H. By requiring Q to be complex linear, we can regard it as a “quantization” of (P (M )C , bC ). From now on, we abbreviate P (M )C = P , etc. We set H = Q(h) and E± = Q(e± ), and let (·, ·) denote the anti-commutator. Finally, observe that H 2 + 4(E+ , E− ) is the Casimir element for the representation Q of bC ; since by axiom (Q4) this representation is irreducible, H 2 + 4(E+ , E− ) = CI

(3.2)

for some ﬁxed constant C (cf. Prop. 3 in Gotay and Grabowski [2001]). We ﬁrst establish the following technical result. 3.1 Lemma.

For any nonnegative integer r, the set of operators Sr = {H j E+l , H k E−m | j + l ≤ r, k + m ≤ r}

forms a basis for Ar . Proof. We proceed by induction on r. The statement is obviously true for S0 = {I}. Now assume Sr−1 is a basis for Ar−1 . Any element of Ar can be written r αklm H k E+l E−m + lower degree terms. k+l+m=r

Now observe that i E+ E− = (E+ , E− ) − H. 2

530

M. J. Gotay

Applying (3.2) we may use this relation to eliminate all factors of E+ E− in the leading terms of the expression above, thereby obtaining − βjl+ H j E+l + βkm H k E−m + lower degree terms (3.3) αr H r + j+l=r l≥1

k+m=r m≥1

− for some coeﬃcients αr , βjl+ , βkm . Together with the induction hypothesis, this shows that Sr spans Ar . − , not all zero, such that Now suppose there exist coeﬃcients αr , βjl+ , βkm the expression (3.3) vanishes. We claim that without loss of generality we + were the ﬁrst nonzero coeﬃcient in may assume αr = 0. For suppose βJL this expression. By taking the commutator of the equation (3.3) = 0 with E− L-times, applying the commutation relations, and simplifying using (3.2), we obtain a condition of the form (3.3) = 0 where now the coeﬃcient − were the ﬁrst nonzero coeﬃcient in of H r is nonzero. Similarly, if βKM (3.3), then taking the commutator with E+ M -times would lead to the same end. Now repeatedly take the commutator of the equation (3.3) = 0 with H. This yields further independent conditions of the form (3.3) = 0 but with no terms involving H r . By Gaussian elimination, we may then remove all terms on the left hand side of (3.3) = 0 of the types βjl+ H j E+l and − βkm H k E−m with j, k < r. Thus we end up with

αr H r + Ar−1 = 0

(3.4)

where αr = 0 and Ar−1 ∈ Ar−1 . Taking the commutator of (3.4) with H yields [Ar−1 , H] = 0. Applying the induction hypothesis, it follows that Ar−1 can only depend upon H. Thus (3.4) reduces to r

αk H k = 0.

k=0

Factor this equation over C: αr (H − λr ) · · · (H − λ1 ) = 0.

(3.5)

As αr = 0, (3.5) implies that the range of Tr−1 = (H − λr−1 ) · · · (H − λ1 ) is contained in the λr -eigenspace of H. By the induction hypothesis Tr−1 = 0, so there exists ψ ∈ D such that ψr−1 = Tr−1 ψ is a (nonzero) eigenvector of H. In view of the irreducibility assumption (Q4), we conclude from sl(2, R) theory (cf. Lang [1975]) that the set {E+l ψr−1 , E−m ψr−1 | l, m ∈ N} contains an inﬁnite number of eigenvectors #r of H, corresponding to distinct eigenvalues λ. Each such λ must satisfy k=0 αk λk = 0 which is impossible. Thus αr = 0 and so Sr is a linearly independent set. We now determine what Q(h2 ) must be.

16. On Quantizing Semisimple Basic Algebras

3.2 Lemma.

531

Q(h2 ) = αH 2 + γI, where α, γ ∈ C.

Proof. By assumption Q(h2 ) must be a polynomial of degree r, say, in H, E+ , E− , which by Lemma 3.1 we may write in the form (3.3). Since H commutes with Q(h2 ), from Lemma 3.1 we see that Q(h2 ) can only depend on H: r Q(h2 ) = αk H k . (3.6) k=0

Using (Q1) and (Q2) to quantize the classical identity 1 3h2 − {{h2 , e− }, e+ } = c 2 we obtain

1 3Q(h2 ) + [[Q(h2 ), E− ], E+ ] = cI. 2 Substituting (3.6) into (3.7) and simplifying yields 1 3 − r(r + 1) αr H r + lower degree terms = cI. 2

(3.7)

From Lemma 3.1 it follows that Q(h2 ) is at most quadratic in H. Taking (3.6) with r = 2, again substituting into (3.7) and simplifying, we obtain the advertised expression for Q(h2 ), where α = α2 is arbitrary and γ satisﬁes 3γ = α(s2 − 1) + c.

(3.8)

Using (Q1) to quantize the identity 1 he± = ± {h2 , e± }, 4 applying Lemma 3.2, and simplifying, we obtain Q(he± ) = α(H, E± ). In turn, using this to quantize the identities 1 e±2 = ± {he± , e± }, 2 we ﬁnd that Q(e±2 ) = αE±2 . Similarly, upon quantizing e+ e− =

1 2 h − {he+ , e− } 2

532

M. J. Gotay

and using the formulæ above, we get Q(e+ e− ) = α(E+ , E− ) +

γ I. 2

Next use these formulæ to quantize the classical identities 2{e+2 , e−2 } + {he+ , he− } = ch and (e+ − e− )2 , {e+2 − e−2 , h(e+ + e− )} + 34 (e+ + e− )2 , {(e+ + e− )2 , h(e+ − e− )} = 8ch(e+ − e− ). After tedious calculations and simpliﬁcations, we end up with α2 (C + 3) H = cH

(3.9)

α3 (C + 9) (H, E+ − E− ) = αc(H, E+ − E− ),

(3.10)

and respectively. With these formulæ in hand, we are now ready to prove 3.3 Proposition. Let M be a nilpotent orbit in sl(2, R)∗ . Then for any polynomial quantization Q of (P (M ), sl(2, R)), Q(P(2) (M )) = {0}.

Proof. We ﬁrst claim that Q(P2 ) = {0}. To see this, observe that since M is nilpotent, the constant c = 0. Since by (Q6) H = 0, (3.9) implies that either α = 0 or C = −3 in the given representation. But if α = 0, then from (3.8) we conclude that Q(h2 ) = 0 which, as we show below, leads to the desired conclusion. In the event that C = −3, we turn to (3.10). Since (H, E+ − E− ) = 0 by Lemma 3.1, we must again have α = 0. Thus in any eventuality Q(h2 ) = 0 and it follows from (Q1) that Q(P2 ) = {0}, since h2 is a cyclic vector for the adjoint action of sl(2, C) on P2 (i.e., every element of P2 can be written as a sum of repeated brackets of elements of sl(2, C) with h2 , as the calculations above show). Finally, it is straightforward to check that hl is a cyclic vector for the adjoint representation of sl(2, C) on Pl . Since for l ≥ 2 hl =

1 2 l−2 {h , h e+ }, e− 2l + 2

(recall that c = 0), Q(h2 ) = 0 together with (Q1) imply that Q(hl ) = 0 for l > 2. Thus Q(P(2) ) = {0}.

16. On Quantizing Semisimple Basic Algebras

533

When M ⊂ sl(2, R)∗ is not nilpotent (in which case it must be semisimple), it turns out that it is not even possible to polynomially quantize (P (M ), b); rather than ﬁnding that Q(P(2) (M )) = {0}, we get an outright inconsistency. 3.4 Proposition. If M is a basic semisimple orbit in sl(2, R)∗ , then there is no polynomial quantization of (P (M ), b). Proof. We mimic the proof of Proposition 3.3; the only diﬀerence is that c is now nonzero. As before, H = 0, so by (3.9) α2 (C + 3) = c. In particular, since c = 0, α = 0. Since (H, E+ − E− ) = 0, (3.10) then gives α2 (C + 9) = c, which is the required contradiction.

Proposition 3.4 is the noncompact analogue of the results obtained in Gotay, Grundling, and Hurst [1996] for b = su(2), in which context every orbit is semisimple. In fact, the only signiﬁcant diﬀerence between the analyses of semisimple orbits in the sl(2, R) and su(2) cases is that the representations for the former are inﬁnite-dimensional, while those for the latter are ﬁnite-dimensional. Since moreover the complexiﬁcations of these Lie algebras are the same (viz. sl(2, C)), the arguments leading from Lemma 3.2 to Proposition 3.4 don’t distinguish between sl(2, R) and su(2). The same is true of the results in §2 ibid., which we may therefore immediately carry over to the present context, yielding: 3.5 Proposition. Let M be a basic semisimple orbit in sl(2, R)∗ . Then P 1 (M ) = R ⊕ sl(2, R) is the largest Lie subalgebra of the coordinate ring P (M ) that can be consistently polynomially quantized. Thus the obstruction to quantizing polynomial algebras on semisimple orbits in sl(2, R)∗ is very severe: the best one can do is quantize the Lie subalgebra of aﬃne polynomials! We end this section with a discussion of the assumption that Q be polynomial. In general, when the basic algebra b is compact (or, equivalently, when the coadjoint orbit M is compact) every quantization of (P (M ), b) is polynomial. For then the Hilbert space H must be ﬁnite-dimensional, and the claim follows from a well known property of enveloping algebras, cf. Prop. 2.6.5 in Dixmier [1977]. Furthermore, when b is nilpotent, it was proven that Q must be polynomial in Gotay and Grabowski [2001]. These results are direct consequences of the irreducibility condition (Q4). However, the analogous statement does not seem to hold for noncompact semisimple basic algebras.

534

M. J. Gotay

To see this, we provide an alternate version of Lemma 3.2, which does not assume that Q is polynomial ab initio. For what follows, we need to be more speciﬁc about the domain D. As a consequence of (Q5), Q b integrates ˜ on H (Cor. 1 of Flato and to a unique unitary representation Π of B Simon [1973]). Naturally associated with Π is the derived representation of b on the domain C ω (Π) consisting of analytic vectors of Π. We shall henceforth assume that D ⊃ C ω (Π). Furthermore, for the sake of simplicity, we suppose that the representation Π drops to SL(2, R) from its double ˜ cover B. Then from sl(2, R) theory (cf. Lang [1975]) we know that (i) the spectrum ∆ of H consists of certain imaginary integers, (ii) in view of (Q4), for each −in ∈ ∆ the corresponding eigenspaces Hn are 1-dimensional, and (iii) each eigenvector of H is an analytic vector, so that Hn ⊂ D. Furthermore, the quantizations of b are labeled by certain complex numbers s, and that for each −in ∈ ∆, there is a vector ψn ∈ Hn such that i Hψn = −inψn and E± ψn = − (s + 1 ± n)ψn±2 . 2

(3.11)

By (Q1), both H and D Q(h2 ) commute. From observations (ii) and (iii) above, and the fact that n∈i∆ Hn is dense in H, it follows that Q(h2 ) = ξ(H)

(3.12)

for some Borel function ξ on the spectrum of H. We now compute ξ. Apply the relation (3.7) to ψn ; from (3.11) and (3.12) we get the recursion relation 3ξn − 18 (s + (1 + n)) (s − (1 + n)) (ξn − ξn+2 ) − (s + (1 − n)) (s − (1 − n)) (ξn−2 − ξn ) = c, (3.13) where ξn is deﬁned via ξ(H)ψn = ξn ψn . It is straightforward to check that any polynomial solution of this recursion relation is of the form ξn = γ−αn2 from which, in view of (3.12) and (3.11), we recover the formula derived previously for Q(h2 ). But there are other solutions of (3.13) which are transcendental: for instance, consider the discrete series representation with s ≥ 1 an even integer. Then ∆ = −i{s + 1, s + 3, . . .}, and with some eﬀort one can show that the general solution of (3.13) is

− 1+n+s − 6ns , ξn = γ − αn2 + β (s2 − 3n2 − 1) 1+n−s 2 2 where α, β are arbitrary, γ is given by (3.8), and the digamma function is the logarithmic derivative of the gamma function. Similar formulæ hold for other allowable values of s.

16. On Quantizing Semisimple Basic Algebras

535

Thus in the case of sl(2, R) irreducibility enables one to determine Q(h2 ) and then, following the template set forth after the proof of Lemma 3.2, all of Q(P 2 (M )), and so on. But unlike for su(2), irreducibility alone apparently does not suﬃce to guarantee that Q is polynomial. While Proposition 3.4 shows that polynomial quantizations of (P (M ), sl(2, R)) for semisimple M cannot exist, it is unclear whether such transcendental quantizations are similarly obstructed.

4

Discussion

The quantization of (P (M ), b) for M ⊂ b nilpotent given above is not the ﬁrst know example of a consistent quantization: In Gotay [1995] a full quantization of (C ∞ (T 2 )), t) was exhibited, where t is the basic algebra of trigonometric polynomials of mean zero; and in Gotay and Grabowski [2001] a polynomial quantization of P (T ∗ R+ ), with the basic algebra being the aﬃne algebra a(1), was constructed. This last example “works” for exactly the same reason the nilpotent one does, viz. the ideal I(M ) is homogeneous. However, in contrast to the case of sl(2, R) (cf. Proposition 3.3), a polynomial quantization of P (T ∗ R+ ) with basic algebra a(1) need not be zero on P(2) . In fact, a moment’s reﬂection shows that there will exist a polynomial quantization of (P (M ), b) for any basic algebra b whenever I(M ) is homogeneous, for then one has the crucial splitting (3.1). But this construction will fail whenever I(M ) is inhomogeneous so that P(2) (M ) is not welldeﬁned. It is tempting to conjecture that an obstruction to quantization exists whenever I(M ) is inhomogeneous; this is borne out explicitly here in the case of semisimple orbits in sl(2, R) by Proposition 3.4. This correlation is also known to hold in all other examples that have been investigated thus far (Gotay [2000]). The next step is to extend Propositions 3.3 and 3.4 to higher rank semisimple basic algebras. Clearly, this necessitates using more Poisson theoretic techniques, as opposed to the computational approach taken here. These issues are addressed in Gotay [2001].

Acknowledgments: I thank A. El Gradechi, J. Grabowski, B. Kaneshige, and R. Sjamaar for many helpful discussions. I am especially indebted to R. Brylinski for her input; in particular for providing a proof of Proposition 2.1(iii). This work was supported in part by NSF grants DMS 96-23083 and 00-72434.

536

M. J. Gotay

References Avez, A. [1974] Repr´esentation de l’algebre de Lie des symplectomorphismes par des op´erateurs born´es. C.R. Acad. Sc. Paris S´ er. A 279 785–787. Barut, A. O. and R¸aczka, R. [1986] Theory of Group Representations and Applications. Second Edition (World Scientiﬁc, Singapore). Borel, A. and Tits, J. [1965] Groupes r´eductifs. Publ. Math. I.H.E.S. 27 55–151. Brylinski, R. [1998] Geometric quantization of nilpotent orbits. J. Diﬀ. Geom. Appl. 9 5–58. Collingwood, D. H. and McGovern, W. M. [1993] Nilpotent Orbits in Semisimple Lie Algebras. (Van Nostrand Reinhold, New York). Dixmier, J. [1977] Enveloping Algebras. (North Holland, Amsterdam). Flato, M. and Simon, J. [1973] Separate and joint analyticity in Lie groups representations. J. Funct. Anal. 13, 268–276. Ginzburg, V. L. and Montgomery, R. [2000] Geometric quantization and no-go theorems. In: Poisson Geometry, J. Grabowski and P. Urba´ nski, Eds. Banach Center Publ. 51 (Inst. Mat. PAN, Warszawa), 69–77. Gotay, M. J. [1995] On a full quantization of the torus. In: Quantization, Coherent States and Complex Structures, J.-P. Antoine et al., eds. (Plenum, New York), 55–62. Gotay, M. J. [2000] Obstructions to quantization. In: Mechanics: From Theory to Computation (Essays in Honor of Juan-Carlos Sim´ o), J. Nonlinear Sci. Eds. (Springer, New York), 171–216. Gotay, M. J. [2001] On quantizing semisimple basic algebras, II: The general case. Preprint. Gotay, M. J. and Grabowski, J. [2001] On quantizing nilpotent and solvable basic algebras. Canadian Math. Bull. 44 140–149. Gotay, M. J., Grabowski, J., and Grundling, H. B. [2000] An obstruction to quantizing compact symplectic manifolds. Proc. Amer. Math. Soc. 28 237– 243. Gotay, M. J. and Grundling, H. [1999] Nonexistence of ﬁnite-dimensional quantizations of a noncompact symplectic manifold. In: Diﬀerential Geometry and Applications, I. Kol´ aˇr et al., eds. (Masaryk Univ., Brno), 593–596. Gotay, M. J., Grundling, H. B., and Hurst, C. A. [1996] A Groenewold-Van Hove theorem for S 2 . Trans. Amer. Math. Soc. 348 1579–1597. Kostant, B. [1963] Lie group representations on polynomial rings. Amer. J. Math. 85 327–404. Lang, S. [1975] SL2 (R). (Addison-Wesley, Reading).

VII

Jerrold Marsden, 1942–

Curriculum Vitae Personal Data 1942, August 17 1965 1968 1968–1995 1970–1971 1971, June

Born in Ocean falls, British Columbia, Canada B.Sc., University of Toronto Ph.D., Princeton University (Arthur Wightman, advisor) Lecturer–Professor, University of California, Berkeley Visiting Professor, University of Toronto Visiting Researcher, Institute for Advanced Study, Princeton 1972, Spring Visiting Researcher, Institut des Hautes Etudes Scientiﬁques 1975, Spring Visiting Professor, University of Toronto ´ 1975, May–June Professeur d’Echange, Universit´e de Paris VI 1977, Spring Carnegie Fellow, Heriot-Watt University ´ 1979, May Professeur d’Echange, Universit´e de Paris VI 1979, Fall Killam Visiting Scholar, University of Calgary 1981–1982 Miller Research Professor, University of California, Berkeley 1983, Fall Visiting Researcher, Center for Nonlinear Studies, Los Alamos National Laboratory 1984–1986 Director, Research Group in Nonlinear Systems and Dynamics, UC Berkeley 1988–1995 Adjunct Professor of Electrical Engineering and Computer Science, UC Berkeley 1991, Spring Humboldt Senior Scientist, University of Hamburg 1991, September Ordway Scholar, University of Minnesota 1991–1994 Director, Fields Institute 1992, Spring Fairchild Fellow, Caltech 1995–present Professor, Control and Dynamical Systems, Caltech 1998 August 15 – Distinguished Visiting Scientist, The Mathematical September 15 Sciences Research Institute (MSRI) 1999, July–October Humboldt Senior Scientist, University of Munich

Awards and Honors Relativity Essay Contest (with Arthur Fischer), First Prize, 1973 Relativity Essay Contest (with Arthur Fischer), Second Prize, 1976 Carnegie Fellow Heriot–Watt University, Edinburgh, 1977 Jeﬀrey-Williams Prize, Canadian Mathematical Society, 1982 Aisenstadt Lectures, Montreal, 1989–90 Humboldt Prize, 1990-91, 1999 Norbert Wiener Prize, 1990, AMS–SIAM

539

540

Curriculum Vitae: J. E. Marsden

Ordway Scholar, University of Minnesota, 1991 Fairchild Fellow, Caltech, 1992 Redman Lectures, McMaster, January, 1993 Fellow, Royal Society of Canada, May, 1993 Jerrold E. Marsden postdoctoral fellowship named at the Fields Institute, 1994 Plenary Lecture, ICIAM, Hamburg, July, 1995 Fellow, American Academy of Arts and Science, April, 1997 Max Planck Research Award, 2000

Selected Professional Service Editor Applied Mathematical Sciences Series, Texts in Applied Mathematics, and Interdisciplinary Applied Mathematics, Springer-Verlag, 1980–present. Editorial Boards Physica D (1985–1990) Journal of Nonlinear Science (1990–present) Journal of Geometry and Physics (1990–present) Dynamics and Stability of Systems (1990–present) Proceedings of the Royal Society of Edinburgh (1990–present), Journal of Mathematical Systems and Control (1990–1995) Mathematica Journal (1990–present) Canadian Journal of Applied Mathematics (1990–present) Journal of Dynamics and Diﬀerential Equations (1990–1996) Foundations of Computational Mathematics (1999–present) Journal of Mathematical Physics (1998–present) Journal of Symplectic Geometry (2000–present) Professional Service American Mathematical Society (AMS) Committee on Summer Research Conferences (1983–86), Chair (1985–1986) NSF Mathematics Advisory Committee (1986–1990) AMS Science Policy Committee (1989–1992) AMS Program Committee (1993–1996) MSRI Advisory Committee (1990–1994) Director, Fields Institute (1990–1994) Scientiﬁc Advisory Panel, Fields Institute (1994–1998) Council of the American Mathematical Society (1995–1998) Chair, Scientiﬁc Program Committee, ICMS Edinburgh (1998–2000) Board of Trustees, Institute for Pure and Applied Mathematics, UCLA (1999–present)

Some Research Highlights Symplectic Reduction Theory. Marsden and Weinstein [30] discovered symplectic reduction theory for mechanical systems with symmetry. This far reaching generalization of classical work of Jacobi, Liouville, Routh, Poincar´e, Arnold, and Smale led to many signiﬁcant developments in both mechanics and mathematics. Guillemin and Sternberg describe some of these in their 1984 book on “Symplectic Techniques in Physics”, referring to this process as Marsden–Weinstein reduction, a term (aka Marsden– Meyer–Weinstein reduction) in wide use today. The development of this theory continues with many signiﬁcant papers, such as semi-direct product reduction theory in Marsden, Ratiu, and Weinstein [97], geometric phases (Hannay–Berry phases) realized as reconstruction phases in Marsden, Montgomery, and Ratiu [151] cotangent bundle reduction theory in Marsden and Abraham [294] and Marsden [302] and reduction by stages in Marsden, Misiolek, Perlmutter, and Ratiu [213]. Poisson reduction theory, which has proved useful in certain integrable systems, was developed in Marsden and Ratiu [120] Lagrangian Reduction Theory. The reduction of variational principles rather than symplectic and Poisson structures is the main theme of the Lagrangian counterpart to symplectic reduction. This profound but simple idea is due to Marsden and Scheurle [171] in which the reduced Euler–Lagrange equations (also called the Lagrange–Poincar´e equations even though these equations are not literally in the works of either Lagrange or Poincar´e) were discovered. Semi-direct Euler–Poincar´e theory was discovered in Holm, Marsden, and Ratiu [212]. Using the Euler–Poincar´e theory, the averaged Euler equations were discovered in Holm, Marsden, and Ratiu [209, 212]. This theory was considerably extended and matured in works such as Marsden, Ratiu, and Scheurle [245] and Cendra, Marsden, and Ratiu [268]. These equations have attracted much attention, both analytical and numerical. Fluid Mechanics and Plasma Physics. Ebin and Marsden [13], building on the work of Arnold, showed remarkable smoothness and other analytic properties of the Euler equations in Lagrangian representation and, as a consequence, established the ﬁrst limit of zero viscosity theorem for the Navier–Stokes equations in regions with no boundary. Marsden and Weinstein [76, 86], used reduction theory to establish the Hamiltonian structure of Eulerian ﬂuid dynamics as well as plasma dynamics (Maxwell–Vlasov equations). Many papers, such as Holm, Marsden, Ratiu, and Weinstein 541

542

Some Research Highlights: J. E. Marsden

[102], explored and developed the Arnold, or energy-Casimir method for ﬂuid and plasma stability. The geometry and analysis of the averaged Euler and Navier–Stokes equations were developed in, for example, Marsden, Ratiu, and Shkoller [251] and Marsden and Shkoller [269]. Relativistic Fields. Combining reduction theory with PDE techniques and classical ﬁeld theory, Fischer, Marsden, and Moncrief [69] and Arms, Marsden, and Moncrief [75] showed that in the spatially compact case, the space of solutions of the Einstein equations has a quadratic singularity at a metric g if and only if g has nontrivial symmetries. This was part of the Fischer–Marsden program of “linearization stability” and Hamiltonian and variational structures for nonlinear problems in classical ﬁeld theory and geometry. The initial-value problem was shown to be well-posed in Hughes, Kato, and Marsden [49] and Fischer and Marsden [63]. The Einstein equations were put in FOSH (ﬁrst order symmetric hyperbolic) form in Fischer and Marsden [21]. The paper of Choquet-Bruhat and Marsden [39] ﬁrst proved the local positive mass conjecture using a Morse-theory idea of Bill and Desser. Dynamical Systems in Mechanics. Holmes and Marsden [71] was one of the ﬁrst papers to rigorously establish the existence of Smale horseshoes (and hence chaotic solutions) in a PDE, in this case a forced beam equation. Exponentially small splitting of separatrices was investigated in Holmes, Marsden, and Scheurle [140] and the phenomena of Pattern Evocation was discovered in Marsden and Scheurle [189]. In stability theory, building on the Arnold-energy-Casimir method, the creation and development of the energy-momentum method, especially the block diagonalization structures and its application to elastic, ﬂuid and plasma equations was a major achievement (see Marsden [302] and references therein). The converse led to new insights into dissipation induced instabilities in Bloch, Krishnaprasad, Marsden, and Ratiu [173]. Nonlinear Elasticity. Hughes, Kato, and Marsden [49] proved the existence and uniqueness for the initial value problem in nonlinear elasticity and the book Marsden and Hughes [300] laid the geometric foundations of the subject, which has had an enormous inﬂuence. Ball and Marsden [110] gave a counterexample to the energy criterion for nonlinear elasticity and, in the process discovered quasi-convexity at the boundary. Bifurcations in the traction problem were investigated in detail in Chillingworth, Marsden, and Wan [87] and Hamiltonian structures for rods and shells in Sim´ o, Marsden, and Krishnaprasad [139]. Variational methods for elastic collision algorithms were developed in Kane, Repetto, Ortiz, and Marsden [229] and Pandolﬁ, Kane, Marsden, and Ortiz [284] Control Theory. The discovery of the momentum equation for nonholonomic systems in Bloch, Krishnaprasad, Marsden, and Murray [195], has led to deep theoretical as well as practical insight into locomotion gener-

Some Research Highlights: J. E. Marsden

543

ation in the control of mechanical systems. In addition, building on early work of Bloch, Krishnaprasad, Marsden, and S´ anchez de Alvarez [163], and motivated by work on underwater vehicles in Leonard and Marsden [204], the paper Bloch, Leonard, and Marsden [262] created the stabilization method of controlled Lagrangians which has already attracted much attention in the control community. With the collaboration of Chang the CL method was recently shown to be equivalent to the CH (controlled Hamiltonian) method, closely related to passivity methods. Variational Integrators. Variational integrators, based on simple but profound consequences of Hamilton’s principle and its Veselov-type discretizations are having a signiﬁcant impact on the numerical integration of mechanical systems, both conservative and forced/dissipative. The paper Marsden, Patrick, and Shkoller [217] stands out as the most profound by showing how this theory extends to the PDE context, producing multisymplectic integration algorithms. A related important discovery in Kane, Marsden, Ortiz, and West [252] is that the Newmark algorithm is variational and hence symplectic. These methods were recently extended to the context of asynchronous variational integrators (AVI) in work with Lew, Ortiz, and West. Dynamical Systems and Space Mission Design. The paper Koon, Lo, Marsden, and Ross [244] extends classical works on the three body problem (due to Conley, McGehee, Simo et. al.) to include situations of interest in space mission design, including the Genesis Discovery Mission. Their discovery of new heteroclinic connections and their use to gain a deep understanding of ballistic capture by the moon as well as the associated discovery of missions to the moons of Jupiter has contributed to these new NASA mission concepts.

Graduate Students, and Post Doctoral Scholars Ph.D. Students [ The following entries show the institution granting the degree and the thesis title for each student, followed by their current location in square brackets. ]

1. Graciela Chichilnisky, Math, UCB, Group actions on spin manifolds, 1970. [ Columbia University ] 2. Murray Robert Cantor, Math, UCB, Global analysis over noncompact spaces, 1970. [ Rational Software ] 3. Marjorie McCracken, Math, UCB, The Stokes equations in Lp , 1975. [ Stanford University Medical Center ]

4. Judith Arms, Math, UCB, Linearization stability of coupled gravitational and gauge ﬁelds, 1977. [ University of Washington ] 5. Gabriel Lugo, Math, UCB, Structure of twistor and H-spaces, 1979. [ University of North Carolina, Wilmington ]

6. Tudor Ratiu, Math, UCB, Euler–Poisson equations on Lie algebras, 1980. [ EPFL, Lausanne ] 7. Omar Hijab, Math, UCB, Minimum energy estimation, 1980. [ Temple University ]

8. Dena Patterson, Math, UCB, Calculus students’ use of visualization, 1983. 9. David Bao, Math, UCB, Some aspects in the dynamics of supergravity, 1983. [ University of Houston ] 10. Gloria Sanchez de Alvarez, Math, UCB, Geometric methods of classical mechanics applied to control theory, May, 1986. [ University of Los Andes, Merida, Venezuela ]

11. Richard Montgomery, Math, UCB, The bundle picture in mechanics, 1986. [ University of California, Santa Cruz ] 12. Debra Lewis, Math, UCB, Rotating liquid drops, Hamiltonian structure, bifurcation and stability, 1987. [ University of California, Santa Cruz ]

13. Andrew Phelps, Math, UCB (co-supervised with A. J. Krener), A Simpliﬁcation of Nonlinear Observer Theory, 1987 545

546

Graduate Students, and Post Doctoral Scholars: J. E. Marsden

14. Uy Mbanefo, Math, UCB, Mixed boundary-value problems of stress singularities, 1988. [ Northrop–Grumman ] 15. George Patrick, Math, UCB, The dynamics of coupled rigid bodies, 1990. [ University of Saskatchewan ] 16. Mark S. Alber, Math, U. Penn, Geometric phases, geometric asymptotics, and integrable systems, 1990. [ University of Notre Dame ] 17. Gil Bor, Math, UCB, Non self dual Yang-Mills ﬁelds, June, 1991. [ University of Guanajuato ]

18. Shuh-Jye Chern, Math, Cornell, Fluid stability on rotating spheres, 1991. [ National Tsing-Hua University, Taiwan ] 19. Robert Fillipini, Math, UCB, The symplectic geometry of the theorems of Borel–Weil and Peter–Weyl , May 1995. 20. Neil Getz, EECS, UCB, Control of nonminimum phase systems, October, 1995. [ Inversion, Inc. ] 21. Wang-Sang Koon, Math, UCB, Nonholonomic mechanical systems, May, 1997. [ Caltech ] 22. Anthony Blaom, Math, Caltech, Geometric mechanics and Nekhoroshev estimates, May, 1997. 23. Mathew Perlmutter, Math, UCB, Symplectic reduction by stages, May, 1999. [ University of Lisbon ] 24. Sameer Jalnapurkar, Math, UCB Nonholonomic and Lagrangian control systems, May, 1999. [ IIT, Bangalore ] 25. Chong Ye Xu, Mathematics, UCB, Asymptotic stability for equilibria of nonlinear semiﬂows with applications to rotating viscoelastic rods, June, 2000. 26. Sergey Pekarsky, CDS, Caltech, Discrete reduction of mechanical systems and multisymplectic geometry of continuum mechanics, June, 2000. [ Moodys Investors Service ] 27. Antonio Hernandez, Math, Caltech, Regularization of the amended potential around symmetric points, November, 2001. [ University of Mexico ]

28. Luz Vianey Vela-Arevelo, CDS, Caltech, Hamiltonian systems in molecular dynamics and frequency analysis, August, 2001. [ Georgia Tech ]

Graduate Students, and Post Doctoral Scholars: J. E. Marsden

547

Ph.D. Students in Progress (Caltech) Dong Eui Chang (CDS) Razvan Fetecau (ACM) Anil Hirani (CS) Francois Lekien (CDS) Melvin Leok (CDS) Shane Ross (CDS) Matthew West (CDS)

Masters Students Thomas J. R. Hughes (SESM), 1974 Shankar Sastry (EECS), 1980 Fathi A. Salam (EECS), 1982 Dave Reilly (Math), 1983 Zexiang Li (EECS), 1989 Greg P. Heinzinger (EECS), 1990 Jeﬀ Wendlandt (ME), 1995

Postdoctoral Fellows Jim Isenberg, UC Berkeley, 1980–1982 Robert Grossman, UC Berkeley, 1986–1988 Vivien Kirk, UC Berkeley, 1990–1992 Hans-Peter Kruse (Hamburg—Humboldt Award), UC Berkeley, 1992–1994 Hassan Gumral (Turkey), UC Berkeley, 1993–1994 Brianno Coller, (CDS), Caltech, 1995–1997 Steve Shkoller, San Diego and LANL, 1996–1999 Greg Luther, (CDS), Caltech, 1997 Wang-Sang Koon, (CDS), Caltech, 1998– Sanjay Lall (joint with John Doyle), (CDS), Caltech, 1998–2000 Banavara Shashikanth (joint with Richard Murray and Joel Burdick), (CDS), Caltech, 1998-2000 Matt Perlmutter, (CDS), Caltech, 1998–2000 Couro Kane, (CDS), Caltech, 1997–2000 Scott Kelly, (joint with Richard Murray), (ME), Caltech, 1998–1999 Petr Krysl, (joint with Peter Schr¨ oder), (CS), Caltech, 1998–2001 Sameer Jalnapurkar, (CDS), Caltech, 1999–2000 Kamran Mohseni, (CDS), Caltech, 2000–2001 Sergey Pekarsky, (CDS), Caltech, 2000–2001 Chad Coulliette, (CDS), Caltech, 2000– Marcel Clerc, (CDS), Caltech, 2001

Publications Papers (Chronological) [ 1965 ] [1] J. E. Marsden [1965], A theorem on harmonic homologies. Canad. Math. Bull., 8, 375–377. [ 1966 ] [2] M. Beattie, J. E. Marsden and R. Sharpe [1966], A universal factorization theorem in topology. Canad. Math. Bull., 9 201–207. [3] M. Beattie, J. E. Marsden and R. Sharpe [1966], Order in ﬁnite aﬃne planes Canad. Math. Bull., 9, 407–412. [ 1967 ] [4] J. E. Marsden [1967], A correspondence principle for momentum operators. Canad. Math. Bull., 10, 247–250. [ 1968 ] [5] J. E. Marsden [1968], Generalized Hamiltonian mechanics. Arch. Rational Mech. Anal., 28, 323–361. [6] J. E. Marsden [1968], Hamiltonian one parameter groups. Arch. Rational Mech. Anal., 28, 362–396. [7] J. E. Marsden [1968], A Banach space of analytic functions for constant coeﬃcient equations of evolution. Canad. Math. Bull., 11, 599–601. [8] J. E. Marsden [1968], Countable and net convergence. Amer. Math. Monthly, 75, 397–398. [ 1969 ] [9] J. E. Marsden [1969], Hamiltonian systems with spin. Canad. Math. Bull., 12, 203–208. [10] J. E. Marsden [1969], Non smooth geodesic ﬂows and classical mechanics. Canad. Math. Bull., 12, 209–212. [11] D. G. Ebin and J. E. Marsden [1969], Groups of diﬀeomorphisms and the solution of the classical Euler equations for a perfect ﬂuid. Bull. Amer. Math. Soc., 75 962–967. [ 1970 ] [12] R. Abraham and J. E. Marsden [1970], Hamiltonian mechanics on Lie groups and hydrodynamics. Proc. Sympos. Pure Math., 16 237–244. [13] D. G. Ebin and J. E. Marsden [1970], Groups of diﬀeomorphisms and the motion of an incompressible ﬂuid. Ann. of Math., 92, 102–163.

549

550

Publications: J. E. Marsden

[14] D. G. Ebin and J. E. Marsden [1970], On the motion of incompressible ﬂuids. Actes Du Congres Intern., 2, 211–214. [15] J. E. Marsden and A. Weinstein [1970], A comparison theorem for Hamiltonian vector ﬁelds. Proc. Amer. Math. Soc., 26, 629–631. [16] P. Chernoﬀ and J. E. Marsden [1970], On continuity and smoothness of group actions. Bull. Amer. Math. Soc., 76, 1044–1049. [ 1972 ] [17] A. E. Fischer and J. E. Marsden [1972], General relativity as a dynamical system on the manifold A of Riemannian metrics which cover diﬀeomorphisms. Lecture Notes in Phys., Springer, 14, 176–188. [18] J. E. Marsden, D. G. Ebin and A. E. Fischer [1972], Diﬀeomorphism groups, hydrodynamics and relativity. Proc. of the 13th Biennial Seminar of Canadian Mathematical Congress, Montreal, (J. Vanstone, ed.), 135–279. [19] A. E. Fischer and J. E. Marsden [1972], The Einstein equation of evolution — a geometric approach. J. Math. Phys., 13, 546–568. [20] J. E. Marsden [1972], Darboux’s theorem fails for weak symplectic forms. Proc. Amer. Math. Soc., 3, 590–592. [21] A. E. Fischer and J. E. Marsden [1972], The Einstein evolution equations as a ﬁrst-order quasi-linear symmetric hyperbolic system, I. Comm. Math. Phys., 28, 1–38. [ 1973 ] [22] A. E. Fischer and J. E. Marsden [1973], New theoretical techniques in the study of gravity. Gen. Relativity Gravitation, 4, 309–317. [23] J. E. Marsden [1973], On product formulas for nonlinear semigroups. J. Funct. Anal., 13, 51–72. [24] J. E. Marsden [1973], On global solutions for nonlinear Hamiltonian evolution equations. Comm. Math. Phys., 30, 79–81. [25] J. E. Marsden [1973], The Hopf bifurcation for nonlinear semigroups. Bull. Amer. Math. Soc., 79, 537–541. [26] J. E. Marsden [1973], On completeness of homogeneous pseudo-Riemannian manifolds. Indiana Univ. Math. J., 22, 1065–1066. [27] A. E. Fischer and J. E. Marsden [1973], General relativity, partial diﬀerential equations, and dynamical systems. Proc. Sympos. Pure Math., 23, 309–327. [28] A. E. Fischer and J. E. Marsden [1973], Linearization stability of the Einstein equations. Bull. Amer. Math. Soc., 79, 1065–1066. [29] J. E. Marsden [1973], A proof of the Calderon extension theorem. Canad. Math. Bull., 16, 133–136. [ 1974 ] [30] J. E. Marsden and A. Weinstein [1974], Reduction of symplectic manifolds with symmetry. Rep. Math. Phys., 5, 121–130. [31] A. E. Fischer and J. E. Marsden [1974], Manifolds of Riemannian metrics with prescribed scalar curvature. Bull. Amer. Math. Soc., 80, 479–484.

Publications: J. E. Marsden

551

[32] A. E. Fischer and J. E. Marsden [1974], Global analysis and general relativity. Gen. Relativity Gravitation, 5, 73–77. [33] J. E. Marsden [1974], A formula for the solution of the Navier–Stokes equation based on a method of Chorin. Bull. Amer. Math. Soc., 80, 154–158. [34] A. E. Fischer and J. E. Marsden [1974], General relativity as a Hamiltonian system. Symposia Math. XIV, 193–205. [35] P. Chernoﬀ and J. E. Marsden [1974], Some basic properties of inﬁnite dimensional Hamiltonian systems. Lecture Notes in Math., Springer, 425, 1–160; [ 1975 ] [36] A. E. Fischer and J. E. Marsden [1975], Linearization stability of nonlinear partial equations. Proc. Sympos. Pure Math., 27, 219–263. [37] A. E. Fischer and J. E. Marsden [1975], Deformations of the scalar curvature. Duke Math. J., 42, 519–547. [38] J. Arms, A. Fischer and J. E. Marsden [1975], Une approche symplectique pour des th´eor`emes de decomposition en geom`etrie et relativit`e gen`eral. C. R. Acad. Sci., Paris, 281, 517–520. [ 1976 ] [39] Y. Choquet-Bruhat and J. E. Marsden [1976], Sur la positivit´e de la masse. C. R. Acad. Sci., Paris, 282, 609–612. [40] Y. Choquet-Bruhat and J. E. Marsden [1976], Solution of the local mass problem in general relativity, Comm. Math. Phys., 51, 283–296. [41] P. Chernoﬀ and J. E. Marsden [1976], Some basic properties of inﬁnite dimensional Hamiltonian systems. Colloq. Internat. C.N.R.S., 237, 313–330. [42] A. E. Fischer and J. E. Marsden [1976], Deformations of nonlinear partial diﬀerential equations. Colloq. Internat. C.N.R.S., 237, 331–345. [43] A. E. Fischer and J. E. Marsden [1976], The manifold of conformally equivalent metrics. Can. J. Math., 29, 193–209. [44] Y. Choquet-Bruhat, A. E. Fischer and J. E. Marsden [1976], Equations des contraintes sur une vari´et´e non compacte. C. R. Acad. Sci., Paris, 284, 975–978. [45] J.-P. Bourguignon, D. G. Ebin and J. E. Marsden [1976], Sur le noyau des op´erateurs pseudo-diﬀerentiels ` a symbole surjectif et non injectif. C. R. Acad. Sci., Paris, 282, 867–870. [46] J. E. Marsden [1976], Well-posedness of the equations of a non-homogeneous perfect ﬂuid. Comm. Partial Diﬀerential Equations, 1, 215–230. [47] A. E. Fischer and J. E. Marsden [1976], A new Hamiltonian structure for the dynamics of general relativity. Gen. Relativity Gravitation, 7, 915–920.

552

Publications: J. E. Marsden [ 1977 ]

[48] P. Chernoﬀ and J. E. Marsden [1977], Some remarks on Hamiltonian systems and quantum mechanics. in Foundations of probability theory, statistical inference, and statistical theories of science, Proc. Internat. Res. Colloq., Univ. Western Ontario, London, Ont., 1973, III, 35–53; Univ. Western Ontario Ser. Philos. Sci., 6, Reidel, Dordrecht. [49] T. J. R. Hughes, T. Kato and J. E. Marsden [1977], Well-posed quasi-linear second-order hyperbolic systems with applications to nonlinear elastodynamics and general relativity. Arch. Rational Mech. Anal., 63, 273–394. [50] J. E. Marsden [1977], Attempts to relate the Navier–Stokes equations to turbulence. Lecture Notes in Math., Springer, 615, 1–22. [51] T. J. R. Hughes and J. E. Marsden [1977], Some applications of geometry in continuum mechanics. Rep. Math. Phys., 12, 35–44. [ 1978 ] [52] T. J. R. Hughes and J. E. Marsden [1978], Classical elastodynamics as a linear symmetric hyperbolic system. J. Elast., 8, 97–110. [53] A. Chorin, T. J. R. Hughes, J. E. Marsden and M. McCracken [1978], Product formulas and numerical algorithms. Comm. Pure Appl. Math., 31, 205–256. [54] P. J. Holmes and J. E. Marsden [1978], Bifurcation to divergence and ﬂutter in ﬂow induced oscillations. Automatica, 14, 367–384. [55] P. J. Holmes and J. E. Marsden [1978], Bifurcations of dynamical systems and nonlinear oscillations in engineering systems. Lecture Notes in Math., Springer, 648, 163–206. [56] T. J. R. Hughes and J. E. Marsden [1978], Topics in the mathematical foundations of elasticity. In Nonlinear Analysis and Mechanics, vol. II, (R. J. Knops, ed.), Pitman Research Notes, 27, 30–285. [57] J. E. Marsden [1978], Qualitative methods in bifurcation theory. Bull. Amer. Math. Soc., 84, 1125–1148. [58] J. M. Ball, R. J. Knops and J. E. Marsden [1978], Two examples in nonlinear elasticity. Lecture Notes in Math., Springer, 665, 41–49. [ 1979 ] [59] J. M. Arms and J. E. Marsden [1979], The absence of Killing ﬁelds is necessary for linearization stability of Einstein’s equations. Indiana Univ. Math. J., 28, 119–125. [60] P. J. Holmes and J. E. Marsden [1979], Qualitative techniques for bifurcation analysis of complex systems. Annals of New York Academy of Sciences, 316, 608–622. [61] A. E. Fischer and J. E. Marsden [1979], Topics in the dynamics of general relativity. In Isolated Gravitating Systems in General Relativity, (J. Ehlers, ed.), Italian Physical Society, 322–395. [62] Y. Choquet-Bruhat, A. E. Fischer, and J. E. Marsden [1979], Maximal hypersurfaces and positivity of mass. In Isolated Gravitating Systems in General Relativity, (J. Ehlers, ed.), Italian Physical Society, 396–456.

Publications: J. E. Marsden

553

[63] A. E. Fischer and J. E. Marsden [1979], The initial value problem and the dynamical formulation of general relativity. In General Relativity, An Einstein Centenary Survey, (S. W. Hawking and W. Israel, eds.), Cambridge University Press, 138–211. [64] J. E. Marsden [1979], On geometry of the Liapunov–Schmidt procedure. Lecture Notes in Math., Springer, 755, 77–82. [ 1980 ] [65] P. J. Holmes and J. E. Marsden [1980], Dynamical systems and invariant manifolds. In New Approaches to Nonlinear Problems in Dynamics, (P. J. Holmes, ed.), SIAM, 1–28. [66] J. E. Marsden and F. Tipler [1980], Maximal hypersurfaces and foliations of constant mean curvature in general relativity. Phys. Reports, 66, 109–139. [67] A. E. Fischer, J. E. Marsden and V. Moncrief [1980], Symmetry breaking in general relativity. In Essays on General Relativity, (F. J. Tipler, ed.), Academic Press, 79–96. [68] P. J. Holmes and J. E. Marsden [1980], A horseshoe in the dynamics of a forced beam. Ann. New York Acad. Sci., 357, 313–321. [69] A. E. Fischer, J. E. Marsden and V. Moncrief [1980], The structure of the space of solutions of Einstein’s equations, I: One Killing ﬁeld. Ann. Inst. H. Poincar´ e Phys. Th´eor., 33, 147–194. [ 1981 ] [70] J. M. Arms, J. E. Marsden and V. Moncrief [1981], Symmetry and bifurcations of momentum mappings. Comm. Math. Phys., 78, 455–478. [71] P. J. Holmes and J. E. Marsden [1981], A partial diﬀerential equation with inﬁnitely many periodic orbits: Chaotic oscillations of a forced beam. Arch. Rational Mech. Anal., 76, 135–166. [72] J. E. Marsden [1981], Four applications of nonlinear analysis to physics and engineering. In New Directions in Applied Math, (P. Hilton and G. Young, eds.), 85–107. [ 1982 ] [73] P. J. Holmes and J. E. Marsden [1982], Horseshoes in perturbations of Hamiltonian systems with two degrees of freedom. Comm. Math. Phys., 82, 523–544. [74] J. E. Marsden [1982], Spaces of solutions of relativistic ﬁeld theories with constraints. Lecture Notes in Math., Springer, 905, 29–44. [75] J. M. Arms, J. E. Marsden and V. Moncrief [1982], The structure of the space of solutions of Einstein’s equations, II: Several Killing ﬁelds and the Einstein–Yang Mills equations. Ann. Physics, 144, 81–106. [76] J. E. Marsden and A. Weinstein [1982], The Hamiltonian structure of the Maxwell–Vlasov equations. Physica D, 4, 394–406. [77] P. J. Holmes and J. E. Marsden [1982], Melnikov’s method and Arnold diffusion for perturbations of integrable Hamiltonian systems. J. Math. Phys., 23, 669–675.

554

Publications: J. E. Marsden

[78] J. E. Marsden [1982], A group theoretic approach to the equations of plasma physics. Canad. Math. Bull., 25, 129–142. [79] J. M. Ball, J. E. Marsden and M. Slemrod [1982], Controllability for distributed bilinear systems. SIAM J. Cont. Opim., 20, 575–597. [80] J. Isenberg and J. E. Marsden [1982], A slice theorem for the space of solutions of Einstein’s equations. Phys. Reports, 89, 179–222. [81] D. R. J. Chillingworth, J. E. Marsden, and Y. H. Wan [1982], Symmetry and bifurcation in three-dimensional elasticity, I. Arch. Rational Mech. Anal., 80, 295–331. [ 1983 ] [82] P. J. Holmes and J. E. Marsden [1983], Horseshoes and Arnold diﬀusion for Hamiltonian systems on Lie groups. Indiana Univ. Math. J., 32, 273–310. [83] J. E. Marsden [1983], The initial value problem and dynamics of gravitational ﬁelds. Proc. 9th Internat. Conf. Gen. Relativity and Gravitation 115–126. [84] M. Buchner, J. E. Marsden, and S. Schecter [1983], Applications of the blowing up construction and algebraic geometry to bifurcation problems. J. Diﬀerential Equations, 48, 404–433. [85] J. E. Marsden [1983], Bifurcation and linearization stability in the traction problem. In Systems of Nonlinear Partial Diﬀerential Equations, (J. M. Ball, ed.), Reidel, 367–372. [86] J. E. Marsden and A. Weinstein [1983], Coadjoint orbits, vortices and Clebsch variables for incompressible ﬂuids. Physica D, 7, 305–323. [87] D. R. J. Chillingworth, J. E. Marsden and Y. H. Wan [1983], Symmetry and bifurcation in three-dimensional elasticity, II. Arch. Rational Mech. Anal., 83, 363–395. [88] J. E. Marsden and Y. H. Wan [1983], Linearization stability and Signorini series for the traction problems in elastostatics. Proc. Roy. Soc. Edinburgh Sect. A, 95, 171–180. [89] D. D. Holm, J. E. Marsden, T. Ratiu and A. Weinstein [1983], Nonlinear stability conditions and a priori estimates for barotropic hydrodynamics. Phys. Letters A, 98, 15–21. [90] M. Golubitsky and J. E. Marsden [1983], The Morse Lemma in inﬁnite dimensions via singularity theory. SIAM J. Math. Anal., 14, 1037–1044. [91] M. Buchner, J. E. Marsden, and S. Schecter [1983], Examples for the inﬁnite dimensional Morse Lemma in inﬁnite dimensions. SIAM J. Math. Anal., 14, 1045-1055. [92] J. E. Marsden and Y. H. Wan [1983], Symmetry and bifurcation in three dimensional elasticity, III. Arch. Rational Mech. Anal., 84, 203–233. [93] J. E. Marsden, A. Weinstein, T. Ratiu, R. Schmid, and R. Spender [1983], Hamiltonian systems with symmetry, coadjoint orbits and plasma physics. Proc. IUTAM–ISIMM Symposium on Modern Developments in Analytical Mechanics, Atti della Academia della Scienze di Torino, 117, 289–340.

Publications: J. E. Marsden

555

[94] J. E. Marsden, F. M. A. Salam, and P. P. Varaiya [1983], Chaos and Arnold diﬀusion in dynamical systems. IEEE Trans. Circuits and Systems, 30, 697–708. [95] J. E. Marsden [1983], Chaotic orbits by Melnikov’s method: A survey of applications. Proc. IEEE Conf. on Decision and Control, 22, 356–359. [96] J. E. Marsden, F. M. A. Salam and P. P. Varaiya [1983b, Arnold diﬀusion in the dynamics of a four-machine power system undergoing a large fault. Proc. IEEE Conf. on Decision and Control, 22, 1411–1414. [ 1984 ] [97] J. E. Marsden, T. Ratiu and A. Weinstein [1984], Semi-direct products and reduction in mechanics. Trans. Amer. Math. Soc., 281, 147–177. [98] M. Golubitsky, J. E. Marsden and D. Schaeﬀer [1984], Bifurcation problems with hidden symmetries. In Partial Diﬀerential Equations and Dynamical Systems, (W. Fitzgibbon, ed.), Plenum Press, 181–210. [99] J. E. Marsden [1984], Hamiltonian structures for the heavy top and plasmas. In Partial Diﬀerential Equations and Dynamical Systems, (W. Fitzgibbon, ed.), Plenum Press, 259–278. [100] J. E. Marsden and J. C. Simo [1983], Stress tensors, Riemannian metrics and the alternative descriptions of elasticity. Lecture Notes in Phys., Springer, 195, 369–383. [101] J. E. Marsden, P. Morrison, and A. Weinstein [1984], The Hamiltonian structure of the BBGKY hierarchy. Contemp. Math., Amer. Math. Soc., Providence, RI, 28, 115–124. [102] D. D. Holm, J. E. Marsden, T. Ratiu, and A. Weinstein [1984], Stability of rigid body motion using the energy-Casimir method. Contemp. Math., Amer. Math. Soc., Providence, RI, 28, 15–24. [103] J. E. Marsden, R. Montgomery and T. Ratiu [1984], Gauged Lie–Poisson structures. Contemp. Math., Amer. Math. Soc., Providence, RI, 28, 101– 114. [104] J. E. Marsden, T. Ratiu and A. Weinstein [1984], Reduction and Hamiltonian structures on duals of semidirect product Lie algebras. Contemp. Math., Amer. Math. Soc., Providence, RI, 28, 55–100. [105] J. E. Marsden and P. Morrison [1984], Noncanonical Hamiltonian ﬁeld theory and reduced MHD. Contemp. Math., Amer. Math. Soc., Providence, RI, 28, 133–150. [106] H. D. I. Abarbanel, D. D. Holm, J. E. Marsden, and T. Ratiu [1984], Richardson number criterion for the nonlinear stability of three-dimensional stratiﬁed ﬂow. Phys. Rev. Lett., 52, 2352–2355. [107] J. E. Marsden and J. Scheurle [1984], Bifurcation to quasi-periodic tori in the interaction of steady state and Hopf bifurcations. SIAM J. Math. Anal., 15, 1055–1074. [108] J. E. Marsden, F. M. A. Salam, and P. P. Varaiya [1984], Arnold diﬀusion in the swing equations of a power system. IEEE Trans. Circuits and Systems, 31, 673–688.

556

Publications: J. E. Marsden

[109] J. E. Marsden and J. C. Simo [1984], On the rotated stress tensor and the material version of the Doyle–Ericksen formula. Arch. Rational Mech. Anal., 86, 213–231. [110] J. M. Ball and J. E. Marsden [1984], Quasiconvexity, second variations and the energy criterion in nonlinear elasticity. Arch. Rational Mech. Anal., 86, 251–277. [111] J. Isenberg and J. E. Marsden [1984], The York Map is a canonical transformation. J. Geom. Phys., 1, 85–105. [112] J. E. Marsden [1984], Chaos in dynamical systems by the Poincar´e– Melnikov–Arnold method. In Chaos in Nonlinear Dynamical Systems, (J. Chandra, ed.), SIAM, 19–31. [113] J. E. Marsden and J. C. Simo [1984], Stress and Riemannian metrics in nonlinear elasticity. MSRI Proc., 2, 173–184. [114] R. Hazeltine, D. D. Holm, J. E. Marsden and P. Morrison [1984], Generalized Poisson brackets and nonlinear Liapunov stability — Application to reduced MHD. Proc. ICPP Conference, Lausanne, 2, 204–209. [ 1985 ] [115] D. D. Holm, J. E. Marsden, T. Ratiu, and A. Weinstein [1985], Nonlinear stability of ﬂuid and plasma equilibria. Phys. Reports, 123, 1–116. [116] D. Bao, J. E. Marsden, and R. Walton [1985], The Hamiltonian structure of general relativistic perfect ﬂuids. Comm. Math. Phys., 99, 319–345. [117] J. E. Marsden and M. Slemrod [1985], Temporal and spatial chaos in a van der Waals ﬂuid due to periodic thermal ﬂuctuations. Adv. in Appl. Math., 6, 135–158. [ 1986 ] [118] D. D. Holm, J. E. Marsden, and T. Ratiu [1986], Nonlinear stability of the Kelvin–Stuart cat’s eyes ﬂows. Lectures in Appl. Math, Amer. Math. Soc., 23, part 2, 171–186. [119] D. Lewis, J. E. Marsden, R. Montgomery, and T. Ratiu [1986], The Hamiltonian structure for dynamic free boundary problems. Physica D, 18, 391– 404. [120] J. E. Marsden and T. S. Ratiu [1986], Reduction of Poisson manifolds. Lett. Math. Phys., 11, 161–170. [121] D. D. Holm, J. E. Marsden, and T. Ratiu [1986], The Hamiltonian structure of continuum mechanics in material, inverse material, spatial and convective representations. S´eminaire de Math´ ematiques sup´ erieurs, Les Presses de L’Universit´ e de Montr`eal, 100, 11–122. [122] D. Lewis, J. E. Marsden and T. Ratiu [1986], Formal stability of liquid drops with surface tension. In Perspectives in Nonlinear Dynamics, (M. F. Shlesinger, R. Cawley, A. W. Saenz and W. Zachary, eds.), World Scientiﬁc, 71–83. [123] H. D. I. Abarbanel, D. D. Holm, J. E. Marsden and T. S. Ratiu [1986], Nonlinear stability analysis of stratiﬁed ﬂuid equilibria. Philos. Trans. Roy. Soc. London Ser. A, 318, 349–409.

Publications: J. E. Marsden

557

[124] J. E. Marsden, R. Montgomery, P. J. Morrison, and W. B. Thompson [1986], Covariant Poisson brackets for classical ﬁelds. Ann. Physics, 169, 29–48. [125] D. Eardley, J. Isenberg, J. E. Marsden and V. Moncrief [1986], Homothetic and conformal symmetries of solutions to Einstein’s equations. Comm. Math. Phys., 106, 137–158. [ 1987 ] [126] P. S. Krishnaprasad and J. E. Marsden [1987], Hamiltonian structure and stability for rigid bodies with ﬂexible attachments. Arch. Rational Mech. Anal., 98, 71–93. [127] J. E. Marsden and T. S. Ratiu [1987], Nonlinear Stability in Fluids and Plasmas. Seminar on New Results in Nonlinear Partial Diﬀerential Equations, (A. J. Tromba, ed.), Vieweg, 101–134 [128] J. E. Marsden [1987], Generic Bifurcation of Hamiltonian Systems with Symmetry. appendix to Golubitsky and Stewart, Physica D, 24, 391–405. [129] H. Cendra and J. E. Marsden [1987], Lin constraints, Clebsch potentials and variational principles. Physica D, 27, 63–89. [130] J. E. Marsden and J. Scheurle [1987], The Construction and Smoothness of Invariant Manifolds by the Deformation Method. SIAM J. Math. Anal., 18, 1261–1274 [131] Lewis, D., J. E. Marsden and T. S. Ratiu [1987], Stability and bifurcation of a rotating liquid drop. J. Math. Phys., 28, 2508–2515. [132] P. S. Krishnaprasad, J. E. Marsden, and T. Posbergh [1987], Stability analysis of a rigid body with a ﬂexible attachment using the energy-Casimir method. Contemp. Math., Amer. Math. Soc., Providence, RI, 68, 253–273. [133] H. Cendra, A. Ibort, and J. E. Marsden [1987], Horizontal Lin constraints, Clebsch potentials and variational principles on principal ﬁber bundles. XV Coll. in Group Theor. Meth. in Phys., (R. Gilmore, ed.), World Sci., 446–450. [134] H. Cendra, A. Ibort and J. E. Marsden [1987], Variational principles on ﬁber bundles: a geometric theory of Clebsch potentials and Lin constraints. J. Geom. Phys., 4, 183–206. [ 1988 ] [135] R. Grossman, P. S. Krishnaprasad and J. E. Marsden [1988], The dynamics of two coupled rigid bodies. In Dynamical Systems Approaches to Nonlinear Problems in Systems and Circuits, (Salam and Levi, eds.), SIAM , 373–378. [136] J. E. Marsden [1988], The Hamiltonian formulation of classical ﬁeld theory. Contemp. Math., Amer. Math. Soc., Providence, RI, 71, 221–235. [137] N. Sreenath, Y. G. Oh, P. S. Krishnaprasad and J. E. Marsden [1988], The dynamics of coupled planar rigid bodies. Part 1: Reduction, equilibria and stability. Dynamics Stability Systems, 3, 25–49. [138] Z. Ge and J. E. Marsden [1988], Lie–Poisson integrators and Lie–Poisson Hamiltonian–Jacobi theory. Phys. Lett. A, 133, 134–139.

558

Publications: J. E. Marsden

[139] J. C. Simo, J. E. Marsden, and P. S. Krishnaprasad [1988], The Hamiltonian structure of nonlinear elasticity: The material, spatial, and convective representations of solids, rods, and plates. Arch. Rational Mech. Anal., 104, 125–183. [140] P. J. Holmes, J. E. Marsden, and J. Scheurle [1988], Exponentially small splittings of separatrices with applications to KAM theory and degenerate bifurcations. Contemp. Math., Amer. Math. Soc., Providence, RI, 81, 213– 244. [ 1989 ] [141] Y. G. Oh, N. Sreenath, P. S. Krishnaprasad, and J. E. Marsden [1989], The dynamics of coupled planar rigid bodies Part 2: Bifurcations, periodic solutions, and chaos. J. Dynamics Diﬀerential Equations., 1, 269–298. [142] J. E. Marsden, R. Montgomery, and T. Ratiu [1989], Cartan–Hannay–Berry phases and symmetry. Contemp. Math., Amer. Math. Soc., Providence, RI, 97, 279–295. [143] J. E. Marsden, J. C. Simo, D. R. Lewis, and T. A. Posbergh [1989], Block diagonalization and the energy momentum method. Contemp. Math., Amer. Math. Soc., Providence, RI, 97, 297–313. [144] J. C. Simo, T. A. Posbergh and J. E. Marsden [1989], Stability analysis of a rigid body with attached geometrically nonlinear rod by the energymomentum method. Contemp. Math., Amer. Math. Soc., Providence, RI, 97, 371–398. Amer. Math. Soc., Providence, RI. [147] A. M. Bloch and J. E. Marsden [1989], Controlling homoclinic orbits. Theoret. Comput. Fluid Mech., 1, 179–190. [146] D. Lewis and J. E. Marsden [1989], The Hamiltonian-dissipative decomposition of normal forms of vector ﬁelds. Proc. of the Conference on Bifurcation Theory and its Numerical Analysis, Xi’an Jaitong Univ. Press, 51–78. [147] A. M. Bloch and J. E. Marsden [1989], Control and stabilization of systems with homoclinic orbits. In Proceedings of the 28th IEEE Conference on Decision and Control, Tampa, FL, 1989, IEEE, New York, 1-3, 2238–2242. [ 1990 ] [148] A. M. Bloch and J. E. Marsden [1990], Stabilization of rigid body dynamics by the energy-Casimir method. Syst. and Cont. Lett., 14, 341–346. [149] D. Lewis, J. E. Marsden, T. S. Ratiu and J. C. Simo [1990], Normalizing connections and the energy-momentum method. Proceedings of the CRM conference on Hamiltonian systems, Transformation Groups, and Spectral Transform Methods, CRM Press, (Harnad and Marsden, eds.), 207–227. [150] S. J. Chern and J. E. Marsden [1990], A note on symmetry and stability for ﬂuid ﬂows. Geo. Astro. Fluid Dyn., 51, 19–26. [151] J. E. Marsden, R. Montgomery, and T. S. Ratiu [1990], Reduction, symmetry, and phases in mechanics. Memoirs Amer. Math. Soc., 436, 1-110. [152] J. E. Marsden and J. C. Simo [1990], The energy-momentum method. La “Mecanique Analytique” de Lagrange et son H´eritage, Atti della Accademia delle Scienze di Torino, 124, 245–268.

Publications: J. E. Marsden

559

[153] J. C. Simo, T. A. Posbergh and J. E. Marsden [1990], Stability of coupled rigid body and geometrically exact rods: block diagonalization and the energy-momentum method. Phys. Reports, 193, 280–360. [ 1991 ] [154] J. E. Marsden, O. M. O’Reilly, F. J. Wicklin, and B. W. Zombro [1991], Symmetry, stability, geometric phases, and mechanical integrators. Nonlinear Science Today, 1, 4–11, and 1, 14–21. [155] J. Scheurle, J. E. Marsden and P. J. Holmes [1991], Exponentially small estimates for separatrix splittings. Asymptotics beyond all orders, Plenum, (H. Segur and S. Tanveer, eds.), 187–195. [156] J. C. Simo, D. R. Lewis, and J. E. Marsden [1991], Stability of relative equilibria I: The reduced energy momentum method. Arch. Rational Mech. Anal., 115, 15–59. [157] J. C. Simo, T. A. Posbergh and J. E. Marsden [1991], Stability of relative equilibria II: Three dimensional elasticity. Arch. Rational Mech. Anal., 115, 61–100. [158] J. E. Marsden, T. S. Ratiu and G. Raugel [1991], Symplectic connections and the linearization of Hamiltonian systems. Proc. Roy. Soc. Edinburgh Sect. A, 117, 329–380. [159] D. D. Holm and J. E. Marsden [1991], The rotor and the pendulum. Symplectic Geometry and Mathematical Physics (P. Donato, C. Duval, J. Elhadad, and G. M. Tuynman, eds.), Birkha¨ user, Boston,189–203. [160] A. M. Bloch, P. S. Krishnaprasad, J. E. Marsden and T. S. Ratiu [1991], Asymptotic stability, instability, and stabilization of relative equilibria. Proc. of ACC., Boston IEEE , 1120–1125. [ 1992 ] [161] F. J. Lin and J. E. Marsden [1992], Symplectic reduction and topology for applications in classical molecular dynamics. J. Math. Phys.,, 33, 1281– 1294. [162] M. Dellnitz, J. E. Marsden, I. Melbourne and J. Scheurle [1992], Generic bifurcations of pendula. Internat. Ser. Num. Math. (G. Allgower, K. B¨ ohmer and M. Golubitsky, eds.), Birkha¨ user, Boston, 104, 111–122. [163] A. M. Bloch, P. S. Krishnaprasad, J. E. Marsden, and G. S´anchez de Alvarez [1992], Stabilization of rigid body dynamics by internal and external torques. Automatica, 28, 745–756. [164] D. Lewis, T. S. Ratiu, J. C. Simo, and J. E. Marsden [1992], The heavy top, a geometric treatment. Nonlinearity, 5, 1–48. [165] M. Dellnitz, I. Melbourne and J. E. Marsden [1992], Generic bifurcation of Hamiltonian vector ﬁelds with symmetry. Nonlinearity, 5, 979–996. [166] M. J. Gotay and J. E. Marsden [1992], Stress-energy-momentum tensors and the Belinfante–Rosenfeld formula. Contemp. Math., Amer. Math. Soc., Providence, RI, 132, 367–392.

560

Publications: J. E. Marsden

[167] M. S. Alber and J. E. Marsden [1992], On geometric phases for soliton equations. Comm. Math. Phys., 149, 217–240. [ 1993 ] [168] H. P. Kruse, J. E. Marsden and J. Scheurle [1993], On uniformly rotating liquid drops between two parallel states. Lectures in Appl. Math., 29, 307– 317. [169] J. E. Marsden and J. Scheurle [1993], Lagrangian reduction and the double spherical pendulum. ZAMP , 44, 17–43. [170] J. E. Marsden [1993], Steve Smale and Geometric Mechanics. Proceedings of the Smalefest, Springer-Verlag, (M. Hirsch, J. Marsden, and M. Shub, eds.), 45, 499–516. [171] J. E. Marsden and J. Scheurle [1993], The reduced Euler–Lagrange equations. Fields Inst. Commun., 1, 139–164. [172] J. E. Marsden [1993], Bifurcations in Hamiltonian systems with symmetry. Proc. 1st European Conf. on Nonlinear Oscillations, (E. Kreuzer and G. Schmidt, eds.), Akademie Verlag, 45–64. [ 1994 ] [173] A. M. Bloch, P. S. Krishnaprasad, J. E. Marsden and T. S. Ratiu [1994], Dissipation Induced Instabilities. Ann. Inst. H. Poincar´ e, Analyse Nonlineare, 11, 37–90. [174] E. Knobloch, A. Mahalov, and J. E. Marsden [1994], Normal forms for three-dimensional parametric instabilities in ideal hydrodynamics. Physica D, 73, 49–81. [175] J. E. Marsden [1994], Geometric mechanics, stability, and control. Appl. Math. Sciences, Springer, (L. Sirovich, ed.), 100, 265–292 [176] M. S. Alber and J. E. Marsden [1994], Geometric phases and monodromy at singularities. In Singular Limits of Dispersive Waves, NATO Adv. Sci. Inst. Ser. B Phys., 320, (N. Ercolani et al, eds.) Plenum Press, NY, 273–295. [177] M. S. Alber, R. Camassa, D. D. Holm, and J. E. Marsden [1994], The geometry of peaked solitons and billiard solutions of a class of integrable pde’s. Lett. Math. Phys., 32, 137–151. [178] M. S. Alber and J. E. Marsden [1994], Resonant geometric phases for soliton equations. Fields Inst. Commun., 3, 1–26. [179] J. E. Marsden [1994], Some remarks on geometric mechanics. Duration and Change: Fifty years at Oberwolfach, (M. Artin, H. Kraft, R. Remmert, eds.), Springer-Verlag, 254–274. [180] M. S. Alber and J. E. Marsden [1994], Complex geometric asymptotics for nonlinear systems on complex varieties. Topol. Methods Nonlinear Anal., 4, 237–251. [ 1995 ] [181] M. S. Alber, R. Camassa, D. D. Holm and J. E. Marsden [1995], On the link between umbilic geodesics and soliton solutions of nonlinear PDE’s. Proc. Roy. Soc., 450, 677–692.

Publications: J. E. Marsden

561

[182] M. Golubitsky, J. E. Marsden, I. M. Stewart, and M. Dellnitz [1995], The Constrained Liapunov–Schmidt procedure and periodic orbits. Fields Inst. Commun., 4, 81–127. ´ [183] J. E. Marsden, T. S. Ratiu, and G. Raugel [1995], Equations d’Euler dans une coque sph´erique mince (Euler equations on a thin spherical shell). C. R. Acad. Sci. Paris, 321, 1201–1206. [184] Z. Ge, H. P. Kruse, J. E. Marsden, and C. Scovel [1995], The convergence of Hamiltonian structures in the shallow water approximation. Canadian Quarterly of Applied Math., 3, 277–302. [185] N. H. Getz, and J. E. Marsden [1995], A dynamic inverse for nonlinear maps. Proc. IEEE Conference on Decision and Control , 34. [186] N. H. Getz, and J. E. Marsden [1995], Joint-space tracking of workspace trajectories in continuous time. Proc. IEEE Conference on Decision and Control., 34. [187] N. H. Getz, and J. E. Marsden [1995], Control for an autonomous bicycle. IEEE International Conference on Robotics and Automation, Nagoya, Japan. [188] N. H. Getz, and J. E. Marsden [1995], Tracking implicit trajectories. IFAC Symposium on Nonlinear Control Systems Design, Tahoe City. [189] J. E. Marsden and J. Scheurle [1995], Pattern evocation and geometric phases in mechanical systems with symmetry. Dynamics Stability Systems, 10, 315–338. [ 1996 ] [190] Z. Ge, H. P. Kruse and J. E. Marsden [1996], The limits of Hamiltonian structures in three-dimensional elasticity, shells and rods. J. Nonlinear Sci., 6, 19–57. [191] M. S. Alber and J. E. Marsden [1996], Semiclassical monodromy and the spherical pendulum as a complex Hamiltonian system. Fields Inst. Commun., 8, 1–18. [192] A. M. Bloch, P. S. Krishnaprasad, J. E. Marsden, and T. S. Ratiu [1996], The Euler–Poincar´e equations and double bracket dissipation. Comm. Math. Phys., 175, 1–42. [193] J. E. Marsden, J. Scheurle, and J. Wendlandt [1996], Visualization of orbits and pattern evocation for the double spherical pendulum. ICIAM 95: Mathematical Research, Academie Verlag, (K. Kirchg¨ assner, O. Mahrenholtz, and R. Mennicken, eds.), 87, 213–232. [194] Kirk, V., J. E. Marsden, and M. Silber [1996], Branches of stable threetori using Hamiltonian methods in Hopf bifurcation on a rhombic lattice. Dynamics Stability Systems, 11, 267–302. [195] A. M. Bloch, P. S. Krishnaprasad, J. E. Marsden, and R. Murray [1996], Nonholonomic mechanical systems with symmetry. Arch. Rational Mech. Anal., 136, 21–99. [196] C.-Y. Xu and J. E. Marsden [1996], Asymptotic stability for equilibria of nonlinear semiﬂows with applications to rotating viscoelastic rods I. Topol. Methods Nonlinear Anal., 7, 271–297.

562

Publications: J. E. Marsden [ 1997 ]

[197] M. S. Alber, G. G. Luther, and J. E. Marsden [1997], Complex billiard Hamiltonian systems and nonlinear waves. In Algebraic Aspects of Integrable Systems: in Memory of Irene Dorfman., (A. S. Fokas and I. M. Gelfand eds.), Progress in Nonlinear Diﬀerential Equations, Birk¨auser, 26, 1–16. [198] M. S. Alber, G. G. Luther and J. E. Marsden [1997], Energy dependent Schr¨ odinger operators and complex hamiltonian systems on Riemann surfaces. Nonlinearity, 10 223–242 [199] A. M. Bloch, J. E. Marsden and G. S´ anchez [1997], Feedback stabilization of relative equilibria of mechanical systems with symmetry. Current and Future Directions in Applied Mathematics, (M. S. Alber, B. Hu, and J. Rosenthal, eds.) Birkh¨ auser, 43–64. [200] J. E. Marsden and J. M. Wendlandt [1997], Mechanical systems with symmetry, variational principles and integration algorithms. Current and Future Directions in Applied Mathematics, (M. S. Alber, B. Hu, and J. Rosenthal, eds.) Birkh¨ auser, 219–261. [201] W. S. Koon and J. E. Marsden [1997], Optimal control for holonomic and nonholonomic mechanical systems with symmetry and Lagrangian reduction. SIAM J. Control Optim., 35, 901–929. [202] J. E. Marsden [1997], Geometric foundations of motion and control Motion, Control and Geometry. BMS, National Academy Press, 3–19. [203] N. H. Getz and J. E. Marsden [1997], Dynamical methods for polar decomposition and inversion of matrices. Linear Algebra and its Appl., 258, 311–343. [204] N. E. Leonard and J. E. Marsden [1997], Stability and drift of underwater vehicle dynamics: Mechanical Systems with Rigid Motion Symmetry. Physica D, 105, 130–162. [205] J. M. Wendlandt and J. E. Marsden [1997], Mechanical integrators derived from a discrete variational principle. Physica D, 106, 223–246. [206] A. M. Bloch, N. Leonard, and J. E. Marsden [1997], Stabilization of mechanical systems using controlled Lagrangians. Proc. CDC, 36, 2356–2361. [207] W. S. Koon and J. E. Marsden [1997], The geometric structure of nonholonomic mechanics. Proc. CDC, 36, 4856–4862. [208] W. S. Koon and J. E. Marsden [1997], The Hamiltonian and Lagrangian approaches to the dynamics of nonholonomic systems. Rep. Math. Phys., 40, 21–62. [ 1998 ] [209] D. D. Holm, J. E. Marsden, and T. S. Ratiu [1998], Euler–Poincar´e models of ideal ﬂuids with nonlinear dispersion. Phys. Rev. Lett., 349, 4173–4177. [210] H. Cendra, D. D. Holm, M. J. W. Hoyle, and J. E. Marsden [1998], The Maxwell–Vlasov equations in Euler–Poincar´e form. J. Math. Phys., 39, 3138–3157

Publications: J. E. Marsden

563

[211] D. V. Zenkov, A. M. Bloch, and J. E. Marsden [1998], The Energy Momentum Method for the Stability of Nonholonomic Systems. Dynamics Stability Systems, 13, 123–166. [212] D. D. Holm, J. E. Marsden and T. S. Ratiu [1998], The Euler–Poincar´e Equations and Semidirect Products with Applications to Continuum Theories. Adv. Math., 137, 1–81. [213] J. E. Marsden, G. Misiolek, M. Perlmutter, and T. Ratiu [1998], Symplectic reduction for semidirect products and central extensions. Diﬀerential Geom. Appl., 9, 173–212. [214] H. Cendra, D. D. Holm, J. E. Marsden and T. S. Ratiu [1998], Lagrangian Reduction, the Euler–Poincar´e Equations, and Semidirect Products. Amer. Math. Soc. Transl., 186, 1–25. [215] S. Pekarsky and J. E. Marsden [1998], Point Vortices on a Sphere: Stability of Relative Equilibria. J. Math. Phys., 39, 5894–5907. [216] M. S. Alber, G. G. Luther, J. E. Marsden, and J. M. Robbins [1998], Geometric phases, reduction and Lie–Poisson structure for the resonant threewave interaction. Physica D, 123, 271–290. [217] J. E. Marsden, G. W. Patrick, and S. Shkoller [1998], Multisymplectic Geometry, Variational Integrators, and Nonlinear PDEs. Comm. Math. Phys., 199, 351–395 [218] W. S. Koon and J. E. Marsden [1998], The Poisson reduction of nonholonomic mechanical systems. Reports on Math Phys., 42, 101–134. [219] A. M. Bloch, N. Leonard, and J. E. Marsden [1998], Matching and stabilization by the method of controlled Lagrangians. Proc. CDC, 37, 1446–1451. [220] A. M. Bloch, P. Crouch, J. E. Marsden, and T. S. Ratiu [1998], Discrete rigid body dynamics and optimal control. Proc. CDC, 37, 2249–2254. [221] S. Glavaski, J. E. Marsden, and R. M. Murray [1998], Model reduction, centering, and the Karhunen–Lo`eve expansion. Proc. CDC, 37, 2071–2076. [222] J. E. Marsden and J. Ostrowski [1998], Symmetries in Motion: Geometric Foundations of Motion Control. Nonlinear Sci. Today. (http://link. springer-ny.com). [ 1999 ] [223] H. P. Kruse, A. Mahalov, and J. E. Marsden [1999], On the Hamiltonian structure and three-dimensional instabilities of rotating liquid bridges. Fluid Dyn. Research, 24, 37–59. [224] J. E. Marsden and S. Shkoller [1999], Multisymplectic geometry, covariant Hamiltonians and water waves. Math. Proc. Camb. Phil. Soc., 125, 553– 575. [225] S. M. Jalnapurkar and J. E. Marsden [1999], Stabilization of relative equilibria II. Reg. and Chaotic Dyn., 3, 161–179. [226] C. Kane, J. E. Marsden, and M. Ortiz [1999], Symplectic energy momentum integrators, J. Math. Phys., 40, 3353–3371. [227] J. E. Marsden [1999], Park City lectures on mechanics, dynamics and symmetry Symplectic Geometry and Topology, (Y. Eliashberg and L. Traynor, eds.), Amer. Math. Soc. IAS/Park City Math. Series, 7, 335–430.

564

Publications: J. E. Marsden

[228] J. E. Marsden, S. Pekarsky and S. Shkoller [1999], Discrete Euler–Poincar´e and Lie–Poisson equations. Nonlinearity, 12, 1647–1662. [229] C. Kane, E. A. Repetto, M. Ortiz, and J. E. Marsden [1999], Finite element analysis of nonsmooth contact. Comput. Methods Appl. Mech. Engrg., 180, 1–26. [230] M. S. Alber, R. Camassa, Y. N. Fedorov, D. D. Holm, J. E. Marsden [1999], On billiard solutions of nonlinear PDE’s. Phys. Lett. A, 264, 171–178. [231] M. S. Alber, G. G. Luther, J. E. Marsden J. M. Robbins [1999], Geometry and control of three-wave interactions. Fields Inst. Commun., 24, 55–80. [237] A. M. Bloch, N. Leonard and J. E. Marsden [1999], Potential shaping and the method of controlled Lagrangians. Proc. CDC, 38, 1653–1657. [233] D. V. Zenkov, A. M. Bloch, and J. E. Marsden [1999], Stabilization of the unicycle with rider. Proc. CDC, 38, 3470–3471. [234] J. E. Marsden, S. Pekarsky, and S. Shkoller [1999], Stability of relative equilibria of point vortices on a sphere and symplectic integrators. Il. Nuovo Cimento, 22, 793–802. [235] S. Lall, J. E. Marsden, and S. Glavaski [1999], Empirical model reduction of controlled nonlinear systems, Proceedings of the IFAC World Congress, F, 473–478. [236] P. A. Parrilo, S. Lall, F. Paganini, G. C. Verghese, B. C. Lesieutre, and J. E. Marsden [1999], Model reduction for analysis of cascading failures in power systems, Proc. Amer. Control Conf., June 2–4, 1999, 4208–4212. [237] Bloch, A. M., N. Leonard and J. E. Marsden [1999], Stabilization of the pendulum on a rotor arm by the method of controlled Lagrangians, Proceedings of the International Conference on Robotics and Automation 1999, IEEE, 500–505. [ 2000 ] [238] A. M. Bloch, D. E. Chang, N. E. Leonard, J. E. Marsden, C. Woolsey [2000], Asymptotic stabilization of Euler–Poincar´e mechanical systems. IFAC Proceedings, Princeton, March 16–18, 2000. [261] D. V. Zenkov, A. M. Bloch, N. E. Leonard, and J. E. Marsden [2000], Matching and stabilization of the unicycle with rider, IFAC Proceedings, Princeton, March 16–18, 2000. [240] A. M. Bloch, P. E. Crouch, J. E. Marsden, T. S. Ratiu [2000], An almost Poisson structure for the generalized rigid body equations, IFAC Proceedings, Princeton, March 16–18, 2000. [241] R. Serban, W. S. Koon, M. Lo, J. E. Marsden, L. R. Petzold, S. D. Ross, and R. S. Wilson [2000], Optimal control for halo orbit missions, IFAC Proceedings, Princeton, March 16–18, 2000. [242] C. W. Rowley and J. E. Marsden [2000], Reconstruction equations and the Karhunen–Lo`eve expansion for systems with symmetry. Physica D, 142, 1–19. [243] M. S. Alber, G. G. Luther, J. E. Marsden, and J. M. Robbins [2000], Geometric analysis of optical frequency conversion and its control in quadratic nonlinear media. J. Opt. Soc. Amer. B, 17, 932–941.

Publications: J. E. Marsden

565

[244] W.-S. Koon, M. W. Lo, J. E. Marsden, and Shane D. Ross [2000], Heteroclinic Connections between periodic orbits and resonance transitions in celestial mechanics, Chaos 10, 427–469. [245] J. E. Marsden, T. S. Ratiu and J. Scheurle [2000], Reduction theory and the Lagrange–Routh equations, J. Math. Phys. Millenium Issue 41, 3379–3429. [246] J. E. Marsden and M. Perlmutter [2000], The orbit bundle picture of cotangent bundle reduction, C. R. Math. Rep. Acad. Sci.Canada, 22, 33–54. [255] Koon, W. S., M. W. Lo, J. E. Marsden and S. D. Ross [2000b], Shoot the Moon, AAS/AIAA Astrodynamics Specialist Conference, Florida, 2000, AAS, 000-166. [254] Marsden, J. E., S. Pekarsky, and S. Shkoller [2000], Symmetry Reduction of Discrete Lagrangian Mechanics on Lie Groups. J. Geom. Physics, 36, 140-150. [249] S. M. Jalnapurkar and J. E. Marsden [2000], Stabilization of relative equilibria, IEEE Trans. Automat. Control 45, 1483–1491. [250] S. M. Jalnapurkar and J. E. Marsden [2000], Reduction of Hamilton’s variational principle, Dynamics Stability Systems, 15, 287–318. [251] J. E. Marsden, T. S. Ratiu, and S. Shkoller [2000], The geometry and analysis of the averaged Euler equations and a new diﬀeomorphism group, Geom. Funct. Anal., 10, 582–599. [252] C. Kane, J. E. Marsden, M. Ortiz and M. West [2000], Variational integrators and the Newmark algorithm for conservative and dissipative mechanical systems, Internat. J. Numer. Math. Engrg., 49, 1295–1325. [253] C. Kane, J. E. Marsden, M. Ortiz, and A. Pandolﬁ [2000], Frictional Collisions Oﬀ Sharp Objects, International Conference on Diﬀerential Equations, Berlin, 1999, (B. Fiedler, K. Gr¨ oger and J. Sprekels, eds.), World Scientiﬁc, 979–984. [254] J. E. Marsden, S. Pekarsky, and S. Shkoller [2000], Poisson structure and invariant manifolds on Lie groups, International Conference on Diﬀerential Equations, Berlin, 1999, (B. Fiedler, K. Gr¨ oger, and J. Sprekels, eds.), World Scientiﬁc, 1192–1197. [255] W. S. Koon, M. W. Lo, J. E. Marsden, and S. D. Ross [2000], Dynamical Systems, the Three-Body Problem, and Space Mission Design, International Conference on Diﬀerential Equations, Berlin, 1999, (B. Fiedler, K. Gr¨ oger, and J. Sprekels, eds.), World Scientiﬁc, 1167–1181. [256] M. West, C. Kane, J. E. Marsden and M. Ortiz, [2000], Variational integrators, the Newmark scheme, and dissipative systems, International Conference on Diﬀerential Equations, Berlin, 1999, (B. Fiedler, K. Gr¨oger and J. Sprekels, eds.), World Scientiﬁc, 1009–1011. [257] J. E. Marsden, T. S. Ratiu, G. Raugel [2000], The Euler Equations on Thin Domains, International Conference on Diﬀerential Equations, Berlin, 1999, (B. Fiedler, K. Gr¨ oger and J. Sprekels, eds.), World Scientiﬁc, 1198–1203. [258] Mohseni, K., S. Shkoller, B. Kosovi, J. E. Marsden, D. Carati, A. Wray, and R. Rogallo [2000], Numerical Simulations of Homogeneous Turbulence Using the Lagrangian-averaged Navier-Stokes equations, Proc. Center for Turbulence Research, Summer School,

566

Publications: J. E. Marsden

[259] Chang, D. E. and J. E. Marsden [2000], Asymptotic stabilization of the heavy top using controlled Lagrangians. Proc. CDC, 39, 269-273. [260] Bloch, A. M., P. Crouch, D. Holm, and J. E. Marsden [2000], An optimal control formulation for inviscid incompressible ideal ﬂuid ﬂow, Proc. CDC, 39, 1273-1279. [261] Zenkov, D. V., A. M. Bloch, N. E. Leonard, and J. E. Marsden [2000], Matching and stabilization of low-dimensional nonholonomic systems Proc. CDC, 39, 1289-1295. [262] A. M. Bloch, N. Leonard, and J. E. Marsden [2000], Controlled Lagrangians and the stabilization of mechanical systems I: The ﬁrst matching theorem, IEEE Trans. Automat. Control 45, 2253–2270. [ 2001 ] [263] H. Cendra, J. E. Marsden, and T. S. Ratiu [2001], Geometric mechanics, Lagrangian reduction and nonholonomic systems. In Mathematics Unlimited — 2001 and Beyond, (B. Enquist and W. Schmid, eds.), Springer-Verlag, New York, pages 221–273. [264] J. E. Marsden, S. Pekarsky, S. Shkoller, and M. West [2001], Variational methods, multisymplectic geometry and continuum mechanics, J. Geom. Phys. 38, 253–284. [265] P. Krysl, S. Lall, and J. Marsden [2001], Dimensional model reduction in nonlinear ﬁnite element dynamics of solids and structures, Internat. J. Numer. Methods Engrg. 51, 479–504. [266] A. M. Bloch, N. Leonard, and J. E. Marsden [2001], Controlled Lagrangians and the stabilization of Euler–Poincar´e mechanical systems, Internat. J. Robust Nonlinear Control 11, 191–214. [267] K. Mohseni, B. Kosovi´c, S. Shkoller, and J. E. Marsden [2001], Numerical simulations of the Lagrangian averaged Navier–Stokes (LANS-α) equations for forced homogeneous isotropic turbulence, AIAA 2001–2645. [268] H. Cendra, J. E. Marsden, and T. S. Ratiu [2001], Lagrangian reduction by stages, volume 152 of Memoirs. Amer. Math. Soc., Providence, R.I. [269] J. E. Marsden and S. Shkoller [2001], Global well-posedness for the Lagrangian averaged Navier-Stokes (LANS-α) equations on bounded domains. Philos. Trans. Roy. Soc. London Ser. A 359, 1449–1468. [270] S. Pekarsky and J. E. Marsden [2001], Abstract mechanical connection and abelian reconstruction for almost K¨ ahler manifolds, Journal of Applied Mathematics 1, 1–28. [271] J. E. Marsden and M. West [2001], Discrete mechanics and variational integrators, Acta Numerica 10, 357–514. [272] A. Hirani, J. E. Marsden, and J. Arvo [2001], Averaged template matching equations, Lecture Notes in Comput. Sci., Springer 2134, 528–543. [273] M. Alber, R. Camassa, Y. Fedorov, D. Holm, and J. E. Marsden [2001], The complex geometry of weak piecewise smooth solutions of integrable nonlinear PDE’s of shallow water and Dym type, Comm. Math. Phys. 221, 197–227.

Publications: J. E. Marsden

567

[274] C. Woolsey, A. M. Bloch, N. E. Leonard, and J. E. Marsden [2001], Physical dissipation and the method of controlled Lagrangians, Proceedings of the European Control Conference, Porto, Portugal, September 2001, 2570– 2575. [275] A. M. Bloch, D. Chang, N. Leonard, and J. E. Marsden [2001], Controlled Lagrangians and the stabilization of mechanical systems II: Potential shaping, IEEE Trans. Automat. Control, 46, 1556–1571. [276] G. Gomez, W. S. Koon, M. W. Lo, J. E. Marsden, J. Masdemont, and S. D. Ross [2001], Invariant manifolds, the spatial three-body problem and space mission design, Proceedings of AIAA/AAS Astrodynamics Specialist Meeting, Quebec City, Quebec, Canada, August, 2001, AAS 01–301. [277] W. S. Koon, J. E. Marsden, J. Masdemont, and R. M. Murray [2001], J2 dynamics and formation ﬂight, Proceedings of AIAA Guidance, Navigation, and Control Conference, Montreal, Canada, August, AIAA 2001–4090. [278] C. Woolsey, A. M. Bloch, N. E. Leonard, and J. E. Marsden [2001], Dissipation and controlled Euler–Poincar´e systems, 2001 Conference on Decision and Control, Orlando, FL, December 2001, 3378–3383. [279] M. G. Clerc and J. E. Marsden [2001] Dissipation-induced instabilities in an optical cavity laser: A mechanical analog near the 1:1 resonance, Physical Review E, 64, (2001), 067603. [280] J. E. Marsden and A. Weinstein [2001] Comments on the history, theory, and applications of symplectic reduction. In Quantization of Singular Symplectic Quotients. (N. Landsman, M. Pﬂaum, and M. Schlichenmaier, eds.), Birkha¨ user, Boston, pp 1–20. [281] Koon, W. S., M. Lo, J. E. Marsden, and S. Ross [2001] Resonance and capture of Jupiter comets, Celestial Mech. and Dyn. Astron. 81, 27–38. [282] Koon, W. S., M. Lo, J. E. Marsden, and S. Ross [2001] Low energy transfer to the moon, Celestial Mech. and Dyn. Astron. 81, 63–73. [ 2002 ] [283] D. E. Chang, D. Chichka, and J. E. Marsden [2002], Lyapunov-based transfer between elliptic Keplerian orbits, Discrete and Continuous Dynamical Systems, Series B 2, 57–67. [284] A. Pandolﬁ, C. Kane, J. E. Marsden, and M. Ortiz [2002], Time-discretized variational formulation of nonsmooth frictional contact, Internat. J. Numer. Methods Engrg. 53, 1801–1829. [285] B. N. Shashikanth, J. E. Marsden, J. W. Burdick, and S. D. Kelly [2002], The Hamiltonian structure of a 2D rigid circular cylinder interacting dynamically with N point vortices, Phys. of Fluids, 14, 1214–1227. [286] Serban, R., W. S. Koon, M. Lo, J. E. Marsden, L. R. Petzold, S. D. Ross, and R. S. Wilson [2002], Halo orbit mission correction maneuvers using optimal control, Automatica 38, 571–583. [287] Koon, W. S., M. Lo, J. E. Marsden, and S. Ross [2002], Constructing a low energy transfer between Jovian moons, Contemp. Math., Amer. Math. Soc., Providence, RI, 292, 129–146.

568

Publications: J. E. Marsden

Textbooks (Chronological) [288] Basic Complex Analysis. W. H. Freeman, 1973. Second Edition (with M. Hoﬀman), 1987. Third Edition, November 1998. [289] Elementary Classical Analysis. W. H. Freeman, 1974, Second Edition (with M. Hoﬀman), 1993. [290] Vector Calculus (with Anthony Tromba). W. H. Freeman, 1976, Fourth Edition, 1996 [291] Calculus, I, II, III (with Alan Weinstein). Second Edition, Springer-Verlag, 1985, [292] Calculus Unlimited (with Alan Weinstein). Benjamin-Cummings, 1981. [293] Basic Multivariable Calculus (with Weinstein and Tromba). W. H. Freeman and Springer-Verlag, 1992.

Monographs (Chronological) [294] Foundations of Mechanics, (with R. Abraham), 1967. Second Edition, Addison Wesley, 1978. [295] Lectures on Analysis, by G. Choquet (co-editor). Benjamin, 3 vols., 1969. [296] Applications on Global Analysis in Mathematical Physics, Berkeley Lecture Note Series, 1976. [297] The Hopf Bifurcation and Its Applications, (with M. McCracken), Appl. Math. Sci. 19, Springer-Verlag, 1976. [298] A Mathematical Introduction to Fluid Mechanics, (with A. Chorin), Texts in Appl. Math. 4, Springer-Verlag, 1979. Third Edition, 1993. [299] Lectures on Geometric Methods in Mathematical Physics. 37, SIAM–CBMS, 1981. [300] Mathematical Foundations of Elasticity, (with T. Hughes), Prentice Hall, 1983. Reprinted Dover, 1994. [301] Manifolds, Tensor Analysis, and Applications, (with R. Abraham and T. Ratiu), 1983. Second Edition, Appl. Math. Sci. 75, Springer-Verlag, 1988. [302] Lectures on Mechanics, Cambridge University Press, 1992. [303] Introduction to Mechanics and Symmetry, (with T. Ratiu), Texts in Appl. Math. 17, Springer-Verlag, 1994, Second Edition, 1999.

Contributors Editors: ∗ Philip Holmes Department of Mechanical and Aerospace Engineering D–202B Engineering Quad Princeton University Princeton, NJ 08544 [email protected] ∗ Paul K. Newton Department of Aerospace and Mechanical Engineering University of Southern California Los Angeles, CA 90089–1191 [email protected] ∗ Alan Weinstein Department of Mathematics University of California, Berkeley Berkeley, CA 94720 [email protected]

Authors: ∗ John M. Ball Mathematical Institute University of Oxford 24–29 St. Giles’ Oxford OX1 3LB, UK [email protected] ∗ Anthony Bloch Department of Mathematics University of Michigan Ann Arbor, MI 48109–1109 [email protected] ∗ Alain Chenciner D´ epartement de Math´ematiques Universit´e Paris VII — Denis Diderot 16, rue Clisson 75013 Paris, France [email protected] or

Astronomie et Syst`emes Dynamiques IMCCE, UMR 8028 du CNRS 77, avenue Denfert–Rochereau 75014 Paris, France [email protected] ∗ Arthur Fischer Department of Mathematics University of California, Santa Cruz Santa Cruz, CA 95064 [email protected] ∗ Joseph Gerver Department of Mathematics Rutgers University Camden, NJ 08102 [email protected] ∗ Martin Golubitsky Department of Mathematics University of Houston Houston, TX 77204-3008 [email protected] ∗ Mark Gotay Department of Mathematics University of Hawaii 2565 The Mall Honolulu, HI 96822–2273 [email protected] ∗ Victor Guillemin Department of Mathematics Massachussetts Institute of Technology Cambridge, Massachussetts 02139–4307 [email protected] ∗ D.D. Holm Theoretical Division, T–7, MS 284 Los Alamos National Lab Los Alamos, NM 87545 [email protected] ∗ T. J. R. Hughes Mechanics and Computation Durand Building Stanford University Stanford, CA 94305 tjr [email protected]

569

570

Contributors

∗ Edgar Knobloch Department of Physics University of California, Berkeley Berkeley, CA 94720 [email protected] or Department of Applied Mathematics University of Leeds Leeds LS2 9JT, UK [email protected] ∗ Naomi Leonard Department of Mechanical and Aerospace Engineering D–202B Engineering Quad Princeton University Princeton, NJ 08544 [email protected] ∗ Adrian Lew Department of Aeronautics M/C 105-50 California Institute of Technology 1200 E. California Blvd. Pasadena, CA 91125 [email protected] ∗ Robert Littlejohn Department of Physics University of California, Berkeley Berkeley, CA 94720 [email protected] ∗ Alexander Mielke Mathematisches Institut A Universit¨ at Stuttgart Pfaﬀenwaldring 57 70569 Stuttgart, Germany [email protected] ∗ Kevin Mitchell P.O. Box 8795 College of William and Mary Williamsburg, VA 23187-8795 [email protected] ∗ Vincent Moncrief Department of Mathematics Yale University 10 Hillhouse Ave. New Haven, CT 06520–8283 [email protected] ∗ Richard Montgomery Applied Sciences Building University of California, Santa Cruz Santa Cruz, CA 95064 [email protected]

∗ Assad A. Oberai Department of Aerospace and Mechanical Engineering Boston University Boston, MA 02215 [email protected] ∗ Juan-Pablo Ortega Institut Non Lin´eaire de Nice Centre National de la Recherche Scientiﬁque 1361, route des Lucioles F-06560 Valbonne, France [email protected] ∗ Michael Ortiz Department of Aeronautics and Mechanical Engineering M/C 105–50 California Institute of Technology 1200 E. California Blvd. Pasadena, CA 91125 [email protected] ∗ Tudor Ratiu Institut Bernoulli ´ Ecole Polytechnique F´ed´ erale de Lausanne CH-1015 Lausanne, Switzerland [email protected]

¨rgen Scheurle ∗ Ju Technische Universit¨ at M¨ unchen Zentrum Mathematik D-80290 M¨ unchen, Germany [email protected] ∗ Steve Shkoller Department of Mathematics University of California, Davis Davis, CA 95616 [email protected]

´ ∗ Carles Simo Departament de Matem` atica Aplicada i An` alisi Universitat. de Barcelona Gran Via, 585, Barcelona 08007, Spain [email protected] ∗ Ian Stewart Mathematics Institute University of Warwick Coventry, CV4 7AL, UK [email protected]

ctd. −→

Contributors ´ Vega ∗ Jose E.T.S.I. Aeron´ auticos Universidad Polit´ecnica de Madrid Plaza Cardenal Cisneros 3 28040 Madrid, Spain [email protected] ∗ Sebastian Walcher Lehrstuhl A f¨ ur Mathematik RWTH Aachen D-52056 Aachen, Germany [email protected] ∗ Catalin Zara Department of Mathematics Massachussetts Institute of Technology Cambridge, MA 02139 [email protected]

TEXnical Editors: ∗ Wendy G. McKay Control and Dynamical Systems M/C 107-81 California Institute of Technology 1200 E. California Blvd. Pasadena, CA 91125-8100 [email protected] ∗ Ross R. Moore Mathematics Department Macquarie University Sydney, NSW 2109, Australia [email protected]

571

Geometry, Mechanics, and Dynamics: Volume in Honor of the 60th Birthday of J. E. Marsden

The Grothendieck Festschrift, Volume I: A Collection of Articles Written in Honor of the 60th Birthday of Alexander Grothendieck

Advances in Phase Space Analysis of Partial Differential Equations: In Honor of Ferruccio Colombini's 60th Birthday (

Nonlinear analysis : stability, approximation, and inequalities ; In honor of Themistocles M. Rassias on the occasion of his 60th birthday

Discrete Groups in Geometry and Analysis: Papers in Honor of G.D.Mostow on his Sixtieth Birthday

Spectral Theory and Mathematical Physics: A Festschrift in Honor of Barry Simon's 60th Birthday: Quantum Field Theory, Statistical Mechanics, and ... (Proceedings of Symposia in Pure Mathematics)

Algebraic Geometry and Number Theory: In Honor of Vladimir Drinfeld's 50th Birthday

Encyclopedia of India E-J

The Grothendieck Festschrift, Volume II: A Collection of Articles Written in Honor of the 60th Birthday of Alexander Grothendieck (Progress in Mathematics Modern Birkhauser Classics)

The Grothendieck Festschrift, Volume II: A Collection of Articles Written in Honor of the 60th Birthday of Alexander Grothendieck (Progress in Mathematics Modern Birkhäuser Classics)

The Grothendieck Festschrift, Volume III: A Collection of Articles Written in Honor of the 60th Birthday of Alexander Grothendieck (Progress in Mathematics Modern Birkhäuser Classics)

Geometry, Mechanics, and Dynamics: Volume in Honor of the 60th Birthday of J. E. Marsden